Problems with using lines of code to measure the size of a product include(s)

Internal product attributes describe the software products in a way that is dependent only on the product itself. The major reason for measuring internal product attributes is that, it will help monitor and control the products during development.

Measuring Internal Product Attributes

The main internal product attributes include size and structure. Size can be measured statically without having to execute them. The size of the product tells us about the effort needed to create it. Similarly, the structure of the product plays an important role in designing the maintenance of the product.

Measuring the Size

Software size can be described with three attributes −

  • Length − It is the physical size of the product.

  • Functionality − It describes the functions supplied by the product to the user.

  • Complexity − Complexity is of different types, such as.

    • Problem complexity − Measures the complexity of the underlying problem.

    • Algorithmic complexity − Measures the complexity of the algorithm implemented to solve the problem

    • Structural complexity − Measures the structure of the software used to implement the algorithm.

    • Cognitive complexity − Measures the effort required to understand the software.

The measurement of these three attributes can be described as follows −

Length

There are three development products whose size measurement is useful for predicting the effort needed for prediction. They are specification, design, and code.

Specification and design

These documents usually combine text, graph, and special mathematical diagrams and symbols. Specification measurement can be used to predict the length of the design, which in turn is a predictor of code length.

The diagrams in the documents have uniform syntax such as labelled digraphs, data-flow diagrams or Z schemas. Since specification and design documents consist of texts and diagrams, its length can be measured in terms of a pair of numbers representing the text length and the diagram length.

For these measurements, the atomic objects are to be defined for different types of diagrams and symbols.

The atomic objects for data flow diagrams are processes, external entities, data stores, and data flows. The atomic entities for algebraic specifications are sorts, functions, operations, and axioms. The atomic entities for Z schemas are the various lines appearing in the specification.

Code

Code can be produced in different ways such as procedural language, object orientation, and visual programming. The most commonly used traditional measure of source code program length is the Lines of code (LOC).

The total length,

LOC = NCLOC + CLOC

i.e.,

LOC = Non-commented LOC + Commented LOC

Apart from the line of code, other alternatives such as the size and complexity suggested by Maurice Halsted can also be used for measuring the length.

Halstead’s software science attempted to capture different attributes of a program. He proposed three internal program attributes such as length, vocabulary, and volume that reflect different views of size.

He began by defining a program P as a collection of tokens, classified by operators or operands. The basic metrics for these tokens were,

  • μ1 = Number of unique operators

  • μ2 = Number of unique operands

  • N1 = Total Occurrences of operators

  • N2 = Number of unique operators

The length P can be defined as

$$N = N_{1}+ N_{2}$$

The vocabulary of P is

$$\mu =\mu _{1}+\mu _{2}$$

The volume of program = No. of mental comparisons needed to write a program of length N, is

$$V = N\times {log_{2}} \mu$$

The program level of a program P of volume V is,

$$L = \frac{V^\ast}{V}$$

Where, $V^\ast$ is the potential volume, i.e., the volume of the minimal size implementation of P

The inverse of level is the difficulty −

$$D = 1\diagup L$$

According to Halstead theory, we can calculate an estimate L as

$${L}' = 1\diagup D = \frac{2}{\mu_{1}} \times \frac{\mu_{2}}{N_{2}}$$

Similarly, the estimated program length is, $\mu_{1}\times log_{2}\mu_{1}+\mu_{2}\times log_{2}\mu_{2}$

The effort required to generate P is given by,

$$E = V\diagup L = \frac{\mu_{1}N_{2}Nlog_{2}\mu}{2\mu_{2}}$$

Where the unit of measurement E is elementary mental discriminations needed to understand P

The other alternatives for measuring the length are −

  • In terms of the number of bytes of computer storage required for the program text

  • In terms of the number of characters in the program text

Object-oriented development suggests new ways to measure length. Pfleeger et al. found that a count of objects and methods led to more accurate productivity estimates than those using lines of code.

Functionality

The amount of functionality inherent in a product gives the measure of product size. There are so many different methods to measure the functionality of software products. We will discuss one such method ─ the Albrecht’s Function Point method ─ in the next chapter.

What is the problem with using lines of code?

The problem with using Lines of Code per day as a productivity metric is it measures the complexity of the solution, not the complexity of the problem. And like most metrics, it means very little without context. LoC has become irrelevant for productivity.

What are the disadvantages of LOC?

Drawbacks of LOC.
It is defined on code. For example it cannot measure the size of specification..
It characterise only one specific view of size, namely length, it takes no account of functionality or complexity..
Bad software design may cause excessive line of code..
It is language dependent..
Users cannot easly understand it..

Why lines of code LOC is not an efficient method for estimating project size?

As Lines of Code (LOC) only counts the volume of code, you can only use it to compare or estimate projects that use the same language and are coded using the same coding standards.

What are the units of measuring lines of code?

The unit for this metric is LOC, the abbreviation of Lines of Code, and its symbol is Ss. Because of the length of programs we also use the unit KLOC for one thousand lines of code. This unit has the symbol S.