HALSTEAD's SOFTWARE METRICS

Divya Srinivasan

Halstead’s Software Metrics:

Halstead’s metrics are based on the idea that a computer program is essentially an implementation of an algorithm made up of tokens, which can be classified as either operators or operands.

"A computer program is an implementation of an algorithm considered to be a collection of tokens, which can be classified as either operators or operands."
— Maurice Halstead

Token Count in Halstead's Metrics:

In this approach, a program is viewed as a series of tokens—the smallest elements of code that carry meaning. These are grouped into two categories:
  • Operators: Symbols or keywords that perform actions (e.g., +, if, return)
  • Operands: Variables or values on which the operators act
All of Halstead’s software metrics are derived from the following basic counts:
  • n1 = Number of distinct operators
  • n2 = Number of distinct operands
  • N1 = Total occurrences of operators
  • N2 = Total occurrences of operands

Program Size (N):

The total size of the program, in terms of tokens used, is calculated as:
N = N1 + N2
This total reflects the overall length of the program based on both operators and operands.

Halstead's Software Metrics:

Program Volume (V):

Definition:
Represents the size of the program in bits, assuming a uniform binary encoding of the program's vocabulary.
Formula:
V = N * log₂(n)
Where:
N = Total number of tokens (operators + operands)
n = Total number of unique tokens (distinct operators + operands)

Program Level (L):

Definition:
Measures the level of abstraction in the program.
L = 1 indicates a highly abstract and efficient program (minimum volume).
Formula:
L = V* / V
Where:
V* = Theoretical minimum volume
V = Actual volume of the program

 Program Difficulty (D):

Definition:
Estimates how difficult the program is to understand or modify. Difficulty increases with more unique operators and operands.
Formula:
D = (n₁ / 2) * (N₂ / n₂)
Where:
n₁ = Number of distinct operators
n₂ = Number of distinct operands
N₂ = Total occurrences of operands

Programming Effort (E):

Definition:
Represents the mental effort required to develop or understand the program.
Formulas:
E = V / L
or
E = D * V
Where:
V = Program volume
D = Difficulty
L = Program level
Unit: Elementary mental discriminations

 Estimated Program Length (N̂):

Definition:
Halstead proposed that the length of a well-structured program depends only on the number of unique operators and operands.
Formulas:
Actual length:
N = N₁ + N₂
Estimated length:
N̂ = n₁ * log₂(n₁) + n₂ * log₂(n₂)

 Alternate Formulas for Estimated Length:

Additional approximations to estimate program length:
  • NJ = log₂(n₁!) + log₂(n₂!)
  • NB = n₁ * log₂(n₂) + n₂ * log₂(n₁)
  • NC = n₁ * √n₁ + n₂ * √n₂
  • NS = (n * log₂(n)) / 2

Halstead’s Extended Metrics:

In addition to basic software science metrics, Halstead also introduced concepts like potential minimum volume, vocabulary size, and language level, which help evaluate the complexity and abstraction of a program more deeply.

 Potential Minimum Volume (V*)

Definition:
Represents the theoretical minimum volume required to express a solution to a problem, using only the essential input and output operations.
Formula:
V* = (2 + n₂*) * log₂(2 + n₂*)
Where:
n₂* = Number of unique input and output parameters
This value helps compare the actual program volume to the minimal necessary size.

Size of Vocabulary (n)

Definition:
Measures the total number of unique tokens used to construct the program.
Formula:
n = n₁ + n₂
Where:
n₁ = Number of unique operators
n₂ = Number of unique operands
n = Program vocabulary size
A larger vocabulary typically indicates a more complex program.

Language Level (L')

Definition:
Indicates how efficiently an algorithm is expressed in a specific programming language. Higher-level languages require less effort to express the same logic compared to lower-level languages.
Formula:
L' = V / D²
Where:
V = Actual program volume
D = Program difficulty
A higher L' implies a higher-level language or a more expressive implementation.

Language Quality Indicator (λ or lambda)

Definition:
Used to assess the quality of the implementation based on the language level and the potential volume.
Formulas:
λ = L * V*
λ = L² * V
These expressions compare actual implementation efficiency against the theoretical minimum.


Counting Rules for C Language

Used for Halstead’s Software Science Metrics

When applying Halstead’s metrics to C programs, specific rules are followed to accurately count operators and operands.

What to Include and Exclude

Exclude:

Comments
Identifier declarations
Function declarations

Include:

  • Variables and constants → Counted as operands
  • Function calls → Counted as operators
  • Looping constructs (for, while, do-while) → Counted as operators
  • Control statements (if, else, switch, case) → Counted as operators
  • Reserved keywords like return, break, continue, sizeof, etc. → Counted as operators
  • Symbols like brackets [ ], braces { }, commas ,, and semicolons ; → Counted as operators
GOTO statements:

1.
goto → Operator
2.Associated label → Operand
  • Unary and binary operators (+, -, *) → Counted separately for each use-case
Array variables:

1.array_name[index] → array_name and index = operands
2.[] = operator

Structure variables:

1.struct.member or struct->member →
struct and member = operands
. and -> = operators
2.Members with the same name in different structures = counted as unique operands

Global variables:
  • Used across different modules = counted as multiple occurrences
Local variables:
  • Same name in different functions = counted as unique operands
Ignore:

All preprocessor directives (#include, #define, etc.)

Example Application

Consider a C program that implements a sorting algorithm (as shown in the figure).

Task:

  • Identify and list all operators and operands
  • Calculate Halstead metrics:
  • Vocabulary (n)
  • Program length (N)
  • Volume (V)
  • Difficulty (D)
  • Effort (E)
  • Language level (L′)
  • Language quality (λ)

 Solution:

A table can be constructed listing each unique operator and operand, along with their respective counts (n₁, n₂, N₁, N₂). Using these values, you can then apply Halstead’s formulas to compute the software science measures.

Given Data
  • 𝑁1=53
  • 𝑁2=38
Program Length

𝑁=N1+N2=53+38=91

Vocabulary of the Program

n=n1+n2=14+10=24

Volume

V=N×log2n=91×log224=417 bits

Estimated Program Length

N^=(14×3.81)+(10×3.32)=53.34+33.2=86.45\hat{N} = (14 \times 3.81) + (10 \times 3.32) = 53.34 + 33.2 = 86.45
Conceptually Unique Input and Output Parameters (
n2=3
  • x: Array holding integers to be sorted (used as both input and output)
  • N: Size of the array to be sorted
Potential Volume

V=5log25=11.6

Program Level

L=VV

Halstead’s Software Metrics

We use the formula:

E^=VL=D×V\hat{E} = \frac{V}{L} = D \times V
Therefore, 10,974 elementary mental discriminations are required to construct the program.

Interpretation
This is likely a reasonable effort and time estimate for producing such a simple program.

Our website uses cookies to enhance your experience. Learn More
Accept !