news

[Published in Open Source For You (OSFY) magazine, October 2013 edition.]

Sparse is a semantic parser written for static analysis of the Linux kernel code. Here’s how you can use it to analyse Linux kernel code.

Sparse implements a compiler front-end for the C programming language, and is released under the Open Software License (version 1.1). You can obtain the latest sources via git:

$ git clone git://git.kernel.org/pub/scm/devel/sparse/sparse.git 

You can also install it on Fedora using the following command:

$ sudo yum install sparse

The inclusion of ‘C=1’ to the make command in the Linux kernel will invoke Sparse on the C files to be compiled. Using ‘make C=2’ will execute Sparse on all the source files. There are a number of options supported by Sparse that provide useful warning and error messages. To disable any warning, use the ’-Wno-option’ syntax. Consider the following example:

void
foo (void)
{
}

int
main (void)
{
  foo();

  return 0;
}

Running sparse on the above decl.c file gives the following output:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include decl.c

decl.c:2:1: warning: symbol 'foo' was not declared. Should it be static?

The ’-Wdecl’ option is enabled by default, and detects any non-static variables or functions. You can disable it with the ’-Wno-decl’ option. To fix the warning, the function foo() should be declared static. A similar output was observed when Sparse was run on Linux 3.10.9 kernel sources:

arch/x86/crypto/fpu.c:153:12: warning: symbol 'crypto_fpu_init' was not declared. 
Should it be static?

While the C99 standard allows declarations after a statement, the C89 standard does not permit it. The following decl-after.c example includes a declaration after an assignment statement:

int
main (void)
{
  int x;

  x = 3;

  int y;

  return 0;
}

When using C89 standard with the ’-ansi’ or ’-std=c89’ option, Sparse emits a warning, as shown below:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include -ansi decl-after.c

decl-after.c:8:3: warning: mixing declarations and code

This Sparse command line step can be automated with a Makefile:

TARGET = decl-after

SPARSE_INCLUDE = -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
                 -I/usr/include

SPARSE_OPTIONS = -ansi

all:
	sparse $(SPARSE_INCLUDE) $(SPARSE_OPTIONS) $(TARGET).c

clean:
	rm -f $(TARGET) *~ a.out

If a void expression is returned by a function whose return type is void, Sparse issues a warning. This option needs to be explicitly specified with a ’-Wreturn-void’. For example:

static void
foo (int y)
{
  int x = 1;

  x = x + y;
}

static void
fd (void)
{
  return foo(3);
}

int
main (void)
{
  fd();

  return 0;
}

Executing the above code with Sparse results in the following output:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include -Wreturn-void void.c

void.c:12:3: warning: returning void-valued expression

The ’-Wcast-truncate’ option warns about truncation of bits during casting of constants. This is enabled by default. An 8-bit character is assigned more than it can hold in the following:

int
main (void)
{
  char i = 0xFFFF;
  
  return 0;
}

Sparse warns of truncation for the above code:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include trun.c 

trun.c:4:12: warning: cast truncates bits from constant value (ffff becomes ff)

A truncation warning from Sparse for Linux 3.10.9 kernel is shown below:

arch/x86/kvm/svm.c:613:17: warning: cast truncates bits from 
constant value (100000000 becomes 0)

Any incorrect assignment between enums is checked with the ’-Wenum-mismatch’ option. To disable this check, use ’-Wno-enum-mismatch’. Consider the following enum.c code:

enum e1 {a};
enum e2 {b};

int
main (void)
{
  enum e1 x;
  enum e2 y;

  x = y;

  return 0;
}

Testing with Sparse, you get the following output:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include enum.c

enum.c:10:7: warning: mixing different enum types
enum.c:10:7:     int enum e2  versus
enum.c:10:7:     int enum e1     

Similar Sparse warnings can also be seen for Linux 3.10.9:

drivers/leds/leds-lp3944.c:292:23: warning: mixing different enum types
drivers/leds/leds-lp3944.c:292:23:     int enum led_brightness  versus
drivers/leds/leds-lp3944.c:292:23:     int enum lp3944_status 

NULL is of pointer type, while, the number 0 is of integer type. Any assignment of a pointer to 0 is flagged by the ’-Wnon-pointer-null’ option. This warning is enabled by default. An integer pointer ‘p’ is set to zero in the following example:

int
main (void)
{
  int *p = 0;

  return 0;
}

Sparse notifies the assignment of 0 as a NULL pointer:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include nullp.c 

nullp.c:4:12: warning: Using plain integer as NULL pointer

Given below is another example of this warning in Linux 3.10.9:

arch/x86/kvm/vmx.c:8057:48: warning: Using plain integer as NULL pointer

The corresponding source code on line number 8057 contains:

vmx->nested.apic_access_page = 0;

The GNU Compiler Collection (GCC) has an old, non-standard syntax for initialisation of fields in structures or unions:

static struct
{
  int x;
} local = { x: 0 };

int
main (void)
{
  return 0;
}

Sparse issues a warning when it encounters this syntax, and recommends the use of the C99 syntax:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include old.c 

old.c:4:13: warning: obsolete struct initializer, use C99 syntax

This option is also enabled by default. The ’-Wdo-while’ option checks if there are any missing parentheses in a do-while loop:

int
main (void)
{
  int x = 0;

  do
    x = 3;
  while (0); 

  return 0;
}

On running while.c with Sparse, you get:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include -Wdo-while while.c

while.c:7:5: warning: do-while statement is not a compound statement

This option is not enabled by default. The correct use of the the do-while construct is as follows:

int
main (void)
{
  int x = 0;

  do {
    x = 3;
  } while (0); 

  return 0;
}

A preprocessor conditional that is undefined can be detected with the ’-Wundef’ option. This must be specified explicitly. The preprocessor FOO is not defined in the following undef.c code:

#if FOO
#endif

int
main (void)
{
  return 0;
}

Executing undef.c with Sparse, the following warning is shown:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include -Wundef undef.c

undef.c:1:5: warning: undefined preprocessor identifier 'FOO

The use of parenthesised strings in array initialisation is detected with the ’-Wparen-string’ option:

int
main (void)
{
  char x1[] = { ("hello") };

  return 0;
}

Sparse warns of parenthesised string initialization for the above code:

$ sparse -I/usr/include/linux -I/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include \
         -I/usr/include -Wparen-string paren.c

paren.c:4:18: warning: array initialized from parenthesized string constant
paren.c:4:18: warning: too long initializer-string for array of char

The ’-Wsparse-all’ option enables all warnings, except those specified with ’-Wno-option’. The width of a tab can be specified with the ’-ftabstop=WIDTH’ option. It is set to 8 by default. This is useful to match the right column numbers in the errors or warnings.

You can refer to the following manual page for more available options:

$ man sparse