Saturday, April 28, 2012

Clang static analyzer.

From the website:

Clang Static Analyzer

The Clang Static Analyzer is source code analysis tool that find bugs in C and Objective-C programs.

Currently it can be run either as a standalone tool or within Xcode. The standalone tool is invoked from the command-line, and is intended to be run in tandem with a build of a codebase.

The analyzer is 100% open source and is part of the Clang project. Like the rest of Clang, the analyzer is implemented as a C++ library that can be used by other tools and applications.

What is Static Analysis?

The term "static analysis" is conflated, but here we use it to mean a collection of algorithms and techniques used to analyze source code in order to automatically find bugs. The idea is similar in spirit to compiler warnings (which can be useful for finding coding errors) but to take that idea a step further and find bugs that are traditionally found using run-time debugging techniques such as testing.

Static analysis bug-finding tools have evolved over the last several decades from basic syntactic checkers to those that find deep bugs by reasoning about the semantics of code. The goal of the Clang Static Analyzer is to provide a industrial-quality static analysis framework for analyzing C and Objective-C programs that is freely available, extensible, and has a high quality of implementation.

This means that this is an open source program that is designed to analyze code without running the code.  Clang reads the source code of the program you are working with in order to tell you what problems the rule set finds in the code.

Several open source projects are interested in using Clang to make their code base better.  In my opinion, Clang is only part of an overall solution that includes unit testing of interfaces combined with high level integration testing of the overall released project.  But it is better than no testing at all.

Since the primary target for Clang is OSX I have to download and follow the instructions here: For other platforms, such as my Ubuntu system the web site lead me to this page:

 I followed those directions and got an executable to begin testing against source code after just a few hours.

I did have to do a

sudo make install

in both the llvm and  build directories.  It installed everything in


Once everything is working you build your project with

scan-build make

This does the build by replacing the normal compiler with a special compiler that both compiles and analyzes the code.  At the end of the run

My plan is to get clang all build and begin testing against a few small projects to see what it says, then scale up to complete open source projects.

The page recommends that the build be done in debug mode with assertions enabled to help control the way the program is analyzed.

I had to use the command:

/usr/local/bin/scan-build .configure
/usr/local/bin/scan-build -v -V make

in order to make everything work correctly with the way I had installed the software.

It took hours to build nmap using this method.

Almost immediately I got this error: warning: Value stored to 'num_host_exp_groups' is never read
  num_host_exp_groups = 0;

which would not be found by the normal checks, because it was used before this point.

2041   /* Free host expressions */
2042   for(i=0; i < num_host_exp_groups; i++)
2043     free(host_exp_group[i]);
2044   num_host_exp_groups = 0;
2045   free(host_exp_group);

Line 2044 really has no effect because it doesn't actually do anything that has an effect.

This is the output from the error report: 

Then it took about 10-20 minutes to check each of the rest of the files.

The second thing it found was:

Which was in  

It is a little scary that it appears that the buflen is checked only after writing to the buffer. warning: Value stored to 'state' is never read
      state = 0;
      ^       ~ warning: Value stored to 'foundgood' is never read
          foundgood = true;
          ^           ~~~~ warning: Value stored to 'foundgood' is never read
          foundgood = true;
          ^           ~~~~ warning: Value stored to 'seq_stddev' is never read
      seq_stddev = 0;

It will even sort a few fields.

I cleared out all the .o instead of just nmap's .o  and found this bug:

linear.cpp:1092:9: warning: ‘loss_old’ may be used uninitialized in this function
linear.cpp:1090:9: warning: ‘Gnorm1_init’ may be used uninitialized in this function
linear.cpp:1376:9: warning: ‘Gnorm1_init’ may be used uninitialized in this function
linear.cpp:1805:15: warning: Call to 'malloc' has an allocation size of 0 bytes
        int *start = Malloc(int,nr_class);
linear.cpp:21:32: note: expanded from macro 'Malloc'
#define Malloc(type,n) (type *)malloc((n)*sizeof(type))
                               ^      ~~~~~~~~~~~~~~~~
linear.cpp:2000:30: warning: Assigned value is garbage or undefined
                                        model_->w[j*nr_class+i] = w[j];
                                                                ^ ~~~~

This one actually seems more serious, if true.

evidently nr_class is 0 there which causes malloc to allocate no bytes to "start".

I set the -k option to keep going, and am letting everything run for as long as it needs to run.

About 6 hours later I checked and it had finished up.  The reports are very complete:


This is an example of one of the errors, but as you can see the NULL does appear to be checked in the previous.  The documentation talks about using asserts to  remove some of these errors in a debug build.

Overall this does look like an interesting additional tool to use in addition to other tools.

Because it makes a copy of the file for each bug report, because the paths to the failed branches it makes a copy of a file for each bug report.

The compressed size of the reports was 3.7 MB, and the uncompressed size was 47 MB.

No comments:

Post a Comment