Subscribe to the RSS feed

Keyword - static analysis

Entries feed - Comments feed

Sunday, August 10 2008

Why the "line of code" is indeed a good metric

When I first learned about source code metrics, I was amazed about people using the line of code for doing comparison with software. It was for me a lack of imagination.

At the beginning of the week, I started a small and fast experiment: extracting metrics from the SATE 2008 test cases. This experiment focuses on function-wise properties and therefore, I have to extract for each functions a couple of metrics:

  • McCabe's cyclomatic complexity which computes the code complexity, this is indeed a good metric to estimate the difficulty that a human will have to understand a given piece of code (very important for security related problems)
  • Line of Code
  • Line of Comments
  • Number of local variables
  • Number of parameters (which represents the coercion between the function and the whole program)
  • Number of function call
  • Number of function that are ``sources''
  • Number of function that are ``sinks''
  • Number of C standards functions (obviously, only for C test cases)

At first the the line of code was implemented cause it's an easy one to compute and it also gives an important value if we want to normalize the other metrics. We also decided to introduce the number of ``source/sinks'' for studying input validation weaknesses later on...

Anyway, after running some statistics on the output results, I was amazed by observing that the Pearson correlation coefficient between McCabe and Line of Code was never less than 0.90 (which could be compare to 90% as a correlation rate) (but I have to say that there is huge limitations in the parsers we are using for extracting information, for instance, the C is not pre-processed etc.). This result is only valid for C test cases, actually, the average of observed correlation in Java test case is around 0.60...

Of course further statistical analysis will be necessary to conclude anything on this subject, but if we were unlucky with the test cases selection, this may have been a source of the problem, but I don't think we were. Actually, this seems quite logical to think that these metrics a related, the longer the code is, the more complex in term of tests, loops etc. it can be, there is indeed more chance that a longer code contains more cycles :)

Oh well, I'll keep writing about especially since I expect to get results pretty soon...

Tuesday, June 10 2008

My talk at SAW: Automated Evaluation of source code analyzer output

It has been some time since I haven't post on my blog... well, I've been busy especially with the end of SATE, and oh well! had vacation :)

Anyway, at the next Static Analysis Workshop this Thursday, we're gonna talk about the SATE experiment and the observations/results we could get from this. I am then gonna talk about a tool I wrote in order to probe if a reported weakness is a false-positive: this is the Automated Evaluation.

The main idea of the Automated Evaluation, is to get some information on the source code and, under some assumptions, try to make a conclusion on the correctness of the piece of code. Behind all the reasoning from that particular tool, my approach had to be radically different than a classical SCA otherwise this would have been like creating a new SCA and this would have been obviously useless. The context of this automated evaluation is limited to the buffer overflows and this can only work for proving false-positive only!

So basically, I am reading the source code from the reported sink to the possibles sources and grabbing the actions that possibly affect the variable which have a role in the code.

These actions are like:

  • Allocation of a destination buffer
  • Computing the size of the source buffer(s)
  • Test for NULL
  • Test that involves the size of the buffers...
  • ... and some others

Then, once these actions are detected, the tool increments a global score of false-positiveness to this reported weakness. We then only have to set a threshold in order to know what correctness we want to have; this is really tied to the source code and how the program is developed.

Even though this evaluation method is not perfect, this was adapted to the C test cases we had in SATE 2008 since the global code quality was good. We can even say that the software were well written; it was then okay to make some assumption on the code such as:

  • If the size of the destination buffer is computed with the size of the source buffer, the size is good (basically: no off-by-one)

Also, the tool itself needs some information on the source code such since it uses regular expression to match the "actions"...



Here we are for a quick explanation and here are the slides: SAW: Automated Evaluation of SCA output

Friday, May 16 2008

Yet another study on code quality: A Tale of Four Kernels

If like me you are interested in code quality and some general conclusion that one can draw based on code quality studies, I really recommend to read this paper: A Tale of Four Kernels by Diomidis Spinellis, ICSE '08: Proceedings of the 30th International Conference on Software Engineering

I just want to quote a part of the conclusion by the author

Therefore, the most we can read from the overall balance of marks is that open source development approaches do not produce software of markedly higher quality than proprietary software development.

The only problem with this statement is that it is based on the fact that the metrics he used were not weighted for their importance for the "Code Quality" (if this means something). Therefore, the comparison between the Windows research kernel and Linux seems a little bit awkward to me. Anyway, this is a very interesting paper about code quality, and lots of interesting ideas from the author of CScout.

Wednesday, May 14 2008

Static Analysis Tool Exposition is over

Yeah, that's sad and also a relief: SATE is over. We actually released today the last stage of the evaluation (basically, the evaluation with some correction based on comments from the participants). Even though I would have prefer to have more feedback from participants on our evaluation, especially to increase its quality, I still think SATE is a good thing and will be an interesting resource for lost of researchers. This is, as far as I know, the only exhaustive resource on the subject (wild source code + weaknesses).

What do I want to do, see next? Since we have accumulated lots of data with the tool reports (raw weaknesses), the evaluations (I really want to thank MITRE's guys, especially Steve Christey and Bob Schmeichel for their help), I'm looking forward to do data analysis and trying to extract some limited results on it.

Anyway, this was overall a good experience, I actually did my first real code review mostly on lighttpd, dspace, mvnform and naim, I think I know way more on how detecting vulnerabilities, I also have been asking myself about how to rate vulnerabilities such as Cross-Site Scripting (hopefully, I will release the little document I wrote about it), I learned so much about how people are writing code trying to understand the design, the code etc. in the applications.

Also, hopefully, I will be able to release the website I developed to handle the weaknesses from different tools. It is, I think, interesting if you are working with more than one assessor. You can send evaluation, comments, merging the weaknesses etc. with a web interface. Even though it needs improvements (it has been done in less than 2 weeks) I think this would be an interesting piece of software for people who are dealing with tons of weaknesses. Another interesting point is that we (at NIST) may open that website for everybody in order to make new evaluation in order to increase the quality of the data we currently have.

Oh well, it seems like a journey is really close to its end, it was such a good time sometimes, and some other time such consuming work. We've been dealing with fifty thousands of weaknesses, dozen of tool reports, and almost tens of test cases... I will keep you posted about the next decision we are gonna make with SATE and hope that lots of people will find in this "exposition" the most they could get.

Friday, February 29 2008

NIST SATE step 3 completed: test cases information release

This evening at work, with Vadim, we were exhausted after days of work but we were smiling. Smiling and happy because we knew that the step 3 of SATE was pretty much done. The step 3 is when all the participants are sending their output to us. Even if we know that we will have hard time to come up with the master reference list for each test cases what we selected for SATE 2008, we know that this is interesting data for the SwA community and especially SCA studies.

Today, we can finally tell which test cases were selected by us for SATE 2008. First of all, we have 2 different tracks: C language and Java language. For the java track, we decided to look more into web applications. We then have:

And for the C track we selected:

  • Nagios: host, service and network monitoring with web interface (using CGI)
  • Lighttpd: web server
  • Naim: console instant messenger

You may have lots of comments on why these and I am totally ready to answer your questions. Just to let you know, during the selection phase, we reviewed 50+ different applications. For each applications, we had to scan them using tools, doing some manual review and our main goal is to find at least one exploitable vulnerability. Concerning the type of test cases themselves, the constrain is to have real exploitable vulnerabilities and they must be real applications which means basically, not test cases that we have in our SRD.

Just as reminder, the next important dates for SATE 2008 are:

  • April 15, we are distributing to the participants our master reference list, the list of real weaknesses found by the participants
  • June, comparison of all the participants results, the participants get all the reports submitted at SATE 2008
  • December, all the data and reports are public

Tuesday, February 19 2008

Code review tools: the missing link (so far)

First of all, I do not consider myself as a pen-tester so maybe you will find these ideas irrelevant, stupid or useless... I have been doing some pen-testing though, whether it was for some friends, for fun (yeah, it's good to learn like that) or for profit (well, it was kinda part of my job for SATE 2008) so I'm not that n00b but I am not a pen-tester. I am not an expert in pen-testing and code review. But when I do some, at work, I have the chance to be able to use commercial tools — I say it's a chance because there is a real benefit of using such tools. In fact, tools are good, way better than me, they can find thousands of vulnerabilities in minutes... I cannot; I need way more time. But here is a little feedback vendors can have from me, utilizing the tools.

The tools are amazing to find some defects, saying that something doesn't look good to them and giving you a stack of 42 function calls. Eh! that's part of the job to examine this bunch of function in order to see why the tool reported this as a vulnerability. So, examining the functions means looking how the data will be transformed/transported from a point to one another. And I cannot tell you the pain it is to do that for the dozens of reported vulnerabilities where the correctness of the tool is not obvious (at least for me).

While talking about that with Vadim today, I thought of a tool that would be awesome for a code reviewer in order to facilitate the “correctness tests”. The idea is really simple and maybe the tool already exists — if so, please give me a link! — but what if you had a kinda debugger where you were able to select the point where you want to start the dynamic evaluation of a piece of code (the Entry Point) and the point where you want to finish and see the result (the Break Point). What is the difference with a typical debugger? The possibility to do such in relation with the source code. In the interface of the source code analyzer, I would be able to select the entry point I want to start my dynamic analysis and the break point. I would launch the dynamic evaluation which would go to the state of the entry point (maybe by asking how to go there... there is often multiple paths to go to one branch of the code), then I would do the modification I want (trying to bypass some filters for example with some weird strings) and the dynamic engine would run the piece of code until the Break Point; then look at the result.

What I just described is a really narrow view of such combination of static/dynamic analysis, by doing a step-by-step modification of the values. We could have information of the privilege state of the current user for a web application, would be able to replay easily a la web apps scanners, etc.

I know that building such a tool is doable. Hard but definitely doable. So far, the toughest point I saw is to be able to arrive at a given state of the program. You would need to do a binary coverage and looking at the branches to take, recording these and mapping the records with the source code. Once you're done with it, you're ready for modifying the parameters, and to look at the results. Yes, the main difference with a debugger is to come in a given state referenced by a function call. But wouldn't this help you to figure out the correctness of a given piece of code?

Thursday, February 14 2008

SATE ready to go + weaknesses walker + Shmoo + 100

Tomorrow will start SATE 2008: the registered participants will be able to get the test cases associated to the tracks they want to participate in. They will have until the 29th of February to send the report of the tools. We are all pretty excited here before the start. It was a real rush for finding the test cases that we think are good for such an event...

Anyway, just a news to release a python script which is definitely SATE oriented. The idea is only to convert the output of some free tools into the SATE XML format. The script is handling Flawfinder, ITS4 and RATS. It can also look at the NVD for the product and the version in order to retrieve the known vulnerabilities.

You can download the script weaknesses walker as a zip file or just the python script (you will need wwwCall for the NVD scrapping part; wwwCall is also included in the zip).

Example how to use ww with flawfinder:

./ww.py --tool flawfinder --file myproject.out.xml --format sate /home/romain/myproject

or for the NVD scrapper:

./ww.py --vdb winamp 5.2 --file winamp_5.2.nvd.xml

For the next version of ww, I may add the possiblity to play with the SATE XML format itself, such as merging the results of different tools with comparison of report or even just the report of multiple tools...

Also, if you are coming downtown DC this weekend for ShmooCon or even BlackHat DC 2008, if you wanna have a beer just drop me a mail. I wasn't able to find a ticket for Shmoo so will not go, but I will meet with dre and marcin from ts/sci security... so if you are around, just tell me I would be happy to meet more sec. people

The last thing is that this post is my number 100!

Tuesday, January 22 2008

PHP Source Code Analyzer

Months ago, I was talking about and doing some small tests with the php source code security analyzer that I was able to find on the web.

I was able to quickly test the new Fortify SCA 5.0 which is handling PHP application now. I can tell you that I am really exciting about this tool. First of all, it beats from far all the tools I've tested previously (for PHP), which is fair since it's a commercial tool.

But what I'm really excited about now is that I will be able to make more tests on my test suites, compare with my security metrics & basic security analyzer, looking at the behavior of SCA tools when the source code is obfuscated, and so on. You're on the good track Fortify, now, open an API and I will be able to make an hybrid tool...

Since I also have some plan of testing real PHP applications with both testing approaches (static/dynamic), I'd like to see the difference of application coverage, vulnerability finding and false-positive rates (yeah, the last one is obvious, but still interesting).

I'm also glad to see that vendors are taking PHP as a serious language and not only for script kiddies.

Wednesday, December 5 2007

Static Analysis Framework: PHP-Ast/Oracle

In my previous blog post, I talked briefly about PHP-Ast/Oracle a PHP source code static analysis framework. I am developing it in order to play with source code and security. The goal of that framework is to be able to perform different type of operations on a PHP source code. I am releasing this tool as it is because I think people may be interested with this... Anyway, I learned a lot doing this.

PHP-Ast/Oracle is developed in C++ and the tool has been developed mainly for:

How it works

The source code repository is divided in 2 parts:

  • php-ast is the converter from PHP to XML
  • php-oracle is the actual engine

php-oracle get a XML file as input which is the output of php-ast. In the SVN there are some python scripts I used in order to combine the 2 tools (they may be outdated i.e. doesn't work with the current php-oracle).

How I think you could use php-oracle

I do not attend to make a clean build with an executable etc. I just provide source code. I decided to give only the source code because I don't want to spend too much time on creating a clean software, it's only research oriented stuff. Furthermore, there is not much documentation in the source code (advantages of being alone to develop such a tool) and then, only really interested people will download this! I can then help them if they have some question about how it works etc.

Getting the source code

You can download the source here: php-ast-oracle.zip

And the trac repository has more documentation about what the framework actually does: http://trac2.assembla.com/php-ast

Development

The tool is in perpetual development, I don't want to create a real software from that, but I think people can use it to perform security analysis, compute stuff, make code transformation and so on.

Thursday, July 12 2007

Secure Programming with Static Analysis

I've just received this book, looked over quickly and it seems a must to have!
I really suggest you to buy this book if you are a developer!
I <3 Bots!