Subscribe to the RSS feed

Keyword - SATE

Entries feed - Comments feed

Tuesday, June 30 2009

NIST Static Analysis Tool Exposition special publication released

The NIST SAMATE project conducted the first Static Analysis Tool Exposition (SATE) in 2008 to advance research in static analysis tools that find security defects in source code. The main goals of SATE were to enable empirical research based on large test sets and to encourage improvement and speed adoption of tools. The exposition was planned to be an annual event.

SATE 2008 was one of my last project at NIST. I really enjoyed working on this project from the beginning, it was challenging especially because we had to create so many artifacts to make the tool reporting the weaknesses the same way, integrate them all together and provide ways for assessors to make meaningful reviews.

In a nutshell, we selected 6 different open-source programs (3 en C, 3 in Java) and made tool vendors running their tool on these test cases. Tool vendors were allowed to customize their tool if their tool provide such capability. Fortify was the only vendor who created a custom rule (to help the tool with a validation routine for MVNForum). Our goal was then to combine the results all together and analyze: provide information on the correctness of the tool.

If you are interested, you can download the SATE data and the NIST SATE Special Publication.

Thanks to all the SAMATE team for this effort, and especially Vadim Okun and Paul E. Black.

For more information, you can reach the SATE page at NIST.

Tuesday, September 23 2008

Last week at NIST

Every good things have an end... this is the time for me to leave NIST. So I will be a security consultant at Cigital, Inc..

I've been working at NIST for 2 years and a half as a Guest Researcher in the SAMATE Project. I originally came at NIST to do mostly statistical analysis or so, but it changed a lot! I started by building the SAMATE Reference Dataset website and this is how I started to learn about "security", but working with flawed source code. This was very obscure to me (I guess like everybody computer scientist specialized in applied mathematics) and I learned a lot about weaknesses, vulnerabilities, "how to find them?", scanners etc.

My first real security related work was about the Web Application Security Scanner Specification and then, design a way of testing the web apps scanners:

  • test suite with seeded vulnerabilities
  • checking the types of attacks
  • trying to explain the false-negative of the tools by a monitoring of what/where the scanner went in the application at a logical level, such as "did the tool logged in successfully? did it generate a couple of errors, did it try many times?

The goal of the 3 components based analysis is to really be able to understand what the tool is doing, if it didn't find a particular vulnerability, why?

One of the best moments I had at NIST was when we did the Static Analysis Tool Exposition. I was part of the organizers and from the beginning, it was a real challenge: choosing good test cases, criteria to evaluate the reports, etc. Of course, SATE 2008 was not perfect, we did many mistakes, but at least, we tried, we had some results and we learned a lot. I have good hopes for the next SATE, even though this is really challenging on many aspects:

  1. Not make people think/act like this is a competition (we sometimes see people claiming they won SATE 2008, but... well, there would be many things to say to them)
  2. Having a strong evaluation criteria (I guess this is challenging every time human assessment is part of the game)
  3. Solve the way to present data to the evaluators. We couldn't have the GUI of the tools etc. so our analysis (as an evaluator) was really limited and we sometimes had to guess what was the exact weakness report
  4. and finally, having more resources and help for evaluating the weaknesses reported by the tools (47k this year, one month to evaluate...)

Oh well, I will of course continue to follow what the SAMATE team is doing, even though I will be away and busy with other interesting stuff and I'm really looking forward to see the results of the current study we are running on the function-wise weakness characterization.

But for now, it's time for me to get some vacation, going back to France for almost one month, getting my worker visa etc.

Tuesday, June 10 2008

My talk at SAW: Automated Evaluation of source code analyzer output

It has been some time since I haven't post on my blog... well, I've been busy especially with the end of SATE, and oh well! had vacation :)

Anyway, at the next Static Analysis Workshop this Thursday, we're gonna talk about the SATE experiment and the observations/results we could get from this. I am then gonna talk about a tool I wrote in order to probe if a reported weakness is a false-positive: this is the Automated Evaluation.

The main idea of the Automated Evaluation, is to get some information on the source code and, under some assumptions, try to make a conclusion on the correctness of the piece of code. Behind all the reasoning from that particular tool, my approach had to be radically different than a classical SCA otherwise this would have been like creating a new SCA and this would have been obviously useless. The context of this automated evaluation is limited to the buffer overflows and this can only work for proving false-positive only!

So basically, I am reading the source code from the reported sink to the possibles sources and grabbing the actions that possibly affect the variable which have a role in the code.

These actions are like:

  • Allocation of a destination buffer
  • Computing the size of the source buffer(s)
  • Test for NULL
  • Test that involves the size of the buffers...
  • ... and some others

Then, once these actions are detected, the tool increments a global score of false-positiveness to this reported weakness. We then only have to set a threshold in order to know what correctness we want to have; this is really tied to the source code and how the program is developed.

Even though this evaluation method is not perfect, this was adapted to the C test cases we had in SATE 2008 since the global code quality was good. We can even say that the software were well written; it was then okay to make some assumption on the code such as:

  • If the size of the destination buffer is computed with the size of the source buffer, the size is good (basically: no off-by-one)

Also, the tool itself needs some information on the source code such since it uses regular expression to match the "actions"...



Here we are for a quick explanation and here are the slides: SAW: Automated Evaluation of SCA output

Wednesday, May 14 2008

Static Analysis Tool Exposition is over

Yeah, that's sad and also a relief: SATE is over. We actually released today the last stage of the evaluation (basically, the evaluation with some correction based on comments from the participants). Even though I would have prefer to have more feedback from participants on our evaluation, especially to increase its quality, I still think SATE is a good thing and will be an interesting resource for lost of researchers. This is, as far as I know, the only exhaustive resource on the subject (wild source code + weaknesses).

What do I want to do, see next? Since we have accumulated lots of data with the tool reports (raw weaknesses), the evaluations (I really want to thank MITRE's guys, especially Steve Christey and Bob Schmeichel for their help), I'm looking forward to do data analysis and trying to extract some limited results on it.

Anyway, this was overall a good experience, I actually did my first real code review mostly on lighttpd, dspace, mvnform and naim, I think I know way more on how detecting vulnerabilities, I also have been asking myself about how to rate vulnerabilities such as Cross-Site Scripting (hopefully, I will release the little document I wrote about it), I learned so much about how people are writing code trying to understand the design, the code etc. in the applications.

Also, hopefully, I will be able to release the website I developed to handle the weaknesses from different tools. It is, I think, interesting if you are working with more than one assessor. You can send evaluation, comments, merging the weaknesses etc. with a web interface. Even though it needs improvements (it has been done in less than 2 weeks) I think this would be an interesting piece of software for people who are dealing with tons of weaknesses. Another interesting point is that we (at NIST) may open that website for everybody in order to make new evaluation in order to increase the quality of the data we currently have.

Oh well, it seems like a journey is really close to its end, it was such a good time sometimes, and some other time such consuming work. We've been dealing with fifty thousands of weaknesses, dozen of tool reports, and almost tens of test cases... I will keep you posted about the next decision we are gonna make with SATE and hope that lots of people will find in this "exposition" the most they could get.

Friday, February 29 2008

NIST SATE step 3 completed: test cases information release

This evening at work, with Vadim, we were exhausted after days of work but we were smiling. Smiling and happy because we knew that the step 3 of SATE was pretty much done. The step 3 is when all the participants are sending their output to us. Even if we know that we will have hard time to come up with the master reference list for each test cases what we selected for SATE 2008, we know that this is interesting data for the SwA community and especially SCA studies.

Today, we can finally tell which test cases were selected by us for SATE 2008. First of all, we have 2 different tracks: C language and Java language. For the java track, we decided to look more into web applications. We then have:

And for the C track we selected:

  • Nagios: host, service and network monitoring with web interface (using CGI)
  • Lighttpd: web server
  • Naim: console instant messenger

You may have lots of comments on why these and I am totally ready to answer your questions. Just to let you know, during the selection phase, we reviewed 50+ different applications. For each applications, we had to scan them using tools, doing some manual review and our main goal is to find at least one exploitable vulnerability. Concerning the type of test cases themselves, the constrain is to have real exploitable vulnerabilities and they must be real applications which means basically, not test cases that we have in our SRD.

Just as reminder, the next important dates for SATE 2008 are:

  • April 15, we are distributing to the participants our master reference list, the list of real weaknesses found by the participants
  • June, comparison of all the participants results, the participants get all the reports submitted at SATE 2008
  • December, all the data and reports are public

Thursday, February 14 2008

SATE ready to go + weaknesses walker + Shmoo + 100

Tomorrow will start SATE 2008: the registered participants will be able to get the test cases associated to the tracks they want to participate in. They will have until the 29th of February to send the report of the tools. We are all pretty excited here before the start. It was a real rush for finding the test cases that we think are good for such an event...

Anyway, just a news to release a python script which is definitely SATE oriented. The idea is only to convert the output of some free tools into the SATE XML format. The script is handling Flawfinder, ITS4 and RATS. It can also look at the NVD for the product and the version in order to retrieve the known vulnerabilities.

You can download the script weaknesses walker as a zip file or just the python script (you will need wwwCall for the NVD scrapping part; wwwCall is also included in the zip).

Example how to use ww with flawfinder:

./ww.py --tool flawfinder --file myproject.out.xml --format sate /home/romain/myproject

or for the NVD scrapper:

./ww.py --vdb winamp 5.2 --file winamp_5.2.nvd.xml

For the next version of ww, I may add the possiblity to play with the SATE XML format itself, such as merging the results of different tools with comparison of report or even just the report of multiple tools...

Also, if you are coming downtown DC this weekend for ShmooCon or even BlackHat DC 2008, if you wanna have a beer just drop me a mail. I wasn't able to find a ticket for Shmoo so will not go, but I will meet with dre and marcin from ts/sci security... so if you are around, just tell me I would be happy to meet more sec. people

The last thing is that this post is my number 100!

I <3 Bots!