Subscribe to the RSS feed

Tuesday, June 10 2008

My talk at SAW: Automated Evaluation of source code analyzer output

It has been some time since I haven't post on my blog... well, I've been busy especially with the end of SATE, and oh well! had vacation :)

Anyway, at the next Static Analysis Workshop this Thursday, we're gonna talk about the SATE experiment and the observations/results we could get from this. I am then gonna talk about a tool I wrote in order to probe if a reported weakness is a false-positive: this is the Automated Evaluation.

The main idea of the Automated Evaluation, is to get some information on the source code and, under some assumptions, try to make a conclusion on the correctness of the piece of code. Behind all the reasoning from that particular tool, my approach had to be radically different than a classical SCA otherwise this would have been like creating a new SCA and this would have been obviously useless. The context of this automated evaluation is limited to the buffer overflows and this can only work for proving false-positive only!

So basically, I am reading the source code from the reported sink to the possibles sources and grabbing the actions that possibly affect the variable which have a role in the code.

These actions are like:

  • Allocation of a destination buffer
  • Computing the size of the source buffer(s)
  • Test for NULL
  • Test that involves the size of the buffers...
  • ... and some others

Then, once these actions are detected, the tool increments a global score of false-positiveness to this reported weakness. We then only have to set a threshold in order to know what correctness we want to have; this is really tied to the source code and how the program is developed.

Even though this evaluation method is not perfect, this was adapted to the C test cases we had in SATE 2008 since the global code quality was good. We can even say that the software were well written; it was then okay to make some assumption on the code such as:

  • If the size of the destination buffer is computed with the size of the source buffer, the size is good (basically: no off-by-one)

Also, the tool itself needs some information on the source code such since it uses regular expression to match the "actions"...



Here we are for a quick explanation and here are the slides: SAW: Automated Evaluation of SCA output

Tuesday, January 22 2008

PHP Source Code Analyzer

Months ago, I was talking about and doing some small tests with the php source code security analyzer that I was able to find on the web.

I was able to quickly test the new Fortify SCA 5.0 which is handling PHP application now. I can tell you that I am really exciting about this tool. First of all, it beats from far all the tools I've tested previously (for PHP), which is fair since it's a commercial tool.

But what I'm really excited about now is that I will be able to make more tests on my test suites, compare with my security metrics & basic security analyzer, looking at the behavior of SCA tools when the source code is obfuscated, and so on. You're on the good track Fortify, now, open an API and I will be able to make an hybrid tool...

Since I also have some plan of testing real PHP applications with both testing approaches (static/dynamic), I'd like to see the difference of application coverage, vulnerability finding and false-positive rates (yeah, the last one is obvious, but still interesting).

I'm also glad to see that vendors are taking PHP as a serious language and not only for script kiddies.

Wednesday, December 5 2007

Static Analysis Framework: PHP-Ast/Oracle

In my previous blog post, I talked briefly about PHP-Ast/Oracle a PHP source code static analysis framework. I am developing it in order to play with source code and security. The goal of that framework is to be able to perform different type of operations on a PHP source code. I am releasing this tool as it is because I think people may be interested with this... Anyway, I learned a lot doing this.

PHP-Ast/Oracle is developed in C++ and the tool has been developed mainly for:

How it works

The source code repository is divided in 2 parts:

  • php-ast is the converter from PHP to XML
  • php-oracle is the actual engine

php-oracle get a XML file as input which is the output of php-ast. In the SVN there are some python scripts I used in order to combine the 2 tools (they may be outdated i.e. doesn't work with the current php-oracle).

How I think you could use php-oracle

I do not attend to make a clean build with an executable etc. I just provide source code. I decided to give only the source code because I don't want to spend too much time on creating a clean software, it's only research oriented stuff. Furthermore, there is not much documentation in the source code (advantages of being alone to develop such a tool) and then, only really interested people will download this! I can then help them if they have some question about how it works etc.

Getting the source code

You can download the source here: php-ast-oracle.zip

And the trac repository has more documentation about what the framework actually does: http://trac2.assembla.com/php-ast

Development

The tool is in perpetual development, I don't want to create a real software from that, but I think people can use it to perform security analysis, compute stuff, make code transformation and so on.

Sunday, December 2 2007

Yet another study oriented release

I've been working a couple of months on a project named php-ast/oracle. I am opening the source of the project today because I think that people may be interested in such a code. Roughly, php-ast/oracle is able to get/transform information on a php source code, I used it for: creating real obfuscations (control-flow, data-flow), implementing security metrics, writing a converter from php to c++ for static analysis purpose and some other stuff such as variables flow etc.. You can have more information here: http://trac2.assembla.com/php-ast. I may post about this project later don't have much time now...

But this news is only for releasing a script I used a lot this last weeks; a PHP preprocessor. I've been using this preprocessor in order to clean the crappy PHP code we can found in the wild... in order to use php-ast/oracle correctly for calculating security metrics and so on.

The preprocessor is actually doing 3 things:

  • Simplifying the strings (keeping only the php variables in the strings -- really important for keeping the AST small with SQL queries and so on, because the strings could be evaluated in PHP, the AST would need to tokenize the strings)
  • Removing comments and HTML
  • Resolving the file inclusions (not for dynamic variable inclusion of course, but it's working with define names and static names)

The preprocessor is available here: preproc.zip

Friday, September 7 2007

Acunetix is releasing a free XSS scanner

I really think it's a good thing do open the XSS scanning like this, definitely a good point Acunetix. What I don't really like though is the commercial points here. They are actually releasing the demo version with XSS scanning free for all websites (all other scanning are then limited to their test websites - which you shouldn't care about for any vendors).

Anyway, good point Acunetix! I wish lots of commercial will release some free tools or even their own little tools (SPI has a lot of good ones!)

Wednesday, August 29 2007

I now understand why it's difficult!

Okay, I know for the halting problem etc. Some theoretical stuff... But now that I'm working on one, I have to say:

Damn! That so complicated to do a source code scanner!

The dataflow is a real pain in the ass, and we know that it's impossible to have a real and full dataflow. But well, we need to do some. The dataflow is more complicated theoretically but what about the control flow? No really easier! I mean... that's easier but there are so many things to understand, so many patterns to recognize in order to build the model of the source code... And I'm not even talking about inter procedural stuff, multi-file source code etc.

So, I'd like to apologize to "I don't remember who are these people" but some source code scanners are good :) Well... for the moment! I'm really waiting for to see more high-tech stuff and AI in these kind of programs...

Anyway, I'm currently building a core engine working on a AST tree generated by yaxx (XML version). I have two short terms targets:

  • Real Obfuscation (from one source code to an equivalent with a different control flow... yes, not only rename the variables, functions, classes etc.)
  • A variable tracer (tool for pen-tester: $_GET['foo'] -> ($foo <- htmlentities()) -> echo or this kind of stack...)

Tuesday, July 10 2007

Website functionalities coverage

Coverage is a tool written in Python which allows you to track what functionalities/web pages are reached on your website. I use this tool for in my Web Apps Scanner evaluation methodology in order to know if the web apps scanner was able to scan every pages, every functionalities of my test apps.

Anyway, this tool is pretty easy to use even if it requires a MySQL database to store the EntryPoints of the application. Basically, you setup the database, you insert the entry points into your code and you run the python script which will generate an HTML report with SVG graphs, reporting the coverage of your application.

Here is a report example

Installation

1/ Database

The database design I used for storing the needed information is the following:

CREATE TABLE `coverage` (
`CoverageID` int(32) NOT NULL auto_increment,
`Apps` varchar(128) character set utf8 collate utf8_unicode_ci NOT NULL,
`Date` date NOT NULL,
`EntryPoint` varchar(255) character set utf8 collate utf8_unicode_ci NOT NULL,
`Origin` varchar(255) character set utf8 collate utf8_unicode_ci NOT NULL,
PRIMARY KEY  (`CoverageID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1;
  • Apps: name of the covered application
  • Date: time when the entry point is reached
  • EntryPoint: Name of the entry point with a special format:


** File Reached:
Touch_ + Name of the file with extension, example, Touch_Index.Php, Touch_Search.Php etc.

** Functionality Reached:
Name of the functionality + _ + Name of the file with extension, example, this sequence of entry points of the page Login.php of a given application:

  1. Touch_Login.Php : Enter the page Login.Php
  2. Username_Password_Login.Php : The username and the password are feed
  3. Call_Function_Login.Php : Call the function login()
  4. Call_Function_Succeed_Login.Php : The function login succeed
  5. Call_Function_Error_Login.Php : The function login reported an error
  • Origin: the origin string is the concatenation of the md5 of the HTTP_USER_AGENT a pipe and the date; this ID + date is used to be sure to study the same user.
<?php
// ...
$origin = md5($_SERVER['HTTP_USER_AGENT']). '|' . date("j-m-y H:i");
?>

2/ In the code

So, you will need to add, in your apps code, lots of entry points. I made a PHP source code to do that more easily:

<?php
class Coverage{
 private $coverage_id = false;
 private $coverage = null;
 function __construct() {
  $this->coverage_id = true;
  $this->coverage = mysql_connect('192.168.1.3:3306', 'test', 'test');
  mysql_select_db("test_collect");
 }
 function send($entryPoint){
  if ($this->coverage) {
   $origin = "";
   $origin .= md5($_SERVER['HTTP_USER_AGENT']);
   $origin .= ('|' . date("j-m-y H:i"));
   $entryPoint = mysql_real_escape_string($entryPoint);
   mysql_query("INSERT INTO coverage VALUES(NULL,'BankApp',NOW(),'$entryPoint','$origin')");
  }
 }
};
	
$coverage = new Coverage();
function register_EntryPoint($entryPoint) {
 global $coverage, $supportCodeCoverage;
 if ($supportCodeCoverage) {
  $coverage->send($entryPoint);
 }
}
?>

Insert this code in a header or something and call:

register_EntryPoint('Touch_MyFile.Php');

etc. in your code where you have functional difference.

Run the tool

To run the tool, you need to have:

  • Python + MySQLdb (the python MySQL API)
  • The date (in SQL format) you want to cover; for now, it's only one day
  • The Origin ID of the user (the MD5(HTTP_USER_AGENT)), basically, you will look at this in the database, or get it by your code etc.


example:

$ python coverage.py 2007-06-28 41942da0293d0b8afcfab4c2d10c2401
$ python coverage.py 2007-04-12

The script must be in the same directory of your files for now... you can download the archive here: coverage.zip

Wednesday, June 20 2007

PHP Source Code Security Scanners: Pixy

I already talked about source code scanners for PHP, and even run a simple test between SWAAT and PHP-SAT. Today, a new toy has been released: Pixy, so I decided to make it pass the test. The first test is really basic, having a quite small php source code with a bunch of possible faults: tests.php

So, you find the output of the tool here: out.pixy.result.txt

I first have to say that it's normal that the tool doesn't catch the header injection stuff, os command injection etc. it doesn't claim to do that. Pixy claims to find the Cross-Site Scripting and the SQL Injection. On that point, I would say pretty good job guy!
The tool catch all the possible Cross-Site Scripting in the echo functions, doesn't warn for the persistent XSS (line 34, the bad html injection would be inserted into the SQL database, if there is no output validation, there are Persistent XSS).

Even better on the SQL Injection where it found every thing I tagged as true-positive.

To conclude, I will definitely keep an eye on this tool which looks promising to me, I will also continue working on the PHP-SAT security configuration in order to make a solid vulnerability disclosure system.

Wednesday, May 30 2007

Such a noisy thing with SWAAT

In one of the last post, I made a comparison between two PHP Source Code Security Analyzers: SWAAT and PHP-SAT. The results was close to say that SWAAT was really better than PHP-SAT.
I started working on the configuration of PHP-SAT and it looks to be quite powerful (well, after talking with Eric Bouwers, I'm waiting for the next release) and I think I will be able to have good results with combining a security oriented configuration and some additional bugpatterns.
On the other hand, SWAAT is really limited for now as example, I've made a simple php script with only SQL queries inside: every lines are highlighted as flawed (and with a MEDIUM level)!! This is simply stupid and they would better don't report anything than doing that... just tell that you don't support SQL Injection for now... Anyway, SWAAT is for me the tool to keep an eye on, I will try to develop some features on it, especially for XSS detection and SQL Injection findings...

Saturday, May 26 2007

RegFuzzer: Test your regular expression filter

Here we go, I release the first shot of a tool I start writing months ago... The goal of that tool is to find some strings that are valids and which pass your regular expressions filters. Basically, it was designed for testing IDS regexp.
The tool is not finish yet, I have lots of work to do on this, especially the attack strings dictionary; currently there is only some client-side string patterns.
You can download the tool here: RegFuzzer

For using the tool, you need to enter the regular expression to test into the XML input file, and launch the tool like:

python regFuzzer.py -f input.xml

This will produce an HTML file as output. As I said before, this is the first release the goal is much more to show that tool and see if the idea is interesting for you; if so, I may work more on this. Don't hesitate to drop me a line about the tool if you have some comments.

Thursday, May 24 2007

PHP Source Code Security Scanners basic test

For quite a long time now, I've been playing with lots of different black-box tools: commercial or not, mine or not. Months ago, I developed Crystal, a plugin for Grabber which does the link between the black-box engine in Grabber and a PHP Source Code Security Analyzer: PHP-SAT . At the time, it was the only advanced PHP SCSA I could find on the web, so I used it without really testings I admit.
That's for the story, few days ago, on #webappsec (irc.freenode.org), Larholm told me about SWAAT a new (at least, for me) PHP SCSA (and not only PHP actually). At the time, I didn't have time to try it; but today, I took the time to compare PHP-SAT and SWAAT with a test which can be view as a quite-exhaustive-basic-flaw-checker (it means that there is maybe 6 different vulnerabilities with variants and false positive/true positive check implementation).
You can see the PHP test file here: tests.php

The result of the two runs can be find here: php-sat-test-output.phps and swaat-output.html
How to read the reports:

  • SWAAT: HTML file with table for each type of vulnerabilities, it will report multiple lines (each line is a vulnerability). If there is a /* fase */ in that line, then, this is a false positive.
  • PHP-SAT: PHP-SAT takes the PHP source code and transform it by adding some information. For the vulnerability report you will have to look for the Malicious Code Vulnerability (MCV). Other report are more quality oriented.


I will not spend time to explain the difference of the tools but the tools don't really have the same goal (even if we can use them for the same utilization). Well, with the default configuration of both tools, SWAAT is really better! But as for many Source Code Security Analyzers, the configuration is really important, so I would mitigate my conclusion on these tools, I really need to dive into the configuration of that two tools and redo the tests.

Wednesday, February 7 2007

How you should design a test suite for Web Apps Scanners

If you have ever think about using a web application scanner for testing the security of your website, you certainly made a choice: Which web apps scanner should I buy/use ?

In this post, I will not tell you what is the better black box tester for whatever kind of web application.

The web applications may be very different, the tools are different and thus they could have different efficiency (i think it's non countable noun)... If you read this, you probably know that I am talking about scanners such as WebInspect, AppScan, Acunetix, Hailstorm, Pantera, Grabber... In the following sections, I will explain a main idea that should be used for testing such a tool.

A test suite for our tools is a website, this website has typically vulnerabilities; you can see this kind of website by watchfire, spi but also WebGoat, SiteGenerator or others. But all of these websites are not realistic and do not consider that the vulnerabilities may exist in different instances: variants.
If you don't know what is a variant check at the XSS Cheat Sheet (RSnake) or at the Attack Patterns (Sean Barnum). A variant is what the hacker use to perform his exploit.
A simple example for XSS is: Let's say you protect your website against XSS by checking the <script> tag; this is not perfect and not good because there is some way to insert other type of XSS strings (onmouseover).

Here comes the concept of Level Of Defense. If you are a developer you think about filters, if you are an attacker you think about variants of vulnerability and attack patterns. The level of defense of a website is the strength of its filters again a given vulnerability.

For a SQL Injection you can have multiple type of filters... Here is a possible list of levels for the SQL Injection:

  • Level 0: Show SQL errors / No input filtering
  • Level 1: Hide SQL errors / No input filtering
  • Level 2: Typecasting (integer, string etc.)
  • Level 3: Escaping input strings
  • Level 4: Restricted accounts...
  • ...


In the concept of the level of defense, it's important to not that depending of the type of vulnerability (weakness, failure...) the level n-1 is also performed in the level n or the level n is stronger (for the same variants) than the level n-1 (for instance, for Weak Hash Function it's not possible but using SHA-1 instead of MD5 is a level of defense higher).

A Key point: When you are implementing a level of defense for a vulnerbility, you must be sure that your implementation does the whole thing for that type of filter. For example, if you are escaping the HTML entities, you need to do all not only '<', '>' and in the next LoD escaping ' and ".

Why is the level of defense better than a simple system with vulnerabilities?

With the level of defense, you can calibrate a type of website which may be close to yours; you can construct a test suite with your kind of level of defense and see how the tool detect the vulnerabilities when the LoD increase. It is also a good way to know the state of the art of the tools for detecting vulnerabilities...

The idea was developed to create a test suite in order to evaluate web apps scanners; in this test suite we can select the current type of vulnerability and its level of defense (the hardness to break):


Thursday, January 11 2007

Virtualization for free!

Thanks to RSnake, I have now a free version of VMWare:


This is definitely a good stuff. In a previous post, I explained that I have to create different architectures for different kind of test suites; this will be a lot easier with virtualization.

Wednesday, January 10 2007

Test Suites for Web Application Scanners

For a while, I've been working on a test suite for evaluating web application scanners. Now I have a test suite (PHP/MySQL/AJAX) with a bunch of variable vulnerabilities:


But there is a problem for a full evaluation. Web Application are not only a simple schema of scripts and databases and complex relation, there is also server configuration, infrastructure, different type of databases etc. Thus, I really have to create different test suites for a good coverage of what web apps could be.
I plan to use:

  • Ruby On Rails framework
  • ASP.NET/MS SQL based application
  • JSP application


This should cover the differnt type of application but I still have to think about server types, architectures,multiple databases etc.

Saturday, December 16 2006

Another tool: Acunetix

I've just found another commercial web apps scanner: "Acunetix Web Vulnerability Scanner". You can reach it here: http://www.acunetix.com

I usually do not quote or report some tools, but I tried the trial version which seems to be nice, but not more than WebInspect or AppScan. But while watching the tiny tools in the distribution, I saw an interesting file: RSnake_XSS cheat sheet.xml
This tool has a Fuzzer which performs some attacks from the XSS cheat sheet from http://ha.ckers.org and this is very interesting (this cheat sheet is something like the most relevant XSS attack dictionary).

- page 1 of 2

http://rgaucher.info/bot