I <3 Bots!
Subscribe to the RSS feed

Keyword - security

Entries feed - Comments feed

Saturday, February 21 2009

SHA-3 reference implementations buffer overflows

Fortify just posted a nice blog post about the audit they did on several reference implementation that compete for being the next NIST SHA-3.

They do not release much information on their findings: only one is described. I would have really like to see how powerful was the analysis (if it was) to find these problems.

It could be nice too to see other tool vendors, such as Grammatech, Klocwork, Coverity, etc. to do the same, and then, start another competition ;)

I'd really like to emphasize the conclusions in the Fortify's blog post:

Reference implementations don't disappear, they serve as a starting point for future implementations or are used directly. A bug in the RSA reference implementation was responsible for vulnerabilities in OpenSSL and two seperate SSH implementations. They can also be used to design hardware implementations, using buffer sizes to decide how much silicon should be used.

The other consideration is speed, which will be a factor in the choice of algorithm. The fix for the MD6 buffer issues was to double the size of a buffer, which could degrade the performance. On the other hand, memory leaks could slow an implementation. A correct implementation is an accurate implementation.

Wednesday, September 10 2008

PyQt and WebKit integration: unexpected limitation [fixed]

For the one that don't know Qt, this is a huge and mature framework for developing GUI & more on different platform (to read, multi-platform). I already did some development using Qt and C++ (especially when I was working at the GERAD).

As, with Marcin, we wanted to have a look at some technologies that involved a browser etc. I decided to look at Qt and the almost-fresh WebKit integration.

The integration of WebKit in a framework like Qt, allows the developer to embed supposedly in a easy manner a browser that supports the basic web technologies which are HTML, CSS and JavaScript (it seems that Flash is going to be supported soon, and anyway, one can write its own plugin in order to interact with some specific content) in its application.

And indeed it is easy... I used PyQt in order to develop a very simple prototype and see what we are able to do with this new technology. As I know already Python and Qt, it was easy to me to start and be kinda effective. So, in few hours of work, documentation reading and trying to understand why and how the Python version of Qt was using such or such thing compared to the C++ version, I got this workable browser that allows dynamic JavaScript injection through a console, view the source and a simple encoding converter (click on the image to see the full screen-shot):



At this point, I was actually very excited, less than 500 lines of Python in order to create that... was kinda worth few days of work in order to create a useful tool: the Swiss Army Knife of the Pen-Test.

My next and logic step was to extend the current tool in order to have the tamper-data like capabilities (eg. being able to hijack the HTTP request and then tampering the GET/POST data).

And here come the problems... it's apparently not possible to get the current request then reply when using the WebKit widget in Qt (QWebView). I tried to use a delegate QNetworkAccessManager in order to overload the POST/GET request since this object is use to set the proxies etc. but nothing... I think they just didn't open this possibility for some reason.

Oh well, I then stop developing this prototype and will try to contact Qt experts/developers just to figure out if there is no other way to do it. I thought of a solution which would be to have my own HTTP manager using QHttp in order to do the request, get the response etc. and then sending the content to the browser; this would be great in a webapps scanner, but for the use that I wanted with, that would create huge limitation for the user-interaction and especially for Ajax applications. So, the prototype stays here until I find a solution or Qt open their network management under the QWebView widget...


Fixed:

An update to let you know that I actually fixed the problem, it was really stupid from me, but I should really care when the method are virtual or not before overloading it or not :/ shame on me!

So now, I am able to have a firefox/tamper-data/firebug in one tool :)

Sunday, August 10 2008

Why the "line of code" is indeed a good metric

When I first learned about source code metrics, I was amazed about people using the line of code for doing comparison with software. It was for me a lack of imagination.

At the beginning of the week, I started a small and fast experiment: extracting metrics from the SATE 2008 test cases. This experiment focuses on function-wise properties and therefore, I have to extract for each functions a couple of metrics:

  • McCabe's cyclomatic complexity which computes the code complexity, this is indeed a good metric to estimate the difficulty that a human will have to understand a given piece of code (very important for security related problems)
  • Line of Code
  • Line of Comments
  • Number of local variables
  • Number of parameters (which represents the coercion between the function and the whole program)
  • Number of function call
  • Number of function that are ``sources''
  • Number of function that are ``sinks''
  • Number of C standards functions (obviously, only for C test cases)

At first the the line of code was implemented cause it's an easy one to compute and it also gives an important value if we want to normalize the other metrics. We also decided to introduce the number of ``source/sinks'' for studying input validation weaknesses later on...

Anyway, after running some statistics on the output results, I was amazed by observing that the Pearson correlation coefficient between McCabe and Line of Code was never less than 0.90 (which could be compare to 90% as a correlation rate) (but I have to say that there is huge limitations in the parsers we are using for extracting information, for instance, the C is not pre-processed etc.). This result is only valid for C test cases, actually, the average of observed correlation in Java test case is around 0.60...

Of course further statistical analysis will be necessary to conclude anything on this subject, but if we were unlucky with the test cases selection, this may have been a source of the problem, but I don't think we were. Actually, this seems quite logical to think that these metrics a related, the longer the code is, the more complex in term of tests, loops etc. it can be, there is indeed more chance that a longer code contains more cycles :)

Oh well, I'll keep writing about especially since I expect to get results pretty soon...

Friday, July 18 2008

Scalp: apache log based attack analyzer

I started a project some time ago in order to parse some apache log file, to detect some attacks etc. The attack recognition is based on the PHP-IDS filters.

The first release version is written in Python http://code.google.com/p/apache-scalp/downloads/list but I started (well, almost finished) a faster multi-threaded/C++ version in order to be able to handle bigger log files.

The main project page is reachable here: http://code.google.com/p/apache-scalp

Scalp the apache log! - http://code.google.com/p/apache-scalp
usage:  ./scalp.py [--log|-l log_file] [--filters|-f filter_file]
                   [--period time-frame] [OPTIONS] [--attack a1,a2,..,an]
   --log       |-l:  the apache log file './access_log' by default
   --filters   |-f:  the filter file     './default_filter.xml' by default
   --exhaustive|-e:  will report all type of attacks detected and not stop
                     at the first found
   --period    |-p:  the period must be specified in the same format as in
                     the Apache logs using * as wild-card
                     ex: 04/Apr/2008:15:45;*/Mai/2008
                     if not specified at the end, the max or min are taken
   --html      |-h:  generate an HTML output
   --xml       |-x:  generate an XML output
   --text      |-t:  generate a simple text output (default)
   --except    |-c:  generate a file that contains the non examined logs due 
                     to the main regular expression; ill-formed Apache log etc.
   --attack    |-a:  specify the list of attacks to look for
                     list: xss, sqli, csrf, dos, dt, spam, id, ref, lfi
                     the list of attacks should not contains spaces and be comma
                     separated
                     ex: xss,sqli,lfi,ref

Saturday, December 8 2007

"My Security Planet"

I love iGoogle. I have a couple of widgets and my RSS feeds in it... That's actually the problem, I had too much feeds, so I decided to create my own "planet" in order to have just one feed with all security blogs I'm reading.

You can reach it here if you have the same taste as mine: http://rgaucher.info/planet

I know that planet-security is pretty much doing the same, but I don't like the interface of this website and it doesn't have all the feeds I'm following...

Sunday, December 2 2007

Yet another study oriented release

I've been working a couple of months on a project named php-ast/oracle. I am opening the source of the project today because I think that people may be interested in such a code. Roughly, php-ast/oracle is able to get/transform information on a php source code, I used it for: creating real obfuscations (control-flow, data-flow), implementing security metrics, writing a converter from php to c++ for static analysis purpose and some other stuff such as variables flow etc.. You can have more information here: http://trac2.assembla.com/php-ast. I may post about this project later don't have much time now...

But this news is only for releasing a script I used a lot this last weeks; a PHP preprocessor. I've been using this preprocessor in order to clean the crappy PHP code we can found in the wild... in order to use php-ast/oracle correctly for calculating security metrics and so on.

The preprocessor is actually doing 3 things:

  • Simplifying the strings (keeping only the php variables in the strings -- really important for keeping the AST small with SQL queries and so on, because the strings could be evaluated in PHP, the AST would need to tokenize the strings)
  • Removing comments and HTML
  • Resolving the file inclusions (not for dynamic variable inclusion of course, but it's working with define names and static names)

The preprocessor is available here: preproc.zip

Wednesday, November 21 2007

The new grabber

Grabber was a nice project. The main goal for me was to learn stuff around web application security/scanners; I didn't really know much before I started this project. But now that I've been playing with web apps scanners for more than 10months, I need to create a new one and go deeper in heuristics, browser integration and AI.

Grabber was in fact more a spider+fuzzer than something else... Not a good web apps scanner at all. Thinking of the analysis engine... It's something kinda stupid, no JavaScript execution, just simple heuristics for parsing and levenstein distances ;)

Anyway, I decided to start over this project. It's not gonna be a bunch of python scripts anymore, I am gonna use Qt/C++ extensively. The idea if this project is to be pen-testers oriented and open, I want to create a kind of wrapper around WebKit (especially using QtWebKit), a spider as core utilities and after, using plugins. The plugins should be either in C++ or JavaScript (QtScript actually). So far, we are 3 guys thinking of this project: we didn't start yet but we are open to every contribution; the project will of course be free and GPL'd.

I just post this in order to get some comments or suggestions about what a web apps scanner should do... Feel free to comment/mail...

Monday, November 12 2007

Interoperability and web application scanners

Talking about web application security scanners , we all have the same problem: False Positive. It's a fact that cannot exactly be solve by the testing methodology itself (since it relies on pattern detection). So, the idea I started talking about on #webappsec today is a common format for exchanging information between tools.

Ideally, this would work like this:

  1. Tool A is scanning a website.
  2. It exports some information a given format: out-tool-a.xml
  3. Tool B is able to understand out-tool-a.xml and take this as an input
  4. Tool B would then be able to verify the results/false-positive of Tool A by scanning with the information in the out-tool-a.xml

I really think that would be helpful somehow, at least for open-source tools. I'm gonna try to implement this for the next release of Grabber.

Tuesday, October 16 2007

Stuck at data-flow? Do box-modeling!

Since yesterday, I'm working on a data-flow problem. I need to model a function and I should do all the data-flow process. Well, that's kinda long if I have to do that on all functions and especially I will never use much of the information I would generate by analyzing the tree associated to the function (local variables etc.). So what the point of doing that? None.

I was stuck at this point, didn't find a good way to model a function (entry parameters, global calls etc.) so I thought of reasoning as a crystal ball. I can see what it is, but it's kinda blurry :) I am now modeling a function as inputs and outputs, only in terms of functions and global variables interaction. By this, I should be able to see the possible interaction of the given function on the system. Hope it's gonna work well!

Wednesday, October 10 2007

Working around security metrics...

I'm not gonna write a long entry about Security Metrics, but since I've been working on this for a couple of weeks now, I have some thoughts. Evaluating the security of a source code is actually pretty hard. Even if I'm sure there is a lot of source code security metrics out there, it's often (I guess) hard to compute. Basically, you will need to know lots of things about the source code then, you need an engine working on the AST , data-flow etc.

This is what I've done for a couple of months, an engine which is working on XML AST, generated by yaxx (this is the same engine that I use to do source code modifications, obfuscations, etc.).

With Vadim Okun, we had the idea of computing the "size" of the security in a source code. The idea is pretty simple and we are aware that this is limited to implementation flaws and not design flaws for now. The "size" of the security is the number of inputs going to sinks.

The inputs have to be taken in the large sense, these are in fact all the variable that are derivate from direct inputs. Here is a simple example of the variable diffusion:

$a = $_GET['foo'];
$b = htmlentities($a);
echo $b;

We are here counting $a and $b since $b is a modification of $a which is a direct input. We are using the same methodologies for all possible modification (concatenation, cast, etc.).

Once we know these variables, we are counting the ones that are going to sinks. The sinks are a list of function such as 'echo', 'mysql_query', 'fopen', and so on. Our list of sinks is directly coming from the PHP-SAT project. In the previous example, the metric result is 1 since there is only one sink 'echo' where a derivate input is going to.

And here we are, this is a fairly simple (in the idea, not the implementation) way to evaluate the possible security problems that you can have in your source code. We are going to try and evaluate this metric on different open source project (wordpress, joomla, mediawiki etc.). I'm sure this is really incomplete: first because we are only counting the security problems that are coming from inputs but also because it really depends on the programmer (his style of programming).

An other example is available here: smetric.pdf

Next Improvements

For the revised version, the first add would be to count the output validation problems. But for that purpose, I need a stronger data-flow analysis which would analyze in function definitions (not done yet). Then, I will be able to trace everything coming from supposed secure sources (databases, resources, local files, etc.) to sinks. Maybe the weight of such flows would be different than the first one (input to sink)...

Tuesday, September 4 2007

Source Code Obfuscation

Source Code Obfuscation is actually a powerful tool for testers. Whether you use it to obfuscate your bytecode (Java, .NET etc.) or increasing the code complexity of your current source code.

Working at SAMATE we are also playing, tweaking, testing, stressing source code analyzers. And now you see the relation. I'm writing a source code obfuscater in order to increase the complexity of our test cases and see if the tools are still doing well.

Thus, I was able (with good documentation, and yaxx) to create one. It currently only add control flow complexity (and of course renaming classes, functions and variables).

Some words on obfuscation

You may have heard about obfuscation in a sense of making the code unreadable for users. This is not what I'm interested in. I want to modify the actually source code, adding some information in it, some tests... I need the outputs of the original program and the obfuscated one to be the same, otherwise we cannot consider the source code as being equivalent.

So for example if I do:

if (var == 0) { 
  echo 0;
}

I will have the same behavior with this source code:

x = some_value;
if (var == 0 or x*x < 0) {
  echo 0;
}

Even though they have the same output, the second one is more complicated since it adds one other test.

Test case wise example

To see what it exactly does given this original source code

<?php
	$b = 0;
	$c = "Salut";
	$len = strlen($c);

	function fct($a) {
		return $a . "_1\n";
	}
	
	class T {
		function foo() {
			echo "test\n";
		}
	}
	echo fct(0);
	echo T::foo();
?>

...and by choosing one of my control flow obfuscating pattern

<?php
	class rand_class_name {
		function rand_func_name_2($rand_name_1) {
			return $rand_name_1 + 1;
		}
	}

	function rand_func_name_1($rand_name_2) {
		return $rand_name_2 + 1;
	}
	
	if (rand_func_name_1(0) > 0 && rand_class_name::rand_func_name_2(0)) {
		$enter_the_new_statement;
	}
?>

To have this result:

<?php
	function HXvE5Plwxp0RSoQM ( $ZMfP98Az96Rq67j6 ) {
		return $ZMfP98Az96Rq67j6 + 1 ;
	}
	class TF03COvMuzXRQcCK {
		function Ltghf3a0McCI8RaZ ( $V309os5vQo15ak9b ) {
			return $V309os5vQo15ak9b + 1 ;
		}
	}
	$b = 0 ;
	$c = "Salut" ;
	$len = strlen ( $c ) ;
	function fct ( $a ) {
		return $a . "_1\n" ;
	}
	class T {
		function foo ( ) {
			echo "test\n" ;
		}
	}
	if ( HXvE5Plwxp0RSoQM ( 0 ) > 0 && TF03COvMuzXRQcCK :: Ltghf3a0McCI8RaZ ( 0 ) ) {
		echo fct ( 0 ) ;
	}
	if ( HXvE5Plwxp0RSoQM ( 0 ) > 0 && TF03COvMuzXRQcCK :: Ltghf3a0McCI8RaZ ( 0 ) ) {
		echo T :: foo ( ) ;
	}

?>

How it actually works

First of all, the engine only works on Abstract Syntax Tree (AST) in order to do powerful manipulation and code refactoring. The idea is to take a couple of transformation patterns (the second source code is in fact a complicated one), and fitting this patterns with the original source code.

The patterns are meta code. You can see that they are in PHP using some names such as $rand_name_1 etc. this means that the engine will generate one unique name for each of them and replace it before the actual refactoring.

Select what I want to obfuscate is not a real problem, but for now I only selected the top statements and will apply the whole modifications to each of them.

A little schema explaining a little how it works is available here: schema_obfuscation.png

What's next

The applied control flow obfuscating pattern is on of the many I do have for now (many more to come), and I guess this is kinda promising, lots of interesting studies should come now.

Currently the tools is only for PHP but I should make it general by using my own AST nodes names and then be able to do code transformation on C, C++, Java etc.

There is no release of the tool (written in C++) right now, I will wait until it's more than correct and clean. I also need to do data obfuscation (using indirections etc.). The program will of course be public and free for everybody when it's gonna be ready.

Wednesday, June 20 2007

PHP Source Code Security Scanners: Pixy

I already talked about source code scanners for PHP, and even run a simple test between SWAAT and PHP-SAT. Today, a new toy has been released: Pixy, so I decided to make it pass the test. The first test is really basic, having a quite small php source code with a bunch of possible faults: tests.php

So, you find the output of the tool here: out.pixy.result.txt

I first have to say that it's normal that the tool doesn't catch the header injection stuff, os command injection etc. it doesn't claim to do that. Pixy claims to find the Cross-Site Scripting and the SQL Injection. On that point, I would say pretty good job guy!
The tool catch all the possible Cross-Site Scripting in the echo functions, doesn't warn for the persistent XSS (line 34, the bad html injection would be inserted into the SQL database, if there is no output validation, there are Persistent XSS).

Even better on the SQL Injection where it found every thing I tagged as true-positive.

To conclude, I will definitely keep an eye on this tool which looks promising to me, I will also continue working on the PHP-SAT security configuration in order to make a solid vulnerability disclosure system.

Monday, June 18 2007

Safe Browsing API by Google

Google has just released the so called "Safe Browser API" which allows everybody to know if a given url is known as a phishing website or malwares infested page. This service is already working with Firefox.

Sunday, June 17 2007

How making people realizing that web apps vulnerabilities are important?

As most of expatriate, I'm aware of what are the news in my country (France) by watching news websites, mostly, I'm watching France 24 which claims to be the French CNN... Anyway, I was watching some videos, and at the end, like on some websites I'm going (depending on if I have time etc.) I looked at how it works, if it has vulnerabilities etc. Of course, it has some, I will not tell here because I didn't tell them yet, but you can find on the most easy way XSS. What's different with other websites? Nothing but they give information, so people trust them.
There are several types of websites, but I could say that their behavior fits in 3 different categories:

  • They give information: news, tv, radio etc.
  • They store personal information: webmails, commerce, forums etc.
  • The others: blog, personal/companies websites etc.

While XSS'ing that website, I thought that it could have a huge impact to be able to change information (we could have seen that with the story of Apple and the wrong news on eGadget...) . Of course, everybody reading this blog is aware of this, but I'm pretty sure that most of other people just think that vulnerabilities are used to get information, not to store.
So, nothing much here, just thought about how a simple SQL Injection, Permanent XSS, File Inclusion or even information/credentials disclosure could have a huge impact on the World :/

On that conclusion, I could say that the information websites and others security/integrity, as christian1 said month ago, belongs to theses companies! They must understand that without a real strict management of their security, their information could be stolen, replaced by bad people and they must be responsible of that since they are making lot of money on that...

Thursday, May 24 2007

PHP Source Code Security Scanners basic test

For quite a long time now, I've been playing with lots of different black-box tools: commercial or not, mine or not. Months ago, I developed Crystal, a plugin for Grabber which does the link between the black-box engine in Grabber and a PHP Source Code Security Analyzer: PHP-SAT . At the time, it was the only advanced PHP SCSA I could find on the web, so I used it without really testings I admit.
That's for the story, few days ago, on #webappsec (irc.freenode.org), Larholm told me about SWAAT a new (at least, for me) PHP SCSA (and not only PHP actually). At the time, I didn't have time to try it; but today, I took the time to compare PHP-SAT and SWAAT with a test which can be view as a quite-exhaustive-basic-flaw-checker (it means that there is maybe 6 different vulnerabilities with variants and false positive/true positive check implementation).
You can see the PHP test file here: tests.php

The result of the two runs can be find here: php-sat-test-output.phps and swaat-output.html
How to read the reports:

  • SWAAT: HTML file with table for each type of vulnerabilities, it will report multiple lines (each line is a vulnerability). If there is a /* fase */ in that line, then, this is a false positive.
  • PHP-SAT: PHP-SAT takes the PHP source code and transform it by adding some information. For the vulnerability report you will have to look for the Malicious Code Vulnerability (MCV). Other report are more quality oriented.


I will not spend time to explain the difference of the tools but the tools don't really have the same goal (even if we can use them for the same utilization). Well, with the default configuration of both tools, SWAAT is really better! But as for many Source Code Security Analyzers, the configuration is really important, so I would mitigate my conclusion on these tools, I really need to dive into the configuration of that two tools and redo the tests.

I <3 Bots!