Working around security metrics...
By Romain Wednesday, October 10 2007 - 18:31 UTC - Discussion - Permalink
By Romain Wednesday, October 10 2007 - 18:31 UTC - Discussion - Permalink
I'm not gonna write a long entry about Security Metrics, but since I've been working on this for a couple of weeks now, I have some thoughts. Evaluating the security of a source code is actually pretty hard. Even if I'm sure there is a lot of source code security metrics out there, it's often (I guess) hard to compute. Basically, you will need to know lots of things about the source code then, you need an engine working on the AST , data-flow etc.
This is what I've done for a couple of months, an engine which is working on XML AST, generated by yaxx (this is the same engine that I use to do source code modifications, obfuscations, etc.).
With Vadim Okun, we had the idea of computing the "size" of the security in a source code. The idea is pretty simple and we are aware that this is limited to implementation flaws and not design flaws for now. The "size" of the security is the number of inputs going to sinks.
The inputs have to be taken in the large sense, these are in fact all the variable that are derivate from direct inputs. Here is a simple example of the variable diffusion:
$a = $_GET['foo']; $b = htmlentities($a); echo $b;
We are here counting $a and $b since $b is a modification of $a which is a direct input. We are using the same methodologies for all possible modification (concatenation, cast, etc.).
Once we know these variables, we are counting the ones that are going to sinks. The sinks are a list of function such as 'echo', 'mysql_query', 'fopen', and so on. Our list of sinks is directly coming from the PHP-SAT project. In the previous example, the metric result is 1 since there is only one sink 'echo' where a derivate input is going to.
And here we are, this is a fairly simple (in the idea, not the implementation) way to evaluate the possible security problems that you can have in your source code. We are going to try and evaluate this metric on different open source project (wordpress, joomla, mediawiki etc.). I'm sure this is really incomplete: first because we are only counting the security problems that are coming from inputs but also because it really depends on the programmer (his style of programming).
An other example is available here: smetric.pdf
For the revised version, the first add would be to count the output validation problems. But for that purpose, I need a stronger data-flow analysis which would analyze in function definitions (not done yet). Then, I will be able to trace everything coming from supposed secure sources (databases, resources, local files, etc.) to sinks. Maybe the weight of such flows would be different than the first one (input to sink)...
Comments