deep inside: security and tools

Why the "line of code" is indeed a good metric

When I first learned about source code metrics, I was amazed about peopleusing the line of code for doing comparison with software. It was for me a lackof imagination.

At the beginning of the week, I started a small and fast experiment:extracting metrics from the SATE 2008 test cases. Thisexperiment focuses on function-wise properties and therefore, I have to extractfor each functions a couple of metrics:

At first the the line of code was implemented cause it's an easy one tocompute and it also gives an important value if we want to normalize the othermetrics. We also decided to introduce the number of ``source/sinks'' forstudying input validation weaknesses later on...

Anyway, after running some statistics on the output results, I was amazed by observing that the p-value between McCabe and Line of Code was never less than 0.90 (which could be compare to 90% as a correlation rate) (but I have to saythat there is huge limitations in the parsers we are using for extractinginformation, for instance, the C is not pre-processed etc.). This result isonly valid for C test cases, actually, the average of observed correlation inJava test case is around 0.60...

Of course further statistical analysis will be necessary to concludeanything on this subject, but if we were unlucky with the test cases selection,this may have been a source of the problem, but I don't think we were.Actually, this seems quite logical to think that these metrics a related, thelonger the code is, the more complex in term of tests, loops etc. it can be,there is indeed more chance that a longer code contains more cycles :)

Oh well, I'll keep writing about especially since I expect to get results pretty soon...

All entries

  1. February 2013 — RSA 2013 speaking session
  2. February 2013 — HTML5 tokenization visualization
  3. September 2011 — PHP, Variable variables, Oh my!
  4. July 2011 — Dissection of a SQL injection challenge
  5. January 2010 — Yes, we need a standard to evaluate SAST, but it ain't easy...
  6. November 2009 — Data driven factory: I give you data, you give me an object...
  7. June 2009 — NIST Static Analysis Tool Exposition special publication released
  8. December 2008 — Every-day's CSRF: Sorry, I turned off your christmas tree lights
  9. August 2008 — Why the "line of code" is indeed a good metric
  10. May 2008 — Accelerate the convergence to the bug: Running the test in 16-bit
  11. February 2008 — Code review tools: the missing link (so far)
  12. January 2008 — Talk: Problems and solutions for testing web application security scanners
  13. October 2007 — IE6 And IE7 don't have compatible CSS tricks
  14. September 2007 — Source Code Obfuscation
  15. February 2007 — The return of the SVG XSS
  16. February 2007 — How you should design a test suite for Web Apps Scanners
  17. January 2007 — Test Suites for Web Application Scanners
  18. December 2006 — SVG Files: XSS attacks