Previously, on the python version of Grabber, I used a BFS crawler. Good to scan all the code (as long as the parsers are not that dumb). The problem with these crawlers is that it's totally inefficient: all the problems are not everywhere.

Starting with this assumption, I tried to start rating what is actually important and what are the evidence that a page may be important for a security testing point of view. So, the architecture of the crawler is simply based on a priority queue and the priority is for now based on obvious reasoning which may be wrong: The script I prefer testing, is the one that is in POST, where the action is in HTTPS (and so on for the rest...) which gives something like that:

  priority <- 30
  If Form Then
    priority <- 10
    If Method = Post Then
      priority <- 5
  else if Anchor Then
    If Get Variables Then // To Understand: index.php?foo=plop, compared to index.php
      priority <- 20
  If HTTPS Communication for {Method action or Anchor URL} Then
      priority /= 2

This is a fairly incomplete work and kinda dumb, but at least it's unbiased for a set of URL.