Feet in the Soup
By Romain Tuesday, November 7 2006 - 10:31 UTC - Tools - Permalink
By Romain Tuesday, November 7 2006 - 10:31 UTC - Tools - Permalink
Since I'm working on Web Apps Scanner, I made scripts to automate some vulnerability detection. This work would have been a pain without Beautiful Soup.
This library is simply amazing, here is an example to retrieve every links on a webpage:
import urllib
from BeautifulSoup import BeautifulSoup
htmlContent = urllib.urlopen("http://rgaucher.info/").read()
soup = BeautifulSoup(htmlContent)
for a in soup.fetch('a'):
print a['href']|php:import urllib
And because it does only html/xml parsing, it's quite easy to deal with cookies, proxies etc. (cookielib & urllib)!
Comments