Scraping Contracts, Digging Dfid

In an article on the Guardian Data Blog Claire Provost outlined how the recent furore over consultancy spending by the Department for International Development (Dfid) should not be about turning the aid tap off but about making aid work for the donor country. One way to promote development in donor countries is to untie aid, to allow companies and consultancies in developing countries to win contracts for work at home. In that way, grow local industry and promote local expertise.

To look at this angle I scraped the Dfid contracts from contracts finder and looked at which contracts were won by UK companies. The article had the data but ScraperWiki has the code for any of you interested in digging up contracts.

Firstly, you should scrape all the links to the individual contracts from the search result page. Here is the one for Dfid. Click on “Copy” to get your own and change the “search_page” variable to the URL of your search. To make sure you get all the URLs change the page size in the URL to make sure they are all on one page.

Next, go into each URL by attaching the data from your search results scraper into a new scraper which extracts the HTML and pulls out the necessary information. Here is the one for Dfid. I have used the HTML scraping library BeautifulSoup. You can find the documentation here.

So take a look at the source, take a look at the code and take a look at the documentation. Open data and open up the news.

Leave a Reply

Your email address will not be published. Required fields are marked *