3

I am trying to scrape some data from a website with very little success. Basically there is a route overlaid on google maps and whenever you mouse over specific sections of the map (about 200 in all) it fetches 7 fields from a database and displays them on screen. Doing a single map manually would take about 30 minutes and be quite imprecise. There are about 10,000 map routes I want to scrape so this is not realistic to do it manually. Is there a relatively straightforward way of automating this process?

The Music
  • 31
  • 2

1 Answers1

3

I've been building web scrapers for over 5 years, and I have to say that a web scraper is rarely "relatively straightforward" simply because of how idiosyncratic each website is. It usually takes a minimum of 10 hours per site to code your scraper if you know what you're doing.

Whenever you have to interact with the page that you're scraping, I recommend Selenium. It's open source and works with most major languages, including Python, Java, and Scala. Moving a mouse around is possible, but I think in your case it might be easier to directly call the javascript that is triggered by the mouse movements. Your web scraper would iterate over all of the hoverable html elements and call the on-hover javascript on each element. However, the devil is in the details, and you're going to need to post a lot of questions on Stackoverflow and do a lot of Googling before you get to a final solution.

I've heard that Google has an API for their maps capability. That might be a lot easier.

Ryan Zotti
  • 4,209
  • 3
  • 21
  • 33