Web Crawling : Scrapy and ElasticsearchSee on Scoop.it - Programming Stuffs
You might think that web crawling and scraping only is for search engines like Google and Bing. But a lot of companies are using it for different purposes: Price comparison, financial risk information and portals all need a way to get the data. And at least sometimes the way is to retrieve it through some public website. Besides these cases where the data is not in your hand it can also make sense if the data is aggregated already. For intranet and portal search engines it can be easier to just scrape the frontend instead of building data import facilities for different, sometimes even old systems.The Example
In this post we are looking at a rather artificial example: Crawling the meetup.com page for recent meetups to make them available for search. Why artificial? Because meetup.com has an API that provides all the data in a more convenient way. But imagine there is no other way and we would like to build a custom search on this information, probably by adding other event sites as well.
This is a part of the Search Meetup Karlsruhe page that displays the recent meetups.
See on blog.florian-hopf.de