The client’s company had an e-commerce-based business for which they wanted to generate leads from bol.com, a webshop in the Netherlands, according to their requirements. The client wanted to scrape the details of all those sellers who match certain criteria in the number of reviews, SKUs, ratings, etc.
If all the seller profiles are opened and checked manually, the task seems to be nearly impossible because of the huge number of sellers selling their products on bol.com.
So, to fulfill their needs we developed a tool called ‘Bol scraper’ which automates the whole process of going through all the pages of the e-commerce website and extracting the details of the seller according to the client’s need. Bol Scraper is a GUI-based tool which means after the tool is delivered to the client, even a user without much technical knowledge can make changes to the parameters (such as the number of reviews, SKUs, and rating) for filtering out the sellers and operate it without any hassle. The client can either select the category through the UI which is to be scraped otherwise, he also has the option to scrape all the categories at once.
We use scrapy, a python-based framework, to scrape through all the pages of the e-commerce website. Along with that, we have integrated various extensions in the module to avoid getting blocked by the bol servers that may happen after making repeated requests for data within a small amount of time.
The scraper shows the details of the sellers meeting all the criteria in real-time as they are scraped through a table in the UI and the user has the option to export all the scraped data to a CSV file at any point during the scraping process.
Using this scraper, we were able to scrape more than 1000 subcategories from bol.com.
- Thousands of pages can be scraped at once, allowing the client to gather data in a shorter time.
- The scraper can be used for lead generation and approaching different sellers according to the different requirements of the client.