Introduction
With data-scraper
you can scrape any website without the need of inspecting web elements or parsing HTML using Beautiful Soup etc.
With just URLs as input you get JSON data as output.
First you need to train the scraper for particular website & then run it.
It uses selenium to automate the things. You can use its inbuilt functions in a very easy way.
Installation/Usage:
You can find this package on Pypi (see here).
Command to install :- pip install data-scraper
Import data-scraper
from data_scraper import *
Train
It takes two URLs of 2 similiar pages to train the scraper.
status: True or False
id: id of the scraper
Here is the code:-
- scraper.train(link1, link2)
- Parameters
link1 (str) – First link on which scraper to run
link2 (str) – Similiar link to link1
- Returns
{“status”: True, “id”: “Adfef343JDJSDJ”}
- Return type
dict
Run
It runs the scraper and gives data in response.
Here is the code:-
- scraper.run(link1, id='Adfef343JDJSDJ')
- Parameters
link1 (str) – Link of which data to scraper
id (str) – id of the scraper got in training
- Returns
{“data1”:”data1”,…}
- Return type
dict