Introduction

With data-scraper you can scrape any website without the need of inspecting web elements or parsing HTML using Beautiful Soup etc. With just URLs as input you get JSON data as output. First you need to train the scraper for particular website & then run it.

It uses selenium to automate the things. You can use its inbuilt functions in a very easy way.

Installation/Usage:

You can find this package on Pypi (see here).

Command to install :- pip install data-scraper

Import data-scraper

from data_scraper import *

Train

It takes two URLs of 2 similiar pages to train the scraper.

status: True or False

id: id of the scraper

Here is the code:-

scraper.train(link1, link2)

Parameters

link1 (str) – First link on which scraper to run
link2 (str) – Similiar link to link1

Returns

{“status”: True, “id”: “Adfef343JDJSDJ”}

Return type

dict

Run

It runs the scraper and gives data in response.

Here is the code:-

scraper.run(link1, id='Adfef343JDJSDJ')

Parameters

link1 (str) – Link of which data to scraper
id (str) – id of the scraper got in training

Returns

{“data1”:”data1”,…}

Return type

dict