Extracting data from the Internet with Scrapy

Israël Hallé

Wednesday 7 March 2018 from 10:00 to 10:45

Talk in English - UK at ConFoo Montreal 2018
Track Name: Fontaine E
Short URL: https://joind.in/talk/31f65 (QR-Code (opens in new window))

While exposing data to developers through API is getting more typical, most of the data found on the internet is only available through raw HTML, often mixed in seemingly chaotic tags. This talk aims to be a quick introduction for the data scientist to politely extract data from a website and store it in a structured database with the help of the Python library Scrapy, and how one might extend it to fits their specific needs.

Comments

Comments are closed.