Aspirations for the Web are evolving from rich user interfaces and collaboration to open-linked data and more semantic content. Regardless of the data’s origin, a common set of skills is needed to retrieve and parse it for use in other applications. This presentation covers the process of extracting and manipulating data from a variety of sources, from communicating with Web services to developing crawlers and scrapers for more traditional acquisition.

Comments

Comments are closed.

Extremely thorough delve into HTTP from the spec up through sockets/streams and PHP libraries to deal with markup. Slightly dry but that's the nature of the topic

Great talk, we do a lot of web scraping and this is what I would like to see my devs know at the start.

The only suggestion I would make (difficult given time and wifi issues) would be some live demos to see web scraping in action.

Great coverage of HTTP...gives a lot to consider for web scraping services.

Nice talk, any one interested in web services should see/hear this first. I have personally used cURL for scraping, but this presentation showed me other methods that I need to read and try implementing.

If its Ok with Matt, I would send a copy of this to a couple of my friends as a reference to start with for writing web services.

Probably the most directly applicable session I've been to. I would have liked to see some demos, but the session was too short for that to work well. Next time make it a three hour tutorial.