Library Hours
Monday to Friday: 9 a.m. to 9 p.m.
Saturday: 9 a.m. to 5 p.m.
Sunday: 1 p.m. to 9 p.m.
Naper Blvd. 1 p.m. to 5 p.m.
     
Limit search to available items
Results Page:  Previous Next
Author Chapagain, Anish, author.

Title Hands-on web scraping with Python : extract quality data from the web using effective Python techniques / Anish Chapagain. [O'Reilly electronic resource]

Edition Second edition.
Publication Info. Birmingham, UK : Packt Publishing Ltd., 2023.
QR Code
Description 1 online resource (324 pages) : illustrations
Bibliography Includes bibliographical references and index.
Summary Web scraping is a powerful tool for extracting data from the web, but it can be daunting for those without a technical background. Designed for novices, this book will help you grasp the fundamentals of web scraping and Python programming, even if you have no prior experience. Adopting a practical, hands-on approach, this updated edition of Hands-On Web Scraping with Python uses real-world examples and exercises to explain key concepts. Starting with an introduction to web scraping fundamentals and Python programming, you'll cover a range of scraping techniques, including requests, lxml, pyquery, Scrapy, and Beautiful Soup. You'll also get to grips with advanced topics such as secure web handling, web APIs, Selenium for web scraping, PDF extraction, regex, data analysis, EDA reports, visualization, and machine learning. This book emphasizes the importance of learning by doing. Each chapter integrates examples that demonstrate practical techniques and related skills. By the end of this book, you'll be equipped with the skills to extract data from websites, a solid understanding of web scraping and Python programming, and the confidence to use these skills in your projects for analysis, visualization, and information discovery.
Contents Cover -- Title page -- Copyright and Credits -- Contributors -- Table of Contents -- Preface -- Part 1: Python and Web Scraping -- Chapter 1: Web Scraping Fundamentals -- Technical requirements -- What is web scraping? -- Understanding the latest web technologies -- HTTP -- HTML -- XML -- JavaScript -- CSS -- Data-finding techniques used in web pages -- HTML source page -- Developer tools -- Summary -- Further reading -- Chapter 2: Python Programming for Data and Web -- Technical requirements -- Why Python (for web scraping)? -- Accessing the WWW with Python -- Setting things up
Creating a virtual environment -- Installing libraries -- Loading URLs -- URL handling and operations -- requests -- Python library -- Implementing HTTP methods -- GET -- POST -- Summary -- Further reading -- Part 2: Beginning Web Scraping -- Chapter 3: Searching and Processing Web Documents -- Technical requirements -- Introducing XPath and CSS selectors to process markup documents -- The Document Object Model (DOM) -- XPath -- CSS selectors -- Using web browser DevTools to access web content -- HTML elements and DOM navigation -- XPath and CSS selectors using DevTools
Scraping using lxml -- a Python library -- lxml by example -- Web scraping using lxml -- Parsing robots.txt and sitemap.xml -- The robots.txt file -- Sitemaps -- Summary -- Further reading -- Chapter 4: Scraping Using PyQuery, a jQuery-Like Library for Python -- Technical requirements -- PyQuery overview -- Introducing jQuery -- Exploring PyQuery -- Installing PyQuery -- Loading a web URL -- Element traversing, attributes, and pseudo-classes -- Iterating using PyQuery -- Web scraping using PyQuery -- Example 1 -- scraping book details -- Example 2 -- sitemap to CSV
Example 3 -- scraping quotes with author details -- Summary -- Further reading -- Chapter 5: Scraping the Web with Scrapy and Beautiful Soup -- Technical requirements -- Web parsing using Python -- Introducing Beautiful Soup -- Installing Beautiful Soup -- Exploring Beautiful Soup -- Web scraping using Beautiful Soup -- Web scraping using Scrapy -- Setting up a project -- Creating an item -- Implementing the spider -- Exporting data -- Deploying a web crawler -- Summary -- Further reading -- Part 3: Advanced Scraping Concepts -- Chapter 6: Working with the Secure Web -- Technical requirements
Exploring secure web content -- Form processing -- Cookies and sessions -- User authentication -- HTML processing using Python -- User authentication and cookies -- Using proxies -- Summary -- Further reading -- Chapter 7: Data Extraction Using Web APIs -- Technical requirements -- Introduction to web APIs -- Types of API -- Benefits of web APIs -- Data formats and patterns in APIs -- Example 1 -- sunrise and sunset -- Example 2 -- GitHub emojis -- Example 3 -- Open Library -- Web scraping using APIs -- Example 1 -- holidays from the US calendar -- Example 2 -- Open Library book details
Example 3 -- US cities and time zones
Subject Data mining.
Python (Computer program language)
Exploration de données (Informatique)
Python (Langage de programmation)
Data mining
Python (Computer program language)
Other Form: Print version: 1837636214 9781837636211 (OCoLC)1393525761
ISBN 9781837638512 electronic book
1837638519 electronic book
Patron reviews: add a review
Click for more information
EBOOK
No one has rated this material

You can...
Also...
- Find similar reads
- Add a review
- Sign-up for Newsletter
- Suggest a purchase
- Can't find what you want?
More Information