Skip to content

Latest commit

 

History

History
16 lines (13 loc) · 719 Bytes

README.md

File metadata and controls

16 lines (13 loc) · 719 Bytes

Scraping News Articles

Overview

Web scraping is a computer software technique of extracting information from websites. This technique mostly focuses on the transformation of unstructured data (HTML format) on the web into structured data (database or spreadsheet).

Analysis

Scraping 500 Hindi news articles from the Jagaran Newspaper website.

Commit message style

  • Docs: For document change/update
  • Gather: For Wrangling process - Reading/Gathering
  • Assess: For Wrangling process - Assess
  • Clean: For wrangling process - Cleaning quality and tidiness issues, may include test codes too
  • Viz: For visualization
  • Refactor: Refactoring existing code
  • Chore: Package manager