Python文章爬蟲-newspaper3k使用筆記

#20190717更新Python爬蟲學習筆記(五) — 使用newspaper3k進行新聞爬蟲

安裝 newspaper3k

pip install newspaper3k

抓取CNN網站當中的網址


>>> import newspaper
>>> cnn_paper = newspaper.build('http://cnn.com')
>>> for article in cnn_paper.articles:
>>> print(article.url)
http://www.cnn.com/2013/11/27/justice/tucson-arizona-captive-girls/
http://www.cnn.com/2013/12/11/us/texas-teen-dwi-wreck/index.html
...
...
...

抓取各種文章的分類

>>> for category in cnn_paper.category_urls():
>>> print(category)

http://lifestyle.cnn.com
http://cnn.com/world
http://tech.cnn.com
...

Written by

Machine Learning / Deep Learning / Python / Flutter cakeresume.com/yanwei-liu

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store