Python爬蟲學習筆記(二) — Selenium自動化+Katalon Recorder

Yanwei Liu
17 min readDec 24, 2018

2021/06/17更新破解Captcha驗證碼的方式

2021/01/24更新Selenium如何和pandas搭配擷取HTML表格資料:

import pandas as pd
import time
from selenium import webdriver
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
i = "DCD710S2"
base_url = str("https://www.lowes.com/search?searchTerm=" + str(i))
driver.get(base_url)
time.sleep(3)
html=driver.page_source
soup=BeautifulSoup(html,'html.parser')
div=soup.select_one("div#collapseSpecs")
table=pd.read_html(str(div))
frames = [table[0], table[1]]
result=pd.concat(frames,ignore_index=True)
print(result)

2021/01/24更新在無GUI介面的Server上執行Selenium(以aiForge為例):
基於這篇參考資料做修改

#安裝chromium-browser
sudo apt-get install chromium-browser
dpkg --configure -a
apt-get install -f
chromium-browser -version #記住版本號,待會要下載對應的driver
#安裝Selenium和xvfb
pip install selenium
sudo apt-get install xvfb
#下載chromedriver並解壓縮到/usr/bin
wget…

--

--

No responses yet