Member-only story

Python爬蟲學習筆記(二) — Selenium自動化+Katalon Recorder

Yanwei Liu

17 min readDec 24, 2018

2021/06/17更新破解Captcha驗證碼的方式

如何破解並繞過網頁上常見的Captcha驗證？以2Captcha API為例

2Captcha是一個非常強大的CAPTCHA辨識服務。在我們日常生活當中，如果要登入網站(如：AWS的帳戶登入頁面)，可能就會遇到需要手動輸入驗證碼的視窗，有些可能單純只是英文及數字的組合。但是有些卻極度複雜，扭曲的字體及顏色，常常讓使用…

yanwei-liu.medium.com

2021/01/24更新Selenium如何和pandas搭配擷取HTML表格資料:

Extracting Table data using Selenium and Python into pandas dataframe

so I have done data extract from a table using library BeautifulSoup with code below: if soup.find("table"…

stackoverflow.com

import pandas as pd
import time
from selenium import webdriver
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
i = "DCD710S2"
base_url = str("https://www.lowes.com/search?searchTerm=" + str(i))
driver.get(base_url)
time.sleep(3)
html=driver.page_source
soup=BeautifulSoup(html,'html.parser')
div=soup.select_one("div#collapseSpecs")
table=pd.read_html(str(div))
frames = [table[0], table[1]]
result=pd.concat(frames,ignore_index=True)
print(result)

2021/01/24更新在無GUI介面的Server上執行Selenium(以aiForge為例):
基於這篇參考資料做修改

#安裝chromium-browser
sudo apt-get install chromium-browser
dpkg --configure -a
apt-get install -f
chromium-browser -version #記住版本號，待會要下載對應的driver#安裝Selenium和xvfb
pip install selenium
sudo apt-get install xvfb#下載chromedriver並解壓縮到/usr/bin
wget…

Python爬蟲學習筆記(二) — Selenium自動化+Katalon Recorder

如何破解並繞過網頁上常見的Captcha驗證？以2Captcha API為例

Extracting Table data using Selenium and Python into pandas dataframe

so I have done data extract from a table using library BeautifulSoup with code below: if soup.find("table"…

Written by Yanwei Liu

No responses yet