使用Python抓出PDF中的文字-PyPDF2使用筆記

1 min readAug 9, 2018

--

安裝PyPDF2
pip install pypdf2

引入PyPDF2和pprint

import PyPDF2
import pprint

把PDF檔案放在X槽, 名字為123.pdf,使用Python打開

File = open(‘X:\\123.pdf’,’rb’)

r : 讀取模式
b : 二進位

建一個PdfFileReader對象

PDF = PyPDF2.PdfFileReader(File)

列出PDF檔案中所有的文字
for page in PDF.pages:
pprint.pprint(page.extractText())

Written by Yanwei Liu

Reading | AI

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams