【发布时间】:2020-05-18 14:53:18
【问题描述】:
我正在尝试使用 Pypdf2 从 pdf 中提取文本并使用 Textblob 进行翻译。
import PyPDF2 as pdf
from docx import Document
from textblob import TextBlob
Arquivo = 'teste.pdf'
lgout = input('\nPara qual língua traduzir? ex: pt, en, es: ')
lgin = input('\nQual língua é o documento? ex: pt, en, es: ')
with open(Arquivo, mode='rb') as f:
reader = pdf.PdfFileReader(f)
npages = int(reader.numPages) -1
ret = 0
while ret <= npages:
page = reader.getPage(ret)
pagext = str(page.extractText())
blob = TextBlob(pagext)
text_trans = (blob.translate(from_lang=lgin,to = lgout))
doc = Document()
doc.add_paragraph(str(text_trans))
doc.save('Doc teste' + str(ret) + '.docx')
ret +=1
else:
print("Documento convertido")
但是当我运行脚本时,我得到了错误
Traceback (most recent call last):
File "/Users/Pedrovhz/Desktop/Estudos/Python/Python Translator/tradutor_pdf.py", line 18, in <module>
text_trans = (blob.translate(from_lang=lginout,to = lgoutpu))
File "/anaconda3/lib/python3.7/site-packages/textblob/blob.py", line 547, in translate
from_lang=from_lang, to_lang=to))
File "/anaconda3/lib/python3.7/site-packages/textblob/translate.py", line 61, in translate
self._validate_translation(source, result)
File "/anaconda3/lib/python3.7/site-packages/textblob/translate.py", line 85, in _validate_translation
raise NotTranslated('Translation API returned the input string unchanged.')
textblob.exceptions.NotTranslated: Translation API returned the input string unchanged.
我不知道我做错了什么,谢谢帮助!
【问题讨论】:
标签: python anaconda pypdf2 textblob