【问题标题】:Export PDF to csv using python (tabula)使用 python (tabula) 将 PDF 导出为 csv
【发布时间】:2021-03-16 13:44:05
【问题描述】:

将 PDF 文件导出为 csv 时,返回错误:writeheader() 接受 1 个位置参数,但给出了 2 个

from tabula import read_pdf
from tabulate import tabulate
import csv

df = read_pdf("asd.pdf")
print(df)


with open('ddd.csv', "w", newline="") as file:
    columns = ['specialty ',"name",'number_of_seats','Total_seats,' "document_type", "concent"]
    writer = csv.DictWriter(file, fieldnames=columns)
    writer.writeheader(df)

【问题讨论】:

    标签: python pdf tabula-py


    【解决方案1】:

    http://theautomatic.net/2019/05/24/3-ways-to-scrape-tables-from-pdfs-with-python/复制的代码,还有更多细节...

    import tabula
     
    file = "http://lab.fs.uni-lj.si/lasin/wp/IMIT_files/neural/doc/seminar8.pdf"
     
    #tables = tabula.read_pdf(file, pages = "all", multiple_tables = True)
    
    # output just the first table in the PDF to a CSV
    tabula.convert_into(file, "output.csv", output_format="csv")
     
    # output all the tables in the PDF to a CSV
    tabula.convert_into(file, "output.csv", output_format="csv", pages='all')
    

    【讨论】:

    • 当我运行代码时出现错误:build_options() got an unexpected keyword argument 'all'
    猜你喜欢
    • 2018-09-08
    • 1970-01-01
    • 2019-07-24
    • 2015-11-06
    • 2022-01-09
    • 1970-01-01
    • 2017-11-23
    • 2021-11-05
    相关资源
    最近更新 更多