【发布时间】:2017-11-17 11:33:44
【问题描述】:
我已经测试了 iTextsharp 和 iText7 的 HTML 到 PDF 转换。基于性能 iTextsharp 创建 10000 个 PDF 需要 3 分钟。但是 iText7 创建 10000 个 PDF 需要 17 分钟。由于 iText7 与 iTextsharp 相比是新版本,因此我决定将 iText7 用于商业目的。但是性能方面 iText7 很低。所以请告诉我如何提高 iText7 中 HTML 到 PDF 转换的性能?
在 iText7 中测试
For i As Integer = 0 To 10000
HTML = ReadFile '=> Read HTML file from particular location
'HTML = Replace(HTML) => To Replace the content dynamically
Dim writer As PdfWriter
Dim array() As Byte = System.Text.Encoding.ASCII.GetBytes("a")
writer = New PdfWriter(FileName, New WriterProperties().SetStandardEncryption(array, array, EncryptionConstants.ALLOW_PRINTING,
EncryptionConstants.ENCRYPTION_AES_256))
HtmlConverter.ConvertToPdf(HTML, writer)
Next
在 iTextSharp 中测试
Imports iTextSharp.text
Imports iTextSharp.text.pdf
Imports iTextSharp.pdfa
Imports System.IO
Imports iTextSharp.text.html.simpleparser
Imports System.Text
Imports iTextSharp.tool.xml.html
Imports iTextSharp.tool.xml
Imports iTextSharp.tool.xml.pipeline.html
For i As Integer = 0 To 10000
HTML = ReadFile '=> Read HTML file from particular location
'HTML = Replace(HTML) => To Replace the content dynamically
Dim bPDF As Byte()
Dim ms As New MemoryStream
Dim doc As Document
doc = New Document(PageSize.A4, 25, 25, 25, 25)
Dim txtReader As New StringReader(Html)
Dim oPdfWriter As PdfWriter
oPdfWriter = PdfWriter.GetInstance(doc, ms)
oPdfWriter.SetEncryption(iTextSharp.text.pdf.PdfWriter.ENCRYPTION_AES_128, "q", "a", 2)
Dim htmlWorker As New HTMLWorker(doc)
doc.Open()
htmlWorker.StartDocument()
htmlWorker.Parse(txtReader)
htmlWorker.EndDocument()
htmlWorker.Close()
doc.Close()
bPDF = ms.ToArray()
Dim FIleName As String = "D:\ItextSharp_" & Now.ToString("ddMMyyyyHHMMssffffff") & ".pdf"
File.WriteAllBytes(FIleName, bPDF)
Next
Function ReadFile()
Dim stringReader As String = ""
Dim objReader As New System.IO.StreamReader("D:\AS1-Revamp\TestHTML\test.html")
Do While objReader.Peek() <> -1
stringReader = stringReader & objReader.ReadLine() & vbNewLine
Loop
ReadFile = stringReader
End Function
我使用上面的代码来测试性能...iText7 Tacking 与 iTextSharp 相比,将 pdf 文件放置在提到的路径中的时间更多。
编辑:在其他问题中复制/粘贴 HTML:
基于路径 iText7 Performance Issue Compared With iTextSharp 中的我的问题,我已为 MR.Amedee Van Gasse 发送了 HTML 文件。所以请告诉我如何提高 iText7 的性能..
<div id = "headerdiv" style="width:540px; float:left; background:#ededed; padding:30px; overflow:hidden;">
<br>
<br>
<br>
<div>
<img border='0' src='D:\AS1-Revamp\TestHTML\newlog.bmp' width='100' height='40'>
</div>
<p style="color:Red;align=center;" > Details</p>
<br>
<br>
<table >
<tr border='0'>
<td bgcolor='Green'>
<font size="3" color="white">
SDetails
</font>
</td>
</td>
</tr>
<tr border='0'>
<td>
<div id="dvKYC">
<table border='1'>
<tr>
<td><#lsName#></td>
<td>No:<#lsno#></td>
</tr>
<tr border='1'>
<td width=500><#lsAddess#></td>
<td></td>
</tr>
<tr>
<td><#lsContacts#></td>
<td> </td>
</tr>
</table>
</div>
</td>
</tr>
</table>
<br>
<div >
<table >
<tr border='0'>
<td bgcolor='Green'>
<font size="3" color="white">
Status
</font>
</td>
</td>
</tr>
</table>
<table style="width:100%;">
<tr bgcolor=gray >
<td style="width:30%;text-align: left; font-weight: bold;">UUH </td>
<td style="width:20%;text-align: left; font-weight: bold;">PN</td>
<td style="width:20%;text-align: left; font-weight: bold;">KC </td>
<td style="width:20%;text-align: left; font-weight: bold;">CC</td>
</tr>
<tr>
<td style"width:200px;"><#lsHs#></td>
<td ><#lsPN#></td>
<td><#lsKC#></td>
<td><#lsCC#></td>
</tr>
</table>
</div>
<div >
<table >
<tr border='0'>
<td bgcolor='Green'>
<font size="3" color="white">
STD
</font>
</td>
</td>
</tr>
</table>
<##TT##>
</div>
应用以下代码后,ConverterProperties 中出现两个错误
1.setCreateAcroForm 不是 iText.Html2pdf.ConverterProperties 的成员
2.setOutlineHandler 不是 iText.Html2pdf.ConverterProperties 的成员
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim converterProperties As ConverterProperties = New ConverterProperties
With converterProperties
.SetBaseUri(".")
.setCreateAcroForm(False)
.SetCssApplierFactory(New DefaultCssApplierFactory())
.SetFontProvider(New DefaultFontProvider())
.SetMediaDeviceDescription(MediaDeviceDescription.CreateDefault())
.setOutlineHandler(New OutlineHandler())
.SetTagWorkerFactory(New DefaultTagWorkerFactory())
End With
Dim HTML = ReadFile("Input_Template")
For i = 0 To 10000
LicenseKey.LoadLicenseFile("C:\iText7\itextkey-0.xml")
Dim PDF = "E:\iText\testpdf " & i & ".pdf"
Dim m As New MemoryStream
Dim writer As PdfWriter
Dim array() As Byte = System.Text.Encoding.ASCII.GetBytes("a")
writer = New PdfWriter(PDF, New WriterProperties().SetStandardEncryption(array, array, EncryptionConstants.ALLOW_PRINTING,
EncryptionConstants.ENCRYPTION_AES_256))
HtmlConverter.ConvertToPdf(HTML, writer, converterProperties)
Next
End Sub
如果我评论那两行代码并运行我的程序,则转换器代码行中出现错误,即(HtmlConverter.ConvertToPdf(HTML,writer,converterProperties))
错误是:“PDF 间接对象属于其他 PDF 文档。将对象复制到当前 pdf 文档。”
由于coverterproperties 处于循环之外,因此会出现此错误。如果我将所有属性都放在循环中,它可以正常工作...但这对于性能而言是否正确..?
请帮我解决这三个错误..?
【问题讨论】:
-
如何提供我不知道的 HTML..?
-
我只在 ItextSharp 中使用过 Htmlworker ......但这不是问题......我询问了 iText7 如何提高性能。?
-
在您的引用路径中:stackoverflow.com/q/44514437/766786 没有解决方案..
-
如果我问另一个问题,那么这个问题将如何继续......?
-
那么好的先生...您已经编辑...请我想要解决方案
标签: vb.net pdf itext performance-testing itext7