【发布时间】:2016-08-13 09:45:39
【问题描述】:
作为 pdfbox 2.0.2 (https://github.com/apache/pdfbox/tree/2.0.2) 用户的新手,我想获取页面 (PDPage) 的所有描边线(例如表格的列和行边框),因此我创建了以下类: 包org.apache.pdfbox.rendering;
import java.awt.geom.GeneralPath;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URI;
import org.apache.commons.io.IOUtils;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.rendering.PDFRenderer;
import org.apache.pdfbox.rendering.PageDrawer;
import org.apache.pdfbox.rendering.PageDrawerParameters;
public class LineCatcher {
private PageDrawer pageDrawer;
private PDDocument document;
private PDFRenderer pdfRenderer;
private PDPage page;
public LineCatcher(URI pdfSrcURI) throws IllegalArgumentException,
MalformedURLException, IOException {
this.document = PDDocument.load(IOUtils.toByteArray(pdfSrcURI));
this.pdfRenderer = new PDFRenderer(this.document);
}
public GeneralPath getLinePath(int pageIndex) throws IOException {
this.page = this.document.getPage(pageIndex);
PageDrawerParameters parameters = new PageDrawerParameters (this.pdfRenderer, this.page);
this.pageDrawer = new PageDrawer(parameters);
this.pageDrawer.processPage(this.page); //catches exception here
return this.pageDrawer.getLinePath();
}
}
根据我的理解,为了得到一个页面的行路径,首先要处理这个页面,所以我在该行中调用了processPage方法,这里我标记了“catch exception here”。它意外地在提到的行中捕获了 NullPointer Excpetions。异常信息如下:
java.lang.NullPointerException
at org.apache.pdfbox.rendering.PageDrawer.fillPath(PageDrawer.java:599)
at org.apache.pdfbox.contentstream.operator.graphics.FillNonZeroRule.process(FillNonZeroRule.java:36)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:815)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:472)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:446)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
at org.apache.pdfbox.rendering.LineCatcher.getLinePath(LineCatcher.java:33)
at org.apache.pdfbox.rendering.TestLineCatcher.testGetLinePath(TestLineCatcher.java:21)
有没有人可以就我的逻辑提供一些建议或帮助调试代码?提前致谢
【问题讨论】:
-
肯定错了……getLinePath()就是在处理页面的时候获取当前行路径。每次填充/描边后,它会重置为空。这不是您想的那样,即包含页面所有行的路径。我会看看我是否能想出更好的东西,例如抓住中风算子。