【问题标题】:Convert .doc with images to .html using xdocreport使用 xdocreport 将带有图像的 .doc 转换为 .html
【发布时间】:2016-10-27 15:54:26
【问题描述】:

我正在使用以下代码将 doc 转换为 html

private static final String docName = "This is a test page.docx";
private static final String outputlFolderPath = "C://";
String htmlNamePath = "docHtml1.html";
String zipName="_tmp.zip";
static File docFile = new File(outputlFolderPath+docName);
File zipFile = new File(zipName);

public void ConvertWordToHtml() {
    try {
        InputStream doc = new FileInputStream(new File(outputlFolderPath+docName));
        System.out.println("InputStream"+doc);
        XWPFDocument document = new XWPFDocument(doc);
        XHTMLOptions options = XHTMLOptions.create(); //.URIResolver(new FileURIResolver(new File("word/media")));;
        String root = "target";
        File imageFolder = new File( root + "/images/" + doc );
        options.setExtractor( new FileImageExtractor( imageFolder ) );
        options.URIResolver( new FileURIResolver( imageFolder ) );
        OutputStream out = new FileOutputStream(new File(htmlPath()));
        XHTMLConverter.getInstance().convert(document, out, options);
    } catch (Exception ex) {

    }
}

public static void main(String[] args) throws IOException, ParserConfigurationException, Exception {
    Convertion cwoWord=new Convertion();
    cwoWord.ConvertWordToHtml();    

}


public String htmlPath(){
    return outputlFolderPath+htmlNamePath;
}

public String zipPath(){
    // d:/_tmp.zip
    return outputlFolderPath+zipName;
}

上面的代码很好地将 doc 转换为 html。当我尝试转换具有图形的 doc 文件时出现问题 像圆圈(如屏幕截图所示),在这种情况下,图形不会显示在 html 文件中。

请帮助我在转换后如何维护从 doc 到 html 文件的图形。提前致谢

【问题讨论】:

  • 你看过stackoverflow.com/questions/37745615/…吗,但总的来说请注意这个功能不是由Apache POI提供的,而是github.com/opensagres/xdocreport,不幸的是他们使用了他们应该使用的org.apache命名空间不是!
  • @centic:感谢回复,我已经查看了给定的链接,我可以转换文档内的图像,但不能转换上面附加的图形

标签: java apache apache-poi xdocreport


【解决方案1】:

您可以使用以下代码将图像嵌入到 html 中:

Base64ImageExtractor imageExtractor = new Base64ImageExtractor();
options.setExtractor(imageExtractor);
options.URIResolver(imageExtractor);

Base64ImageExtractor 的样子:

public class Base64ImageExtractor implements IImageExtractor, IURIResolver {

    private byte[] picture;

    public void extract(String imagePath, byte[] imageData) throws IOException {
        this.picture = imageData;
    }

    private static final String EMBED_IMG_SRC_PREFIX = "data:;base64,";


    public String resolve(String uri) {
        StringBuilder sb = new StringBuilder(picture.length + EMBED_IMG_SRC_PREFIX.length())
                .append(EMBED_IMG_SRC_PREFIX)
                .append(Base64Utility.encode(picture));
        return sb.toString();
    }
}

【讨论】:

    猜你喜欢
    • 2013-02-24
    • 1970-01-01
    • 2012-11-28
    • 2013-06-13
    • 2015-05-05
    • 1970-01-01
    • 2018-05-06
    • 2015-12-30
    • 2017-12-30
    相关资源
    最近更新 更多