【发布时间】:2012-01-29 01:20:31
【问题描述】:
如何在 Java 中将 doc 或 docx 转换为 HTML。使用 Apache POI,我能够将 doc 转换为 html,但无法将 docx 转换为 html?请给我看示例代码?此代码适用于 doc,但不适用于 docx。
HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(stream);
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
wordToHtmlConverter.processDocument(wordDocument);
Document htmlDocument = wordToHtmlConverter.getDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(out);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
out.close();
String result = new String(out.toByteArray());
【问题讨论】:
-
你可以使用 docx4j 来做,见例子:github.com/plutext/docx4j/blob/master/src/samples/docx4j/org/…
-
@user960567,我也遇到了同样的问题。你找到解决办法了吗?
-
@jnrdn0011 搜索 Office Open XML
-
谢谢,终于找到解决办法了。
标签: java spring-mvc apache-poi