【问题标题】:Why does it missing formatting when coppied from PDF为什么从 PDF 复制时会丢失格式
【发布时间】:2017-01-11 16:16:36
【问题描述】:

我正在使用 Summernote 编辑器,从 PDF 文档复制和粘贴时它缺少格式。

此问题存在于所有其他编辑器中,例如 google doc 和 onedrive doc。

但从 PDF 文档复制和粘贴时,msWord 会保留格式。

任何人都明白 MS WORD 是如何做到这一点的,因为从 PDF 复制时剪贴板不保留任何 html 标签?

我在粘贴数据时检查了剪贴板。它显示了以下仅包含 div 标签的结果。

<div>This is Heading1 Text</div><div>This is a regular paragraph with the default style of Normal. This is a regular paragraph with the default style of Normal. This is a regular paragraph with the default style of Normal. This is a regular paragraph with the default style of Normal. This is a regular paragraph with the default style of Normal.</div><div>This is a Defined Block Style Called BlockStyleTest</div><div>This is more Normal text.</div><div>This is Heading 2 text</div><div>This is more Normal text. This is bold, this is italic, and this is bold italic. This is normal. This is in a defined inline style called InlineStyle. This is normal. This is red text. This is normal.</div><div>This block is centered.</div><div>This is left-aligned.</div><div> First item of bulleted list.</div><div> Second item of bulleted list.</div><div>Second paragraph of second item of bulleted list.</div><div> Third item of bulleted list.</div><div>o First item of third item’s nested list</div><div>o Second item of third item’s nested list</div><div> Fourth and final item of main bulleted list.</div><div>This is Normal text.</div><div>1. First item of numbered list.</div><div>2. Second item of numbered list.</div><div>Second paragraph of second item of numbered list.</div><div>3. Third item of numbered list.</div><div>Here is a BMP picture:</div><div>Here is a JPEG picture:</div><div>Here is a PNG picture:</div><div>Here is a table:</div><div>New York Boston Detroit</div><div>Baseball Mets Yankees Red Sox Tigers</div><div>Hockey Rangers Islanders Bruins Red Wings</div><div>Football Giants Jets Patriots Lions</div><div>Here is an embedded Excel spreadsheet:</div><div>pre- post- pre- postdogs</div><div>1234.43 0.33 354.30 777.00</div><div>cats 432.00 -432.20 654.45 333.00</div><div>turkeys 3.30 4.66 34.65 132.10</div><div>fish 52.55 55.33 37.88 31.50</div><div>total 1722.28 -371.88 1081.28 1273.60</div><div>2001 2002</div><div>https://en.wikipedia.org/wiki/United_States</div><div>This is more Underlined text.</div><div>This is more Strikethrough text.</div><div>Test superscript text. This is superscript texts.</div><div>Test subscript text. This is subscript texts.</div><div>Here are some special characters -!”&amp;’(*)+’./:;?_ÈÓ 12\</div><div>This concludes our test.</div>

【问题讨论】:

  • 你的问题到处都是。首先,从 PDF 复制的内容。其次,澄清一下,您是否说明当 PDF 文本被复制然后粘贴到 Summernote、Google Docs 等时,它正在丢失 PDF 文档中复制选择中存在的 HTML 标记(或其他一些“格式”) ?第三,当您说您检查剪贴板时,这是实际的剪贴板数据还是粘贴在 Word 中的数据?
  • 很抱歉给您带来了困惑。 1. 表格、标题、彩色文本等 2. 是 3. 我的意思是我通过 javascript 访问的实际剪贴板。(e.originalEvent.clipboardData.getData('text/plain'))

标签: javascript pdf clipboard onedrive summernote


【解决方案1】:

PDF 使用 PostScript,您似乎遇到了从 PostScript 复制和粘贴到“Summernote”等的问题。MS Word 显然多年来一直花时间与 PostScript 配合使用,必须有 Word 用来实现它的信息是 PostScript 并正确处理粘贴。

顺便说一句,PostScript 远比看起来要多。

背景信息: https://youtu.be/48tFB_sjHgY https://youtu.be/guXgBe2wvEA https://youtu.be/-cFOsAzigyQ https://youtu.be/S_NXz7I5dQc

解决方法: 您可以尝试直接从 Word 打开 PDF,然后复制并粘贴到 Summernote。 (我还没有验证这个作品)

结论: 我看不出有任何方法可以解决您的问题,因为这完全取决于接收程序,在您的情况下是 Summernote 和其他。

【讨论】:

  • 非常感谢。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-05-27
  • 1970-01-01
  • 2020-06-05
相关资源
最近更新 更多