Word OpenXML 替换标记文本答案

【问题标题】：Word OpenXML replace token textWord OpenXML 替换标记文本
【发布时间】：2013-07-29 08:49:19
【问题描述】：

我正在使用 OpenXML 修改 Word 模板，这些模板包含可由某些字符（目前是双 V 形（ascii 171 和 187））识别的简单标记。

我想用我的文本替换这些标记，它可以是多行的（即来自数据库）。

【问题讨论】：

【解决方案1】：

首先你需要打开模板：

        //read file into memory
        byte[] docByteArray = File.ReadAllBytes(templateName);
        using (MemoryStream ms = new MemoryStream())
        {
            //write file to memory stream
            ms.Write(docByteArray, 0, docByteArray.Length);

            //
            ReplaceText(ms);

            //reset stream
            ms.Seek(0L, SeekOrigin.Begin);

            //save output
            using (FileStream outputStream = File.Create(docName))
                ms.CopyTo(outputStream);
        }

搜索正文的内部文本 xml 的简单方法是最快的方法，但不允许插入多行文本，也不给您扩展更复杂更改的基础。

using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(ms, true))
{
     string docText = null;
     //read the entire document into a text
     using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
         docText = sr.ReadToEnd();

     //replace the text
     docText.Replace(oldString, myNewString);

     //write the text back
     using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
         sw.Write(docText);
}

相反，您需要处理元素和结构：

        using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(ms, true))
        {
            //get all the text elements
            IEnumerable<Text> texts = wordDoc.MainDocumentPart.Document.Body.Descendants<Text>();
            //filter them to the ones that contain the QuoteLeft char
            var tokenTexts = texts.Where(t => t.Text.Contains(oldString));

            foreach (var token in tokenTexts)
            {
                //get the parent element
                var parent = token.Parent;
                //deep clone this Text element
                var newToken = token.CloneNode(true);

                //split the text into an array using a regex of all line terminators
                var lines = Regex.Split(myNewString, "\r\n|\r|\n");

                //change the original text element to the first line
                ((Text) newToken).Text = lines[0];
                //if more than one line
                for (int i = 1; i < lines.Length; i++)
                {
                    //append a break to the parent
                    parent.AppendChild<Break>(new Break());
                    //then append the next line
                    parent.AppendChild<Text>(new Text(lines[i]));
                }

                //insert it after the token element
                token.InsertAfterSelf(newToken);
                //remove the token element
                token.Remove();
            }

            wordDoc.MainDocumentPart.Document.Save();
        }

基本上你会找到 Text 元素（Word 是从 Paragraphs of Text 构建的），克隆它，更改它（如果需要，插入新的 Break 和 Text 元素），然后将它添加到原始标记 Text 元素之后，最后删除原始标记文本元素。

【讨论】：

忘了提一下，这两种方法都将继承令牌文本的格式。但是后一种方法确实允许您更改它，因为您可以围绕元素树进行操作。