【发布时间】:2019-02-19 08:37:39
【问题描述】:
我正在开发一个 Windows 窗体应用程序,该应用程序采用其他软件生成的机器人程序并对其进行修改。修改过程如下:
- StreamReader.ReadLine() 用于逐行解析文件
- 正则表达式用于搜索文件中的特定关键字。如果获得匹配,则将匹配的字符串复制到另一个字符串并替换为新的机器人代码行。
修改后的代码以字符串形式保存,最后写入新文件。
所有使用Regex获得的匹配字符串集合也保存在一个字符串中,最后写入一个新文件。
我已经能够成功地做到这一点
private void Form1_Load(object sender, EventArgs e)
{
string NextLine = null;
string CurrLine = null;
string MoveL_Pos_Data = null;
string MoveL_Ref_Data = null;
string MoveLFull = null;
string ModCode = null;
string TAB = "\t";
string NewLine = "\r\n";
string SavePath = null;
string ExtCode_1 = null;
string ExtCode_2 = null;
string ExtCallMod = null;
int MatchCount = 0;
int NumRoutines = 0;
try
{
// Ask user location of the source file
// Displays an OpenFileDialog so the user can select a Cursor.
OpenFileDialog openFileDialog1 = new OpenFileDialog
{
Filter = "MOD Files|*.mod",
Title = "Select an ABB RAPID MOD File"
};
// Show the Dialog.
// If the user clicked OK in the dialog and
// a .MOD file was selected, open it.
if (openFileDialog1.ShowDialog() == System.Windows.Forms.DialogResult.OK)
{
// Assign the cursor in the Stream to the Form's Cursor property.
//this.Cursor = new Cursor(openFileDialog1.OpenFile());
using (StreamReader sr = new StreamReader(openFileDialog1.FileName))
{
// define a regular expression to search for extr calls
Regex Extr_Ex = new Regex(@"\bExtr\(-?\d*.\d*\);", RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline);
Regex MoveL_Ex = new Regex(@"\bMoveL\s+(.*)(z\d.*)", RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline);
Match MoveLString = null;
while (sr.Peek() >= 0)
{
CurrLine = sr.ReadLine();
//Console.WriteLine(sr.ReadLine());
// check if the line is a match
if (Extr_Ex.IsMatch(CurrLine))
{
// Keep a count for total matches
MatchCount++;
// Save extr calls in a string
ExtCode_1 += NewLine + TAB + TAB + Extr_Ex.Match(CurrLine).ToString();
// Read next line (always a MoveL) to get Pos data for TriggL
NextLine = sr.ReadLine();
//Console.WriteLine(NextLine);
if (MoveL_Ex.IsMatch(NextLine))
{
// Next Line contains MoveL
// get matched string
MoveLString = MoveL_Ex.Match(NextLine);
GroupCollection group = MoveLString.Groups;
MoveL_Pos_Data = group[1].Value.ToString();
MoveL_Ref_Data = group[2].Value.ToString();
MoveLFull = MoveL_Pos_Data + MoveL_Ref_Data;
}
// replace Extr with follwing commands
ModCode += NewLine + TAB + TAB + "TriggL " + MoveL_Pos_Data + "extr," + MoveL_Ref_Data;
ModCode += NewLine + TAB + TAB + "WaitDI DI1_1,1;";
ModCode += NewLine + TAB + TAB + "MoveL " + MoveLFull;
ModCode += NewLine + TAB + TAB + "Reset DO1_1;";
//break;
}
else
{
// No extr Match
ModCode += "\r\n" + CurrLine;
}
}
Console.WriteLine($"Total Matches: {MatchCount}");
}
}
// Write modified code into a new output file
string SaveDirectoryPath = Path.GetDirectoryName(openFileDialog1.FileName);
string ModName = Path.GetFileNameWithoutExtension(openFileDialog1.FileName);
SavePath = SaveDirectoryPath + @"\" + ModName + "_rev.mod";
File.WriteAllText(SavePath, ModCode);
//Write Extr matches into new output file
//Prepare module
ExtCallMod = "MODULE ExtruderCalls";
// All extr calls in one routine
//Prepare routines
ExtCallMod += NewLine + NewLine + TAB + "PROC Prg_ExtCall"; // + 1;
ExtCallMod += ExtCode_1;
ExtCallMod += NewLine + NewLine + TAB + "ENDPROC";
ExtCallMod += NewLine + NewLine;
//}
ExtCallMod += "ENDMODULE";
// Write to file
string ExtCallSavePath = SaveDirectoryPath + @"\ExtrCalls.mod";
File.WriteAllText(ExtCallSavePath, ExtCallMod);
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
}
虽然这可以帮助我实现我想要的,但这个过程非常缓慢。由于我是 C# 编程的新手,我怀疑速度慢来自将原始文件内容复制到字符串而不是替换内容(我不确定是否可以直接替换原始文件中的内容)。对于 20,000 行的输入文件,整个过程需要 5 分钟多一点。
我曾经收到以下错误:Message=Managed Debugging Assistant 'ContextSwitchDeadlock' : 'CLR 在 60 秒内无法从 COM 上下文 0xb27138 转换到 COM 上下文 0xb27080。 拥有目标上下文/单元的线程很可能正在执行非泵送等待或处理非常长时间运行的操作而不泵送 Windows 消息。这种情况通常会对性能产生负面影响,甚至可能导致应用程序变得无响应或内存使用量随着时间的推移不断累积。为避免此问题,所有单线程单元 (STA) 线程都应使用泵送等待原语(例如 CoWaitForMultipleHandles)并在长时间运行的操作期间定期泵送消息。'
我可以通过在调试器设置中禁用“ContextSwitchDeadlock”设置来克服它。这可能不是最佳做法。
谁能帮助我提高代码的性能?
编辑:我发现机器人控制器对 MOD 文件(输出文件)中的行数有限制。允许的最大行数是 32768。我想出了一个逻辑,将字符串生成器的内容拆分为单独的输出文件,如下所示:
// Split modCodeBuilder into seperate strings based on final size
const int maxSize = 32500;
string result = modCodeBuilder.ToString();
string[] splitResult = result.Split(new string[] { "\r\n" }, StringSplitOptions.None);
string[] splitModCode = new string[maxSize];
// Setup destination directory to be same as source directory
string destDir = Path.GetDirectoryName(fileNames[0]);
for (int count = 0; ; count++)
{
// Get the next batch of text by skipping the amount
// we've taken so far and then taking the maxSize.
string modName = $"PrgMOD_{count + 1}";
string procName = $"Prg_{count + 1}()";
// Use Array Copy to extract first 32500 lines from modCode[]
int src_start_index = count * maxSize;
int srcUpperLimit = splitResult.GetUpperBound(0);
int dataLength = maxSize;
if (src_start_index > srcUpperLimit) break; // Exit loop when there's no text left to take
if (src_start_index > 1)
{
// Make sure calculate right length so that src index is not exceeded
dataLength = srcUpperLimit - maxSize;
}
Array.Copy(splitResult, src_start_index, splitModCode, 0, dataLength);
string finalModCode = String.Join("\r\n", splitModCode);
string batch = String.Concat("MODULE ", modName, "\r\n\r\n\tPROC ", procName, "\r\n", finalModCode, "\r\n\r\n\tENDPROC\r\n\r\nENDMODULE");
//if (batch.Length == 0) break;
// Generate file name based on count
string fileName = $"ABB_R3DP_{count + 1}.mod";
// Write our file text
File.WriteAllText(Path.Combine(destDir, fileName), batch);
// Write status to output textbox
TxtOutput.AppendText("\r\n");
TxtOutput.AppendText("\r\n");
TxtOutput.AppendText($"Modified MOD File: {fileName} is generated sucessfully! It is saved to location: {Path.Combine(destDir, fileName)}");
}
【问题讨论】:
-
@Gauravsa 你能解释一下为什么这些线路是瓶颈以及如何改进它们吗?您的回答并没有按原样回答我的问题。
-
字符串是不可变的。每次对字符串进行更改时,您实际上是在创建一个新字符串、分配内存、将数据从现有字符串复制到新字符串..
-
这里有一个很好的阅读链接:jonskeet.uk/csharp/stringbuilder.html
-
就个人而言,我会使用两个线程写入和一个线程读取,这样文件可以在读取时同时写入,其次您可以通过打印找到哪个进程是瓶颈
number of ticks采取了一系列步骤......也专注于正则表达式匹配 -
在计划使用结果时避免使用
IsMatch。直接使用Matches,这样就不用加倍正则表达式的执行了。使用Compiled分析您的特定正则表达式 - 没有它,极其简单的表达式实际上可以运行得更快。使用StringBuilder
标签: c# regex streamreader