为什么 GOTO 循环比 FOR 循环慢得多并且还依赖于电源？答案

【问题标题】：Why is a GOTO loop much slower than a FOR loop and depends additionally on power supply?为什么 GOTO 循环比 FOR 循环慢得多并且还依赖于电源？
【发布时间】：2019-04-21 03:00:01
【问题描述】：

我编写了一个小批处理文件，将每行包含八个 16 位十六进制值的文件转换为包含八个十进制值的 CSV 文件。

输入数据文件是嵌入式设备的 ADC 值的捕获输出，以 ASCII 格式的八个十六进制值的数据包的形式发送，并通过 RS-232 回车和换行到 PC，然后在 PC 上简单地捕获到文件中。输入数据文件中的一行是这样的：

000A002D0044008B0125018C01F40237

该行的 CSV 文件是：

10,45,68,139,293,396,500,567

批处理文件有效，但完成转换需要几分钟，这让我感到震惊。我预计 Windows 命令处理器需要几秒钟来完成这项任务，而用 C 或 C++ 编写的控制台应用程序可以在几毫秒内完成。但是对于一个小于 512 KiB 的数据文件来说，几分钟的执行时间绝对不是我所期望的。

因此，我进一步研究了这个问题，使用四种不同的方法创建批处理文件，以从具有十六进制值的数据文件创建具有十进制值的 CSV 文件。

下面附上测试四种方法的完整批处理文件以及我的测试结果。

我知道使用子例程的前两种方法比在一个循环中进行转换的后两种方法慢得多，每次循环迭代时分别将每个 CSV 行输出到文件中一次 FOR 由于调用子程序而导致的循环导致cmd.exe 完成了几个额外的步骤，这些步骤总共花费了大量时间来调用子程序数千次。

但我真的不明白为什么使用 GOTO 循环的第一种方法比使用几乎相同的两条命令行的 FOR 循环慢大约六倍。

方法一的批处理文件代码：

@echo off
setlocal EnableExtensions EnableDelayedExpansion
set "DelDataFile="
set "DataFile=%TEMP%\HexValues.dat"
if not exist "%DataFile%" set "DelDataFile=1" & echo 000A002D0044008B0125018C01F40237>"%DataFile%"

for /F "usebackq delims=" %%I in ("%DataFile%") do call :ConvertLine "%%I"

if defined DelDataFile del "%DataFile%"
endlocal
goto :EOF

:ConvertLine
set "DataLine=%~1"
set "AllValues="
set "StartColumn=0"
:NextValue
set /A Value=0x!DataLine:~%StartColumn%,4!
set "AllValues=%AllValues%,%Value%"
set /A StartColumn+=4
if not %StartColumn% == 32 goto NextValue
echo %AllValues:~1%
goto :EOF

方法二的批处理文件代码：

@echo off
setlocal EnableExtensions EnableDelayedExpansion
set "DelDataFile="
set "DataFile=%TEMP%\HexValues.dat"
if not exist "%DataFile%" set "DelDataFile=1" & echo 000A002D0044008B0125018C01F40237>"%DataFile%"

for /F "usebackq delims=" %%I in ("%DataFile%") do call :LineConvert "%%I"

if defined DelDataFile del "%DataFile%"
endlocal
goto :EOF

:LineConvert
set "DataLine=%~1"
set "AllValues="
for /L %%J in (0,4,28) do (
    set /A Value=0x!DataLine:~%%J,4!
    set "AllValues=!AllValues!,!Value!"
)
echo !AllValues:~1!
goto :EOF

而且我在运行测试时还发现自己找出原因，方法 1 在使用电池运行的 PC 上比在插入电源时要长 5 到 10 秒。

问题：

与方法 2 使用的 FOR 循环相比，方法 1 使用的 GOTO 循环执行速度慢得多的原因是什么以及为什么方法 1 依赖于PC的电源是什么？

这是用于比较不同方法的整个批处理文件：

@echo off
setlocal EnableExtensions EnableDelayedExpansion
cls
set "TestRuns=5"
set "DelDataFile="
set "DataFile=%TEMP%\HexValues.dat"
if exist "%DataFile%" goto InitMethod1

set "DelDataFile=1"
echo Creating data file which takes some seconds, please wait ...
setlocal
set "HexDigits=0123456789ABCDEF"
set "DataLine="
(for /L %%I in (0,1,32767) do (
    set /A "Digit1=(%%I >> 12) %% 16"
    set /A "Digit2=(%%I >> 8) %% 16"
    set /A "Digit3=(%%I >> 4) %% 16"
    set /A "Digit4=%%I %% 16"
    set "HexValue="
    for %%J in (!Digit1! !Digit2! !Digit3! !Digit4!) do set "HexValue=!HexValue!!HexDigits:~%%J,1!"
    set "DataLine=!DataLine!!HexValue!"
    set /A "ValuesPerLine=%%I %% 8"
    if !ValuesPerLine! == 7 (
        echo !DataLine!
        set "DataLine="
    )
))>"%DataFile%"
endlocal
echo/


:InitMethod1
call :MethodInit 1
:RunMethod1
set /A TestRun+=1
set "CSV_File=%TEMP%\Values%Method%_%TestRun%.csv"
del "%CSV_File%" 2>nul
call :GetTime StartTime

for /F "usebackq delims=" %%I in ("%DataFile%") do call :ConvertLine "%%I"

call :OutputTime
if %TestRun% LSS %TestRuns% goto RunMethod1
call :MethodResults
goto InitMethod2

:ConvertLine
set "DataLine=%~1"
set "AllValues="
set "StartColumn=0"
:NextValue
set /A Value=0x!DataLine:~%StartColumn%,4!
set "AllValues=%AllValues%,%Value%"
set /A StartColumn+=4
if not %StartColumn% == 32 goto NextValue
>>"%CSV_File%" echo %AllValues:~1%
goto :EOF


:InitMethod2
call :MethodInit 2
:RunMethod2
set /A TestRun+=1
set "CSV_File=%TEMP%\Values%Method%_%TestRun%.csv"
del "%CSV_File%" 2>nul
call :GetTime StartTime

for /F "usebackq delims=" %%I in ("%DataFile%") do call :LineConvert "%%I"

call :OutputTime
if %TestRun% LSS %TestRuns% goto RunMethod2
call :MethodResults
goto InitMethod3

:LineConvert
set "DataLine=%~1"
set "AllValues="
for /L %%J in (0,4,28) do (
    set /A Value=0x!DataLine:~%%J,4!
    set "AllValues=!AllValues!,!Value!"
)
echo !AllValues:~1!>>"%CSV_File%"
goto :EOF


:InitMethod3
call :MethodInit 3
:RunMethod3
set /A TestRun+=1
set "CSV_File=%TEMP%\Values%Method%_%TestRun%.csv"
del "%CSV_File%" 2>nul
call :GetTime StartTime

for /F "usebackq delims=" %%I in ("%DataFile%") do (
    set "DataLine=%%I"
    set "AllValues="
    for /L %%J in (0,4,28) do (
        set /A Value=0x!DataLine:~%%J,4!
        set "AllValues=!AllValues!,!Value!"
    )
    echo !AllValues:~1!>>"%CSV_File%"
)

call :OutputTime
if %TestRun% LSS %TestRuns% goto RunMethod3
call :MethodResults
goto InitMethod4


:InitMethod4
call :MethodInit 4
:RunMethod4
set /A TestRun+=1
set "CSV_File=%TEMP%\Values%Method%_%TestRun%.csv"
del "%CSV_File%" 2>nul
call :GetTime StartTime

(for /F "usebackq delims=" %%I in ("%DataFile%") do (
    set "DataLine=%%I"
    set "AllValues="
    for /L %%J in (0,4,28) do (
        set /A Value=0x!DataLine:~%%J,4!
        set "AllValues=!AllValues!,!Value!"
    )
    echo !AllValues:~1!
))>>"%CSV_File%"

call :OutputTime
if %TestRun% LSS %TestRuns% goto RunMethod4
call :MethodResults
goto EndBatch


:GetTime
for /F "tokens=2 delims==." %%I in ('%SystemRoot%\System32\wbem\wmic.exe OS GET LocalDateTime /VALUE') do set "%1=%%I"
goto :EOF

:MethodInit
set "Method=%1"
echo Test runs with method %Method%
echo -----------------------
echo/
set "TestRun=0"
set "TotalTime=0"
goto :EOF

:MethodResults
set /A AverageTime=TotalTime / TestRun
echo Method %Method% total time: %TotalTime% seconds
echo Method %Method% average time: %AverageTime% seconds
echo/
goto :EOF

:OutputTime
call :GetTime EndTime
set /A StartTime=(1%StartTime:~8,2% - 100) * 3600 + (1%StartTime:~10,2% - 100) * 60 + 1%StartTime:~12,2% - 100
set /A EndTime=(1%EndTime:~8,2% - 100) * 3600 + (1%EndTime:~10,2% - 100) * 60 + 1%EndTime:~12,2% - 100
set /A DiffTime=EndTime - StartTime
set /A TotalTime+=DiffTime
echo Method %Method% run %TestRun% time: %DiffTime% seconds
goto :EOF


:EndBatch
if defined DelDataFile del "%DataFile%"
del /Q "%TEMP%\Values?_*.csv"
endlocal

它首先在文件夹中为临时文件创建一个十六进制值递增的数据文件，这些文件已经花费了几秒钟。请注释此批处理文件的倒数第二个命令行以保留该文件，以防多次运行此批处理文件或对此文件感兴趣。

然后它运行五次从数据文件读取十六进制值并将十进制值写入CSV文件的四种方法，并将测试结果打印到控制台分别处理STDOUT。

最后，它会删除所有在文件夹中创建的所有 CSV 文件，这些文件都是具有相同内容的临时文件。请在最后一个命令行上发表评论，以保持这些 CSV 文件对这些文件感兴趣。

这个批处理文件由我在两个笔记本上执行了四次。

以下是在装有 Intel Core Duo P8400、2.26 GHz 和 2 GiB RAM、7200 rpm 硬盘、运行 Windows XP x86 并插入电源的笔记本电脑上首次运行的结果：

Test runs with method 1
-----------------------

Method 1 run 1 time: 51 seconds
Method 1 run 2 time: 51 seconds
Method 1 run 3 time: 51 seconds
Method 1 run 4 time: 52 seconds
Method 1 run 5 time: 51 seconds
Method 1 total time: 256 seconds
Method 1 average time: 51 seconds

Test runs with method 2
-----------------------

Method 2 run 1 time: 9 seconds
Method 2 run 2 time: 9 seconds
Method 2 run 3 time: 9 seconds
Method 2 run 4 time: 8 seconds
Method 2 run 5 time: 9 seconds
Method 2 total time: 44 seconds
Method 2 average time: 9 seconds

Test runs with method 3
-----------------------

Method 3 run 1 time: 3 seconds
Method 3 run 2 time: 3 seconds
Method 3 run 3 time: 4 seconds
Method 3 run 4 time: 3 seconds
Method 3 run 5 time: 3 seconds
Method 3 total time: 16 seconds
Method 3 average time: 3 seconds

Test runs with method 4
-----------------------

Method 4 run 1 time: 3 seconds
Method 4 run 2 time: 2 seconds
Method 4 run 3 time: 2 seconds
Method 4 run 4 time: 2 seconds
Method 4 run 5 time: 2 seconds
Method 4 total time: 11 seconds
Method 4 average time: 2 seconds

方法 2 比方法 1 快 5.67 倍。方法 3 和 4 甚至比方法 2 还要快，但这是我的预期。方法 3 和 4 所需的 2 秒和 3 秒大部分来自 WMIC 命令，以获取区域独立格式的本地日期和时间。

以下是在同一台计算机上第二次运行与第一次运行时的结果，不同之处在于在充满电的电池上运行 PC：

Test runs with method 1
-----------------------

Method 1 run 1 time: 63 seconds
Method 1 run 2 time: 61 seconds
Method 1 run 3 time: 61 seconds
Method 1 run 4 time: 61 seconds
Method 1 run 5 time: 61 seconds
Method 1 total time: 307 seconds
Method 1 average time: 61 seconds

Test runs with method 2
-----------------------

Method 2 run 1 time: 11 seconds
Method 2 run 2 time: 10 seconds
Method 2 run 3 time: 10 seconds
Method 2 run 4 time: 10 seconds
Method 2 run 5 time: 10 seconds
Method 2 total time: 51 seconds
Method 2 average time: 10 seconds

Test runs with method 3
-----------------------

Method 3 run 1 time: 3 seconds
Method 3 run 2 time: 4 seconds
Method 3 run 3 time: 3 seconds
Method 3 run 4 time: 4 seconds
Method 3 run 5 time: 3 seconds
Method 3 total time: 17 seconds
Method 3 average time: 3 seconds

Test runs with method 4
-----------------------

Method 4 run 1 time: 2 seconds
Method 4 run 2 time: 2 seconds
Method 4 run 3 time: 2 seconds
Method 4 run 4 time: 2 seconds
Method 4 run 5 time: 2 seconds
Method 4 total time: 10 seconds
Method 4 average time: 2 seconds

可以看出，对于方法 2 到 4，处理时间只增加了一点点。但是方法 1 的处理时间增加了 10 秒，所以这个解决方案现在比方法 2 慢了大约 6.10 倍。我不知道为什么方法 1 的处理时间取决于电源类型。

以下是在配备 Intel Core Duo T9600、2.80 GHz 和 4 GiB RAM、运行 Windows 7 x64 并插入电源的 SSD 的笔记本上首次运行的结果：

Test runs with method 1
-----------------------

Method 1 run 1 time: 91 seconds
Method 1 run 2 time: 88 seconds
Method 1 run 3 time: 77 seconds
Method 1 run 4 time: 77 seconds
Method 1 run 5 time: 78 seconds
Method 1 total time: 411 seconds
Method 1 average time: 82 seconds

Test runs with method 2
-----------------------

Method 2 run 1 time: 11 seconds
Method 2 run 2 time: 16 seconds
Method 2 run 3 time: 16 seconds
Method 2 run 4 time: 14 seconds
Method 2 run 5 time: 16 seconds
Method 2 total time: 73 seconds
Method 2 average time: 14 seconds

Test runs with method 3
-----------------------

Method 3 run 1 time: 6 seconds
Method 3 run 2 time: 4 seconds
Method 3 run 3 time: 4 seconds
Method 3 run 4 time: 4 seconds
Method 3 run 5 time: 6 seconds
Method 3 total time: 24 seconds
Method 3 average time: 4 seconds

Test runs with method 4
-----------------------

Method 4 run 1 time: 4 seconds
Method 4 run 2 time: 3 seconds
Method 4 run 3 time: 5 seconds
Method 4 run 4 time: 4 seconds
Method 4 run 5 time: 4 seconds
Method 4 total time: 20 seconds
Method 4 average time: 4 seconds

有趣的是，使用更强大的硬件执行批处理文件在 Windows 7 x64 上比在 Windows XP x86 上花费更多时间。但对我来说更有趣的是，方法 2 比方法 1 快 5.86 倍，这仅仅是因为使用了 FOR 而不是 GOTO 循环。

为了完整起见，第四次在同一台计算机上运行的结果与第三次运行的结果不同，在充满电的电池上运行 PC：

Test runs with method 1
-----------------------

Method 1 run 1 time: 97 seconds
Method 1 run 2 time: 91 seconds
Method 1 run 3 time: 90 seconds
Method 1 run 4 time: 81 seconds
Method 1 run 5 time: 77 seconds
Method 1 total time: 436 seconds
Method 1 average time: 87 seconds

Test runs with method 2
-----------------------

Method 2 run 1 time: 12 seconds
Method 2 run 2 time: 16 seconds
Method 2 run 3 time: 17 seconds
Method 2 run 4 time: 16 seconds
Method 2 run 5 time: 13 seconds
Method 2 total time: 74 seconds
Method 2 average time: 14 seconds

Test runs with method 3
-----------------------

Method 3 run 1 time: 6 seconds
Method 3 run 2 time: 6 seconds
Method 3 run 3 time: 5 seconds
Method 3 run 4 time: 5 seconds
Method 3 run 5 time: 5 seconds
Method 3 total time: 27 seconds
Method 3 average time: 5 seconds

Test runs with method 4
-----------------------

Method 4 run 1 time: 4 seconds
Method 4 run 2 time: 4 seconds
Method 4 run 3 time: 5 seconds
Method 4 run 4 time: 4 seconds
Method 4 run 5 time: 4 seconds
Method 4 total time: 21 seconds
Method 4 average time: 4 seconds

方法 3 到 4 的执行时间与插入电源的第三次运行相比没有太大差异。但是方法 1 的执行时间增加了大约 5 秒，因此方法 1 是 6.21 倍比方法2慢。

我真的很想知道为什么方法 1 比方法 2 慢得多，而且还取决于电源类型。

由于 Windows 文件缓存，硬盘活动 LED 在所有测试运行中很少闪烁。

【问题讨论】：

因为 FOR 命令和包含它的所有行都从文件中读取一次，然后完整的解析代码会被执行几次从记忆里。另一方面，a GOTO 只是转移执行，因此循环中的所有行都以通常的批处理文件方式执行：打开文件，读取行，关闭文件，解析行，执行行。 ..
在完整脚本顶部带有:top 标签，设置一个计数器变量并在32767 处退出。添加goto :bottom。在底部添加:bottom 和goto :top。执行它需要 25 秒，因为它必须读取整个脚本的每一行 32767 次，向下搜索 :bottom 标签。
@Aacini 你在上面写的绝对正确。这可以在批处理文件执行期间由 Sysinternals Process Monitor 记录的文件系统访问中看到。不使用 GOTO 和 CALL 处理HexValues.dat 中的数据的解决方案不会在处理@ 中的所有行时对HexValues.dat 和批处理文件进行任何文件访问987654339@ 与 GOTO 和 CALL 的解决方案相比，批处理文件非常频繁地逐行读取。所以使用延迟扩展的 FOR 循环肯定比使用 GOTO 或 CALL 更好。

标签： windows batch-file cmd

【解决方案1】：

感谢您的回答。看来你没事。

原因是在批处理文件中首先向下搜索 GOTO 引用的标签，然后从顶部搜索到文件底部。

我首先使用具有以下行和文件大小为 941 字节的批处理文件验证了这一点：

@echo off
setlocal EnableExtensions EnableDelayedExpansion
for /F "tokens=2 delims==." %%I in ('%SystemRoot%\System32\wbem\wmic.exe OS GET LocalDateTime /VALUE') do set "StartTime=%%I"
for /F "usebackq delims=" %%I in ("%TEMP%\HexValues.dat") do call :ConvertLine "%%I"
for /F "tokens=2 delims==." %%I in ('%SystemRoot%\System32\wbem\wmic.exe OS GET LocalDateTime /VALUE') do set "EndTime=%%I"
set /A StartTime=(1%StartTime:~8,2% - 100) * 3600 + (1%StartTime:~10,2% - 100) * 60 + 1%StartTime:~12,2% - 100
set /A EndTime=(1%EndTime:~8,2% - 100) * 3600 + (1%EndTime:~10,2% - 100) * 60 + 1%EndTime:~12,2% - 100
set /A DiffTime=EndTime - StartTime
echo Time: %DiffTime% seconds
endlocal
goto :EOF

:ConvertLine
set "DataLine=%~1"
set "AllValues="
set "StartColumn=0"
:NextValue
set /A Value=0x!DataLine:~%StartColumn%,4!
set "AllValues=%AllValues%,%Value%"
set /A StartColumn+=4
if not %StartColumn% == 32 goto NextValue
goto :EOF

在装有 Windows XP 的笔记本电脑上使用较大的测试批处理文件在 51 秒内完成任务需要 34 秒。

然后我创建了这个批处理文件的副本，并在 goto :EOF 和 :ConvertLine 之间插入了一个包含 250 行的块，所有行都具有相同的字符串：

rem comment line of no interest

这个包含 9193 字节的批处理文件需要 64 秒才能完成完全相同的任务。

所以搜索一个向上只有四行的标签肯定是方法一的时间比方法二长得多的原因。而且方法二比方法三和四慢主要是因为同样的原因。

但是我仍然没有发现为什么第二个9193字节的批处理文件需要72秒而不是64秒在使用电池而不是插入电源的笔记本上。批处理文件和数据文件已加载在缓存中。运行批处理文件时没有输出。我已经在电源选项中进行了配置，以便在使用电池运行时也使用最高性能。在批处理文件中扫描标签显然比在插入电源时使用电池运行要慢，尽管在批处理文件执行期间并没有真正访问硬盘，只是 CPU 内核、CPU 缓存和 RAM。

我还尝试使用依赖于区域的TIME 环境变量而不是使用 WMIC 命令来获取与区域无关的日期/时间的批处理代码。 %TIME% 在我的 PC 上扩展为德国时间格式 HH::MM:SS.ms。

@echo off
setlocal EnableExtensions EnableDelayedExpansion
set "StartTime=%TIME%"
for /F "usebackq delims=" %%I in ("%TEMP%\HexValues.dat") do call :ConvertLine "%%I"
set "EndTime=%TIME%"
set /A StartTime=(1%StartTime:~0,2% - 100) * 3600 + (1%StartTime:~3,2% - 100) * 60 + 1%StartTime:~6,2% - 100
set /A EndTime=(1%EndTime:~0,2% - 100) * 3600 + (1%EndTime:~3,2% - 100) * 60 + 1%EndTime:~6,2% - 100
set /A DiffTime=EndTime - StartTime
echo Time: %DiffTime%
endlocal
goto :EOF

:ConvertLine
set "DataLine=%~1"
set "AllValues="
set "StartColumn=0"
:NextValue
set /A Value=0x!DataLine:~%StartColumn%,4!
set "AllValues=%AllValues%,%Value%"
set /A StartColumn+=4
if not %StartColumn% == 32 goto NextValue
goto :EOF

这个批处理文件在30 秒内完成，在插入电源的 Windows XP x86 上运行，HDD ST980411ASG 和 7200 rpm（磁性硬盘）。接下来使用电池运行的相同批处理文件在同一台 PC 上花费了 37 秒。

在安装了 Windows 7 x64 和三星 SSD 850 EVO（固态磁盘）且插入电源的 PC 上花费了 72 秒，使用电池运行花费了 77 秒。我刚刚在测试运行之间拔掉了电源，没有其他任何改变。没有连接到任何网络，每个硬件开关都关闭了 WLAN，在 BIOS 中禁用了蓝牙，在执行期间禁用了防病毒应用程序（Windows 7x 64 上的 Windows Defender 除外）。

我在装有 Windows 7 x64 的 PC 上一次又一次地运行这个批处理文件，最后在执行过程中风扇经常旋转，执行时间变得恒定，72 秒独立于电源是否插入。

为了比较，我也执行了这个批处理文件：

@echo off
setlocal EnableExtensions EnableDelayedExpansion
set "StartTime=%TIME%"
for /F "usebackq delims=" %%I in ("%TEMP%\HexValues.dat") do (
    set "DataLine=%%I"
    set "AllValues="
    for /L %%J in (0,4,28) do (
        set /A Value=0x!DataLine:~%%J,4!
        set "AllValues=!AllValues!,!Value!"
    )
)
set "EndTime=%TIME%"
set /A StartTime=(1%StartTime:~0,2% - 100) * 3600 + (1%StartTime:~3,2% - 100) * 60 + 1%StartTime:~6,2% - 100
set /A EndTime=(1%EndTime:~0,2% - 100) * 3600 + (1%EndTime:~3,2% - 100) * 60 + 1%EndTime:~6,2% - 100
set /A DiffTime=EndTime - StartTime
echo Time: %DiffTime%
endlocal

在运行 Windows XP 的 PC 上需要 1 或 2 秒才能完成，在插入电源时，完成时间通常为 1 秒，而在使用电池运行时，完成时间通常为 2 秒。我还需要考虑毫秒，以便在这个快速完成的解决方案上更加精确。在 PC Windows 7 x64 上的执行时间在 4 到 5 秒之间，与其他装有 Windows XP 的笔记本电脑上的电源插入时间相同。

与不运行批处理文件相比，两台计算机上的硬盘活动发光二极管闪烁并没有什么不同。在 Windows XP 上运行批处理文件大约需要 30 秒才能完成，而在使用电池运行时，我没有听到任何来自 HDD 的不同声音。

但是我可以在两台 PC 上使用 Process Monitor 看到批处理文件本身在使用 GOTO 循环运行批处理文件时被永久打开、读取、关闭，而几乎没有批处理文件访问使用带有 FOR 循环的最优化版本。

并且cmd.exe 确实使用 GOTO 方法一次又一次地逐行读取批处理文件，因为它也可以通过进程监视器看到大量的ReadFile 访问随着Offset 的增加偏移量与批处理文件中每行开头的偏移量相同。由于记录了超过 200 万个事件并显示了超过 500,000 个事件，Process Monitor 记录文件系统访问的执行时间显着增加。

此外，可以通过 Process Monitor 看到仅使用优化的 FOR 循环 cmd.exe 读取行到 FOR 循环的末尾，然后读取一次整个HexValues.dat 只有一个ReadFile 访问，需要5 秒（在Windows 7 x64 上）完成从十六进制到十进制的转换，无需任何文件系统访问，然后读取批处理文件的其余行以完成其执行。 Process Monitor 仅记录了大约 50,000 个事件，但显示的事件少于 100 个。

我认为在 BIOS 中启用 Intel SpeedStep 技术是使用电池运行时在当前命令行上方带有标签的命令 GOTO 执行批处理文件时间增加的原因。在我看来，这也可以解释这样的效果：在 Windows 7 x64 上一次又一次地运行此答案中的第二批发布最终会导致恒定的执行时间，而与是否插入电源无关，因为英特尔 SpeedStep 最终会提高性能即使在使用电池的情况下，CPU 也可以达到最大值，因为一个核心始终以 100% 的速度运行。

结论：

方法1的GOTO解决方案比所有其他方法慢得多，因为根据Aacini提供的信息，cmd.exe确实跟随到达goto NextValue，其他人分析验证使用进程监视器：

打开批处理文件以读取和查询有关文件的标准信息以检查自上次访问以来批处理文件是否更改。以下其他步骤不查询标准信息。
从批处理文件中逐行读取，处理每一行以找到带有:NextValue的行，但没有成功到文件结尾。
到达文件末尾时倒回文件顶部。
现在从顶部逐行读取批处理文件中的一行，处理每一行以找到带有:NextValue 的行。
在找到:NextValue 行并知道下一个要处理的命令行的偏移量后关闭批处理文件。
再次打开批处理文件。
阅读set /A Value=0x!DataLine:~%StartColumn%,4!这一行。
再次关闭批处理文件。
处理读取命令行。
再次打开批处理文件。
从批处理文件中读取下一行set "AllValues=%AllValues%,%Value%"。
关闭批处理文件。
处理读取命令行。
再次打开批处理文件。
从批处理文件中读取下一行set /A StartColumn+=4。
关闭批处理文件。
处理读取命令行。
再次打开批处理文件。
从批处理文件中读取下一行if not %StartColumn% == 32 goto NextValue。
关闭批处理文件。
处理读取命令行。
如果条件为真，则继续第一步，即StartColumn 不是32。

所有这些批处理文件打开/读取/关闭操作即使在不访问存储介质的情况下也需要几毫秒，但最有可能（我的假设）是访问主板上的 DRAM其中文件被加载是因为整个批处理文件内容没有加载到 CPU 的内部缓存中。

因此，通过在子例程中使用 FOR 循环，如方法 2 所做的那样，批处理文件访问操作的数量已经大大减少，因为 21 个步骤中没有一个步骤需要读取更多的文件访问操作在处理从HexValues.dat 读取的当前行中的十六进制值时，需要逐行处理批处理文件而不是在单个步骤中明确列出所有内容。

HexValues.dat 中的所有行都可以由cmd.exe 处理，而无需访问批处理文件就可以像方法 3 和 4 那样在一个 FOR 循环中进行整个转换。并且最后，通过将 CSV 的所有行输出到 STDOUT （内存中的缓冲区）并按照方法 4 将它们写入 CSV 文件一次，可以保存更多的文件系统访问，从而再次减少所需的总时间此值转换任务与方法 3 相比。

应该避免在一个较大的批处理文件中使用带有goto LABEL 或call :LABEL 且LABEL 在当前行以上或以下多行的命令行在循环中执行数百或数千次迭代的操作。在这种情况下，对于非批处理文件编码专家来说，一个复杂的分别不容易理解的 FOR 循环比使用 GOTO 或 CALL 的可读性好的解决方案要好强>。或者换句话说，总是建议只使用 FOR 循环来经常在循环中做某事。

【讨论】：

有趣的是，即使电池模式设置相同，电源也会影响性能；不幸的是，我的笔记本坏了，所以我目前无法进行任何实验，但我猜 Windows 在电池上的行为与在电源上的行为仍然不同，即使所有设置都相同......
我认为您对这些短语有误：“批处理文件和数据文件已加载到缓存中”和“尽管在批处理文件执行期间并未真正访问硬盘”。你从哪里得到这些分数？据我所知，批处理文件执行重复地从磁盘打开/读取/关闭每个执行行的批处理文件......我也不同意使用 GOTO 或 的解决方案与复杂的 FOR 构造相比，CALL 是“可读性好的解决方案”... ;)
大多数磁盘都有内部缓存。所以我认为可以安全地假设，批处理文件（如果不是太大）有效地从 Windows 缓存或至少磁盘缓存运行（只要没有其他繁重的 IO）。
@Aacini 我非常确定在批处理文件执行期间不会访问硬盘本身。在运行更多测试并使用 Sysinternals Process Monitor 后，我扩展了我的答案。文件访问都是在内存中的缓存文件上完成的。但有趣的是，在一次又一次地使用 GOTO 运行批处理文件解决方案时，经过 10 次以上的执行，它在 Windows 7 x64 上使用电源或电池运行并没有什么不同。但是，主批处理文件是一个很好的性能测试脚本。我还计划在其他装有 Windows 10 x64 的 PC 上运行它。
恕我直言，导致此问题的 中心点 是用于运行批处理文件的打开/读取/关闭文件方法，该方法仅使用 FOR 完成一次，但重复多次GOTO 我第一次提到in my comment。 GOTO 命令按向下顺序搜索标签的事实只会导致在大文件中执行更多行。但是，在您对问题的广泛解释中，您甚至没有提到我对这一点的评论......

【解决方案2】：

如前所述，GOTO 和 CALL 从当前文件位置到文件末尾搜索下一个匹配标签，然后从文件开始到当前文件位置搜索。
这种行为有一些有用的效果，因为您不必担心函数中的不同标签名称。

:myFunc1
<some code>
goto :someLabel  -- this goto's to the next :someLabel  

:someLabel

:myFunc2
<some code>
goto :someLabel  -- this goto's to the next :someLabel  

:someLabel

但是当您构建循环时，速度优势很大，因为完整的 FOR 括号块将只被读取一次并解析到阶段 2。
解析后的块位于 cmd 缓存中，因此不需要任何进一步的磁盘读取，并且标记化已经完成。

【讨论】：

我建议在某个地方使用:@F 标签来定义一个重复标签，该标签仅用于通过GOTO @F 向前跳转几行。这是为了类似于 MASM 的anonymous label feature（以相同的方式使用），并指示（按照惯例）这样的标签可以出现多次并且不是错误。 .

【解决方案3】：

根据解释器的this analysis，FOR 变量将在第 4 阶段扩展，因此解释器将立即知道执行命令的次数和值。相比之下，每个 GOTO 都在第 7 阶段进行解释，并且每次都需要重新扫描文件以查找标签，这解释了感知的时间差异。

【讨论】：

【解决方案4】：

这是因为goto 的工作方式。与在编译期间将goto 转换为固定地址的编译型编程语言不同，批处理必须在文件中搜索每个goto 的标签。

它从goto 行向下搜索文件，如果找不到标签，则从文件开头继续搜索。

【讨论】：

【解决方案5】：

如果我没记错的话，批处理文件中的 GOTO LABEL 命令实际上会扫描文本的其余部分以查找标签，如果没有找到，它会从顶部重新开始。

这是一个相当昂贵的操作，我认为 CALL 也可以。

因此，如果您关心性能，那么您应该尽量避免这些结构。或者，更好的是，不要在批处理文件中执行此类工作。

【讨论】：