fwrite UTF-8 没有 BOM答案

【问题标题】：fwrite UTF-8 without BOMfwrite UTF-8 没有 BOM
【发布时间】：2014-02-11 21:17:47
【问题描述】：

将数据写入以 UTF-8 编码的文件（使用 BOM）：

function writeStringToFile($file, $string){
  $f=fopen($file, "wb");
  $string="\xEF\xBB\xBF".$string; // UTF-8
  fputs($f, $string);
  fclose($f);
}

如何在没有 BOM 的情况下写入以 UTF-8 编码的数据？

谢谢。

编辑：带有编码的notepad++截图：

【问题讨论】：

只是不添加 BOM (\xEF\xBB\xBF)？
我认为那将是简单的 ANSI...
@ihtus：没有 BOM 的文件只是一组字节。如何处理它们取决于处理它的软件。
"notepad++ 的带有编码的屏幕截图：" --- 现在在 HEX 编辑器中打开一个文件。
@ihtus：我不太确定。

标签： php byte-order-mark

【解决方案1】：

function writeStringToFile($file, $string){
    $f=fopen($file, "wb");
    // $file="\xEF\xBB\xBF".$file; // UTF-8 <-- this is UTF8 BOM
    fputs($f, $string);
    fclose($f);
}

【讨论】：

准确地说：$file 变量甚至没有被使用 :-)
哦！真的 =) 该睡觉了 xD 我想，一定有像 $string="\xEF\xBB\xBF".$string; 这样的东西
好的，那么只需要剪切 3 个前字节。

【解决方案2】：

我发现的唯一方法是在创建文件后删除 BOM。

<?php // change the pathname to your target file which you want to remove the BOM $pathname = "./test.txt"; $file_handler = fopen($pathname, "r"); $contents = fread($file_handler, filesize($pathname)); fclose($file_handler); for ($i = 0; $i < 3; $i++){ $bytes[$i] = ord(substr($contents, $i, 1)); } if ($bytes[0] == 0xef && $bytes[1] == 0xbb && $bytes[2] == 0xbf){ $file_handler = fopen($pathname, "w"); fwrite($file_handler, substr($contents, 3)); fclose($file_handler); printf("%s BOM removed. \n", $pathname); } ?>

【讨论】：

【解决方案3】：

字符串"\xEF\xBB\xBF" 匹配«UTF-8 with BOM»格式。

如果你有一个这种格式的字符串，并且想用«simple» UTF-8 将它写入一个文件，你必须删除这些字符。这可以通过多种方式完成，例如preg_replace：

function writeStringToFileUTF8($file, $string){
  $string = preg_replace("`\xEF\xBB\xBF`", "", $string);
  // this is equivalent as fopen(w)/fputs()/fclose()
  file_put_contents($file, $string);
}

【讨论】：