如何在 PHP4 中解析 gettext .mo 文件而不依赖 setlocale/locales？答案

【问题标题】：How could I parse gettext .mo files in PHP4 without relying on setlocale/locales at all?如何在 PHP4 中解析 gettext .mo 文件而不依赖 setlocale/locales？
【发布时间】：2009-12-22 21:08:46
【问题描述】：

我做了几个相关的主题，但这是我正在寻找答案的一个直接问题。如果php版本为5，我的框架将使用Zend_Translate，否则我必须模仿4的功能。

似乎几乎每个 gettext 实现都依赖于 setlocale 或 locales，我知道系统之间存在很多不一致，这就是我不想依赖它的原因。

我已经尝试了几次以使 textdomain、bindtextdomain 和 gettext 函数工作，但我总是需要调用 setlocale。

顺便说一下，所有的 .mo 文件都是 UTF-8。

【问题讨论】：

标签： php localization gettext php4

【解决方案1】：

这里有一些可重用的代码，用于在 PHP 中解析 MO 文件，基于 Zend_Translate_Adapter_Gettext：

<?php

class MoParser {

    private $_bigEndian   = false;
    private $_file        = false;
    private $_data        = array();

    private function _readMOData($bytes)
    {
        if ($this->_bigEndian === false) {
            return unpack('V' . $bytes, fread($this->_file, 4 * $bytes));
        } else {
            return unpack('N' . $bytes, fread($this->_file, 4 * $bytes));
        }
    }

    public function loadTranslationData($filename, $locale)
    {
        $this->_data      = array();
        $this->_bigEndian = false;
        $this->_file      = @fopen($filename, 'rb');
        if (!$this->_file) throw new Exception('Error opening translation file \'' . $filename . '\'.');
        if (@filesize($filename) < 10) throw new Exception('\'' . $filename . '\' is not a gettext file');

        // get Endian
        $input = $this->_readMOData(1);
        if (strtolower(substr(dechex($input[1]), -8)) == "950412de") {
            $this->_bigEndian = false;
        } else if (strtolower(substr(dechex($input[1]), -8)) == "de120495") {
            $this->_bigEndian = true;
        } else {
            throw new Exception('\'' . $filename . '\' is not a gettext file');
        }
        // read revision - not supported for now
        $input = $this->_readMOData(1);

        // number of bytes
        $input = $this->_readMOData(1);
        $total = $input[1];

        // number of original strings
        $input = $this->_readMOData(1);
        $OOffset = $input[1];

        // number of translation strings
        $input = $this->_readMOData(1);
        $TOffset = $input[1];

        // fill the original table
        fseek($this->_file, $OOffset);
        $origtemp = $this->_readMOData(2 * $total);
        fseek($this->_file, $TOffset);
        $transtemp = $this->_readMOData(2 * $total);

        for($count = 0; $count < $total; ++$count) {
            if ($origtemp[$count * 2 + 1] != 0) {
                fseek($this->_file, $origtemp[$count * 2 + 2]);
                $original = @fread($this->_file, $origtemp[$count * 2 + 1]);
                $original = explode("\0", $original);
            } else {
                $original[0] = '';
            }

            if ($transtemp[$count * 2 + 1] != 0) {
                fseek($this->_file, $transtemp[$count * 2 + 2]);
                $translate = fread($this->_file, $transtemp[$count * 2 + 1]);
                $translate = explode("\0", $translate);
                if ((count($original) > 1) && (count($translate) > 1)) {
                    $this->_data[$locale][$original[0]] = $translate;
                    array_shift($original);
                    foreach ($original as $orig) {
                        $this->_data[$locale][$orig] = '';
                    }
                } else {
                    $this->_data[$locale][$original[0]] = $translate[0];
                }
            }
        }

        $this->_data[$locale][''] = trim($this->_data[$locale]['']);

        unset($this->_data[$locale]['']);
        return $this->_data;
    }

}

【讨论】：

【解决方案2】：

好的，我基本上最终编写了一个基于 Zend 的 Gettext 适配器的 mo 文件解析器，据我所知 gettext 非常依赖于语言环境，因此手动解析 .mo 文件可以省去遇到奇怪情况的麻烦setlocale 的语言环境问题。我还计划解析以 xml 文件形式提供的 Zend Locale 数据。

【讨论】：

+1，这也是我最终所做的。我将在这里发布代码。