【问题标题】:PHPExcel 1.8.0 memory exhausted on load, chunk read, and iteratorPHPExcel 1.8.0 内存在加载、块读取和迭代器时耗尽
【发布时间】:2015-05-20 19:50:56
【问题描述】:

我已经将我的php配置设置为上传12800M文件,无限大小文件,上传时间无限进行测试,但我一直卡在这个常见的PHPExcel致命内存耗尽错误。我收到以下常见错误消息:

致命错误:第 1220 行 .../Worksheet.php 中允许的内存大小为 536870912 字节已用尽(尝试分配 32 字节)。

当我使用块读取过滤器或迭代器时,当 .xlsx 文件被进一步读取到行中时,内存使用量会增加,而不是像我从 PHPExcel 开发人员那里看到的那样保持不变。

我使用的是 PHPExcel 1.8.0。我可能会尝试旧版本,因为我已经读过读取大文件的性能更好。我开始定期加载文件并将其读入数组,使用示例中的块读取过滤,并在此 URL 处使用迭代器:PHPExcel - memory leak when I go through all rows。我认为它是 1.8.0 或更早版本都没有关系。

<?php

/** Error reporting */
error_reporting(E_ALL);
ini_set('display_errors', TRUE);
ini_set('display_startup_errors', TRUE);
date_default_timezone_set('America/Los_Angeles');

define('EOL',(PHP_SAPI == 'cli') ? PHP_EOL : '<br />');

/** Include PHPExcel_IOFactory */
require_once dirname(__FILE__) . '/../Classes/PHPExcel/IOFactory.php';


if (!file_exists("Test.xlsx")) {
    exit("Please check if Test.xlsx exists first.\n");
}

echo date('H:i:s') , " Load workbook from Excel5 file" , EOL;
$callStartTime = microtime(true);

$objPHPExcel = PHPExcel_IOFactory::load("Order_Short.xls");

$callEndTime = microtime(true);
$callTime = $callEndTime - $callStartTime;

echo 'Call time to load Workbook was ' , sprintf('%.4f',$callTime) , " seconds" , EOL;
// Echo memory usage
echo date('H:i:s') , ' Current memory usage: ' , (memory_get_usage(true) / 1024 / 1024) , " MB" , EOL;


//http://runnable.com/Uot2A2l8VxsUAAAR/read-a-simple-2007-xlsx-excel-file-for-php

//  Read your Excel workbook
$inputFileName="Drybar Client Data for Tableau Test.xlsx";
$table = "mt_company2_project2_table144_raw";

try {
    echo date('H:i:s') , " Load workbook from Excel5 file" , EOL;
    $callStartTime = microtime(true);

    $inputFileType = PHPExcel_IOFactory::identify($inputFileName);
    $objReader = PHPExcel_IOFactory::createReader($inputFileType);
    $objPHPExcel = $objReader->load($inputFileName);

    $callEndTime = microtime(true);
    $callTime = $callEndTime - $callStartTime;

    echo 'Call time to load Workbook was ' , sprintf('%.4f',$callTime) , " seconds\r\n" , EOL;
    // Echo memory usage
    echo date('H:i:s') , ' Current memory usage: ' , (memory_get_usage(true) / 1024 / 1024) , " MB" , EOL;
} catch (Exception $e) {
    die('Error loading file "' . pathinfo($inputFileName, PATHINFO_BASENAME)
        . '": ' . $e->getMessage());
}

// Get worksheet dimensions
$sheet = $objPHPExcel->getSheet(0);

// http://asantillan.com/php-excel-import-to-mysql-using-phpexcel/
$worksheetTitle = $sheet->getTitle();

$highestRow = $sheet->getHighestRow();
$highestColumn = $sheet->getHighestColumn();
$highestColumnIndex = PHPExcel_Cell::columnIndexFromString($highestColumn);

// Calculationg Columns
$nrColumns = ord($highestColumn) - 64;

echo "File ".$worksheetTitle." has ";
echo $nrColumns . ' columns';
echo ' x ' . $highestRow . ' rows.<br />';

//  Loop through each row of the worksheet in turn
for ($row = 1; $row <= $highestRow; $row++) {
    //  Read a row of data into an array
    $rowData = $sheet->rangeToArray('A' . $row . ':' . $highestColumn . $row,
                                    NULL, FALSE, FALSE);

    var_dump($rowData);

    foreach ($rowData[0] as $k => $v) {
        echo "Row: " . $row . "- Col: " . ($k + 1) . " = " . $v . "<br />";
    }

}

我正在包括从 PHPExcel 阅读器示例 #12 修改的块读取过滤器,它仍然给我同样的致命内存耗尽,因为它的内存使用量仍在增加,因为它进一步向下读取行?

<?php

error_reporting(E_ALL);
set_time_limit(0);

define('EOL',(PHP_SAPI == 'cli') ? PHP_EOL : '<br />');

date_default_timezone_set('America/Los_Angeles');

/**  Set Include path to point at the PHPExcel Classes folder  **/
set_include_path(get_include_path() . PATH_SEPARATOR . '../../../Classes/');

/**  Include PHPExcel_IOFactory  **/
include 'PHPExcel/IOFactory.php';

$inputFileName = 'Test.xlsx';


/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
    private $_startRow = 0;

    private $_endRow = 0;

    /**  Set the list of rows that we want to read  */
    public function setRows($startRow, $chunkSize) {
        $this->_startRow    = $startRow;
        $this->_endRow      = $startRow + $chunkSize;
    }

    public function readCell($column, $row, $worksheetName = '') {
        //  Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
        if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
            return true;
        }
        return false;
    }
}

$inputFileType = PHPExcel_IOFactory::identify($inputFileName);

$callStartTime = microtime(true);

echo 'Loading file ',pathinfo($inputFileName,PATHINFO_BASENAME),' using IOFactory with a defined reader type of ',$inputFileType,'<br />';

// Call time

$callEndTime = microtime(true);
$callTime = $callEndTime - $callStartTime;
echo 'Call time to read Workbook was ' , sprintf('%.4f',$callTime) , " seconds" , EOL;

// Echo memory usage
echo date('H:i:s') , ' Current memory usage: ' , (memory_get_usage(true) / 1024 / 1024) , " MB" , EOL;

/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);

echo '<hr />';


/**  Define how many rows we want to read for each "chunk"  **/
$chunkSize = 100;
/**  Create a new Instance of our Read Filter  **/
$chunkFilter = new chunkReadFilter();

/**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/
$objReader->setReadFilter($chunkFilter);
/**  Loop to read our worksheet in "chunk size" blocks  **/
for ($startRow = 2; $startRow <= 26000; $startRow += $chunkSize) {
    echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ',$startRow,' to ',($startRow+$chunkSize-1),'<br />';
    /**  Tell the Read Filter, the limits on which rows we want to read this iteration  **/
    $chunkFilter->setRows($startRow,$chunkSize);
    /**  Load only the rows that match our filter from $inputFileName to a PHPExcel Object  **/
    $objPHPExcel = $objReader->load($inputFileName);

    //  Do some processing here

    $sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);

    echo '<br /><br />';

    // Call time

    $callEndTime = microtime(true);
    $callTime = $callEndTime - $callStartTime;
    echo 'Call time to read Workbook was ' , sprintf('%.4f',$callTime) , " seconds" , EOL;

    // Echo memory usage
    echo date('H:i:s') , ' Current memory usage: ' , (memory_get_usage(true) / 1024 / 1024) , " MB" , EOL;

    // Echo memory peak usage
    echo date('H:i:s') , " Peak memory usage: " , (memory_get_peak_usage(true) / 1024 / 1024) , " MB" , EOL;
    echo '<hr />';
}

?>
<body>
</html>

【问题讨论】:

  • 在您发布的代码中没有使用分块,所以我无法对此发表评论;但是你看过单元缓存吗?
  • 代码哪里出错了?它是否加载了文件(一个块或其他)?还是在加载期间?
  • 对不起,我在第二个代码 sn-p 中包含了分块。我看过单元缓存。我注意到这个致命的内存耗尽错误的任何改进。第一个代码在加载大于 2MB 的文件时失败。
  • 您应该在下一次块迭代之前在循环中处理它之后断开工作表并取消设置当前 PHPExcel 对象,如文档中所述.....请注意,分块不会消除所有内存使用,但它确实减少了它
  • 我已经尝试实现 $objPHPExcel->disconnectWorksheets();未设置($objPHPExcel,$sheetData);在我的 for 循环迭代之后,但我仍然得到完全相同的致命内存耗尽错误......

标签: php excel memory


【解决方案1】:

你确实有一个大问题:

$sheetData = $objPHPExcel->getActiveSheet()->toArray(null,true,true,true);

无论您是否加载了块,它将尝试为工作表的整个大小构建一个数组...... toArray() 根据您正在加载的文件使用电子表格的预期大小,而不是您正在加载的过滤后的单元格集

尝试使用 rangeToArray() 仅获取您通过块加载的单元格范围

$sheetData = $objPHPExcel->getActiveSheet()
    ->rangeToArray(
        'A'.$startRow.':'.$objPHPExcel->getActiveSheet()->getHighestColumn().($startRow+$chunkSize-1),
        null,
        true,
        true,
        true
    );

即便如此,在内存中构建 PHP 数组也会占用大量内存;如果您的代码可以一次处理一行工作表数据而不是填充一个大数组,那么您的代码将大大减少内存消耗

【讨论】:

  • 使用 rangToArray 而不是 toArray,我仍然遇到致命的内存耗尽错误。似乎 rangeToArray 遇到了与 toArray 相同的问题。
  • 您要求的范围是多少? 在内存中构建 PHP 数组会占用大量内存 与在内存中构建大型数组相比,一次处理一行工作表数据要好得多
  • 奇怪,我在 LAMP 堆栈上运行它而不是在我使用的 WAMP 堆栈上运行它,在 LAMP 堆栈上没有问题。我可能遇到了 Windows 操作系统内存管理问题。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2020-06-25
  • 2011-04-02
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2015-11-24
相关资源
最近更新 更多