【发布时间】:2014-05-23 16:28:28
【问题描述】:
我收到一个内存耗尽错误,我不应该占用任何内存!
应用程序在 Windows 8 Server / IIS i / PHP 5.5/ CodeIgniter / MS SQL Server 上
错误如下:
[23-May-2014 10:56:57 America/New_York] PHP 致命错误:允许 内存大小 134217728 字节用尽(试图分配 1992 字节)在 C:\inetpub\wwwroot\application\models\DW_import.php 在第 112 行
[23-May-2014 11:07:34 America/New_York] PHP 致命错误: 允许的内存大小 134217728 字节已用尽(试图分配 2438 字节)在 C:\inetpub\wwwroot\application\models\DW_import.php 在第 113 行
脚本在一个目录中查找几个不同的 CSV 文件以导入数据库。请记住,导入文件非常庞大,有些高达 4 Gigs 的数据。据我所知,没有变量会持续聚合可能导致此问题的数据。正在运行的脚本(模型)(该控制器没有视图,只有模型)如下:
DW_import.php
<?php
class dw_import extends CI_Model {
public function import(){
global $file,$errLogFile,$logFile,$tableName, $fieldList, $file, $count, $line, $query;
$this->load->database(); // init db connection
// map file types to database tables
$fileToDBArr = array(
'Customers' => 'customer',
'Customers_Historical' => 'customer_historical',
'Orders' => 'order',
'Customer_AR_Aggs' => 'customer_ar_aging_agg'
);
// extend timeout of this script
ini_set('max_execution_time', 3600);
// error handler to log errors and continue processing
function myErrorHandler($errno,$errstr,$errfile,$errline){
global $file,$errLogFile,$logFile,$tableName, $fieldList, $file, $count, $line, $query;
// error - store in DB
//echo "<br>[$errno $errstr $errfile $errline $tableName $file $count] $errLogFile<br>";
$err = "#$errno $errstr $errfile on line $errline :: Table $tableName File $file Row# $count Headers: $fieldList Data: $line";
echo $err;
file_put_contents($errLogFile,$err,FILE_APPEND);
};
set_error_handler("myErrorHandler");
// set temp error log file
$errLogFile = "C:/Data_Updates/logs/general." . date('YmdHis') . ".errLog";
// loop thru file types
foreach($fileToDBArr as $fileType=>$table){
// get the files for this import type
$fileArr = glob('C:/Data_Updates/'.$fileType.'.*');
sort($fileArr,SORT_STRING); // sort so earlier files (by date in file name) will process first
// loop thru files found
foreach($fileArr as $file){
// set log file paths specific to this import file
$errLogFile = str_replace('Data_Updates/','Data_Updates/logs/',$file) . "." . date('YmdHis') . ".errLog";
$logFile = str_replace('Data_Updates/','Data_Updates/logs/',$file) . "." . date('YmdHis') . ".log";
file_put_contents($logFile,"---BEGIN---",FILE_APPEND); // log
// lets get the file type and translate it into a table name
preg_match('/C:\/Data_Updates\/([^\.]+)/',$file,$matches);
$fileType = $matches[1];
$tableName = $fileToDBArr[$fileType];
// lets get the first row as a field list
$fp = fopen($file,'r');
//$fieldList = str_replace('"','',fgets($fp));
// counters to track status
$count = 0;
$startPoint = 0;
// see if continuation, set startPoint to last row imported from file
$query = "SELECT max(import_line) as maxline FROM $tableName WHERE import_file = '" . addslashes($file) . "'";
$result = $this->db->query($query);
foreach($result->result() as $row) $startPoint = $row->maxline+1; // set the startPoint if this is continuation
file_put_contents($logFile,"\nstartPoint $startPoint",FILE_APPEND); // log
// loop thru file lines
while (!feof($fp)) {
$line = fgets($fp);
// reformat those pesky dates from m/d/y to y-m-d
$line = preg_replace('/, ?(\d{1,2})\/(\d{1,2})\/(\d{4})/',',${3}-${1}-${2}',$line);
if(!$count){
// header row - set aside to use for column headers on insert statements
$fieldList = str_replace('"','',$line);
file_put_contents($logFile,"\nHeaders: $fieldList",FILE_APPEND); // log
} elseif($count >= $startPoint && trim($line)) {
// data row - insert into DB
$lineArr = str_getcsv($line); // turn this CSV line into an array
// build the insert query
$query = "INSERT INTO $tableName ($fieldList,import_date,import_file,import_line)
VALUES (";
foreach($lineArr as $k=>$v) $query .= ($v !== '') ? "'".addslashes(utf8_encode($v))."'," : " NULL,";
$query .= "now(),'" . addslashes($file). "',$count)
ON DUPLICATE KEY UPDATE ";
foreach(explode(',',$fieldList) as $k=>$v) $query .= "\n$v=" . (($lineArr[$k] !== '') ? "\"" . addslashes(utf8_encode($lineArr[$k])) . "\"" : "NULL") . ", ";
$query .= "import_date = now(),import_file='" . addslashes($file) . "',import_line = $count ";
if(!$this->db->query($query)) {
trigger_error('db error ' . $this->db->_error_number() . ' ' . $this->db->_error_message());
$status = 'error ';
} else {
$status = 'success ';
};
file_put_contents($logFile,"row: $count status: $status data: $line",FILE_APPEND); // log'
} else {
// skipped - this row was already imported from this file
// removed log to speed up
file_put_contents($logFile,"row: $count status: SKIPPED data: $line",FILE_APPEND); // log
}; // if $count
$count++;
}; // while $fp
fclose($fp);
// file complete - move file to archive
rename($file,str_replace('Data_Updates/','Data_Updates/archive/',$file));
file_put_contents($logFile,"-- END --",FILE_APPEND); // log
}; // each $fileArr
}; // each $globArr
} // end import function
} // end class
?>
任何帮助将不胜感激!
********编辑
根据几个人的建议,我添加了一些更改。这些更改仅影响循环逻辑的“数据行插入 DB”部分。您可以看到添加了日志记录以跟踪 memory_get_peak_usage,添加了 unset() 和 clearcachestat()。代码下面是一些日志数据:
file_put_contents($logFile,memory_get_peak_usage() . " line 1 \n\r",FILE_APPEND);
// data row - insert into DB
if(isset($lineArr)) unset($lineArr);
file_put_contents($logFile,memory_get_peak_usage() . " line 1.1 \n\r",FILE_APPEND);
$lineArr = str_getcsv($line); // turn this CSV line into an array
// build the insert query
file_put_contents($logFile,memory_get_peak_usage() . " line 2 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
if(isset($query)) unset($query);
file_put_contents($logFile,memory_get_peak_usage() . " line 2.1 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
$query = "INSERT INTO $tableName ($fieldList,import_date,import_file,import_line)
VALUES (";
file_put_contents($logFile,memory_get_peak_usage() . " line 2.2 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
foreach($lineArr as $k=>$v) $query .= ($v !== '') ? "'".addslashes(utf8_encode($v))."'," : " NULL,";
$query .= "now(),'" . addslashes($file). "',$count)
ON DUPLICATE KEY UPDATE ";
file_put_contents($logFile,memory_get_peak_usage() . " line 2.3 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
foreach(explode(',',$fieldList) as $k=>$v) $query .= "\n$v=" . (($lineArr[$k] !== '') ? "\"" . addslashes(utf8_encode($lineArr[$k])) . "\"" : "NULL") . ", ";
file_put_contents($logFile,memory_get_peak_usage() . " line 2.4 lineArr size: " . strlen(implode(',',$lineArr)) . "\n\r",FILE_APPEND);
$query .= "import_date = now(),import_file='" . addslashes($file) . "',import_line = $count ";
file_put_contents($logFile,memory_get_peak_usage() . " line 3 query size: " . strlen($query) . "\n\r",FILE_APPEND);
if(!$this->db->query($query)) {
trigger_error('db error ' . $this->db->_error_number() . ' ' . $this->db->_error_message());
$status = 'error ';
} else {
$status = 'success ';
};
clearstatcache();
日志数据:(最左边的数字是memory_get_peak_usage()调用的结果
2724960 line 1.1
2724960 line 2 lineArr size: 194
2724960 line 2.1 lineArr size: 194
2724960 line 2.2 lineArr size: 194
2724960 line 2.3 lineArr size: 194
2727392 line 2.4 lineArr size: 194
2727392 line 3 query size: 2346
2727392 line 1
2727392 line 1.1
2727392 line 2 lineArr size: 194
2727392 line 2.1 lineArr size: 194
2727392 line 2.2 lineArr size: 194
2727392 line 2.3 lineArr size: 194
2729944 line 2.4 lineArr size: 194
2729944 line 3 query size: 2346
2729944 line 1
2729944 line 1.1
2729944 line 2 lineArr size: 194
2729944 line 2.1 lineArr size: 194
2729944 line 2.2 lineArr size: 194
2729944 line 2.3 lineArr size: 194
2732448 line 2.4 lineArr size: 194
2732448 line 3 query size: 2346
2732448 line 1.1
2732448 line 2 lineArr size: 194
2732448 line 2.1 lineArr size: 194
2732448 line 2.2 lineArr size: 194
2732448 line 2.3 lineArr size: 194
2735088 line 2.4 lineArr size: 194
2735088 line 3 query size: 2346
请注意,2.3 和 2.4 行之间的内存仍在增长,即以下代码行:
foreach(explode(',',$fieldList) as $k=>$v) $query .= "\n$v=" . (($lineArr[$k] !== '') ? "\"" . addslashes(utf8_encode($lineArr[$k])) . "\"" : "NULL") . ", ";
有什么想法吗?
【问题讨论】:
-
您可以在整个代码中分散
memory_get_peak_usage()读数,以查看使用率上升的位置。 -
感谢@halfer。我确实发现了 2 个内存在增加的地方,但不应该是:#1) $lineArr = str_getcsv($line);和#2) $query = "INSERT INTO $tableName ($fieldList,...."。这是一个循环,但在这两种情况下,我都希望变量简单地覆盖自己。相反,它看起来像内存继续聚合每次迭代。我检查了变量,它们确实覆盖了以前的值(所以每次迭代的内容不会变大)。仍然难倒......
-
您使用什么数据库引擎 - PDO?使用
$query对象后,您也许可以丢弃它们。 (如果您可以将您的更新编辑到您的问题中,并删除您的评论,这会将问题的当前状态保留在一个地方)。 -
另外,在循环中使用这些变量后尝试
unsetting。 -
啊,我确实在确定问题变量后尝试取消设置,它似乎适用于#1,但不适用于#2。
标签: php sql-server codeigniter windows-8.1 iis-8