【问题标题】:multipart/form-data php curl多部分/表单数据 php curl
【发布时间】:2013-09-07 18:06:44
【问题描述】:

我正在使用 i2ocr.com 的 OCR 服务将图像转换为文本..

在我的项目中,我需要自动完成这项工作,所以我使用 PHP 来获取图像的文本。

在 OCR 网站中,postdata 以 multipart/form-data 的形式包含

像这样:

-----------------------------32642708628732\r\n
Content-Disposition: form-data; name="i2ocr_options"\r\n
\r\n
url\r\n
-----------------------------32642708628732\r\n
Content-Disposition: form-data; name="i2ocr_uploadedfile"\r\n
\r\n
\r\n
-----------------------------32642708629732\r\n
Content-Disposition: form-data; name="i2ocr_url"\r\n
\r\n
http://www.murraydata.co.uk/wp-content/uploads/2013/02/ocr-font-500x220.jpg\r\n
-----------------------------32642708628732\r\n
Content-Disposition: form-data; name="i2ocr_languages"\r\n
\r\n
gb,eng\r\n
-----------------------------32642708628732--\r\n

在我使用的 PHP 中

$ch = curl_init();
$dt = array();
$dt['i2ocr_options'] = 'url';
$dt['i2ocr_uploadedfile'] = '';
$dt['i2ocr_url'] = 'http://www.murraydata.co.uk/wp-content/uploads/2013/02/ocr-font-500x220.jpg';
$dt['i2ocr_languages'] = 'gb,eng';


    curl_setopt($ch, CURLOPT_URL,"http://www.i2ocr.com/process_form");    
    curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; rv:23.0) Gecko/20100101 Firefox/23.0");
    curl_setopt($ch,CURLOPT_ENCODING,"gzip,deflate");
    curl_setopt($ch, CURLOPT_HTTPHEADER, Array("Content-Type: multipart/form-data; boundary=---------------------------32642708628732"));
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_REFERER, "http://www.i2ocr.com/");
    curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, "$dt");
    $html=curl_exec($ch);

    print_r($html);

这段代码不会产生任何错误,但我也没有得到任何输出。

我需要帮助来获取此 curl 请求的输出。

【问题讨论】:

    标签: php curl ocr


    【解决方案1】:

    像这样:

    <?php
    function get($url, $refer, $ch)
    {
            curl_setopt ($ch, CURLOPT_URL,$url); 
            curl_setopt ($ch, CURLOPT_POST, 0);  
            curl_setopt ($ch, CURLOPT_COOKIEJAR, realpath('cookie.txt')); // cookie.txt 
            curl_setopt ($ch, CURLOPT_COOKIEFILE, realpath('cookie.txt'));
        curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; U; Linux i586; de; rv:5.0)         Gecko/20100101 Firefox/5.0');
        curl_setopt ($ch, CURLOPT_REFERER, $refer);
        $result= curl_exec($ch);
        return $result;                 
    }
    function post($url, $refer, $parametros, $ch)
    {
        curl_setopt ($ch, CURLOPT_URL,$url); 
        curl_setopt ($ch, CURLOPT_POST, 1); 
        curl_setopt ($ch, CURLOPT_POSTFIELDS, $parametros); 
        curl_setopt ($ch, CURLOPT_COOKIEJAR, realpath('cookie.txt')); // cookie.txt 
        curl_setopt ($ch, CURLOPT_COOKIEFILE, realpath('cookie.txt'));
        curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; U; Linux i586; de; rv:5.0) Gecko/20100101 Firefox/5.0');
        curl_setopt ($ch, CURLOPT_REFERER, $refer);
        $result= curl_exec($ch);
        return $result;                 
    }
    function hazlo() {
    $ch = curl_init();
    /* STEP 1. visito la primera pagina para coger sus cookies */
    get ("http://www.i2ocr.com/", "http://www.i2ocr.com/", $ch);
    
    //STEP 2. Creo un array con los datos del post
    $data = array(
    'i2ocr_options' => 'url',
    'i2ocr_uploadedfile' => '',
    'i2ocr_url' => 'http://www.murraydata.co.uk/wp-content/uploads/2013/02/ocr-font-    500x220.jpg',
    'i2ocr_languages' => 'gb,eng'
    );
    $data2 = http_build_query($data);
    
    //STEP 3. Enviamos el el array en post
    echo post ("http://www.i2ocr.com/process_form", "http://www.i2ocr.com/", $data2, $ch);
    }
    hazlo();
    ?>
    

    使用视图源查看响应html,您可以看到图像的文本(对不起我的英文)。工作 100% :)

    【讨论】:

      猜你喜欢
      • 2022-11-11
      • 2014-03-29
      • 2013-04-12
      • 1970-01-01
      • 1970-01-01
      • 2016-04-08
      • 2012-03-16
      • 2017-07-29
      • 2012-06-08
      相关资源
      最近更新 更多