【问题标题】:Submit movie rating to imdb.com with curl使用 curl 向 imdb.com 提交电影评分
【发布时间】:2014-04-05 23:32:26
【问题描述】:

我正在尝试使用 curl 登录 imdb.com,然后提交电影评分。我知道这违反了他们的服务条款,但我没有构建应用程序或任何东西,只是一个供个人使用的小脚本。我是 curl 的新手,但我通过使用 stackoverflow 上的信息使登录部分正常工作。登录后,我将 curl URL 设置为http://www.imdb.com/ratings/_ajax/title,因为这是提交评级的地方。但是,当我执行 curl 命令时,没有提交评分。也不知道如何解决这个问题,所以希望有人能指出我正确的方向?这是我到目前为止得到的:

// options
$username           = 'username';
$password           = 'password';
$url_login          = "https://secure.imdb.com/register-imdb/login"; 
$url_rating         = "http://www.imdb.com/ratings/_ajax/title";
$headers[]          = "Accept: */*";
$headers[]          = "Connection: Keep-Alive";
$headers[]          = "Content-Type: application/x-www-form-urlencoded";
$cookie_file_path   = dirname(__FILE__)."/cookies.txt";
$agent              = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36";


// get login page
$ch = curl_init(); 

// basic curl options for all requests
curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);
curl_setopt($ch, CURLOPT_HEADER,  0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);         
curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); 
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); 
curl_setopt($ch, CURLOPT_VERBOSE, 1);
// log
$verbose = fopen("loginfetch.txt", 'a+');
curl_setopt($ch, CURLOPT_STDERR, $verbose);

// set first URL
curl_setopt($ch, CURLOPT_URL, $url_login);

// execute session to get cookies and required form inputs
$return = curl_exec($ch); 

// close connection
curl_close($ch);

//echo $return;


// grab the hidden inputs from the form required to login
$fields = getFormFields($return);
$fields['login'] = $username;
$fields['password'] = $password;

// set postfields using what we extracted from the form
$postfields = http_build_query($fields); 

// post to login page
$ch = curl_init(); 

// set post options
curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);
curl_setopt($ch, CURLOPT_HEADER,  0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);         
curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); 
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields); 
curl_setopt($ch, CURLOPT_URL, $url_login);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
// log
$verbose = fopen("loginpost.txt", 'a+');
curl_setopt($ch, CURLOPT_STDERR, $verbose);

// perform login
$return = curl_exec($ch);  

// close connection
curl_close($ch);

//echo $return; 


//submit rating

$data['tconst'] = 'tt1709143';
$data['rating'] = '5';
$data['tracking_tag'] = 'title-maindetails';

$post = http_build_query($data); 

// post to submit page
$ch = curl_init(); 

curl_setopt($ch, CURLOPT_HEADER,  0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);         
curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); 
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); 
curl_setopt($ch, CURLOPT_POST, 1);                                                                                            
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);                                                                  
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);  
curl_setopt($ch, CURLOPT_URL, $url_rating);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
// log
$verbose = fopen("ratingsubmit.txt", 'a+');
curl_setopt($ch, CURLOPT_STDERR, $verbose);

$return = curl_exec($ch);

// close connection
curl_close($ch);

//echo $return;

启用日志记录后,我得到三个日志。第三个将请求记录到 IMDB 评级页面,这不起作用,您可以在下面的日志中查看:

* About to connect() to www.imdb.com port 80 (#0)
*   Trying 72.21.203.211... * connected
* Connected to www.imdb.com (72.21.203.211) port 80 (#0)
> POST /ratings/_ajax/title HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36
Host: www.imdb.com
Cookie: cs=ECxJtiNrhA/m+SuIKh15AweBbbqgkVqM8BkNuqOS/TKzsu7Z84JeKeCRXRoA0U26oKcq7CWRbbqj9TkMh9HN2eCRWyxAGW26oKdbraCRbbqgsW26oJFt+uDBHYqg==; cache=BCYqeti-w3RKC8bV21R-BwArPk4ILOkGu0T6E1oB5KGihmddDp_kluyca1x7QLflsfnEZ9smi6EZc2uHo7eY5FZeXfG4EQ97tKKFR8VhyAW4d4Q; id=BCYiSjnWuGQc8HDlo5OAY8cDzxQyS5nHJqLgwq_9yI08DAjTU5l0CeOXL8dUvE28QUv1MNlBQ0MD5jEzs8OuhUVQKukg_AtlD58ORFostzT-mCzLCuv8a_mOFztCRGX7V3rpONDCl_xyKHAEj2JLSnWHI8VbKrpes93j5xsgNtdgeU0oYH3s93XMeRVWOM06V1Lg; session-id-time=1551966508; session-id=357-4286508-9576651; uu=BCYvAfd_f2bQLnYdtpdRlYkDth4AKSl6zlKVXzyzSLlagoM-bH3kvZe3FLFOj_KmoWbEkh-dRXiPZZtStWC72Dbsd6jCQiNnXDAyxc-_vmzg5yiJLuwbKVF6nICv9xuwCV_Gn-_Ek8gqTujYDQPdgIWR2Y3aXArES1RzXoqX1pA9jkZ1EkWFkVKNaukvSqxPQRJhE50xfMNMwaUJLJ8SLA1WRsIVLqp873yNvZf7ecyLd4hgmC7AxdbfzPtDCdwgaelx
Accept: */*
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 56

< HTTP/1.1 400 Bad Request
< Date: Sat, 08 Mar 2014 13:48:30 GMT
< Server: Server
< X-Frame-Options: SAMEORIGIN
< Content-Type: text/html;charset=UTF-8
< Content-Language: en-US
< Vary: Accept-Encoding,User-Agent
* Replaced cookie cache="BCYs4XPUvL_p2AL_pctQP7qEdwB9nBAXcIkiNxRZlqHtp9VjHkCy-GzvEIqsHCBHjjuGdWIyzZb1%0D%0Aip5WAl_SmYCtFg%0D%0A" for domain imdb.com, path /, expire 3541770158
< Set-Cookie: cache=BCYs4XPUvL_p2AL_pctQP7qEdwB9nBAXcIkiNxRZlqHtp9VjHkCy-GzvEIqsHCBHjjuGdWIyzZb1%0D%0Aip5WAl_SmYCtFg%0D%0A; Domain=.imdb.com; Expires=Thu, 26-Mar-2082 17:02:38 GMT; Path=/
< P3P: policyref="http://i.imdb.com/images/p3p.xml",CP="CAO DSP LAW CUR ADM IVAo IVDo CONo OTPo OUR DELi PUBi OTRi BUS PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA HEA PRE LOC GOV OTC "
< Cneonction: close
< Transfer-Encoding: chunked
< 
* Connection #0 to host www.imdb.com left intact
* Closing connection #0

【问题讨论】:

    标签: php curl login


    【解决方案1】:

    我知道出了什么问题。缺少 IMDb 期望与电影 ID 和评级一起提交的字符串。它被称为“Auth”,它是电影电影页面上存在的字符串。所以我添加了一个函数来查找 auth 字符串,并在向 IMDb 提交评级时传递它。没有更多错误。

    如果有人感兴趣,这是整个(工作)的事情:

    // options
    $username           = 'username';
    $password           = 'password';
    $url_login          = "https://secure.imdb.com/register-imdb/login"; 
    $url_rating         = "http://www.imdb.com/ratings/_ajax/title";
    $movie_id           = "tt1800241";
    $url_movie          = "http://www.imdb.com/title/" . $movie_id;
    $data['tconst']     = $movie_id;
    $data['rating']     = '7';
    $data['tracking_tag'] = 'title-maindetails';
    $headers[]          = "Accept: */*";
    $headers[]          = "Connection: Keep-Alive";
    $headers[]          = "Content-Type: application/x-www-form-urlencoded";
    $cookie_file_path   = dirname(__FILE__)."/cookies.txt";
    $agent              = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36";
    
    
    /**
        Step 1: get login page and cookies
    **/
    
    $ch = curl_init(); 
    
    // basic curl options for all requests
    curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);
    curl_setopt($ch, CURLOPT_HEADER,  0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);         
    curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); 
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); 
    // log
    curl_setopt($ch, CURLOPT_VERBOSE, 1);
    $verbose = fopen("loginfetch.txt", 'a+');
    curl_setopt($ch, CURLOPT_STDERR, $verbose);
    
    // set URL
    curl_setopt($ch, CURLOPT_URL, $url_login);
    
    // execute session to get cookies and required form inputs
    $return = curl_exec($ch); 
    
    // close connection
    curl_close($ch);
    
    //echo $return;
    
    /** 
        Step 2: post login credentials
    **/
    
    // grab the hidden inputs from the form required to login
    $fields = getFormFields($return);
    $fields['login'] = $username;
    $fields['password'] = $password;
    
    // set postfields using what we extracted from the form
    $postfields = http_build_query($fields); 
    
    // post to login page
    $ch = curl_init(); 
    
    // set post options
    curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);
    curl_setopt($ch, CURLOPT_HEADER,  0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);         
    curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); 
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); 
    curl_setopt($ch, CURLOPT_POST, 1); 
    curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields); 
    // log
    curl_setopt($ch, CURLOPT_VERBOSE, 1);
    $verbose = fopen("loginpost.txt", 'a+');
    curl_setopt($ch, CURLOPT_STDERR, $verbose);
    
    // set URL
    curl_setopt($ch, CURLOPT_URL, $url_login);
    
    // perform login
    $return = curl_exec($ch);  
    
    // close connection
    curl_close($ch);
    
    //echo $return; 
    
    /**
        Step 3: get Auth string from movie page
    **/
    
    $ch = curl_init(); 
    
    // basic curl options for all requests
    curl_setopt($ch, CURLOPT_HTTPHEADER,  $headers);
    curl_setopt($ch, CURLOPT_HEADER,  0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);         
    curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); 
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); 
    // log
    curl_setopt($ch, CURLOPT_VERBOSE, 1);
    $verbose = fopen("authfetch.txt", 'a+');
    curl_setopt($ch, CURLOPT_STDERR, $verbose);
    
    // set URL
    curl_setopt($ch, CURLOPT_URL, $url_movie);
    
    // execute session
    $return_auth = curl_exec($ch); 
    
    // close connection
    curl_close($ch);
    
    //echo $return_auth;
    
    /**
        Step 4: submit rating
    **/
    
    $data['auth'] = getAuth($return_auth);
    
    $post = http_build_query($data); 
    
    // post to submit page
    $ch = curl_init(); 
    
    curl_setopt($ch, CURLOPT_HEADER,  0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);         
    curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); 
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); 
    curl_setopt($ch, CURLOPT_POST, 1);                                                                                            
    curl_setopt($ch, CURLOPT_POSTFIELDS, $post);                                                                  
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
    // log
    curl_setopt($ch, CURLOPT_VERBOSE, 1);
    $verbose = fopen("ratingsubmit.txt", 'a+');
    curl_setopt($ch, CURLOPT_STDERR, $verbose);
    
    // set URL
    curl_setopt($ch, CURLOPT_URL, $url_rating);
    
    // execute session
    $return = curl_exec($ch);
    
    // close connection
    curl_close($ch);
    
    //echo $return;
    
    
    
    function getFormFields($data)
    {
        if (preg_match('/(<form method="post.*?<\/form>)/is', $data, $matches)) {
            $inputs = getInputs($matches[1]);
    
            return $inputs;
        } else {
            return('Login form not found.');
        }
    }
    
    function getInputs($form)
    {
        $inputs = array();
    
        $elements = preg_match_all('/(<input[^>]+>)/is', $form, $matches);
    
        if ($elements > 0) {
            for($i = 0; $i < $elements; $i++) {
                $el = preg_replace('/\s{2,}/', ' ', $matches[1][$i]);
    
                if (preg_match('/name=(?:["\'])?([^"\'\s]*)/i', $el, $name)) {
                    $name  = $name[1];
                    $value = '';
    
                    if (preg_match('/value=(?:["\'])?([^"\'\s]*)/i', $el, $value)) {
                        $value = $value[1];
                    }
    
                    $inputs[$name] = $value;
                }
            }
        }
    
        return $inputs;
    }
    
    // when submitting a rating to IMDb you also need to send an 'auth' string which we grab from the rating-list div on the movie details page
    function getAuth($data)
    {
        if (preg_match('/data-auth="(.*?)"/is', $data, $matches)) {
            $auth = $matches[1];
    
            return $auth;
        } else {
            return('Auth string not found.');
        }
    }
    

    【讨论】:

      【解决方案2】:

      当您提交请求时,您的 cookie 为空。每次做curl_exec时都需要关闭curl句柄:

      curl_close($ch);
      

      这会将 cookie 存储到文件中。

      因此,对于您的代码,您需要将其关闭 3 次。确保在关闭后再次初始化 curl,并确保每次都相应地指向 cookie 文件。

      【讨论】:

      • 我按照您的建议通过关闭和初始化 curl 句柄更新了我的脚本(请参阅上面的更新帖子),但这没有什么区别,因为在我成功登录后没有提交评分。我做得对吗?
      • 代码看起来没问题。可能是你在其他地方做错了。通过为 all 启用选项 curl_setopt($ch, CURLOPT_VERBOSE, 1); 再次运行脚本。它将帮助您查看每次通话期间发送的请求和响应。
      • 我们能看到第三个的详细输出吗??
      • 我将日志添加到开篇帖子中。
      猜你喜欢
      • 2021-12-10
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-10-20
      • 1970-01-01
      • 2023-03-29
      • 2021-06-23
      相关资源
      最近更新 更多