【问题标题】:how to use simple html dom get inner html [duplicate]如何使用简单的html dom获取内部html [重复]
【发布时间】:2016-02-22 04:37:09
【问题描述】:

我有一些代码可以从这个网站http://www.jne.co.id/tarif.php获取html表格结果跟踪

我尝试从我的本地主机创建此代码:

<?php include_once("simple_html_dom.php");


if($_POST){

    //extract data from the post
    //set POST variables
    $url = 'http://www.jne.co.id/tarif.php';
    $fields = array(
        'origin_code' => urlencode($_POST['origin_code']),
        'dest_code' => urlencode($_POST['dest_code']),
        'weight' => urlencode($_POST['weight']),
        'g-recaptcha-response' => urlencode($_POST['g-recaptcha-response']),
    );

    $fields_string = '';
    //url-ify the data for the POST
    foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
    rtrim($fields_string, '&');

    //open connection
    $ch = curl_init();

    //set the url, number of POST vars, POST data
    curl_setopt($ch,CURLOPT_URL, $url);
    curl_setopt($ch,CURLOPT_POST, count($fields));
    curl_setopt($ch,CURLOPT_POSTFIELDS, $fields_string);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

    //execute post
    $result = curl_exec($ch);   

    $string = $result;//htmlentities($result);

    //close connection
    curl_close($ch);

    // Create a DOM object
    $html_base = new simple_html_dom();
    // Load HTML from a string
    $html_base->load($string);

    $str = $html_base->find('.tracking');
echo $str;

    $html_base->clear(); 
    unset($html_base);
}
?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
</head>

<body>
<form class="check validate" action='' method='post' style='float:none;max-height:none;'>
<div class="row">
<label>
<input type='hidden' name='origin_code' value="CGK10000" class='required'>
<input type="text" name='origin_label' placeholder="Origin" class='autocomplete required' data-url='lib/origin.php' data-origin="JAKARTA" value="JAKARTA">
<br>Origin Shipment
</label>
<label>
<input type='hidden' name='dest_code' value="BDO10000" class='required'>
<input type="text" name='dest_label' placeholder="Destination"     class='autocomplete required' data-url='lib/dest.php' data-origin="BANDUNG" value="BANDUNG">
<br>Destination Shipment
</label>
<label>
<input name="weight" type="text" placeholder="1" style="width:30px; text-align:center" value="1" class='required number'>
<br>Weight(Kg)
</label>
<script src='https://www.google.com/recaptcha/api.js'></script>
<div class="g-recaptcha" data-sitekey="6LeyjxETAAAAAE5qSotpy40_cG31GyRm-VSBQaWU" style="margin:10px 0px;"></div>
<button class="btn red">check</button>
</div>
</form>
</body>
</html>

我需要获取带有类“table_style tracking”的表格html,请看这张图片

Image

编辑: 我只需要获取带有类跟踪的 html &lt;table&gt;,而不是 html 中的字符串。结果像这样的html:

<table width="100%" border="0" class="table_style tracking">
<thead>
<tr>
<td>Nama Layanan</td>
<td align="center">Jenis Kiriman</td>
<td>Tarif </td>
<td>ETD(Estimates Days) </td>
</tr>
</thead>
<tbody><tr>
<td>OKE</td>
<td align="center">Dokumen / Paket</td>
<td>Rp. 10.000</td>
<td>2 - 3 D</td>
</tr><tr>
<td>REG</td>
<td align="center">Dokumen / Paket</td>
<td>Rp. 11.000</td>
<td>1 - 2 D</td>
</tr><tr>
<td>YES</td>
<td align="center">Dokumen / Paket</td>
<td>Rp. 22.000</td>
<td>1 - 1 D</td>
</tr><tr>
<td>SPS</td>
<td align="center">Dokumen / Paket</td>
<td>Rp. 403.000</td>
<td> </td>
</tr></tbody>
</table>

【问题讨论】:

  • 哇,你可以绕过recaptcha
  • 我不知道,但我想是的。只需将 g-recaptcha-response 发布到 curl,我就会得到结果。 :D
  • 如果您不使用 DOM 方法,这可能会有一些用处——这可能是最好的方法。 stackoverflow.com/questions/34834038/…

标签: php html curl simple-html-dom


【解决方案1】:

试试这个:

http://sourceforge.net/projects/simplehtmldom/

例如:

$items = str_get_html($html);
$count = 0;
$country = $items->find('.category-label span', 0)->text(); // first element by index 0

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多