使用 Guzzle 从 html 中提取信息答案

【问题标题】：Extract info from html with Guzzle使用 Guzzle 从 html 中提取信息
【发布时间】：2014-12-02 14:02:49
【问题描述】：

我正在尝试使用此代码提取车辆 ID：

    $client = new Client();
    $request = $client->get('http://www.truck1.eu/_TEN_auto_1522980_Truck_Chassis_MAN_TGA_18_320_BL_Platou_9_80m_lang_manuelles_Getriebe_Euro_4_Motor.html',  ['allow_redirects' => false]);

    $html = $request->getBody(true);

    $crawler = new Crawler();
    $crawler->addContent($html);
    print $crawler->filterXPath('//*[@id="content"]/div/div[2]/table/tbody/tr[2]/td')->text();

但由于某种原因，我无法使其正常工作。我正在使用 Symfony 的 Guzzle 和 DomCrawler。

【问题讨论】：

标签： symfony xpath web-crawler guzzle

【解决方案1】：

尝试使用此 XPath 来获取包含“车辆 ID”标签的 th 旁边的 td（并避免一些不必要的祖先依赖）：

//td[preceding-sibling::th = 'Vehicle ID']

【讨论】：