将信息抓取到 wordpress答案

【问题标题】：Scraping information to wordpress将信息抓取到 wordpress
【发布时间】：2016-03-29 16:31:19
【问题描述】：

我正在为汽车经销商制作个人网站，他将他的交易放在一个全国性的大型汽车销售网站上，并希望在公司网站上进行相同的汽车交易。为了避免重复提交，我决定从the country wide website 获取汽车信息并将其放入公司网站。

公司网站是在 Wordpress 上制作的。我对 PHP 不是很熟悉。

有这种工作的插件吗？也许我可以通过使用 node.js 之类的东西来实现这一点？

我想这个过程可能看起来像这样：该脚本扫描全国主要网站，如果有新车，它应该打开它们，阅读里面的信息并将信息放到公司网站上，以便显示。

【问题讨论】：

你需要curl和DOMDocument，除非源站点有一个很好用的API。

标签： php node.js wordpress web-scraping

【解决方案1】：

看看这个：

function get_inner_html( $node ) {
    $innerHTML= '';
    $children = $node->childNodes;
    foreach ($children as $child) {
        $innerHTML .= $child->ownerDocument->saveXML( $child );
    }

    return $innerHTML;
} 
function get_html_table($link,$element,$class){
//$data=date("Y-m-d");
$html = file_get_contents($link); //get the html returned from the following url
$poke_doc = new DOMDocument();
libxml_use_internal_errors(false); //disable libxml errors
libxml_use_internal_errors(true);
if(!empty($html)){ //if any html is actually returned
    $poke_doc->loadHTML($html);
    libxml_clear_errors(); //remove errors for yucky html
    $poke_xpath = new DOMXPath($poke_doc);
    $poke_type = $poke_xpath->query("//".$element."[@class='".$class."']");
    $table = "<table>";
    foreach($poke_type as $type){
        $table .= get_inner_html($type);
    }
    $table .= "</table>";
return $table;
}

echo get_html_table('https://www.linkname.com','div','kv');
//fist parameter the link; second: the dom element you want to scrap; third the class of the element

在你的情况下 get_html_table('http://auto.plius.lt/litechnija/skelbimai/zemes-ukio-misko-technika-manipuliatoriai/zemes-ukio-savaeige-technika','div','item');

【讨论】：