【问题标题】:Parse content from inside div ignoring span从 div 内部解析内容忽略跨度
【发布时间】:2015-05-06 05:27:01
【问题描述】:

我正在尝试获取故事副本。

$url = 'http://www.myfoxchicago.com/story/28987351/cook-county-jail-guards-sick-calls-tripled-on-night-of-big-fight';

//$content = file_get_content($url);

$content = '<div id="WNStoryBody" class=""><span id="WNStoryDateline">CHICAGO (FOX 32 News) -
            </span><p>The Mayweather-Pacquiao...</p></div>';

preg_match_all('<div id=\"WNStoryBody\" class=\"\">
  <span id=\"WNStoryDateline\">CHICAGO (FOX 32 News) -
            <\/span>(.*?)<\/div>', $content, $matches);

print_r($matches);

想要的输出是:

<p>The Mayweather-Pacquiao boxing match was billed as the fight of the century, and it may have contributed to a massive number of correctional officers calling in sick at the Cook County Jail this past weekend.</p><p>Between 7:00 AM Saturday and 3:00 PM Sunday, 637 correctional officers called in sick out of the 3100 employees scheduled to work.</p><p>Sheriff Tom Dart is beyond frustrated.</p><p>“This was triple what the normal call in would be,” Dart said.</p><p>Between 20 and 25 percent of the staff on each of the four shifts in question took a sick day.</p><p>“We are not, didn't just fall off the truck, there was a big boxing match, a lot of people were talking about it, did that play into this?” Dart said. “We'd be naïve not to realize that certain big events seem to coincide with people not showing up for work at times here,” he said.</p><p>But Teamsters Local 700, which represents the correctional officers, believes those accusations are a low blow to guards who took a sick day that they needed.</p><p>“I don't believe that the officers called in to watch the fight,” said Dennis Andrews, Business Agent for Teamsters Local 700.</p><p>Andrews attributes the sick call to coincidence and stress.</p><p>“The stress level on the officers is off the charts. So I think this is a culmination of the building up for the past several weeks of all the inmate fights, the inmates attacking staff,” Andrews said.</p><p>Sheriff Dart said what happened this weekend was horribly unfair to good employees and taxpayers.</p><p>“We have pled with people over and over again, listen, there's all sorts of mechanisms here so that you can take care of family members, people who are sick, but you can't just call in sick because you want a day off,” Dart said.</p><p>The sheriff said it could be hard to hand down discipline directly as a result of the sick calls, but he does monitor sick day abuse and it could impact the ability of employees to be promoted.</p><p>The Teamsters don't believe anyone should face discipline.</p><p>“If an officer has sick time and he called in sick, that's his earned time to used, there can't be any discipline,” Andrews said. “I don't believe anybody used it as a vacation day.”</p>

【问题讨论】:

    标签: php html preg-match-all


    【解决方案1】:

    preg_match 不是解析 HTML 的复杂方式,可以使用

    $doc = new DOMDocument();
    $doc->loadHTML($content);
    
    $div = $doc->getElementById('WNStoryBody');
    
    $ptag = new DOMDocument();
    $ptag->loadHTML($div);
    
    $all_ptags = $ptag->getElementsByTagName('p');
    // you get array of <p> tags here
    // you can use implode if you want to convert it to a string or use foreach to use each <p> tag seprately
    

    【讨论】:

    • 我认为 $all_ptags = $dom-&gt;getElementsByTagName('p'); 应该是:$all_ptags = $ptag-&gt;getElementsByTagName('p');
    • 哎呀,这是个错误!
    • 感谢您对此的帮助! :)
    【解决方案2】:

    为此使用DOMDocument

    $dom = new DOMDocument;
    $dom->loadHTML($content);
    $elements = $dom->getElementById('WNStoryBody');
    foreach($elements as $element)
      print str_replace(Array("<span id=\"WNStoryDateline\">", "</span">), "", $element->nodeValue);
    

    也许您必须扩展 str_replace 或使其成为 preg_replace。

    使用 Viral 的答案,这是一种更好的方法,因为您不必替换任何东西。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-12-30
      • 2015-03-03
      • 1970-01-01
      • 2015-01-28
      • 1970-01-01
      • 2015-09-17
      • 2014-06-12
      • 1970-01-01
      相关资源
      最近更新 更多