从图像链接中删除所有其他内容，但保留 src答案

【问题标题】：Remove everything else from an image link but keep src从图像链接中删除所有其他内容，但保留 src
【发布时间】：2016-06-24 22:37:02
【问题描述】：

我正在尝试从图像中删除一些属性，但它只删除了属性的名称并保留了其余的..

我有一张如下图所示：

<img class="aligncenter size-full wp-image-sd174" src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg" alt="alt title" srcset="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 700w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 241w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 624w" sizes="(max-width: 700px) 100vw, 700px" height="870" width="700">

我想删除除<img src="image path">之外的所有内容

我尝试了下面的代码，但它只删除了属性的名称.. 例如 srcset。

$html = "<img class="aligncenter size-full wp-image-sd174" src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg" alt="alt title" srcset="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 700w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 241w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 624w" sizes="(max-width: 700px) 100vw, 700px" height="870" width="700">";

$one = preg_replace('#(<img.+?)srcset=(["\']?)\d*\2(.*?/?>)#i', '$1$3', $html);
$two= preg_replace('#(<img.+?)sizes=(["\']?)\d*\2(.*?/?>)#i', '$1$3', $one);

【问题讨论】：

我可能更喜欢反过来做，只需获取 src 属性并创建一个新的imgtag
如何对 html 字符串中的所有图像执行此操作？
如果您的字符串包含的不仅仅是一个图像标签，您可能是shouldn't be using a regex。 HTML确实需要解析，看看DOMDocument

标签： php regex preg-replace preg-match-all

【解决方案1】：

试试这个：

$html = preg_replace("/(<img\\s)[^>]*(src=\\S+)[^>]*(\\/?>)/i", "$1$2$3", $html);

它不会替换不必要的属性，而是用图像标签的开闭来提取src属性。

它应该适用于您的 html 中任意数量的 <img> 标签。

【讨论】：

【解决方案2】：

我建议你采用以下方法。

考虑到每个属性都必须用空格分隔，您可以使用简单的 explode() 函数拆分所有属性，然后迭代以获取您需要的属性并创建干净的图像标签。

function cleanImage($html) {
    $output = '';
    $image_components = explode(' ',$html);
    foreach($image_components as $component) {
        if(substr($component,0,4) == 'src=') {
            $output = '<img '.$component.">";
            break;
        }
    }
    return $output;
}


$html = '<img class="aligncenter size-full wp-image-sd174" src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg" alt="alt title" srcset="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 700w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 241w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 624w" sizes="(max-width: 700px) 100vw, 700px" height="870" width="700">';

$image = cleanImage($html);

【讨论】：

【解决方案3】：

您可以使用DOM extension 正确操作 HTML 结构。

在非常简单的情况下使用正则表达式可能没问题，但it won't be a complete solution 不管它看起来多么复杂。

剥离除src 之外的所有<img> 属性：

$html = '<img class="aligncenter size-full wp-image-sd174" src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg" alt="alt title" srcset="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 700w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 241w, http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg 624w" sizes="(max-width: 700px) 100vw, 700px" height="870" width="700">';

echo stripImageAttributes($html);

输出：

<img src="http://www.blahblah.com/wp-content/uploads/2016/06/07d333r.jpg">

stripImageAttributes()的定义：

（它旨在处理 HTML 片段，而不是完整的文档。）

/** 
 * @param string $html
 * @return string 
 */ 
function stripImageAttributes($html)
{
    // init document
    $doc = new DOMDocument();
    $doc->loadHTML('<!doctype html><html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head><body>' . $html . '</body></html>');

    // init xpath
    $xpath = new DOMXPath($doc);

    // process images
    $body = $xpath->query('/html/body')->item(0);

    foreach ($xpath->query('//img', $body) as $image) {
        $toRemove = null;

        foreach ($image->attributes as $attr) {
            if ('src' !== $attr->name) {
                $toRemove[] = $attr;
            }
        }

        if ($toRemove) {
            foreach ($toRemove as $attr) {
                $image->removeAttribute($attr->name);
            }
        }
    }

    // convert the document back to a HTML string
    $html = '';
    foreach ($body->childNodes as $node) {
        $html .= $doc->saveHTML($node);
    }

    return $html;
}

【讨论】：