如何为 AJAX 爬虫处理 ?_escaped_fragment_=？答案

【问题标题】：How to handle ?_escaped_fragment_= for AJAX crawlers?如何为 AJAX 爬虫处理 ?_escaped_fragment_=？
【发布时间】：2013-09-26 13:14:46
【问题描述】：

我正在努力使基于 AJAX 的网站对 SEO 友好。按照网络教程中的建议，我在链接中添加了“漂亮”href 属性：<a href="#!site=contact" data-id="contact" class="navlink">контакт</a>，并且在默认使用 AJAX 加载内容的 div 中，为爬虫添加了 PHP 脚本：

$files = glob('./pages/*.php'); 

foreach ($files as &$file) {
    $file = substr($file, 8, -4); 

}

if (isset($_GET['site'])) {
    if (in_array($_GET['site'], $files)) {
        include ("./pages/".$_GET['site'].".php");
    }
}

我有一种感觉，一开始我需要从 (...)/index.php?_escaped_fragment_=site=about 中额外删除 _escaped_fragment_= 部分，因为否则脚本将无法从 URL 中 GET site 值，对吗？

但是，无论如何，我怎么知道爬虫将漂亮的链接（带有#! 的链接）转换为丑陋的链接（包含?_escaped_fragment_=）？有人告诉我，这是自动发生的，我不需要提供此映射，但 Googlebot 的 Fetch 不会向我提供有关 URL 发生情况的任何信息。

【问题讨论】：

标签： php ajax seo

【解决方案1】：

Google bot 会自动查询 ?_escaped_fragment_= 网址。

所以来自www.example.com/index.php#!site=about Google bot 会查询：www.example.com/index.php?_escaped_fragment_=site=about

在 PHP 网站上，您会得到 $_GET['_escaped_fragment_'] = "site=about"

如果你想获得“网站”的价值，你需要做这样的事情：

if(isset($_GET['_escaped_fragment_'])){
    $escaped = explode("=", $_GET['_escaped_fragment_']);
    if(isset($escaped[1]) && in_array($escaped[1], $files)){
          include ("./pages/".$escaped[1].".php");
    }
 }

查看文档：

https://developers.google.com/webmasters/ajax-crawling/docs/specification

【讨论】：

我担心它会这样工作，而当你回复时，我已经完成了将我的网站重写为没有site= 的版本；）无论如何，感谢您消除我的疑虑！
如果你想google bot，你也可以在你的网页头部添加，在没有哈希的ajax页面上抓取
我的网站在每个包含联系表单的页面上都显示了这一点。我正在使用 Ajax 提交表单。我该怎么办。如何删除#！和 ?_escaped_fragment 来自 url。这些网址仅在我使用 A1 站点地图生成器工具时出现。从SEO的角度来看这是一个问题吗，请帮助