【发布时间】:2018-06-22 22:16:34
【问题描述】:
Google App Engine 上的 Java 和 AngularJS。
至于为什么,虽然我确信大多数爬虫可以解析 javascript 网站,但它并没有完全解析我的 angularjs 网站,因此没有正确索引它。我已经创建了站点的静态版本,并希望根据用户代理有条件地重定向到它。它适用于除我网站的根目录或带有或不带有斜杠的 localhost:8080 之外的每个网址。
我认为这是因为我的 web.xml 中 tukey UrlRewriteFilter 的配置是 /*,所以如果没有尾部斜杠就不会触发它?不过,我尝试过改变它;我已经尝试了我能想到的一切,将 servlet 版本更改为 3.0,使用“欢迎文件”,为 url-pattern 放置空字符串等。
感谢您的帮助。
Urlrewrite.xml:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE urlrewrite PUBLIC "-//tuckey.org//DTD UrlRewrite 4.0//EN"
"http://www.tuckey.org/res/dtds/urlrewrite4.0.dtd">
<urlrewrite use-query-string="true">
<rule>
<condition name="user-agent">
facebookexternalhit/[0-9]|facebook|Googlebot|Googlebot-Mobile|
Mediapartners-Google|AdsBot(.*)|AdSense(.*)|(.*)AdsBot|(.*)AdSense|
Googlebot-Image|Googlebot-Video|Googlebot(.*)|
FacebookExternalHit/[0-9]|Mediapartners-Google|AdsBot-Google
|facebookexternalhit/1.0|FacebookExternalHit/1.1|
FacebookExternalHit/1.0|facebookexternalhit/1.1|Facebot|Twitter|Twitterbot|Pinterest
</condition>
<from>^/(.*)$</from>
<to>/staticview.jsp</to>
</rule>
</urlrewrite>
web.xml:
<web-app version="2.5" xmlns="http://java.sun.com/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd">
<filter>
<filter-name>UrlRewriteFilter</filter-name>
<filter-class>org.tuckey.web.filters.urlrewrite.UrlRewriteFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>UrlRewriteFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>appstats</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
<filter>
<filter-name>appstats</filter-name>
<filter-class>com.google.appengine.tools.appstats.AppstatsFilter</filter-class>
<init-param>
<param-name>calculateRpcCosts</param-name>
<param-value>true</param-value>
</init-param>
</filter>
<servlet>
<servlet-name>appstats</servlet-name>
<servlet-class>com.google.appengine.tools.appstats.AppstatsServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>appstats</servlet-name>
<url-pattern>/appstats/*</url-pattern>
</servlet-mapping>
<security-constraint>
<web-resource-collection>
<web-resource-name>appstats</web-resource-name>
<url-pattern>/appstats/*</url-pattern>
</web-resource-collection>
<auth-constraint>
<role-name>admin</role-name>
</auth-constraint>
</security-constraint>
<servlet>
<servlet-name>rss</servlet-name>
<servlet-class>com.byron.common.controller.RSSServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>rss</servlet-name>
<url-pattern>/rss</url-pattern>
</servlet-mapping>
<servlet>
<servlet-name>rssfull</servlet-name>
<servlet-class>com.byron.common.controller.FullRSSServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>rssfull</servlet-name>
<url-pattern>/rssfull</url-pattern>
</servlet-mapping>
<servlet>
<servlet-name>sitemap</servlet-name>
<servlet-class>com.byron.common.controller.SitemapServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>sitemap</servlet-name>
<url-pattern>/sitemap</url-pattern>
</servlet-mapping>
<servlet>
<servlet-name>Jersey REST Service</servlet-name>
<servlet-class>com.sun.jersey.spi.container.servlet.ServletContainer</servlet-class>
<init-param>
<param-name>com.sun.jersey.config.feature.DisableWADL</param-name>
<param-value>true</param-value>
</init-param>
<!--
Please try to declare your resource classes statically in your Application implementation as
follows in order to minimize the startup time of your application.
-->
<init-param>
<param-name>javax.ws.rs.Application</param-name>
<param-value>com.byron.common.controller.Resources</param-value>
</init-param>
<load-on-startup>1</load-on-startup>
</servlet>
<servlet-mapping>
<servlet-name>Jersey REST Service</servlet-name>
<url-pattern>/rest/*</url-pattern>
</servlet-mapping>
</web-app>
【问题讨论】:
-
作为更新,没有解决方案,但有进步。同样,我担心的是,当我访问网站的根目录时,它实际上并没有通过 Tuckey 的 URLRewriteFilter。通过在 web.xml 中添加“欢迎文件”,我得到了一些好坏参半的结果。
<welcome-file-list> <welcome-file>posts</welcome-file> </welcome-file-list>使用用户代理“google”,它通过“posts”将根重定向到我的静态版本。只有这样我才能实现 root -> static。然而,“posts”并不是一个真正的端点,所以没有代理我只是从根中得到“403(禁止)”,无论如何它似乎是迂回的。
标签: java web.xml servlet-filters tuckey-urlrewrite-filter