【发布时间】:2012-01-17 21:48:46
【问题描述】:
我正在尝试使用HtmlUnit 来测试我的GWT 网站是否正确加载。
很遗憾,我正在获取的页面似乎并不完整。当我在普通浏览器中访问该页面时,它缺少可查看的内容。
这是产生此输出的单元测试:
WebClient webClient = new WebClient();
webClient.setThrowExceptionOnScriptError(false);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.waitForBackgroundJavaScript(30000);
HtmlPage page = webClient.getPage("http://www.ozdroid.com/#!BLOG/2010/10/12/How_to_Make_Google_AppEngine_Applications_Ajax_Crawlable");
System.out.println(page.asXml());
webClient.closeAllWindows();
有没有人知道我可以做些什么来解决这个问题并获取网站的完整 Html?
编辑
这是 page.asXml() 返回的更新代码,显然不完整:
<?xml version="1.0" encoding="ISO-8859-1"?>
<html xmlns:fb="http://www.facebook.com/2008/fbml>
<head>
<meta http-equiv=" content-type="">
<head>
<meta name="google-site-verification" content="_KCG8ec0LvgmXjnBAikAog0knc7jAbIGCu8Cmu2hsCI"/>
<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7"/>
<link rel="shortcut icon" href="favicon.ico"/>
<link rel="icon" type="image/gif" href="favicon.gif"/>
<title>
OzDroid - Enterprise Solutions for Android | Laser Barcode
scanners | RFID | Handheld Computers | Rugged PDA's and Mobile Phones
</title>
<script type="text/javascript">
//<![CDATA[
var _gaq = _gaq || [];
//]]>
</script>
<script type="text/javascript" language="javascript" src="ozdroid/ozdroid.nocache.js">
</script>
<script defer="defer">
//<![CDATA[
ozdroid.onInjectionDone('ozdroid')
//]]>
</script>
<script src="http://www.google-analytics.com/ga.js" type="text/javascript">
</script>
</head>
<body>
<!-- OPTIONAL: include this if you want history support --> <iframe src="javascript:''" id="__gwt_historyFrame" style="position: absolute; width: 0; height: 0; border: 0">
</iframe>
<noscript>
<div
style="width: 22em; position: absolute; left: 50%; margin-left: -11em; color: red; background-color: white; border: 1px solid red; padding: 4px; font-family: sans-serif">
<p>Welcome, to the website of OzDroid, we sell and distribute rugged Android
handheld computers, pda's and mobile phones. These devices can be equipped
with options including 1D and 2D laser barcode scanners, RFID, wifi,
bluetooth and cameras.</p>
<p> In the near future, we also
will be supplying logistics software for the same.
</p>
<p>As this site contains dynamic content that relies on javascript,
<b>your web browser must have JavaScript enabled</b> in order for this site to
display correctly.
</p></div>
</noscript>
<div id="fb-root">
</div>
<!-- Production --> <script src="http://connect.facebook.net/en_GB/all.js">
</script>
</body>
</html>
谢谢
【问题讨论】:
-
GWT 应用程序是丰富的 Javascript 应用程序,而不仅仅是静态网页。 HTML 标记不会包含您在浏览器中加载页面时看到的所有内容的源 - 其中大部分是由 javascript 加载的。
-
可能是htmlunit的bug,你为什么不去那里问?
-
@NickJohnson 我正在使用 HtmlUnit 来查看完全渲染的页面。