【发布时间】:2017-11-19 07:06:18
【问题描述】:
我可以使用以下代码获取 HTML 源代码。但是当我尝试使用https://marriott.medallia.com/sso/marriott/homepage.do?v=bnAaQvo3*lVHsqtnwluPh_CMCsIHyFkti&alreftoken=6d0d31c7eb7583b964d0ecb89b55e12b
页面 URL 正在动态更改,当我看到源视图时,在下一个生成的页面上,我只在 HTML 正文中获得以下代码:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>IdP Selection</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" type="text/css" href="style.min.css">
</head>
<body>
<div id="app-container" class="app-container"></div>
<script>
AppContext = {
idps: '[{"entityId":"MI-PROD-SAML2-IDP-MEDALLIA","name":"Marriott International (any associate w/ EID)"},{"entityId":"https://identity.starwoodhotels.com","name":"Starwood Hotels"}]'
};
</script>
<script src="main.min.js"></script>
</body>
</html>
当我检查生成的单选按钮时,我能够在浏览器开发人员元素选项卡中获取 HTML 元素。
我的C#代码如下:
public Form1()
{
InitializeComponent();
this.webBrowser1.ObjectForScripting = new MyScript();
}
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
webBrowser1.Navigate("javascript: window.external.CallServerSideCode();");
}
[ComVisible(true)]
public class MyScript
{
public void CallServerSideCode()
{
var doc = ((Form1)Application.OpenForms[0]).webBrowser1.Document;
var renderedHtml = doc.GetElementsByTagName("HTML")[0].OuterHtml;
var marelement = doc.GetElementById("MI-PROD-SAML2-IDP-MEDALLIA");
HtmlElementCollection eCollections = doc.GetElementsByTagName("HTML");
string strDoc = eCollections[0].OuterHtml;
}
}
【问题讨论】:
-
您的代码运行时是否遇到特定异常?
-
代码运行良好,但我无法在运行时获取生成的元素,var marelement = doc.GetElementById("MI-PROD-SAML2-IDP-MEDALLIA");在这里变空。 :(
标签: javascript c# html winforms web-scraping