【发布时间】:2018-10-28 04:36:13
【问题描述】:
我还是个初学者,但我可以阅读简单的 html 结构。
但是在https://stockrow.com/AAPL/financials/income/annual 网站上,我尝试使用 xmlhttprequest 将数据提取到 excel 中,但源数据缺少包含所有关键数据的重要表格。 当我检查网站时,我可以看到整个 html 结构。
这是我得到的源数据:
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="apple-touch-icon-precomposed" sizes="57x57"
href="/favicons/apple-touch-icon-57x57.png" />
<link rel="apple-touch-icon-precomposed" sizes="114x114"
href="/favicons/apple-touch-icon-114x114.png" />
<link rel="apple-touch-icon-precomposed" sizes="72x72"
href="/favicons/apple-touch-icon-72x72.png" />
<link rel="apple-touch-icon-precomposed" sizes="144x144"
href="/favicons/apple-touch-icon-144x144.png" />
<link rel="apple-touch-icon-precomposed" sizes="60x60"
href="/favicons/apple-touch-icon-60x60.png" />
<link rel="apple-touch-icon-precomposed" sizes="120x120"
href="/favicons/apple-touch-icon-120x120.png" />
<link rel="apple-touch-icon-precomposed" sizes="76x76"
href="/favicons/apple-touch-icon-76x76.png" />
<link rel="apple-touch-icon-precomposed" sizes="152x152"
href="/favicons/apple-touch-icon-152x152.png" />
<link rel="icon" type="image/png" href="/favicons/favicon-196x196.png"
sizes="196x196" />
<link rel="icon" type="image/png" href="/favicons/favicon-96x96.png"
sizes="96x96" />
<link rel="icon" type="image/png" href="/favicons/favicon-32x32.png"
sizes="32x32" />
<link rel="icon" type="image/png" href="/favicons/favicon-16x16.png"
sizes="16x16" />
<link rel="icon" type="image/png" href="/favicons/favicon-128.png"
sizes="128x128" />
<meta name="application-name" content="stockrow.com"/>
<meta name="msapplication-TileColor" content="#FFFFFF" />
<meta name="msapplication-TileImage" content="/favicons/mstile-144x144.png"
/>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<link href="https://code.cdn.mozilla.net/fonts/fira.css" rel="stylesheet" type="text/css" />
<script src="https://www.google.com/recaptcha/api.js"></script>
<script src="https://cdn.ravenjs.com/3.15.0/raven.min.js"></script>
<script>Raven.config('https://3ce523a8252c436f83c6fc423b340c0a@sentry.io/144901').install()</script>
<meta name="csrf-param" content="authenticity_token" />
<link rel="stylesheet" media="screen" href="/packs/stockrow-aa9c6f09f554179248530de2e33baa9b.css" />
<script src="/packs/stockrow-a35b20c51d525016f7c7.js"></script>
<script async id="_ck_381101" src="https://forms.convertkit.com/381101?v=7"></script>
我不知道如何解决这个问题,所以我想试试堆栈溢出。
【问题讨论】:
-
网站的内容可能主要由 JavaScript 加载,因此不在初始 HTML 中。阅读网络抓取工具。