【发布时间】:2021-04-26 05:23:10
【问题描述】:
我正在制作一个网络爬虫来提取股票信息并保存到数据库中。我的计划是仅获取公司名称和价格(最新价格、收盘价 YCP 等)并存储为对象。
URL = 查看源代码:https://www.dsebd.org/latest_share_price_scroll_l.php 如果需要,请从 5460 行开始
这里我需要先转义 tr 然后再拉每个 td[3-7]。
<div class="table-responsive inner-scroll">
<table class='table table-bordered background-white shares-table fixedHeader'>
<thead>
<tr>
<th width="4%">#</th>
<th width="12%">TRADING CODE</th>
<th width="12%">LTP*</th>
<th width="12%">HIGH</th>
<th width="12%">LOW</th>
<th width="12%">CLOSEP*</th>
<th width="12%">YCP*</th>
<th width="12%">CHANGE</th>
<th width="12%">TRADE</th>
<th width="12%">VALUE (mn)</th>
<th width="12%">VOLUME</th>
</tr>
</thead>
<tbody>
<tr>
<td width="4%">1</td>
<td width="15%">
<a href="displayCompany.php?name=1JANATAMF" class='ab1'>
1JANATAMF </a>
</td>
<td width="10%">6.3</td>
<td width="10%">6.7</td>
<td width="12%">6.3</td>
<td width="11%">6.5</td>
<td width="12%">6.6</td>
<td width="12%" style="color: red">-0.3</td>
<td width="11%">218</td>
<td width="11%">11.593</td>
<td width="11%">1,771,986</td>
</tr>
</tbody>
<tr>
<td width="4%">2</td>
<td width="15%">
<a href="displayCompany.php?name=1STPRIMFMF" class='ab1'>
1STPRIMFMF </a>
</td>
<td width="10%">20.2</td>
<td width="10%">21.9</td>
<td width="12%">20</td>
<td width="11%">20.2</td>
<td width="12%">21.3</td>
<td width="12%" style="color: red">-1.1</td>
<td width="11%">420</td>
<td width="11%">16.914</td>
<td width="11%">815,552</td>
</tr>
</tbody>... More stocks
这是我的代码。
public Worker(ILogger<Worker> logger, IParseService parseService)
{
_logger = logger;
_parseService = parseService;
_url = "https://www.dsebd.org/latest_share_price_scroll_l.php";
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
var HtmlDoc = GetHtml(_url);
var mainNode = HtmlDoc.DocumentNode.SelectSingleNode("//div[@class='table-responsive inner-scroll']/table[contains(@class, 'table table-bordered background-white shares-table fixedHeader')]").ChildNodes;
foreach (var nodes in mainNode)
{
//Code to get the info
}
感谢您阅读我的问题,非常感谢任何帮助。
【问题讨论】:
标签: c# web-scraping html-table html-parsing html-agility-pack