【发布时间】:2021-10-31 14:58:32
【问题描述】:
我正在使用 Cheerio 抓取几个网站以用于一个项目。此代码从网站获取数据并将其推送到几个不同的数组中。
我遇到的问题是我似乎无法隔离价格信息。这是我的 NodeJS 代码:
// Gets all available keyboards from
mykeyboard.eu (first page only) //
app.get('/data', (req, res) => {
const MyKeyboardEU ='https://mykeyboard.eu/catalogue/category/mechanical-keyboards_3/?selected_facets=num_in_stock_exact%3A%5B1+TO+%2A%5D';
const MKEUResults = [];
const MKEUThumbs = [];
const MKEUPrice = [];
// Gets in-stock results from mykeyboard.eu //
Axios.get(MyKeyboardEU)
.then((response) => {
let $ = cheerio.load(response.data);
let keyboards = $('.thumbnail')
let price = $('.price_color').children;
// Pushes keyboard names + thumbnail links to respective arrays //
for (var i = 0; i < keyboards.length; i++) {
MKEUResults.push(keyboards[i].attribs.alt);
MKEUThumbs.push(keyboards[i].attribs.src);
console.log(price[i]);
}
// Maps array into single object for consuption on frontend //
let arr = MKEUResults.map((res, idx) => {
return {'name': res, "img": MKEUThumbs[idx]}
});
res.send(arr);
})
.catch((err) => res.send(err));
});
这是 console.log(price[i]) 输出的内容:
<ref *1> [
keeb-finder-server-1 | Node {
keeb-finder-server-1 | type: 'text',
keeb-finder-server-1 | data: '€179.00',
keeb-finder-server-1 | parent: Node {
keeb-finder-server-1 | type: 'tag',
keeb-finder-server-1 | name: 'p',
keeb-finder-server-1 | namespace: 'http://www.w3.org/1999/xhtml',
keeb-finder-server-1 | attribs: [Object: null prototype],
keeb-finder-server-1 | 'x-attribsNamespace': [Object: null prototype],
keeb-finder-server-1 | 'x-attribsPrefix': [Object: null prototype],
keeb-finder-server-1 | children: [Circular *1],
keeb-finder-server-1 | parent: [Node],
keeb-finder-server-1 | prev: [Node],
keeb-finder-server-1 | next: [Node]
keeb-finder-server-1 | },
keeb-finder-server-1 | prev: null,
keeb-finder-server-1 | next: null
keeb-finder-server-1 | }
keeb-finder-server-1 | ]
为了记录,它会输出一些与网站上不同项目有关的消息。我只想获取所有这些响应的 data 组件。
我确信在阅读文档时我错过了一些相当简单的东西,但我似乎无法让它发挥作用。
【问题讨论】:
标签: javascript node.js web-scraping cheerio