【问题标题】:Unable to scrape text from a div tag using Cheerio js node?无法使用 Cheerio js 节点从 div 标签中抓取文本?
【发布时间】:2018-11-20 16:03:57
【问题描述】:

我正在尝试了解网络抓取以及如何使用 Cheerio.js 从 DOM 元素中抓取文本。现在,这是我的问题。我有一个 div 标签,里面有另一个 div 标签,然后在第二个 div 标签里面,我有一个

标签,我想从

中提取文本

标签。我该怎么办?

var express = require('express');
var fs = require('fs');
var request = require('request');
var cheerio = require('cheerio');
var app     = express();

app.get('/scrape', function(req, res){

    //All the web scraping magic will happen here

    url = 'https://ihub.co.ke/jobs';

    // The structure of our request call
    // The first parameter is our URL
    // The callback function takes 3 parameters, an error, response status code and the html

    request(url, function(error, response, html){

        // First we'll check to make sure no errors occurred when making the request

        if(!error){
            // Next, we'll utilize the cheerio library on the returned html which will essentially give us jQuery functionality

            var $ = cheerio.load(html);
            console.log('Html data',$);

            // Finally, we'll define the variables we're going to capture

            var title;
            var json = { title : ""};


            thing = $(".container-fluid-post-job-link").text();


            console.log('Thing',thing);


        }
    })


})

app.listen('8081')

console.log('Magic happens on port 8081');

HTML
<div class="container-fluid jobs-board">
<div class="container-fluid post-job-link">
<p>Advertise a Job Vacancy for KES 1,500 for 2 months.</p>
<p><a href="/myjobs" class="btn btn-outline-primary btn-md btn-block">Post Job</a></p>
</div>


//I want to extract text from the 2 <p> tags that are inside the <div> tag which has a class = container-fluid post-job-link

【问题讨论】:

    标签: javascript jquery html node.js cheerio


    【解决方案1】:

    您需要使用正确的查询选择器,使用.container-fluid.post-job-link 而不是.container-fluid-post-job-link,以下示例可能对您有所帮助

    thing = $(".container-fluid.post-job-link").text();
    

    或特定的文字

    thing = $(".container-fluid.post-job-link").find('a').first().text();
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-04-16
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-11-10
      • 1970-01-01
      相关资源
      最近更新 更多