【问题标题】:Getting the pdf blob from url and insert to drive directly using puppeteer library and fetch从 url 获取 pdf blob 并使用 puppeteer 库直接插入驱动器并获取
【发布时间】:2019-01-15 18:05:40
【问题描述】:

我正在尝试使用 puppeteer 登录网站并将 pdf 直接“下载”到我的驱动器。我已经设法使用 puppeteer 到达 pdf 页面,并且我尝试(在其他尝试之间)使用 fetch 和 cookie 来获取 blob 以发送到驱动器。我不能在这里发布登录信息,但如果你能帮我在代码中查找错误(或更多),那就太好了!目前,它转到 pdf 之前的页面,获取链接,使用 cookie 获取并在驱动器中插入 pdf,但 pdf 已损坏 0 kb。

我尝试了 setRequestInterception、getPdf(来自 puppeteer)并使用缓冲区和我在研究中发现的一些东西。

 //Page before pdfPage. Here I got the link: urlPdf
 //await page.goto(urlPdf); 
 //await page.waitForNavigation();
 //const htmlPdf = await page.content();

 const cookies = await page.cookies()
 const opts = {
    headers: {
        cookie: cookies
    }
};

 let blob = await fetch(urlPdf,opts).then(r => r.blob());
 console.log("pegou o blob")
 // upload file in specific folder

 var file ;
  console.log("driveApi upload reached")
  function blobToFile(req){
    file = req.body.blob
    //A Blob() is almost a File() - it's just missing the two properties below which we will add
    file.lastModifiedDate = new Date();
    file.name = teste.pdf;//req.body.word;
    return file;
  }


var folderId = myFolderId;
var fileMetadata = {
  'name': 'teste.pdf',
  parents: [folderId]
};
var media = {
  mimeType: 'application/pdf',
  body: file
};
drive.files.create({
  auth: jwToken,
  resource: fileMetadata,
  media: media,
  fields: 'id'
}, function(err, file) {
  if (err) {
    // Handle error
    console.error(err);
  } else {
    console.log('File Id: ', file.data.id);
  }
});

【问题讨论】:

    标签: javascript node.js google-drive-api puppeteer


    【解决方案1】:

    我尝试了很多东西,但最终的解决方案发布在这里:

    Puppeteer - How can I get the current page (application/pdf) as a buffer or file?

    await page.setRequestInterception(true);
    
    page.on('request', async request => {
        if (request.url().indexOf('exibirFat.do')>0) { //This condition is true only in pdf page (in my case of course)
          const options = {
            encoding: null,
            method: request._method,
            uri: request._url,
            body: request._postData,
            headers: request._headers
          }
          /* add the cookies */
          const cookies = await page.cookies();
          options.headers.Cookie = cookies.map(ck => ck.name + '=' + ck.value).join(';');
          /* resend the request */
          const response = await request_client(options);
          //console.log(response); // PDF Buffer
          buffer = response;
          let filename = 'file.pdf';
          fs.writeFileSync(filename, buffer); //Save file
       } else {
          request.continue();
       }
    });
    
    

    此方案需要: const request_client = require('request-promise-native');

    【讨论】:

      猜你喜欢
      • 2018-12-28
      • 1970-01-01
      • 2018-02-19
      • 1970-01-01
      • 1970-01-01
      • 2023-03-29
      • 2020-05-22
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多