【问题标题】:Content-script extracting data from HTML?内容脚本从 HTML 中提取数据?
【发布时间】:2020-07-17 14:23:28
【问题描述】:

我正在尝试使用内容脚本从我的 Gmail 帐户中解析和提取电子邮件数据,但由于 Gmail 使用动态生成的 DOM 路径,因此脚本在重新加载时会随着路径的变化而失败。这是我的代码:

function extractData() {
    var email = document.querySelector("//*[@id=":1u"]/div[2]/table/tbody/tr/td[2]/table[1]/tbody/tr/td/table/tbody/tr[3]/td/table[2]/tbody/tr/td[2]/div/span[3]/span/a");
    email = email.textContent;
    var amount = document.querySelector("#\\:2k > div:nth-child(2) > table > tbody > tr > td:nth-child(2) > table:nth-child(1) > tbody > tr > td > table > tbody > tr:nth-child(3) > td > table:nth-child(2) > tbody > tr > td:nth-child(2) > div > span:nth-child(4)");
    amount = amount.textContent;
    const regex = /(\$[0-9,]+(\.[0-9]{2})?)/;
    amount = amount.match(regex);
    amount = amount[0].replace('$', '');

    var date = document.querySelector("#\\:2k > div:nth-child(2) > table > tbody > tr > td:nth-child(2) > table:nth-child(1) > tbody > tr > td > table > tbody > tr:nth-child(3) > td > table:nth-child(1) > tbody > tr > td:nth-child(4) > span > span:nth-child(1)");
    date = date.textContent;
    date = date.split(' ');
    date = date[0];

    console.log(email + amount + date);

} 

如何克服这个问题,我想使用Regexhtml 中提取相关数据可能是一个答案,但Regex 超出了我的水平。我需要提取的数据是这样的:

You received a payment of {$10.00} USD from {NameHere} ({emailHere})

需要提取花括号之间的数据。

【问题讨论】:

  • @wOxxOm 不确定这将如何让我知道并选择新的 dom ID,如 @id=":1u",DOM id 是在这里动态生成的?
  • 啊,使用基于人类可读属性的更简单的手工选择器,例如[data-message-id] [email]

标签: javascript regex firefox-addon-webextensions content-script


【解决方案1】:

如果电子邮件文本始终具有您显示的结构,您可以这样做:

const text = "You received a payment of $10.00 USD from Some Person (mail@mail.com)";
const regex = /You received a payment of (.*) USD from (.*) \((.*)\)/;
const matches = text.match(regex);

const amount = matches[1];
const name = matches[2];
const email = matches[3];

console.log(amount, name, email);

如果文本不同,您可以通过这种方式找到部分(虽然我不建议这样做,特别是对于名称):

let text = "You received a payment of $10.00 USD from Some Person (mail@mail.com). But somemail@hotmail.com also sent $500.00 and then $15 more.";

const priceRegex = /\$\d+\.?\d*/g;
const nameRegex = /([A-Z][a-z]+ [A-Z][a-z]+)/g;
const emailRegex = /([a-zA-Z0-9._-]+@[a-zA-Z0-9]+\.[a-zA-Z0-9._-]{2,4}(\.[a-zA-Z0-9._-]+)?)/g;

console.log( text.match(priceRegex) );
console.log( text.match(nameRegex) );
console.log( text.match(emailRegex) );

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-10-28
    • 2013-11-15
    • 2014-01-26
    • 1970-01-01
    • 1970-01-01
    • 2016-04-12
    相关资源
    最近更新 更多