【问题标题】:Mysql regexp tag HTMLMysql 正则表达式标签 HTML
【发布时间】:2021-09-30 03:03:01
【问题描述】:

我在 Message_html 字段的数据库中保存了一个 HTML 代码

Dear CUSTOMER, <br />
<br />
Please be advised that the following document has been moved:<br />
Document number: D4D4D4D4D4D<br />
<br />
<table border="1">
    <th>Data</th>
    <th>Movimento</th>
    <th>Documento</th>
    <tr>
        <td>22/07/2021 15:35</td>
        <td>Juntada de contrarrazões</td>
        <td><a href="ver.aspx">ALERT - REPRESENTATIONS</a></td>
    </tr>
    <tr>
        <td>22/07/2021 15:38</td>
        <td>Juntada de certidão</td>
        <td><a href="ver.aspx">SUCCESS - CERTIFICATE</a></td>
    </tr>
    <tr>
        <td>22/07/2021 15:39</td>
        <td>Juntada de alvará</td>
        <td><a href="ver.aspx">NOTICE - PERMIT</a></td>
    </tr>
</table>
<br />
<br />
If you are no longer interested in receiving the push, access the link:: <a href="push.aspx">Exit</a><br />
<br />
<b>ATTENTION: this email is generated in an automated way, please do not reply.</b>

我需要检查表格列中是否有单词 CERTIFICATE

<td><a href="ver.aspx">CERTIFICATE</a></td>

MYSQL中使用的正则表达式

SELECT REGEXP_INSTR('<td><a href="ver.aspx">SUCCESS - CERTIFICATE</a></td>', '>[^<td><a*]*CERTIFICATE*[</a></td>]') AS verify;

    REGEXP_INSTR(k.Message_html, '>[^<td><a*]*CERTIFICATE*[</a></td>]')

找不到记录

SELECT
*
FROM table as k

Where

WHERE REGEXP_INSTR(k.Message_html, concat('>[^<td><a*]*','CERTIFICATE,'*[</a></td>]')) > 0;

而正则表达式在表中找不到单词

既然有证书这个词

单词表的内容

word_id word
1 SUBJECT
2 DECISION
3 ORDER
4 SENTENCE
5 PETITION
6 CERTIFICATE
7 AMENDMENT TO THE INITIAL PETITION
8 NOTIFICATION - NOTIFICATION
9 EXTRACT
10 PETITION - PETITION
11 NOTIFICATION
12 MANIFESTATION
13 OTHER PARTS
14 REPRESENTATIONS

【问题讨论】:

  • Cannot find record 是什么意思?这不是 MySQL 错误消息。
  • 如果查询没有返回任何行,这取决于WHERE 子句,而不是SELECT 子句。
  • [^&lt;td&gt;&lt;a*] 不会按照你的想法去做。它匹配不是&lt;td&gt;a* 的单个字符。
  • 我把查询放在哪里用了

标签: mysql regex regexp-replace


【解决方案1】:

如果您只是测试字符串是否与正则表达式匹配,则无需使用REGEXP_INSTR()。使用RLIKE

SELECT *
FROM YourTable AS k
WHERE k.Message_html RLIKE '<td><a [^<]*CERTIFICATE'

[^&lt;]* 将匹配不是另一个标签开头的任何内容,因此如果&lt;a&gt; 在其文本中包含CERTIFICATE,它将匹配。

加入单词表:

SELECT * 
FROM Table1 AS k 
JOIN words p ON k.Message_html RLIKE CONCAT('<td><a [^<]*', p.word);

DEMO

【解决方案2】:

这将找到您的证书

SELECT REGEXP_INSTR('<td><a href="ver.aspx">SUCCESS - CERTIFICATE</a></td>'
, '(<td><a href="ver.aspx">).*CERTIFICATE.*(</a></td>)') AS verify;

    SELECT REGEXP_INSTR('Dear CUSTOMER, <br />
<br />
Please be advised that the following document has been moved:<br />
Document number: D4D4D4D4D4D<br />
<br />
<table border="1">
    <th>Data</th>
    <th>Movimento</th>
    <th>Documento</th>
    <tr>
        <td>22/07/2021 15:35</td>
        <td>Juntada de contrarrazões</td>
        <td><a href="ver.aspx">ALERT - REPRESENTATIONS</a></td>
    </tr>
    <tr>
        <td>22/07/2021 15:38</td>
        <td>Juntada de certidão</td>
        <td><a href="ver.aspx">SUCCESS - CERTIFICATE</a></td>
    </tr>
    <tr>
        <td>22/07/2021 15:39</td>
        <td>Juntada de alvará</td>
        <td><a href="ver.aspx">NOTICE - PERMIT</a></td>
    </tr>
</table>
<br />
<br />
If you are no longer interested in receiving the push, access the link:: <a href="push.aspx">Exit</a><br />
<br />
<b>ATTENTION: this email is generated in an automated way, please do not reply.</b>', '(<td><a href="ver.aspx">).*CERTIFICATE.*(</a></td>)') AS verify;

【讨论】:

猜你喜欢
相关资源
最近更新 更多
热门标签