【问题标题】:Searching for keywords from a table, in a generic text在通用文本中从表中搜索关键字
【发布时间】:2018-04-20 14:24:36
【问题描述】:

我创建了一个表,其中包含一个关键字列表和一个标识同义词的代码。 IE。具有相同代码的所有关键字都将被视为相同的关键字。 varchar varchar tsvector C1000 AI 'ai':1 C1000 Artificial intelligence 'artifici':1 'intellig':2 C1001 Algorithms 'algorithm':1 C1002 Software Design 'design':2 'softwar':1 C1003 ui design 'design':2 'ui':1 C1003 User interface design 'design':3 'interfac':2 'user':1 C1003 user interface engineering 'engin':3 'interfac':2 'user':1

我想构建一个查询,返回在给定文本中找到的关键字列表。

例如,以下文本(只是一个示例)应返回数组:[C1001,C1003]。

A good ui design starts from a good algorithm design, for this you need a good user interface engineering.

有没有办法通过 postgres 查询或自定义函数来做到这一点?

【问题讨论】:

    标签: postgresql full-text-search keyword keyword-search tsvector


    【解决方案1】:

    您可以使用朴素贝叶斯分类器算法。它是最强大的文本分类算法。从here了解更多信息

    【讨论】:

      【解决方案2】:

      您可以将文本转换为向量,要查询的关键字,然后检查向量是否与查询匹配

      => \d codes 
      Column  |       Type        | Modifiers 
      ---------+-------------------+-----------
      code    | character varying | 
      keyword | character varying | 
      
      => select * from codes ;
       code  |          keyword           
      -------+----------------------------
      C1000 | AI
      C1000 | Artificial intelligence
      C1001 | Algorithms
      C1002 | Software Design
      C1003 | ui design
      C1003 | User interface design
      C1003 | user interface engineering
      (7 rows)
      
      => select distinct code from codes where to_tsvector('A good ui design starts from a good algorithm design, for this you need a good user interface engineering.') @@ plainto_tsquery(keyword);
      code  
      -------
      C1001
      C1003
      (2 rows)
      
      => select array_agg(distinct code) from codes where to_tsvector('A good ui design starts from a good algorithm design, for this you need a good user interface engineering.') @@ plainto_tsquery(keyword);
      array_agg   
      ---------------
      {C1001,C1003}
      (1 row)
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2015-09-05
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2012-10-06
        相关资源
        最近更新 更多