【问题标题】:Joining two tables based on column regular expression matching基于列正则表达式匹配连接两个表
【发布时间】:2019-09-06 07:00:11
【问题描述】:

AIM:根据一个表的一列连接两个表,其中包含另一个表的另一列中提到的单词。

表 1

tibble::tribble(
  ~ORDER_ID,                                                         ~SUPERNAME_WITH_MODS, ~QUANTITY,
          1,                           "Mods, , 2 Regular Fries, 2 Regular Fries (Mods),",         0,
          2,     "Tomatoes, Tomatoes (Toppings), Toppings, , Lettuce, Lettuce (Toppings)",         0,
          3, "Chicken, Dirty Chicken Cheeseburger, Dirty Chicken Cheeseburger (Chicken),",         0,
          4,  "Garlic & Buttermilk Mayo Dip Pot, Garlic & Buttermilk Mayo Dip Pot (Dips)",         0,
          5,                            "Garlic Bread Pizza, , Verdure, Verdure (Pizza),",         0,
          6,   "Skinny Fries (Salt) - Large, Skinny Fries (Salt) - Large (Add On Sides),",         0,
          7,                    "CYOSalad Veg, , Green Beans, Green Beans (CYOSalad Veg)",         0,
          8,                     "Little Five Guys Style, Little Five Guys Style (Fries)",         0,
          9,                          "Chicken de Volaille (Mains), Chips, Chips (Sides)",         0,
         10,                                     "Modifiers, Medium, Medium (Modifiers),",         0
  )

表 2:

tibble::tribble(
  ~ingredient, ~contributor,
       "beef",       "beef",
      "chili",       "beef",
     "chilli",       "beef",
       "pork",       "pork",
      "bacon",       "pork",
    "chicken",    "chicken"
  )

预期结果

根据表 1 中的 SUPERNAME_WITH_MODS 列连接表 1 和 2,其中包含表 2 中 ingredient 列中的任何单词。请注意,如果没有匹配项,它将返回 NULL

我还想强调它必须匹配任何大小写(大写或小写)中的整个单词。

tibble::tribble(
  ~ORDER_ID,                                                         ~SUPERNAME_WITH_MODS, ~QUANTITY, ~ingredient, ~contributor,
          1,                           "Mods, , 2 Regular Fries, 2 Regular Fries (Mods),",         0,      "NULL",       "NULL",
          2,     "Tomatoes, Tomatoes (Toppings), Toppings, , Lettuce, Lettuce (Toppings)",         0,      "NULL",       "NULL",
          3, "Chicken, Dirty Chicken Cheeseburger, Dirty Chicken Cheeseburger (Chicken),",         0,   "chicken",    "chicken",
          4,  "Garlic & Buttermilk Mayo Dip Pot, Garlic & Buttermilk Mayo Dip Pot (Dips)",         0,      "NULL",       "NULL",
          5,                            "Garlic Bread Pizza, , Verdure, Verdure (Pizza),",         0,      "NULL",       "NULL",
          6,   "Skinny Fries (Salt) - Large, Skinny Fries (Salt) - Large (Add On Sides),",         0,      "NULL",       "NULL",
          7,                    "CYOSalad Veg, , Green Beans, Green Beans (CYOSalad Veg)",         0,      "NULL",       "NULL",
          8,                     "Little Five Guys Style, Little Five Guys Style (Fries)",         0,      "NULL",       "NULL",
          9,                          "Chicken de Volaille (Mains), Chips, Chips (Sides)",         0,      "NULL",       "NULL",
         10,                                     "Modifiers, Medium, Medium (Modifiers),",         0,      "NULL",       "NULL"
  )

问题我知道这将是左连接,但我不确定我应该在连接的 ON 部分写什么。

【问题讨论】:

标签: sql regex ansi-sql snowflake-cloud-data-platform


【解决方案1】:

不确定要如何处理有多个匹配项的情况。如果需要,您可以添加 GROUP BY 子句。

选择 * 从 t1 LEFT JOIN t2 ON CONTAINS(LOWER(t1.supername_with_mods), LOWER(t2.ingredient));

【讨论】:

    猜你喜欢
    • 2013-02-28
    • 2016-07-11
    • 1970-01-01
    • 2018-01-08
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-03-16
    相关资源
    最近更新 更多