【发布时间】:2021-12-12 12:31:15
【问题描述】:
我想将带重音的字符串与不带重音的查询拆分。
这是我目前的代码:
const sanitizer = (text: string): string => {
return text
.normalize("NFD")
.replace(/\p{Diacritic}/gu, "")
.toLowerCase();
};
const splitter = (text: string, query: string): string[] => {
const regexWithQuery = new RegExp(`(${query})|(${sanitizer(query)})`, "gi");
return text.split(regexWithQuery).filter((value) => value);
};
这是测试文件:
import { splitter } from "@/utils/arrayHelpers";
describe("arrayHelpers", () => {
describe("splitter", () => {
const cases = [
{
text: "pepe dominguez",
query: "pepe",
expectedArray: ["pepe", " dominguez"],
},
{
text: "pépé dominguez",
query: "pepe",
expectedArray: ["pépé", " dominguez"],
},
{
text: "pepe dominguez",
query: "pépé",
expectedArray: ["pepe", " dominguez"],
},
{
text: "pepe dominguez",
query: "pe",
expectedArray: ["pe", " pe", " dominguez"],
},
{
text: "pepe DOMINGUEZ",
query: "DOMINGUEZ",
expectedArray: ["pepe ", "DOMINGUEZ"],
},
];
it.each(cases)(
"should return an array of strings with 2 elements [pepe, dominguez]",
({ text, query, expectedArray }) => {
// When I call the splitter function
const textSplitted = splitter(text, query);
// Then I must have an array of two elements
expect(textSplitted).toStrictEqual(expectedArray);
}
);
});
});
问题在于第二种情况:
{
text: "pépé dominguez",
query: "pepe",
expectedArray: ["pépé", " dominguez"],
}
因为经过清理的查询pepe 也是pepe,所以不在Pépé dominguez 中。
我不知道在这种情况下如何实现使splitter函数返回['pépé', 'dominguez']。
我正在寻找原始文本的结果,而不是净化文本
【问题讨论】:
-
通常你不会删除方言,而是用其他字母替换它们。例如。
.replace('é', 'e')。 stackoverflow.com/questions/286921/… -
我认为清理功能可以完成这项工作。但我不想清理结果
-
你会用
text: "ééé"做什么?
标签: javascript regex