为了获得最佳性能,如果字符串可能很长,并且您需要支持所有个 Unicode 字符,请使用 Set<Integer> 和 retainAll(),其中整数值是 Unicode 代码点。 p>
在 Java 8 中,可以使用以下代码完成:
private static int countDistinctCommonChars(String s1, String s2) {
Set<Integer> set1 = s1.codePoints().boxed().collect(Collectors.toSet());
Set<Integer> set2 = s2.codePoints().boxed().collect(Collectors.toSet());
set1.retainAll(set2);
return set1.size();
}
如果您希望返回常用字符,您可以这样做:
private static String getDistinctCommonChars(String s1, String s2) {
Set<Integer> set1 = s1.codePoints().boxed().collect(Collectors.toSet());
Set<Integer> set2 = s2.codePoints().boxed().collect(Collectors.toSet());
set1.retainAll(set2);
int[] codePoints = set1.stream().mapToInt(Integer::intValue).toArray();
Arrays.sort(codePoints);
return new String(codePoints, 0, codePoints.length);
}
测试
public static void main(String[] args) {
test("hello", "lend");
test("lend", "hello");
test("mississippi", "expressionless");
test("expressionless", "comprehensible");
test("????", "?????"); // Extended, i.e. 2 chars per code point
}
private static void test(String s1, String s2) {
System.out.printf("Found %d (\"%s\") common chars between \"%s\" and \"%s\"%n",
countDistinctCommonChars(s1, s2),
getDistinctCommonChars(s1, s2),
s1, s2);
}
输出
Found 2 ("el") common chars between "hello" and "lend"
Found 2 ("el") common chars between "lend" and "hello"
Found 3 ("ips") common chars between "mississippi" and "expressionless"
Found 8 ("eilnoprs") common chars between "expressionless" and "comprehensible"
Found 2 ("??") common chars between "????" and "?????"
请注意,最后一个测试使用来自 'Domino Tiles' Unicode Block(U+1F030 到 U+1F09F)的 Unicode 字符,即在 Java 字符串中存储为 surrogate pairs 的字符。