查找两个字符串共有的最大字符数答案

【问题标题】：Find maximum number of characters that both strings have in common查找两个字符串共有的最大字符数
【发布时间】：2014-02-13 08:57:25
【问题描述】：

找出两个字符串共有的最大字符数。字符区分大小写，即小写和大写字符被认为是不同的。

这是我的代码：

#include <iostream>
#include <cstring> 
using namespace std;

int main() {
    std::string a, b;
    int number_cases = 0, count = 0;
    cin >> number_cases;
    while (number_cases != 0) {
        cin >> a;
        cin >> b;
        for (int i = 0; i < a.size(); i++) {
            for (int j = 0; j < b.size(); j++) {
                if (a[i] == b[j]) {
                    count++;
                    b[j] = '#';
                    break;
                }
            }
        }
        cout << count << endl;
        count = 0;
        --number_cases;
    }
}

但是运行需要超过 1 秒，我需要在 1 秒或正好 1 秒内完成。有什么优化技巧吗？

【问题讨论】：

这更适合CodeReview。
对于初学者，请定义“有共同点”。根据含义，最简单的解决方案是对字符串进行排序，然后对它们进行比较。（字符串有多长？尽管是 O(n^2)，但我希望您的代码几乎可以立即完成正常长度的字符串。）
是来自currently running programming contest codechef.com/FEB14/problems/LCPESY 吗？你提到 1 sec 。我认为这不应该回答。

标签： c++ string algorithm stl comparison

【解决方案1】：

只需对它们进行排序并使用set_intersection

#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>

int main()
{
    std::string s1 = "Hello";
    std::string s2 = "World";

    std::sort(begin(s1), end(s1));
    std::sort(begin(s2), end(s2));

    std::string s3;
    std::set_intersection(begin(s1), end(s1), begin(s2), end(s2), std::back_inserter(s3));
    std::cout << s3.size() << ":" << s3;
}

Live Example.

注意：如果您对独特的重叠字符感兴趣，可以在s3 上运行std::unique。

【讨论】：

区分大小写吗？
是的（但可以通过传递适当的谓词来改变）

【解决方案2】：

假设只有 256 个字符。我们可以扫描每个字符串一次，并将每个字符的计数保存在两个数组中：int[]arrayA 用于字符串 A，int [] arrayB 用于字符串 B。

最后，从 0 到 255 遍历 int[] arrayA , arrayB，添加到结果中：

result +=Min(arrayA[i],arrayB[i]);

时间复杂度将是 O(n + m + 256) = O(n) 其中 n 和 m 是字符串 A 和 B 的长度

【讨论】：

【解决方案3】：

我不确定您所说的“有共同点”是什么意思，但对于我想到的第一个定义，最简单的解决方案是可能只使用两个bool 数组：

bool inA[256] = {};
for ( auto current = a.begin(), end = a.end(); current != end; ++ current ) {
    inA[static_cast<unsigned char>(*current)] = true;
}
bool inB[256] = {};
for ( auto current = b.begin(), end = b.end(); current != end; ++ current ) {
    inB[static_cast<unsigned char>(*current)] = true;
}
int count = 0;
for (int i = 0; i != 256; ++ i ) {
    if ( inA[i] && inB[i] ) {
        ++ count;
    }
}

但是，当然，这与之前做的完全不同你的代码可以。如果您正在寻找最长的共同点子序列，或所有子序列，你需要一个不同的算法。

【讨论】：

【解决方案4】：

以下可能会有所帮助：

std::size_t count_letter_in_common_with_dup(const std::string& s1, const std::string& s2)
{
    std::size_t present1[256] = {};
    std::size_t present2[256] = {};

    for (unsigned char c : s1) {
        ++present1[c];
    }
    for (unsigned char c : s2) {
        ++present2[c];
    }
    std::size_t res = 0;
    for (int i = 0; i != 256; ++i) {
        res += std::min(present1[i], present2[i]);
    }
    return res;
}

【讨论】：