如何将我的代码从 O(n^2) 优化到 nlog(n)答案

【问题标题】：How can I optimize my code from O(n^2) to nlog(n)如何将我的代码从 O(n^2) 优化到 nlog(n)
【发布时间】：2021-06-18 05:01:43
【问题描述】：

给定一个数字数组，以产生最大值的方式排列它们。例如，如果给定数字是 {54, 546, 548, 60}，则排列 6054854654 给出最大值。如果给定的数字是 {1, 34, 3, 98, 9, 76, 45, 4}，则排列 998764543431 给出最大值。

所以，提供的函数声明是

string printLargest(vector<string> &arr)

下面提供了我写的解决方案。

string printLargest(vector<string> &arr) {
    for (int i=0; i<arr.size()-1; i++) {
        for (int j=i+1; j<arr.size(); j++) {
            string y = arr[i] + arr.at(j);
            string z = arr[j] + arr[i];
            if (z>y) swap(arr[j], arr[i]);
        }
    }
    string y="";
    for(string x:arr) y +=x;
    return y;
}

在线编译器显示“超出时间限制” 请优化您的代码并再次提交。我认为我的解决方案需要 O(n^2)。预期时间复杂度：O(NlogN)，如何优化我的代码？

【问题讨论】：

使用标准排序算法？例如std::sort
从算法上讲，“最高”的数字不会先排序，然后产生最高的数字吗？就像按他们的字符串版本排序，“99”>“989”等。
按降序对字符串进行排序，但是您仍然需要蛮力在具有相同前缀的字符串之间进行选择，例如 54,546 -> 54654 和 54,543 -> 54543 您可能会遇到与 next prefix 组合的类似情况。 ..
你不能在向量大小的唯一棱镜上分析这个问题的算法复杂性。向量物质所包含的整数的大小

标签： c++ arrays string algorithm

【解决方案1】：

这可以使用std::map来完成


struct classcomp{ // The comparison class that allow std::map to do comparison between keys of type std::string
    bool operator() (const std::string &a, const std::string &b)const {return (a+b>b+a);}
    };
̀
std::string printLargest(std::vector<std::string> &arr) {
    std::map<std::string, unsigned int, classcomp> orderedString;  // a map of the form string : number of occurance in the vector
    for (auto i = arr.begin(); i != arr.end(); i++){  // O(n)
        if (orderedString.count(*i)) orderedString[*i]++;  // O(log(n)) or O(1) depending of the implementation of std::map
        else orderedString.insert(std::pair<std::string, unsigned int>(*i, 1));  // O(log(n)) or O(1) depending of the implementation of std::map
        }
    std::string r="";
    for (auto i = orderedString.begin(); i != orderedString.end(); i++){  //this works since our map container is such that the first element is the highest
        for (unsigned j=0; j < i->second; j++){  //The overall complexity is O(n)
            r+=i->first;
            }
        }
    return r;
    }
̀

总体复杂度为 O(mnlog(n))，其中 m 是向量中字符串的最大长度，n 是向量本身的大小

【讨论】：

【解决方案2】：

手动解决这个问题，我要做的是取以序列中最高数字开头的数字，对吗？因此，换句话说，我会手动按此标准对它们进行排序，然后附加它们。
只要您能够描述标准，这只不过是一种排序算法，标准是一个自定义比较器。

所以基本上，最后，代码看起来有点像：

inline bool better_digits(const string& a, const string& b);

string print_largest(vector<string> data)
{
    std::sort(data.begin(), data.end(), better_digits); // sort
    string result = std::accumulate(data.begin(), data.end(), std::string{}); // append
    return result;
}

换句话说，我做了和你已经做的一样的事情，但是使用了更好的排序算法，只需相信std::sort 是有效的（确实如此）。无需重新发明轮子。

注意：std::accumulate 的行需要 C++20。否则，只需像在自己的代码中那样使用循环进行追加。
另外，我从输入中删除了引用以避免函数产生副作用，但如果允许，请务必这样做。

剩下要做的就是定义better_digits。为此，我们可以使用您已经做过的以及 TUI 爱好者也使用过的：

inline bool better_digits(const string& a, const string& b)
{
    return a+b > b+a;
}

请注意，我没有将我的变体与 TUI 爱好者的变体进行比较。这将被证明是非常有趣的。我发布了我的，因为我认为它更具可读性，但 TUI 爱好者变体可能更容易更有效。（两者都是 Θ(nlogn)，但整体因素也很重要。）

【讨论】：

您尝试过54,546 -> 54654 与54,543 -> 54543 的案例吗？
@Spektre 这不应该由a+b > b+a 自动完成吗？因此问题中的代码已经完成了吗？
没有测试您的解决方案，所以我不确定......这只是打破基于天真的排序的解决方案的测试用例......您的解决方案可能还会出现随之而来的前缀问题（如果存在）根本不会处理
我使用了 map，因为我不完全信任 std::sort。我仍然认为您的解决方案很可能最终会更快
@TUIlover 甚至可能依赖于编译器，或者依赖于 STL 版本。

【解决方案3】：

这是O(ns) 中的一个算法，其中 n 是数组的长度，s 是字符串的最大长度。它使用 trie 方法。它不是用空格或零填充，而是用假数字填充。

如果数字 544、54 出现在一组中，则 54 相当于 545，并且应该排在前面（我们在 54 的最后一位数字中填充假 5）。

比较[5,554,45]，

第一轮（最高有效位），将其拆分为 [5, 554], [45]
第二轮 [5, 554], [45]
第三轮[5]，[554]，[45]（因为5是用假5s填充的）


    
    def pad_with_fake(L):
        outL = []
        for x in L:
            while len(str(x)) < maxlen:
                lead = x[0]
                if lead not in fake_digits:
                    x = x + fake_digits[int(lead)]
                else:
                    x = x + lead
            outL.append(x)
        return outL
    
    def arrange_number_at_digit_position(pos,inL):
        outL = []
        for digit in digits:
            subL = []
            for x in inL:
                if str(x)[pos] == digit:
                    subL.append(x)
            if subL != []:
                outL.append(subL)
        return outL
            
    def arrange(inL):
        maxlen = max([len(l) for l in inL])
        i = 0
        outLs = [[inL]]
        while i < maxlen:
            outL = []
            for l in outLs[i]:
                outL = outL + (arrange_number_at_digit_position(i,l))
            outLs = outLs + [outL]
            i = i+1
        return outLs
    
    def main():
        inL = [559, 5, 55, 59, 549, 544, 54]
        L = [str(l) for l in inL]
        digits = [0,1,2,3,4,5,6,7,8,9]
        fake_digits = ['a','b','c','d','e','f','g','h','i','j']
        digits = ['9','j','8','i','7','h','6','g','5','f','4','e','3','d','2','c','1','b','a','0']
        L = pad_with_fake(L)
        outLs = arrange(L)
        for l in outLs:
            print(l)
        final_string = ""
        for l in outLs[-1]:
            final_string = final_string + "|" + l[0]
        for i in range(0,10):
            final_string = final_string.replace(fake_digits[i],"")
        print("input", inL, "--> output", final_string)
    main()

例子

[['559', '5ff', '55f', '59f', '549', '544', '54f']]
[['559', '5ff', '55f', '59f', '549', '544', '54f']]
[['59f'], ['559', '55f'], ['5ff'], ['549', '544', '54f']]
[['59f'], ['559'], ['55f'], ['5ff'], ['549'], ['54f'], ['544']]
input [559, 5, 55, 59, 549, 544, 54] --> output |59|559|55|5|549|54|544

【讨论】：

【解决方案4】：

您可以尝试使用更快的排序算法，例如合并排序 O(n log n)。

或者，通过使用 trie，这个问题可以在 O(s) 中解决，其中 s 是字符数的总和所有字符串。

【讨论】：

我不认为O(n) 是可能的......简单的排序不会导致有效的答案O(n.log(n)) 可能是可能的，但我认为它更像O(m^2 + n.log(n)) 其中m < n ...和/或很多启发式方法
你可以尝试将所有字符串插入到一个 trie 中，并进行树遍历。插入将占用所有字符串中的字符数。遍历将采用 trie 中的节点数。它会给出 O(n) 复杂度。
这只会增强排序，但这只是问题的“次要”部分
您能否详细说明使用 trie 的 O(n) 解决方案？ trie拓扑到底是什么？您将如何从中获得答案？您是否也考虑过创建 trie 的复杂性（因为这是在线判断，它也将与其余代码一起测量）？