没有虚函数的 C++ 动态调度答案

【问题标题】：C++ Dynamic Dispatch without Virtual Functions没有虚函数的 C++ 动态调度
【发布时间】：2011-06-09 06:11:16
【问题描述】：

我有一些遗留代码，而不是虚拟函数，使用kind 字段进行动态调度。它看起来像这样：

// Base struct shared by all subtypes
// Plain-old data; can't use virtual functions
struct POD
{
    int kind;

    int GetFoo();
    int GetBar();
    int GetBaz();
    int GetXyzzy();
};

enum Kind { Kind_Derived1, Kind_Derived2, Kind_Derived3 /* , ... */ };

struct Derived1: POD
{
    Derived1(): kind(Kind_Derived1) {}

    int GetFoo();
    int GetBar();
    int GetBaz();
    int GetXyzzy();

    // ... plus other type-specific data and function members ...
};

struct Derived2: POD
{
    Derived2(): kind(Kind_Derived2) {}

    int GetFoo();
    int GetBar();
    int GetBaz();
    int GetXyzzy();

    // ... plus other type-specific data and function members ...
};

struct Derived3: POD
{
    Derived3(): kind(Kind_Derived3) {}

    int GetFoo();
    int GetBar();
    int GetBaz();
    int GetXyzzy();

    // ... plus other type-specific data and function members ...
};

// ... and so on for other derived classes ...

然后POD类的函数成员是这样实现的：

int POD::GetFoo()
{
    // Call kind-specific function
    switch (kind)
    {
    case Kind_Derived1:
        {
        Derived1 *pDerived1 = static_cast<Derived1*>(this);
        return pDerived1->GetFoo();
        }
    case Kind_Derived2:
        {
        Derived2 *pDerived2 = static_cast<Derived2*>(this);
        return pDerived2->GetFoo();
        }
    case Kind_Derived3:
        {
        Derived3 *pDerived3 = static_cast<Derived3*>(this);
        return pDerived3->GetFoo();
        }

    // ... and so on for other derived classes ...

    default:
        throw UnknownKindException(kind, "GetFoo");
    }
}

POD::GetBar()、POD::GetBaz()、POD::GetXyzzy()等成员的实现方式类似。

这个例子被简化了。实际代码有大约十几个不同的POD 子类型和几十个方法。 POD 的新子类型和新方法的添加非常频繁，因此每次我们这样做时，都必须更新所有这些 switch 语句。

处理这个问题的典型方法是在POD 类中声明函数成员virtual，但我们不能这样做，因为对象驻留在共享内存中。有很多代码依赖于这些结构是普通的旧数据，所以即使我能找到某种方法在共享内存对象中拥有虚函数，我也不想这样做。

因此，我正在寻找有关清理此问题的最佳方法的建议，以便所有关于如何调用子类型方法的知识都集中在一个地方，而不是分散在几十个 switch 语句中几十个函数。

我想到的是，我可以创建某种适配器类来包装POD 并使用模板来最小化冗余。但在我开始走这条路之前，我想知道其他人是如何处理这个问题的。

【问题讨论】：

你说有很多代码取决于这个类。您可以向其中添加字段，还是结构需要保持不变？
结构应该基本保持不变。我们在共享内存中有一大堆这样的东西，并且已经遇到了内存大小的限制。
所有多个进程是否具有完全相同的库版本？
是的，它们都将拥有相同版本的库。

标签： c++ refactoring metaprogramming template-meta-programming

【解决方案1】：

您可以使用跳转表。这是大多数虚拟调度在底层的样子，您可以手动构建它。

template<typename T> int get_derived_foo(POD*ptr) {
    return static_cast<T>(ptr)->GetFoo();
}
int (*)(POD*) funcs[] = {
    get_derived_foo<Derived1>,
    get_derived_foo<Derived2>,
    get_derived_foo<Derived3>
};
int POD::GetFoo() {
    return funcs[kind](this);
}

举个简短的例子。

共享内存的具体限制是什么？我意识到我在这里不够了解。这是否意味着我不能使用指针，因为另一个进程中的某个人会尝试使用这些指针？

您可以使用字符串映射，其中每个进程都有自己的映射副本。您必须将其传递给 GetFoo() 以便它可以找到它。

struct POD {
    int GetFoo(std::map<int, std::function<int()>& ref) {
        return ref[kind]();
    }
};

编辑：当然，你不必在这里使用字符串，你可以使用 int。我只是用它作为例子。我应该把它改回来。事实上，这个解决方案非常灵活，但重要的是，复制特定于流程的数据，例如函数指针之类的，然后传入。

【讨论】：

共享内存中的对象被多个进程使用。虚函数指针并非对所有进程都有效。
我没有看细节，但总的来说，表是由每个对象中的指针引用的。如果线程在地址 Y 处具有类型 X 的 vtable，它将把 Y 存储在对象的 vptr 字段中。不能保证即使 vtable 存储在系统中完全相同的硬件地址中，两个不同的进程也不会在不同的逻辑地址中看到它。如果是这种情况，其他线程将尝试使用错误地址中的 vtable 并死掉。
很好，但最好确保您正确生成数组。预处理器宏可以做到这一点。
如果 kind 的类型稍有变化，可以散列 typeid(T).name()。
这对于一般问题来说不是一个糟糕的解决方案，但在我的情况下，大量的类型和方法意味着将有很多代码来初始化数组或映射，并添加新的types 意味着更新所有初始化代码。我不确定这比我已经处理的要好得多。

【解决方案2】：

您可以尝试Curiously recurring template pattern。这有点复杂，但当你不能使用纯虚函数时，它会很有帮助。

【讨论】：

【解决方案3】：

这是一种使用虚方法实现跳转表的方法，不需要Pod类或派生类实际具有虚函数。

目标是简化跨多个类的添加和删除方法。

添加方法需要使用清晰通用的模式添加到Pod中，PodInterface需要添加纯虚函数，PodFuncs必须使用清晰通用的模式添加转发函数。

派生类只需要一个文件静态初始化对象来进行设置，否则看起来就像它们已经做的一样。

// Pod header

#include <boost/shared_ptr.hpp>
enum Kind { Kind_Derived1, Kind_Derived2, Kind_Derived3 /* , ... */ };

struct Pod
{
    int kind;

    int GetFoo();
    int GetBar();
    int GetBaz();
};

struct PodInterface
{
    virtual ~PodInterface();

    virtual int GetFoo(Pod* p) const = 0;
    virtual int GetBar(Pod* p) const = 0;
    virtual int GetBaz(Pod* p) const = 0;

    static void
    do_init(
            boost::shared_ptr<PodInterface const> const& p,
            int kind);
};

template<class T> struct PodFuncs : public PodInterface
{
    struct Init
    {
        Init(int kind)
        {
            boost::shared_ptr<PodInterface> t(new PodFuncs);
            PodInterface::do_init(t, kind);
        }
    };

    ~PodFuncs() { }

    int GetFoo(Pod* p) const { return static_cast<T*>(p)->GetFoo(); }
    int GetBar(Pod* p) const { return static_cast<T*>(p)->GetBar(); }
    int GetBaz(Pod* p) const { return static_cast<T*>(p)->GetBaz(); }
};


//
// Pod Implementation
//

#include <map>

typedef std::map<int, boost::shared_ptr<PodInterface const> > FuncMap;

static FuncMap& get_funcmap()
{
    // Replace with other approach for static initialisation order as appropriate.
    static FuncMap s_funcmap;
    return s_funcmap;
}

//
// struct Pod methods
//

int Pod::GetFoo()
{
    return get_funcmap()[kind]->GetFoo(this);
}

//
// struct PodInterface methods, in same file as s_funcs
//

PodInterface::~PodInterface()
{
}

void
PodInterface::do_init(
        boost::shared_ptr<PodInterface const> const& p,
        int kind)
{
    // Could do checking for duplicates here.
    get_funcmap()[kind] = p;
}

//
// Derived1
//

struct Derived1 : Pod
{
    Derived1() { kind = Kind_Derived1; }

    int GetFoo();
    int GetBar();
    int GetBaz();

    // Whatever else.
};

//
// Derived1 implementation
//

static const PodFuncs<Derived1>::Init s_interface_init(Kind_Derived1);

int Derived1::GetFoo() { /* Implement */ }
int Derived1::GetBar() { /* Implement */ }
int Derived1::GetBaz() { /* Implement */ }

【讨论】：

【解决方案4】：

这是我现在要走的模板元编程路径。这是我喜欢它的地方：

添加对新种类的支持只需要更新LAST_KIND 并添加新的KindTraits。
添加新函数有一个简单的模式。
如有必要，函数可以专门用于特定类型。
如果我搞砸了，我可以期待编译时错误和警告，而不是神秘的运行时不当行为。

有几个问题：

POD 的实现现在依赖于所有派生类的接口。（在现有的实现中已经是这样了，所以我并不担心，但它有点臭。）
我指望编译器足够聪明，可以生成大致相当于基于switch 的代码的代码。
很多 C++ 程序员看到这里都会挠头。

代码如下：

// Declare first and last kinds
const int FIRST_KIND = Kind_Derived1;
const int LAST_KIND = Kind_Derived3;

// Provide a compile-time mapping from a kind code to a subtype
template <int KIND>
struct KindTraits
{
    typedef void Subtype;
};
template <> KindTraits<Kind_Derived1> { typedef Derived1 Subtype; };
template <> KindTraits<Kind_Derived2> { typedef Derived2 Subtype; };
template <> KindTraits<Kind_Derived3> { typedef Derived3 Subtype; };

// If kind matches, then do the appropriate typecast and return result;
// otherwise, try the next kind.
template <int KIND>
int GetFooForKind(POD *pod)
{
    if (pod->kind == KIND)
        return static_cast<KindTraits<KIND>::Subtype>(pod)->GetFoo();
    else
        return GetFooForKind<KIND + 1>();  // try the next kind
}

// Specialization for LAST_KIND+1 
template <> int GetFooForKind<LAST_KIND + 1>(POD *pod)
{
    // kind didn't match anything in FIRST_KIND..LAST_KIND
    throw UnknownKindException(kind, "GetFoo");
}

// Now POD's function members can be implemented like this:

int POD::GetFoo()
{
    return GetFooForKind<FIRST_KIND>(this);
}

【讨论】：

您错过了抛出 UnkindException 的机会！这种方法要求 POD 实现已经看到了每个派生类型的定义。鉴于代码的当前状态，您可能不在乎，但需求就在那里。
我已将有关依赖关系的注释添加到我的“关注”部分。

【解决方案5】：

这是一个使用奇怪重复模板模式的示例。如果您在编译时了解更多信息，这可能会满足您的需求。

template<class DerivedType>
struct POD
{
    int GetFoo()
    {
        return static_cast<DerivedType*>(this)->GetFoo();
    }
    int GetBar()
    {
        return static_cast<DerivedType*>(this).GetBar();
    }
    int GetBaz()
    {
        return static_cast<DerivedType*>(this).GetBaz();
    }
    int GetXyzzy()
    {
        return static_cast<DerivedType*>(this).GetXyzzy();
    }
};

struct Derived1 : public POD<Derived1>
{
    int GetFoo()
    {
        return 1;
    }
    //define all implementations
};

struct Derived2 : public POD<Derived2>
{
    //define all implementations

};

int main()
{
    Derived1 d1;
    cout << d1.GetFoo() << endl;
    POD<Derived1> *p = new Derived1;
    cout << p->GetFoo() << endl;
    return 0;
}

【讨论】：

但是如果我以POD * 开头，它实际上指向一个子类型，并且想调用GetFoo()，我该怎么办？

【解决方案6】：

扩展您最终得到的解决方案，以下解决了程序初始化时到派生函数的映射：

#include <typeinfo>
#include <iostream>
#include <functional>
#include <vector>

enum Kind
{
    Kind_First,
    Kind_Derived1 = Kind_First,
    Kind_Derived2,
    Kind_Total
};

struct POD
{
    size_t kind;

    int GetFoo();
    int GetBar();
};

struct VTable
{
    std::function<int(POD*)> GetFoo;
    std::function<int(POD*)> GetBar;
};

template<int KIND>
struct KindTraits
{
    typedef POD KindType;
};

template<int KIND>
void InitRegistry(std::vector<VTable> &t)
{
    typedef typename KindTraits<KIND>::KindType KindType;

    size_t i = KIND;
    t[i].GetFoo = [](POD *p) -> int {
        return static_cast<KindType*>(p)->GetFoo();
    };
    t[i].GetBar = [](POD *p) -> int {
        return static_cast<KindType*>(p)->GetBar();
    };

    InitRegistry<KIND+1>(t);
}
template<>
void InitRegistry<Kind_Total>(std::vector<VTable> &t)
{
}

struct Registry
{
    std::vector<VTable> table;

    Registry()
    {
        table.resize(Kind_Total);
        InitRegistry<Kind_First>(table);
    }
};

Registry reg;

int POD::GetFoo() { return reg.table[kind].GetFoo(this); }
int POD::GetBar() { return reg.table[kind].GetBar(this); }

struct Derived1 : POD
{
    Derived1() { kind = Kind_Derived1; }

    int GetFoo() { return 0; }
    int GetBar() { return 1; }
};
template<> struct KindTraits<Kind_Derived1> { typedef Derived1 KindType; };

struct Derived2 : POD
{
    Derived2() { kind = Kind_Derived2; }

    int GetFoo() { return 2; }
    int GetBar() { return 3; }
};
template<> struct KindTraits<Kind_Derived2> { typedef Derived2 KindType; };

int main()
{
    Derived1 d1;
    Derived2 d2;
    POD *p;

    p = static_cast<POD*>(&d1);
    std::cout << p->GetFoo() << '\n';
    p = static_cast<POD*>(&d2);
    std::cout << p->GetBar() << '\n';
}

【讨论】：