Matrix 矩阵产品操作 OpenMDAO答案

【问题标题】：Matrix Matrix product operations OpenMDAOMatrix 矩阵产品操作 OpenMDAO
【发布时间】：2021-07-16 12:49:24
【问题描述】：

openmdao 或 dymos 是否具有类似于 MatrixVectorProductComp 的向量化矩阵-矩阵乘积组件？

我想对形状为 (nn,3,3) 的矩阵 A 乘以形状为 (nn,3,3) 的矩阵 B 进行矩阵乘法，输出是形状 ( nn,3,3)。

nn=2 的 2D 示例可能是：

A = np.array([[[1, 3], [0, 1]],[[-1, 7], [0, 1]]])
B = np.array([[[4, 1],  [2, 2]],[[2, 1],  [2, 3]]])

For the first node:
C[0,:,:] = np.matmul(A[0,:,:], B[0,:,:]) 
[[10  7]
 [ 2  2]]
And second Node
C[1,:,:] = np.matmul(A[1,:,:], B[1,:,:])
[[12 20]
 [ 2  3]]

有一个名为 MatrixVectorProductComp 的组件，它将一个矩阵乘以一个具有指定向量维度/节点的向量。我试图使用 np.einsum('nij,njk->nik', A, B) 对组件进行扩展以进行矩阵运算并提供其部分。但是，它在尝试 assert_check 其偏导数时崩溃（检查偏导返回一个空 {}）：

nn = 5 
model = om.Group()
ivc = om.IndepVarComp()
ivc.add_output(name='A', shape=(nn, 3, 3))
ivc.add_output(name='B', shape=(nn, 3, 3))

model.add_subsystem(name='ivc',
                           subsys=ivc,
                           promotes_outputs=['A', 'B'])

model.add_subsystem(name='mat_vec_product_comp',
                      subsys=MatrixMatrixProductComp(vec_size=nn))

model.connect('A', 'mat_vec_product_comp.A')
model.connect('B', 'mat_vec_product_comp.B')

p = om.Problem(model)
p.setup(force_alloc_complex=True)

p['A'] = np.random.rand(nn, 3, 3)
p['B'] = np.random.rand(nn, 3, 3)

p.run_model()
cpd = p.check_partials(compact_print=False, method='cs')
assert_check_partials(cpd, atol=1.0E-6, rtol=1.0E-6)

Traceback（最近一次调用最后一次）：文件 "C:/tools/OpenMDAO/openmdao/components/matrix_matrix_product_comp.py", 第 294 行，在 assert_check_partials(cpd, atol=1.0E-6, rtol=1.0E-6) 文件“c:\tools\dymos\dymos\utils\testing_utils.py”，第 12 行，在 assert_check_partials assert len(data) >= 1, "No check partials data found. Is " \ AssertionError: No check partials data found.是 dymos.options['include_check_partials'] 设置为 True？

MatrixMatrixProductComp 代码：

"""Definition of the Matrix Matrix Product Component."""


import numpy as np
import scipy.linalg as spla

from openmdao.core.explicitcomponent import ExplicitComponent


class MatrixMatrixProductComp(ExplicitComponent):
    """
    Computes a vectorized matrix-vector product.

    math::
        b = np.dot(A, B)

    where A is of shape (vec_size, n, m)
          B is of shape (vec_size, m,p)
          b is of shape (vec_size, n,p)

    if vec_size > 1 and

    where A is of shape (n, m)
          B is of shape (m,p)
          b is of shape (n,p)

    otherwise.

    The size of vectors x and b is determined by the number of rows in m at each point.

    Attributes
    ----------
    _products : list
        Cache the data provided during `add_product`
        so everything can be saved until setup is called.
    """

    def __init__(self, **kwargs):
        """
        Initialize the Matrix Vector Product component.

        Parameters
        ----------
        **kwargs : dict of keyword arguments
            Keyword arguments that will be mapped into the Component options.
        """
        super().__init__(**kwargs)

        self._products = []

        opt = self.options
        self.add_product(b_name=opt['b_name'], A_name=opt['A_name'], B_name=opt['B_name'],
                         b_units=opt['b_units'], A_units=opt['A_units'], B_units=opt['B_units'],
                         vec_size=opt['vec_size'], A_shape=opt['A_shape'], B_shape=opt['B_shape'])

        self._no_check_partials = False

    def initialize(self):
        """
        Declare options.
        """
        self.options.declare('vec_size', types=int, default=1,
                             desc='The number of points at which the matrix vector product '
                                  'is to be computed')
        self.options.declare('A_name', types=str, default='A',
                             desc='The variable name for the matrix.')
        self.options.declare('A_shape', types=tuple, default=(3, 3),
                             desc='The shape of the input matrix at a single point.')
        self.options.declare('A_units', types=str, default=None, allow_none=True,
                             desc='The units of the input matrix.')
        # self.options.declare('A_transpose', types=bool, default=False, allow_none=True,
        #                      desc='If true, matrix is transposed first')
        self.options.declare('B_name', types=str, default='B',
                             desc='The name of the input vector.')
        self.options.declare('B_shape', types=tuple, default=(3, 3),
                             desc='The shape of the input matrix at a single point.')
        self.options.declare('B_units', types=str, default=None, allow_none=True,
                             desc='The units of the input vector.')
        self.options.declare('b_name', types=str, default='b',
                             desc='The variable name of the output vector.')
        self.options.declare('b_units', types=str, default=None, allow_none=True,
                             desc='The units of the output vector.')

    def add_product(self, b_name, A_name='A', B_name='B', A_units=None, B_units=None, b_units=None,
                    vec_size=1, A_shape=(3, 3), B_shape=(3, 3)):
        """
        Add a new output product to the matrix vector product component.

        Parameters
        ----------
        A_name : str
            The name of the matrix input.
        B_name : str
            The name of the matrix input.
        b_name : str
            The name of the vector product output.
        A_units : str or None
            The units of the input matrix.
        B_units : str or None
            The units of the input matrix.
        b_units : str or None
            The units of the output matrix.
        vec_size : int
            The number of points at which the matrix vector product
            should be computed simultaneously.
        A_shape : tuple of (int, int)
            The shape of the matrix at each point.
            The first element also specifies the size of the output at each point.
            The second element specifies the size of the input vector at each point.
            For example, if vec_size=10 and shape is (5, 3), then
            the input matrix will have a shape of (10, 5, 3),
            the input vector will have a shape of (10, 3), and
            the output vector will have shape of (10, 5).
        B_shape : tuple of (int, int)
            The shape of the matrix at each point.
            The first element also specifies the size of the output at each point.
            The second element specifies the size of the input vector at each point.
            For example, if vec_size=10 and shape is (5, 3), then
            the input matrix will have a shape of (10, 5, 3),
            the input vector will have a shape of (10, 3), and
            the output vector will have shape of (10, 5).
        """
        self._products.append({
            'A_name': A_name,
            'B_name': B_name,
            'b_name': b_name,
            'A_units': A_units,
            'B_units': B_units,
            'b_units': b_units,
            'A_shape': A_shape,
            'B_shape': B_shape,
            'vec_size': vec_size
        })

        # add inputs and outputs for all products
        if self._static_mode:
            var_rel2meta = self._static_var_rel2meta
            var_rel_names = self._static_var_rel_names
        else:
            var_rel2meta = self._var_rel2meta
            var_rel_names = self._var_rel_names

        n_rows_A, n_cols_A = A_shape
        n_rows_B, n_cols_B = B_shape

        A_shape = (vec_size, n_rows_A, n_cols_A)
        B_shape = (vec_size, n_rows_B, n_cols_B)
        b_shape = (vec_size, n_rows_A, n_cols_B) if vec_size > 1 else (n_rows_A, n_cols_B)

        if b_name not in var_rel2meta:
            self.add_output(name=b_name, shape=b_shape, units=b_units)
        elif b_name in var_rel_names['input']:
            raise NameError(f"{self.msginfo}: '{b_name}' specified as an output, "
                            "but it has already been defined as an input.")
        else:
            raise NameError(f"{self.msginfo}: Multiple definition of output '{b_name}'.")

        if A_name not in var_rel2meta:
            self.add_input(name=A_name, shape=A_shape, units=A_units)
        elif A_name in var_rel_names['output']:
            raise NameError(f"{self.msginfo}: '{A_name}' specified as an input, "
                            "but it has already been defined as an output.")
        else:
            meta = var_rel2meta[A_name]
            if vec_size != meta['shape'][0]:
                raise ValueError(f"{self.msginfo}: Conflicting vec_size={B_shape[0]} "
                                 f"specified for matrix '{A_name}', which has already "
                                 f"been defined with vec_size={meta['shape'][0]}.")

            elif (n_rows_A, n_cols_A) != meta['shape'][1:]:
                raise ValueError(f"{self.msginfo}: Conflicting shape {A_shape[1:]} specified "
                                 f"for matrix '{A_name}', which has already been defined "
                                 f"with shape {meta['shape'][1:]}.")

            elif A_units != meta['units']:
                raise ValueError(f"{self.msginfo}: Conflicting units '{A_units}' specified "
                                 f"for matrix '{A_name}', which has already been defined "
                                 f"with units '{meta['units']}'.")

        if B_name not in var_rel2meta:
            self.add_input(name=B_name, shape=B_shape, units=B_units)
        elif B_name in var_rel_names['output']:
            raise NameError(f"{self.msginfo}: '{B_name}' specified as an input, "
                            "but it has already been defined as an output.")
        else:
            meta = var_rel2meta[B_name]
            if vec_size != meta['shape'][0]:
                raise ValueError(f"{self.msginfo}: Conflicting vec_size={B_shape[0]} "
                                 f"specified for vector '{B_name}', which has already "
                                 f"been defined with vec_size={meta['shape'][0]}.")

            elif n_cols_A != meta['shape'][1]:
                raise ValueError(f"{self.msginfo}: Matrix shape {A_shape[1:]} is incompatible "
                                 f"with vector '{B_name}', which has already been defined "
                                 f"with {meta['shape'][1]} column(s).")

            elif B_units != meta['units']:
                raise ValueError(f"{self.msginfo}: Bonflicting units '{B_units}' specified "
                                 f"for vector '{B_name}', which has already been defined "
                                 f"with units '{meta['units']}'.")

        # Make a dummy version of A so we can figure out the nonzero indices
        A = np.ones(A_shape)
        B = np.ones(B_shape)
        # print(A.shape)
        bd_A = spla.block_diag(*A)
        # print(bd_A.shape)
        bd_B = spla.block_diag(*B)
        B_repeat = np.repeat(B, A.shape[1], axis=0)
        A_repeat = np.repeat(A, B.shape[1], axis=0)
        # print(A_repeat.shape)
        bd_B_repeat = spla.block_diag(*B_repeat)
        bd_A_repeat = spla.block_diag(*A_repeat)
        # print(bd_A_repeat.shape)
        db_dB_rows, db_dB_cols = np.nonzero(bd_A_repeat)
        db_dA_rows, db_dA_cols = np.nonzero(bd_B_repeat)

        # db_dA_rows = np.arange()
        self.declare_partials(of=b_name, wrt=A_name,
                              rows=db_dA_rows, cols=db_dA_cols)
        self.declare_partials(of=b_name, wrt=B_name,
                              rows=db_dB_rows, cols=db_dB_cols)

    def compute(self, inputs, outputs):
        """
        Compute the matrix vector product of inputs `A` and `x` using np.einsum.

        Parameters
        ----------
        inputs : Vector
            unscaled, dimensional input variables read via inputs[key]
        outputs : Vector
            unscaled, dimensional output variables read via outputs[key]
        """
        for product in self._products:
            A = inputs[product['A_name']]
            B = inputs[product['B_name']]
            outputs[product['b_name']][...] = np.einsum('nij,njk->nik', A, B)

    def compute_partials(self, inputs, partials):
        """
        Compute the sparse partials for the matrix vector product w.r.t. the inputs.

        Parameters
        ----------
        inputs : Vector
            unscaled, dimensional input variables read via inputs[key]
        partials : Jacobian
            sub-jac components written to partials[output_name, input_name]
        """
        for product in self._products:
            A_name = product['A_name']
            B_name = product['B_name']
            b_name = product['b_name']

            A = inputs[A_name]
            B = inputs[B_name]

            # Use the following for sparse partials
            partials[b_name, A_name] = np.repeat(B, A.shape[1], axis=0).ravel()
            partials[b_name, B_name] = np.repeat(A, B.shape[1], axis=0).ravel()

if __name__ == "__main__":
    import openmdao.api as om
    import dymos as dm
    from openmdao.utils.assert_utils import assert_near_equal
    from dymos.utils.testing_utils import assert_check_partials

    nn = 5

    model = om.Group()
    ivc = om.IndepVarComp()
    ivc.add_output(name='A', shape=(nn, 3, 3))
    ivc.add_output(name='B', shape=(nn, 3, 3))

    model.add_subsystem(name='ivc',
                               subsys=ivc,
                               promotes_outputs=['A', 'B'])

    model.add_subsystem(name='MatrixMatrixProductComp',
                          subsys=MatrixMatrixProductComp(vec_size=nn))

    model.connect('A', 'MatrixMatrixProductComp.A')
    model.connect('B', 'MatrixMatrixProductComp.B')

    p = om.Problem(model)
    p.setup(force_alloc_complex=True)

    p['A'] = np.random.rand(nn, 3, 3)
    p['B'] = np.random.rand(nn, 3, 3)

    p.run_model()
    cpd = p.check_partials(compact_print=False, method='cs')
    assert_check_partials(cpd, atol=1.0E-6, rtol=1.0E-6)

【问题讨论】：

你的问题有点笼统。我不确定向量化矩阵矩阵乘积是什么意思。您是在问是否可以多个两个矩阵？或者在一个输入中有一个额外的维度，所以你将一个矩阵向量乘以另一个矩阵（或矩阵向量）？能否请您至少提供一个您想要实现的产品类型的 numpy 示例。
嗨贾斯汀。我为原来的配方道歉。我用一个例子编辑了它。我也在创建一种组件来处理它，但是在验证它的部分时遇到了问题（包括问题），所以我想知道是否可能已经有一个组件可以做到这一点而我错过了。

标签： matrix-multiplication openmdao

【解决方案1】：

您没有将代码提供给您的 MatrixMatrixProductComp，但我可以对发生的事情做出有根据的猜测，因为您在 OpenMDAO 的标准库中引用了 MatrixVectorProductComp。

对于 OpenMDAO 标准库中的所有组件，开发人员已经实现并验证了衍生产品。因此，没有必要让这些组件在 99.9999% 的时间内出现在 check_partials 的输出中。它只会混淆用户需要查看的输出。

在这些组件中打开了一个隐藏的、未记录的标志 _no_check_partials，以便 check_partials 跳过它们。现在我已经在互联网上大肆宣传这个隐藏的标志了:) ...here is where its assigned in the MatrixVectorProductComp Dymos 库也使用了此功能，因此 Dymos 组件不会将其 ODE 组件和组的用户 check_partials 输出弄乱。

同样，Dymos 这样做是因为这些部分已经过仔细验证！还有 special code in Dymos that flips that flag off 以便内部测试可以验证 Dymos 组件部分。

很可能您只需将该属性设置为 False，您将获得一些可在测试中使用的 check_partials 输出。我要小心地强调，这个标志只能非常谨慎地使用。错误的 partials 会破坏优化，如果您打开了该标志，那么您可能根本不会在 check_partials 中获得该组件的报告。

在您的情况下，您因模型中没有其他组件而被烧毁，因此出现空字典错误。一般来说，我不建议任何用户代码使用这个标志。

【讨论】：

嗨贾斯汀，感谢您指出！我能够检查部分内容，并且正如预期的那样，我没有正确提供它们。做一个捷径（只是推导我现在需要分析的矩阵乘法）并且可能会在以后的阶段解决它。如果它对某人有用，我在最初的帖子中添加了它，但请注意，compute_partials 例程无法正常工作。