【问题标题】:Converting an RPy2 ListVector to a Python dictionary将 RPy2 ListVector 转换为 Python 字典
【发布时间】:2023-10-12 21:20:01
【问题描述】:

自然的 Python 等价于 R 中的命名列表是一个字典,但 RPy2 给你一个 ListVector 对象。

import rpy2.robjects as robjects

a = robjects.r('list(foo="barbat", fizz=123)')

此时,a 是一个ListVector 对象。

<ListVector - Python:0x108f92a28 / R:0x7febcba86ff0>
[StrVector, FloatVector]
  foo: <class 'rpy2.robjects.vectors.StrVector'>
  <StrVector - Python:0x108f92638 / R:0x7febce0ae0d8>
[str]
  fizz: <class 'rpy2.robjects.vectors.FloatVector'>
  <FloatVector - Python:0x10ac38fc8 / R:0x7febce0ae108>
[123.000000]

我想要的是可以像普通 Python 字典一样对待的东西。我的临时解决办法是这样的:

def as_dict(vector):
    """Convert an RPy2 ListVector to a Python dict"""
    result = {}
    for i, name in enumerate(vector.names):
        if isinstance(vector[i], robjects.ListVector):
            result[name] = as_dict(vector[i])
        elif len(vector[i]) == 1:
            result[name] = vector[i][0]
        else:
            result[name] = vector[i]
    return result

as_dict(a)
{'foo': 'barbat', 'fizz': 123.0}

b = robjects.r('list(foo=list(bar=1, bat=c("one","two")), fizz=c(123,345))')
as_dict(b)
{'fizz': <FloatVector - Python:0x108f7e950 / R:0x7febcba86b90>
 [123.000000, 345.000000],
 'foo': {'bar': 1.0, 'bat': <StrVector - Python:0x108f7edd0 / R:0x7febcba86ea0>
  [str, str]}}

所以,问题是......有没有更好的方法或我应该使用的 RPy2 内置的东西?

【问题讨论】:

    标签: python rpy2


    【解决方案1】:

    我认为将 r 向量放入dictionary 不必那么复杂,这样怎么样:

    In [290]:
    
    dict(zip(a.names, list(a)))
    Out[290]:
    {'fizz': <FloatVector - Python:0x08AD50A8 / R:0x10A67DE8>
    [123.000000],
     'foo': <StrVector - Python:0x08AD5030 / R:0x10B72458>
    ['barbat']}
    In [291]:
    
    dict(zip(a.names, map(list,list(a))))
    Out[291]:
    {'fizz': [123.0], 'foo': ['barbat']}
    

    当然,如果您不介意使用pandas,那就更简单了。结果将是 numpy.array 而不是 list,但在大多数情况下都可以:

    In [294]:
    
    import pandas.rpy.common as com
    com.convert_robj(a)
    Out[294]:
    {'fizz': [123.0], 'foo': array(['barbat'], dtype=object)}
    

    【讨论】:

    • 不错! dict(zip(... 方法不处理嵌套列表(我上面的第二个示例),但对于简单的情况它更简洁。我喜欢它。
    • 使用DataFrame 也可以安全吗?
    【解决方案2】:

    对于不同 rpy2 向量类型的深度嵌套结构,我遇到了同样的问题。我在*的任何地方都找不到直接的答案,所以这是我的解决方案。 使用 CT Zhu 的回答,我想出了以下代码,将完整的结构递归地转换为 python 类型。

    from collections import OrderedDict
    
    import numpy as np
    
    from rpy2.robjects.vectors import DataFrame, FloatVector, IntVector, StrVector, ListVector, Matrix
    
    
    def recurse_r_tree(data):
        """
        step through an R object recursively and convert the types to python types as appropriate. 
        Leaves will be converted to e.g. numpy arrays or lists as appropriate and the whole tree to a dictionary.
        """
        r_dict_types = [DataFrame, ListVector]
        r_array_types = [FloatVector, IntVector, Matrix]
        r_list_types = [StrVector]
        if type(data) in r_dict_types:
            return OrderedDict(zip(data.names, [recurse_r_tree(elt) for elt in data]))
        elif type(data) in r_list_types:
            return [recurse_r_tree(elt) for elt in data]
        elif type(data) in r_array_types:
            return np.array(data)
        else:
            if hasattr(data, "rclass"):  # An unsupported r class
                raise KeyError('Could not proceed, type {} is not defined'
                               'to add support for this type, just add it to the imports '
                               'and to the appropriate type list above'.format(type(data)))
            else:
                return data  # We reached the end of recursion
    

    【讨论】:

    • 这非常有效。请告诉我们,我们如何也可以在转换中使用 R 矩阵?
    • @moha 我修改了上面的代码以包含 R 矩阵。请给我反馈。我还更改了错误消息以帮助添加对更多 R 类型的支持。
    【解决方案3】:

    简单的 R 列表到 Python 字典:

    >>> import rpy2.robjects as robjects
    >>> a = robjects.r('list(foo="barbat", fizz=123)')
    >>> d = { key : a.rx2(key)[0] for key in a.names }
    >>> d
    {'foo': 'barbat', 'fizz': 123.0}
    

    使用 R RJSONIO JSON 序列化/反序列化的任意 R 对象到 Python 对象

    在 R 服务器上:install.packages("RJSONIO", dependencies = TRUE)

    >>> ro.r("library(RJSONIO)")
    <StrVector - Python:0x300b8c0 / R:0x3fbccb0>
    [str, str, str, ..., str, str, str]
    >>> import rpy2.robjects as robjects
    >>> rjson = robjects.r(' toJSON( list(foo="barbat", fizz=123, lst=list(33,"bb")) )  ')
    >>> pyobj = json.loads( rjson[0] )
    >>> pyobj
    {u'lst': [33, u'bb'], u'foo': u'barbat', u'fizz': 123}
    >>> pyobj['lst']
    [33, u'bb']
    >>> pyobj['lst'][0]
    33
    >>> pyobj['lst'][1]
    u'bb'
    >>> rjson = robjects.r(' toJSON( list(foo="barbat", fizz=123, lst=list( key1=33,key2="bb")) )  ')
    >>> pyobj = json.loads( rjson[0] )
    >>> pyobj
    {u'lst': {u'key2': u'bb', u'key1': 33}, u'foo': u'barbat', u'fizz': 123}
    

    【讨论】:

    • 巧妙使用 JSON 作为中间体。
    【解决方案4】:

    一个将嵌套 R 命名列表转换为嵌套 Python 字典的简单函数:

    def rext(r):
        """
        Returns a R named list as a Python dictionary
        """
        # In case `r` is not a named list
        try:
            # No more names, just return the value!
            if r.names == NULL:
                # If more than one value, return numpy array (or list)
                if len(list(r)) > 1:
                    return np.array(r)
                # Just one value, return the value
                else:
                    return list(r)[0]
            # Create dictionary to hold named list as key-value
            dic = {}
            for n in list(r.names):
                dic[n] = rext(r[r.names.index(n)])
            return dic
        # Uh-oh `r` is not a named list, just return `r` as is
        except:
            return r
    

    【讨论】:

      【解决方案5】:

      有了新版的pandas,也能做到,

      import rpy2.robjects as robjects
      a = robjects.r('list(foo="barbat", fizz=123)')
      
      from rpy2.robjects import pandas2ri
      print(pandas2ri.ri2py(a.names))
      temp = pandas2ri.ri2py(a)
      print(temp[0])
      print(temp[1])
      

      【讨论】:

      • 在上面的代码之后,&gt; type(temp) 得到了什么?我仍然将 temp 视为 rpy2.robjects.vectors.ListVector
      【解决方案6】:

      以下是我从 rpy2 ListVector 转换为 python dict 的函数,能够处理嵌套列表:

      import rpy2.robjects as ro
      from rpy2.robjects import pandas2ri
      
      def r_list_to_py_dict(r_list):
          converted = {}
          for name in r_list.names:
              val = r_list.rx(name)[0]
              if isinstance(val, ro.vectors.DataFrame):
                  converted[name] = pandas2ri.ri2py_dataframe(val)
              elif isinstance(val, ro.vectors.ListVector):
                  converted[name] = r_list_to_py_dict(val)
              elif isinstance(val, ro.vectors.FloatVector) or isinstance(val, ro.vectors.StrVector):
                  if len(val) == 1:
                      converted[name] = val[0]
                  else:
                      converted[name] = list(val)
              else: # single value
                  converted[name] = val
          return converted
      

      【讨论】:

        【解决方案7】:

        您还可以执行以下操作:

        dict(a.items())
        

        出局

        {'foo': R object with classes: ('character',) mapped to:
         ['barbat'], 'fizz': R object with classes: ('numeric',) mapped to:
         [123.000000]}
        

        【讨论】:

          最近更新 更多