【问题标题】:how to convert Index into list? [duplicate]如何将索引转换为列表? [复制]
【发布时间】:2016-09-07 19:54:58
【问题描述】:

我的索引:

Index([u'Newal', u'Saraswati Khera', u'Tohana'], dtype='object')

我必须将此格式转换为以下格式的列表:

['Newal','SaraswatiKhera','Tohana']

【问题讨论】:

    标签: python python-2.7 pandas


    【解决方案1】:

    您可以使用tolistlist

    print df.index.tolist()
    
    print list(df.index)
    

    但最快的解决方案是将np.arry 转换为values tolist(感谢EdChum

    print df.index.values.tolist()
    

    示例:

    import pandas as pd
    
    idx = pd.Index([u'Newal', u'Saraswati Khera', u'Tohana'])
    print idx
    Index([u'Newal', u'Saraswati Khera', u'Tohana'], dtype='object')
    
    print idx.tolist()
    [u'Newal', u'Saraswati Khera', u'Tohana']
    
    print list(idx)
    [u'Newal', u'Saraswati Khera', u'Tohana']
    

    如果需要编码UTF-8:

    print [x.encode('UTF8') for x in idx.tolist()]
    ['Newal', 'Saraswati Khera', 'Tohana']
    

    另一种解决方案:

    print [str(x) for x in idx.tolist()]
    ['Newal', 'Saraswati Khera', 'Tohana']
    

    但如果 unicode 字符串字符不在 ascii 范围内,它会失败。

    时间安排

    import pandas as pd
    import numpy as np
    
    #random dataframe
    np.random.seed(1)
    df = pd.DataFrame(np.random.randint(10, size=(3,3)))
    df.columns = list('ABC')
    df.index = [u'Newal', u'Saraswati Khera', u'Tohana']
    print df
    
    print df.index
    Index([u'Newal', u'Saraswati Khera', u'Tohana'], dtype='object')
    
    print df.index.tolist()
    [u'Newal', u'Saraswati Khera', u'Tohana']
    
    print list(df.index)
    [u'Newal', u'Saraswati Khera', u'Tohana']
    
    print df.index.values.tolist()
    [u'Newal', u'Saraswati Khera', u'Tohana']
    
    
    In [90]: %timeit list(df.index)
    The slowest run took 37.42 times longer than the fastest. This could mean that an intermediate result is being cached 
    100000 loops, best of 3: 2.18 µs per loop
    
    In [91]: %timeit df.index.tolist()
    The slowest run took 22.33 times longer than the fastest. This could mean that an intermediate result is being cached 
    1000000 loops, best of 3: 1.75 µs per loop
    
    In [92]: %timeit df.index.values.tolist()
    The slowest run took 62.72 times longer than the fastest. This could mean that an intermediate result is being cached 
    1000000 loops, best of 3: 787 ns per loop
    

    【讨论】:

    • 好总结,注意使用 print 的方式运行时间会更长(根据 %timeit 大约长 3.5 倍)
    • 好主意,我加了。
    • 如果性能是关键,请使用底层 np 数组 df.index.values.tolist() 这将比其他方法更快
    • 谢谢 EdChum,我将其添加到解决方案中。
    • 我发现df.index.tolist()df.index.values.tolist() 之间没有明显区别
    猜你喜欢
    • 1970-01-01
    • 2021-12-20
    • 2021-11-30
    • 2021-05-26
    • 2014-12-09
    • 1970-01-01
    • 2019-09-27
    • 2019-09-21
    • 2013-06-18
    相关资源
    最近更新 更多