如何“漂亮地打印”python pandas DatetimeIndex答案

【问题标题】：How to "pretty print" a python pandas DatetimeIndex如何“漂亮地打印”python pandas DatetimeIndex
【发布时间】：2015-02-07 21:40:48
【问题描述】：

我是 pandas 的新手，但我仍然对它的功能感到惊讶，尽管有时也对它的完成方式感到惊讶 ;-)

我设法编写了一个小脚本，该脚本将报告在时间序列中遇到的缺失值的数量，无论是在该系列的每个月还是在该系列的每一年。下面是使用一些虚拟数据进行演示的代码。

如果我打印返回的结果（print cnty 或print cntm），一切看起来都很好，除了我想根据我的数据分辨率格式化索引的日期时间值，即我希望有2000 1000 10 15 代替 2000-12-31 1000 10 15 表示年产量，2000-01 744 10 15 表示月产量。有没有一种简单的方法可以在 pandas 中执行此操作，或者我必须通过一些循环并将其转换为“普通”python，然后再打印它。注意：我事先不知道我有多少数据列，所以每行具有固定格式字符串的任何东西都不适合我。

import numpy as np
import pandas as pd
import datetime as dt


def make_data():
    """Make up some bogus data where we know the number of missing values"""
    time = np.array([dt.datetime(2000,1,1)+dt.timedelta(hours=i)
                     for i in range(1000)])
    wd = np.arange(0.,1000.,1.)
    ws = wd*0.2
    wd[[2,3,4,8,9,22,25,33,99,324]] = -99.9   # 10 missing values
    ws[[2,3,4,10,11,12,565,644,645,646,647,648,666,667,669]]  =-99.9 # 15 missing values
    data = np.array(zip(time,wd,ws), dtype=[('time', dt.datetime),
                                            ('wd', 'f4'), ('ws', 'f4')])
    return data


def count_miss(data):
    time = data['time']
    dff = pd.DataFrame(data, index=time)
    # two options for setting missing values:
    # 1) replace everything less or equal -99
    for c in dff.columns:
        ser = pd.Series(dff[c])
        ser[ser <= -99.] = np.nan
        dff[c] = ser
    # 2) alternative: if you know the exact value to be replaced
    # you can use the DataFrame replace method:
##    dff.replace(-99.9, np.nan, inplace=True)

    # add the time variable as data column
    dff['time'] = time
    # count missing values
    # the print expressions will print date labels and the total number of values
    # in the time column plus the number of missing values for all other columns
    # annually:
    cnty = dff.resample('A', how='count', closed='right', label='right')
    for c in cnty.columns:
        if c != 'time':
            cnty[c] = cnty['time']-cnty[c]
    # monthly:
    cntm = dff.resample('M', how='count', closed='right', label='right')
    for c in cntm.columns:
        if c != 'time':
            cntm[c] = cntm['time']-cntm[c]
    return cnty, cntm

if __name__ == "__main__":
    data = make_data()
    cnty, cntm = count_miss(data)

最后说明：DatetimeIndex 是否有格式化方法，但遗憾的是没有说明如何使用它。

【问题讨论】：

标签： python datetime pandas format

【解决方案1】：

DatetimeIndex 的 format 方法的执行类似于 datetime.datetime 对象的 strftime。

这意味着您可以使用此处找到的格式字符串：http://www.tutorialspoint.com/python/time_strftime.htm

诀窍是您必须传递format 方法的函数formatter kwarg。看起来像这样（只是作为一个与您的代码无关的示例：

import pandas
dt = pandas.DatetimeIndex(periods=10, start='2014-02-01', freq='10T')
dt.format(formatter=lambda x: x.strftime('%Y    %m    %d  %H:%M.%S'))

输出：

['2014    02    01  00:00.00',
 '2014    02    01  00:10.00',
 '2014    02    01  00:20.00',
 '2014    02    01  00:30.00',
 '2014    02    01  00:40.00',
 '2014    02    01  00:50.00',
 '2014    02    01  01:00.00',
 '2014    02    01  01:10.00',
 '2014    02    01  01:20.00',
 '2014    02    01  01:30.00']

【讨论】：

【解决方案2】：

这取决于你想要它有多漂亮，但对于大多数用例来说，它很简单：

print(date[0])（date 是您的 DatetimeIndex 变量。）

你会得到如下输出：

2019-04-26 12:00:00

【讨论】：