【问题标题】:R iterate over a dictionary into a function for multiple valuesR将字典迭代为多个值的函数
【发布时间】:2022-01-01 01:09:46
【问题描述】:

我正在尝试创建一个循环(或最有效的方法)来迭代 R(或 Python!)中的一系列日历以发布所有假期(理想情况下是所有工作日,但似乎我可能会为此设计两个部分-因为我希望标记周末)。目标是拥有一个如下所示的数据框:

Country | ISO Code (if available) | Dates
United States of America| US| 12.24.2020
United States of America| US| 12.25.2020
United States of America| US| 01.01.2021
United Kingdom| UK| 12.24.2020
United Kingdom| UK| 12.25.2020
United Kingdom| UK| 01.01.2021

到目前为止我所拥有的:

    require("lattice")
    require("reticulate")
    require("RcppQuantuccia")
    require("tidyverse")
    require("tidytable")

fun_Holidays <- function(cal) {
    setCalendar(cal)
    getHolidays(as.Date("2019-01-01"), as.Date("2030-12-31"))
}
cal_dic <- data.table(calendar=calendars)
as.list(cal_dic)

cal_dic 是 RcppQuantuccia 上所有可用日历的列表,但如果我运行:

fun_Holidays(cal_dic)

我得到的只是错误(因为它不是迭代的):

ERROR: Error in setCalendar(cal): Expecting a single string value: [type=list; extent=1].

我还使用Holidays 包在Python 中对此进行了尝试,并取得了进一步进展,但ISO 代码未正确附加:

all_holidays = []
country_list = ['Angola','Argentina','Aruba','Australia','Austria','Bangladesh','Belarus','Belgium','Botswana','Brazil',
                'Bulgaria','Burundi','Canada','Chile','China','Colombia','Croatia','Curacao','Czechia','Denmark','Djibouti','DominicanRepublic',
                'Egypt','England','Estonia','Finland','France','Georgia','Germany','Greece','Honduras','HongKong','Hungary','Iceland','India','Ireland','IsleOfMan',
                'Israel','Italy','Jamaica','Japan','Kenya','Korea','Latvia','Lesotho','Lithuania','Luxembourg','Malaysia','Malawi','Mexico','Morocco','Mozambique','Netherlands',
                'Namibia','NewZealand','Nicaragua','Nigeria','NorthernIreland','Norway','Paraguay','Peru','Poland','Portugal','PortugalExt','Romania','Russia','SaudiArabia','Scotland',
                'Serbia','Singapore','Slovakia','Slovenia','SouthAfrica','Spain','Swaziland','Sweden','Switzerland','Turkey','Ukraine','UnitedArabEmirates','UnitedKingdom',
                'UnitedStates','Venezuela','Vietnam','Wales','Zambia','Zimbabwe']
    

for country in country_list:
    for holiday in holidays.CountryHoliday(country, years = np.arange(2018,2030,1)).items():
        all_holidays.append({'date' : holiday[0], 'holiday' : holiday[1], 'country': country, 'code': code})
all_holidays = pd.DataFrame(all_holidays)
all_holidays

    date    holiday country code
0   2018-09-17  Dia do Herói Nacional   Angola  NZ
1   2018-01-01  Ano novo    Angola  NZ
2   2018-03-30  Sexta-feira Santa   Angola  NZ
3   2018-02-13  Carnaval    Angola  NZ
4   2018-02-04  Dia do Início da Luta Armada    Angola  NZ
... ... ... ... ...
14386   2029-08-15  Zimbabwe Heroes' Day    Zimbabwe    NZ
14387   2029-08-13  Defense Forces Day  Zimbabwe    NZ
14388   2029-12-22  Unity Day   Zimbabwe    NZ
14389   2029-12-25  Christmas Day   Zimbabwe    NZ
14390   2029-12-26  Boxing Day  Zimbabwe    NZ
14391 rows × 4 columns

我觉得很奇怪,在 csv 中没有按日期按国家/地区分类的假期主列表来帮助处理时间序列 - 但也许只有我一个人! :)

谢谢!

编辑:我也一直在看:https://workalendar.github.io/workalendar/

因为它拥有最多的国家/地区列表,但它比假期更难使用 - 但是如果有人有解决方案来将“主日历”从 workaldendar 中取出,那就太棒了!

【问题讨论】:

    标签: python r loops calendar


    【解决方案1】:

    使用lapply 获取calendars 中每个值的日期列表。

    library(RcppQuantuccia)
    
    fun_Holidays <- function(cal) {
      setCalendar(cal)
      getHolidays(as.Date("2019-01-01"), as.Date("2030-12-31"))
    }
    
    
    lapply(calendars, fun_Holidays)
    

    要创建一个包含国家名称和日期的单一数据框,您可以使用 -

    do.call(rbind, lapply(calendars, function(x) {
      dates <- fun_Holidays(x)
      if(length(dates))
        data.frame(country = x, dates)
    })) -> result
    
    head(result)
    
    #  country      dates
    #1  TARGET 2019-01-01
    #2  TARGET 2019-04-19
    #3  TARGET 2019-04-22
    #4  TARGET 2019-05-01
    #5  TARGET 2019-12-25
    #6  TARGET 2019-12-26
    

    或者purrr -

    purrr::map_df(calendars, function(x) {
      dates <- fun_Holidays(x)
      if(length(dates))
        data.frame(country = x, dates)
    }) -> result
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-01-04
      • 2021-09-22
      • 1970-01-01
      • 2020-01-31
      • 2020-08-04
      • 1970-01-01
      相关资源
      最近更新 更多