【发布时间】:2018-05-20 15:20:32
【问题描述】:
祝大家今天好,
我无法完成这项具有挑战性的任务,因此我想找到一种优雅的方法来:
- 我需要对“区域”中的每个行元素使用一种适应性强的方法,例如循环
- 从按“Zone”元素分组的“country_name”中逐行提取多个子字符串
- 将逐行的多个子字符串存储为索引值以用于 df2
- 将索引值与 df2 中的数据框匹配
- 计算总人口并根据 df1 对其进行变异
本质上的挑战是,该方法不应该针对数据框中的任何特定元素进行修复。
第一个数据帧:
df1 <- data.frame(zone, country_name)
zone = c("M", "N", "O")
country_name = c("The USA, Canada & Mexico are part of North America", "Canada like Australia is a Commonwealth member", "The UK is still finalizing its exit plans from the EU")
第二个数据框:
df2 <- data.frame(zonal_region, country, population)
zonal_region = c("M", "M", "M", "N", "N", "N", "O", "O", "O")
country = c("USA", "Canada", "Mexico", "Canada", "Australia", "UK", "Australia", "UK", "Canada")
population = c(323.4 , 36.29, 127.5, 36.29, 24.13, 65.64, 24.13, 65.64, 36.29)
这是我最终输出的样子:
df3 <- data.frame(zone, country_name, total_population)
zone = c("M", "N", "O")
country_name = c("The USA, Canada & Mexico are part of North America", "Canada like Australia is a Commonwealth member", "The UK is still finalizing its exit plans from the EU")
total_population = c(487.19, 60.42, 65.64)
我在提取多个子字符串并针对给定区域的 df2 索引它们的值时遇到了麻烦。
如果有人能解决这个问题,将不胜感激。
谢谢!
【问题讨论】:
标签: r