【发布时间】:2015-05-22 17:04:43
【问题描述】:
我有一组由 3k 位作者共同撰写的数据。我有 Sender 和 Receiver(或 Source 和 Target)的列以及 Journal name 的列 和 出版年份。如果一些作者有不止一篇共同的文章,结果将在一行中以逗号分隔。我想要做的是将这些行分成多行。 data.frame - my GitHub repository
例如:
HALL M,DE JONG GF, "['GRAEFE DR 2008 INTERNATIONAL MIGRATION REVIEW', 'HALL M 2010 SOCIAL SCIENCE RESEARCH']"
我需要像这样拆分最后一列:
HALL M,DE JONG GF, GRAEFE DR 2008 INTERNATIONAL MIGRATION REVIEW
HALL M,DE JONG GF, HALL M 2010 SOCIAL SCIENCE RESEARCH
我听说我需要在 R 中编写一个简单的循环,但我不知道它应该是什么样子。
编辑 我的数据的输入,前 20 行:
> dput(head(temp,n=20))
structure(list(Source = c("HUMPHREY CR", "HUMPHREY CR", "HUMPHREY CR",
"SELL RR", "SELL RR", "SELL RR", "GARDNER RW", "GARDNER RW",
"GARDNER RW", "GARDNER RW", "GARDNER RW", "GARDNER RW", "GARDNER RW",
"GARDNER RW", "FAWCETT JT", "FAWCETT JT", "FAWCETT JT", "FAWCETT JT",
"FAWCETT JT", "FAWCETT JT"), Target = c("SELL RR", "GILLASPY RT",
"KROUT JA", "GILLASPY RT", "KROUT JA", "DEJONG GF", "FAWCETT JT",
"ARNOLD F", "CARINO BV", "ROOT BD", "DEJONG G", "ABAD RG", "DEJONG GF",
"BOUVIER LF", "ARNOLD F", "PARK IH", "CARINO BV", "ROOT BD",
"DEJONG G", "ABAD RG"), Type = c("Undirected", "Undirected",
"Undirected", "Undirected", "Undirected", "Undirected", "Undirected",
"Undirected", "Undirected", "Undirected", "Undirected", "Undirected",
"Undirected", "Undirected", "Undirected", "Undirected", "Undirected",
"Undirected", "Undirected", "Undirected"), Id = c(2386L, 2385L,
2384L, 3635L, 3634L, 3636L, 401L, 397L, 398L, 399L, 403L, 396L,
400L, 402L, 598L, 602L, 601L, 604L, 605L, 597L), Label = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), Weight = c(1, 1, 1, 1, 1, 1, 3, 2, 2, 1, 1, 2, 2,
1, 3, 1, 2, 1, 1, 2), ayjid = c("['HUMPHREY CR 1977 RURAL SOCIOLOGY']",
"['HUMPHREY CR 1977 RURAL SOCIOLOGY']", "['HUMPHREY CR 1977 RURAL SOCIOLOGY']",
"['HUMPHREY CR 1977 RURAL SOCIOLOGY']", "['HUMPHREY CR 1977 RURAL SOCIOLOGY']",
"['SELL RR 1978 JOURNAL OF POPULATION']", "['DEJONG GF 1983 INTERNATIONAL MIGRATION REVIEW', 'DEJONG G 1986 POPULATION AND ENVIRONMENT', 'FAWCETT JT 1994 POPULATION AND ENVIRONMENT']",
"['DEJONG GF 1983 INTERNATIONAL MIGRATION REVIEW', 'GARDNER RW 1986 POPULATION AND ENVIRONMENT']",
"['DEJONG GF 1983 INTERNATIONAL MIGRATION REVIEW', 'GARDNER RW 1986 POPULATION AND ENVIRONMENT']",
"['DEJONG G 1986 POPULATION AND ENVIRONMENT']", "['DEJONG G 1986 POPULATION AND ENVIRONMENT']",
"['DEJONG GF 1983 INTERNATIONAL MIGRATION REVIEW', 'DEJONG G 1986 POPULATION AND ENVIRONMENT']",
"['DEJONG GF 1983 INTERNATIONAL MIGRATION REVIEW', 'GARDNER RW 1986 POPULATION AND ENVIRONMENT']",
"['BOUVIER LF 1986 POPULATION BULLETIN']", "['DEJONG GF 1983 INTERNATIONAL MIGRATION REVIEW', 'ARNOLD F 1989 INTERNATIONAL MIGRATION REVIEW', 'FAWCETT JT 1987 INTERNATIONAL MIGRATION REVIEW']",
"['ARNOLD F 1989 INTERNATIONAL MIGRATION REVIEW']", "['DEJONG GF 1983 INTERNATIONAL MIGRATION REVIEW', 'ARNOLD F 1989 INTERNATIONAL MIGRATION REVIEW']",
"['DEJONG G 1986 POPULATION AND ENVIRONMENT']", "['DEJONG G 1986 POPULATION AND ENVIRONMENT']",
"['DEJONG GF 1983 INTERNATIONAL MIGRATION REVIEW', 'DEJONG G 1986 POPULATION AND ENVIRONMENT']"
)), .Names = c("Source", "Target", "Type", "Id", "Label", "Weight",
"ayjid"), row.names = c(NA, 20L), class = "data.frame")
【问题讨论】:
-
你能输入你的data.frame吗?
-
没有。不。不。将 data.frame 放在您的问题中 - 所有信息都收集在同一个地方!你有没有想过有些人因为防火墙而无法访问你的链接?如果您的 data.frame 很大,只需删除一个有代表性的子集!
-
感谢您的 cmets,我已经编辑了我的问题。实际上,我的 data.frame 并不太大,希望有人能打开它。)
-
同号。不不;)我的意思是复制粘贴你的 dput 的结果,而不是在一些可能无法访问的东西上!