使用 R 中的 rtweet 包遍历 Twitter 关注者答案

【问题标题】：Loop through Twitter followers with rtweet package in R使用 R 中的 rtweet 包遍历 Twitter 关注者
【发布时间】：2017-04-06 21:28:09
【问题描述】：

我有一个使用特定主题标签的 Twitter ID 列表，现在我正在尝试制作一个网络图来查看他们关注的人。使用全新的 rtweet 包，我的想法是对于每个 user_id，我使用 get_friends 函数并最终得到两列表 - userids | 以下。

问题在于，我最终得到的不是两列，而是一列。以下是我基于类似问题所做的事情：

#this is where the ids list comes from
head(ids)
user_id             freq
2953382183           291
2832407758           178
522476436            149
773707421579677696   117
1296286704           113
773555423970529280   113

#for each user_id, get_friends() show me who the user is following
userids <- ids[1,1]
following <- get_friends(userids)
head(following)
               ids
         540219772
757699150507020288
        2392165598
         628569910
         576547113
         181996651

#NOW I'LL TRY TO FILL A NEW DATA FRAME FOR EACH "user_id" WITH ALL FOLLOWING "ids"

#initializing an empty data frame
final <- data.frame(userids = character(), following =character())

totalusers <- nrow(ids) #ids is a data frame where I got all `user_id`
userids <- NULL
following <- NULL
df <- NULL

for (i in 1:totalusers)
{
userids[i] <- ids[i,1]
following <- get_friends(userids[i]) #get_friends returns a data frame, by package default
df[i] <- data.frame(userids[i], following)
final <- rbind(final, df[i])
}

有谁知道我如何将以下变量附加到此数据框？非常感谢。

【问题讨论】：

您可能应该阅读关于生长对象的 RInferno。您要做的是索引正确的行和列，而不是在每次迭代中创建数据框，或者类似地，在每次迭代中使用rbind。
非常感谢@shayaa。现在我将使用数据框的解决方案编辑帖子，同时我正在研究一种更有效的方法来做到这一点。
没问题。提供用于测试代码以及预期结果的最小数据集是公认的标准。如果你不这样做，你经常会被否决。此外，您可以针对自己的问题发布解决方案，无需将其保留为编辑。

标签： r loops twitter

【解决方案1】：

以下代码有效，尽管它可能不是大型数据集的最有效方式。

for (i in 1:totalusers)
{
userids[i] <- ids[i,1]
following <- get_friends(userids[i])
final <- rbind(final, data.frame(userids=userids[i], following=following))
}

我最终得到了这个：

userids                    ids
2953382183           540219772
2953382183  757699150507020288
2953382183          2392165598
2953382183           628569910
2953382183           576547113
2953382183           181996651

【讨论】：

【解决方案2】：

对于给定的一组 id (ids)，您可以执行以下操作：

library(rtweet)
library(plyr)
ids<-c("156562085","808676983","847366544183050240")#the users id
list_of_friends<-lapply(ids,get_friends)#get all the friends' ids per each user id
names(list_of_friends)<-ids
list_of_friends2<-lapply(list_of_friends,function(y) dim(y)[1])#get the number of friends 
df1<-ldply(list_of_friends2, data.frame)#transform the data into data.frame
names(df1)<-c("user_id","following")

df1 产生：

             user_id         following
1           156562085           339
2           808676983          1066
3  847366544183050240             0

另外为了产生edge list:

f1<-function(x){
  return(cbind(rep(names(list_of_friends[x]),dim(list_of_friends[[x]])
[1]),list_of_friends[[x]]))
}
l1<-lapply(names(list_of_friends),f1)
df2<-ldply(l1,data.frame)
names(df2)<-c("user_id","friend_id")

屈服df2:

  user_id          friend_id
1    156562085           26787673
2    156562085           18139619
3    156562085           23827692
                [...]
1403 808676983           19397785
1404 808676983           50393960
1405 808676983           113419517

如果您在df1 中添加来自following 的列值，您将得到1405，同意nrow(df2)。我相信df2 是您首先想要的。

【讨论】：