【问题标题】:Converting Twitter Stream JSON to R dataframe将 Twitter 流 JSON 转换为 R 数据帧
【发布时间】:2012-08-12 22:38:27
【问题描述】:

我有一个 JSON 格式的文本文件,我使用来自 link 的代码下载了该文件。它看起来像这样:

"{\"contributors\":null,\"coordinates\":null,\"in_reply_to_user_id\":null,\"truncated\":false,\"text\":\"'ello miss...?\",\"entities\":{\"hashtags\":[],\"urls\":[],\"user_mentions\":[]},\"place\":null,\"id_str\":\"234702954853199872\",\"favorited\":false,\"geo\":null,\"source\":\"\u003Ca href=\\"http:\/\/www.tweetdeck.com\\" rel=\\"nofollow\\"\u003ETweetDeck\u003C\/a\u003E\",\"retweet_count\":0,\"in_reply_to_status_id_str\":null,\"in_reply_to_screen_name\":null,\"created_at\":\"Sun Aug 12 17:28:39 +0000 2012\",\"in_reply_to_user_id_str\":null,\"user\":{\"show_all_inline_media\":false,\"lang\":\"en\",\"friends_count\":145,\"profile_sidebar_border_color\":\"181A1E\",\"location\":\"Chile\",\"profile_background_image_url_https\":\"https:\/\/si0.twimg.com\/images\/themes\/theme9\/bg.gif\",\"id_str\":\"76862348\",\"listed_count\":5,\"profile_use_background_image\":true,\"profile_image_url_https\":\"https:\/\/si0.twimg.com\/profile_images\/1547896482\/egotuiter_normal.png\",\"description\":\"Geek. Amante de la m\u00fasica. Ateo. Nortino. Estudio Derecho en la U. de Chile. Nadie lee mi blog. Cobreloino. Afortunadamente ya no uso lentes :D\",\"follow_request_sent\":null,\"following\":null,\"profile_text_color\":\"666666\",\"default_profile\":false,\"profile_background_image_url\":\"http:\/\/a0.twimg.com\/images\/themes\/theme9\/bg.gif\",\"followers_count\":181,\"is_translator\":false,\"time_zone\":\"Buenos Aires\",\"profile_link_color\":\"2FC2EF\",\"protected\":false,\"created_at\":\"Thu Sep 24 05:11:20 +0000 2009\",\"profile_background_color\":\"1A1B1F\",\"name\":\"Camilo Rojas\",\"default_profile_image\":false,\"contributors_enabled\":false,\"statuses_count\":36450,\"geo_enabled\":false,\"notifications\":null,\"profile_background_tile\":false,\"url\":\"http:\/\/unnombreingenioso.blogspot.com\",\"profile_image_url\":\"http:\/\/a0.twimg.com\/profile_images\/1547896482\/egotuiter_normal.png\",\"screen_name\":\"rojas_altazor\",\"id\":76862348,\"verified\":false,\"utc_offset\":-10800,\"favourites_count\":42,\"profile_sidebar_fill_color\":\"252429\"},\"retweeted\":false,\"in_reply_to_status_id\":null,\"id\":234702954853199872}
"
"{\"contributors\":null,\"coordinates\":null,\"in_reply_to_user_id\":null,\"truncated\":false,\"text\":\"VALENTINA CORTANTE!\",\"entities\":{\"hashtags\":[],\"urls\":[],\"user_mentions\":[]},\"place\":null,\"id_str\":\"234702954861580288\",\"favorited\":false,\"geo\":null,\"source\":\"web\",\"retweet_count\":0,\"in_reply_to_status_id_str\":null,\"in_reply_to_screen_name\":null,\"created_at\":\"Sun Aug 12 17:28:39 +0000 2012\",\"in_reply_to_user_id_str\":null,\"user\":{\"show_all_inline_media\":false,\"lang\":\"es\",\"friends_count\":251,\"profile_sidebar_border_color\":\"CC3366\",\"location\":\"\",\"profile_background_image_url_https\":\"https:\/\/si0.twimg.com\/images\/themes\/theme11\/bg.gif\",\"id_str\":\"292675295\",\"listed_count\":0,\"profile_use_background_image\":true,\"profile_image_url_https\":\"https:\/\/si0.twimg.com\/profile_images\/2494400748\/7zd2z67iclw88rc0q8sr_normal.jpeg\",\"description\":\"Soy hermosaaaa y valen me ama\u2665\",\"follow_request_sent\":null,\"following\":null,\"profile_text_color\":\"362720\",\"default_profile\":false,\"profile_background_image_url\":\"http:\/\/a0.twimg.com\/images\/themes\/theme11\/bg.gif\",\"followers_count\":399,\"is_translator\":false,\"time_zone\":\"Santiago\",\"profile_link_color\":\"B40B43\",\"protected\":false,\"created_at\":\"Wed May 04 01:35:06 +0000 2011\",\"profile_background_color\":\"FF6699\",\"name\":\"Valen me amodora\u2020\",\"default_profile_image\":false,\"contributors_enabled\":false,\"statuses_count\":2643,\"geo_enabled\":false,\"notifications\":null,\"profile_background_tile\":true,\"url\":null,\"profile_image_url\":\"http:\/\/a0.twimg.com\/profile_images\/2494400748\/7zd2z67iclw88rc0q8sr_normal.jpeg\",\"screen_name\":\"CamiCelie\",\"id\":292675295,\"verified\":false,\"utc_offset\":-14400,\"favourites_count\":2,\"profile_sidebar_fill_color\":\"E5507E\"},\"retweeted\":false,\"in_reply_to_status_id\":null,\"id\":234702954861580288}
"

我很好奇如何将单个片段提取到 R 数据框中。例如,理想情况下,这将转到这样的数据框:

screen_name        text
rojas_altazor       'ello miss...?
CamiCelie           VALENTINA CORTANTE!
...                  ...

这是我第一次尝试使用 JSON 文件,如有任何建议,我们将不胜感激。

【问题讨论】:

    标签: json r rcurl


    【解决方案1】:

    首先,使用 rjson 库,我们拉入流 - 在这种情况下,我使用的是一个用户的时间线。

    library(rjson)
    timeline=fromJSON(file="http://api.twitter.com/1/statuses/user_timeline.json?screen_name=douglbutt&count=100")
    

    这给了我们一个列表 - 我总是很难将分层列表分解成整齐的表格,所以我要逐个字段地做。

    timelist=unlist(timeline)
    
    tweets=timelist[names(timelist)=="text"]
    names=timelist[names(timelist)=="user.screen_name"]
    times=timelist[names(timelist)=="created_at"]
    
    tweet.frame=data.frame(tweets,names,times)
    

    所以我们应该将它们全部放在 tweet.frame 中。如果您想查看未列出数据中的可用字段,请尝试:

    unique(names(timelist))
    

    【讨论】:

    • 给我Warning message: In readLines(file) : incomplete final line found on 'http://api.twitter.com/1/statuses/user_timeline.json?screen_name=douglbutt&count=100' 。关于为什么的任何想法?
    【解决方案2】:
    twit<-c("{\"contributors\":null,\"coordinates\":null,\"in_reply_to_user_id\":null,\"truncated\":false,\"text\":\"'ello miss...?\",\"entities\":{\"hashtags\":[],\"urls\":[],\"user_mentions\":[]},\"place\":null,\"id_str\":\"234702954853199872\",\"favorited\":false,\"geo\":null,\"source\":\"\\u003Ca href=\\\"http:\\/\\/www.tweetdeck.com\\\" rel=\\\"nofollow\\\"\\u003ETweetDeck\\u003C\\/a\\u003E\",\"retweet_count\":0,\"in_reply_to_status_id_str\":null,\"in_reply_to_screen_name\":null,\"created_at\":\"Sun Aug 12 17:28:39 +0000 2012\",\"in_reply_to_user_id_str\":null,\"user\":{\"show_all_inline_media\":false,\"lang\":\"en\",\"friends_count\":145,\"profile_sidebar_border_color\":\"181A1E\",\"location\":\"Chile\",\"profile_background_image_url_https\":\"https:\\/\\/si0.twimg.com\\/images\\/themes\\/theme9\\/bg.gif\",\"id_str\":\"76862348\",\"listed_count\":5,\"profile_use_background_image\":true,\"profile_image_url_https\":\"https:\\/\\/si0.twimg.com\\/profile_images\\/1547896482\\/egotuiter_normal.png\",\"description\":\"Geek. Amante de la m\\u00fasica. Ateo. Nortino. Estudio Derecho en la U. de Chile. Nadie lee mi blog. Cobreloino. Afortunadamente ya no uso lentes :D\",\"follow_request_sent\":null,\"following\":null,\"profile_text_color\":\"666666\",\"default_profile\":false,\"profile_background_image_url\":\"http:\\/\\/a0.twimg.com\\/images\\/themes\\/theme9\\/bg.gif\",\"followers_count\":181,\"is_translator\":false,\"time_zone\":\"Buenos Aires\",\"profile_link_color\":\"2FC2EF\",\"protected\":false,\"created_at\":\"Thu Sep 24 05:11:20 +0000 2009\",\"profile_background_color\":\"1A1B1F\",\"name\":\"Camilo Rojas\",\"default_profile_image\":false,\"contributors_enabled\":false,\"statuses_count\":36450,\"geo_enabled\":false,\"notifications\":null,\"profile_background_tile\":false,\"url\":\"http:\\/\\/unnombreingenioso.blogspot.com\",\"profile_image_url\":\"http:\\/\\/a0.twimg.com\\/profile_images\\/1547896482\\/egotuiter_normal.png\",\"screen_name\":\"rojas_altazor\",\"id\":76862348,\"verified\":false,\"utc_offset\":-10800,\"favourites_count\":42,\"profile_sidebar_fill_color\":\"252429\"},\"retweeted\":false,\"in_reply_to_status_id\":null,\"id\":234702954853199872}", 
    "", "{\"contributors\":null,\"coordinates\":null,\"in_reply_to_user_id\":null,\"truncated\":false,\"text\":\"VALENTINA CORTANTE!\",\"entities\":{\"hashtags\":[],\"urls\":[],\"user_mentions\":[]},\"place\":null,\"id_str\":\"234702954861580288\",\"favorited\":false,\"geo\":null,\"source\":\"web\",\"retweet_count\":0,\"in_reply_to_status_id_str\":null,\"in_reply_to_screen_name\":null,\"created_at\":\"Sun Aug 12 17:28:39 +0000 2012\",\"in_reply_to_user_id_str\":null,\"user\":{\"show_all_inline_media\":false,\"lang\":\"es\",\"friends_count\":251,\"profile_sidebar_border_color\":\"CC3366\",\"location\":\"\",\"profile_background_image_url_https\":\"https:\\/\\/si0.twimg.com\\/images\\/themes\\/theme11\\/bg.gif\",\"id_str\":\"292675295\",\"listed_count\":0,\"profile_use_background_image\":true,\"profile_image_url_https\":\"https:\\/\\/si0.twimg.com\\/profile_images\\/2494400748\\/7zd2z67iclw88rc0q8sr_normal.jpeg\",\"description\":\"Soy hermosaaaa y valen me ama\\u2665\",\"follow_request_sent\":null,\"following\":null,\"profile_text_color\":\"362720\",\"default_profile\":false,\"profile_background_image_url\":\"http:\\/\\/a0.twimg.com\\/images\\/themes\\/theme11\\/bg.gif\",\"followers_count\":399,\"is_translator\":false,\"time_zone\":\"Santiago\",\"profile_link_color\":\"B40B43\",\"protected\":false,\"created_at\":\"Wed May 04 01:35:06 +0000 2011\",\"profile_background_color\":\"FF6699\",\"name\":\"Valen me amodora\\u2020\",\"default_profile_image\":false,\"contributors_enabled\":false,\"statuses_count\":2643,\"geo_enabled\":false,\"notifications\":null,\"profile_background_tile\":true,\"url\":null,\"profile_image_url\":\"http:\\/\\/a0.twimg.com\\/profile_images\\/2494400748\\/7zd2z67iclw88rc0q8sr_normal.jpeg\",\"screen_name\":\"CamiCelie\",\"id\":292675295,\"verified\":false,\"utc_offset\":-14400,\"favourites_count\":2,\"profile_sidebar_fill_color\":\"E5507E\"},\"retweeted\":false,\"in_reply_to_status_id\":null,\"id\":234702954861580288}", 
    "")
    
    
    library(RJSONIO)
    
    twit1<-fromJSON(twit[1])
    
    > twit1$user$screen_name
    [1] "rojas_altazor"
    
    > twit1$text
    [1] "'ello miss...?"
    
    > twit2<-fromJSON(twit[3])
    > twit2$user$screen_name
    [1] "CamiCelie"
    > twit2$text
    [1] "VALENTINA CORTANTE!"
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2018-08-19
      • 1970-01-01
      • 1970-01-01
      • 2017-12-13
      • 2018-03-27
      • 1970-01-01
      • 2019-01-16
      • 1970-01-01
      相关资源
      最近更新 更多