【问题标题】:Sorting dataframe with 1 column使用 1 列对数据框进行排序
【发布时间】:2022-01-17 11:08:21
【问题描述】:

我有一个包含 1 列的名称数据框。我尝试了order() 的多次迭代,并将其转换为列表,并以几种不同的方式尝试了sort(),但没有成功。

下面是dput()供参考:

> dput(names.ordered)
structure(list(Directors = c("Darabont, Frank", "Nolan, Christopher", 
"Lumet, Sidney", "Spielberg, Steven", "Jackson, Peter", "Tarantino, Quentin", 
"Leone, Sergio", "Fincher, David", "Zemeckis, Robert", "Kershner, Irvin", 
"Wachowski, Lana", "Scorsese, Martin", "Forman, Milos", "Kurosawa, Akira", 
"Demme, Jonathan", "Meirelles, Fernando", "Benigni, Roberto", 
"Capra, Frank", "Lucas, George", "Miyazaki, Hayao", "Besson, Luc", 
"Kobayashi, Masaki", "Polanski, Roman", "Cameron, James", "Singer, Bryan", 
"Hitchcock, Alfred", "Allers, Roger", "Chaplin, Charles", "Kaye, Tony", 
"Takahata, Isao", "Chazelle, Damien", "Scott, Ridley", "Nakache, Olivier", 
"Curtiz, Michael", "Tornatore, Giuseppe", "Kubrick, Stanley", 
"Wilder, Billy", "Stanton, Andrew", "Russo, Anthony", "Persichetti, Bob", 
"Chan-Wook, Park", "Phillips, Todd", "Shinkai, Makoto", "Unkrich, Lee", 
"Labaki, Nadine", "Petersen, Wolfgang", "Hirani, Rajkumar", "Lasseter, John", 
"Mendes, Sam", "Gibson, Mel", "Kail, Thomas", "Marquand, Richard", 
"Klimov, Elem", "Lang, Fritz", "Khan, Aamir", "Welles, Orson", 
"Vinterberg, Thomas", "Aronofsky, Darren", "Donen, Stanley", 
"Gondry, Michel", "Lean, David", "Tiwari, Nitesh", "Villeneuve, Denis", 
"Zeller, Florian", "Farhadi, Asghar", "Ray, Satyajit", "Ritchie, Guy", 
"Jeunet, Jean-Pierre", "Mulligan, Robert", "Docter, Pete", "Mann, Michael", 
"Hanson, Curtis", "McTiernan, John", "Gnanavel, T.J.", "Farrelly, Peter", 
"Hirschbiegel, Oliver", "Gilliam, Terry", "Eastwood, Clint", 
"Majidi, Majid", "Kramer, Stanley", "Sturges, John", "Huston, John", 
"Howard, Ron", "Coen, Ethan", "Carpenter, John", "Bergman, Ingmar", 
"McDonagh, Martin", "Pablos, Sergio", "Lynch, David", "Weir, Peter", 
"Reed, Carol", "McTeigue, James", "Boyle, Danny", "Coen, Joel", 
"O'Connor, Gavin", "Fleming, Victor", "Ozu, Yasujirô", "Kazan, Elia", 
"Irmak, Cagan", "Szifron, Damián", "Tarkovsky, Andrei", "Cimino, Michael", 
"Costa-Gavras, Costa-Gavras,", "Anderson, Wes", "Keaton, Buster", 
"Bruckman, Clyde", "Linklater, Richard", "Elliot, Adam", "Sheridan, Jim", 
"Abrahamson, Lenny", "Raghavan, Sriram", "Mangold, James", "McQueen, Steve", 
"Lubitsch, Ernst", "DeBlois, Dean", "Miller, George", "Wyler, William", 
"Yates, David", "Clouzot, Henri-Georges", "Reiner, Rob", "Kashyap, Anurag", 
"Rosenberg, Stuart", "Hallström, Lasse", "Kassovitz, Mathieu", 
"Truffaut, François", "Yamada, Naoko", "Stone, Oliver", "McCarthy, Tom", 
"Jones, Terry", "George, Terry", "Turgul, Yavuz", "Wong, Kar-Wai", 
"Penn, Sean", "Anno, Hideaki", "Pontecorvo, Gillo", "Fellini, Federico", 
"Wenders, Wim", "Kieslowski, Krzysztof", "Kumar, Ram", "Coppola, Francis Ford", 
"Joon Ho, Bong", "von Donnersmarck, Florian Henckel", "Van Sant, Gus", 
"De Sica, Vittorio", "Hill, George Roy", "De Palma, Brian", "Mankiewicz, Joseph L.", 
"Anderson, Paul Thomas", "del Toro, Guillermo", "Campanella, Juan José", 
"Shyamalan, M. Night", "Dreyer, Carl Theodor", "Avildsen, John G.", 
"Iñárritu, Alejandro G.")), row.names = c(NA, -154L), class = "data.frame")

我已经尝试过的几件事返回错误或没有结果:

> names.ordered <- names.ordered[order(names.ordered$Directors)]
Error in `[.data.frame`(names.ordered, order(names.ordered$Directors)) : 
  undefined columns selected

> names.ordered <- names.ordered[order(1)] 

#after converting to list
> names.ordered <- sort(names.ordered)
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

【问题讨论】:

    标签: r sorting


    【解决方案1】:

    即使数据框仅包含一列,您也需要指定要对哪一列进行排序/排序。

    如果要保留names.ordered 的原始顺序,请使用order 创建索引:

    idx <- order(names.ordered$Director)
    head(names.ordered)
               Directors
    1    Darabont, Frank
    2 Nolan, Christopher
    3      Lumet, Sidney
    4  Spielberg, Steven
    5     Jackson, Peter
    6 Tarantino, Quentin
    head(names.ordered[idx, ])
    # [1] "Abrahamson, Lenny"     "Allers, Roger"         "Anderson, Paul Thomas" "Anderson, Wes"         "Anno, Hideaki"         "Aronofsky, Darren" 
    

    如果要重新排列names.ordered 的顺序,请使用sort()

    names.ordered$Directors <- sort(names.ordered$Directors)
    head(names.ordered$Directors)
    # [1] "Abrahamson, Lenny"     "Allers, Roger"         "Anderson, Paul Thomas" "Anderson, Wes"         "Anno, Hideaki"         "Aronofsky, Darren"    
    tail(names.ordered$Directors)
    # [1] "Wong, Kar-Wai"    "Wyler, William"   "Yamada, Naoko"    "Yates, David"     "Zeller, Florian"  "Zemeckis, Robert"
    

    【讨论】:

      【解决方案2】:

      我认为您的主要问题是您尝试对列进行排序。从数据框中提取元素的语法是x[i, j, ... , drop=TRUE] X[j],其中i 表示行,j 表示列。请注意引用行时始终需要的逗号。由于您没有使用逗号,R 认为您使用了X[j] 并且您想要对列进行排序。所以在逗号前使用order()按行排序。

      在“ order() ”调用中,只需输入要从中获取要重新排列数据框的顺序的向量。

      一个小麻烦是您只有一列,这会将结果强制为尽可能低的维度(即本例中的向量)。为了避免这种情况,有一个参数drop=FALSE

      names_ordered <- names[order(names$Directors), , drop=FALSE]
      
      head(names_ordered)
      #            Directors
      # 1    Darabont, Frank
      # 2 Nolan, Christopher
      # 3      Lumet, Sidney
      # 4  Spielberg, Steven
      # 5     Jackson, Peter
      # 6 Tarantino, Quentin
      

      【讨论】:

        猜你喜欢
        • 2015-12-03
        • 2020-01-11
        • 2010-11-20
        • 2018-10-29
        • 2018-01-01
        • 1970-01-01
        • 2018-07-10
        相关资源
        最近更新 更多