根据它们在我的数据集属性中出现的次数来分解所有列答案

【问题标题】：factorize all column by their levels with how many times they occur in the attribute of my data set根据它们在我的数据集属性中出现的次数来分解所有列
【发布时间】：2025-12-07 14:30:01
【问题描述】：

this is my data set on which i want to complete factorize my data set with each count levels of the every attribute of file 这是我的代码：

    library(dplyr)
    #read File
    h_Data<-read.csv(file.choose())
    #store university attribute
    h_Data<-h_Data$University

    #Count each levels factor of data of 
    h_DataDF <- data.frame(h_Data)
    h_dataLevels<-h_DataDF %>% 
    group_by(h_Data) %>%
    summarise(no_rows = length(h_Data))
    h_dataLevels  

    #missing of data
    h_DataMissing<-sum(is.na(h_Data))
    h_DataMissing

    #percentage of each level of factor
    h_DataPer<-prop.table(table(h_Data))*100

    #table format
    h_DataTable <-data.frame(levels_data=h_dataLevels,levels_perc=h_DataPer,missing_data=h_DataMissing)
    h_DataTable

我想总结为： levels_University no.of_timesLevels Percentage_of_Level MissingAttributes IBA 4 57.14 0 库 1 14.28 0 UIT 2 28.57 0

【问题讨论】：

请让这个问题可重现。这包括示例代码（包括列出非基础 R 包）、示例数据（例如，dput(head(x))）和预期输出。参考：*.com/questions/5963269、*.com/help/mcve 和 *.com/tags/r/info。由于您提到了一个“文件”，可能包括文件中的前“n”行，其中“n”是基于平衡相对重要性、充分性和紧凑性来定义的。
标题应该是一个非常简短的问题摘要，而不是问题本身，首先...

标签： r machine-learning deep-learning analytics data-mining

【解决方案1】：

如果没有一些样本数据和所需的输出，很难确切知道您想要什么，但这里有一些代码采用数据框，并且对于作为因子的每一列，返回一个数据框，列出每个因子级别的观察数。

## dummy data
df <- data.frame(Sex = c("m", "f", "m","f"), department = c("bs", "el", "bs", "se"), numbers = c(1,2,3,4))

## function that takes a column of data
## and returns factor counts if the column is a factor
countFactors <- function(col){
     if(is.factor(col)){
          fct_count(col)
     }else{
          NULL
     }
}

## use purrr::map to iterate through the columns of the
## dataframe and apply the function
df %>% 
     map(~ countFactors(.)) %>% 
     compact()

【讨论】：