【源码】DSciBox:数据科学工具箱,具有数据预处理、分类、回归和聚类等工具

目前该工具箱具有的功能:

Currently available functions:

PREPROCESSING

Pricipal Component Analysis

pca = dsb_preprocessing.PCA(n_components)

pca = pca.fit(X)

Xt = pca.transform(X)

Min-Max Normalizer

scaler = dsb_preprocessing.MinMaxNormalizer()

scaler = scaler.fit(X)

Xt = scaler.transform(X)

Quantile Binning Transformation

b = dsb_preprocessing.Binning(n_bins)

b = b.fit(X)

Xt = b.transform(X)

Feature Selection Based on Information Gain

ig = dsb_preprocessing.InformationGain(n_features)

ig = ig.fit(X,Y)

Xr = ig.feature_selection(X)

UTILITIES

Simple or Stratified Random Sampling

[X,Xnew,Y,Ynew] = dsb_utilities.data_sampling(X,Y,0.30,‘stratified’)

Cross Validation

accuracy = cross_validation(mdl,X,Y,k)

Accuracy Classification Score

accuracy = dsb_utilities.accuracy_score(Ynew,Ypred)

Information Entropy

e = dsb_utilities.entropy(Y)

Quantile Analysis

Q = dsb_utilities.quantile(X,[0.25 0.50 0.75])

CLASSIFICATION

k-Nearest Neighbors

mdl = dsb_predictors.kNNeighbors(k,p)

mdl = mdl.fit(X,Y)

Ypred = mdl.predict(Xnew)

[indices,distances] = mdl.find(Xnew)

Naive Bayes

mdl = dsb_predictors.NaiveBayes(‘gaussian’)

mdl = mdl.fit(X,Y)

Ypred = mdl.predict(Xnew)

[Ysorted,probabilities] = mdl.find(Xnew(1,:))

Decision Tree

mdl = dsb_predictors.DTree()

mdl = mdl.fit(X,Y)

Ypred = mdl.predict(Xnew)

REGRESSION

Linear Regression

reg = dsb_predictors.LinearRegression()

reg = reg.fit(X,Y)

Ypred = reg.predict(Xnew)

CLUSTERING

k-Means

mdl = dsb_descriptors.kMeans(k)

mdl = mdl.fit(X)

Ypred = mdl.predict(Xnew)

更多精彩文章请关注公众号:【源码】DSciBox:数据科学工具箱,具有数据预处理、分类、回归和聚类等工具

相关文章:

  • 2022-02-12
  • 2021-07-28
  • 2022-02-26
  • 2022-03-04
  • 2021-11-10
  • 2022-02-19
  • 2022-12-23
猜你喜欢
  • 2021-07-06
  • 2021-10-28
  • 2021-06-13
  • 2021-09-21
  • 2022-12-23
  • 2022-12-23
相关资源
相似解决方案