目前该工具箱具有的功能:
Currently available functions:
PREPROCESSING
Pricipal Component Analysis
pca = dsb_preprocessing.PCA(n_components)
pca = pca.fit(X)
Xt = pca.transform(X)
Min-Max Normalizer
scaler = dsb_preprocessing.MinMaxNormalizer()
scaler = scaler.fit(X)
Xt = scaler.transform(X)
Quantile Binning Transformation
b = dsb_preprocessing.Binning(n_bins)
b = b.fit(X)
Xt = b.transform(X)
Feature Selection Based on Information Gain
ig = dsb_preprocessing.InformationGain(n_features)
ig = ig.fit(X,Y)
Xr = ig.feature_selection(X)
UTILITIES
Simple or Stratified Random Sampling
[X,Xnew,Y,Ynew] = dsb_utilities.data_sampling(X,Y,0.30,‘stratified’)
Cross Validation
accuracy = cross_validation(mdl,X,Y,k)
Accuracy Classification Score
accuracy = dsb_utilities.accuracy_score(Ynew,Ypred)
Information Entropy
e = dsb_utilities.entropy(Y)
Quantile Analysis
Q = dsb_utilities.quantile(X,[0.25 0.50 0.75])
CLASSIFICATION
k-Nearest Neighbors
mdl = dsb_predictors.kNNeighbors(k,p)
mdl = mdl.fit(X,Y)
Ypred = mdl.predict(Xnew)
[indices,distances] = mdl.find(Xnew)
Naive Bayes
mdl = dsb_predictors.NaiveBayes(‘gaussian’)
mdl = mdl.fit(X,Y)
Ypred = mdl.predict(Xnew)
[Ysorted,probabilities] = mdl.find(Xnew(1,:))
Decision Tree
mdl = dsb_predictors.DTree()
mdl = mdl.fit(X,Y)
Ypred = mdl.predict(Xnew)
REGRESSION
Linear Regression
reg = dsb_predictors.LinearRegression()
reg = reg.fit(X,Y)
Ypred = reg.predict(Xnew)
CLUSTERING
k-Means
mdl = dsb_descriptors.kMeans(k)
mdl = mdl.fit(X)
Ypred = mdl.predict(Xnew)
更多精彩文章请关注公众号: