该示例所用的数据可从该链接下载,提取码为3y90,数据说明可参考该网页。该示例的“模型调参”这一部分引用了这篇博客的步骤。
数据前处理
- 导入数据
View Code
import pandas as pd import numpy as np from sklearn.cross_validation import train_test_split ### Load data ### Split the data to train and test sets data = pd.read_csv('data/loan/Train.csv', encoding = "ISO-8859-1") train, test = train_test_split(data,train_size=0.7,random_state=123,stratify=data['Disbursed']) ### Check number of nulls in each feature column nulls_per_column = train.isnull().sum() print(nulls_per_column)