【问题标题】:"LightGBMError: Do not support special JSON characters in feature name"\"LightGBMError: 不支持特征名称中的特殊 JSON 字符\"
【发布时间】:2023-01-26 12:13:48
【问题描述】:

我的“X”数据是时间序列的熊猫数据框。我使用 Tsfresh 提取 X 数据的特征,并尝试应用 LightGBM 算法将数据分类为 0(坏)和 1(好)。但它显示错误。我的 X 数据列是`


指数(['0__ratio_beyond_r_sigma__r_1', '0__change_quantiles__f_agg_"mean"isabs_True__qh_0.8__ql_0.0', '0__cwt_coefficients__coeff_1__w_20__widths(2, 5, 10, 20)', '0__cwt_coefficients__coeff_1__w_10__widths(2, 5, 10, 20)', '0__change_quantiles__f_agg_"var"_isabs_False__qh_0.8__ql_0.0', '0__change_quantiles__f_agg“意思”_isabs_True__qh_0.4__ql_0.0', '0__change_quantiles__f_agg“意思”_isabs_True__qh_0.8__ql_0.6', '0__change_quantiles__f_agg“意思”_isabs_False__qh_0.4__ql_0.0', '0__fft_coefficient__attr“真实的”_系数_3', '0__change_quantiles__f_agg“意思”_isabs_True__qh_1.0__ql_0.0', ... '0__quantile__q_0.4', '0__fft_coefficient__attr“形象”coeff_39', '0__large_standard_deviation__r_0.2', '0__cwt_coefficients__coeff_13__w_10__widths(2, 5, 10, 20)', '0__fourier_entropy__bins_10', '0__fft_coefficient__attr“角度”_coeff_9', '0__fft_coefficient__attr“形象”_系数_17', '0__fft_coefficient__attr“角度”_coeff_92', '0__最大值', '0__fft_coefficient__attr“形象”__coeff_32'], dtype='object', 长度=225)


我的代码是 `

import lightgbm as lgb
d_train = lgb.Dataset(X_train, label=y_train)



lgbm_params = {'learning_rate':0.05, 'boosting_type':'dart',   
              'objective':'binary',
              'metric':['auc', 'binary_logloss'],
              'num_leaves':100,
              'max_depth':10}


clf = lgb.train(lgbm_params, d_train, 50) 




y_pred_lgbm=clf.predict(X_test)


for i in range(0, X_test.shape[0]):
    if y_pred_lgbm[i]>=.5:       
       y_pred_lgbm[i]=1
    else:  
       y_pred_lgbm[i]=0
       


cm_lgbm = confusion_matrix(y_test, y_pred_lgbm)
sns.heatmap(cm_lgbm, annot=True)

`

我尝试了以下代码来更改我的列,但它不起作用。 `

import re
X = X.rename(columns = lambda u:re.sub('[^A-Za-z0-9_]+', '', u))

应用该重命名功能后,列如下所示 `

Index(['0__ratio_beyond_r_sigma__r_1',
       '0__change_quantiles__f_agg_mean__isabs_True__qh_08__ql_00',
       '0__cwt_coefficients__coeff_1__w_20__widths_251020',
       '0__cwt_coefficients__coeff_1__w_10__widths_251020',
       '0__change_quantiles__f_agg_var__isabs_False__qh_08__ql_00',
       '0__change_quantiles__f_agg_mean__isabs_True__qh_04__ql_00',
       '0__change_quantiles__f_agg_mean__isabs_True__qh_08__ql_06',
       '0__change_quantiles__f_agg_mean__isabs_False__qh_04__ql_00',
       '0__fft_coefficient__attr_real__coeff_3',
       '0__change_quantiles__f_agg_mean__isabs_True__qh_10__ql_00',
       ...
       '0__quantile__q_04', '0__fft_coefficient__attr_imag__coeff_39',
       '0__large_standard_deviation__r_02',
       '0__cwt_coefficients__coeff_13__w_10__widths_251020',
       '0__fourier_entropy__bins_10',
       '0__fft_coefficient__attr_angle__coeff_9',
       '0__fft_coefficient__attr_imag__coeff_17',
       '0__fft_coefficient__attr_angle__coeff_92', '0__maximum',
       '0__fft_coefficient__attr_imag__coeff_32'],
      dtype='object', length=225)

` 我应该怎么做才能摆脱这个错误?

【问题讨论】:

    标签: machine-learning time-series lightgbm


    【解决方案1】:

    你不能像'_'这样的符号放在列名中,否则lgb会报告这种错误

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-01-27
      • 1970-01-01
      • 2011-12-27
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多