Python 标准化导致 1 秒的范围为 0-1答案

【问题标题】：Python Normalization results in 1s insted of range 0-1Python 标准化导致 1 秒的范围为 0-1
【发布时间】：2020-12-23 16:04:49
【问题描述】：

我正在尝试使用 Python sklearn preprocessing.normalize 标准化我的数据，但是所有结果都以 1 insted of in range [0-1] 结尾。我想这是一个简单的错误，但我是 Python 新手。最大的差异是平均可见的，这清楚地表明有些事情是很遥远的！

这里是重现问题的示例代码

import numpy as np
import pandas as pd
from sklearn import preprocessing

tmp = np.random.randint(0, 100, 1000) 

tmp_st = preprocessing.normalize(tmp.reshape(-1, 1))
print('min: ' + str(min(tmp_st)) + 
      ' | max: ' + str(max(tmp_st)) + 
      ' | avg: ' + str(sum(tmp_st) / len(tmp_st)) + 
      ' - min org: ' + str(min(tmp)) + 
      ' | max org: ' + str(max(tmp)) + 
      ' | avg org: ' + str(sum(tmp) / len(tmp)))
# min: 0.0 | max: 1.0 | avg: 0.99 - min org: 0 | max org: 99 | avg org: 50.156

我也试过在数据框中做

df_tmp = pd.DataFrame({'tmp': tmp})
df_tmp['tmp_st'] = preprocessing.normalize(df_tmp[['tmp']])
print('min: ' + str(min(df_tmp['tmp_st'])) + 
      ' | max: ' + str(max(df_tmp['tmp_st'])) + 
      ' | avg: ' + str(sum(df_tmp['tmp_st']) / len(df_tmp['tmp_st'])) + 
      ' - min org: ' + str(min(df_tmp['tmp'])) + 
      ' | max org: ' + str(max(df_tmp['tmp'])) + 
      ' | avg org: ' + str(sum(df_tmp['tmp']) / len(df_tmp['tmp'])))
# min: 0.0 | max: 1.0 | avg: 0.99 - min org: 0 | max org: 99 | avg org: 50.156

【问题讨论】：

标签： python scikit-learn normalization

【解决方案1】：

normalize 默认情况下独立标准化每一行。但是你给它一个列向量——每一行只有一个值！尝试添加 axis=0 关键字 arg 以改为按列规范化。

【讨论】：

我已经尝试过axis = 0，但结果仍然很奇怪：min: [0.] | max: [0.05490838] | avg: [0.02733495] - min org: 0 | max org: 99 | avg org: 49.285 与axis = 1 相同： min: 0.0 |最大值：1.0 |平均：0.991 - 最小组织：0 |最大组织：99 |平均组织：49.2851