【问题标题】:UnimplementedError: Cast string to float is not supported . ML with tensorflowUnimplementedError: Cast string to float is not supported 。带有张量流的机器学习
【发布时间】:2020-12-14 12:44:54
【问题描述】:

尝试使用 tensorflow 制作模型来预测贷款申请。这是下载Dataset 的链接。我正在使用 LinerClassifier 模型进行预测。该数据集可在 kaggle 上获得,并且可以轻松下载。 张量流版本:2.3.1 Python版本:3.7.3

train_df = pd.read_csv('Loan_train.csv')
test_df = pd.read_csv('Loan_test.csv')
train_df.head()

df = pd.concat([train_df, test_df])

df = df.fillna(method='ffill')
df = df.dropna()


CATEGORICAL_COLUMNS = ['Gender', 'Married', 'Education', 'Self_Employed','Dependents', 'Property_Area']
NUMERIC_COLUMNS = ['ApplicantIncome', 'CoapplicantIncome', 'LoanAmount', 'Loan_Amount_Term', 'Credit_History']

feature_columns = []
for feature_name in CATEGORICAL_COLUMNS:
    vocabulary = df[feature_name].unique()  # gets a list of all unique values from given feature column
    feature_columns.append(tf.feature_column.categorical_column_with_vocabulary_list(feature_name, vocabulary))

for feature_name in NUMERIC_COLUMNS:
    feature_columns.append(tf.feature_column.numeric_column(feature_name, dtype=tf.float64))

print(feature_columns)

# Split data into train and test datasets
from sklearn.model_selection import train_test_split
columns = CATEGORICAL_COLUMNS + NUMERIC_COLUMNS

X = df[columns]
Y = df['Loan_Status']
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.15, random_state=0)


def make_input_fn(data_df, label_df, num_epochs=10, shuffle=True, batch_size=32):
    def input_function():  # inner function, this will be returned
        ds = tf.data.Dataset.from_tensor_slices((dict(data_df), label_df))  # create tf.data.Dataset object with data and its label
        if shuffle:
            ds = ds.shuffle(1000)  # randomize order of data
        ds = ds.batch(batch_size).repeat(num_epochs)  # split dataset into batches of 32 and repeat process for number of epochs
        return ds  # return a batch of the dataset
    return input_function  # return a function object for use

train_input_fn = make_input_fn(X_train, y_train)  # here we will call the input_function that was returned to us to get a dataset object we can feed to the model
test_input_fn = make_input_fn(X_test, y_test, num_epochs=1, shuffle=False)


classifier = tf.estimator.LinearClassifier(
    feature_columns=feature_columns,
    # The model must choose between 2 classes.
    n_classes=2)

# training model
classifier.train(train_input_fn) 

错误信息

UnimplementedError: Cast string to float is not supported
     [[node head/losses/Cast (defined at /home/dipeshkusrai/.local/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/head/binary_class_head.py:258) ]]

【问题讨论】:

  • 你能检查一下这一行吗>> column(feature_name, dtype=tf.float64))。我怀疑 credithistory 是字符串。

标签: python pandas tensorflow


【解决方案1】:

您输入的数据集的列名似乎是一个字符串,并且该列中的值是浮点数,请尝试在使用skiprows=1从 csv 转换为 DataFrame 时跳过第一行

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-04-12
    • 2018-04-11
    • 2018-03-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-02-18
    相关资源
    最近更新 更多