【发布时间】:2017-02-08 22:37:54
【问题描述】:
我想计算a文件中两个向量之间的余弦相似度,格式如下:
first_vector 1 2 3
second_vector 1 3 5
... 只是向量的名称,然后是它的元素,用单个空格分隔。我已经定义了一个函数,它应该将每一行作为单独的列表,然后计算相似度。我的问题是我不知道如何将两行转换为两个列表。
这是我的代码:
import math
def cosine_sim(vector1,vector2):
sum_of_x,sum_of_y, sum_of_xy=0,0,0
for i in range(len(v1)):
x=vector1[i]; y=vector2[i]
sum_of_x+=x*x;
sum_of_y+=y*y;
sum_of_xy += x*y
return (sum_of_xy/math.sqrt(sum_of_x*sum_of_y))
myfile=open("vectors","r")
v1='#This should read the first line vector which is 1 2 3'
v2='#This should read the second line vector which is 1 3 5'
print("The similarity is",cosine_sim(v1,v2))
【问题讨论】:
标签: python list file-io cosine-similarity