创建网络图答案

【问题标题】：Creating Network Graphs创建网络图
【发布时间】：2016-03-06 18:44:00
【问题描述】：

我的 CSV 格式的样本数据集如下所示。

无向图有 90 个节点，用数字 {10,11,12....99} 表示其带有权重的边定义如下。

[样本数据]

node1 node2 weight
23     89    34.9  (i.e. there is an edge between node 23 and 89 with weight 34.9)
75     14    28.5
so on....

我想以网络形式表示它。表示它的有效方式是什么（例如 Gephi、networkx 等）。边缘的厚度应该代表边缘的重量。

【问题讨论】：

标签： graph social-networking

【解决方案1】：

使用networkx，可以添加带属性的边

import networkx as nx
G = nx.Graph()
G.add_edge(23, 89, weight=34.9)
G.add_edge(75, 14, weight=28.5)

【讨论】：

【解决方案2】：

如果您的 csv 文件很大，我建议您将 pandas 用于任务的 I/O 部分。 networkx 有一个与pandas 交互的有用方法，称为from_pandas_dataframe。假设您的数据是上述格式的 csv，则此命令应该适合您：

df = pd.read_csv('path/to/file.csv', columns=['node1', 'node2', 'weight'])

但为了演示，我将在您的要求内使用 10 个随机边（您无需导入 numpy，我只是将其用于随机数生成）：

import matplotlib as plt
import networkx as nx
import pandas as pd

#Generate Random edges and weights
import numpy as np
np.random.seed(0) # for reproducibility

w = np.random.rand(10) # weights 0-1
node1 = np.random.randint(10,19, (10))  # I used 10-19 for demo
node2 = np.random.randint(10,19, (10))
df = pd.DataFrame({'node1': node1, 'node2': node2, 'weight': w}, index=range(10))

上一个块中的所有内容都应生成与您的pd.read_csv 命令相同的内容。导致这个DataFrame，df:

    node1   node2   weight
0   16  13  0.548814
1   17  15  0.715189
2   17  10  0.602763
3   18  12  0.544883
4   11  13  0.423655
5   15  18  0.645894
6   18  11  0.437587
7   14  13  0.891773
8   13  13  0.963663
9   10  13  0.383442

使用from_pandas_dataframe 初始化MultiGraph。这假设您将有多个边连接到一个节点（OP 中未指定）。要使用此方法，您必须对convert_matrix.py 文件中的networkx 源代码进行简单更改，实现here（这是一个简单的错误）。

MG = nx.from_pandas_dataframe(df, 
                              'node1', 
                              'node2', 
                               edge_attr='weight',
                               create_using=nx.MultiGraph()
                              )

这会生成您的 MultiGraph，您可以使用 draw 对其进行可视化：

positions = nx.spring_layout(MG) # saves the positions of the nodes on the visualization
# pass positions and set hold=True
nx.draw(MG, pos=positions, hold=True, with_labels=True, node_size=1000, font_size=16)

详细说明： positions 是一个字典，其中每个节点都是一个键，值是图上的一个位置。我将在下面描述我们为什么存储positions。通用draw 将绘制您的MultiGraph 实例MG，其节点位于指定的positions。但是，如您所见，边缘的宽度都相同：

但是您拥有添加权重所需的一切。首先将权重放入名为weights 的列表中。使用edges 遍历每个边（使用列表理解），我们可以提取权重。我选择乘以5，因为它看起来最干净：

weights = [w[2]['weight']*5 for w in  MG.edges(data=True)]

最后我们将使用draw_networkx_edges，它只绘制图形的边缘（没有节点）。由于我们拥有节点的positions，并且我们设置了hold=True，因此我们可以在之前的可视化之上绘制加权边缘。

nx.draw_networkx_edges(MG, pos=positions, width=weights) #width can be array of floats

您可以看到节点(14, 13) 具有来自DataFrame df（除了(13,13)）的最重线和最大值。

【讨论】：

在 nx.Multigraph() 我收到此错误：TypeError: unhashable type: 'dict'
如果您在该代码块之前的段落中进行了更改，它应该可以工作。 Another link to the SO question 和GH Issue。此外，如果您完全删除 create_using 参数，它将起作用，前提是您知道您的图表是 Graph 而不是 MultiGraph。

【解决方案3】：

如果您在 Linux 中，并假设您的 csv 文件如下所示（例如）：

23;89;3.49
23;14;1.29
75;14;2.85
14;75;2.9
75;23;0.9
23;27;4.9

你可以使用这个程序：

import os

def build_G(csv_file):

    #init graph dict
    g={}

    #here we open csv file
    with open(csv_file,'r') as f:
        cont=f.read()

    #here we get field content
    for line in cont.split('\n'):
        if line != '':

            fields=line.split(';')

            #build origin node
            if g.has_key(fields[0])==False:
                g[fields[0]]={}

            #build destination node         
            if g.has_key(fields[1])==False:
                g[fields[1]]={}

            #build edge origin>destination
            if g[fields[0]].has_key(fields[1])==False:
                g[fields[0]][fields[1]]=float(fields[2])

    return g

def main():

    #filename
    csv_file="mynode.csv"

    #build graph
    G=build_G(csv_file)

    #G is now a python dict
    #G={'27': {}, '75': {'14': 2.85, '23': 0.9}, '89': {}, '14': {'75': 2.9}, '23': {'27': 4.9, '89': 3.49, '14': 1.29}}


    #write to file
    f = open('dotgraph.txt','w')
    f.writelines('digraph G {\nnode [width=.3,height=.3,shape=octagon,style=filled,color=skyblue];\noverlap="false";\nrankdir="LR";\n')
    f.writelines

    for i in G:
        for j in G[i]:
            #get weight
            weight = G[i][j]
            s= '      '+ i
            s +=  ' -> ' +  j + ' [dir=none,label="' + str(G[i][j]) + '",penwidth='+str(weight)+',color=black]'
            if s!='      '+ i:
                s+=';\n'
                f.writelines(s)

    f.writelines('}')
    f.close()

    #generate graph image from graph text file
    os.system("dot -Tjpg -omyImage.jpg dotgraph.txt")

main()

我之前一直在寻找构建复杂图形的有效解决方案，这是我找到的最简单（没有任何 python 模块依赖）的方法。

这是无向图的图像结果（使用 dir=none）：

【讨论】：

如果您的系统上不存在 dot 二进制文件，您可以从终端使用sudo apt-get install graphviz
@ Stefani 谢谢..!!我的图表是无向的，如何删除方向。
@user1659936 不客气，施工时需要添加dir=none，所以请将s += ' -> ' + j + ' [label="' + str(G[i][j]) + '",penwidth='+str(weight)+',color=black]'换成s += ' -> ' + j + ' [dir=none,label="' + str(G[i][j]) + '",penwidth='+str(weight)+',color=black]'去掉方向
@user1659936：我根据您的需要更改代码。问候
使用上面的代码，我的 dotgraph.txt 文件在 Windows 上（看似）正确生成，但我找不到 .png 输出。我怎样才能让它在 Windows 而不仅仅是 Linux 上工作？

【解决方案4】：

您应该编辑 csv 文件开头的行，如下所示：

源目标类型权重 23 89 无向 34.9（即节点 23 和 89 之间有一条边，权重为 34.9） 75 14 无向 28.5 等等……

之后你可以将csv文件导入到Gephi中来表示以边的粗细代表权重的图形，例如： enter image description here

【讨论】：