【问题标题】:Calculate clustering for each node in graph计算图中每个节点的聚类
【发布时间】:2020-10-18 17:30:08
【问题描述】:

谁能帮助我如何在不使用 python 库的情况下计算图中每个节点的聚类?通用公式为2.0 * E / (V *(V - 1))。 代码(不能正常工作):

def clustering():
        clust = []
        print(vertexDegree)
        E = len(list1)
   
    
    for i in vertexDegree:
        if i <= 1:
              clust.append(0)
        else:
              clust.append(2.0 * E / (i *(i - 1)))
      
    vertex = 1
    for i in clust:
        print("Vertex ", vertex, "have clustering: ", i)
        vertex += 1
        print(clust)

list1 是连接节点的列表 - [[1, 2], [3, 5], [2, 4]]

E是所有连接(边)的数量

V 是邻居(节点)之间可能的连接(边)数。

图形由字典表示-{1: [2], 2: [1, 4], 3: [5], 4: [2], 5: [3]},计算vertexDegree并保存在列表中-[1, 2, 1, 1, 1]

【问题讨论】:

    标签: python graph


    【解决方案1】:

    将networkx的代码转换为使用基于字典的图形

    Networkx 代码:How to calculate clustering coefficient of each node in the graph in Python using Networkx

    代码

    def clustering(data, undirected = True):
      """
        Computes clustering coefficient for each
        node in graph g
      """
      def has_edge(n1, n2):
        """ Helper function
            True if n1 has edge to n2 or n2 has edge to n1
        """
        # n1 neighbors
        neighbours = g.get(n1, [])
        if n2 in neighbours:
          return True
    
        # n2 neighbors
        neighbours = g.get(n2, [])
        if n2 in neighbours:
          return True
    
        return False
    
      def edges_to_dic(edges, undirected = True):
        " Generates graph dictionary from edges "
        res = {}
        for v1, v2 in edges:
          res.setdefault(v1, set())
          res[v1].add(v2)
          if undirected:
            res.setdefault(v2, set())
            res[v2].add(v1)
        return res
    
      if isinstance(data, list):
        # list of edges
        g = edges_to_dic(data, undirected)
      else:
        g = data
    
      result = {}
      for node in g:
        # Iterate over nodes of g
        neighbours = g[node]
        n_neighbors = len(neighbours)
        n_links = 0
        
        if n_neighbors > 1:
          for node1 in neighbours:
            for node2 in neighbours:
              if has_edge(node1,node2):
                n_links += 1
    
          n_links /= 2 #because n_links is calculated twice
          result[node] = 2*n_links/(n_neighbors*(n_neighbors-1))
        else:
          result[node] = 0
    
      return result
    

    用法

    适用于字典

    g = {1: [2], 2: [1, 4], 3: [5], 4: [2], 5: [3]}
    
    print(clustering(g))
    # Output: {1: 0, 2: 0.0, 3: 0, 4: 0, 5: 0}
    

    适用于边缘

    edges = [[1, 2], [3, 5], [2, 4]]
    print(clustering(edges))
    # Output: {1: 0, 2: 0.0, 3: 0, 4: 0, 5: 0}
    

    解释

    所有系数为零的原因是:

    1. 节点 1、3、4、5 只有一个邻居,因此它们的系数为零
    2. 节点 2 有邻居 [1, 4] 但这些邻居没有连接,因此系数为零。

    测试 Source

    def show_results(edges, text):
      " Show results for set of test cases "
      print(text)
      print('\tUndirected: ', clustering(edges))
      print('\tDirected', clustering(edges, undirected = False))
    
    
    tests = [
      [[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]], # C = 1
      [[1, 2], [1, 3], [1, 4], [3, 4]],                 # C = 1/3
      [[1, 2], [1, 3], [1, 4]]                          # C = 0
    ]
    
    texts = [
      "Case C = 1",
      "Case C = 1/3",
      "Case C = 0"
    ]
    for edges, text in zip(tests, texts):
      show_results(edges, text)
    

    输出

    Case C = 1
        Undirected:  {1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0}
        Directed {1: 0.5, 2: 0.5, 3: 0}
    Case C = 1/3
        Undirected:  {1: 0.3333333333333333, 2: 0, 3: 1.0, 4: 1.0}
        Directed {1: 0.16666666666666666, 3: 0}
    Case C = 0
        Undirected:  {1: 0.0, 2: 0, 3: 0, 4: 0}
        Directed {1: 0.0}
    

    测试图

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-03-31
      • 1970-01-01
      • 1970-01-01
      • 2019-05-27
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多