【问题标题】:Visualize Yes/ No tree using Graphviz使用 Graphviz 可视化是/否树
【发布时间】:2020-06-30 09:45:03
【问题描述】:

我有一份症状诊断问卷数据,格式如下(python): 路径字典列表。这是一个症状诊断示例,第一个初始症状 (A) 和之后的 2 个问题。

 qa=  [OrderedDict([('A', 1), ('B', 1), ('F', 1), ('C', 1), ('D', 1), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 1), ('C', 1), ('D', 1), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 1), ('C', 1), ('D', 0), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 1), ('C', 1), ('D', 0), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 1), ('C', 0), ('D', 1), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 1), ('C', 0), ('D', 1), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 1), ('C', 0), ('D', 0), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 1), ('C', 0), ('D', 0), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 0), ('E', 1), ('D', 1), ('C', 1)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 0), ('E', 1), ('D', 1), ('C', 0)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 0), ('E', 1), ('D', 0), ('C', 1)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 0), ('E', 1), ('D', 0), ('C', 0)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 0), ('E', 0), ('D', 1), ('C', 1)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 0), ('E', 0), ('D', 1), ('C', 0)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 0), ('E', 0), ('D', 0), ('C', 1)]),
 OrderedDict([('A', 1), ('B', 1), ('F', 0), ('E', 0), ('D', 0), ('C', 0)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 1), ('C', 1), ('D', 1), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 1), ('C', 1), ('D', 1), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 1), ('C', 1), ('D', 0), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 1), ('C', 1), ('D', 0), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 1), ('C', 0), ('D', 1), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 1), ('C', 0), ('D', 1), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 1), ('C', 0), ('D', 0), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 1), ('C', 0), ('D', 0), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 0), ('C', 1), ('D', 1), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 0), ('C', 1), ('D', 1), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 0), ('C', 1), ('D', 0), ('E', 1)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 0), ('C', 1), ('D', 0), ('E', 0)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 0), ('C', 0), ('E', 1), ('D', 1)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 0), ('C', 0), ('E', 1), ('D', 0)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 0), ('C', 0), ('E', 0), ('D', 1)]),
 OrderedDict([('A', 1), ('B', 0), ('F', 0), ('C', 0), ('E', 0), ('D', 0)])]
    

虽然 1= 是,0 = 否

我想以决策树格式绘制诊断图,其中每个节点都分成“是”/“否”边,这些边通向下一个节点,依此类推。

当两者都可用于同一问题时,我将“是”和“否”分组,因为它在节点上运行 使用graphviz:

u = Digraph(name, strict=True ,filename='blabla',format='png',node_attr={'color': 'mediumpurple1', 'style': 'filled'})
u.attr(size='16,16')
answer_map = ['No','Yes']
nodes = []
edges = []
for path in qa:
    questions = [f'{j}_{lev}' for lev,j in enumerate(path.keys(), 1)]
    questions = [w.replace(':', '_') for w in questions]
    answers = [answer_map[item] for item in path.values()] 
    for i in range(len(questions)-1):
        #u.edge(questions[i], questions[i+1],label=answers[i])
        nodes.append((questions[i],questions[i+1]))
        edges.append(answers[i])
d = {'nodes':nodes,'edges':edges}
df_graph = pd.DataFrame(d).drop_duplicates()
df_graph_joined = df_graph.groupby('nodes')['edges'].apply(','.join).reset_index()

for row in df_graph_joined.itertuples():
    u.edge(row.nodes[0],row.nodes[1],label=row.edges)
u.render()

但是,如您所见,无法区分诊断路径。我想在每个“是”/“否”连接处拆分树,所以通过查看树我可以看到每个诊断路径。我该怎么做?

我希望它看起来像这样:

【问题讨论】:

  • 我注意到这里没有问题。如果您 a) 展示您到目前为止所做的工作并且 b) 提供显示您想要的结果的图表(手绘是可以的)
  • @sroush 谢谢,我已经更新了问题,添加了代码和情节
  • 请添加您尝试归档的绘图,例如手动修改提供的图纸
  • @Jens 我添加了树的绘图。谢谢。

标签: python tree graphviz


【解决方案1】:

要拆分每个答案,您需要将节点编辑为不同的名称。我建议通过完整路径更改节点名称。例如,对于这个 OrderedDict:

('A', 1), ('B', 1), ('F', 1), ('C', 1), ('D', 1), ('E', 1)

你可以这样使用:

root, root-A1, root-A1-B1, root-A1-B1-F1, root-A1-B1-F1-C1, root-A1-B1-F1-C1-D1

在这个例子中:

  • root代表A
  • root-A1代表B,路径是A-->1
  • root-A1-B1代表F,路径是A --> 1 --> B --> 1

这是一个例子:

# New node names
qa_tree = []
for path in qa:
    prefix = 'root'
    path_tree = OrderedDict()
    for i, (key, value) in enumerate(path.items()):
        key_tree = '{}'.format(prefix)
        prefix += '-{}{}'.format(key, value)
        path_tree[key_tree] = {'value': value, 'name': key}
    qa_tree.append(path_tree)

name = 'test'
u = Digraph(name, strict=True ,filename='blabla',format='png',node_attr={'color': 'mediumpurple1', 'style': 'filled'})
u.attr(size='16,16')
answer_map = ['No','Yes']
nodes = []
edges = []

for path in qa_tree:
    questions = [key for i, key in enumerate(path.keys(), 1)]
    answers = [answer_map[item.get('value')] for item in path.values()] 
    names = [item.get('name') for item in path.values()] 
    for i in range(len(questions)-1):
        u.node(questions[i], label = names[i])
        u.node(questions[i+1], label = names[i+1] )
        u.edge(questions[i], questions[i+1], label=answers[i])
u.render()

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-03-10
    • 2022-08-24
    • 1970-01-01
    • 2021-05-13
    • 2014-01-15
    • 1970-01-01
    • 1970-01-01
    • 2016-11-24
    相关资源
    最近更新 更多