【发布时间】:2025-12-08 06:10:01
【问题描述】:
我希望生成一个可视化 xml 文件结构的图表。
我创建了一个节点列表来表示 xml 文件。
每个节点包含 3 个字符串:xml 标记、属性和内容。
xml 文件如下所示:
<?xml version="1.0" encoding="UTF-8"?>
<entry db="genbank">
<data id="AC116785" length="132912" molecule="DNA" data_class="linear" division="HTG" date="08-JUL-2002" />
<definition>
<description>Mus musculus clone RP24-146B1, WORKING DRAFT SEQUENCE, 10 ordered pieces.</description>
</definition>
<accession>AC116785</accession>
<version>
<version_number>AC116785.3</version_number>
<gi>21703640</gi>
</version>
<keywords>
<keyword>HTG</keyword>
<keyword>HTGS_PHASE2</keyword>
<keyword>HTGS_DRAFT</keyword>
<keyword>HTGS_FULLTOP</keyword>
</keywords>
<source>
<abbreviation>house mouse.</abbreviation>
<organism>
<name>Mus musculus</name>
<taxonomy>
<class>Eukaryota</class>
<class>Metazoa</class>
<class>Chordata</class>
<class>Craniata</class>
<class>Vertebrata</class>
<class>Euteleostomi</class>
<class>Mammalia</class>
<class>Eutheria</class>
<class>Rodentia</class>
<class>Sciurognathi</class>
<class>Muridae</class>
<class>Murinae</class>
<class>Mus</class>
</taxonomy>
</organism>
</source>
<references>
<reference number="1" from="1" to="132912">
<authors>
<author>Birren,B.</author>
</authors>
<title>Mus musculus, clone RP24-146B1</title>
<journal>
<location>Unpublished</location>
</journal>
</reference>
<reference number="2" from="1" to="132912">
<authors>
<author>Birren,B.</author>
</authors>
<title>Direct Submission</title>
<journal>
<submission>02-APR-2002</submission>
<department>Whitehead Institute/MIT Center for Genome Research, 320 Charles Street, Cambridge, MA 02141, USA</department>
</journal>
</reference>
<reference number="3" from="1" to="132912">
<authors>
<author>Birren,B.</author>
</authors>
<title>Direct Submission</title>
<journal>
<submission>08-JUL-2002</submission>
<department>Whitehead Institute/MIT Center for Genome Research, 320 Charles Street, Cambridge, MA 02141, USA</department>
</journal>
</reference>
</references>
<comment>
<replaced>
<date>Jul 8, 2002</date>
<gi>21700645</gi>
</replaced>
<information title="All repeats were identified using RepeatMasker">Smit, A.F.A. , Green, P. (1996-1997)http://ftp.genome.washington.edu/RM/RepeatMasker.html</information>
<information title="Center">Whitehead Institute/ MIT Center for Genome Research</information>
<information title="Center code">WIBR</information>
<information title="Web site">http://www-seq.wi.mit.edu</information>
<information title="Contact">sequence_submissions@genome.wi.mit.edu</information>
<information title="Center project name">L25104</information>
<information title="Center clone name">146_B_1</information>
<information title="Sequencing vector">Plasmid; n/a; 100% of reads</information>
<information title="Chemistry">Dye-terminator Big Dye; 100% of reads</information>
<information title="Assembly program">Phrap; version 0.960731</information>
<information title="Consensus quality">130058 bases at least Q40</information>
<information title="Consensus quality">131186 bases at least Q30</information>
<information title="Consensus quality">131595 bases at least Q20</information>
<information title="Insert size">142000; agarose-fp</information>
<information title="Insert size">132012; sum-of-contigs</information>
<information title="Quality coverage">6.9 in Q20 bases; agarose-fp</information>
<information title="Quality coverage">7.5 in Q20 bases; sum-of-contigs</information>
<information title="NOTE">This is a 'working draft' sequence. It currently consists of 10 contigs. Gaps between the contigsare represented as runs of N. The order of the piecesis believed to be correct as given, however the sizesof the gaps between them are based on estimates that haveprovided by the submittor.This sequence will be replacedby the finished sequence as soon as it is available andthe accession number will be preserved.</information>
<information title="1 1178">contig of 1178 bp in length</information>
<information title="1179 1278">gap of 100 bp</information>
<information title="1279 2835">contig of 1557 bp in length</information>
<information title="2836 2935">gap of 100 bp</information>
<information title="2936 5385">contig of 2450 bp in length</information>
<information title="5386 5485">gap of 100 bp</information>
<information title="5486 8192">contig of 2707 bp in length</information>
<information title="8193 8292">gap of 100 bp</information>
<information title="8293 10488">contig of 2196 bp in length</information>
<information title="10489 10588">gap of 100 bp</information>
<information title="10589 12801">contig of 2213 bp in length</information>
<information title="12802 12901">gap of 100 bp</information>
<information title="12902 18716">contig of 5815 bp in length</information>
<information title="18717 18816">gap of 100 bp</information>
<information title="18817 34793">contig of 15977 bp in length</information>
<information title="34794 34893">gap of 100 bp</information>
<information title="34894 51004">contig of 16111 bp in length</information>
<information title="51005 51104">gap of 100 bp</information>
<information title="51105 132912">contig of 81808 bp in length.</information>
</comment>
<features>
<sequence_feature type="source">
<location>1..132912</location>
<qualifer type="db_xref">taxon:10090</qualifer>
<qualifer type="clone">RP24-146B1</qualifer>
<qualifer type="clone_lib">RPCI-24 Male Mouse BAC</qualifer>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>1..1178</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>1279..2835</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>2936..5385</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>5486..8192</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>8293..10488</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>10589..12801</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>12902..18716</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>18817..34793</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>34894..51004</location>
</sequence_feature>
<sequence_feature type="misc_feature">
<location>51105..132912</location>
</sequence_feature>
</features>
<base_count num_a="43599" num_c="24512" num_g="23668" num_t="40195" num_others="938" />
<sequence>mhkkiciigagaaglvsakhaikqgyqvdifeqtdqvggtwvysektgchsslykvmktn
lpkeamlfqdepfrdelpsfmshehvleylnefskdfpiqfsstvnevkrendlwkvlie
snsetitrfydvvfvcnghffeplnpyqnsyfkgklihshdyrraehytgknvvivgagp
sgiditlqiaqtanhvtliskkatypvlpesvqqmatnvksvdehgvvtdegdhvpadvi
ivctgyvfkfpfldssliqlkyndrmvsplyehlchvdypttlffiglplgtitfplfev
qvkyalsliagkgklpsddveirnfedarlqgllnpasfhviieeqweymkklakmggfe
ewnymetikklygyimterkknvigykmvnfelttdssdfklltirvdfnddvawiirfa
ypi</sequence>
</entry>
我希望通过枚举节点列表来使用 Plotly 和 igraph 库生成树状图。
我使用这个网站here 作为参考。
我的 XML 文件中的元素具有可变数量的子元素。 但是,给出的示例仅向我展示了如何开发具有固定数量子节点的树(示例显示每个节点固定数量的 2 个子节点)
查看 igraph 教程网站here,我看到一个类似的示例,每个节点只使用 2 个子节点。
我应该如何生成具有可变数量子节点的树,例如在我的 XML 文件中?
我已经坚持了这么久,任何帮助将不胜感激!
【问题讨论】: