快速从文件的一部分读取格式化数据（Gmsh 网格格式）答案

【问题标题】：Read formatted data from part of a file fast (Gmsh mesh format)快速从文件的一部分读取格式化数据（Gmsh 网格格式）
【发布时间】：2017-01-13 18:50:44
【问题描述】：

我维护a little Python package，它在用于网格表示的不同格式之间进行转换 à la

这些文件可能会变得非常大，因此在使用 Python 读取它们时，高效地读取它们很重要。

最常用的格式之一是来自Gmsh 的msh。不幸的是，它的数据布局可以说不是最好的。示例文件：

$MeshFormat
2.2 0 8
$EndMeshFormat
$Nodes
8
1 -0.5 -0.5 -0.5
2  0.5 -0.5 -0.5
3 -0.5  0.5 -0.5
4  0.5  0.5 -0.5
5 -0.5 -0.5  0.5
6  0.5 -0.5  0.5
7 -0.5  0.5  0.5
8  0.5  0.5  0.5
$EndNodes
$Elements
2
1 4 2 1 11 1 2 3 5
2 4 2 1 11 2 5 6 8
$EndElements

对于$Nodes：

第一个数字 (8) 是要跟随的节点数。

在每个节点行中，第一个数字是索引（实际上仍有部分格式不需要，呃），然后跟随三个空间坐标。

到目前为止，我还没有在for 循环中找到比islices 更好的方法，这很慢。

# The first line is the number of nodes
line = next(islice(f, 1))
num_nodes = int(line)
#
points = numpy.empty((num_nodes, 3))
for k, line in enumerate(islice(f, num_nodes)):
    points[k, :] = numpy.array(line.split(), dtype=float)[1:]
    line = next(islice(f, 1))
assert line.strip() == '$EndNodes'

对于$Elements：

第一个数字 (2) 是后面的元素数。

在每个元素行中，第一个数字是索引，然后是元素类型的枚举（4 是四面体）。然后是该元素的整数标签数（此处每种情况下为2，即1 和11）。对应于元素类型，该行中的最后几个条目对应于构成元素的 $Node 索引 - 对于四面体，最后四个条目。

由于标签的数量可能因元素而异（即，行到行），就像元素类型和节点索引的数量一样，每行可能有不同数量的整数。

对于$Nodes 和$Elements，我们非常感谢您对快速读取这些数据的任何帮助。

【问题讨论】：

标签： python numpy io mesh

【解决方案1】：

这是一个基于 NumPy 的有点奇怪的实现：

f = open('foo.msh')
f.readline() # '$MeshFormat\n'
f.readline() # '2.2 0 8\n'
f.readline() # '$EndMeshFormat\n'
f.readline() # '$Nodes\n'
n_nodes = int(f.readline()) # '8\n'
nodes = numpy.fromfile(f,count=n_nodes*4, sep=" ").reshape((n_nodes,4))
# array([[ 1. , -0.5, -0.5, -0.5],
#   [ 2. ,  0.5, -0.5, -0.5],
#   [ 3. , -0.5,  0.5, -0.5],
#   [ 4. ,  0.5,  0.5, -0.5],
#   [ 5. , -0.5, -0.5,  0.5],
#   [ 6. ,  0.5, -0.5,  0.5],
#   [ 7. , -0.5,  0.5,  0.5],
#   [ 8. ,  0.5,  0.5,  0.5]])
f.readline() # '$EndNodes\n'
f.readline() # '$Elements\n'
n_elems = int(f.readline()) # '2\n'
elems = numpy.fromfile(f,sep=" ")[:-1] # $EndElements read as -1
# This array must be reshaped based on the element type(s)
# array([  1.,   4.,   2.,   1.,  11.,   1.,   2.,   3.,   5.,   2.,   4.,
#    2.,   1.,  11.,   2.,   5.,   6.,   8.])

【讨论】：

【解决方案2】：

为什么不使用Gmsh SDK 中的gmsh python API？例如，使用文件explore.py（位于SDK tarball，gmsh--Linux64/share/doc/gmsh/demos/api/explore.py）来阅读你的例子，（我命名为@987654323 @)。

$ python explore.py test.msh

输出：

Info    : No current model available: creating one
Info    : Reading 'test.msh'...
Info    : 8 vertices
Info    : 2 elements
Info    : Done reading 'test.msh'
6 mesh nodes and 2 mesh elements on entity (3, 11) Discrete volume
 - Element type: Tetrahedron 4, order 1
   with 4 nodes in param coord:  [0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 1.]

节点和元素存储为 numpy 数组。

【讨论】：

嘿，我安装了 gmsh（pip install gmsh），但我没有看到 explore.py，你能 @Mitch 从导入语句更新你的帖子并打电话吗？？
@SBFRF 更新了我的答案。该文件在源代码压缩包中，而不是通过 pip 提供的 python 包