【问题标题】:Speed up Numpy Meshgrid Command加速 Numpy Meshgrid 命令
【发布时间】:2013-08-14 13:32:47
【问题描述】:

我正在使用 Numpy 生成一个 Meshgrid,它需要大量内存和相当多的时间。

xi, yi = np.meshgrid(xi, yi)

我正在生成一个与底层站点地图图像分辨率相同的网格网格,有时尺寸为 3000 像素。它有时会使用几 Gigs 的内存,并且在将其写入页面文件时需要 10-15 秒或更长时间。

我的问题是;我可以在不升级服务器的情况下加快速度吗?这是我的应用程序源代码的完整副本。

def generateContours(date_collected, substance_name, well_arr, site_id, sitemap_id, image, title_wildcard='', label_over_well=False, crop_contours=False, groundwater_contours=False, flow_lines=False, site_image_alpha=1, status_token=""):
    #create empty arrays to fill up!
    x_values = []
    y_values = []
    z_values = []

    #iterate over wells and fill the arrays with well data
    for well in well_arr:
        x_values.append(well['xpos'])
        y_values.append(well['ypos'])
        z_values.append(well['value'])

    #initialize numpy array as required for interpolation functions
    x = np.array(x_values, dtype=np.float)
    y = np.array(y_values, dtype=np.float)
    z = np.array(z_values, dtype=np.float)

    #create a list of x, y coordinate tuples
    points = zip(x, y)

    #create a grid on which to interpolate data
    start_time = time.time()
    xi, yi = np.linspace(0, image['width'], image['width']), np.linspace(0, image['height'], image['height'])

    xi, yi = np.meshgrid(xi, yi)

    #interpolate the data with the matlab griddata function (http://matplotlib.org/api/mlab_api.html#matplotlib.mlab.griddata)
    zi = griddata(x, y, z, xi, yi, interp='nn')

    #create a matplotlib figure and adjust the width and heights to output contours to a resolution very close to the original sitemap
    fig = plt.figure(figsize=(image['width']/72, image['height']/72))

    #create a single subplot, just takes over the whole figure if only one is specified
    ax = fig.add_subplot(111, frameon=False, xticks=[], yticks=[])

    #read the database image and save to a temporary variable
    im = Image.open(image['tmpfile'])

    #place the sitemap image on top of the figure
    ax.imshow(im, origin='upper', alpha=site_image_alpha)

    #figure out a good linewidth
    if image['width'] > 2000:
        linewidth = 3
    else:
        linewidth = 2

    #create the contours (options here http://cl.ly/2X0c311V2y01)
    kwargs = {}
    if groundwater_contours:
        kwargs['colors'] = 'b'

    CS = plt.contour(xi, yi, zi, linewidths=linewidth, **kwargs)
    for key, value in enumerate(CS.levels):
        if value == 0:
            CS.collections[key].remove()

    #add a streamplot
    if flow_lines:
        dy, dx = np.gradient(zi)
        plt.streamplot(xi, yi, dx, dy, color='c', density=1, arrowsize=3, arrowstyle='<-')

    #add labels to well locations
    label_kwargs = {}
    if label_over_well is True:
        label_kwargs['manual'] = points

    plt.clabel(CS, CS.levels[1::1], inline=5, fontsize=math.floor(image['width']/100), fmt="%.1f", **label_kwargs)

    #add scatterplot to show where well data was read
    scatter_size = math.floor(image['width']/20)
    plt.scatter(x, y, s=scatter_size, c='k', facecolors='none', marker=(5, 1))

    try:
        site_name = db_session.query(Sites).filter_by(site_id=site_id).first().title
    except:
        site_name = "Site Map #%i" % site_id

    sitemap = SiteMaps.query.get(sitemap_id)
    if sitemap.title != 'Sitemap':
        sitemap_wildcard = " - " + sitemap.title
    else:
        sitemap_wildcard = ""

    if title_wildcard != '':
        filename_wildcard = "-" + slugify(title_wildcard)
        title_wildcard = " - " + title_wildcard
    else:
        filename_wildcard = ""
        title_wildcard = ""

    #add descriptive title to the top of the contours
    title_font_size = math.floor(image['width']/72)
    plt.title(parseDate(date_collected) + " - " + site_name + " " + substance_name + " Contour" + sitemap_wildcard + title_wildcard, fontsize=title_font_size)

    #generate a unique filename and save to a temp directory
    filename = slugify(site_name) + str(int(time.time())) + filename_wildcard + ".pdf"
    temp_dir = tempfile.gettempdir()
    tempFileObj = temp_dir + "/" + filename
    savefig(tempFileObj)  # bbox_inches='tight' tightens the white border

    #clears the matplotlib memory
    clf()

    #send the temporary file to the user
    resp = make_response(send_file(tempFileObj, mimetype='application/pdf', as_attachment=True, attachment_filename=filename))

    #set the users status token for javascript workaround to check if file is done being generated
    resp.set_cookie('status_token', status_token)

    return resp

【问题讨论】:

    标签: python numpy matplotlib contour


    【解决方案1】:

    xi, yi = np.meshgrid(xi, yi, copy=False) 怎么样。 这样它只会返回原始数组的视图,而不是复制所有数据。

    【讨论】:

    • 请记住,如果您稍后必须编辑坐标,这将不起作用。
    【解决方案2】:

    看起来您可能不需要通过meshgrid 传递xiyi。检查您使用xiyi 的函数的文档字符串。许多人接受(甚至期望)一维数组。

    例如:

    In [33]: x
    Out[33]: array([0, 0, 0, 1, 1, 1, 2, 2, 2])
    
    In [34]: y
    Out[34]: array([0, 1, 2, 0, 1, 2, 0, 1, 2])
    
    In [35]: z
    Out[35]: array([0, 1, 4, 1, 2, 5, 2, 3, 6])
    
    In [36]: xi
    Out[36]: array([ 0. ,  0.5,  1. ,  1.5,  2. ])
    
    In [37]: yi
    Out[37]: 
    array([ 0.        ,  0.33333333,  0.66666667,  1.        ,  1.33333333,
            1.66666667,  2.        ])
    
    In [38]: zi = griddata(x, y, z, xi, yi)
    
    In [39]: zi
    Out[39]: 
    array([[ 0.        ,  0.5       ,  1.        ,  1.5       ,  2.        ],
           [ 0.33333333,  0.83333333,  1.33333333,  1.83333333,  2.33333333],
           [ 0.66666667,  1.16666667,  1.66666667,  2.16666667,  2.66666667],
           [ 1.        ,  1.61111111,  2.        ,  2.61111111,  3.        ],
           [ 2.        ,  2.5       ,  3.        ,  3.5       ,  4.        ],
           [ 3.        ,  3.5       ,  4.        ,  4.5       ,  5.        ],
           [ 4.        ,  4.5       ,  5.        ,  5.5       ,  6.        ]])
    
    
    In [40]: plt.contour(xi, yi, zi)
    Out[40]: <matplotlib.contour.QuadContourSet instance at 0x3ba03b0>
    

    【讨论】:

      【解决方案3】:

      如果meshgrid 是让你慢下来的原因,请不要称之为...根据griddata docs

      xi 和 yi 必须描述一个规则网格,可以是 1D 或 2D,但是 必须是单调递增的。

      因此,如果您跳过对meshgrid 的调用并执行以下操作,则对griddata 的调用应该同样有效:

      xi = np.linspace(0, image['width'], image['width'])
      yi = np.linspace(0, image['height'], image['height'])
      zi = griddata(x, y, z, xi, yi, interp='nn')
      

      也就是说,如果您的 xy 向量很大,则实际插值,即调用 griddata 可能需要相当长的时间,因为 Delaunay 三角剖分是一项计算密集型操作。您确定您的性能问题来自meshgrid,而不是来自griddata

      【讨论】:

      • 你完全正确。我为每个命令设置了计时器,meshgrid 命令用了 0.2 秒,而 griddata 命令用了 4 秒。关于如何提高速度的任何建议?
      猜你喜欢
      • 1970-01-01
      • 2020-12-04
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-02-23
      相关资源
      最近更新 更多