使用 Matplotlib “伪造”轴刻度和标签答案

【问题标题】："Faking" axis ticks and labels with Matplotlib使用 Matplotlib “伪造”轴刻度和标签
【发布时间】：2015-10-14 14:17:47
【问题描述】：

我正在尝试创建一个图形，其中多维数据集的每个维度都在子图网格中相互绘制。这是我目前所拥有的：

x 维度由 subplot 列确定，y 维度由行确定。当维度相等时，将绘制 y 轴上密度的一维直方图，否则使用密度映射到颜色的二维直方图。创建每个子图时，我与该列中的第一个图共享 x 轴（使用 Figure.add_subplot 函数中的 sharex 参数）。 Y 轴的共享方式类似，但 1d 直方图除外。

这可以很好地保持坐标轴的比例相同，但您可以在左上角看到问题。由于大多数轴在行和列中都是相同的，因此我只在图的底部和左侧有刻度线。问题是左上角的子图的 y 比例与其行的其余部分不同。

我想为该行上其他子图的 y 轴实际添加刻度，应用于左上角的子图，不更改该子图的 y 限制。从行中的第二个子图中获取 y 标签并将它们设置在第一个作品上，但实际上改变刻度的位置并不因为轴的限制不同。除了将点从一个绘图的比例显式转换到另一个绘图之外，我不知道如何以相对方式设置刻度位置。

编辑：既然有人问，这里是用于生成此代码的基本版本：

import numpy as np
from scipy.stats import gaussian_kde

def matrix_plot(figure, data, limits, labels):
    """
    Args:
        figure: matplotlib Figure
        data: numpy.ndarray, points/observations in rows
        limits: list of (min, max) values for axis limits
        labels: list of labels for each dimension
    """

    # Number of dimensions (data columns)
    ndim = data.shape[1]

    # Create KDE objects
    density = [ gaussian_kde(data[:,dim]) for dim in range(ndim) ]

    # Keep track of subplots
    plots = np.ndarray((ndim, ndim), dtype=object)

    # Loop through dimensions twice
    # dim1 goes by column
    for dim1 in range(ndim):
        # dim2 goes by row
        for dim2 in range(ndim):

            # Index of plot
            i = dim2 * ndim + dim1 + 1

            # Share x-axis with plot at top of column
            # Share y-axis with plot at beginning of row, unless that
            #    plot or current plot is a 1d plot
            kwargs = dict()
            if dim2 > 0:
                kwargs['sharex'] = plots[0][dim1]
                if dim1 > 0 and dim1 != dim2:
                    kwargs['sharey'] = plots[dim2][0]
            elif dim1 > 1:
                kwargs['sharey'] = plots[dim2][1]

            # Create new subplot
            # Pass in shared axis arguments with **kwargs
            plot = figure.add_subplot(ndim, ndim, i, **kwargs)
            plots[dim2][dim1] = plot

            # 1d density plot
            if dim1 == dim2:

                # Space to plot over
                x = np.linspace(limits[dim][0], limits[dim][1], 100)

                # Plot filled region
                plot.set_xlim(limits[dim])
                plot.fill_between(x, density[dim].evaluate(x))

            # 2d density plot
            else:

                # Make histogram
                h, xedges, yedges = np.histogram2d(data[:,dim1],
                    data[:,dim2], range=[limits[dim1], limits[dim2]],
                    bins=250)

                # Set zero bins to NaN to make empty regions of
                #   plot transparent
                h[h == 0] = np.nan

                # Plot without grid
                plot.imshow(h.T, origin='lower',
                    extent=np.concatenate((limits[dim1], limits[dim2])),
                    aspect='auto')
                plot.grid(False)

            # Ticks and labels of except on figure edges
            plot.tick_params(axis='both', which='both', left='off',
                right='off', bottom='off', top='off', labelleft='off',
                labelbottom='off')
            if dim1 == 0:
                plot.tick_params(axis='y', left='on', labelleft='on')
                plot.set_ylabel(labels[dim2])
            if dim2 == self._ndim - 1:
                plot.tick_params(axis='x', bottom='on', labelbottom='on')
                plot.set_xlabel(labels[dim1])

        # Tight layout
        figure.tight_layout(pad=.1, h_pad=0, w_pad=0)

当我尝试将刻度位置和标签从第一行第二个图的 y 轴复制到第一个图时，我得到了以下结果：

plots[0][0].set_yticks(plots[0][1].get_yticks())
plots[0][0].set_yticklabels(plots[0][1].get_yticklabels())

注意它是如何在比密度图比例高得多的绝对比例上分配刻度位置的。轴范围扩大以显示刻度，因此实际的密度曲线被压扁到底部。此外，标签不显示。

【问题讨论】：

也许你可以用 set_yticklabels 和 set_yticks 伪造它
你用什么代码来生成这个？这将有助于了解您是如何做到的。此外，这并不是您问题的真正答案，但您可以尝试使用来自pandas 的scatter_matrix - 它会制作这种精确的情节并且应该正确处理这些限制。
我已经用示例代码编辑了这个问题，并通过将刻度从二维直方图复制到左上角密度图的结果。
seaborn PairGrids 维护刻度线和网格线：stanford.edu/~mwaskom/software/seaborn/tutorial/…

标签： python matplotlib

【解决方案1】：

感谢 Ajean 的评论告诉我 pandas 包中的 scatter_matrix 函数，它或多或少地完成了我在这里尝试做的事情。我查看了 GitHub 上的源代码，发现他们“修复”左上图中的轴以对应于行的共享 y 轴而不是密度轴的部分：

if len(df.columns) > 1:
    lim1 = boundaries_list[0]
    locs = axes[0][1].yaxis.get_majorticklocs()
    locs = locs[(lim1[0] <= locs) & (locs <= lim1[1])]
    adj = (locs - lim1[0]) / (lim1[1] - lim1[0])

    lim0 = axes[0][0].get_ylim()
    adj = adj * (lim0[1] - lim0[0]) + lim0[0]
    axes[0][0].yaxis.set_ticks(adj)

    if np.all(locs == locs.astype(int)):
        # if all ticks are int
        locs = locs.astype(int)
    axes[0][0].yaxis.set_ticklabels(locs)

不幸的是，它看起来就像我害怕的那样：除了手动将刻度位置从一个范围转换到另一个范围之外，没有任何更优雅的方法可以做到这一点。这是我的版本，在双循环之后：

# Check there are more plots in the row, just in case
if ndim > 1:
    # Get tick locations from 2nd plot in first row
    ticks = np.asarray(plots[0][1].yaxis.get_majorticklocs())

    # Throw out the ones that aren't within the limit
    # (Copied from pandas code, but probably not necessary)
    ticks = ticks[(ticks >= limits[0][0]) & (ticks <= limits[0][1])]

    # Scale ticks to range of [0, 1] (relative to axis limits)
    ticks_scaled = (ticks - limits[0][0]) / (limits[0][1] - limits[0][0])

    # Y limits of top-left density plot (was automatically determined
    #       by matplotlib)
    dlim = plots[0][0].get_ylim()

    # Set the ticks scaled to the plot's own y-axis
    plots[0][0].set_yticks((ticks_scaled * (dlim[1] - dlim[0])) + dlim[0])

    # Set tick labels to their original positions on the 2d plot
    plots[0][0].set_yticklabels(ticks)

这会得到我正在寻找的结果。

【讨论】：