【问题标题】:scatter plot over boxplot using Matlab使用Matlab在箱线图上散点图
【发布时间】:2015-09-16 15:04:50
【问题描述】:

我使用 Matlab 绘制了一个向量 y (1xN) 的简单箱线图。我使用了多个分组变量:x1、x2、x3

x1 (1xN) 表示长度(0.5, 1, 2 or 3)

x2 (1xN) 表示规格(26 或 30)

x3(1xN 元胞数组)表示供应商的名称。

close all; clc;

N = 1000;


% measurements values: they represent some kind of an
% electrical characteristic of a cable.
y = randn(N,1);

% each cable being measured can be of length 1m, 2m, or 3m:
x1 = randi(3,N,1);

% each cable being measured have a gauge of  1awg or 2awg:
x2 = randi(2,N,1);

% each cable can be produced by a different vendor. for instance: 'SONY' or
% 'YAMAHA' 

x3 = cell(N,1);

for ii = 1:N
   if mod(ii,3) == 0
       x3{ii} = 'SONY';
   else
       x3{ii} = 'YAMAHA';
   end
end

figure(1)
boxplot(y,{x1,x2,x3});

我想在这个箱线图上绘制一个散点图,以显示创建箱线图的 y 的相关值,但我找不到像箱线图函数那样对值进行分组的函数。

我发现的最接近的是以下function,但它只接受一个分组变量。

有什么帮助吗?

【问题讨论】:

    标签: matlab scatter-plot boxplot scatter


    【解决方案1】:

    箱线图的方框由 IQR 确定。框和异常值之间的数据是上下四分位数 1.5*IQR 范围内的所有数据。您可以手动过滤数据。

    比如……

    % data generation 
    data=randn(100,3);
    
    %% 
    datas=sort(data);
    datainbox=datas(ceil(end/4)+1:floor(end*3/4),:);
    
    [n1 n2]=size(datainbox);
    
    figure(1);clf
    boxplot(data); hold on
    plot(ones(n1,1)*[1 2 3],datainbox,'k.')
    
    %% 
    % All datapoints coincide now horizontally. Consider adding a little random
    % horizontal play to make them not coincide:
    
    figure(2);clf
    boxplot(data); hold on
    plot(ones(n1,1)*[1 2 3]+.4*(rand(n1,n2)-.5),datainbox,'k.')
    
    %%
    % If you want to add all data between boxes and outliers too, do something like:
    
    dataoutbox=datas([1:ceil(end/4) floor(end*3/4)+1:end],:);
    n3=size(dataoutbox,1);
    % calculate quartiles
    dataq=quantile(data,[.25 .5 .75]);
    % calculate range between box and outliers = between 1.5*IQR from quartiles
    dataiqr=iqr(data);
    datar=[dataq(1,:)-dataiqr*1.5;dataq(3,:)+dataiqr*1.5];
    dataoutbox(dataoutbox<ones(n3,1)*datar(1,:)|dataoutbox>ones(n3,1)*datar(2,:))=nan;
    
    figure(3);clf
    boxplot(data); hold on
    plot(ones(n1,1)*[1 2 3]+.4*(rand(n1,n2)-.5),datainbox,'k.')
    plot(ones(n3,1)*[1 2 3]+.4*(rand(n3,n2)-.5),dataoutbox,'.','color',[1 1 1]*.5)
    

    【讨论】:

    • 感谢您的努力,但是从我刚刚添加的示例代码中可以看出,我使用了许多分组向量,其中一些是字符串。
    【解决方案2】:

    找到了一个简单的解决方案:

    我编辑了“boxplot”函数的签名,因此除了“h”之外,它还会返回“groupIndexByPoint”:

    函数 [h,groupIndexByPoint] = boxplot(varargin)

    groupIndexByPoint 是 'boxplot' 使用的内部变量。

    现在只需在原始代码中添加 4 行:

    N = 1000;
    
    % measurements values: they represent some kind of an
    % electrical characteristic of a cable.
    y = randn(N,1);
    
    % each cable being measured can be of length 1m, 2m, or 3m:
    x1 = randi(3,N,1);
    
    % each cable being measured have a gauge of  1awg or 2awg:
    x2 = randi(2,N,1);
    
    % each cable can be produced by a different vendor. for instance: 'SONY' or
    % 'YAMAHA' 
    
    x3 = cell(N,1);
    
    for ii = 1:N
       if mod(ii,3) == 0
           x3{ii} = 'SONY';
       else
           x3{ii} = 'YAMAHA';
       end
    end
    
    figure(1);
    hold on;
    [h,groups] = boxplot(y,{x1,x2,x3});
    scattering_factor = 0.3;
    scaterring_vector = (rand(N,1)-0.5)*scattering_factor;
    groups_scattered = groups + scaterring_vector;
    plot(groups_scattered,y,'.g');
    

    【讨论】:

      猜你喜欢
      • 2020-10-10
      • 1970-01-01
      • 2017-11-16
      • 2015-06-29
      • 2014-03-31
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-08-04
      相关资源
      最近更新 更多