【问题标题】:How to calculate mean, median, mode and range from a set of numbers如何从一组数字中计算平均值、中位数、众数和范围
【发布时间】:2011-05-10 15:40:12
【问题描述】:

是否有任何函数(作为数学库的一部分)可以从一组数字中计算mean、中位数、众数和范围。

【问题讨论】:

    标签: java math probability


    【解决方案1】:

    是的,似乎确实有第三个库(Java Math 中没有)。出现的两个是:

    http://opsresearch.com/app/

    http://www.iro.umontreal.ca/~simardr/ssj/indexe.html

    但是,编写自己的方法来计算均值、中位数、众数和极差实际上并不难。

    平均

    public static double mean(double[] m) {
        double sum = 0;
        for (int i = 0; i < m.length; i++) {
            sum += m[i];
        }
        return sum / m.length;
    }
    

    中位数

    // the array double[] m MUST BE SORTED
    public static double median(double[] m) {
        int middle = m.length/2;
        if (m.length%2 == 1) {
            return m[middle];
        } else {
            return (m[middle-1] + m[middle]) / 2.0;
        }
    }
    

    模式

    public static int mode(int a[]) {
        int maxValue, maxCount;
    
        for (int i = 0; i < a.length; ++i) {
            int count = 0;
            for (int j = 0; j < a.length; ++j) {
                if (a[j] == a[i]) ++count;
            }
            if (count > maxCount) {
                maxCount = count;
                maxValue = a[i];
            }
        }
    
        return maxValue;
    }
    

    更新

    正如 Neelesh Salpe 所指出的,上述内容不适合多模式集合。我们可以很容易地解决这个问题:

    public static List<Integer> mode(final int[] numbers) {
        final List<Integer> modes = new ArrayList<Integer>();
        final Map<Integer, Integer> countMap = new HashMap<Integer, Integer>();
    
        int max = -1;
    
        for (final int n : numbers) {
            int count = 0;
    
            if (countMap.containsKey(n)) {
                count = countMap.get(n) + 1;
            } else {
                count = 1;
            }
    
            countMap.put(n, count);
    
            if (count > max) {
                max = count;
            }
        }
    
        for (final Map.Entry<Integer, Integer> tuple : countMap.entrySet()) {
            if (tuple.getValue() == max) {
                modes.add(tuple.getKey());
            }
        }
    
        return modes;
    }
    

    添加

    如果您使用的是 Java 8 或更高版本,您还可以像这样确定模式:

    public static List<Integer> getModes(final List<Integer> numbers) {
        final Map<Integer, Long> countFrequencies = numbers.stream()
                .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
    
        final long maxFrequency = countFrequencies.values().stream()
                .mapToLong(count -> count)
                .max().orElse(-1);
    
        return countFrequencies.entrySet().stream()
                .filter(tuple -> tuple.getValue() == maxFrequency)
                .map(Map.Entry::getKey)
                .collect(Collectors.toList());
    }
    

    【讨论】:

    • 谢谢,但如果可能的话,我更愿意使用开箱即用的东西
    • 如果你有一个非常大的数组或者必须即时计算值,这个类就会出现问题。它可以写成没有均值和标准差的数组;中位数和众数不确定。
    • MODE 算法不考虑具有多个模式(双峰、三峰等)的情况 - 当有多个数字出现与 maxCount 相同的次数时,就会发生这种情况。考虑到这一点,它应该返回一个数组而不是单个 int 值。
    • 正如我在对 Adeel 的回答的评论中提到的,对整个数组进行排序以获得中位数是非常低效的。
    • @NeeleshSalpe - 感谢您指出这一点。更新了我的答案。
    【解决方案2】:

    查看commons math from apache。那里有很多。

    【讨论】:

    • 查看对 Adeel 回答的评论:Apache Commons Math 似乎使用了一种效率很低的中值算法。
    【解决方案3】:
    public class Mode {
        public static void main(String[] args) {
            int[] unsortedArr = new int[] { 3, 1, 5, 2, 4, 1, 3, 4, 3, 2, 1, 3, 4, 1 ,-1,-1,-1,-1,-1};
            Map<Integer, Integer> countMap = new HashMap<Integer, Integer>();
    
            for (int i = 0; i < unsortedArr.length; i++) {
                Integer value = countMap.get(unsortedArr[i]);
    
                if (value == null) {
                    countMap.put(unsortedArr[i], 0);
                } else {
                    int intval = value.intValue();
                    intval++;
                    countMap.put(unsortedArr[i], intval);
                }
            }
    
            System.out.println(countMap.toString());
    
            int max = getMaxFreq(countMap.values());
            List<Integer> modes = new ArrayList<Integer>();
    
            for (Entry<Integer, Integer> entry : countMap.entrySet()) {
                int value = entry.getValue();
                if (value == max)
                    modes.add(entry.getKey());
            }
            System.out.println(modes);
        }
    
        public static int getMaxFreq(Collection<Integer> valueSet) {
            int max = 0;
            boolean setFirstTime = false;
    
            for (Iterator iterator = valueSet.iterator(); iterator.hasNext();) {
                Integer integer = (Integer) iterator.next();
    
                if (!setFirstTime) {
                    max = integer;
                    setFirstTime = true;
                }
                if (max < integer) {
                    max = integer;
                }
            }
            return max;
        }
    }
    

    测试数据

    模式 {1,3} 用于 { 3, 1, 5, 2, 4, 1, 3, 4, 3, 2, 1, 3, 4, 1 };
    { 3, 1, 5, 2, 4, 1, 3, 4, 3, 2, 1, 3, 4, 1 ,-1,-1,-1,-1,-1} 的模式 {-1} ;

    【讨论】:

      【解决方案4】:
          public static Set<Double> getMode(double[] data) {
                  if (data.length == 0) {
                      return new TreeSet<>();
                  }
                  TreeMap<Double, Integer> map = new TreeMap<>(); //Map Keys are array values and Map Values are how many times each key appears in the array
                  for (int index = 0; index != data.length; ++index) {
                      double value = data[index];
                      if (!map.containsKey(value)) {
                          map.put(value, 1); //first time, put one
                      }
                      else {
                          map.put(value, map.get(value) + 1); //seen it again increment count
                      }
                  }
                  Set<Double> modes = new TreeSet<>(); //result set of modes, min to max sorted
                  int maxCount = 1;
                  Iterator<Integer> modeApperance = map.values().iterator();
                  while (modeApperance.hasNext()) {
                      maxCount = Math.max(maxCount, modeApperance.next()); //go through all the value counts
                  }
                  for (double key : map.keySet()) {
                      if (map.get(key) == maxCount) { //if this key's value is max
                          modes.add(key); //get it
                      }
                  }
                  return modes;
              }
      
              //std dev function for good measure
              public static double getStandardDeviation(double[] data) {
                  final double mean = getMean(data);
                  double sum = 0;
                  for (int index = 0; index != data.length; ++index) {
                      sum += Math.pow(Math.abs(mean - data[index]), 2);
                  }
                  return Math.sqrt(sum / data.length);
              }
      
      
              public static double getMean(double[] data) {
              if (data.length == 0) {
                  return 0;
              }
              double sum = 0.0;
              for (int index = 0; index != data.length; ++index) {
                  sum += data[index];
              }
              return sum / data.length;
          }
      
      //by creating a copy array and sorting it, this function can take any data.
          public static double getMedian(double[] data) {
              double[] copy = Arrays.copyOf(data, data.length);
              Arrays.sort(copy);
              return (copy.length % 2 != 0) ? copy[copy.length / 2] : (copy[copy.length / 2] + copy[(copy.length / 2) - 1]) / 2;
          }
      

      【讨论】:

        【解决方案5】:

        如果您只关心单峰分布,请考虑一下。像这样。

        public static Optional<Integer> mode(Stream<Integer> stream) {
            Map<Integer, Long> frequencies = stream
                .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
        
            return frequencies.entrySet().stream()
                .max(Comparator.comparingLong(Map.Entry::getValue))
                .map(Map.Entry::getKey);
        }
        

        【讨论】:

          【解决方案6】:

          正如 Nico Huysamen 已经指出的那样,在 Java 1.8 中查找多个模式可以按如下方式进行。

          import java.util.ArrayList;
          import java.util.List;
          import java.util.HashMap;
          import java.util.Map;
          
          public static void mode(List<Integer> numArr) {
              Map<Integer, Integer> freq = new HashMap<Integer, Integer>();;
              Map<Integer, List<Integer>> mode = new HashMap<Integer, List<Integer>>();
          
              int modeFreq = 1; //record the highest frequence
              for(int x=0; x<numArr.size(); x++) { //1st for loop to record mode
                  Integer curr = numArr.get(x); //O(1)
                  freq.merge(curr, 1, (a, b) -> a + b); //increment the frequency for existing element, O(1)
                  int currFreq = freq.get(curr); //get frequency for current element, O(1)
          
                  //lazy instantiate a list if no existing list, then
                  //record mapping of frequency to element (frequency, element), overall O(1)
                  mode.computeIfAbsent(currFreq, k -> new ArrayList<>()).add(curr);
          
                  if(modeFreq < currFreq) modeFreq = currFreq; //update highest frequency
              }
              mode.get(modeFreq).forEach(x -> System.out.println("Mode = " + x)); //pretty print the result //another for loop to return result
          }
          

          编码愉快!

          【讨论】:

            【解决方案7】:

            这是 JAVA 8 中完整且经过优化的代码

            import java.io.*;
            import java.util.*;
            
            public class Solution {
            
            public static void main(String[] args) {
            
                /*Take input from user*/
                Scanner sc = new Scanner(System.in);
            
                int n =0;
                n = sc.nextInt();
                
                int arr[] = new int[n];
                
                //////////////mean code starts here//////////////////
                int sum = 0;
                for(int i=0;i<n; i++)
                {
                     arr[i] = sc.nextInt();
                     sum += arr[i]; 
                }
                System.out.println((double)sum/n); 
                //////////////mean code ends here//////////////////
            
            
                //////////////median code starts here//////////////////
                Arrays.sort(arr);
                int val = arr.length/2;
                System.out.println((arr[val]+arr[val-1])/2.0); 
                //////////////median code ends here//////////////////
            
            
                //////////////mode code starts here//////////////////
                int maxValue=0;
                int maxCount=0;
            
                for(int i=0; i<n; ++i)
                {
                    int count=0;
            
                    for(int j=0; j<n; ++j)
                    {
                        if(arr[j] == arr[i])
                        {
                            ++count;
                        }
            
                        if(count > maxCount)
                        {
                            maxCount = count;
                            maxValue = arr[i];
                        }
                    }
                } 
                System.out.println(maxValue);
               //////////////mode code ends here//////////////////
            
              }
            
            }
            

            【讨论】:

              猜你喜欢
              • 1970-01-01
              • 2016-10-04
              • 1970-01-01
              • 2021-05-12
              • 2016-05-31
              • 1970-01-01
              • 2019-03-07
              • 2019-10-15
              • 2020-06-10
              相关资源
              最近更新 更多