【问题标题】:Resample, aggregate, and interpolate of TimeSeries trend data时间序列趋势数据的重采样、聚合和插值
【发布时间】:2012-01-30 04:11:38
【问题描述】:

在分析能源需求和消耗数据时,我在重新采样和插值时间序列趋势数据时遇到问题。

数据集示例:

timestamp                value kWh
------------------       ---------
12/19/2011 5:43:21 PM    79178
12/19/2011 5:58:21 PM    79179.88
12/19/2011 6:13:21 PM    79182.13
12/19/2011 6:28:21 PM    79183.88
12/19/2011 6:43:21 PM    79185.63

根据这些观察结果,我想要一些聚合以基于一段时间汇总值,并将该频率设置为一个时间单位。

例如,每小时的间隔填补缺失数据的任何空白

timestamp                value (approx)
------------------       ---------
12/19/2011 5:00:00 PM    79173
12/19/2011 6:00:00 PM    79179
12/19/2011 7:00:00 PM    79186

对于线性算法,我似乎会取时间差并将值与该因子相乘。

TimeSpan ts = current - previous;

Double factor = ts.TotalMinutes / period;

值和时间戳可以根据因子计算。

有了如此多的可用信息,我不确定为什么很难找到最优雅的方法。

也许首先,有没有可以推荐的开源分析库?

对程序化方法有什么建议吗?理想情况下是 C#,还是可能使用 SQL?

或者,我可以指出任何类似的问题(有答案)?

【问题讨论】:

    标签: c# sql .net time-series


    【解决方案1】:

    大概是这样的:

    SELECT DATE_FORMAT('%Y-%m-%d %H', timestamp) as day_hour, AVG(value) as aprox FROM table GROUP BY day_hour
    

    您使用什么数据库引擎?

    【讨论】:

    • MS SQL Server 2008 Express。这非常接近我的需要;不过,我更喜欢 C# 实现。
    【解决方案2】:

    对于您正在做的事情,您似乎为初学者错误地声明了 TimeSpan ts = (TimeSpan)(current- previous);还要确保 current 和 previous 是 DateTime 类型。

    如果您想查看计算或汇总,我会查看 TotalHours() 这是一个示例,如果您愿意,可以查看一下 这是检查 LastWrite / Modified 时间是否在 24 小时内

    if (((TimeSpan)(DateTime.Now - fiUpdateFileFile.LastWriteTime)).TotalHours < 24){}
    

    我知道这与您的情况不同,但您对如何使用 TotalHours 有所了解

    【讨论】:

      【解决方案3】:

      通过使用内部用于表示 DateTimes 的时间标记,您可以获得尽可能准确的值。由于这些时间刻度不会在午夜从零重新开始,因此您不会在日期边界处遇到问题。

      // Sample times and full hour
      DateTime lastSampleTimeBeforeFullHour = new DateTime(2011, 12, 19, 17, 58, 21);
      DateTime firstSampleTimeAfterFullHour = new DateTime(2011, 12, 19, 18, 13, 21);
      DateTime fullHour = new DateTime(2011, 12, 19, 18, 00, 00);
      
      // Times as ticks (most accurate time unit)
      long t0 = lastSampleTimeBeforeFullHour.Ticks;
      long t1 = firstSampleTimeAfterFullHour.Ticks;
      long tf = fullHour.Ticks;
      
      // Energy samples
      double e0 = 79179.88; // kWh before full hour
      double e1 = 79182.13; // kWh after full hour
      double ef; // interpolated energy at full hour
      
      ef = e0 + (tf - t0) * (e1 - e0) / (t1 - t0); // ==> 79180.1275 kWh
      

      公式说明
      在几何学中,相似三角形是形状相同但大小不同的三角形。上面的公式是基于一个三角形中任意两条边的比率对于相似三角形的对应边相同的事实。

      如果你有一个三角形 A B C 和一个相似的三角形 a b c,那么A : B = a : b。两个比率相等称为比例。

      我们可以将此比例规则应用于我们的问题:

      (e1 – e0) / (t1 – t0) = (ef – e0) / (tf – t0)
      --- large triangle --   --- small triangle --
      

      【讨论】:

      • 现象 - 这是一个很好的基础 - 谢谢!
      • 很棒的帖子 - 非常有用
      【解决方案4】:

      我编写了一个 LINQ 函数来插值和规范化时间序列数据,以便可以聚合/合并它。

      重采样功能如下。我已经在 Code Project 中写了一个 short article 关于这种技术的文章。

      // The function is an extension method, so it must be defined in a static class.
      public static class ResampleExt
      {
          // Resample an input time series and create a new time series between two 
          // particular dates sampled at a specified time interval.
          public static IEnumerable<OutputDataT> Resample<InputValueT, OutputDataT>(
      
              // Input time series to be resampled.
              this IEnumerable<InputValueT> source,
      
              // Start date of the new time series.
              DateTime startDate,
      
              // Date at which the new time series will have ended.
              DateTime endDate,
      
              // The time interval between samples.
              TimeSpan resampleInterval,
      
              // Function that selects a date/time value from an input data point.
              Func<InputValueT, DateTime> dateSelector,
      
              // Interpolation function that produces a new interpolated data point
              // at a particular time between two input data points.
              Func<DateTime, InputValueT, InputValueT, double, OutputDataT> interpolator
          )
          {
              // ... argument checking omitted ...
      
              //
              // Manually enumerate the input time series...
              // This is manual because the first data point must be treated specially.
              //
              var e = source.GetEnumerator();
              if (e.MoveNext())
              {
                  // Initialize working date to the start date, this variable will be used to 
                  // walk forward in time towards the end date.
                  var workingDate = startDate;
      
                  // Extract the first data point from the input time series.
                  var firstDataPoint = e.Current;
      
                  // Extract the first data point's date using the date selector.
                  var firstDate = dateSelector(firstDataPoint);
      
                  // Loop forward in time until we reach either the date of the first
                  // data point or the end date, which ever comes first.
                  while (workingDate < endDate && workingDate <= firstDate)
                  {
                      // Until we reach the date of the first data point,
                      // use the interpolation function to generate an output
                      // data point from the first data point.
                      yield return interpolator(workingDate, firstDataPoint, firstDataPoint, 0);
      
                      // Walk forward in time by the specified time period.
                      workingDate += resampleInterval; 
                  }
      
                  //
                  // Setup current data point... we will now loop over input data points and 
                  // interpolate between the current and next data points.
                  //
                  var curDataPoint = firstDataPoint;
                  var curDate = firstDate;
      
                  //
                  // After we have reached the first data point, loop over remaining input data points until
                  // either the input data points have been exhausted or we have reached the end date.
                  //
                  while (workingDate < endDate && e.MoveNext())
                  {
                      // Extract the next data point from the input time series.
                      var nextDataPoint = e.Current;
      
                      // Extract the next data point's date using the data selector.
                      var nextDate = dateSelector(nextDataPoint);
      
                      // Calculate the time span between the dates of the current and next data points.
                      var timeSpan = nextDate - firstDate;
      
                      // Loop forward in time until wwe have moved beyond the date of the next data point.
                      while (workingDate <= endDate && workingDate < nextDate)
                      {
                          // The time span from the current date to the working date.
                          var curTimeSpan = workingDate - curDate; 
      
                          // The time between the dates as a percentage (a 0-1 value).
                          var timePct = curTimeSpan.TotalSeconds / timeSpan.TotalSeconds; 
      
                          // Interpolate an output data point at the particular time between 
                          // the current and next data points.
                          yield return interpolator(workingDate, curDataPoint, nextDataPoint, timePct);
      
                          // Walk forward in time by the specified time period.
                          workingDate += resampleInterval; 
                      }
      
                      // Swap the next data point into the current data point so we can move on and continue
                      // the interpolation with each subsqeuent data point assuming the role of 
                      // 'next data point' in the next iteration of this loop.
                      curDataPoint = nextDataPoint;
                      curDate = nextDate;
                  }
      
                  // Finally loop forward in time until we reach the end date.
                  while (workingDate < endDate)
                  {
                      // Interpolate an output data point generated from the last data point.
                      yield return interpolator(workingDate, curDataPoint, curDataPoint, 1);
      
                      // Walk forward in time by the specified time period.
                      workingDate += resampleInterval; 
                  }
              }
          }
      }
      

      【讨论】:

        猜你喜欢
        • 2020-02-05
        • 2021-07-15
        • 2023-03-29
        • 2021-01-02
        • 2017-06-03
        • 2015-07-09
        • 2021-04-07
        • 2017-02-23
        • 2020-12-28
        相关资源
        最近更新 更多