【问题标题】:Removing extra whitespace from generated HTML in MVC从 MVC 中生成的 HTML 中删除多余的空格
【发布时间】:2010-10-25 17:26:30
【问题描述】:

我有一个 MVC 应用程序视图,它正在生成相当大的 HTML 值表 (>20MB)。

我正在使用压缩过滤器压缩控制器中的视图

 internal class CompressFilter : ActionFilterAttribute
 {
     public override void OnActionExecuting(ActionExecutingContext filterContext)
     {
         HttpRequestBase request = filterContext.HttpContext.Request;
         string acceptEncoding = request.Headers["Accept-Encoding"];
         if (string.IsNullOrEmpty(acceptEncoding))
             return;
         acceptEncoding = acceptEncoding.ToUpperInvariant();
         HttpResponseBase response = filterContext.HttpContext.Response;
         if (acceptEncoding.Contains("GZIP"))
         {
             response.AppendHeader("Content-encoding", "gzip");
             response.Filter = new GZipStream(response.Filter, CompressionMode.Compress);
         }
         else if (acceptEncoding.Contains("DEFLATE"))
         {
             response.AppendHeader("Content-encoding", "deflate");
             response.Filter = new DeflateStream(response.Filter, CompressionMode.Compress);
         }
     }
 }

有没有办法在我运行压缩过滤器之前消除视图中生成的(相当大的)冗余空白(以减少压缩工作量和大小)?

编辑: 我使用下面 Womp 建议的 WhiteSpaceFilter 技术让它工作。

出于兴趣,这里是 Firebug 分析的结果:

1) 无压缩,无空格 - 21MB,2.59 分钟
2) 使用 GZIP 压缩,无空白条 - 2MB,17.59s
3) 使用 GZIP 压缩,空白条 - 558kB,12.77s

所以当然值得。

【问题讨论】:

  • 有趣的结果,感谢发布。
  • 我知道这是旧的,但你想发布完整的代码吗?

标签: html asp.net-mvc compression whitespace http-compression


【解决方案1】:

This guy 编写了一个简洁的小空格压缩器,它只是通过正则表达式运行字节的快速块副本以去除空间块。他将其编写为一个 http 模块,但您可以从中取出 7 行主力代码并将其放入您的函数中。

【讨论】:

  • 只是对人们的警告,如果您阅读了该链接上的 cmets,则该解决方案存在一些缺陷。
  • RegEx 不是“快”的。查看我的测量结果stackoverflow.com/a/15014794/141172
【解决方案2】:

可以通过扩展 Razor 在编译时删除空格。这消除了(根据我的测量非常重要)从生成的 HTML 中删除空格的运行时影响。在使用 Stack Overflow 上基于 RegEx 的代码修剪 100KB 文档的高端 i7 上,命中率高达 88 毫秒。

以下提供了 MVC 3 和 MVC 4 的编译时解决方案的实现:

Meleze.Web

解决方案描述于

http://cestdumeleze.net/blog/2011/minifying-the-html-with-asp-net-mvc-and-razor/

(但请使用 GitHub 代码或 NuGet DLL,因为博文中的代码仅涵盖 MVC 3)。

【讨论】:

  • @MarcinHabuszewski:第一个链接没有死。我只是点击它,它加载。也许是 github.com 上的一个临时问题?
  • 我看到你误读了我的评论。我写道,第二个链接已失效。至于第一个,我的意思是即使它已经死了(它不是)它实际上也值得一些东西,因为它包含您建议使用的工具的名称,尽管这是答案中唯一包含的地方名字。
【解决方案3】:

这是我在项目中使用的空白过滤器属性的 VB.NET 版本:

#Region "Imports"

    Imports System.IO

#End Region

Namespace MyCompany.Web.Mvc.Extensions.ActionFilters

    ''' <summary>
    ''' WhitespaceFilter attribute
    ''' </summary>
    Public NotInheritable Class WhitespaceFilterAttribute
        Inherits ActionFilterAttribute

        ''' <summary>
        ''' Called when action executing.   
        ''' </summary>
        ''' <param name="filterContext">The filter context.</param>
        ''' <remarks></remarks>
        Public Overrides Sub OnActionExecuting(filterContext As ActionExecutingContext)

                filterContext.HttpContext.Response.Filter = New WhitespaceFilterStream(filterContext.HttpContext.Response.Filter)

        End Sub

    #Region "Whitespace stream filter"

            ''' <summary>
            ''' Whitespace stream filter
            ''' </summary>
            Private Class WhitespaceFilterStream
                Inherits Stream

    #Region "Declarations"

                ' Member vars.
                Private Shared regexPattern As New Regex("(?<=[^])\t{2,}|(?<=[>])\s{2,}(?=[<])|(?<=[>])\s{2,11}(?=[<])|(?=[\n])\s{2,}")
                ' Property vars.
                Private sinkStreamValue As Stream
                Private positionValue As Long

    #End Region

    #Region "Constructor(s)"

                ''' <summary>
                ''' Contructor to create a new object.
                ''' </summary>
                ''' <param name="sink"></param>
                ''' <remarks></remarks>
                Public Sub New(sink As Stream)

                    Me.sinkStreamValue = sink

                End Sub

    #End Region

    #Region "Properites"

                ''' <summary>
                ''' Gets the CanRead value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides ReadOnly Property CanRead() As Boolean
                    Get
                        Return True
                    End Get
                End Property

                ''' <summary>
                ''' Gets the CanSeek value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides ReadOnly Property CanSeek() As Boolean
                    Get
                        Return True
                    End Get
                End Property

                ''' <summary>
                ''' Gets the CanWrite value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides ReadOnly Property CanWrite() As Boolean
                    Get
                        Return True
                    End Get
                End Property

                ''' <summary>
                ''' Get Length value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides ReadOnly Property Length() As Long
                    Get
                        Return 0
                    End Get
                End Property

                ''' <summary>
                ''' Get or sets Position value.
                ''' </summary>
                ''' <value></value>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides Property Position() As Long
                    Get
                        Return Me.positionValue
                    End Get
                    Set(value As Long)
                        Me.positionValue = value
                    End Set
                End Property

    #End Region

    #Region "Stream Overrides Methods"

                ''' <summary>
                ''' Stream object Close method.
                ''' </summary>
                ''' <remarks></remarks>
                Public Overrides Sub Close()

                    Me.sinkStreamValue.Close()

                End Sub

                ''' <summary>
                ''' Stream object Close method.
                ''' </summary>
                ''' <remarks></remarks>
                Public Overrides Sub Flush()

                    Me.sinkStreamValue.Flush()

                End Sub

                ''' <summary>
                ''' Stream object Read method.
                ''' </summary>
                ''' <param name="buffer"></param>
                ''' <param name="offset"></param>
                ''' <param name="count"></param>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides Function Read(buffer As Byte(), offset As Integer, count As Integer) As Integer

                    Return Me.sinkStreamValue.Read(buffer, offset, count)

                End Function

                ''' <summary>
                ''' Stream object Seek method.
                ''' </summary>
                ''' <param name="offset"></param>
                ''' <param name="origin"></param>
                ''' <returns></returns>
                ''' <remarks></remarks>
                Public Overrides Function Seek(offset As Long, origin As SeekOrigin) As Long

                    Return Me.sinkStreamValue.Seek(offset, origin)

                End Function

                ''' <summary>
                ''' Stream object SetLength method.
                ''' </summary>
                ''' <param name="value"></param>
                ''' <remarks></remarks>
                Public Overrides Sub SetLength(value As Long)

                    Me.sinkStreamValue.SetLength(value)

                End Sub

                ''' <summary>
                ''' Stream object Write method.
                ''' </summary>
                ''' <param name="bufferBytes"></param>
                ''' <param name="offset"></param>
                ''' <param name="count"></param>
                ''' <remarks></remarks>
                Public Overrides Sub Write(bufferBytes As Byte(), offset As Integer, count As Integer)

                    Dim html As String = Encoding.Default.GetString(bufferBytes)

                    Buffer.BlockCopy(bufferBytes, offset, New Byte(count - 1) {}, 0, count)
                    html = regexPattern.Replace(html, String.Empty)
                    Me.sinkStreamValue.Write(Encoding.Default.GetBytes(html), 0, Encoding.Default.GetBytes(html).GetLength(0))

                End Sub

    #End Region

            End Class

    #End Region

        End Class

    End Namespace

在 Global.asax.vb 中:

Shared Sub RegisterGlobalFilters(ByVal filters As GlobalFilterCollection)

    With filters
        ' Standard MVC filters
        .Add(New HandleErrorAttribute())
        ' MyCompany MVC filters
        .Add(New CompressionFilterAttribute)
        .Add(New WhitespaceFilterAttribute)
    End With

End Sub

【讨论】:

  • 我在互联网上搜索了几个小时以找到一个可以解释如何同时压缩和删除空格的答案,这是唯一真正有效的答案!我还了解到,将这些属性添加为全局过滤器可以为每个页面完成这项工作,而无需使用 BaseController。谢谢@Ed Degagne 我改变的一件事是Write方法: var html = Encoding.UTF8.GetString(buffer, offset, count); var reg = new Regex(@"(?]*)"); html = reg.Replace(html, string.Empty);缓冲区 = Encoding.UTF8.GetBytes(html); _base.Write(buffer, 0, buffer.Length);
  • @DavidLétourneau 你能解释一下为什么会这样吗?谢谢!!
  • 当然,我遇到了一些 html 渲染问题。例如,输入按钮未正确显示,其他 html 组件很少。这是唯一的原因 =)
  • 太棒了。 . . . .
【解决方案4】:

@womp 已经提出了一种很好的方法,但是该模块已经过时了。我一直在使用它,但事实证明这不是最佳方式。这是我问的问题:

Remove white space from entire Html but inside pre with regular expressions

这是我的做法:

public class RemoveWhitespacesAttribute : ActionFilterAttribute {

    public override void OnActionExecuted(ActionExecutedContext filterContext) {

        var response = filterContext.HttpContext.Response;

        //Temp fix. I am not sure what causes this but ContentType is coming as text/html
        if (filterContext.HttpContext.Request.RawUrl != "/sitemap.xml") {

            if (response.ContentType == "text/html" && response.Filter != null) {
                response.Filter = new HelperClass(response.Filter);
            }
        }
    }

    private class HelperClass : Stream {

        private System.IO.Stream Base;

        public HelperClass(System.IO.Stream ResponseStream) {

            if (ResponseStream == null)
                throw new ArgumentNullException("ResponseStream");
            this.Base = ResponseStream;
        }

        StringBuilder s = new StringBuilder();

        public override void Write(byte[] buffer, int offset, int count) {

            string HTML = Encoding.UTF8.GetString(buffer, offset, count);

            //Thanks to Qtax
            //https://stackoverflow.com/questions/8762993/remove-white-space-from-entire-html-but-inside-pre-with-regular-expressions
            Regex reg = new Regex(@"(?<=\s)\s+(?![^<>]*</pre>)");
            HTML = reg.Replace(HTML, string.Empty);

            buffer = System.Text.Encoding.UTF8.GetBytes(HTML);
            this.Base.Write(buffer, 0, buffer.Length);
        }

        #region Other Members

        public override int Read(byte[] buffer, int offset, int count) {

            throw new NotSupportedException();
        }

        public override bool CanRead{ get { return false; } }

        public override bool CanSeek{ get { return false; } }

        public override bool CanWrite{ get { return true; } }

        public override long Length{ get { throw new NotSupportedException(); } }

        public override long Position {

            get { throw new NotSupportedException(); }
            set { throw new NotSupportedException(); }
        }

        public override void Flush() {

            Base.Flush();
        }

        public override long Seek(long offset, SeekOrigin origin) {

            throw new NotSupportedException();
        }

        public override void SetLength(long value) {

            throw new NotSupportedException();
        }

        #endregion
    }

}

【讨论】:

  • 今天早上进行了基准测试。在我的设置中,一个相当大的 ca。使用此过滤器(在高端 i7 设置上)处理 78KB 的 HTML 文件需要 250 毫秒。我还注意到,对于大文件,过滤器被多次调用......这意味着如果文件被切片错误,这可能会严重失败。理论上有更好的方法......在编译时删除额外的空白,但到目前为止我看到的唯一解决方案在我的环境中失败了。 github.com/meleze/Meleze.Web
【解决方案5】:
#region Stream filter
class StringFilterStream : Stream
{
  private Stream _sink;
  private Func<string, string> _filter;

  public StringFilterStream(Stream sink, Func<string, string> filter) {
    _sink = sink;
    _filter = filter;
  }

  #region Mixin Properties/Methods
  public override bool CanRead { get { return true; } }
  public override bool CanSeek { get { return true; } }
  public override bool CanWrite { get { return true; } }
  public override void Flush() { _sink.Flush(); }
  public override long Length { get { return 0; } }
  private long _position;
  public override long Position {
    get { return _position; }
    set { _position = value; }
  }
  public override int Read(byte[] buffer, int offset, int count) {
    return _sink.Read(buffer, offset, count);
  }
  public override long Seek(long offset, SeekOrigin origin) {
    return _sink.Seek(offset, origin);
  }
  public override void SetLength(long value) {
    _sink.SetLength(value);
  }
  public override void Close() {
    _sink.Close();
  }
  #endregion

  public override void Write(byte[] buffer, int offset, int count) {
    // intercept the data and convert to string
    byte[] data = new byte[count];
    Buffer.BlockCopy(buffer, offset, data, 0, count);
    string s = Encoding.Default.GetString(buffer);

    // apply the filter
    s = _filter(s);

    // write the data back to stream
    byte[] outdata = Encoding.Default.GetBytes(s);
    _sink.Write(outdata, 0, outdata.GetLength(0));
  }
}
#endregion

public enum WebWhitespaceFilterContentType
{
  Xml = 0, Css = 1, Javascript = 2
}
public class WebWhitespaceFilterAttribute : ActionFilterAttribute
{
  private WebWhitespaceFilterContentType _contentType;

  public WebWhitespaceFilterAttribute() {
    _contentType = WebWhitespaceFilterContentType.Xml;
  }
  public WebWhitespaceFilterAttribute(WebWhitespaceFilterContentType contentType) {
    _contentType = contentType;
  }

  public override void OnActionExecuting(ActionExecutingContext filterContext) {

    var request = filterContext.HttpContext.Request;
    var response = filterContext.HttpContext.Response;

    switch (_contentType) {
      case WebWhitespaceFilterContentType.Xml:

        response.Filter = new StringFilterStream(response.Filter, s => {
          s = Regex.Replace(s, @"\s+", " ");
          s = Regex.Replace(s, @"\s*\n\s*", "\n");
          s = Regex.Replace(s, @"\s*\>\s*\<\s*", "><");
          // single-line doctype must be preserved
          var firstEndBracketPosition = s.IndexOf(">");
          if (firstEndBracketPosition >= 0) {
            s = s.Remove(firstEndBracketPosition, 1);
            s = s.Insert(firstEndBracketPosition, ">\n");
          }
          return s;
        });
        break;

      case WebWhitespaceFilterContentType.Css:
      case WebWhitespaceFilterContentType.Javascript:

        response.Filter = new StringFilterStream(response.Filter, s => {
          s = Regex.Replace(s, @"/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/", "");
          s = Regex.Replace(s, @"\s+", " ");
          s = Regex.Replace(s, @"\s*{\s*", "{");
          s = Regex.Replace(s, @"\s*}\s*", "}");
          s = Regex.Replace(s, @"\s*;\s*", ";");
          return s;
        });
        break;
    }
  }
}

【讨论】:

  • 一些低质量的代码。 1) \s+ 已经匹配换行符,所以为什么下一行 \s*\n\s*. 2)应该在 OnResultExecuted 而不是 OnActionExecuting 方法上工作,因为那时我们知道控制器设置的内容类型。 3) 如果行 cmets // 已使用,则中断
     元素和 
【解决方案6】:

空白压缩得很好,我认为删除它不会为您节省太多。

如果可能的话,我建议尝试将一些 HTML 卸载到客户端,使用 JavaScript 重构重复的内容。

【讨论】:

  • 我完全同意...我认为空格会随着 GZIP 消失。我很惊讶地看到另一层“空白删除”在 GZIP 之后没有删除空白的情况下对大小有任何影响。
【解决方案7】:

如果您从视图返回 JSON,它已经被缩小并且不应包含任何空格或 CR/LF。您应该使用分页来避免一次向浏览器发送如此多的数据。

【讨论】:

  • 这取决于您使用的 JSON 库以及它的配置方式。
  • 作为评论而不是答案。
【解决方案8】:

我想说,如果您的 View 生成超过 20mb 的数据,您可能想研究显示数据的不同方式,也许是分页?

【讨论】:

  • 由于应用程序的特殊性,很遗憾这是不可能的。
  • 浏览器不是因为巨大的解析而窒息吗?
  • 不。在ie6/7/8、safari、firefox下好像没问题
猜你喜欢
  • 1970-01-01
  • 2012-05-21
  • 1970-01-01
  • 2021-10-16
  • 2016-05-19
  • 1970-01-01
  • 2023-01-16
  • 1970-01-01
  • 2011-08-05
相关资源
最近更新 更多