HTML 字符串的漂亮打印 assertEqual()答案

【问题标题】：pretty print assertEqual() for HTML stringsHTML 字符串的漂亮打印 assertEqual()
【发布时间】：2011-11-04 09:03:03
【问题描述】：

我想在一个包含 html 的 python unittest 中比较两个字符串。

有没有一种方法可以以人类友好（类似差异）的版本输出结果？

【问题讨论】：

Django 从 1.4 版开始就有 assertHTMLEqual：docs.djangoproject.com/en/dev/topics/testing/…

标签： python html unit-testing testing

【解决方案1】：

一个简单的方法是从 HTML 中去除空格并将其拆分为一个列表。 Python 2.7's unittest（或向后移植的unittest2）然后给出列表之间的人类可读差异。

import re

def split_html(html):
    return re.split(r'\s*\n\s*', html.strip())

def test_render_html():
    expected = ['<div>', '...', '</div>']
    got = split_html(render_html())
    self.assertEqual(expected, got)

如果我正在为工作代码编写测试，我通常会先设置expected = []，在断言前插入self.maxDiff = None，然后让测试失败一次。然后可以从测试输出中复制粘贴预期的列表。

您可能需要根据 HTML 的外观调整如何去除空格。

【讨论】：

【解决方案2】：

几年前我提交了一个补丁来执行此操作。该补丁已被拒绝，但您仍然可以在 python bug list 上查看它。

我怀疑你是否想破解你的 unittest.py 来应用补丁（如果它在这段时间之后仍然有效的话），但这里有一个函数可以将两个字符串减小到可管理的大小，同时仍然至少保留部分不同之处.只要您不想要完全的差异，这可能就是您想要的：

def shortdiff(x,y):
    '''shortdiff(x,y)

    Compare strings x and y and display differences.
    If the strings are too long, shorten them to fit
    in one line, while still keeping at least some difference.
    '''
    import difflib
    LINELEN = 79
    def limit(s):
        if len(s) > LINELEN:
            return s[:LINELEN-3] + '...'
        return s

    def firstdiff(s, t):
        span = 1000
        for pos in range(0, max(len(s), len(t)), span):
            if s[pos:pos+span] != t[pos:pos+span]:
                for index in range(pos, pos+span):
                    if s[index:index+1] != t[index:index+1]:
                        return index

    left = LINELEN/4
    index = firstdiff(x, y)
    if index > left + 7:
        x = x[:left] + '...' + x[index-4:index+LINELEN]
        y = y[:left] + '...' + y[index-4:index+LINELEN]
    else:
        x, y = x[:LINELEN+1], y[:LINELEN+1]
        left = 0

    cruncher = difflib.SequenceMatcher(None)
    xtags = ytags = ""
    cruncher.set_seqs(x, y)
    editchars = { 'replace': ('^', '^'),
                  'delete': ('-', ''),
                  'insert': ('', '+'),
                  'equal': (' ',' ') }
    for tag, xi1, xi2, yj1, yj2 in cruncher.get_opcodes():
        lx, ly = xi2 - xi1, yj2 - yj1
        edits = editchars[tag]
        xtags += edits[0] * lx
        ytags += edits[1] * ly

    # Include ellipsis in edits line.
    if left:
        xtags = xtags[:left] + '...' + xtags[left+3:]
        ytags = ytags[:left] + '...' + ytags[left+3:]

    diffs = [ x, xtags, y, ytags ]
    if max([len(s) for s in diffs]) < LINELEN:
        return '\n'.join(diffs)

    diffs = [ limit(s) for s in diffs ]
    return '\n'.join(diffs)

【讨论】：

【解决方案3】：

也许这是一个相当“冗长”的解决方案。您可以为您必须首先定义的用户定义类型（例如：HTMLString）添加一个新的“相等函数”：

class HTMLString(str):
    pass

现在你必须定义一个类型相等函数：

def assertHTMLStringEqual(first, second):
    if first != second:
        message = ... # TODO here: format your message, e.g a diff
        raise AssertionError(message)

您所要做的就是按照您的喜好格式化您的消息。您还可以在特定的TestCase 中使用类方法作为类型相等函数。这为您提供了更多功能来格式化您的消息，因为 unittest.TestCase 经常这样做。

现在你必须在 unittest.TestCase 中注册这个相等函数：

...
def __init__(self):
    self.addTypeEqualityFunc(HTMLString, assertHTMLStringEqual)

类方法也一样：

...
def __init__(self):
    self.addTypeEqualityFunc(HTMLString, 'assertHTMLStringEqual')

现在您可以在测试中使用它了：

def test_something(self):
    htmlstring1 = HTMLString(...)
    htmlstring2 = HTMLString(...)
    self.assertEqual(htmlstring1, htmlstring2)

这应该适用于 python 2.7。

【讨论】：

【解决方案4】：

我（提出这个问题的人）现在使用 BeautfulSoup：

def assertEqualHTML(string1, string2, file1='', file2=''):
    u'''
    Compare two unicode strings containing HTML.
    A human friendly diff goes to logging.error() if there
    are not equal, and an exception gets raised.
    '''
    from BeautifulSoup import BeautifulSoup as bs
    import difflib
    def short(mystr):
        max=20
        if len(mystr)>max:
            return mystr[:max]
        return mystr
    p=[]
    for mystr, file in [(string1, file1), (string2, file2)]:
        if not isinstance(mystr, unicode):
            raise Exception(u'string ist not unicode: %r %s' % (short(mystr), file))
        soup=bs(mystr)
        pretty=soup.prettify()
        p.append(pretty)
    if p[0]!=p[1]:
        for line in difflib.unified_diff(p[0].splitlines(), p[1].splitlines(), fromfile=file1, tofile=file2):
            logging.error(line)
        raise Exception('Not equal %s %s' % (file1, file2))

【讨论】：