【问题标题】:Where is the circular reference?循环引用在哪里?
【发布时间】:2025-04-21 11:20:02
【问题描述】:

我正在编写一些我想编码为 JSON 的 python 类。当我尝试 jsonify 我的对象时,我收到一个错误,提到“循环引用”。我想我理解循环引用的含义,但在我的代码中找不到任何示例。

对象之间的关系(有a/是a)

  • 注册有一个
  • 注册人有一个
  • 地址

代码(python):

class Address:
    def __init__(self, address1, address2, city, state, zip):
        self.address1 = address1
        self.address2 = address2
        self.city = city
        self.state = state
        self.zip = zip

class Signup:
    def __init__(self, registrant, classId, date, time, paid, seatCost, notes, className, seats, groupId, agentName, agentCompany):
        self.registrant = registrant
        self.classId = classId
        self.date = date
        self.time = time
        self.paid = paid
        self.seatCost = seatCost
        self.notes = notes
        self.className = className
        self.seats = seats
        self.groupId = groupId
        self.agentName = agentName
        self.agentCompany = agentCompany

class Registrant:
    def __init__(self, firstName, lastName, address, phone, email):
        self.firstName = firstName
        self.lastName = lastName
        self.address = address
        self.phone = phone
        self.email = email

def scrape(br):
    signups = []

    soup = libStuff.getSoup(br, 'http://thepaintmixer.com/admin/viewdailysignups.php')

    table = soup.find(id='Calendar')
    rows = table.find_all('tr')
    rowNumber = 0
    for row in rows:
        if rowNumber == 0:
            rowNumber = rowNumber + 1
            continue
        cells = row.find_all('td')
        cellNumber = 0
        for cell in cells:
            if cellNumber == 0:
                try:
                    firstName = cell.contents[0]
                except IndexError:
                    firstName = None
            elif cellNumber == 1:
                try:
                    lastName = cell.contents[0]
                except IndexError:
                    lastName = None
            elif cellNumber == 2:
                try:
                    address1 = cell.contents[0]
                except IndexError:
                    address1 = None
            elif cellNumber == 3:
                try:
                    address2 = cell.contents[0]
                except IndexError:
                    address2 = None
            elif cellNumber == 4:
                try:
                    city = cell.contents[0]
                except IndexError:
                    city = None
            elif cellNumber == 5:
                try:
                    state = cell.contents[0]
                except IndexError:
                    state = None
            elif cellNumber == 6:
                try:
                    zip = cell.contents[0]
                except IndexError:
                    zip = None
            elif cellNumber == 7:
                try:
                    phone = cell.contents[0]
                except IndexError:
                    phone = None
            elif cellNumber == 8:
                try:
                    email = cell.contents[0]
                except IndexError:
                    email = None
            elif cellNumber == 9:
                try:
                    classId = cell.contents[0]
                except IndexError:
                    classId = None
            elif cellNumber == 10:
                try:
                    date = cell.contents[0]
                except IndexError:
                    date = None
            elif cellNumber == 11:
                try:
                    time = cell.contents[0]
                except IndexError:
                    time = None
            elif cellNumber == 12:
                try:
                    paid = cell.contents[0]
                except IndexError:
                    paid = None
            elif cellNumber == 13:
                try:
                    seatCost = cell.contents[0]
                except IndexError:
                    seatCost = None
            elif cellNumber == 14:
                try:
                    notes = cell.contents[0]
                except IndexError:
                    notes = None
            elif cellNumber == 15:
                try:
                    className = cell.contents[0]
                except IndexError:
                    className = None
            elif cellNumber == 16:
                try:
                    seats = cell.contents[0]
                except IndexError:
                    seats = None
            elif cellNumber == 17:
                try:
                    groupId = cell.contents[0]
                except IndexError:
                    groupId = None
            elif cellNumber == 18:
                try:
                    agentName = cell.contents[0]
                except IndexError:
                    agentName = None
            elif cellNumber == 19:
                try:
                    agentCompany = cell.contents[0]
                except IndexError:
                    agentCompany = None
            cellNumber = cellNumber + 1

        address = Address(address1, address2, city, state, zip)
        registrant = Registrant(firstName, lastName, address, phone, email)
        signup = Signup(registrant, classId, date, time, paid, seatCost, notes, className, seats, groupId, agentName, agentCompany)
        signups.append(signup)
    return signups
#I then call json.dumps() on that returned list
json.dumps(scrape(br), default=lambda o: o.__dict__)

我的构造函数搞砸了吗?我是否传递了不该传递的东西?

【问题讨论】:

  • 您只是展示了这些课程,但没有展示您对它们所做的事情。请展示一个简单的示例,说明您如何使用这些类以及产生了什么错误。
  • @BrenBarn 好的,但是代码很多
  • 然后做一个更小、更简单的例子来说明问题。
  • 问题是,我不知道问题出在哪里所以我不知道我可以删除什么。
  • @macsj200 考虑使用collections.namedtuple 来表示您的数据。所有这些复制/粘贴确实使您搜索解决方案变得复杂,并且确实阻止了我们为您提供帮助。

标签: python json object serialization


【解决方案1】:

可能的原因是cell.contents[0] 返回的是一个复杂的 BeautifulSoup 对象,而不是纯文本。 BeautifulSoup 对象知道它们的父对象、兄弟对象、解析器类、属性和其他可能共享或循环的对象。

<td> 元素包含内部 html 时,就会出现这种情况。这在表格中很常见(例如,表格条目可能是粗体或斜体)。

您的问题的一个可能的解决方案是确保使用 BeautifulSoup 的 .text 以确保您只获取文本而不是内部 BeautifulSoup 元素:

columns = [col.text for col in row.findAll('td')] 

FWIW,这是一种简单的诊断技术,可以查看实际情况。只需修改 json.dumps() 中的默认函数,使其输出可见:

def view_dict(obj):
    print '--------------'
    print 'Type:', obj.__class__
    d = obj.__dict__
    pprint.pprint(d)
    return d

json.dumps(scrape(br), default=view_dict)

循环引用应该立即弹出。希望这能解开谜团(因为否则您的代码看起来很好,并且没有显式创建循环引用)。

【讨论】:

    【解决方案2】:

    我找不到错误,所以我重构为使用命名元组(Credit @metatoaster)。重构解决了这个问题。

    def scrape(br):
        signups = []
    
        soup = libStuff.getSoup(br, 'http://thepaintmixer.com/admin/viewdailysignups.php')
    
        table = soup.find(id='Calendar')
        rows = table.find_all('tr')
        rowNumber = 0
        for row in rows:
            if rowNumber == 0:
                rowNumber = rowNumber + 1
                continue
            cells = row.find_all('td')
            cells = [cell.string if cell.string != None else '' for cell in cells]
            signup = Signup(*cells)
            signups.append(signup)
        return signups
    

    【讨论】:

    • 您至少应该在另一个答案中运行诊断程序以确认实际原因是什么。这样,这个 SO 问题和答案将对其他人有用。
    • FWIW,实际修复可能是 .string 部分。其他更改是不错的代码改进,但与您的循环引用无关。