【发布时间】:2014-12-12 21:46:15
【问题描述】:
所以我正在处理将 CSV 转换为 JSON 的文件,但是我不断收到此错误消息,但似乎无法弄清楚。缩进似乎是正确的,所以我有点迷失在哪里去。代码如下:
回溯(最近一次通话最后):
File "/home/uwp/widgets/contentFreshness/freshmap.py", line 308, in <module>
main()
File "/home/uwp/widgets/contentFreshness/freshmap.py", line 303, in main
mySite.writeJSONFile(options)
File "/home/uwp/widgets/contentFreshness/freshmap.py", line 247, in writeJSONFile
outputFile.write('"' + str(dateOfCrawl) + '"' )
NameError: global name 'dateOfCrawl' is not defined
代码
class Site:
dateOfCrawl = 0;
def __init__(self,csvFilePath):
self.pageList = [] # ordered list of page IDs
self.pageData={} # dictionary of individual page dictionaries, indexed on page ID
self.titleDict = { } # dictionary of unique titles
self.buildPageData(csvFilePath)
self.homePageId=self.pageList[0] # only use of site.pageList
self.depth=0
def buildPageData(self,csvFilePath):
global dateOfCrawl
# read data from CSV file, build a dictionary of page data, including list of children, in order
lines = csv.reader(open(csvFilePath, "rb"))
for line in lines:
pageURL=line[0]
pageURL=re.sub('\/\Z', '',pageURL) # remove any trailing slash
self.pageData[pageURL]={}
self.pageData[pageURL]["URL"]=pageURL
self.pageData[pageURL]["Title"]=self.cleanTitle(line[1],pageURL)
# when taking the home page and chop its url the parent will be http:/
# which should be avoided by setting it to ''
parent = chopPath(pageURL)
if(parent == 'http:/'):
parent=''
dateOfCrawl = line[2]
self.pageData[pageURL]["Parent"]= parent
self.pageData[pageURL]["Modified"]=line[2]
self.pageData[pageURL]["Children"]=[]
list = self.pageData.keys()
# sort IDs before attempting to match children
self.pageList = self.pageData.keys()
self.pageList.sort()
lineCount = 0
for pageURL in self.pageList:
# record page as child of its parent (parents must already be in the list!)
parentURL=self.pageData[pageURL]["Parent"]
if (lineCount > 0):
while( self.pageData.has_key(parentURL)== False):
if(parentURL == ''):
sys.exit(pageURL + " has no parent at " + parentURL)
parentURL = chopPath(parentURL)
self.pageData[parentURL]["Children"].append(pageURL)
lineCount+=1
self.pageCount=lineCount
def writeJSONFile(self,options):
global dateOfCrawl
outputFile = options ["outputFile"]
#see http://code.google.com/intl/en/apis/visualization/documentation/reference.html#DataTable
outputFile.write('[')
outputFile.write('"' + str(dateOfCrawl) + '"' )
self.homePage.toJSON(options)
outputFile.write(']')
outputFile.close()
【问题讨论】:
-
请阅读stackoverflow.com/help/mcve 并提供完整的追溯信息。您是在要求人们阅读 200 多个 LOC 来帮助您 - 先帮助自己。
-
抱歉,我还以为我把它包括在内了!我知道这是很多代码,但要理解它可能是必要的。
-
那几乎从来没有。请阅读ericlippert.com/2014/03/05/how-to-debug-small-programs,并在以后花更多时间缩小问题范围。
-
肯定会再次为我的无知道歉,我感谢您的指导。
-
我已将您的帖子删减到只有必要的部分。