在一个字段中查找重复项，然后使用 Python (ArcGIS) 用 Y 或 N 更新另一个字段答案

【问题标题】：Find duplicates in a field then update another field with a Y or N with Python (ArcGIS)在一个字段中查找重复项，然后使用 Python (ArcGIS) 用 Y 或 N 更新另一个字段
【发布时间】：2015-03-07 14:13:22
【问题描述】：

我正在尝试创建一个 python 脚本，该脚本将识别一个带有 Y 或 N 的点 shapefile 中的重复记录（可能超过 5000 条记录）。类似这样：

xyCombine |复制

E836814.148873 N814378.125749 |

E836815.033548 N814377.614688 |

E836818.016542 N814371.411850 |

我希望处理字段 xyCombine 以查找重复项，并使用 Y 或 N 更新另一个字段（dplicate）（如果它是重复项）。期望的结果为：

xyCombine |复制

E836814.148873 N814378.125749 |是的

E836815.033548 N814377.614688 |是的

E836818.016542 N814371.411850 |否

以下是我的尝试：

# Process: Searches xyCombine field for any duplicates
duplicateCount = 0
inShapefile = pointsShapefile
fieldName = "xyCombine"
shpFRows = arcpy.UpdateCursor(inShapefile)
shpFRow = shpFRows.next()
fieldList = []
while shpFRow:
    if shpFRow.isNull(fieldName) == False and len(str(shpFRow.getValue(fieldName)).strip()) > 1:
            fieldList.append(shpFRow.getValue(fieldName))
    shpFRow = shpFRows.next()
duplicateList = [x for x, y in collections.Counter(fieldList).items() if y > 1]
print duplicateList
selectFile = pointsShapefile
selectFields = ('xyCombine','dupCHK')
shpFRows = arcpy.UpdateCursor(selectFile,selectFields)
shpFRow1 = shpFRows.next()
while shpFRow1:
    if shpFRow1.isNull(fieldName) == False and len(str(shpFRow1.getValue(fieldName)).strip()) > 1:
        for row in duplicateList:
            if shpFRow1.getValue(fieldName) == row:
                duplicate += 1
                row[1] = "Y"
            else:
                row[1] = "N"
            cursor.updateRow(row)
        shpFRow1 = shpFRows.next()
if duplicateCount > 0:
    print ""
    print "*** "+str(duplicate)+" duplicated points. ***"
    print ""

如果我不包括

    row[1] = "Y"
else:
    row[1] = "N"
cursor.updateRow(row)

脚本正确执行打印重复的总数，但不会使用 Y 或 N 值更新字段重复，这很重要，因为它将在脚本后面提供 csv 错误报告。

但是，当我包含它时，我收到以下错误消息：

Win32 上的 Python 2.7.2（默认，2011 年 6 月 12 日，15:08:59）[MSC v.1500 32 位（英特尔）]

[u'E836814.148873 N814378.125749', u'E836815.033548 N814377.614688', u'E836818.016542 N814371.41185']

Traceback（最近一次调用最后一次）：文件“C:\ Duplicate Points Check\Python Scripts\DuplicatePointsCheck_TEST1.py”，第 458 行，在重复点检查（）文件“C:\ Duplicate Points Check\Python Scripts\DuplicatePointsCheck_TEST1.py”，第 94 行，位于 DuplicatePointsCheck 行 [1] = "N" TypeError: 'unicode' 对象不支持项目分配>>>

我了解 ArcGIS 中有一些工具可以通过字段计算器提供可能的解决方案。但是我想加强我对 Python 的理解，因为我对 Python 还是很陌生。如果之前有人提出过这个问题，我深表歉意，但我已经在互联网上进行了搜索，我的唯一搜索结果包括定位和删除重复记录。如果你们中的任何一个人可以引导我朝着正确的方向前进，那将是非常有帮助的。提前谢谢你。

【问题讨论】：

示例文件中的记录已排序。您的真实文件中的记录是否也已排序？
@gboffi：不，真实文件中的记录没有排序。我只是想澄清我的问题。
交叉发布于gis.stackexchange.com/q/129860/115

标签： python python-2.7 arcgis

【解决方案1】：

没有足够的信息可以确定，但您似乎使用的是 ArcGIS 10.1 或更高版本。如果是这种情况，您似乎在尝试使用新的数据访问版本的 UpdateCursor，但实际上是在调用 Pre

我最近没有使用 ArCGIS 10.0，但从文档看来语法已经改变。 ArcGIS 文档 lists this method 用于使用 UpdateCursor 为字段分配值：

for row in cursor:
    # field2 will be equal to field1 multiplied by 3.0
    row.setValue(field2, row.getValue(field1) * 3.0)
    cursor.updateRow(row)

您似乎正在使用数据访问语法，如下所示，再次from the ArcGIS 10.2 documentation：

with arcpy.da.UpdateCursor(fc, fields) as cursor:
# Update the field used in Buffer so the distance is based on road 
# type. Road type is either 1, 2, 3 or 4. Distance is in meters. 
for row in cursor:
    # Update the BUFFER_DISTANCE field to be 100 times the 
    # ROAD_TYPE field.
    row[1] = row[0] * 100
    cursor.updateRow(row)

确保您使用 arcpy.da.UpdateCursor 创建光标；我希望这能解决您的问题。

【讨论】：

我使用的是 ArcGIS 10.1。另外，我用 arcpy.da.UpdateCursor 替换了 arcpy.UpdateCursor 并收到以下错误消息： if shpFRow.isNull(fieldName) == False and len(str(shpFRow.getValue(fieldName)).strip()) > 1： AttributeError: 'list' 对象没有属性 'isNull'
再说一次，我不能 100% 确定，但旧的 arcpy.UpdateCursor 函数返回的行对象似乎与 arcpy.da.UpdateCursor 返回的对象不同。 Per the documentation，next 方法：“将下一行作为元组返回。字段的顺序将按照创建游标时指定的顺序返回。”因此，isNull 方法对行不起作用。您必须将值引用为shpFRow[0]。您还需要使用其他方法来测试无效性。