dynamodb 如何仅按排序键查询？答案

【问题标题】：dynamodb how to query by sort key only?dynamodb 如何仅按排序键查询？
【发布时间】：2017-02-21 15:26:30
【问题描述】：

我写了一些python代码，我想通过排序键查询dynamoDB数据。我记得我可以使用后续代码成功：

 table.query(KeyConditionExpression=Key('event_status').eq(event_status))

我的表结构列

primary key:event_id
sort key: event_status

【问题讨论】：

我相信这可以通过以Inverted Index 的形式创建Global Secondary Index 来完成。

标签： amazon-web-services amazon-dynamodb boto3

【解决方案1】：

您必须为排序键创建一个全局二级索引 (GSI) 才能单独对其进行查询。

【讨论】：

这是不正确的，你需要创建一个GSI，你不能单独查询一个排序键。 LSI 是额外的 LSI

【解决方案2】：

如果您想在不使用哈希键属性值的情况下从 DynamoDB 获取数据，则应使用扫描 API。

示例：-

fe = Attr('event_status').eq("new");

response = table.scan(
        FilterExpression=fe        
    )

for i in response['Items']:

print(json.dumps(i, cls=DecimalEncoder))

while 'LastEvaluatedKey' in response:
    response = table.scan(        
        FilterExpression=fe,        
        ExclusiveStartKey=response['LastEvaluatedKey']
        )

    for i in response['Items']:
        print(json.dumps(i, cls=DecimalEncoder))

【讨论】：

注意扫描不等于查询。您将遍历所有值，因此它的效率远低于查询。如果您确实需要有效地查询event_status，您应该考虑为该字段创建一个 GSI。

【解决方案3】：

如果您不想扫描（也许您不应该），您需要为此创建一个GSI (Global Secondary Index)，并将event_status 设置为 GSIPK。

所以你的表配置将是：

 table = dynamodb.create_table(
        TableName="your_table",
        KeySchema=[
            {"AttributeName": "event_id", "KeyType": "HASH"},  # Partition key
            {"AttributeName": "event_status", "KeyType": "RANGE"},  # Sort key
        ],
        AttributeDefinitions=[
            {"AttributeName": "event_id, "AttributeType": "S"},
            {"AttributeName": "event_status", "AttributeType": "S"},
            {"AttributeName": "event_status", "AttributeType": "S"},
            {"AttributeName": "event_id", "AttributeType": "S"},
        ],
        GlobalSecondaryIndexes=[
            {
                "IndexName": "gsiIndex",
                "KeySchema": [
                    {"AttributeName": "event_status", "KeyType": "HASH"},
                    {"AttributeName": "event_id", "KeyType": "RANGE"},
                ],
                "Projection": {"ProjectionType": "ALL"},
            },
        ],
        BillingMode="PAY_PER_REQUEST",
    )

请注意，GSI 可能很昂贵，如果您不需要所有属性，可能需要更改 ProjectionType。

现在可以通过pk查询了：

table.query(KeyConditionExpression=Key('event_id').eq(event_id))

或通过设置为您的 sk 的 GSI PK：

lookup.query(
        IndexName="gsiIndex",
        KeyConditionExpression=Key("event_status").eq(event_status),
    )

【讨论】：

【解决方案4】：

通过使用FilterExpression，我们可以使用排序键扫描表

注意： 这里 LastUpdated 是排序键

例子：

from_date = "fromdate"
to_date = "todate"

dynamodb = boto3.resource('dynamodb', region_name='ap-south-1')
table = dynamodb.Table("your-tablename")
response =table.scan(
    FilterExpression=Attr('LastUpdated').between(from_date,to_date))
    )
result = response['Items']

【讨论】：

小心，这只会过滤从原始查询返回的最多 1MB 的数据。来自文档：“在查询完成后，但在返回结果之前应用过滤器表达式”

【解决方案5】：

根据排序键的主要概念，在查询中用分区键定义一些过滤表达式是分区键中主簇的一部分。因此无法单独搜索排序键且没有分区键。除非在排序键上定义全局二级索引。

【讨论】：

【解决方案6】：

dynamodb 中的查询只能使用主键来完成

dynamodb = boto3.resource('dynamodb', region_name='your region name')
table = dynamodb.Table('database-dev')
response = table.query(KeyConditionExpression=Key('your primary key').eq("condition"))

但如果你的主键同时包含哈希键和排序键，那么我们可以像这样使用哈希键

dynamodb = boto3.resource('dynamodb', region_name='your region name')
table = dynamodb.Table('database-dev')
response = table.query(KeyConditionExpression=Key('your hash key').eq("condition"))

【讨论】：

这两个代码块是相同的！您的回答文字表明它们应该不同？