谷歌云数据存储仅存储唯一实体答案

【问题标题】：Google cloud datastore only store unique entity谷歌云数据存储仅存储唯一实体
【发布时间】：2017-05-13 03:54:43
【问题描述】：

我正在尝试使用 Google Datastore 学习 NoSQL，但遇到了唯一性问题。

考虑一个电子商务商店，它有类别和产品。

您不希望数据库中有两个相同 SKU 的产品。

所以我用 JSON 插入一个实体：

{"sku": 1234, "product_name": "Test product"}

它显示了两个字段。但是我可以再次这样做，并且我有两个或更多相同的产品。

你如何避免这种情况？你能让 sku 字段独一无二吗？

插入前需要查询吗？

类别也会出现同样的问题。我应该只为我的所有类别使用一个实体并将其构建在我的 JSON 中吗？

这里有什么好的常见做法？

【问题讨论】：

标签： database google-app-engine transactions google-cloud-datastore google-cloud-platform

【解决方案1】：

创建一个名为“sku”的新种类。创建新产品时，您需要同时插入产品实体和 sku 实体的事务性插入。

例如，假设您要添加一个类别名称为product 且id 为abc 的新产品：

"product/abc" = {"sku": 1234, "product_name": "Test product"}

为确保属性“sku”的唯一性，您总是希望插入一个种类名称为 sku 且 id 等于属性值的实体：

"sku/1234" = {"created": "2017-05-11"}

上面的示例实体具有创建日期的属性 - 只是我作为示例的一部分添加的可选内容。

现在，只要您将这两者插入作为同一交易的一部分，您就可以确保“sku”属性具有唯一值。这是因为：

插入确保如果该号码的 sku 实体已经存在，写入将失败
transaction 确保写入产品实体（带有 sku 值）并且 sku 实体是原子的 - 所以如果 sku 不是唯一的，写入 sku 实体将失败，从而导致产品实体写也失败。

【讨论】：

这是一个很好的提示，并且回答了问题的重点，但是使用这种方法比仅将 SKU 设置为实体 ID 有什么好处吗？
是的。使用 SKU 作为实体 ID 依赖于问题中未回答的假设。根据这些假设，它很容易就像使用 SKU 作为 id/name 一样简单。使这成为更好解决方案的不同假设：1）您不希望 SKU 是不可变的，2）产品的主键是在不使用 SKU 的外部系统中定义的。 3) 每个产品有多个 SKU，并且您希望 SKU 属性是重复属性（例如，Cloud Datastore 有多个 SKU）。 4) 您希望能够在不创建产品实体的情况下保留 SKU/将 SKU 列入黑名单。

【解决方案2】：

您可以将“sku”用作实体的“id”（如果是数字）或“name”（如果是字符串），而不是将“sku”存储为属性。然后保证它是唯一的，因为它成为唯一实体键的一部分。

【讨论】：

【解决方案3】：

数据模型是一个很大的主题，但 IMO 有两种方法可供您选择。这对您的问题来说更基本，更具体。它给出了一些想法。

第一种方法——将引用存储为属性

就像考虑产品包含产品变体一样......

这种方法与 RDBMS 世界中的方法类似。您可以单独创建产品，每个产品在每个产品变体中都会有一个参考。它类似于外键在数据库中的工作方式。因此，您将拥有产品变体实体的新属性，其中将包含对其所属产品的引用。产品属性实际上将包含产品种类实体的键。如果这听起来令人困惑，这就是你可以剖析它的方法。我将以python为例：

# product model
class Product(ndb.Model):
    name = ndb.StringProperty()

# product variant model
class ProductVariant(ndb.Model):
    name = ndb.StringProperty()
    price = ndb.IntegerProperty()
    # product key.
    product = ndb.KeyProperty(kind=Product)

hugoboss = Product(name="Hugo Boss", key=ndb.Key(Product, 'hugoboss'))
gap = Product(name="Gap", key=ndb.Key(Gap, 'gap'))

pants1 = ProductVariant(name="Black panst", price=300, product=hugoboss.key)
pants2 = ProductVariant(name="Grey pants", price=200, product=hugoboss.key)
tshirt = ProductVariant(name="White graphic tshirt", price=10, product=gap.key)

pants1.put()
pants2.put()
tshirt.put()

# so lets say give me all pants that has label hugoboss
for pants in ProductVariant.query(ProductVariant.product == hugoboss.key).fetch(10):
    print pants.name

# You should get something:
Black pants
Grey panst

第二种方法——钥匙内的产品

要充分利用它，您需要了解 Bigtable（基于 Bigtable 构建的数据存储）行键的排序功能以及如何围绕它操作数据。如果你想深入了解，这里有很棒的论文Bigtable: A Distributed Storage System for Structured Data

# product model
class Product(ndb.Model):
    name = ndb.StringProperty()

# product variant model
class ProductVariant(ndb.Model):
    name = ndb.StringProperty()
    price = ndb.IntegerProperty()

hugoboss = ndb.Key(Product, 'hugoboss')
gap = ndb.Key(Product, 'gap')

Product(name="Hugo Boss", key=hugoboss).put()
Product(name="Gap", key=gap).put()

pants1 = ProductVariant(name="Black pants", price=300, parent=hugoboss)
pants2 = ProductVariant(name="Grey pants", price=200, parent=hugoboss)
tshirt = ProductVariant(name="White graphic tshirt", price=10, parent=gap)

pants1.put()
pants2.put()
tshirt.put()

# so lets say give me all pants that has label hugoboss
for pants in ProductVariant.query(ancestor=hugoboss).fetch(10):
    print pants.name

# You should get something:
Black pants
Grey pants

第二种方法很强大！我希望这会有所帮助。

【讨论】：