【问题标题】:MongoDB .NET driver and text searchMongoDB .NET 驱动程序和文本搜索
【发布时间】:2017-04-16 08:39:34
【问题描述】:

我正在使用这个 MongoDB 驱动程序:https://mongodb.github.io/mongo-csharp-driver/ 我想使用文本索引进行搜索,(我认为)是在所有文本字段上创建的,如下所示:

{
    "_fts" : "text",
    "_ftsx" : 1
}

我正在使用 linq 查询来过滤数据,例如:

MongoClient client = new MongoClient(_mongoConnectionString);
IMongoDatabase mongoDatabase = client.GetDatabase(DatabaseName);
var aCollection = mongoDatabase.GetCollection<MyTypeSerializable>(CollectionName);

IMongoQueryable<MyTypeSerializable> queryable = aCollection.AsQueryable()
                .Where(e=> e.Field == 1);
var result = queryable.ToList();

如何使用此方法进行文本搜索?

【问题讨论】:

    标签: mongodb linq


    【解决方案1】:

    搜索解决方案我找到了FilterDefinition&lt;T&gt;.Inject() 扩展方法。 所以我们可以更深入地在IMongoQueryable&lt;T&gt;上再创建一个扩展:

    public static class MongoQueryableFullTextExtensions
    {
        public static IMongoQueryable<T> WhereText<T>(this IMongoQueryable<T> query, string search)
        {
            var filter = Builders<T>.Filter.Text(search);
            return query.Where(_ => filter.Inject());
        }
    }
    

    并像这样使用它:

    IMongoDatabase database = GetMyDatabase();
    
    var results = database
        .GetCollection<Blog>("Blogs")
        .AsQueryable()
        .WhereText("stackoverflow")
        .Take(10)
        .ToArray();
    

    希望这对某人有帮助:)

    【讨论】:

    • 这正是我所需要的!相同的方法(使用构建器然后注入where 调用)还允许您构建任何其他查询,例如地理空间查询等。
    • 天哪,这太棒了!我实际上是在尝试在我的存储库中保留对 mongocollection 的引用,仅用于进行文本搜索。现在我可以只保留 IMongoQueryable 引用并继续运输。
    • 我知道是 4 年前,但谢谢伙计,这太棒了
    【解决方案2】:

    查看 C# MongoDB 驱动程序中的PredicateTranslator,没有任何表达式被转换为text 查询。因此,您将无法使用 linq 查询获得 text 查询。

    但是您可以尝试使用 Builder&lt;&gt; 进行文本搜索:

    MongoClient client = new MongoClient(_mongoConnectionString);
    IMongoDatabase mongoDatabase = client.GetDatabase(DatabaseName);
    var aCollection = mongoDatabase.GetCollection<MyTypeSerializable>(CollectionName);
    
    var cursor = await aCollection.FindAsync(Builders<MyTypeSerializable>.Filter.Text("search"));
    
    var results = await cursor.ToListAsync();
    

    关于文本过滤器的详细信息在这里https://docs.mongodb.com/manual/reference/operator/query/text/

    【讨论】:

    • 看起来合法,我会检查并回来,谢谢。
    • 您能否告诉我如何将此方法与基于 linq 的过滤器结合使用?我可以做cursor.ToEnumerable().Where(e =&gt; e.Field == 1),它实际上会在最后一个.ToList() 之后“物化”吗?
    • 不确定是否可以将 Linq 与 Mongo 过滤器混合搭配,请尝试:var builder = Builders&lt;MyModel&gt;.Filter; var filter = builder.And( builder.Text("search"), builder.Eq(x =&gt; x.Field, 1) );
    【解决方案3】:

    可以修改 MongoDb 驱动源代码。让我给你解释一下:

    1. 请考虑“PredicateTranslator”不会将 linq 表达式转换为“$text”查询。但是有一个Text()方法的“FilterDefinitionBuilder”类,“PredicateTranslator”不知道实体类属性有文本搜索索引。
    2. 您必须用属性标记实体类属性(谓词语句中的条件)。该属性可用于备注属性具有全文搜索索引。
    3. 从现在开始,“PredicateTranslator”类知道该属性具有具有此属性“PredicateTranslator”的全文搜索索引

    让我给你看一些代码:

    1. 在 MongoDB.Bson 项目中创建一个 Attribute 如下图所示:

      [AttributeUsage(AttributeTargets.Property | AttributeTargets.Field)] 公共类 BsonFullTextSearchAttribute : 属性 { }

    2. 在您的实体类属性中放置“BsonFullTextSearchAttribute”属性,如下所示:

      public class History 
      {
          [MongoDB.Bson.Serialization.Attributes.BsonFullTextSearch]
          public string ObjectJSON { get; set; }
      }
      
    3. 在 MongoDB.Driver.Linq.Translators.QueryableTranslator.cs

      • 在Expression中添加一个保持实体类类型的字段>,如下图:

        private Type _sourceObjectTypeInExpression;
        
      • 添加获取实体类类型的方法如下:

        private void GetObjectType(Expression node)
        {
            if (node.Type != null && node.Type.GenericTypeArguments != null && node.Type.GenericTypeArguments.Length > 0)
            {
                this._sourceObjectTypeInExpression = node.Type.GenericTypeArguments[0]; 
            }
         }
        
      • 替换“public static QueryableTranslation Translate()”方法如下:

        public static QueryableTranslation Translate(Expression node, IBsonSerializerRegistry serializerRegistry, ExpressionTranslationOptions translationOptions)
        {
        var translator = new QueryableTranslator(serializerRegistry, translationOptions);
        translator.GetObjectType(node);
        translator.Translate(node);
        
        var outputType = translator._outputSerializer.ValueType;
        var modelType = typeof(AggregateQueryableExecutionModel<>).MakeGenericType(outputType);
        var modelTypeInfo = modelType.GetTypeInfo();
        var outputSerializerInterfaceType = typeof(IBsonSerializer<>).MakeGenericType(new[] { outputType });
        var constructorParameterTypes = new Type[] { typeof(IEnumerable<BsonDocument>), outputSerializerInterfaceType };
        var constructorInfo = modelTypeInfo.GetConstructors(BindingFlags.Instance | BindingFlags.NonPublic)
            .Where(c => c.GetParameters().Select(p => p.ParameterType).SequenceEqual(constructorParameterTypes))
            .Single();
        var constructorParameters = new object[] { translator._stages, translator._outputSerializer };
        var model = (QueryableExecutionModel)constructorInfo.Invoke(constructorParameters);
        
        return new QueryableTranslation(model, translator._resultTransformer);
        }
        
      • 在 TranslateWhere() 方法中,将“_sourceObjectTypeInExpression”字段传递给 PredicateTranslator.Translate() 静态方法

        var predicateValue = PredicateTranslator.Translate(node.Predicate, _serializerRegistry, this._sourceObjectTypeInExpression);
        

        B. MongoDB.Driver.Linq.Translators.PredicateTranslator.cs - 添加一个字段:“private Type sourceObjectTypeInExpression = null;”

        - Replace constructor as shown below (there has to be only one constructor);
            private PredicateTranslator(Type _sourceObjectTypeInExpression)
            {
                this.sourceObjectTypeInExpression = _sourceObjectTypeInExpression;
            }
        
        - Replace function "public static BsonDocument Translate(Expression node, IBsonSerializerRegistry serializerRegistry)" as shown below;
            public static BsonDocument Translate(Expression node, IBsonSerializerRegistry serializerRegistry, Type sourceObjectTypeInExpression)
            {
                var translator = new PredicateTranslator(sourceObjectTypeInExpression);
                node = FieldExpressionFlattener.FlattenFields(node);
                return translator.Translate(node)
                    .Render(serializerRegistry.GetSerializer<BsonDocument>(), serializerRegistry);
            }
        
        - Add these lines for reflection cache:
            #region FullTextSearch
            private static readonly object mSysncFullTextSearchObjectCache = new object();
            private static ConcurrentDictionary<string, List<string>> _fullTextSearchObjectCache = null;
            private static ConcurrentDictionary<string, List<string>> FullTextSearchObjectCache
            {
                get
                {
                    if (_fullTextSearchObjectCache == null)
                    {
                        lock (mSysncFullTextSearchObjectCache)
                        {
                            try
                            {
                                if (_fullTextSearchObjectCache == null)
                                {
                                    _fullTextSearchObjectCache = new ConcurrentDictionary<string, List<string>>();
                                }
                            }
                            finally
                            {
                                Monitor.PulseAll(mSysncFullTextSearchObjectCache);
                            }
                        }
                    }
        
                    return _fullTextSearchObjectCache;
                }
            }
        
            private bool IsFullTextSearchProp(Type entityType, string propName)
            {
                bool retVal = false;
                string entityName = entityType.Name;
        
                this.SetObject2FullTextSearchObjectCache(entityType);
                if (FullTextSearchObjectCache.ContainsKey(entityName))
                {
                    List<string> x = FullTextSearchObjectCache[entityName];
                    retVal = x.Any(p => p == propName);
                }
        
                return retVal;
            }
        
            private void SetObject2FullTextSearchObjectCache(Type entityType)
            {
                string entityName = entityType.Name;
        
                if (!FullTextSearchObjectCache.ContainsKey(entityName))
                {
                    List<string> retVal = new List<string>();
        
                    PropertyInfo[] currentProperties = entityType.GetProperties(BindingFlags.Public | BindingFlags.Instance);
                    foreach (PropertyInfo tmp in currentProperties)
                    {
                        var attributes = tmp.GetCustomAttributes();
                        BsonFullTextSearchAttribute x = (BsonFullTextSearchAttribute)attributes.FirstOrDefault(a => typeof(BsonFullTextSearchAttribute) == a.GetType());
                        if (x != null)
                        {
                            retVal.Add(tmp.Name);
                        }
                    }
        
                    FieldInfo[] currentFields = entityType.GetFields(BindingFlags.Public | BindingFlags.Instance);
                    foreach (FieldInfo tmp in currentFields)
                    {
                        var attributes = tmp.GetCustomAttributes();
                        BsonFullTextSearchAttribute x = (BsonFullTextSearchAttribute)attributes.FirstOrDefault(a => typeof(BsonFullTextSearchAttribute) == a.GetType());
                        if (x != null)
                        {
                            retVal.Add(tmp.Name);
                        }
                    }
        
                    FullTextSearchObjectCache.AddOrUpdate(entityName, retVal, (k, v) => v);
                }
            }
            #endregion
        
        - Replace "switch (operatorType)" switch in "private FilterDefinition<BsonDocument> TranslateComparison(Expression variableExpression, ExpressionType operatorType, ConstantExpression constantExpression)" function as shown below;
            bool isFullTextSearchProp = this.IsFullTextSearchProp(this.sourceObjectTypeInExpression, fieldExpression.FieldName);
            switch (operatorType)
            {
                case ExpressionType.Equal:
                    if (!isFullTextSearchProp)
                    {
                        return __builder.Eq(fieldExpression.FieldName, serializedValue);
                    }
                    else
                    {
                        return __builder.Text(serializedValue.ToString());
                    }
                case ExpressionType.GreaterThan: return __builder.Gt(fieldExpression.FieldName, serializedValue);
                case ExpressionType.GreaterThanOrEqual: return __builder.Gte(fieldExpression.FieldName, serializedValue);
                case ExpressionType.LessThan: return __builder.Lt(fieldExpression.FieldName, serializedValue);
                case ExpressionType.LessThanOrEqual: return __builder.Lte(fieldExpression.FieldName, serializedValue);
                case ExpressionType.NotEqual:
                    if (!isFullTextSearchProp)
                    {
                        return __builder.Ne(fieldExpression.FieldName, serializedValue);
                    }
                    else
                    {
                        throw new ApplicationException(string.Format("Cannot use \"NotEqual\" on FullTextSearch property: \"{0}\"", fieldExpression.FieldName));
                    }
            }
        
        - Replace "switch (methodCallExpression.Method.Name)" switch in "private FilterDefinition<BsonDocument> TranslateStringQuery(MethodCallExpression methodCallExpression)" function as shown below;
            bool isFullTextSearchProp = this.IsFullTextSearchProp(this.sourceObjectTypeInExpression, tmpFieldExpression.FieldName);
            var pattern = Regex.Escape((string)constantExpression.Value);
            if (!isFullTextSearchProp)
            {
                switch (methodCallExpression.Method.Name)
                {
                    case "Contains": pattern = ".*" + pattern + ".*"; break;
                    case "EndsWith": pattern = ".*" + pattern; break;
                    case "StartsWith": pattern = pattern + ".*"; break; // query optimizer will use index for rooted regular expressions
                    default: return null;
                }
        
                var caseInsensitive = false;
                MethodCallExpression stringMethodCallExpression;
                while ((stringMethodCallExpression = stringExpression as MethodCallExpression) != null)
                {
                    var trimStart = false;
                    var trimEnd = false;
                    Expression trimCharsExpression = null;
                    switch (stringMethodCallExpression.Method.Name)
                    {
                        case "ToLower":
                        case "ToLowerInvariant":
                        case "ToUpper":
                        case "ToUpperInvariant":
                            caseInsensitive = true;
                            break;
                        case "Trim":
                            trimStart = true;
                            trimEnd = true;
                            trimCharsExpression = stringMethodCallExpression.Arguments.FirstOrDefault();
                            break;
                        case "TrimEnd":
                            trimEnd = true;
                            trimCharsExpression = stringMethodCallExpression.Arguments.First();
                            break;
                        case "TrimStart":
                            trimStart = true;
                            trimCharsExpression = stringMethodCallExpression.Arguments.First();
                            break;
                        default:
                            return null;
                    }
        
                    if (trimStart || trimEnd)
                    {
                        var trimCharsPattern = GetTrimCharsPattern(trimCharsExpression);
                        if (trimCharsPattern == null)
                        {
                            return null;
                        }
        
                        if (trimStart)
                        {
                            pattern = trimCharsPattern + pattern;
                        }
                        if (trimEnd)
                        {
                            pattern = pattern + trimCharsPattern;
                        }
                    }
        
                    stringExpression = stringMethodCallExpression.Object;
                }
        
                pattern = "^" + pattern + "$";
                if (pattern.StartsWith("^.*"))
                {
                    pattern = pattern.Substring(3);
                }
                if (pattern.EndsWith(".*$"))
                {
                    pattern = pattern.Substring(0, pattern.Length - 3);
                }
        
                var fieldExpression = GetFieldExpression(stringExpression);
                var options = caseInsensitive ? "is" : "s";
                return __builder.Regex(fieldExpression.FieldName, new BsonRegularExpression(pattern, options));
            }
            else
            {
                return __builder.Text(pattern);
            }
        

    【讨论】:

      【解决方案4】:

      怎么样:

      IMongoQueryable<MyTypeSerializable> queryable = aCollection
      .AsQueryable()
      .Where(e=> e.Field.Contains("term"));
      

      【讨论】:

      • 你确定这使用了文本索引吗?
      • 这取决于 2 个因素:Linq 提供程序的实现和 mongodb 端的查询处理器。为了确保你必须追踪这个链条。
      • 查看上面的源代码并没有使用 mongodb 中的文本查询。
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-06-19
      • 2016-03-12
      • 2016-03-13
      • 2018-06-23
      相关资源
      最近更新 更多