milvus数据库搜索

milvus

A cloud-native vector database, storage for next generation AI applications

项目地址：https://gitcode.com/gh_mirrors/mi/milvus

免费下载资源

灵海之森

1972人浏览 · 2023-11-17 14:09:08

灵海之森 · 2023-11-17 14:09:08 发布

一、向量相似度搜索
在Milvus中进行向量相似度搜索时，会计算查询向量和集合中具有指定相似性度量的向量之间的距离，并返回最相似的结果。通过指定一个布尔表达式来过滤标量字段或主键字段，您可以执行混合搜索。

1.加载集合
执行操作的前提是集合加载到内存。

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.load()

2.准备搜索参数
搜索参数要适应你的搜索场景。

search_params = {
    "metric_type": "L2", 
    "offset": 0, 
    "ignore_growing": False, 
    "params": {"nprobe": 10}#适合 IVF_FLAT index
}

params的可选参数和值如下：

nprobe Indicates the number of cluster units to search. This parameter is available only when index_type is set to IVF_FLAT, IVF_SQ8, or IVF_PQ. The value should be less than nlist specified for the index-building process.

ef Indicates the search scope. This parameter is available only when index_type is set to HNSW. The value should be within the range from top_k to 32768.

radius Indicates the angle where the vector with the least similarity resides.

range_filter Indicates the filter used to filter vector field values whose similarity to the query vector falls into a specific range.

3.进行向量搜索

# 使用集合对象的 search 方法来进行向量检索
results = collection.search(
    data=[[0.1, 0.2]],  # 查询向量
    anns_field="book_intro",  # 指定用于检索的字段
    param=search_params,  # 检索参数
    limit=10,  # 返回结果数量的限制
    expr=None,  # 查询表达式
    output_fields=['title'],  # 指定要从搜索结果中检索的字段
    consistency_level="Strong"  # 一致性级别
)

# 获取搜索结果中最相似的文档 IDs
results[0].ids

# 获取搜索结果中的距离值
results[0].distances

# 获取第一个匹配的文档
hit = results[0][0]

# 从匹配的文档中获取 'title' 字段的值
hit.entity.get('title')

二、混合搜索

混合搜索是使用属性过滤的向量搜索。通过指定过滤标量字段或主键字段的布尔表达式，来先限定搜索范围。
1.加载集合
2.进行混合向量搜索
其实也就是在前面的搜索配置加了个布尔表达式

search_param = {
  "data": [[0.1, 0.2]],
  "anns_field": "book_intro",
  "param": {"metric_type": "L2", "params": {"nprobe": 10}, "offset": 0},
  "limit": 10,
  "expr": "word_count <= 11000",
}
res = collection.search(**search_param)

3.检查搜索结果

assert len(res) == 1 # 断言
hits = res[0]
assert len(hits) == 2
print(f"- Total hits: {len(hits)}, hits ids: {hits.ids} ")
print(f"- Top1 hit id: {hits[0].id}, distance: {hits[0].distance}, score: {hits[0].score} ")

三、范围搜索
1.加载集合
2.定义范围搜索参数
l2度量：

param = {
    # use `L2` as the metric to calculate the distance
    "metric_type": "L2",
    "params": {
        # search for vectors with a distance smaller than 1.0
        "radius": 1.0,# 半径 只有距离查询向量的距离小于半径值的向量才会被返回作为检索结果。
        # filter out vectors with a distance smaller than or equal to 0.8
        "range_filter" : 0.8 #大于或等于指定值的向量将被返回作为检索结果
    }
}

内积ip度量：

param = {
    # use `IP` as the metric to calculate the distance
    "metric_type": "IP",
    "params": {
        # search for vectors with a distance greater than 0.8
        "radius": 0.8,
        # filter out most similar vectors with a distance greater than or equal to 1.0
        "range_filter" : 1.0
    }
}

3.执行范围搜索

res = collection.search(
    data=[[0.3785311281681061,0.2960498034954071]], # query vector
    anns_field='book_intro', # vector field name
    param=param, # search parameters defined in step 2
    limit=5 # number of results to return
)

print(res)

GitHub 加速计划 / mi / milvus

28.68 K

2.76 K

下载

A cloud-native vector database, storage for next generation AI applications

最近提交(Master分支：3 个月前 )

00edec2e issue: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com> 5 天前

3cdb4850 action for https://github.com/milvus-io/milvus/issues/37166#issuecomment-2469502955 Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com> 5 天前

GitCode 开源社区

旨在为数千万中国开发者提供一个无缝且高效的云端环境，以支持学习、使用和贡献开源项目。

更多推荐

[转载]在Windows环境下安装GNU Radio

转自：在Windows环境下安装GNURadio_恐弱智_新浪博客GNU Radio是用Python开发的，大部分开源的工程能够在Linux环境下运行良好，而Windows下却运行的很勉强，而且安装配置都很复杂。GNU Radio算是个例外了，不光提供了Windows的二进制安装，还有比较详细的说明。我是Python小白，所以折腾了好久才弄好，特意记录下来，免得以后再装还折腾。GNU Radio的

GitCode 开源社区

centOS 8 使用dnf安装Docker

DNF是什么？CentOS 8使用YUM软件包管理器版本v4.0.4。现在，该版本使用DNF(已删除YUM)。DNF是软件包管理器。它会在Linux发行版上安装，执行更新并删除软件包。使用DNF安装Docker跳过具有损坏依赖性的程序包一个有效的解决方案是使您的CentOS 8系统使用以下--nobest命令安装最符合条件的版本：sudo dnf install docker...

GitCode 开源社区

定时同步数据库表(mysql+linux+crontab)

sync.sh里面的参数需要改变，ip/username/password/database/tablesync.sh#!/bin/sh# Please change the IP and password of the data source db.# Then change the table name.filename=/home/nington/db/$(date +%Y-%m