Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: multi-col top-k unexpected #33146

Closed
1 task done
JackTan25 opened this issue May 19, 2024 · 30 comments
Closed
1 task done

[Bug]: multi-col top-k unexpected #33146

JackTan25 opened this issue May 19, 2024 · 30 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@JackTan25
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.4.1
- Deployment standalone  
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): CentOS7
- CPU/Memory: 12c 128G
- GPU: 
- Others:

Current Behavior

multi-col top-k where k is 50, we can get only ten results.

Expected Behavior

get 50 results.

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@JackTan25 JackTan25 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 19, 2024
@JackTan25
Copy link
Author

unexpected_top_k.zip

@JackTan25
Copy link
Author

JackTan25 commented May 19, 2024

step to reprocess:

  1. unzip https://github.com/milvus-io/milvus/files/15369214/unexpected_top_k.zip
  2. python3 small_data/data_load_small_data.py # load data
  3. python3 small_data/milvus_small_multi_vector.py # run multi-top-k test
    cc @yanliang567 , please check the milvus.txt for the query results.

@yanliang567
Copy link
Contributor

I did not reproduce the issue on latest milvus 2.4.2 build, I got 50 results as expected.
image
could you please retry?
/assign @JackTan25
/unassign

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 20, 2024
@JackTan25
Copy link
Author

@yanliang567 Hi, did you modify the code of mine? is there any error?

@yanliang567
Copy link
Contributor

@yanliang567 Hi, did you modify the code of mine? is there any error?
I changed the milvus server ip only

@JackTan25
Copy link
Author

milvusdb/milvus:v2.4.0-rc.1 this is the version of mine @yanliang567

@JackTan25
Copy link
Author

JackTan25 commented May 20, 2024

Is there any change about milvus multi-vector search between these two versions?

@JackTan25
Copy link
Author

version: '3.5'

services:
  etcd:
    container_name: milvus-etcd
    image: quay.io/coreos/etcd:v3.5.5
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296
      - ETCD_SNAPSHOT_COUNT=50000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/etcd:/etcd
    command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
    healthcheck:
      test: ["CMD", "etcdctl", "endpoint", "health"]
      interval: 30s
      timeout: 20s
      retries: 3

  minio:
    container_name: milvus-minio
    image: minio/minio:RELEASE.2023-03-20T20-16-18Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    ports:
      - "9001:9001"
      - "9000:9000"
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/minio:/minio_data
    command: minio server /minio_data --console-address ":9001"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3

  standalone:
    container_name: milvus-standalone
    image: milvusdb/milvus:v2.4.1
    command: ["milvus", "run", "standalone"]
    security_opt:
    - seccomp:unconfined
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/milvus:/var/lib/milvus
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]
      interval: 30s
      start_period: 90s
      timeout: 20s
      retries: 3
    ports:
      - "19530:19530"
      - "9091:9091"
    depends_on:
      - "etcd"
      - "minio"

networks:
  default:
    name: milvus

This is my docker-compose.yaml file, Is there any problem? before I upgrade milvus from 2.3.5 to 2.4.1, So I run

docker compose down
docker compose up -d

and I will get warning like below:

WARN[0000] /home/tanboyu/cpp_workspace/milvus/docker-compose.yml: `version` is obsolete 

I'm not sure whether it will cause this? @yanliang567

@xiaofan-luan
Copy link
Contributor

@JackTan25
if you can share the scripts you try to reproduce this issue we can double check on it

@JackTan25
Copy link
Author

wget https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh
bash standalone_embed.sh start

It seems I can get only 2.4.0, upper version can't be retrieved.

@xiaofan-luan
Copy link
Contributor

@yanliang567

can we use this script to reproduce on 2.4.2?

@yanliang567
Copy link
Contributor

@yanliang567

can we use this script to reproduce on 2.4.2?

not reproduce on 2.4.2 build, please retry

@JackTan25
Copy link
Author

well, I can't pull the newest 2.4.2, I can get only 2.4.1 or 2.4.0. And I can reproduce on 2.4.1 and 2.4.0. Can you try these two versions?

@yanliang567
Copy link
Contributor

try update the milvus tag to 2.4-20240515-b2d83d33

@JackTan25
Copy link
Author

try update the milvus tag to 2.4-20240515-b2d83d33

ok, let me check it.

@JackTan25
Copy link
Author

JackTan25 commented May 20, 2024

this is my milvus version: milvusdb/milvus:2.4-20240515-b2d83d33. But I can get just only 10 results. That is very strange.

@JackTan25
Copy link
Author

Hi, can I see your 50 results distance, I get 0.xxxx distance, that's very strange.

["['id: 115882, distance: 1.0, entity: {}', 'id: 712697, distance: 0.7491630911827087, entity: {}', 'id: 255111, distance: 0.746990442276001, entity: {}', 'id: 735007, distance: 0.7439651489257812, entity: {}', 'id: 425115, distance: 0.7413135766983032, entity: {}', 'id: 81369, distance: 0.7407634258270264, entity: {}', 'id: 714808, distance: 0.7384312152862549, entity: {}', 'id: 464140, distance: 0.7377769947052002, entity: {}', 'id: 93407, distance: 0.733290433883667, entity: {}', 'id: 265376, distance: 0.7315851449966431, entity: {}']"]

@JackTan25
Copy link
Author

JackTan25 commented May 20, 2024

I reinstall milvus like below:

## after download, I modify the tag as 2.4-20240515-b2d83d33context=explore)
wget https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh 
bash standalone_embed.sh start

@yanliang567
Copy link
Contributor

res[0].ids
[667, 608, 620, 111, 339, 7, 372, 567, 971, 183, 709, 177, 853, 357, 738, 904, 580, 47, 497, 309, 141, 153, 723, 673, 134, 302, 829, 181, 566, 299, 494, 555, 610, 703, 449, 506, 322, 161, 296, 632, 742, 469, 649, 521, 232, 346, 959, 317, 324, 157]
res[0].distances
[0.5353243350982666, 0.5318502187728882, 0.5216019153594971, 0.5177504420280457, 0.5133214592933655, 0.49992430210113525, 0.4973728656768799, 0.48816734552383423, 0.4849900007247925, 0.4736632704734802, 0.4726385474205017, 0.47195154428482056, 0.47125494480133057, 0.47053390741348267, 0.4699331521987915, 0.4690789580345154, 0.4685344099998474, 0.46435511112213135, 0.46267831325531006, 0.45821017026901245, 0.4578573703765869, 0.45749562978744507, 0.45422059297561646, 0.45378345251083374, 0.45277297496795654, 0.4502769112586975, 0.4501721262931824, 0.4498910903930664, 0.44975870847702026, 0.44871199131011963, 0.4486366808414459, 0.4457487463951111, 0.4427493214607239, 0.4417317509651184, 0.43927425146102905, 0.43822145462036133, 0.436090350151062, 0.4359341263771057, 0.435463547706604, 0.43288862705230713, 0.4318164587020874, 0.43161338567733765, 0.4314650893211365, 0.43105268478393555, 0.43021082878112793, 0.4291290044784546, 0.4284785985946655, 0.4277331233024597, 0.42639073729515076, 0.42496371269226074]

@JackTan25
Copy link
Author

Does milvus will do some change for the l2 distance internally? In fact ,the dataset's l2 distance should not be 0.0xx, the uint8 of every dimension for the column ssnap,bigann.

@JackTan25
Copy link
Author

JackTan25 commented May 20, 2024

I can just get 667 608 620 111 339 7 372 567 971 183.

@yanliang567
Copy link
Contributor

did you run on a completed new deployment or on a instance that upgraded from 2.3?
Please offer milvus logs for investigation, For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs.

@JackTan25
Copy link
Author

JackTan25 commented May 20, 2024

I do bash standalone_embed.sh stop and bash standalone_embed.sh delete. And then I run wget https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh and
bash standalone_embed.sh start. Does that matter? follow here https://milvus.io/docs/install_standalone-docker.md

I don't follow the docker compose but the docker

@JackTan25
Copy link
Author

image

@yanliang567
Copy link
Contributor

that's okay, as long as you can export the logs

@JackTan25
Copy link
Author

JackTan25 commented May 20, 2024

$ docker logs c3b1521e50ea > milvus.log
milvus.log
here is the log, cc @yanliang567

@JackTan25
Copy link
Author

JackTan25 commented May 20, 2024

What's the problem from the log?

@JackTan25
Copy link
Author

milvus.log
I got the related logs for the query here, it has only 1200 rows.

@JackTan25
Copy link
Author

JackTan25 commented May 20, 2024

I suspect that, I'm in standalone mode and My environment is CentOS7, Intel(R) Xeon(R) CPU E5-2650
v4@2.20GHz 12-core processors, 128GB of RAM. How about yours? cc @yanliang567

@JackTan25
Copy link
Author

JackTan25 commented May 21, 2024

https://github.com/milvus-io/pymilvus/blob/master/examples/hybrid_search.py We should reference to this file. Let's close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

3 participants