Skip to content

Issue with bulk ingestion with few Query nodes hitting maximum memory limit #28682

Discussion options

You must be logged in to vote

If shard_num = 4, there will be 4 "data channel" for the collection. And 4 query nodes as "leader", each of them consumes the data from one channel.
When you insert data, the proxy node hashes each "reviewer_id" to be an integer value. The hash value is mod with 4 to determine which channel the entity belongs to.
If all the values of "reviewer_id" are the same, all the data will be consumed by the same channel. That means only one channel is busy, others are idle.

Replies: 2 comments 11 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
11 replies
@yhmo
Comment options

yhmo Nov 23, 2023
Collaborator

@kdabbir
Comment options

@kdabbir
Comment options

@yhmo
Comment options

yhmo Nov 23, 2023
Collaborator

Answer selected by kdabbir
@kdabbir
Comment options

@yhmo
Comment options

yhmo Nov 23, 2023
Collaborator

@yhmo
Comment options

yhmo Nov 23, 2023
Collaborator

@kdabbir
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants