Skip to content

Add missing similarity field to KnnQuery and missing fields to KnnSea… #2224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

pstrsr
Copy link
Contributor

@pstrsr pstrsr commented Aug 3, 2023

@cla-checker-service
Copy link

cla-checker-service bot commented Aug 3, 2023

💚 CLA has been signed

@pstrsr
Copy link
Contributor Author

pstrsr commented Aug 4, 2023

@ezimuel I signed the Contributor Agreement on Tuesday, but the check still failed. Does it take a while for it to be available? Since then I have signed it 2 more times, just to be sure nothing went wrong.

Could you help me getting the cla-check to pass?

@flobernd
Copy link
Member

flobernd commented Aug 4, 2023

@AlmostFamiliar You have to use the mail that is assigned to your commit. I can see you signed with 2 mails, but none of them equals the commit mail 🙂

@pstrsr
Copy link
Contributor Author

pstrsr commented Aug 4, 2023

Makes sense. I got a bit confused, because the document asked for the GitHub username (it gets automatically swapped with the mail from the step before and I did not notice that).

Regardless. Thanks for the help, I put this other mail as well now 👍

{
"description": "The required minimum similarity for a vector to be considered a match",
"name": "similarity",
"required": true,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

Are we sure similarity is a required parameter? I believe it is a requirement for the similarity score when provided but not a required parameter for the knn query, per https://www.elastic.co/guide/en/elasticsearch/reference/8.9/knn-search.html#_search_knn_with_expected_similarity.

Copy link
Contributor Author

@pstrsr pstrsr Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, that was a mistake on my part, thank you for catching that.

I added a commit that fixes this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I click on resolve conversation, or leave this as is for now?

@@ -25416,7 +25416,7 @@
{
"description": "The query vector",
"name": "query_vector",
"required": true,
"required": false,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to enforce that one of query_vector or query_vector_builder are defined through TS and/or the spec? The way the type would work if we were to merge this would inform the user that both fields are optional

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of. I just copied this over from the KnnQuery Interface, because I saw it was out of date and I thought I might as well fix that clearly out of date part of the code.

I could just remove this part from the MR and this can be fixed another time. The Java Client only needs the KnnQuery interface part, as far as I can tell. That way the similarity field would work and people could use it. Should I do that?

Copy link
Contributor Author

@pstrsr pstrsr Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshameti I have created a simplified version of this MR, so that people can work with vector search in elasticsearch until there is more clarity on this issue.
See here:
#2261

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshameti

Is there a way to enforce that one of query_vector or query_vector_builder are defined through TS and/or the spec?

We can do this:

export type KnnQuery = {
  /** The name of the vector field to search against */
  field: Field
  /** The final number of nearest neighbors to return as top hits */
  k: integer
  /** The number of nearest neighbor candidates to consider per shard */
  num_candidates: integer
  /** The minimum similarity required for a document to be considered a match */
  similarity?: float
  /** Boost value to apply to kNN scores */
  boost?: float
  /** Filters for the kNN search query */
  filter?: QueryContainer | QueryContainer[]
} & (
  {
    /** The query vector */
    query_vector: QueryVector
  } & {
    /** The query vector builder. You must provide a query_vector_builder or query_vector, but not both. */
    query_vector_builder: QueryVectorBuilder
  }
)

But we will get this error:

Error: Unhandled node kind: IntersectionType

/** Filters for the kNN search query */
filter?: QueryContainer | QueryContainer[]
/** The minimum similarity for a vector to be considered a match */
similarity?: double

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pstrsr hello, according to https://www.elastic.co/guide/en/elasticsearch/reference/8.10/knn-search-api.html the _knn_search API doesn't have similarity parameter.

I also want to see similarity parameter published: #2293 . Because of that I suggest to revert changes to this file completely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it seems that I confused the APIs. Reverted the changes.

@pstrsr pstrsr force-pushed the knn-query-similarity branch from 44fd54f to 246a17a Compare September 21, 2023 05:51
@pstrsr
Copy link
Contributor Author

pstrsr commented Sep 28, 2023

Closing because #2261 got merged

@pstrsr pstrsr closed this Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants