-
Notifications
You must be signed in to change notification settings - Fork 455
add dot support in DSL indentifier #3989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
add dot support in DSL indentifier
@trinity-1686a can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
your changes seems to work fine for a json like
{
"abc.def": 123
}
but not for
{
"abc": {
"def": 123
}
}
(in that case, hash_attribute
search for the literal key abc.def
and finds nothing)
We want to support both use cases (probably with slightly different syntax, like having abc\.def
to search for the literal key abc.def
, and abc.def
for nested keys).
Do you want to implement the whole thing (that is, a bit more change to the dsl, and teach hash_attribute
how to handle nested keys), or leave things mostly as is and keep the issue open for now? (in the 2nd case, I'd still prefer for the dsl to want .
for not nested keys to be escaped, so it's not a breaking change when support for nested key is implemented)
@trinity-1686a Good catch. |
We do not have the choice, users send data with arbitrary keys. I agree with @trinity-1686a on the syntax:
|
Okay, let's use abc\.def to search for the literal key and abc.def for nested keys |
(I took the liberty of editing your message, the \ got eaten my markdown and made the message unclear) |
refactor hash_attribute to hash nested keys
add test case
minor bug fix add more test cases
@trinity-1686a @fmassot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are on the indexing critical path here so we need to be extra careful when doing extra allocation. This make this issue an interesting problem! Can you review your code and try to remove all allocations?
@JerryKwan: you could start by adding a benchmark to monitor the performance (thanks @guilload for the suggestion!). There is a benchmark for the doc mapper, we could probably use it. |
optimize routing expression add benchmark test for routing expression
@fmassot @trinity-1686a |
@JerryKwan can you share the bench results before/after? |
@fmassot
And this is the one after changes
|
quickwit/quickwit-doc-mapper/benches/routing_expression_bench.rs
Outdated
Show resolved
Hide resolved
@trinity-1686a Thanks for reviewing, I will refactor the codes asap, thank you |
hi @JerryKwan, it looks like you haven't had time to make progress on this PR. Would you mind if I take over? |
@trinity-1686a |
no problem 😉 |
add dot support in DSL indentifier
Description
In the previous implementation, the identifier only supports { a..z | A..Z | 0..9 | _ }, and does not support set partition_key on key in json field and key with dot
How was this PR tested?
Closes:#3380