Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix map field's prefix not being set in marqo__field_types when partial updating + add api tests #1144

Open
wants to merge 17 commits into
base: mainline
Choose a base branch
from

Conversation

adityabharadwaj198
Copy link
Member

@adityabharadwaj198 adityabharadwaj198 commented Mar 12, 2025

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
  1. Fixed a bug where when we insert a map using partial updates, the map prefix in a flattened map field name was not getting set in marqo__field_types
  2. Adds some API tests in correspondence with the set of manual tests that we were running
  • What is the current behavior? (You can also link to an open issue here)

  • wrt the bug - currently if a map is inserted via a partial update (it must not exist from before), the marqo__field_types will consist of int_map.key1, int_map.key2 but no int_map. This doesn't happen when inserting maps via add_docs and is critical to prevent type conversions.

  • What is the new behavior (if this is a feature change)?

  • if a map is inserted via a partial update (it must not exist from before), the marqo__field_types will consist of int_map.key1: int_map_entry, int_map.key2: int_map_entry & int_map: int_map_entry.

  • Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)

  • No

  • Have unit tests been run against this PR? (Has there also been any additional testing?)

  • Yes

  • Related Python client changes (link commit/PR here)

  • NA

  • Related documentation changes (link commit/PR here)

  • Other information:

  • Please check if the PR fulfills these requirements

  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes/features)
  • Docs have been added / updated (for bug fixes / features)

…fields, adding new maps with docs resulted in map type issues & later didn't allow us to properly update maps in subsequent calls.

Also added more robust testing where we are now testing the field_type metadata & running every test through 3 different docs
…n_uuid is updated everytime a map update happens.
mostly added more test cases and fixed logging in some existing ones
…n-uuid

# Conflicts:
#	src/marqo/core/semi_structured_vespa_index/semi_structured_vespa_index.py
#	tests/integ_tests/core/document/test_partial_update_semi_structured.py
@adityabharadwaj198 adityabharadwaj198 changed the title Aditya/partial updates add more api tests Fix map field's prefix not being set in marqo__field_types when partial updating + add api tests Mar 13, 2025
Copy link
Collaborator

@farshidz farshidz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a test that covers the bug you're fixing in this PR

# Handle creating update statements for the map field name.
if "." in field_name:
# For fields like "map1.key1", extract the map name "map1"
map_name = field_name.split(".", 1)[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we have this logic in a function that we could reuse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introduced a method for this.

@@ -453,12 +459,17 @@ def _create_update_statement_for_updating_numeric_and_numeric_map_field(
if (original_field_name not in numeric_field_map and
original_doc.fixed_fields.field_types.get(original_field_name) in (MarqoFieldTypes.INT_MAP.value, MarqoFieldTypes.FLOAT_MAP.value)):

map_name = original_field_name.split(".", 1)[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have a func rather than repeat the split logic. Probably even a func to detect if it's a map (contains .)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introduced a method for this.

self.assertEqual(get_docs_response['new_float_map.e'], 2.718)

def test_add_new_tensor_fields_with_update_documents_api(self):
"""Test that adding new tensor fields, string arrays, and strings via update_documents fails.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general you're putting too many things in your tests. This one for example looks like at least three separate tests. In general if you have a lot of asserts, you're probably testing too many things at once.

Also print statement in test is unusual and not good practice usually. If there's an issue, raise an error and have the string as its message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've broken down some tests that I could see fit. Please let me know if you think any other tests can be split also.

Deleted all print statements. Some are still there in the tests with threads and can help in debugging. let me know if you want me to delete them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants