Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error saving: Unable to create attribute (object header message is too large) #2117

Open
leo8163 opened this issue Feb 17, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@leo8163
Copy link

leo8163 commented Feb 17, 2025

I'm getting an error when trying to save my SLEAP file as .slp after running inference that says "An error occured when attempting to save: Unable to create attribute (object header message is too large). Try saving your project with a different filename or in a different format". I saved my project with a shorter file name but still got the same error. I exported my data as csv and I had data but I need the sleap file to save the inference in the .slp format.

On the Anaconda prompt, it says "RuntimeError: unable to create attribute (object header message is too large)." I tried to re run it but still got the same issue. I am currently doing this with 150 videos at once which I have run through SLEAP in the past with no issue in saving. I tried to make another SLEAP file with just one video and see if that saved (with my normal length file name) after running inference with the same model and it worked without any issues.

Not sure if an update is needed, need to run fewer videos at once, or something else. Please advise!

Image Image
@leo8163 leo8163 added the bug Something isn't working label Feb 17, 2025
@roomrys
Copy link
Collaborator

roomrys commented Feb 18, 2025

Hi @leo8163,

From your description, it sounds like an error with how much data is being written to a single SLP file. But, I have a feeling there is more that we can do to handle storing this large a project... (see action items).

Action Items (to help us)

To help us pin down the issue, are you able to include more information from you terminal (this would let us see where in the SLEAP code the error occurs - i.e. in saving videos/provenance/labels)? A few lines ~5 before you see Traceback should be sufficient.

Work-around (as you've already discovered)

For training a model, it is good idea to sparsely annotate frames from across all videos, but then, for running inference on all frames, splitting slp files by video (or smaller video sets) would be the work around here.

Thanks,
Liezl

Related (External) Posts:

@leo8163
Copy link
Author

leo8163 commented Feb 20, 2025

Hi @roomrys thank you for your prompt response! This is the only image I have from the terminal. Unfortunately I have closed out of anaconda and am not able to go further back from the "Traceback" function. I might have to rerun that whole file to see if I get the same error. I have since then tried to run half the amount of videos (I started with 153 and now am doing 88) and the sleap file seems to be saving the inference like normal, so perhaps it is a file size issue. I have attached the image I was referring to earlier below. Again, thank you for your help!

    'start_timestamp': '2025-02-16 14:39:34.826657',
    'finish_timestamp': '2025-02-16 14:40:16.825106'
}

Saved output: C:/SLEAP/Retrieval Project AIM/Retrieval G1 Test\predictions\Ret_G1_MF_Test_v16_021625_AIM.slp.250216_1439
31.predictions.slp

Process return code: 0
Traceback (most recent call last):
    File "C:\Users\eisch\anaconda3\envs\sleap\lib\site-packages\sleap\gui\commands.py", line 1078, in _try_save
        Labels.save_file(labels=labels, filename=filename, as_format=extension)
    File "C:\Users\eisch\anaconda3\envs\sleap\lib\site-packages\sleap\io\dataset.py", line 1997, in save_file
        write(filename, labels, *args, **kwargs)
    File "C:\Users\eisch\anaconda3\envs\sleap\lib\site-packages\sleap\io\format\main.py", line 162, in write
        return disp.write(filename, source_object, *args, **kwargs)
    File "C:\Users\eisch\anaconda3\envs\sleap\lib\site-packages\sleap\io\format\dispatch.py", line 79, in write
        return adaptor.write(filename, source_object, *args, **kwargs)
    File "C:\Users\eisch\anaconda3\envs\sleap\lib\site-packages\sleap\io\format\hdf5.py", line 341, in write
        meta_group.attrs["json"] = np.string_(json_dumps(d))
    File "h5py\_objects.pyx", Line 54, in h5py._objects.with_phil.wrapper
    File "h5py\_objects.pyx", Line 55, in h5py._objects.with_phil.wrapper
    File "C:\Users\eisch\anaconda3\envs\sleap\lib\site-packages\h5py\_hl\attrs.py", line 103, in __setitem__
        self.create(name, data=value)
    File "C:\Users\eisch\anaconda3\envs\sleap\lib\site-packages\h5py\_hl\attrs.py", line 197, in create
        attr = h5a.create(self._id, self._e(tempname), htype, space)
    File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
    File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
    File "h5py\h5a.pyx", line 50, in h5py.h5a.create
RuntimeError: Unable to create attribute (object header message is too large)

@roomrys roomrys changed the title Error when attempting to save file after inference Unable to create attribute (object header message is too large) Feb 20, 2025
@roomrys roomrys changed the title Unable to create attribute (object header message is too large) Error saving: Unable to create attribute (object header message is too large) Feb 20, 2025
@roomrys
Copy link
Collaborator

roomrys commented Feb 20, 2025

That is perfect for us, thank you. I'll begin my analysis (and keep updating this comment) on what we can do to save larger projects.

Analysis

Starting point

All .slp and .pkg.slp files (excluding .analysis.slp files) are serialized from complex SLEAP objects to primitive types via the LabelsV1Adaptor.write method. This is the line of interest inside this method where things fail:

# Output the dict to JSON
meta_group.attrs["json"] = np.string_(json_dumps(d))

.

Now, what is d and why is it so big - too big?

Initial d

The initial d contains everything but the labeled frame data.

d is initially defined at the start of the LabelsV1Adaptor.write method as:

# Serialize all the meta-data to JSON.
d = labels.to_dict(skip_labels=True)

where the Labels.to_dict method returns the following dict:

sleap/sleap/io/dataset.py

Lines 1931 to 1946 in 6531447

# Serialize the skeletons, videos, and labels
dicts = {
"version": LABELS_JSON_FILE_VERSION,
"skeletons": skeleton_cattr.unstructure(self.skeletons),
"nodes": cattr.unstructure(self.nodes),
"videos": Video.cattr().unstructure(self.videos),
"tracks": track_cattr.unstructure(self.tracks),
"suggestions": label_cattr.unstructure(self.suggestions),
"negative_anchors": label_cattr.unstructure(self.negative_anchors),
"provenance": label_cattr.unstructure(self.provenance),
}
if not skip_labels:
dicts["labels"] = label_cattr.unstructure(self.labeled_frames)
return dicts

, skipping the actual LabeledFrame data - which is good news since that would easily make our metadata WAY too large. Some other keys that could scale with project size are "suggestions" (all the SuggestionFrames from the Labeling Suggestions), "tracks" (all the Track objects in the project - used or unused), and "videos" (all the Videos in the project).

(Lack of) Modifications based on save_frame_data

There are no modifications for saving an slp to another slp.

In the next few lines, we modify d based on save_frame_data. However, we know that save_frame_data is False here because we call Labels.save_file (which then calls this LabelsV1Adaptor.write method) from this SaveProjectAs command:

sleap/sleap/gui/commands.py

Lines 1071 to 1079 in 6531447

class SaveProjectAs(AppCommand):
@staticmethod
def _try_save(context, labels: Labels, filename: str):
"""Helper function which attempts save and handles errors."""
success = False
try:
extension = (PurePath(filename).suffix)[1:]
extension = None if (extension == "slp") else extension
Labels.save_file(labels=labels, filename=filename, as_format=extension)

where we opt not to pass in save_frame_data (this is only passed in as True when saving a .pkg.slp file). Hence, we use a default value of
save_frame_data: bool = False,

and instead of entering the
if save_frame_data:

condition, we enter:
else:
# Include the source video metadata if this was a package.
new_videos = []
for video in labels.videos:
if hasattr(video.backend, "_source_video"):
new_videos.append(video.backend._source_video)
else:
new_videos.append(video)
d["videos"] = Video.cattr().unstructure(new_videos)

which has no affect for our analysis (see #462 for reasoning behind this condition).

Modifications based on append

We seem to remove all the scale-with-dataset keys here.

Next, we do or don't do some appending - potentially rewriting the entire dictionary d. Following the calls from Labels.save_file in the aformentioned SaveProjectAs command, we see that there is no append keyword argument passed in, so we use the default value of:
https://github.com/talmolab/sleap/blob/653144760882aa3f5c7f1199366d45aba0bccaed/sleap/io/format/hdf5.py#L269C9-L269C15
. Thus, we skip the condition:

if append and "json" in meta_group.attrs:

, and instead enter:
if not append:
# These items are stored in separate lists because the metadata
# group got to be too big.
for key in ("videos", "tracks", "suggestions"):
# Convert for saving in hdf5 dataset
data = [np.string_(json_dumps(item)) for item in d[key]]
hdf5_key = f"{key}_json"
# Save in its own dataset (e.g., videos_json)
f.create_dataset(hdf5_key, data=data, maxshape=(None,))
# Clear from dict since we don't want to save this in attribute
d[key] = []

. Hmm... and here we seem to remove all the keys that we had said before would scale with the project:

Some other keys that could scale with project size are "suggestions" (all the SuggestionFrames from the Labeling Suggestions), "tracks" (all the Track objects in the project - used or unused), and "videos" (all the Videos in the project).

. This leaves us scratching our heads since the very next line is where we run into the error.

Is it possible we enter append accidentally? Nein.

if append and "json" in meta_group.attrs:

... say if we reason if (append and "json") in metadata_group.attrs?
Nein.

In [1]: append = False

In [2]: append and "json"
Out[2]: False

In [3]: import h5py

In [4]: filename = "test.slp"

In [5]: with h5py.File(filename, "a") as f:
   ...:     meta_group = f.require_group("metadata")
   ...:     meta_group.attrs["format_id"] = 1.2
   ...:     if append and "json" in meta_group.attrs:
   ...:         print("True")
   ...: 

In [6]: with h5py.File(filename, "a") as f:
   ...:     meta_group = f.require_group("metadata")
   ...:     meta_group.attrs["format_id"] = 1.2
   ...:     if not (append and "json" in meta_group.attrs):
   ...:         print("True")
   ...: 
True

In [7]: with h5py.File(filename, "a") as f:
   ...:     meta_group = f.require_group("metadata")
   ...:     meta_group.attrs["format_id"] = 1.2
   ...:     if (append and "json") in meta_group.attrs:
   ...:         print("True")
   ...: 
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[7], line 4
      2 meta_group = f.require_group("metadata")
      3 meta_group.attrs["format_id"] = 1.2
----> 4 if (append and "json") in meta_group.attrs:
      5     print("True")

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File ~/micromamba/envs/sa0/lib/python3.9/site-packages/h5py/_hl/attrs.py:282, in AttributeManager.__contains__(self, name)
    279 @with_phil
    280 def __contains__(self, name):
    281     """ Determine if an attribute exists, by name. """
--> 282     return h5a.exists(self._id, self._e(name))

File ~/micromamba/envs/sa0/lib/python3.9/site-packages/h5py/_hl/base.py:200, in CommonStateObject._e(self, name, lcpl)
    198 else:
    199     try:
--> 200         name = name.encode('ascii')
    201         coding = h5t.CSET_ASCII
    202     except UnicodeEncodeError:

AttributeError: 'bool' object has no attribute 'encode'

What keys (other than "suggestions", "tracks", or "videos") could be too large?

? Still processing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants