Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<video>查找判断问题BUG #5768

Closed
1 task done
cqray1990 opened this issue Oct 21, 2024 · 7 comments
Closed
1 task done

<video>查找判断问题BUG #5768

cqray1990 opened this issue Oct 21, 2024 · 7 comments
Labels
solved This problem has been already solved

Comments

@cqray1990
Copy link

cqray1990 commented Oct 21, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

content 内容有<\video> ,但是我的数据跟视频无关,这样导致出现错误:
[rank0]: ValueError: len(videos) is less than the number of <video> tokens.

意思就是content内容中就不能有<image>.<video>等字符,这样是不是不太好?

                while VIDEO_PLACEHOLDER in content:
                              if num_video_tokens >= len(video_grid_thw):
                                  raise ValueError("`len(videos)` is less than the number of {} tokens.".format(VIDEO_PLACEHOLDER))
              
                              content = content.replace(
                                  VIDEO_PLACEHOLDER,
                                  "<|vision_start|>{}<|vision_end|>".format(
                                      self.video_token * (video_grid_thw[num_video_tokens].prod() // merge_length)
                                  ),
                                  1,
                              )
                              num_video_tokens += 1

数据内容:
"from": "gpt",
"value": "Certainly! Here are simple examples for each point to help illustrate the concepts:\n\n1. Design trends:\na. Responsive design: A website that adjusts its layout for smartphones, tablets, and desktop computers.\nb. Minimalism: A website with a simple color scheme, limited images, and plenty of whitespace.\nc. Typography: Using a clear and legible font like Arial or Helvetica.\nd. UX/UI design: A shopping website with an easy-to-use search bar and a smooth checkout process.\ne. Dark mode: Twitter or YouTube offering a dark background with light text for nighttime viewing.\n2. Best practices:\na. Accessibility: Providing alt text for images so screen readers can describe them to visually impaired users.\nb. Mobile-friendliness: A website that loads quickly and is easy to navigate on a smartphone.\nc. Fast loading times: A news website that displays articles and images without making users wait.\nd. Clear navigation: A company website with a menu that clearly lists "About Us," "Products," and "Contact Us."\ne. SEO: A blog that uses relevant keywords and phrases to rank higher in Google search results.\n3. Technologies and tools:\na. HTML5: Using the

Reproduction

content 内容有<video> ,但是我的数据跟视频无关,这样导致出现错误:
[rank0]: ValueError: len(videos) is less than the number of

意思就是content内容中就不能有<\image>.<\video>等字符,这样是不是不太好?

                while VIDEO_PLACEHOLDER in content:
                              if num_video_tokens >= len(video_grid_thw):
                                  raise ValueError("`len(videos)` is less than the number of {} tokens.".format(VIDEO_PLACEHOLDER))
              
                              content = content.replace(
                                  VIDEO_PLACEHOLDER,
                                  "<|vision_start|>{}<|vision_end|>".format(
                                      self.video_token * (video_grid_thw[num_video_tokens].prod() // merge_length)
                                  ),
                                  1,
                              )
                              num_video_tokens += 1

"from": "gpt",
"value": "Certainly! Here are simple examples for each point to help illustrate the concepts:\n\n1. Design trends:\na. Responsive design: A website that adjusts its layout for smartphones, tablets, and desktop computers.\nb. Minimalism: A website with a simple color scheme, limited images, and plenty of whitespace.\nc. Typography: Using a clear and legible font like Arial or Helvetica.\nd. UX/UI design: A shopping website with an easy-to-use search bar and a smooth checkout process.\ne. Dark mode: Twitter or YouTube offering a dark background with light text for nighttime viewing.\n2. Best practices:\na. Accessibility: Providing alt text for images so screen readers can describe them to visually impaired users.\nb. Mobile-friendliness: A website that loads quickly and is easy to navigate on a smartphone.\nc. Fast loading times: A news website that displays articles and images without making users wait.\nd. Clear navigation: A company website with a menu that clearly lists "About Us," "Products," and "Contact Us."\ne. SEO: A blog that uses relevant keywords and phrases to rank higher in Google search results.\n3. Technologies and tools:\na. HTML5: Using the

Expected behavior

No response

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Oct 21, 2024
@hiyouga
Copy link
Owner

hiyouga commented Oct 22, 2024

更新代码后设置环境变量 VIDEO_PLACEHOLDER="<reserved>"

@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Oct 22, 2024
@cqray1990
Copy link
Author

cqray1990 commented Oct 22, 2024

更新代码后设置环境变量 #VIDEO_PLACEHOLDER=""
更新代码后,如果数据是多种的怎么办?video 和其他对话数据都有,设置VIDEO_PLACEHOLDER="",视频数据的content 中<\video>,又被忽略了,那怎么处理呢
@hiyouga

@hiyouga
Copy link
Owner

hiyouga commented Oct 22, 2024

你的数据是带视频还是不带视频?

@cqray1990
Copy link
Author

cqray1990 commented Oct 22, 2024

你的数据是带视频还是不带视频?

混合的,两种数据都有
@hiyouga

@hiyouga
Copy link
Owner

hiyouga commented Oct 22, 2024

video 标签只能有一种,提前处理好数据集

@MengHao666
Copy link

更新代码后设置环境变量 VIDEO_PLACEHOLDER="<reserved>"

最新版本的代码微调qwen2-vl报同样错误。不知道这样设置环境变量的意义是什么?

@MengHao666
Copy link

你的数据是带视频还是不带视频?

混合的,两种数据都有 @hiyouga

你的问题解决了吗?

1587causalai pushed a commit to 1587causalai/llama_factory that referenced this issue Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

3 participants