Skip to content

Feature/images #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

PhantomReactor
Copy link

@PhantomReactor PhantomReactor commented Apr 25, 2025

InShot_20250429_212847545.mp4

@PhantomReactor PhantomReactor changed the title Feature/images [WIP] Feature/images Apr 25, 2025
@PhantomReactor
Copy link
Author

PhantomReactor commented Apr 25, 2025

Tested openai and anthropic. Need to fix test cases and refactor

@@ -38,7 +39,7 @@ func (e *AgentEvent) Response() message.Message {
}

type Service interface {
Run(ctx context.Context, sessionID string, content string) (<-chan AgentEvent, error)
Run(ctx context.Context, sessionID string, content string, attachments [][]byte) (<-chan AgentEvent, error)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Run(ctx context.Context, sessionID string, content string, attachments [][]byte) (<-chan AgentEvent, error)
Run(ctx context.Context, sessionID string, content string, attachments ...[]byte) (<-chan AgentEvent, error)

Just a small thing maybe this looks better so that you don't have to use nil when you don't have attachments

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, you're right. will change

@kujtimiihoxha
Copy link
Collaborator

@PhantomReactor this is great 🙌

@AdityaGadhvi
Copy link

keep up the great work!

@PhantomReactor
Copy link
Author

hey @kujtimiihoxha, somethng's wrong with gemini models or am i just dumb? can't get them working

@PhantomReactor
Copy link
Author

hey @kujtimiihoxha, somethng's wrong with gemini models or am i just dumb? can't get them working

nevermind, saw your comments in other discussions

@kujtimiihoxha
Copy link
Collaborator

@PhantomReactor yeah that provides is messed up I did not test it enough, something is definitely wrong with it

@PhantomReactor
Copy link
Author

@PhantomReactor yeah that provides is messed up I did not test it enough, something is definitely wrong with it

okay. will test it from my end after this

Copy link
Collaborator

@kujtimiihoxha kujtimiihoxha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is not ready but saw a couple of things that maybe can be improved

@@ -95,6 +100,11 @@ type appModel struct {

showInitDialog bool
initDialog dialog.InitDialogCmp

editingMode bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this came from pulling from main this is removed now.

@@ -157,7 +156,7 @@ func (a *agent) err(err error) AgentEvent {
}
}

func (a *agent) Run(ctx context.Context, sessionID string, content string) (<-chan AgentEvent, error) {
func (a *agent) Run(ctx context.Context, sessionID string, content string, attachmentPaths string, attachments ...[]byte) (<-chan AgentEvent, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we instead create a type here for attachment something like

type Attachment struct {
   Path string
   Data []byte
}

And than also update the message.BinaryContent to add the Path attribute, this way we do not need to add a new attribute to the message itself and it keeps the code a bit cleaner?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Didn't push my latest changes, that should address the attachment comment. Will push it when I get back
  2. Any reason we need path in binary content?

Also, I was trying to add image previews to the messages but they just keep aligning diagonally breaking the ui. I will try to fix it but if it takes time let's merge it to master without previews. Will keep you updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What is attachmentPaths used for, is it a list of paths for attachments that the message has? I might have misunderstood it, my thought was to store the paths in the binary content so that you don't have to add a new field to the message itself and you know which path represents which data.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am saving attachment paths in a new db column in messages table. We can use it to fetch all the attachments(assuming the path isn't broken) when retrieving message history and display them on messages on chat page. As I moved path to attachment type I don't see any reason to attach them to binary content. Let me know if I am missing something

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was to remove the attachmentPaths from the message and instead add the Path in the BinaryContent instead, this way you have the path and the data (even if the path is broken you have the data the user initially uploaded) and you than you can use myMsg.BinaryContent() to get all binary content including all their paths and you can still display them in the chat page. The main reason to do it this way is to have 1 place we store related image data. Does this explain my thinking better? let me know if I can share more context.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I had the same thought initially but we will have to create a new table to store binary content. I decided not to considering the length of base64 strings and the size of the encoded attachments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm what if instead we just update https://github.com/PhantomReactor/opencode/blob/feature/images/internal/message/content.go#L70 to be a string and store the base64 ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will be storing all the attachments base64 in db which is not a good idea. Also deserialising them from db would be slow.

Also I understand the downside to storing paths in db is they could be broken if user deletes files but I feel the second one is a better for performance

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I get your concern🤔, I was thinking since this is all local and the DB is per project it would not be a big issue but maybe you are right... in that case maybe we just update the BinaryContent model so that it just stores the Path and nothing else and than you use that to load the images in the history... We might need to just ignore paths that are now broken...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right. i am saving it in attachment_paths column in messages table

@kujtimiihoxha
Copy link
Collaborator

Closes #40

@kujtimiihoxha
Copy link
Collaborator

I am also adding a new ImageInput flag in the models attributes so you can check that to know if you should include the binary messages or not.

This is in an unrelated PR just wanted to let you know.
Screenshot 2025-04-27 at 15 17 55

@kujtimiihoxha
Copy link
Collaborator

@PhantomReactor I merged fixes for the gemini provider it should work as expected now.

I am really exited for this PR thanks so much for working on this you are awesome 🙌

@PhantomReactor
Copy link
Author

thanks man, i too am excited to work on this. thanks for making this tool

@kujtimiihoxha
Copy link
Collaborator

@PhantomReactor another thing you could consider is pushing the preview stuff to a follow up pr and just include the paths for now. Another reason that might be a good thing is that rendering the images might make it slow to render if you have many...

@PhantomReactor
Copy link
Author

PhantomReactor commented Apr 28, 2025

It looks like bubbletea doesn't support images. There's an open issue for image support in bubbletea. Images sort of render but the ui just breaks after that. I will just add ASCII previews for file picker and messages will just have paths. Will push the latest changes today.

@PhantomReactor PhantomReactor changed the title [WIP] Feature/images Feature/images Apr 29, 2025
@kujtimiihoxha
Copy link
Collaborator

@PhantomReactor this is sooo amazing man, I think a couple of small things.

  • There is a small issue with the border background not being correct.
  • I think when you want to navigate we should allow arrow keys also
  • space or enter should go inside of folders
  • we should maybe not show the types of files we don't support at all
  • another thing I don't know is should we default the directory to cwd and allow them to move up if they want to.

@kujtimiihoxha kujtimiihoxha self-requested a review April 30, 2025 10:49
@PhantomReactor
Copy link
Author

  • Will check 1st and 4th points
  • I agree with the second and third points. Should be addressed in last commit
  • I think most users wouldn't have images in their projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants