Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions from dev.to article (MCP alternative) #89

Open
samchon opened this issue Mar 10, 2025 · 41 comments
Open

Questions from dev.to article (MCP alternative) #89

samchon opened this issue Mar 10, 2025 · 41 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested

Comments

@samchon
Copy link
Contributor

samchon commented Mar 10, 2025

Question from @ryoppippi after reading https://dev.to/samchon/i-made-mcp-model-context-protocol-alternative-solution-for-openai-and-all-other-llms-that-is-i7f


Hey, I read your blog post and found it really interesting, but I have a few questions and points I’d like to discuss:

From your architecture diagram, it seems Agentica is primarily a library—a wrapper for APIs like OpenAI and Gemini combined with typia to smooth out protocol differences. Since Agentica is a library (not a protocol) and MCP is designed as a general-purpose protocol (supporting multiple languages, tools, and even future support for Gemini/llama), isn’t comparing the two a bit like comparing apples to oranges? How do you think they should be compared, and on what criteria?

When comparing Agentica to other popular libraries like vercel-ai (which is also a library), what are the specific advantages of Agentica? Could you share concrete use cases or examples where Agentica clearly outperforms these alternatives?

Your post claims that using Agentica with a local LLM is cheaper than using MCP with Claude. However, couldn’t a combination of goose and MCP also enable local LLM usage, potentially achieving similar cost savings? What makes Agentica’s approach cost-wise a clear win?

You mentioned that “gpt-4o-mini (of 8b parameters) is taking about 70% of type level mistakes.” In my experience, I haven’t seen such a high error rate when using structured output. Do you have specific experimental data, evaluation metrics, or additional evidence that support this claim?

Lastly, while you highlight typia’s ability to auto-generate logic and Swagger documentation from TypeScript types—improving error detection and safety—many developers already use tools like vercel-ai, zod, and zod-to-json-schema. Could you elaborate on what makes typia’s benefits (e.g., improved safety or more accurate error detection) more compelling compared to these existing solutions?

Overall, I’m trying to understand the specific comparative advantages of Agentica. Is it the improved developer experience (DX), greater versatility, or cost efficiency? Given that options like MCP or vercel-ai + zod remain robust alternatives, what exactly sets Agentica apart?

I am going to bed now, so please understand that my reply will be late. Sorry.

Looking forward to your insights on these points!

@samchon samchon added the enhancement New feature or request label Mar 10, 2025
@samchon samchon added this to WrtnLabs Mar 10, 2025
@samchon
Copy link
Contributor Author

samchon commented Mar 10, 2025

Good point.

The blog post is an experiment to find out what topics to write about before the homepage is completed.

Therefore, although MCP was designed with the assumption that it can be used in addition to Anthropic Claude, it is actually only used in Claude, so I used somewhat exaggerated expressions. I faced similar criticism on Reddit, and on the other hand, there was a response that it is not actually used in anything other than Claude.

However, it is necessary to reconsider the appropriate promotional phrase that will not face excessive criticism while riding on the popularity of MCP. Below are a few more exaggerated phrases that I am considering with similar arguments.

  • From today, all TypeScript developers are AI developers
  • Your backend is wrong (targeting the Java/Spring ecosystem)
  • Only Swagger is important for the new era's backend

@samchon
Copy link
Contributor Author

samchon commented Mar 10, 2025

And you mentioned the accuracy of tool calling. That's a good point. After opening a homepage, we should not say that it fails 70% empirically, but we should say it with specific experimental figures.

Among the figures we currently have, there is a DTO schema with a 0% success rate in the first attempt that fails unconditionally even when bringing in gpt-4o-mini as well as gpt-4o (IShoppingSale.ICreate.

This is the use case for creating products in a shopping mall, and it succeeds immediately after going through validation feedback once or twice.

Retry count
  - 1: success without validation feedback
  - 2: success after one validation feedback
[
  2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
  2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2,
  2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2,
  2, 3, 2, 2, 2, 2, 2, 2, 2, 3, 2, 3,
  2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2,
  2, 2, 2, 2, 3, 2, 3, 2, 3, 2, 2, 2,
  2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
  2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 2, 2,
  2, 2, 2, 2
]

@samchon
Copy link
Contributor Author

samchon commented Mar 10, 2025

And I think the strengths of agentica and typia/nestia over other means such as zod are "purity" and "mass production".

Zod can write a schema directly with one or two structured outputs, but it cannot deliver the entire TypeScript class like agentica. In addition, zod has a fatal bug that makes the IDE unusable due to excessive TypeScript generic type operations when the number of input functions increases and the schema becomes complex.

And in an environment where hundreds of APIs are mass-produced, such as a backend server, the most efficient way is to use the OpenAPI itself as it is. Zod has no power here.

In any case, I find the greatest advantage of Agentica in that it can be made into an AI chatbot without much effort while maintaining the existing TypeScript class implementation and backend server implementation.

@samchon
Copy link
Contributor Author

samchon commented Mar 10, 2025

And although I used the wording of MCP replacement as part of a social experiment to get Agentica noticed, as you know, it is not the right expression.

MCP is still unable to use the backend ecosystem as it is and needs additional development, or the ecosystem is poor and not productive, but on the contrary, they are famous. Our Agentica can use the existing ecosystem as it is, but on the contrary, it is not famous.

Therefore, in the long term, we should support MCP, and we should also support MCP in nestia so that MCP servers can be created as they are in the existing NestJS ecosystem, and we should integrate with them.

@samchon samchon changed the title Questions from dev.to article Questions from dev.to article (MCP alternative) Mar 10, 2025
@kakasoo
Copy link
Contributor

kakasoo commented Mar 10, 2025

If Swagger had been properly written, many LLM providers would not have needed to define their own protocols or schemas for function calling. Instead, they would have focused on making it possible to simply plug in a Swagger document.
(In a way, this might have been fortunate for us, as most companies do not invest much energy into documentation, which has created an opportunity for us.)

However, that was not the case, and these companies ended up establishing their own protocols.
I believe their thought process went something like this:

  1. Why don't developers write documentation properly?
  2. Would they do it better if easier tools, such as annotations, decorators, or documentation libraries, were provided?
  3. Instead of relying on that approach, wouldn’t it be more stable to enforce the definition of function calling schemas at the code level?

I believe the MCP protocol was introduced as an alternative solution to this problem.
However, I don’t think this method can be a perfect solution.

As you know, the number of servers following the MCP protocol is increasing, but I still don’t find this approach to be particularly elegant.
The reason is that if this method were truly effective, we would have seen an improvement in the accuracy of Swagger documentation long ago.

The reason developers don't properly document their code is that documentation is a tedious and undesirable task.
I’m not sure about foreign companies, but posts on Reddit frequently mention that documentation is often poorly maintained.
In South Korea, it’s common for newly hired interns to be assigned all the documentation work—it's widely known as a task that most developers avoid.

I think the reason the MCP protocol is gaining traction is not necessarily because it makes documentation easier
(even though it is written at the code level)
but rather because massive companies worth billions of dollars are leading the market.

In the end, this hasn’t made documentation more universal;
rather, it has turned documentation into something that only high-paid developers are expected to do.

However, as you know, we have utilized Typia and ensured stability through the TypeScript compiler.

I believe our advantage is not just about type safety but also about productivity and cost-efficiency.
Instead of requiring developers who specifically understand and follow the MCP protocol,
we can develop agents using just frontend developers and Node.js backend developers.

This is a clear advantage in terms of accessibility and ease of development.

@kakasoo
Copy link
Contributor

kakasoo commented Mar 10, 2025

Additionally, I feel that the solution we have implemented is closer to a more fundamental programming approach.
Other agents tend to over-rely on prompts, which gradually makes development more difficult.

Many of them claim to be building general-purpose agents (as do we),
but in reality, I believe that creating a truly universal agent based solely on prompts is almost impossible.

We cannot define an agent as something that is simply good at everything.
If we want to create an IT expert, then they would need to be skilled in:

  • Design
  • Development
  • Planning
  • Sales
  • ...and everything else

This would require an overwhelming amount of prompts.
Such an approach can never be truly general-purpose, nor would it be economically viable in terms of token consumption.

Therefore, I believe that a universal approach should follow principles similar to those in TDD (Test-Driven Development):

  • Proper design is essential.
  • The structure should be measurable and well-defined.
  • It should not rely on uncertainty like prompts do.

Even if the value of AI lies in its ability to handle uncertainty,
this does not change the fact that structured design is a better foundation.

@kakasoo
Copy link
Contributor

kakasoo commented Mar 10, 2025

Therefore, our approach is that even when we write prompts, we write them within functions (tools).
If we want to define an IT expert, we need to define the tools that the IT developer can work with,
and the only prompts we use are the appropriate descriptions for those tools.

@ryoppippi
Copy link
Member

ryoppippi commented Mar 10, 2025

You have written a lot of things, but they are all very long.
I have been given the task of writing an academic paper, but your explanations are too long!
What is Claim in Agentica?

In dev.to and in this thread, you have mentioned the following characteristics

  • Better than MCP.
  • Better accuracy of parameter extraction.
  • Can be done cheaply

But if you can't say in one sentence what you are essentially trying to solve as a claim, then you can't claim novelty.
What is your idea of a CLAIM in AGENTICA?

@kakasoo @samchon

@samchon
Copy link
Contributor Author

samchon commented Mar 10, 2025

How about productivity or efficiency?

@ryoppippi
Copy link
Member

ryoppippi commented Mar 10, 2025

Supplementary information.
What is CLAIM?

  • DNA has a spiral structure
  • The earth revolves around the sun.
  • If you analyse the type, you can generate code for validation (typia)

It can be expressed in a few words, such as ‘yes or no’, and it seems to be settled by yes or no. This is not talking about a genre of research. For example.

  • I want to do research on deep learning
  • I want to build applications using llm

etc. is not a CLAIM.

@kakasoo
Copy link
Contributor

kakasoo commented Mar 10, 2025

There was a lot I wanted to say, so the explanation was long. I think it's a validation of parameters.

@ryoppippi
Copy link
Member

CLAIM is a hypothesis-testable one-line statement.

@sunrabbit123
Copy link
Collaborator

Overall, I’m trying to understand the specific comparative advantages of Agentica. Is it the improved developer experience (DX), greater versatility, or cost efficiency? Given that options like MCP or vercel-ai + zod remain robust alternatives, what exactly sets Agentica apart?

I believe that this difference explains why developers use compiled languages.

Why did people transition from JavaScript to TypeScript?
You could explicitly define types using JsDoc, so why switch to TypeScript?

Why do developers of Python, a famous interpreted language, seek linters and type checking?

The answer to all these questions starts with the idea that developers might just be monkeys.
Anything written by humans—or even by LLMs—can be incorrect.
That’s why we use programs to enforce constraints and provide feedback.

For this reason, I believe typia is better than zod and provides a better developer experience (DX) in terms of typing.

With zod, even when an object already exists, you need to create a new schema to validate it.
In contrast, typia reuses what already exists.

Developers experienced in modular programming understand the importance of reuse.
Of course, once a zod schema is created, it can also be reused.
However, the kind of reuse we’re discussing here is the reuse of TypeScript code itself.

That’s why I believe typia is the superior choice.


My claim is:
"Agentica is a library that integrates with existing servers or TypeScript apps, allowing you to create agents in just a few lines of code."

Because of this, it needs to be simple and straightforward.
Since it integrates with existing servers, it should work seamlessly with Swagger.
And because it connects effortlessly with TypeScript applications, it ensures smooth integration.

To enable creation in just a few lines, it must be declarative.

I don’t know how others may view this, but this is my perspective, and it’s the mindset I bring to development.

@ryoppippi
Copy link
Member

ryoppippi commented Mar 10, 2025

Thanks @sunrabbit123 . I'll reply

I believe that this difference explains why developers use compiled languages.

Why did people transition from JavaScript to TypeScript?
You could explicitly define types using JsDoc, so why switch to TypeScript?

Why do developers of Python, a famous interpreted language, seek linters and type checking?

The answer to all these questions starts with the idea that developers might just be monkeys.
Anything written by humans—or even by LLMs—can be incorrect.
That’s why we use programs to enforce constraints and provide feedback.

Agree, that's why we need type-safe languages.

For this reason, I believe typia is better than zod and provides a better developer experience (DX) in terms of typing.

With zod, even when an object already exists, you need to create a new schema to validate it.
In contrast, typia reuses what already exists.

‘Typing can ensure the safety and quality of the code’
From the CLAIM that
‘typia is better than zod’
is not derived from the claim that ‘typia is better than zod’. There is a leap of argument.

Both are validation libraries to ensure type-safe.

I understand that typia's DX is better than zod, but we need to separate that discussion.

"Agentica is a library that integrates with existing servers or TypeScript apps, allowing you to create agents in just a few lines of code."

This seems like a not a bad claim, but I consider this to be of no substance.

This statement describes the characteristics of AGENTICA and does not represent the core idea. I would like you to read the definition of CLAIM again.

I also have some thoughts on the statement itself, which I will write down

that integrates with existing servers or TypeScript apps

A new description of function calling should have to be written. Not entirely zero.

in just a few lines of code

This is exactly what can be achieved with zod + vercel ai sdk, so this would be a weak feature. There is no difference between writing a type definition once and writing a zod definition once because there is not that much use of function call definitions in multiple places.

It sounds harsh, but if you don't have a well-defined CLAIM, your thesis, product and presentation will be off-axis.

@ryoppippi
Copy link
Member

I have a vague idea of the answer in my own way. Nevertheless, I would like to know how you have understood and worked with agentica over the past few months.
I believe that this is not only my understanding, but also a necessary process for research and development without blurring about agentica and agentOS!

@kakasoo
Copy link
Contributor

kakasoo commented Mar 10, 2025

I guess I wasn't used to "claiming". I'm constantly thinking about it, so please understand if the answer is slow.

A compiler-based Agent automates API validation for greater consistency, productivity, and efficiency, making it a superior choice

How about this?

@ryoppippi
Copy link
Member

The statement does capture a value proposition, but it’s more of a marketing assertion than a falsifiable claim. It says that a compiler-based Agent "automates API validation for greater consistency, productivity, and efficiency, making it a superior choice," but it doesn’t specify measurable criteria to test those benefits. In an academic context, a claim should be a concise, testable hypothesis—something you can empirically verify or refute.

If you want to mention about performance or DX, instead of saying it's "a superior choice," you might aim for a claim like:

"A compiler-based Agent reduces API validation errors by X% compared to manual validation methods, leading to measurable improvements in development consistency and efficiency."

This version provides a clear benchmark and outcome that can be tested. The original statement is a good starting point for discussion, but it needs refinement to meet the standards of a falsifiable academic claim.

@ryoppippi
Copy link
Member

ryoppippi commented Mar 10, 2025

My answer for the question "what is the claim of Agentica" is

Well-documented type declarations form a schema for function calling.

This statement captures the core idea behind Agentica. All the various features we've discussed—improved DX, lower costs, enhanced parameter extraction accuracy—are interesting implementation details and methods. However, these are just means to an end, not the fundamental claim. In essence, the power of our approach lies not in the additional functionalities, but in the idea that proper type declarations can automatically generate function calling schemas, simplifying and unifying the process. This core insight distinguishes our work from simply integrating with existing tools, and it is what should be the focus of our claim.

@ryoppippi
Copy link
Member

ryoppippi commented Mar 10, 2025

The benefits like lower costs or serving as an MCP alternative naturally follow from this central concept, but they do not define the core innovation.
We need to focus on that specific feature.

I am against blurring the axis and hype.
The context of an academic paper and the context of marketing are different again and should not be confused.

In addition, if you want to prove that DX can be more efficient, you should include specific figures, graphs or codes in your text.

@sunrabbit123
Copy link
Collaborator

Both are validation libraries to ensure type-safe.

Yes, I agree and understand
but I think it's a matter of the degree of combination between the validation object and the actual implementation code.

It sounds harsh, but if you don't have a well-defined CLAIM, your thesis, product and presentation will be off-axis.

I think so too, but I don't think I've thought deeply about the product. This situation seems fortunate

@ryoppippi
Copy link
Member

ryoppippi commented Mar 10, 2025

Until I was trained in academia, I tended to be conscious of processes, methods and implementation, and often worked without being clear about what the issue was or what the goal was. This led to an increase in rework, which was inefficient. I want people to think things through before they start work in order to gain sufficient competitiveness.

Now,

Well-documented type declarations form a schema for function calling.

is a claim that works, you can build a claim based on this. And you can eliminate superfluous claims.

For example, I haven't received an answer yet,

  • MCP is a protocol.
  • Agentica is a library.

Such a comparison would not have emerged if the claim had been established.

@ryoppippi
Copy link
Member

I respect the development skills of wrtnlab, but we need to talk about these things.

@ryoppippi
Copy link
Member

Going a little further, it would be good to deepen the discussion on why schema driven development is better before Agentica.
There are several possible reasons, for example

  • No need to maintain both type and implementation like in ZOD.
  • schema generation is compatible with LLM
  • Definitions only are easier to read than a mixture of implementation and definition

etc.

This also needs to be established as an AgentOS-wide philosophy.

@ryoppippi
Copy link
Member

Same as https://github.com/wrtnlabs/autoview/
It is easy to say 'How', but is hard to say 'WHAT' and 'WHY'

@kakasoo
Copy link
Contributor

kakasoo commented Mar 10, 2025

Thank you so much. I think it's a relief to be able to talk about this.

why schema driven development is better before Agentica.

Separation of Concerns: Isolating schema definitions from business logic allows teams to update implementations without affecting the interface, thereby enhancing maintainability.

@kakasoo
Copy link
Contributor

kakasoo commented Mar 10, 2025

When integration is schema-based, it is more likely to be compatible across different programming languages.

@kakasoo
Copy link
Contributor

kakasoo commented Mar 10, 2025

It is advantageous for maintenance because it enables integration into a new codebase through schema without modifying well-functioning existing code.

@luke0408
Copy link
Contributor

Thank you for the in-depth discussion on Agentica's advantages—especially in terms of type safety and developer productivity—the differences from alternatives like MCP, and how to clearly articulate the product’s core value!

I've gained a lot of information from your insights!

Due to the extensive discussions, I personally organized a comparison focusing on Zod’s schema-based approach versus Agentica’s compiler-based type reuse paradigm to better understand these deep insights.
(I hope my summary will be of some help in the future.)

Comparison Criteria Zod (Schema-based) Agentica (Compiler-based Type Reuse)
Type Definition Approach - Developers must explicitly create schemas.
- Separate schema definitions allow fine-grained control over data structures and validation logic.
- Automatically generates schemas and validation logic by reusing existing TypeScript types.
- Eliminates redundancy and maintains consistency with a single type definition.
Flexibility & Control - Well-suited for implementing complex, conditional validations and custom logic.
- Allows developers to manage validation logic with fine control.
- Focuses on automation to maximize productivity and efficiency.
- May offer less granular control in highly customized scenarios.
Code Duplication & Maintainability - Separate definitions of types and schemas can lead to redundancy.
- Potential for inconsistencies between schemas and actual types.
- Manages both schema and validation logic with a single type definition.
- Significantly reduces code duplication and inconsistencies, thereby enhancing maintainability.
Developer Experience (DX) - Requires an initial investment in explicit schema creation.
- Managing complex schemas can reduce developer experience.
- Easily integrates with existing servers or TypeScript apps with just a few lines of code.
- Offers high productivity and efficiency, particularly in large-scale API automation environments.
Performance & IDE Support - Complex generic usage might slow down IDE performance and add overhead to the development environment.
- The more schemas defined, the greater the potential performance concerns.
- Provides relatively stable IDE and performance support due to compiler-based validation.
- Leverages the existing type system to reduce overhead.
Ecosystem & Extensibility - Widely adopted with strong community support, comprehensive documentation, and robust extensibility. - Closely integrated with the TypeScript ecosystem, offering advantages in large-scale systems and automation.
- Currently in a relatively early stage regarding awareness and community support.

Additionally, I learned a great deal from the process of clarifying the claim and defining the objective as @ryoppippi mentioned. Even if I may not be able to contribute directly, I wanted to leave a note expressing my gratitude for all this knowledge.

(I'm not very good at English, so I borrowed the power of GPT to write this. Please kindly overlook any slight mistranslations.)

@ryoppippi
Copy link
Member

@luke0408 Thank you, Luke! This is a helpful matrix.
Actually I understand the difference between zod and typia(Agentica)!

I didn't get it why we compare Agentica with MCP btw....

I feel that the article could be misleading, and it might be better to reconsider its publication.

@samchon
Copy link
Contributor Author

samchon commented Mar 11, 2025

This is the story I wrote above, but I wrote the article MCP alternative to ride the wave of MCP fame. I have no intention of actually opposing MCP, and I do not want the library to go in that direction.

@kakasoo
Copy link
Contributor

kakasoo commented Mar 11, 2025

@luke0408 Thank you for your interest in our library.

@luke0408
Copy link
Contributor

Before discussing the comparison with MCP, I reflected on how Agentica's underlying technologies support the claim "Well-documented type declarations form a schema for function calling." Here’s my thought process:

Q. How can well-documented type declarations form a schema for function calling?
A. When type declarations are well-documented, they inherently define the role of a validator. This means that LLM function calling gains reliability through the validator represented by the type itself.

  • Claim: Detailed type declarations can serve as validation in themselves.
  • Implemented Technology: typia

Q. But Swagger is not an actual type; it's documentation. Can documentation serve as a type?
A. API documentation can be represented in various forms, varying with the information the author wishes to convey. However, the most crucial parts of an API—such as the request, URL, and response—reside in specific elements. The ability to create a common list of these essential pieces of information means that the API can be standardized. And once standardized, it can be expressed as a type.

  • Claim: API documentation can be standardized into a specific pattern.
  • Implemented Technology: Swagger/OpenAPI

In summary, Agentica can implement the claim "Well-documented type declarations form a schema for function calling" because API documentation, being standardizable into a specific pattern, can be written as types. These types then act as powerful validators.


Returning to the latest discussions with @ryoppippi and @samchon, I believe that by examining the rationale behind the paradigm proposed by MCP and explaining the "difference in direction," we can clarify the justification for Agentica's approach. This isn’t about proving one side right or wrong—it’s about establishing a solid narrative for what Agentica aims to achieve.

@kakasoo kakasoo added question Further information is requested documentation Improvements or additions to documentation labels Mar 11, 2025
@sunrabbit123 sunrabbit123 pinned this issue Mar 11, 2025
@ryoppippi
Copy link
Member

ryoppippi commented Mar 12, 2025

This is the use case for creating products in a shopping mall, and it succeeds immediately after going through validation feedback once or twice.

samchon/openapi@master/examples/function-calling/prompts/microsoft-surface-pro-9.md
samchon/shopping-backend@master/src/api/structures/shoppings/sales

So is there any reproduction code ? I can see the json schema, but I cannot see the actual code you tried with gpt-4 models.
I need this for reproduction
@samchon

I'm asking a lot of questions here, I'm not trying to undermine your work. Your work deserves respect.

We need to examine what we want to solve with this project, how we have solved it and whether it really works better than the existing ones. Reproducibility is necessary for that. Therefore, we would like to know if there is any code that you have actually tried. Currently, the schema and text are there, but the prompts, for example, do not exist.

@ryoppippi
Copy link
Member

I'm sure that a lot of time and effort has gone into this work. However, effectiveness needs to be well demonstrated.

@sunrabbit123
Copy link
Collaborator

You can reproduce it in Agentica/packages/chat:
https://github.com/wrtnlabs/agentica/tree/main/packages/chat

Ref: #83 (comment)

However, if you need reproduction code at the Typia level, you must write the code using Typia and the Shopping SDK.

@danpe
Copy link

danpe commented Mar 29, 2025

First of all It was a pleasure reading your whole discussion, @samchon amazing work, @ryoppippi amazing points.
After playing with Agentica in the past week, the claim that I wish agentica would fulfill is:
Agentica transforms any OpenAPI specification into a smart, task-executing agent — no manual coding or infrastructure needed.

Unfortunately this claim is still not there, I was able to run and play with the existing connectors developed by your team at @wrtnlabs but I couldn't get a good experience with plugging-in external OpenAPI docs, usually fails around authorization.

So it still not a plug-and-play experience, but feel like the project is not too far form that, and if it works, I think that's a huge news for the world.

Would love to be kept in the loop or maybe even join forces here ✊

@ryoppippi
Copy link
Member

Hi! @danpe ! Thank you for your comment! This would cheer the contributors up!!!

If you have any trouble with some OpenAPI, could you provide a reproduction repository? It helps to solve the bug!!!

@samchon
Copy link
Contributor Author

samchon commented Mar 29, 2025

Sorry for bad experience, but thanks for ealier adoption @danpe

If failed to authorization, it is because @agentica needs pre-authorization for OpenAPI function calling.

@danpe
Copy link

danpe commented Mar 29, 2025

@samchon That makes much more sense now, looking at the example though I think it misses OpenApi.convert(swaggerDoc) for the fetched document?

and if I understand correctly, implementing a new "Swagger Agent" would just require importing that agent + writing a custom function for the authorization?

Here's my current temp code that I was hoping would support any given swagger (but I see custom auth need to be written for each):

      console.log("Initializing OpenAPI agent...");
      
      // Get Swagger/OpenAPI file path or URL
      const swaggerPath = await prompt("Enter Swagger/OpenAPI JSON file path or URL: ");
      console.log(`Loading Swagger document from: ${swaggerPath}`);
      
      let swaggerDoc;
      if (swaggerPath.startsWith('http')) {
        // Fetch from URL
        const response = await axios.get(swaggerPath);
        swaggerDoc = response.data;
      } else {
        // Load from file
        swaggerDoc = JSON.parse(fs.readFileSync(swaggerPath, 'utf8'));
      }
      
      const apiBaseUrl = await prompt("Enter API base URL (e.g., https://api.example.com): ");
      
      // Create a simple HTTP connection using the IHttpConnection interface
      const connection: IHttpConnection = {
        host: apiBaseUrl
      };
      
      // NOTE: The Agentica API may have been updated since the example.
      // The example shows controllers being passed directly, but the current API
      // might require additional wrapper properties.
      agent = new Agentica({
        model: "chatgpt",
        vendor: {
          api: openai,
          model: "gpt-4o-mini",
        },
        controllers: [
          {
            protocol: "http",
            name: "OpenAPI Connector",
            application: HttpLlm.application({
              model: "chatgpt",
              document: OpenApi.convert(swaggerDoc),
            }),
            connection,
          }
        ],
      });
      console.log("Agentica agent created with OpenAPI service and API helper.");
    }

@samchon
Copy link
Contributor Author

samchon commented Mar 29, 2025

@danpe correct, and you can pre-experience the swagger chatbot from here

By the way, as our framework is not v1 yet and the playground has not optimized by selector plugin (not completed yet) feature, if your Swagger file is huge, LLM tokne consumption can be larger than expected, so use below only for demonstration please.

https://wrtnlabs.io/agentica/playground/uploader/

@samchon
Copy link
Contributor Author

samchon commented Mar 29, 2025

@wrtnlabs/laboratory we need to describe the Authorization part clearly in the document

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request question Further information is requested
Projects
Status: No status
Development

No branches or pull requests

6 participants