Skip to content

Update Inference specification for Hugging Face's completion and chat completion tasks #4383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Jan-Kazlouski-elastic
Copy link

This PR is for changes to specification caused by elastic/elasticsearch#127254:

Extended Task Support:

  • Added completion and chat_completion tasks to the list of supported Hugging Face tasks.

Model Requirements for Chat Tasks:

  • Updated documentation to describe specific requirements for using chat_completion and completion tasks, including model compatibility with the OpenAI API format and usage guidelines for serverless vs. dedicated endpoints.

New Configuration Parameters:

  • Introduced optional model_id field in Hugging Face service settings, applicable to completion and chat_completion tasks.

Rate Limit Clarifications:

  • Updated rate_limit documentation to clarify default behavior and guidance for tuning based on deployment specifics.

Documentation Fixes:

  • Corrected typos in existing text_embedding request examples.

Additional actions

  • Signed the CLA

  • Executed make contrib

Copy link
Contributor

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.chat_completion_unified Missing test Missing test
inference.completion Missing test Missing test
inference.delete Missing test Missing test
inference.get 🟢 1/1 1/1
inference.inference Missing test Missing test
inference.put_alibabacloud Missing test Missing test
inference.put_amazonbedrock Missing test Missing test
inference.put_anthropic Missing test Missing test
inference.put_azureaistudio Missing test Missing test
inference.put_azureopenai Missing test Missing test
inference.put_cohere Missing test Missing test
inference.put_elasticsearch Missing test Missing test
inference.put_elser Missing test Missing test
inference.put_googleaistudio Missing test Missing test
inference.put_googlevertexai Missing test Missing test
inference.put_hugging_face Missing test Missing test
inference.put_jinaai Missing test Missing test
inference.put_mistral Missing test Missing test
inference.put_openai Missing test Missing test
inference.put_voyageai Missing test Missing test
inference.put_watsonx Missing test Missing test
inference.put Missing test Missing test
inference.rerank Missing test Missing test
inference.sparse_embedding Missing test Missing test
inference.stream_completion Missing test Missing test
inference.text_embedding Missing test Missing test
inference.update Missing test Missing test

You can validate these APIs yourself by using the make validate target.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants