Hugging Face Inference Endpoint (Dedicated)
This document provides information on how to integrate Hugging Face models though Hugging Face's Inference Endpoint (Dedicated). For information about integrating with Inference API (Serverless), please refer to other documents.
Requisites
To use models provided by Hugging Face, you need to have an Hugging Face API key. You can get one by signing up at Hugging Face.
Required credentials:
- HUGGING_FACE_API_KEY: Your Hugging Face API key.
- HUGGING_INFERENCE_ENDPOINT_URL: The URL of the of your dedicated Hugging Face Inference Endpoint.
Supported Models:
NOTE: Only models that are set to work on 'Sentence Embeddings' at your Hugging Face Inference Endpoint (Dedicated) backend are supported by this integration.
Wildcard
- Model schema id: hugging_face_inference_endpoint/wildcard
Since Hugging Face is a platform that hosts thousands of models, TaskingAI created a wildcard model that can integrate all eligible models on Hugging Face.
Currently, the eligible models are sentence-transformers
models that are available for Hugging Face Inference Endpoint (Dedicated) service. And the model's task should be set as Sentence Embedding
at Inference Endpoint backend.
To integrate a specific model, pass the model id to the Provider Model Id
parameter, for example, sentence-transformers/paraphrase-MiniLM-L6-v2
.