Using TaskingAI-Inference
Choose the right endpoint
TaskingAI-Inference currently supports two model types: Chat Completion and Text Embedding. There is a separate endpoint for each model type.
Chat Completion
Chat completion API takes a prompt as input and generates a response based on the prompt. The response can be a single sentence or a paragraph. In TaskingAI-Inference, more complicated chat completion features are also supported including function calls and stateful conversation. For more information, please refer to Chat Completion.
Chat completion endpoint: /v1/chat_completion
Text Embedding
Text embedding API takes a single string or a string list as input, and the service will generate a list of vector embeddings for each input text. For more information, please refer to Text Embedding.
Embedding endpoint: /v1/text_embedding