Skip to main content

Jina

Our embedding models are specifically designed to cater to diverse applications, combining language, code and multimodal representation to open up new possibilities in AI-driven solutions.

Requisites

To use Jina models, you need to have the following credential information. You can get them by signing up at Jina's website.

Required credentials:

  • JINA_API_KEY: Your Jina API key for authentication.

Supported Models

Jina Embeddings v2 Small EN

  • Model schema id: jina/jina-embeddings-v2-small-en

The jina-embeddings-v2-small-en model is an English BERT-based embedding model pre-trained on the C4 dataset and further enhanced with over 400 million sentence pairs for improved performance in processing long documents across various domains.

  • Embedding size: 512
  • Input token limit: 8192
  • Max batch size: 500

Jina Embeddings v2-base-code

  • Model schema id: jina/jina-embeddings-v2-base-code

The jina-embeddings-v2-base-code model is a multilingual embedding model for English and 30 programming languages with 137 million parameters, trained on over 150 million coding Q&A and source code pairs, ideal for technical Q&A and code search up to 8192 sequence lengths.

  • Embedding size: 768
  • Input token limit: 8192
  • Max batch size: 500

Jina Embeddings v2-base Multilingual Series

All models in this series share the following properties:

  • Embedding size: 768
  • Input token limit: 8192
  • Max batch size: 500

Jina Embeddings v2-base de

  • Model schema id: jina/jina-embeddings-v2-base-de

Jina Embeddings v2-base en

  • Model schema id: jina/jina-embeddings-v2-base-en

Jina Embeddings v2-base zh

  • Model schema id: jina/jina-embeddings-v2-base-zh