Skip to main content

Manage Collections

A collection in the TaskingAI retrieval system is the smallest unit with its independent vector storage index, designed to aggregate and manage a set of data for retrieval purposes.

Characteristics:

  • Independent Indexing: Each collection is indexed separately, allowing for organized and efficient data retrieval.
  • Capacity Management: Collections have a defined capacity, indicating the maximum number of text chunks they can store.

Manage collection dashboard

Collection Types:

Currently, there are two types of collections: Text Collection and QA Collection. Text collection is the most fundamental type for holding textual information, while QA collection is a specialized type that is optimized for question-answering pairs.

Create a Collection

It's easy to create a collection using the TaskingAI dashboard. Here are the steps:

  1. Navigate to the Project page then go to the Retrieval tab.
  2. Click the new collection button.
  3. Select the collection type you want to create.
  4. Enter the required information then click the confirm button.

The capacity parameter is the maximum number of text chunks the collection can hold. And the embedding_model parameter is the model used to embed the text chunks into vectors and retrieve chunks when required.

NOTE: Please properly set collection name and description with meaningful texts. This may affect the performance of your retrieval system.

warning

Once a collection is created, you cannot change its embedding_model and capacity. Please make sure you have the right settings before creating a collection.

Create a collection

Test a collection

After the collection and its chunks are created, select the Query Testing option in actions, and you can test the collection by querying it with a text input and adjust the retrieval configs.

Supported retrieval configs include:

  1. Top K: The maximum number of chunks to be returned from the retrieval task.
  2. Max Tokens: The maximum number of tokens to be returned from the retrieval task.
  3. Relevance Score Threshold: The minimum similarity score for the chunks to be returned from the retrieval task. A 0.0 (extremely low relevance) to 1.0 (perfect equivalence) value is expected.

Test a collection

Delete a Collection

  1. Navigate to the Project page then go to the Retrieval tab.
  2. Select the collection that you want to update then click the delete button.
  3. Follow the instruction in the pop-up window to confirm the deletion.

When executed, the specified collection is permanently removed from the project and the records and chunks associated with it are also deleted.

warning

Be cautious when deleting a collection especially when it is already associated with an assistant. Deletion may cause the Assistant to receive unexpected errors when generating response messages.