Ask a question (RAG)

curl --request POST \ --url https://aigmented.io/api/v1/collections/{id}/ask \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "question": "What does the agreement say about liability?", "model": "gpt-4o-mini", "mode": "fast", "stream": false, "top_k": 10, "current_only": true } '

{ "answer": "<string>", "sources": [ { "document_id": "<string>", "file_name": "<string>", "page": 123, "chunk_index": 123, "score": 123, "content_preview": "<string>" } ], "model": "<string>", "tokens_used": { "llm_prompt": 123, "llm_completion": 123, "embedding": 123, "model_id": "<string>" } }

Authorizations

Authorization

string

header

required

Pass your API key as a Bearer token. Example: Authorization: Bearer sk-xxxxxxxxxxxx

Path Parameters

integer

required

Collection ID

Body

application/json

question

string

required

The question to answer

Example:

"What does the document say about data retention?"

model

string

default:gpt-4o-mini

LLM model identifier to use for answering

Example:

"gpt-4o"

mode

enum<string>

default:fast

Answer mode. fast uses fewer retrieved chunks for a quicker response; full retrieves more context for a thorough answer.

Available options:

fast,

full

stream

boolean

default:false

If true, the response is streamed as Server-Sent Events (SSE). Each event has a type field: delta (text chunk), done (final metadata), or error.

top_k

integer

default:10

Number of knowledge chunks to retrieve before generating the answer

Required range: 1 <= x <= 50

current_only

boolean

default:false

Restrict retrieval to the most current document versions only

chat_history

object[]

Optional prior conversation turns for multi-turn context

Show child attributes

Example:

[
  {
    "role": "user",
    "content": "Summarise the document."
  },
  {
    "role": "assistant",
    "content": "The document covers..."
  }
]

filters

object

Optional metadata filters to scope retrieval

Response

Answer generated successfully. When stream=false, returns a JSON body. When stream=true, returns an SSE stream (Content-Type: text/event-stream).

answer

string

The LLM-generated answer

sources

object[]

Knowledge chunks used to generate the answer

Show child attributes

model

string

Model used to generate the answer

tokens_used

object

Token usage breakdown

Show child attributes

Documentation Index

Authorizations

Path Parameters

Body

Response