Skip to main content
POST
/
api
/
v1
/
collections
/
{id}
/
ask
Ask a question (RAG)
curl --request POST \
  --url https://app.aigmented.com/api/v1/collections/{id}/ask \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "question": "What does the agreement say about liability?",
  "model": "gpt-4o-mini",
  "mode": "fast",
  "stream": false,
  "top_k": 10,
  "current_only": true
}
'
{
  "answer": "<string>",
  "sources": [
    {
      "document_id": "<string>",
      "file_name": "<string>",
      "page": 123,
      "chunk_index": 123,
      "score": 123,
      "content_preview": "<string>"
    }
  ],
  "model": "<string>",
  "tokens_used": {
    "llm_prompt": 123,
    "llm_completion": 123,
    "embedding": 123,
    "model_id": "<string>"
  }
}

Authorizations

Authorization
string
header
required

Pass your API key as a Bearer token. Example: Authorization: Bearer sk-xxxxxxxxxxxx

Path Parameters

id
integer
required

Collection ID

Body

application/json
question
string
required

The question to answer

Example:

"What does the document say about data retention?"

model
string
default:gpt-4o-mini

LLM model identifier to use for answering

Example:

"gpt-4o"

mode
enum<string>
default:fast

Answer mode. fast uses fewer retrieved chunks for a quicker response; full retrieves more context for a thorough answer.

Available options:
fast,
full
stream
boolean
default:false

If true, the response is streamed as Server-Sent Events (SSE). Each event has a type field: delta (text chunk), done (final metadata), or error.

top_k
integer
default:10

Number of knowledge chunks to retrieve before generating the answer

Required range: 1 <= x <= 50
current_only
boolean
default:false

Restrict retrieval to the most current document versions only

chat_history
object[]

Optional prior conversation turns for multi-turn context

Example:
[
{
"role": "user",
"content": "Summarise the document."
},
{
"role": "assistant",
"content": "The document covers..."
}
]
filters
object

Optional metadata filters to scope retrieval

Response

Answer generated successfully. When stream=false, returns a JSON body. When stream=true, returns an SSE stream (Content-Type: text/event-stream).

answer
string

The LLM-generated answer

sources
object[]

Knowledge chunks used to generate the answer

model
string

Model used to generate the answer

tokens_used
object

Token usage breakdown