Documentation Index
Fetch the complete documentation index at: https://docs.aigmented.io/llms.txt
Use this file to discover all available pages before exploring further.
Search
Search returns matching knowledge cards ranked by relevance. No LLM involved — pure vector search with optional reranking.
curl -X POST https://aigmented.io/api/v1/collections/49/search \
-H "Authorization: Bearer sk-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "payment terms",
"top_k": 5,
"rerank": true
}'
Parameters
| Parameter | Type | Default | Description |
|---|
query | string | required | Search query |
top_k | integer | 10 | Number of results (1-100) |
rerank | boolean | true | Apply cross-encoder reranking |
current_only | boolean | true | Only return current document versions |
filters | object | null | Metadata filters (see below) |
Filters
Narrow results by metadata:
{
"query": "payment terms",
"filters": {
"knowledge_type": "procedure",
"document_id": "contract-2024.pdf"
}
}
| Filter | Description |
|---|
knowledge_type | Card type (e.g. knowledge, procedure, reference) |
document_id | Restrict to cards from a specific document |
Ask
Ask a question and get an AI-generated answer grounded in your knowledge base.
curl -X POST https://aigmented.io/api/v1/collections/49/ask \
-H "Authorization: Bearer sk-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question": "What are the payment terms?",
"mode": "fast"
}'
Parameters
| Parameter | Type | Default | Description |
|---|
question | string | required | Your question |
model | string | auto | LLM model to use |
mode | string | "full" | "fast" (direct search → LLM) or "full" (pipeline with agent escalation) |
stream | boolean | false | Stream response as SSE events |
top_k | integer | 10 | Number of knowledge cards to retrieve |
current_only | boolean | true | Only use current document versions |
chat_history | array | [] | Previous conversation turns for multi-turn |
filters | object | null | Same filters as search |
Modes
fast — Retrieves cards, sends to LLM, returns answer. Deterministic and quick. Filters and current_only are fully supported.
full — Uses the full pipeline with smart routing and agent escalation. May take longer but handles complex questions better. Note: filters and current_only are not supported in full mode.
Response
{
"answer": "The payment terms state that...",
"sources": [
{ "card_id": "abc123", "title": "Payment Terms Section", "score": 0.95 }
],
"model": "google/gemini-3-flash-preview",
"tokens_used": {
"llm_prompt": 2100,
"llm_completion": 350,
"embedding": 0,
"model_id": "google/gemini-3-flash-preview"
}
}
Streaming
Set stream: true to receive Server-Sent Events:
curl -N -X POST https://aigmented.io/api/v1/collections/49/ask \
-H "Authorization: Bearer sk-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question": "What are the payment terms?",
"stream": true
}'
Events:
data: {"type": "status", "data": {"message": "Szukam informacji..."}}
data: {"type": "answer_chunk", "data": {"content": "The payment terms"}}
data: {"type": "answer_chunk", "data": {"content": " state that invoices..."}}
data: {"type": "source", "data": {"card_id": "abc123", "title": "Payment Terms", "score": 0.95}}
data: {"type": "done", "data": {"total_sources": 3, "model": "google/gemini-3-flash-preview", "tokens": {"prompt_tokens": 2100, "completion_tokens": 350}}}
| Event | Description |
|---|
status | Processing status update |
answer_chunk | Partial answer text (stream as it arrives) |
source | Source knowledge card used for the answer |
done | Stream complete, includes token usage and model info |
Multi-turn conversations
Pass previous turns in chat_history for follow-up questions:
{
"question": "Can you elaborate on point 3?",
"chat_history": [
["What are the payment terms?", "The payment terms state that..."]
]
}
Each entry is a [question, answer] pair.