Search & Ask - Aigmented

Search

Search returns matching knowledge cards ranked by relevance. No LLM involved — pure vector search with optional reranking.

curl -X POST https://aigmented.io/api/v1/collections/49/search \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "payment terms",
    "top_k": 5,
    "rerank": true
  }'

Parameters

Parameter	Type	Default	Description
`query`	string	required	Search query
`top_k`	integer	10	Number of results (1-100)
`rerank`	boolean	true	Apply cross-encoder reranking
`current_only`	boolean	true	Only return current document versions
`filters`	object	null	Metadata filters (see below)

Filters

Narrow results by metadata:

{
  "query": "payment terms",
  "filters": {
    "knowledge_type": "procedure",
    "document_id": "contract-2024.pdf"
  }
}

Filter	Description
`knowledge_type`	Card type (e.g. `knowledge`, `procedure`, `reference`)
`document_id`	Restrict to cards from a specific document

Ask

Ask a question and get an AI-generated answer grounded in your knowledge base.

curl -X POST https://aigmented.io/api/v1/collections/49/ask \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What are the payment terms?",
    "mode": "fast"
  }'

Parameters

Parameter	Type	Default	Description
`question`	string	required	Your question
`model`	string	auto	LLM model to use
`mode`	string	`"full"`	`"fast"` (direct search → LLM) or `"full"` (pipeline with agent escalation)
`stream`	boolean	false	Stream response as SSE events
`top_k`	integer	10	Number of knowledge cards to retrieve
`current_only`	boolean	true	Only use current document versions
`chat_history`	array	[]	Previous conversation turns for multi-turn
`filters`	object	null	Same filters as search

Modes

fast — Retrieves cards, sends to LLM, returns answer. Deterministic and quick. Filters and current_only are fully supported. full — Uses the full pipeline with smart routing and agent escalation. May take longer but handles complex questions better. Note: filters and current_only are not supported in full mode.

Response

{
  "answer": "The payment terms state that...",
  "sources": [
    { "card_id": "abc123", "title": "Payment Terms Section", "score": 0.95 }
  ],
  "model": "google/gemini-3-flash-preview",
  "tokens_used": {
    "llm_prompt": 2100,
    "llm_completion": 350,
    "embedding": 0,
    "model_id": "google/gemini-3-flash-preview"
  }
}

Streaming

Set stream: true to receive Server-Sent Events:

curl -N -X POST https://aigmented.io/api/v1/collections/49/ask \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What are the payment terms?",
    "stream": true
  }'

Events:

data: {"type": "status", "data": {"message": "Szukam informacji..."}}

data: {"type": "answer_chunk", "data": {"content": "The payment terms"}}

data: {"type": "answer_chunk", "data": {"content": " state that invoices..."}}

data: {"type": "source", "data": {"card_id": "abc123", "title": "Payment Terms", "score": 0.95}}

data: {"type": "done", "data": {"total_sources": 3, "model": "google/gemini-3-flash-preview", "tokens": {"prompt_tokens": 2100, "completion_tokens": 350}}}

Event	Description
`status`	Processing status update
`answer_chunk`	Partial answer text (stream as it arrives)
`source`	Source knowledge card used for the answer
`done`	Stream complete, includes token usage and model info

Multi-turn conversations

Pass previous turns in chat_history for follow-up questions:

{
  "question": "Can you elaborate on point 3?",
  "chat_history": [
    ["What are the payment terms?", "The payment terms state that..."]
  ]
}

Each entry is a [question, answer] pair.

Documentation Index

​Search

​Parameters

​Filters

​Ask

​Parameters

​Modes

​Response

​Streaming

​Multi-turn conversations

Search

Parameters

Filters

Ask

Parameters

Modes

Response

Streaming

Multi-turn conversations