Skip to main content
Search returns matching knowledge cards ranked by relevance. No LLM involved — pure vector search with optional reranking.
curl -X POST https://aigmented.io/api/v1/collections/49/search \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "payment terms",
    "top_k": 5,
    "rerank": true
  }'

Parameters

ParameterTypeDefaultDescription
querystringrequiredSearch query
top_kinteger10Number of results (1-100)
rerankbooleantrueApply cross-encoder reranking
current_onlybooleantrueOnly return current document versions
filtersobjectnullMetadata filters (see below)

Filters

Narrow results by metadata:
{
  "query": "payment terms",
  "filters": {
    "knowledge_type": "procedure",
    "document_id": "contract-2024.pdf"
  }
}
FilterDescription
knowledge_typeCard type (e.g. knowledge, procedure, reference)
document_idRestrict to cards from a specific document

Ask

Ask a question and get an AI-generated answer grounded in your knowledge base.
curl -X POST https://aigmented.io/api/v1/collections/49/ask \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What are the payment terms?",
    "mode": "fast"
  }'

Parameters

ParameterTypeDefaultDescription
questionstringrequiredYour question
modelstringautoLLM model to use
modestring"full""fast" (direct search → LLM) or "full" (pipeline with agent escalation)
streambooleanfalseStream response as SSE events
top_kinteger10Number of knowledge cards to retrieve
current_onlybooleantrueOnly use current document versions
chat_historyarray[]Previous conversation turns for multi-turn
filtersobjectnullSame filters as search

Modes

fast — Retrieves cards, sends to LLM, returns answer. Deterministic and quick. Filters and current_only are fully supported. full — Uses the full pipeline with smart routing and agent escalation. May take longer but handles complex questions better. Note: filters and current_only are not supported in full mode.

Response

{
  "answer": "The payment terms state that...",
  "sources": [
    { "card_id": "abc123", "title": "Payment Terms Section", "score": 0.95 }
  ],
  "model": "google/gemini-3-flash-preview",
  "tokens_used": {
    "llm_prompt": 2100,
    "llm_completion": 350,
    "embedding": 0,
    "model_id": "google/gemini-3-flash-preview"
  }
}

Streaming

Set stream: true to receive Server-Sent Events:
curl -N -X POST https://aigmented.io/api/v1/collections/49/ask \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What are the payment terms?",
    "stream": true
  }'
Events:
data: {"type": "status", "data": {"message": "Szukam informacji..."}}

data: {"type": "answer_chunk", "data": {"content": "The payment terms"}}

data: {"type": "answer_chunk", "data": {"content": " state that invoices..."}}

data: {"type": "source", "data": {"card_id": "abc123", "title": "Payment Terms", "score": 0.95}}

data: {"type": "done", "data": {"total_sources": 3, "model": "google/gemini-3-flash-preview", "tokens": {"prompt_tokens": 2100, "completion_tokens": 350}}}
EventDescription
statusProcessing status update
answer_chunkPartial answer text (stream as it arrives)
sourceSource knowledge card used for the answer
doneStream complete, includes token usage and model info

Multi-turn conversations

Pass previous turns in chat_history for follow-up questions:
{
  "question": "Can you elaborate on point 3?",
  "chat_history": [
    ["What are the payment terms?", "The payment terms state that..."]
  ]
}
Each entry is a [question, answer] pair.