🔑
Enter your API key to start chatting
Don't have a key? Contact altyurin3@gmail.com
Multi-model agent harness with retrieval-augmented generation, fine-tuned intent classification, multi-turn session memory, evaluation pipelines, guardrails, hallucination mitigation, and observability.
What Powers This Agent
RAG Pipeline
Qdrant + Voyage embed + reranker
Multi-Model Routing
2 LLMs + fine-tuned classifier
Safety Guards
Bedrock + hallucination judge
Eval Pipeline
RAGAS + DeepEval, 7 suites
Session Memory
Redis + semantic cache
Chaos Engineering
Fault injection + circuit breakers
Observability
Langfuse traces + dashboards
AWS Infra
ECS Fargate + Terraform + WAF
Auth & Access
JWT tokens + API keys
Rate Limiting
Per-user sliding window
Model Routing
| Model | Task |
|---|---|
| gpt-4o-mini | Generation, clarification, hallucination judge |
| gpt-5-nano | Context summarization |
| ft:gpt-4.1-nano | Intent classification (97% accuracy) |
| voyage-4-lite | Embedding (1024 dims) |
| rerank-2.5-lite | Reranking (top-3 from top-10) |