AI research workflow, shipped as production infrastructure
This demo runs on AWS ECS Fargate behind an Application Load Balancer, built from Docker, provisioned with Terraform, deployed via GitHub Actions, and wired to Groq for cloud LLM inference.
User enters a standard or deep research query from the live web app.
FastAPI orchestrates planning, tool use, memory, and background job execution.
Requests flow through ALB into ECS Fargate tasks with observability and health checks.
Local development can use Ollama, while cloud deployment uses Groq-backed inference.
Use standard mode for interactive streaming, or switch to Deep mode for a background job.