Agent Internals

Deep dive into evaluation results, execution traces, tool performance, and multi-agent architecture

92.6
Overall Score /100
55
Tool Calls Evaluated
5
Destinations Tested
<1ms
Avg Tool Latency

Agent Evaluation Scores

Each agent evaluated across 5 destinations: Tokyo, Paris, Bangkok, New York, Bali

Tool-Level Results

Individual tool call evaluations with pass/fail checks

Agent Execution Trace

Watch the multi-agent workflow execute step by step

Agent Execution Trace
Click "Run Trace" to see the agent workflow in action

Tool Inspector

Interactive tool testing — run any agent tool with custom parameters

search_flights LIVE

Search flights between cities with airline data, pricing, and booking links

search_hotels LIVE

Find real hotels with ratings, amenities, and booking links

search_activities LIVE

Curated local experiences, attractions, and restaurants

calculate_trip_budget LIVE

Smart budget allocation with real city cost data

Multi-Agent Architecture

LangGraph-powered sequential pipeline — direct tool calls for speed, LLM only for final compilation

🌤️ Weather
Direct tool call — Open-Meteo API (free)
|
✈️ Flight
Direct tool call — route pricing
🏨 Hotel
Direct tool call — 30+ cities
🎯 Activity
Direct tool call — curated DB
No LLM needed — instant, free, reliable
|
💰 Budget
Direct tool call — cost optimization
|
📋 Itinerary
LLM compilation (Groq/Gemini free) — with no-LLM fallback

State Machine Flow

START → Sequential pipeline (no conditional routing needed) ├─ weather_agent → direct call: get_weather_forecast + get_best_travel_months (no LLM) ├─ flight_agent → direct call: search_flights + compare_flight_prices (no LLM) ├─ hotel_agent → direct call: search_hotels (no LLM) ├─ activity_agent → direct call: search_activities + get_restaurant_recommendations (no LLM) ├─ budget_agent → direct call: calculate_budget + optimize_budget + get_currency_info (no LLM) ├─ itinerary_agent → LLM compiles all data (or no-LLM fallback) └─ END → return itinerary Shared State (AgentState): messages[] ← append-only chat history travel_request ← user input (destination, dates, budget) flights[] ← Flight Agent writes hotels[] ← Hotel Agent writes activities[] ← Activity Agent writes weather[] ← Weather Agent writes budget{} ← Budget Agent writes itinerary ← Itinerary Agent writes (final output) agent_logs[] ← all agents append execution trace

LLM Provider Comparison

Provider Model Cost Speed
Groq llama-3.3-70b-versatile FREE ⚡ Blazing fast
Google gemini-2.0-flash FREE ⚡ Fast
OpenAI gpt-4o-mini ~$0.15/1M ⚡ Fast