VoyageAI - Agent Internals & Evaluation

92.6

Overall Score /100

Tool Calls Evaluated

Destinations Tested

<1ms

Avg Tool Latency

Agent Evaluation Scores

Each agent evaluated across 5 destinations: Tokyo, Paris, Bangkok, New York, Bali

Tool-Level Results

Individual tool call evaluations with pass/fail checks

Agent Execution Trace

Watch the multi-agent workflow execute step by step

Agent Execution Trace

Click "Run Trace" to see the agent workflow in action

Tool Inspector

Interactive tool testing — run any agent tool with custom parameters

search_flights LIVE

Search flights between cities with airline data, pricing, and booking links

search_hotels LIVE

Find real hotels with ratings, amenities, and booking links

search_activities LIVE

Curated local experiences, attractions, and restaurants

calculate_trip_budget LIVE

Smart budget allocation with real city cost data

Multi-Agent Architecture

LangGraph-powered sequential pipeline — direct tool calls for speed, LLM only for final compilation

🌤️ Weather

Direct tool call — Open-Meteo API (free)

✈️ Flight

Direct tool call — route pricing

🏨 Hotel

Direct tool call — 30+ cities

🎯 Activity

Direct tool call — curated DB

No LLM needed — instant, free, reliable

💰 Budget

Direct tool call — cost optimization

📋 Itinerary

LLM compilation (Groq/Gemini free) — with no-LLM fallback

State Machine Flow

START → Sequential pipeline (no conditional routing needed)
  ├─ weather_agent → direct call: get_weather_forecast + get_best_travel_months (no LLM)
  ├─ flight_agent  → direct call: search_flights + compare_flight_prices (no LLM)
  ├─ hotel_agent   → direct call: search_hotels (no LLM)
  ├─ activity_agent → direct call: search_activities + get_restaurant_recommendations (no LLM)
  ├─ budget_agent  → direct call: calculate_budget + optimize_budget + get_currency_info (no LLM)
  ├─ itinerary_agent → LLM compiles all data (or no-LLM fallback)
  └─ END → return itinerary

Shared State (AgentState):
  messages[]     ← append-only chat history
  travel_request ← user input (destination, dates, budget)
  flights[]      ← Flight Agent writes
  hotels[]       ← Hotel Agent writes
  activities[]   ← Activity Agent writes
  weather[]      ← Weather Agent writes
  budget{}       ← Budget Agent writes
  itinerary      ← Itinerary Agent writes (final output)
  agent_logs[]   ← all agents append execution trace
                    

LLM Provider Comparison

Provider	Model	Cost	Speed
Groq	llama-3.3-70b-versatile	FREE	⚡ Blazing fast
Google	gemini-2.0-flash	FREE	⚡ Fast
OpenAI	gpt-4o-mini	~$0.15/1M	⚡ Fast