Models
The model is the reasoning engine behind every agent. Choosing the right model for each task — and knowing when to switch — is one of the highest-leverage decisions in an AI system.
The Model Landscape
Three categories, each with distinct trade-offs.
Frontier Models
The most capable commercial models — Claude, GPT-4, Gemini. Best reasoning, longest context, most expensive. Accessed via API, so data leaves your infrastructure.
- Highest capability
- Always up to date
- Pay-per-token pricing
Open-Source Models
Llama, Mistral, Qwen, DeepSeek — freely available, self-hostable, increasingly competitive. Full control over deployment, no per-token costs after infra investment.
- Full data control
- No vendor lock-in
- Requires GPU infrastructure
Fine-Tuned Models
Base models trained further on your specific data and tasks. Higher accuracy for domain-specific work, smaller models that punch above their weight.
- Domain-specific accuracy
- Smaller, faster, cheaper
- Requires training data & expertise
Value Pathways
Strategic value from understanding the model layer.
Task-Specific Routing
Not every task needs a frontier model. Route simple extraction to a fast model, complex reasoning to a capable one, and domain-specific tasks to a fine-tuned specialist. Same quality, fraction of the cost.
Fine-Tuning
When a general model is almost good enough but not quite, fine-tuning bridges the gap. Train a smaller model on your specific task and it outperforms a general model twice its size — at a fraction of the cost.
Cost Structure
Understanding the model layer lets you design cost-predictable systems. Fixed-cost self-hosted models for high-volume work, pay-per-token APIs for occasional complex tasks.
Vendor Independence
Building on open-source models and abstracted inference layers means you're never locked into a single provider. If pricing changes or a better model appears, you switch without rebuilding.
Security Postures
How this works across different deployment models and security requirements.
Access frontier models via API — always up to date, zero maintenance. Best for general-purpose tasks where data sensitivity allows it.
Deploy open-source models on your own infrastructure. Choose models optimised for your specific tasks and fine-tune on your data.
Run models in fully isolated environments. No telemetry, no external dependencies. Required for classified workloads.
Use frontier APIs for non-sensitive tasks, self-hosted models for proprietary data. Route based on classification rules.
Need help choosing the right models?
I evaluate models against your specific requirements — not benchmarks — and design systems that can adapt as the landscape evolves.