ON THIS PAGE 7 sections
A custom AI build in 2026 costs less than people think on inference, and more than they think on everything else.
The token-price obsession is misleading. You can run a heavy production pipeline for less than $300 a month in LLM costs and still ship a $40k build, because the LLM call is roughly 5% of the total work. The other 95% is design, integration, evaluation, monitoring, and the maintenance you will need next quarter.
This is the actual breakdown. Where the money goes, what gets skipped, and what you should plan for before you sign.
The realistic floor
A useful, production-grade custom AI build in 2026 starts at $15k. Below that you are buying a demo.
A demo runs in a notebook, talks to one LLM, has no auth, no monitoring, no eval, and breaks the first time the data shifts. Useful for proving an idea works. Not useful for running a business on.
A production-grade build has at minimum:
- A specific, scoped outcome (1 to 3 use cases, no more)
- An integration with at least one real system (CRM, CMS, support tool)
- A prompt library with version control
- An eval rubric that runs automatically (see eval loops for AI content)
- Basic monitoring with alerts on failure modes
- A handoff document the in-house team can maintain
- Deployment as something more durable than a script on someone’s laptop
Even a tight version of this is 60 to 120 hours of senior engineering work. At $200 to $400 per hour, that is $15k to $40k. The cheap end exists; the cheap-and-shippable end starts at $15k.
Where the money goes
For a typical $25k single-pipeline build, here is how a real budget looks.
Design and scoping — $5k (20%)
Most underestimated. The work: write the brief, define the eval rubric, draw the data flow, identify the integration touchpoints, write the acceptance criteria. About 25 to 40 hours over 2 weeks.
If a vendor wants to skip this and “just start building,” refuse. The build always costs 30 to 50% more when the scope is unclear.
Integration — $8k (32%)
The single largest line item. Real systems are messy. The CRM has 14 custom fields the docs do not mention. Auth tokens rotate every 90 days. The webhook fires twice on retries. Rate limits trigger at 9pm Pacific because that is when EU traffic peaks.
About 40 to 60 hours per integration, more if the system is poorly documented or legacy. A clean Stripe-style API integration is 8 hours; a SAP integration is 80.
Prompt and pipeline design — $5k (20%)
The work LLM tutorials cover. System prompt design, few-shot examples, chain-of-thought structure, output parsing, error handling, retry logic. 20 to 30 hours.
This is where vendors over-bill. A clean prompt library should not take 100 hours. If it does, something else is wrong (usually evals are missing).
Evals and monitoring — $4k (16%)
The most-skipped budget line. The work: define the rubric, build the test set, wire up automatic eval runs, set up alerting, build a small dashboard. 16 to 24 hours.
Skipping this saves $4k up front and costs $20k six months later when the system has been hallucinating for a quarter and nobody noticed. See eval loops for AI content.
Deploy and handoff — $3k (12%)
Containerization or serverless setup, env var management, secrets, basic CI, handoff document. 12 to 16 hours.
LLM API spend during build — under $500
The whole project’s worth of API calls while you are iterating. People expect this to be a five-figure number; in 2026 it is usually under $500. At current rates (Claude Sonnet 4.6 at $3/$15 per M tokens; GPT-5.4 at $2.50/$10; Haiku 4.5 and Gemini Flash under $1 per M), even heavy testing barely shows up.
Ongoing run-cost — separate budget
The above is the build. The run-cost is on top, monthly:
- LLM API: $50 to $400/mo for most apps (a RAG app on Sonnet 4.6 running ~200M input / 30M output tokens lands around $1,050/mo standard, $150-$300 with caching, per Anthropic pricing references Q2 2026)
- Monitoring and observability: $50 to $200/mo
- Maintenance engineering: $2k to $8k/mo
Annualize: about 15 to 30% of the build cost per year, ongoing. Budget for this before you sign. The teams that do not get hit with a surprise quarterly bill they cannot defend to their CFO. See SEO budget allocation 2026 for the related conversation on tracking spend.
The three scope tiers
To set expectations cleanly, three brackets cover almost every custom AI build I see in 2026.
Tight scope: $15k to $40k
One use case. One integration. 4 to 6 weeks. One operator running the build.
Example projects:
- A content brief generator that pulls from your CRM and outputs to a Notion database
- A sales call summarizer that posts to Slack and tags deals in Pipedrive
- A support ticket classifier that routes to the right queue in Intercom
- A custom RAG over your internal docs, deployed as a Slack bot
Roughly 60 to 120 hours of work. Good fit for an independent senior consultant; see hiring an AI consultant.
Medium scope: $60k to $150k
3 to 5 integrations. Custom UI or dashboard. 8 to 14 weeks. Often 2 people: a senior engineer plus a designer or junior implementer.
Example projects:
- An internal copilot for sales reps that searches Salesforce, Gong calls, and Notion, with a custom UI
- A multi-step content pipeline (brief → draft → eval → publish) with admin dashboard
- A pricing recommendation engine that reads inventory + competitor data + historical pricing
- A lead qualification system with multi-channel integration
About 300 to 700 hours total. Outside the comfortable range for one independent. Fit for a senior solo with a junior support, or a boutique agency.
Enterprise scope: $150k to $500k+
5+ integrations. Multi-team rollout. Compliance (SOC 2, HIPAA). Custom infrastructure. 4 to 8 months.
Example projects:
- A regulated-industry AI assistant with full audit logging
- A multi-tenant AI feature inside an existing SaaS product
- A company-wide knowledge layer across 8+ data sources
- A custom model fine-tune + deployment pipeline
About 1,000+ hours, multiple engineers, plus PM, security, and ops capacity. This is agency territory, not solo. See hiring an AI consultant for when agencies are the right call.
The 4 hidden costs
Things that surprise first-time buyers. Plan for them.
1. Data prep. The data is never as clean as people remember. A “we have all the call transcripts” assertion turns into “we have call transcripts for 60% of calls, in 3 different formats, and the metadata schema changed in Q2 2024.” Plan 10 to 25% of the build budget for data prep, especially for RAG projects.
2. Approval cycles. Internal stakeholders who were not in the kickoff want changes once the build is mid-flight. Budget contingency hours. The fastest projects have one decision-maker; the slowest have three.
3. Eval calibration. Your first eval rubric is wrong. You need 2 to 3 rounds of calibration where you actually run the rubric against real outputs and tune. Budget 8 to 16 hours over the build.
4. Failure mode handling. Real systems hit edge cases. Empty inputs, rate limits, model timeouts, integration outages. Robust error handling is 15 to 20% of the integration work and is the line item most often cut from quotes.
What you should not pay for
A few lines that should not be on your quote.
“AI strategy consulting” priced separately from build. If the consultant cannot bake strategy into a fixed-price build, they are charging for thinking. See hiring an AI consultant for the framing.
“Custom LLM training” for general use cases. Fine-tuning a model for a generic task is almost never worth it in 2026. Prompt engineering and RAG beat fine-tuning for 90% of business use cases at 1/10th the cost.
Per-seat licensing for the build itself. Custom builds should be owned by the buyer, not leased. If the consultant wants to charge a recurring license on your custom build, find another consultant.
“AI insurance” or “model monitoring license” priced above $500/mo. Real monitoring is cheap. Vendors love to bundle this as a $2k/mo add-on because they know buyers do not know better. See the run-cost section above.
A real anonymized quote
This is the actual line-item quote from a project I shipped in late 2025. Series B SaaS, RAG over support tickets to power a customer-facing help search.
- Discovery + eval rubric (2 weeks): $7,000
- Integration with Zendesk + Postgres (3 weeks): $11,000
- Prompt design + RAG pipeline (2 weeks): $6,000
- Eval calibration + edge case handling (1 week): $3,500
- Deploy (Vercel + Postgres + monitoring): $2,500
- Handoff documentation: $1,000
Total build: $31,000. Timeline: 9 weeks. Ongoing run-cost: about $280/mo for LLM API, $150/mo for monitoring, $4k/mo retainer for the first 3 months (then dropping to $1.5k/mo).
Year 1 total: $31k build + ~$8k run-cost ÷ pro-rated retainer = roughly $52k.
That is a realistic medium-scope build budget. Most projects I see priced under $20k are either tiny scope or under-quoted (and will overrun).
What to do tomorrow
If you are scoping a custom AI build, the 30-minute exercise that pays for itself.
- Write the outcome in one sentence. If you cannot, you are not ready to build.
- List the integrations. Count them.
- Sketch the data flow. Note where the data lives.
- Identify the eval criterion: how will you know it works?
- Estimate volume: runs per week, peak load.
With those five pieces, any senior consultant can quote a fixed price in 90 minutes. Without them, every quote is a guess and every guess goes over. The clarity is worth more than any tool choice you will make in the project.
- Workflow with proprietary data. RAG over your sales calls, internal docs, customer support history. No SaaS can match this depth.
- High-volume, well-defined task. Content briefs, sales call summaries, lead qualification — at 1,000+ runs per month, ROI is clear.
- Integration too specific for SaaS. Your CRM, your inventory system, your domain-specific ontology. Build is the only option.
- Vague exploratory projects. If you cannot describe the outcome in one sentence, you are not ready to build. Buy a SaaS or do a 2-week prototype first.
- One-off tasks. Manual still wins for less than ~50 runs per month. Build economics need volume.
- Mature SaaS category. Customer support, content marketing, sales enablement. Mature tools beat custom builds 9 times out of 10.
Q01 Why is LLM API spend so small a share of the cost? +
Q02 Why does integration eat so much of the budget? +
Q03 What does ongoing maintenance actually cover? +
Q04 Can I do a custom AI build in-house? +
Q05 What is the cheapest viable custom AI project? +
Q06 How long does a $25k build take to ship? +
- [01] AI implementation cost benchmarksessay
- [02] documentation
- [03] documentation
- [04] data
- [05] data
- [06] State of AI in production surveyreport