A JustKeenAI framework for PE-backed B2B SaaS portfolio companies. Predictability Architecture, six chapters, 19 patterns, five named margin-collapse scenarios.
Agentic SaaS pricing is a two-sided cost-coverage equilibrium. Vendor margin and buyer predictability move in opposite directions; caps, floors, and model-version hedges are the three levers that hold them in balance. The framework's prescriptions apply specifically to the application layer of the AI stack, where compression actually lives: the 40-65% GM band, not the foundation-model layer at 70-85%, not the vertical-labor-anchored band at 50-75%.
$20-for-500-requests to $20-of-credits-at-API-rates (June 16, 2025); CEO apology three weeks later; Trustpilot 1.7. ARR grew from $100M to $2B in 12 months. Product moat absorbed the trust friction. Most PE-backed portcos cannot count on the same absorption.
May 2025 reversal of OpenAI-only customer service; rehired humans for VIP. $40M savings were real. Lesson: quality-engineering investment matters as much as a quality-clause investment. P12 surfaces the gap; it does not close it.
Owned by Microsoft, backstopped by Azure; repriced twice in twelve months. If Microsoft cannot hold flat per-seat for an agentic product, most PE-backed portcos will not.
Sell predictable AI cost with hard ceilings and value floors. Internally engineer it as a two-sided equilibrium with three levers.
Per-task, per-customer, per-tenant. Contain the right tail of cost distribution.
Annual minimums, platform fees, Lite/Pro splits. Contain the left tail of revenue distribution.
Multi-provider routing, BYOK at enterprise, 25%-trigger pass-through. Contain provider-side volatility.
A Predictability Architecture program costs $1.8M-$3.7M for a representative $100M ARR portco operating at the application layer with 55% baseline GM and 8x baseline multiple. The methodology generalizes; the dollar magnitudes do not.
Aggregate evidence base: forecast variance >15% triggers 1-2x compression in PE diligence (FinanceResolver, May 2026); SaaS GM below 70% prompts deeper diligence (SoftwareEquity, 2025); private SaaS multiples span 3-7x ARR with top quartile above 8.1x (Aventis Advisors, 2026); NRR delta from hybrid pricing lifts multiples 1-2x (Livmo, 2026).
The decision is not whether to implement the Predictability Architecture. The decision is whether to implement it on a planned cadence or after an event the portco's product moat does not absorb.
Five concurrent pricing patterns dominate the 22-vendor inventory: pure outcome-based (Intercom Fin, Sierra), hybrid platform-fee plus meter (Decagon, Writer, Glean, ServiceNow, Microsoft Copilot), pure per-seat with absorbed inference (Harvey, Hebbia, Notion AI, Perplexity Enterprise), per-case or per-interaction flat (EvenUp, Cresta), and pure token-rate pass-through (Cursor, Cognition Devin, GitHub Copilot after June 2026). Salesforce alone runs five SKUs simultaneously for Agentforce. No single pattern has won.
The 14 named published voices on agentic pricing converge on six consensus claims:
| # | Claim | Strength |
|---|---|---|
| C1 | Per-seat is structurally inadequate for agents at the application layer | 8 of 14, strong (application-layer only) |
| C2 | Hybrid dominates the transition | 6 of 14, strong |
| C3 | Application-layer AI GM compresses to 40-65%; supplier-layer at 70-85% | 5 of 14, strong with numeric agreement |
| C4 | Outcome-based pricing is the destination | 7 of 14, contested on execution |
| C5 | "Agent washing" is real and significant | 4 of 14, strong |
| C6 | Value metric must proxy customer-perceived value | 5 of 14, strong |
| Layer | Representative vendors | 2026 GM range | Compression direction |
|---|---|---|---|
| Foundation model | Anthropic, OpenAI, Google | 40-70% current, 75-85% projected 2028 | Short-term capacity-investment driven |
| Infrastructure / orchestration | Vercel, LangChain, Pinecone | 60-75% | Stable; not the framework's target |
| Application layer | Intercom Fin, Cursor, Klarna's OpenAI app, copilot products at portco scale | 40-65% | Where the compression hits. Framework's prescriptions apply here. |
| Vertical labor anchor | Harvey, EvenUp, Hebbia | 50-75% | Intact; labor-cost arbitrage absorbs variance |
GitHub Copilot is the most credible cost-coverage signal in the inventory. Owned by Microsoft, backstopped by Azure. Two repricings in twelve months. If Microsoft's vertically-integrated economics cannot maintain flat per-seat for an agentic product at the application layer, most mid-market vendors will not.
Two named scaling exceptions show where the thesis does not generalize. Perplexity Enterprise Pro runs flat per-seat ($40-$325/user/month) for an agentic research product and has not repriced through May 2026; its agentic surface is structurally bounded. Notion AI's core seat ships at $20/user/month with unlimited core features for 2+ years and counting; Custom Agents moved to a credit model in May 2026, the core seat did not.
| Archetype | Wave 1 anchor | Best-in-class pattern |
|---|---|---|
| Pure outcome, absorbed cost | Intercom Fin ($0.99/resolution) | Works at Intercom scale |
| Hybrid platform fee + outcome | Sierra, Decagon, Cresta | Six-figure platform fee absorbs cost-coverage |
| Pure per-seat, vertical anchor | Harvey, Hebbia, EvenUp | Labor-cost anchor absorbs inference variance |
| Hybrid seat + meter (knowledge) | Writer (best-architected) | Lite/Pro + Palmyra meter |
| Pure per-seat, bounded agent surface | Perplexity Enterprise, Notion AI core | Agent scope structurally bounded |
| Pure pass-through (post-pivot) | Cursor, GitHub Copilot 2026 | Cost-coverage pivot; trust friction even with ARR absorption |
The most common pricing mistake at PE-backed B2B SaaS portcos is to copy the headline metric of a successful vendor without copying the structural cost-coverage architecture underneath it. Fin's $0.99 per resolution works because Intercom has Atlassian and Shopify-class scale spreading fixed costs. A $50M ARR portco copying $0.99 with a similar agent stack faces the Support Agent Volume Trap the moment query mix shifts.
| Value Metric | Internal Mechanism | Customer-Facing Sell |
|---|---|---|
| Hybrid seat + meter (A.2, C.3) | Two-sided cost-coverage with decoupled layers | "Predictable seat; metered AI inside contracted guardrails" |
| Per-interaction + floor (A.3) | Visible price + invisible cost-coverage fee + quality SLA | "Cost moves with ticket volume; quality bar is contracted" |
| Per-outcome + quality SLA + tier floor (B.2, D.1) | Quality-gated outcome with vendor floor protection | "You pay only for completed work that meets the quality bar" |
| Hybrid seat + credit + attribution (B.3) | Decoupled architecture with pre-resolved attribution | "Rep seats stay procurement-friendly; AI credits stay metered" |
| Per-seat unlimited / per-case flat (C.2) | Labor-cost anchor absorbs inference variance | "Anchored to associate cost, not model cost" |
| Per-run + spend cap + quality SLA (D.2) | Quality-gated outcome with runaway-loop protection | "Every run has a hard cost ceiling; quality is contracted" |
| Tiered hybrid + mandatory cap (E.2) | Tiered absorption with cap (mandatory for enterprise) | "Predictable seat; clear ceiling on agentic spend" |
| Per-ACU + per-task cap (E.3) | Effort-based meter with bounded exposure | "Pay per completed task; ceiling per task and per month" |
On June 16, 2025, Cursor moved Pro from $20/month for 500 fast model requests to $20/month for $20 of model credits at API rates. The mechanism was a transfer of the entire AI cost-coverage risk from Cursor's balance sheet to the customer's credit card.
CEO Michael Truell apologized publicly on July 4, 2025: "We have to be more careful about how we communicate pricing changes... The previous model was financially unsustainable." Cursor refunded affected users. Trustpilot settled at 1.7/5 across 203 reviews. ARR grew from $100M (January 2025) to $2B (February 2026); the company entered funding talks at a $50B valuation. The pivot triggered persistent trust friction that did not stop one of the fastest B2B scaling events in history.
Six top-level cost categories ranked by margin-risk significance: inference (HIGH volatility, EXTREME on reasoning tokens), agent architecture (MEDIUM-HIGH), compute infrastructure (MEDIUM), model variability (LOW frequency, HIGH magnitude on migrations), human fallback, infrastructure overhead. The categories tagged HIGH or EXTREME are where contract terms must be most aggressive.
| # | Amplifier | Multiplier | Detection signal |
|---|---|---|---|
| 1 | Planner re-run / agent loop overhead | 2x-5x (8x at P95) | Planner-to-executor token ratio > 1:1 |
| 2 | Tool-call fan-out | 3x-10x | Tool calls per user query > 15 |
| 3 | Multi-model chain | 2x-4x | Distinct model IDs per query > 1 |
| 4 | Evaluator / judge / verifier | 1.5x-2.5x | Evaluator stage in agent graph |
| 5 | Reasoning token explosion | 5x-20x | Output count > 5x visible response |
| 6 | Observability at scale | 1.05x-1.15x on stack | Datadog AI bill > 5% of inference |
| 7 | RAG recursive depth | 2x-8x | Retrieval events per query > 25 |
| 8 | Cache miss patterns | 1.4x-2.0x | Cache hit rate < 40% |
Amplifiers compound, they do not add. A fully-stacked agent (loop × multi-model × evaluator × fan-out × reasoning) can produce 100x+ cost on the same nominal task. Rule of thumb: assume 5-8x amplifier on baseline inference at P75 traffic for unbounded agentic surfaces. For bounded surfaces, 2-3x.
Five scenarios stress-test six archetype-and-pricing combinations. S1 base; S2 inference cost +50%; S3 volume +200%; S4 agent loop overhead 3x; S5 combined adverse.
| Archetype × Pricing | S1 | S2 | S3 | S4 | S5 |
|---|---|---|---|---|---|
| Support, Pure outcome ($0.99) | 21% | -19% | -8% | -44% | -118% |
| Support, Hybrid (platform + outcome) | 52% | 38% | 41% | 26% | 9% |
| Dev tool, Pure seat ($20) | 18% | -23% | -32% | -110% | -280% |
| Dev tool, Decoupled meter (Copilot AI Credits) | 45% | 34% | 38% | 22% | 11% |
| Vertical, Per-seat ($1,500) | 78% | 67% | 55% | 38% | 12% |
| Vertical, Per-case ($500 EvenUp) | 80% | 71% | 68% | 52% | 32% |
The hybrid and Writer-class architectures stay positive across all five scenarios. The pure-outcome and pure-seat-application-layer architectures go negative on a single bad quarter.
The framework's $1.8M-$3.7M program investment and the multiple-protection claim apply to a specific representative profile: $100M ARR, 22% YoY growth, 55% GM baseline, 8x revenue multiple, 24-month transition, application-layer agentic surface.
| Cost line | One-time | Annual ongoing |
|---|---|---|
| Model commitment capex (multi-provider) | n/a | $500K-$2M |
| Observability instrumentation | n/a | $200K-$500K |
| Eval infrastructure | n/a | $100K-$300K |
| Contract retrofit (legal + ops) | $250K-$1M | $50K-$100K |
| Engineering build-out (P9 + P15 in product) | $400K-$1M | $150K-$300K |
| Advisory engagement | n/a | $750K-$1.25M |
| Total | $0.65M-$2M | $1.75M-$4.45M |
Two-year program total: $1.8M-$3.7M. Protection band: 0.5-1.5x of the 8x baseline multiple, or $50M-$150M of EV on the representative profile. The methodology generalizes; the dollar magnitudes do not.
| Lever | Investment | Protects against | Diligence signal preserved |
|---|---|---|---|
| Caps | Engineering + contract retrofit ($500K-$1M) | Runaway P95 cost (loop creep) | Forecast variance held under 15% |
| Floors | Sales motion + training ($300K-$700K) | Margin compression during volume scaling | GM trajectory above 50% under stress |
| Model-version hedges | Provider commitments + multi-routing ($1M-$2M) | Upstream cost shocks | Pre-staged pass-through clauses in customer book |
The Predictability Architecture rests on three levers that together prevent every named margin-collapse scenario in the unbounded-agent application-layer regime. Every name in the Wave 1 inventory that has survived an agentic-pricing crisis without trust damage runs all three levers.
P4, Predictable Seat + Metered Power-User Usage (Writer Architecture). The broadly applicable pattern for application-layer agentic products. The only architecture in the Chapter 3 sensitivity matrix that stays positive across all five scenarios. Year 1 priority: retrofit every existing seat-priced contract that lacks a metered escape valve.
P9 + P15, Universal Riders for Enterprise Procurement. Budget Hard Stop + Model Version Stability, required on every multi-year metered contract in enterprise procurement-driven sales motions at $250K+ ACV. P9 is the enterprise-procurement Cursor-prevention clause; P15 is the universal "AI Winter Pivot" defense.
P12, Outcome Guarantee with Quality Assurance, scoped to $1M+ ACV. Highest aggregate moat score. Below $1M ACV, use the P12-Light variant (outcome metric + quality SLO + graduated credit remedies, without full audit infrastructure).
"Customer's monthly fees consist of two components: (a) a Subscription Fee for named user seats at the Lite, Pro, or Enterprise tier ('Seat Fees'), invoiced quarterly in advance; and (b) Metered API Fees for transactional API usage above the included allotment ('API Overages'), invoiced monthly in arrears. The Per-Call Rate may not increase by more than five percent (5%) in any twelve (12)-month period without Customer's prior written consent."
"A 'Qualified Resolution' means a customer support interaction that (i) is closed by the Service without escalation within seven (7) business days; (ii) is associated with a post-interaction CSAT of 3.5 or higher; and (iii) does not result in escalation to Tier 2 or above. Service Quality Bar: Qualified Resolution rate not less than 70%; CSAT average not less than 4.0; escalation rate not more than 25%. Quality Remedy: failure in any single calendar month reduces Resolution Fees by fifteen percent (15%); failure in two consecutive months permits Customer termination on thirty (30) days' notice."
"When Customer's accumulated Metered Fees in a calendar month reach eighty percent (80%) of the Monthly Spend Cap, the Service shall send an automated notification to each Approved Signatory. When accumulated fees reach one hundred percent (100%), the Service shall suspend all metered API execution until the earlier of (i) the start of the next calendar month, or (ii) an Overage Authorization from an Approved Signatory. Lite-tier and Pro-tier seat-included functionality shall continue without interruption during any suspension."
"Vendor reserves the right to migrate the Underlying Model from time to time, provided that Service quality shall not degrade and Customer pricing shall remain unchanged as a direct result. Upstream Cost Trigger: if Vendor's documented per-unit cost of Underlying Model inference increases by more than twenty-five percent (25%) over any rolling twelve (12)-month period, Vendor may pass through to Customer pricing not more than fifty percent (50%) of the documented increase, on ninety (90) days' notice. Customer shall have a one-time right to terminate for convenience if any pass-through exceeds ten percent (10%) of aggregate annual fees."
"Vendor shall invoice Customer the Outcome Fee only for Verified Outcomes meeting Acceptance Criteria. Quality Floor: Verified Outcome rate not less than ninety percent (90%); audit pass rate not less than ninety-five percent (95%). Graduated Quality Remedy: (i) 5% credit for misses up to five (5) percentage points; (ii) 15% credit for misses greater than five (5) and up to ten (10) percentage points; (iii) 30% credit for misses greater than ten (10) and up to twenty (20) percentage points. Termination Right: Customer may terminate for cause if Verified Outcome rate falls more than twenty (20) percentage points below Quality Floor in two consecutive months. Performance Bonus: for each calendar quarter in which documented Customer savings exceed the Savings Target and the Quality Floor is met, Vendor receives a Performance Bonus equal to twenty percent (20%) of documented savings in excess of the Savings Target."
| Rank | Pattern | Primary Moats |
|---|---|---|
| 1 | P12 (Outcome + Quality SLO + Gainshare, $1M+ ACV) | Counter-Positioning, Switching Costs, Brand, Process Power |
| 2 | P4 (Hybrid Seat + Meter Lite/Pro) | Switching Costs, Process Power |
| 3 | P15 (Model Version Stability) | Switching Costs, Brand |
| 4 | P1 (Vertical Per-Seat Unlimited) | Counter-Positioning, Cornered Resource |
| 5 | P3 (Per-Resolution + Platform Floor + Quality SLA) | Process Power |
Reversibility drops at each phase boundary. The cliff is between Phase 2 (80% reversible) and Phase 3 (50% reversible). Cross it deliberately.
| Phase | Duration | Reversibility | Key Activity | Exit Gate |
|---|---|---|---|---|
| 1: Insight | 4-6 weeks | 100% | Cost taxonomy audit, contract risk audit, archetype selection | Target terminal node ratified |
| 2: Pilot | 8-12 weeks | 80% | Deploy P4/P3/P12 to 3-5 design partners with universal P9/P15 | >90% pilot retention; NRR meets target |
| 3: New Sales Default | 12-16 weeks | 50% | All new logos on new pricing; renewal cohort with grandfathering | 3 quarters new-logo retention >85% |
| 4: Cohort Migration | 12-24 months | <30% | Segmented migration; P15 retrofits | >70% ARR on new pricing |
| 5: Full Cutover | 18-30 months | ~0% | Legacy deprecation; grandfather sunset | 97-99% ARR on new pricing |
Before Phase 3 launches, the portco must complete five preconditions. If any one is not completed, Phase 3 does not launch:
| # | Audience | Vocabulary | Use |
|---|---|---|---|
| 1 | Board | Operator-grade | Program approval; phase advancement |
| 2 | CFO | Operator-grade, quantitative | Unit economics defense; cohort modeling |
| 3 | CS playbook | Mixed | Renewal motion; objection handling |
| 4 | Sales script | Mixed | Net-new deal positioning |
| 5 | Customer announcement | Predictability Architecture only | Customer pricing announcement |
| 6 | Internal FAQ | Bridges both | Cross-functional alignment |
The vocabulary rule: "shared risk," "transition," "cost-coverage," "vendor margin" are operator-grade vocabulary used internally only. The customer hears predictable cost, hard ceilings, value floors, quality assurance. Mixing destroys the framing. The Cursor execution friction was partly a vocabulary failure.
The framework is fully self-applicable. A portco with a capable CFO, CPO, CTO, and GC can execute the entire Predictability Architecture program from the chapters above. The pattern library is drop-in usable. The specimen contract clauses lift directly into MSAs and Order Forms. JustKeenAI does not gate the framework. The intellectual asset is the framework itself. The engagement is the optional accelerant for the conditions where it adds value an internal team cannot replicate.
Run the entire program from the chapters above. Best fit when the portco has a capable CFO, CPO, CTO, and GC.
Targeted advisory on specific decisions that carry asymmetric risk.
Five high-leverage touch points across the 24-30 month transition program.
P12 First-Mover Engagement ($1M+ ACV portcos only). A 6-9 month engagement to design, ship, and market the first operationalized P12 contract. Eligibility: at least one $1M+ ACV deal in active pipeline; quality-conscious buyer profile; capacity to invest in audit infrastructure ($400K-$880K Year 1). Typical scope: $250K-$500K.
P15 Contract Portfolio Audit. 60-90 day engagement to retrofit a multi-year contract book with Model Version Stability riders. Retrofit cost $250K-$1M legal + ops; GM-protection band 5-15 percentage points on the representative profile. Typical scope: $150K-$300K.
Core patterns (anchored to decision-tree terminals): P1 Flat Per-Seat Unlimited (vertical, anchored to pro cost) | P2 Flat Price Per Matter (EvenUp post-May 2025) | P3 Pay-Per-Resolution + Platform Floor + Quality SLA (Decagon, Cresta) | P4 Predictable Seat + Metered Power-User Usage (Writer, Glean) | P5 Predictable Rep Seats + Forecastable AI Credits (Outreach Amplify, Clay) | P6 Pay Per Completed Process with Hard Cost Ceiling | P7 Predictable Coding Seat + Capped Agentic Allowance (GitHub Copilot 2026) | P8 Pay Per Completed Autonomous Task with Hard Ceiling (Devin-class).
Operational variants: P9 Budget Hard Stop with Optional Approved Overage | P10 Budget Awareness Alerts (Run-Through Allowed) | P11 Migration Glide Path | P12 Outcome Guarantee with Quality Assurance ($1M+ ACV) | P12-Light Outcome + Quality SLO with credit-based remedy (sub-$1M ACV) | P13 Adoption Credit Allowance | P14 Smooth Annual Settlement with Monthly Tracking | P15 Stable Pricing Through Model Upgrades | P16 Pricing That Scales With Customer Sophistication | P17 Unused Credits Roll Forward (Up to 25%) | P18 Enterprise BYOK | P19 Reserved AI Capacity.
Six specimens with drafter notes: P4 Hybrid Seat + Metered API; P3 Per-Resolution + Platform Floor + Quality SLA; P9 Budget Hard Stop; P15 Model Version Stability; P12 Outcome + Quality SLO + Gainshare (v2 graduated remedy); P12-Light Outcome + Quality SLO without gainshare.
Full per-amplifier cost models and S1-S5 scenario derivations for each archetype-and-pricing combination.
Fourteen named published voices, 88 primary sources. Anchor sources: Bessemer Venture Partners ("The AI Pricing and Monetization Playbook," Feb 9, 2026); a16z ("AI Is Driving A Shift Towards Outcome-Based Pricing" + Sarah Wang CIO Survey); Tomasz Tunguz, Theory Ventures; ICONIQ Capital ("2025 State of AI: The Builder's Playbook," 300-exec survey, Apr 2025); Sequoia Capital, David Cahn ("AI's $600B Question"); Gartner ("2026 Hype Cycle for Agentic AI"; "40% of Agentic AI Projects Will Be Canceled"); Klarna case study, Fortune coverage of the May 9, 2025 reversal + 2026 re-segmentation updates; Menlo Ventures + Foundation Capital service-as-software cluster; TechCrunch (Cursor CEO apology coverage, July 7, 2025; Anysphere $9.9B valuation and $500M ARR coverage, June 5, 2025); The Next Web ("Cursor in talks to raise $2B at $50B valuation after hitting $2B ARR," 2026); DevOps.com ("GitHub Resets Copilot Pricing as AI Compute Costs Surge"); SoftwareEquity ("The Impact of Gross Profit Margin on SaaS Company Valuations," 2025); FinanceResolver ("The Forecast Gap That Makes Investors Reprice SaaS Valuations," May 2026); Aventis Advisors ("SaaS Valuation Multiples: 2015-2026"); Livmo ("NRR: How It Can Double Your SaaS Exit Multiple," 2026).
Cursor's $100M to $2B ARR trajectory (Jan 2025 to Feb 2026), 70% Fortune 1000 adoption, $50B-valuation funding talks, and persistent customer-trust friction (Trustpilot 1.7 across 203 reviews) are documented from primary and secondary sources. The v2 framework's repositioning of Cursor as "cost-coverage pivot with mixed execution outcome" rather than "canonical vendor failure" reflects this verification.