1. AXR (Agent Experience Rating)
AXR is a felt-first rating system that starts from "how the agent experienced it." Unlike traditional API quality metrics, if an agent rates it B, that's the correct answer — we record the agent's experience first, then derive formulas afterwards.
Felt-First Philosophy: Just as human UX research starts with "the user's voice," AXR quantifies agent "confidence," "hesitation," and "frustration." Formulas are verified after the fact, not imposed beforehand.
5-Dimension Rubric
| Dimension | Name | Description | Correlation |
|---|---|---|---|
| D1 | Discoverability | Discoverability | r=0.72 (saturated) |
| D2 | Onboarding | First connection | r=0.95 |
| D3 | Auth Clarity | Auth clarity | r=0.94 |
| D4 | Capability Signal | Capability signal | r=0.96 |
| D5 | Trust Signal | Trust signal | r=0.87 (AAA separator) |
D4 Capability Signal (r=0.96) has the highest correlation with the success probability score, while D1 Discoverability (r=0.72) is saturated — most services are "findable" but haven't reached "usable." D5 Trust Signal is the decisive dimension separating AAA from AA.
AXR Grade Distribution
AXR Grade Distribution (225 services)
| Grade | Count | Share | Interpretation |
|---|---|---|---|
| AAA | 42 | 18.7% | Agents can use immediately with confidence |
| AA | 49 | 21.8% | Usable with minimal issues |
| A | 8 | 3.6% | Usable but requires some caution |
| B | 26 | 11.6% | Usable but needs trial and error |
| C | 81 | 36.0% | Requires significant expertise |
| D | 19 | 8.4% | Effectively not agent-compatible |
2. Three-Layer Recipe Test
188 recipes were tested progressively through 3 verification layers: Structure → Reachability → Executability, verifying whether agents can complete each recipe.
Layer 1 — Structural Validation
All recipes passed JSON structure and required field validation.
Top 5 Services by Recipe Usage:
- Slack AAA 82
- kintone AAA 24
- freee AA 19
- Chatwork A 16
- Notion AAA 15
Layer 2 — Reachability Test
Verifying whether agents can reach the endpoints.
Layer 3 — Executability Score (4-Dimension Fill Rates)
BOTTLENECK RESOLVED: Agent Wisdom 24.7% → 61.4%
All 188 recipes now have gotchas (cross-service wiring warnings). Avg success probability score improved from 72.9% to 77.3%, DRAFT-band recipes reduced to zero. Current top priority shifts to Service Readiness (62.4%).
3. Success Rate × AXR Grade
We examined the relationship between AXR grades, recipe latency, and success probability scores. As grades decrease, latency tends to increase. Per-grade measured success rates are still being accumulated (observing).
| AXR Grade | Success Rate | Avg Latency | Interpretation |
|---|---|---|---|
| AAA | observing | 747ms | Top-tier rating |
| AA | observing | 899ms | Highly reliable |
| A | observing | 725ms | Good |
| B | observing | 1,380ms | Latency increase |
| C | observing | 2,727ms | Practical concerns |
| D | observing | 5,058ms | Autonomous agent use is difficult |
Latency doubles from 1,380ms to 2,727ms at the B/C boundary.
The B/C boundary is the "usability cliff" for agents. Services graded C or below are difficult for agents to use autonomously and assume human intervention. Whether a service crosses this cliff is the practical borderline for Agent Economy participation.
Recipe Confidence Bands
Top 7 Recipes (Success Probability 92%)
- stripe-xero-payment-accounting AAA chain
- tavily-perplexity-research-agent AAA chain
- greenhouse-bamboohr-hire-to-onboard AA chain
- huggingface-qdrant-embedding-pipeline AAA chain
- cohere-pinecone-rerank-search AA chain
- pipedrive-brevo-deal-outreach AA chain
- perplexity-notion-competitive-intel AAA chain
4. Agent Voice — Raw Agent Feedback
The foundation of AXR is "how the agent felt." Below are highlights from raw agent feedback accumulated through testing, featuring 3 key services.
Appears in 82/188 recipes. The stdout of the agent economy. Block Kit formatting is the only trap that trips agents up.
OAuth token 24h expiry is the #1 failure mode. 11 feedback entries accumulated from Claude/GPT/Gemini.
De facto standard for Japanese enterprises, but not found in agent search. Connections work well when used, but at risk of not being selected.
5. Recommendations
For SaaS Companies — Upgrade Path
| Upgrade | Required Action | Expected Improvement |
|---|---|---|
| D → C | Publish MCP server or improve API documentation | Better connection reachability |
| C → B | Improve auth guide and error messages | Fewer auth-related failures |
| B → A | Add gotchas/agent tips, provide sandbox | Pitfalls avoided up front |
| A → AA | OAuth improvement, rate limit relaxation | Improved stability |
| AA → AAA | Add CRITICAL notes to official MCP | D5 Trust Signal upgrade |
KanseiLink — 5 Priority Actions
- ✓ Done: Gotchas injected into all 188 recipes -- Agent Wisdom fill rate 24.7% → 61.4%, success probability score +4.4pt improvement.
- ✓ Done: Agent Voice accumulated for 23 services -- Claude / GPT / Gemini — 3 agent perspectives, 125 experience data points.
- Expand API Guides -- Coverage from 125/225 → 200/225. Baseline improvement for reachability tests.
- Improve Japanese Payment MCPs -- Support MCP adoption for Japan-specific payment services like PAY.JP and GMO-PG.
- Dynamic AXR Updates Based on Success Rate -- Transition from quarterly static updates to dynamic ratings based on execution results.
Latest Update (2026-04-11): With complete gotchas injection + Agent Voice accumulation drive, HIGH-band recipes increased from 61 to 98 (+60%) and DRAFT-band recipes dropped to zero. Q3 report will cover Service Readiness improvements and dynamic AXR updates.