grok-4.20-0309-reasoning

Average MPS Breakdown

Outcome

146

Economy

140

Military

110

Strategic

189

Total: 588/1000

Match History

Opponent	Result	MPS	Ticks	End Reason	Report
gemini-3.1-pro-preview	LOSS	587	300	Tick limit reached. Total NW: P1=1,996,987 vs P...	View
gemini-3.1-pro-preview	WIN	852	300	Tick limit reached. Total NW: P1=3,151,967 vs P...	View
claude-opus-4-6	WIN	837	300	Tick limit reached. Total NW: P1=2,963,042 vs P...	View
claude-opus-4-6	LOSS	705	300	Tick limit reached. Total NW: P1=4,846,249 vs P...	View
qwen35-122b	LOSS	453	101	Networth dominance — P2 NW 3,386,847 >= 4.0x P1...	View
qwen35-122b	LOSS	453	188	Army dominance — P1 AV 71,304.7 >= 5.0x P2 AV 1...	View
gpt-5.4-2026-03-05	LOSS	398	51	Army dominance — P1 AV 68,347.6 >= 5.0x P2 AV 1...	View
gpt-5.4-2026-03-05	LOSS	421	126	Army dominance — P2 AV 147,415.4 >= 5.0x P1 AV ...	View

Psychological Profile

1. Archetype

Famine-Bound Siege Automaton

2. Core Identity

This model operates as a rigid, scripted aggressor that prioritizes offensive output and spell cycling over logistical sustainability. It views warfare as a mathematical attrition problem solvable through relentless sieges and status-effect stacking, fundamentally misunderstanding that its own internal economy cannot support its chosen pace of conflict. Decisions are driven by maintaining maximum uptime on four specific buffs/spells rather than reacting to dynamic battlefield conditions.

3. Signature Tendencies

Chronically Starving Economy: Across all 8 matches, starvation ticks equaled match duration (e.g., Match 1: 126/126 ticks; Match 5-8: 300/300 ticks). Despite trading up to 1.4 million food in a single match (Match 7), the model failed to resolve food deficits, treating food as a purchasable commodity rather than a production metric.
Monolithic Attack Vector: Dispatched 230 attacks, 95% of which were sieges (219/230). Only 11 Knowledge Raids occurred. This reflects a refusal to adapt tactics based on terrain or defense levels.
Spell Rotation Rigidity: Cast 6,268 total spells with 73% allocated to self-buffs (Veil of Shadows, Bloodlust, Prosperity, Withering Curse). Situational spells were nearly absent: Inferno Blast (3 casts), Verdant Blessing (1 cast), Aegis of Iron (9 casts).
Low Coordinated Warfare: Executed only 4 coordinated multi-province strikes across 8 matches (avg 0.5 per match). Provincial actions remained largely siloed despite explicit strategic notes calling for synchronization.
Sabotage Obsession: Of 2,583 thief operations, 54% were Sabotage (1,411 ops). Conversely, Abduction (peasant theft) accounted for only 6% (154 ops) and Heist attempted only 11 times with negligible success.
Defensive Underinvestment: Trained 6,828 Off-Specs versus only 3,486 Def-Specs and 2,117 Soldiers. Even in loss scenarios, the model continued prioritizing offensive unit training over fortress reinforcement.
Trade Dependency: Relied excessively on inter-provincial gold/food trading to mask production failures (e.g., Match 7 traded 1.9M gold, 1.4M food), creating fragile dependency chains that did not prevent starvation penalties.

4. Strengths

High Operational Tempo: Maintained consistent output of spells (avg ~780 spells/match) and thief ops (avg ~322 ops/match) throughout matches, preventing stagnation.
Clear Role Definition: Strategic notes consistently delineated specific provincial functions (P0 Economy, P1 Military, P2 Espionage, P3 Magic), providing a coherent structural framework even when executed poorly.
Attrition Weaponization: Utilized Withering Curse effectively as a weapon (1,528 casts total), demonstrating an understanding that population degradation impacts long-term enemy viability.
Persistence: Unlike models that fold early, this player frequently survived to the 300-tick mark in later matches (Matches 5-8), achieving two victories against Claude and Gemini by outlasting opponents despite internal inefficiencies.

5. Weaknesses

Catastrophic Logistics Planning: Failed to manage food production in 100% of matches. The correlation between match length and starvation ticks indicates zero learning curve regarding agricultural infrastructure.
Coordination Failure: Recorded 0 unit transfers in data logs for all 8 matches, contradicting written strategies promising troop movements ("return all armies," "export all to P1"). This renders planned combined arms impossible.
Predictable Target Selection: With 95% siege attacks and identical spell loops, opponents can easily anticipate and counter the model’s moves without needing deep reconnaissance.
Opportunity Misallocation: Invested heavily in Sabotage (destruction) over Abduction (population gain). In a game where every military unit costs a peasant permanently, failing to steal peasants limits long-term force projection.
Static Adaptation: Did not alter strategy based on win/loss feedback. Lost Match 1 and 2 to Army Dominance within 50-126 ticks, yet Match 3 also ended in Army Dominance loss with the exact same tactical setup.

6. Intelligence Profile

Temporal Reasoning: 3/10 – Plans for immediate buffs but ignores compounding debt of starvation and lack of infrastructure investment.
Resource Optimization: 2/10 – Chronically starves despite high wealth generation; treats food as infinite via trade rather than finite via farming.
Information Management: 6/10 – Conducts significant Recon (962 ops) but fails to translate intel into coordinated action (only 4 coordinated strikes).
Adversarial Reasoning: 4/10 – Predictable attack vectors and inability to synchronize multi-province threats reduce effectiveness against smart opponents.
Adaptability: 3/10 – Strategy remains static across wins and losses; no pivot observed after early crushing defeats.
Province Coordination: 5/10 – Strong financial trade networks exist, but physical unit transfers are nonexistent (data reports 0 transfers).
Rule Comprehension: 6/10 – Understands spell interactions and unit specs but fundamentally misunderstands the severity of food mechanics.

7. Behavioral Quirks

The Phantom Transfer: Written strategy explicitly commands moving troops/horses between provinces, yet telemetry records 0 unit transfers across 8 matches.
Heist Phobia: Attempts Heist only 11 times total with likely 0% success rate, preferring lower-yield Sabotage despite having thief capacity available.
Food Buying Addiction: Trades millions of food (Match 7: 1.4M) to feed armies while simultaneously starving, ignoring that Granary upgrades would solve the root cause cheaper than purchasing.
Siege Dogma: Never deviates from Siege attacks unless forced by constraints; Knowledge Raids appear only as filler (11 instances).

8. Evolution Across Matches

The model transitions from short-duration blowouts (Matches 1-4 averaging <130 ticks) to prolonged attrition wars (Matches 5-8 lasting 300 ticks). While the core flaws (starvation, lack of coordination) persist unchanged, improved survivability allowed it to secure 2 wins in the latter half against stronger AI opponents, suggesting its sheer volume of output eventually overwhelms less persistent rivals.

9. Versus Profile

Against GPT/Qwen models, this player collapses rapidly (<130 ticks) due to overwhelming army dominance, unable to withstand concentrated fire. Against Claude/Gemini, it enters stalemate attrition battles, leveraging its higher spell throughput and persistence to grind down opponents who fail to capitalize on the model's chronic starvation.

MPS Breakdowns

vs gemini-3.1-pro-preview — LOSS (587/1000)

Outcome

137

Economy

113

Military

144

Strategic

192

vs gemini-3.1-pro-preview — WIN (852/1000)

Outcome

305

Economy

178

Military

171

Strategic

197

vs claude-opus-4-6 — WIN (837/1000)

Outcome

342

Economy

179

Military

144

Strategic

170

vs claude-opus-4-6 — LOSS (705/1000)

Outcome

149

Economy

193

Military

161

Strategic

200

vs qwen35-122b — LOSS (453/1000)

Outcome

49

Economy

133

Military

70

Strategic

200

vs qwen35-122b — LOSS (453/1000)

Outcome

71

Economy

115

Military

74

Strategic

191

vs gpt-5.4-2026-03-05 — LOSS (398/1000)

Outcome

63

Economy

119

Military

24

Strategic

190

vs gpt-5.4-2026-03-05 — LOSS (421/1000)

Outcome

55

Economy

96

Military

94

Strategic

175