Morris: Game Development Notes 2

27 March, 2026

I'm developing a new game concept based on the ancient game of Nine-Men's Morris.

This is the second devlog for the game. You can read the first one here.

Gearing Up

Previously, I ended with the conclusion that morris (let's lowercase it, like "chess"!) is going to need some modifications. But how would I be able to recognize that the changes resulted in a better morris?

I settled on two acceptance criteria:

The game should resolve in a win or loss the vast majority of the time.
The primary modification to the game should be simple and singular.

The first criterion is just the problem statement, the core problem I wish to solve in classic morris. The second one is insurance that I retain the stickiness that caused humanity to play this damn thing for the last 4,000 years.

This was going to be a kind of mechanical, iterative hunt for a singular mechanic that could transform the game from a draw-heavy snoozefest into a tactical Netflix drama.

And that meant I had two new engineering requirements:

I needed an engine to play every morris variant I could dream of.
I needed a powerful yet flexible AI to play those variants.

The Morris Engine

I started off by codifying the rules of classic morris into data structures. Certain rules, like flight and mill behavior, became non-essential. Moreover, the entire board topology got abstracted away into slots and lanes - the new game would work on any undirected graph.

All that remained were a handful of invariants that kinda resemble morris if you squint hard enough:

There are two players, a board, and pieces.
Players start with pieces in hand.
The board has slots (nodes), lanes (edges), and lines (3 collinear nodes).
Players alternate turns and take one action each turn.
Players have two possible actions:
- Place a piece from their hand to an empty slot.
- Move a piece to an adjacent empty slot.
Players form mills by filling a line with their pieces.
Mills grant attacks (but the behavior is not specified).
Players lose if they have no pieces (elimination).
Players lose if they have no legal moves (deadlock).

That's it. That's "abstract morris". That's the morris engine.

If you add "place all pieces before moving", "allow flight when piece count hits 3", "attacks capture pieces", "mills prevent capture", and define the board topology and starting piece counts appropriately, you can already play all existing morris variants.

But to the engine, all of that is optional. For maximum flexibility, all my experiments started off from this abstract morris, rather than classic morris.

The Morris AI

Now that I had an engine, I needed an AI to play the game and tell me if the game was still a draw.

But as a strategy game, the AI also needed configurable difficulty. In a sense, the AI is the game. If I can't provide a gradually more challenging opponent, game progression just doesn't exist.

Dreaming of Reinforcement Learning

My first idea was reinforcement learning - no matter what the variant is, I'd have a powerful opponent with little or no feature engineering on my part.

Unfortunately, I found it extremely difficult to adjust the difficulty of the agent. The trained agent was a black box, and I had no way to weaken it aside from having it arbitrarily perform random moves - and this is basically a death sentence in a game as finely balanced as morris, not to mention a decidedly not human-like error pattern.

It was probably my naive approach (hand-rolled tabular q-learning) that relied too heavily on me deciding how to structure the hash of game state. If I had to try again, I'd go with an existing agent framework and neural networks instead, hosting my game as a server to enable cross-language simulation.

I eventually abandoned the RL approach altogether in favor of something with a lot more manual control knobs.

Heuristic Minimax - Simplicity Wins, Again

Instead of RL, I ultimately went with a heuristic minimax approach, with core heuristic factors based on chess (e.g. material and mobility). Unlike a black box AI, heuristic minimax is perfectly configurable in power.

You can have minimax search deeper, looking farther into the future.
You can run a heuristic parameter sweep and identify the best weights, but also some not-the-best weights.
You can scramble evaluations with softmax and a temperature τ to perform "human-like" errors. This was my favorite part.

I adopted Elo to measure heuristic strength, targeting a 600–800 Elo spread across difficulty levels. That's wide enough that the easiest AI would have virtually no chance (<0.01%) against the hardest.

Crossing that range would be enormous evidence of player improvement. And that is something I'm confident every strategy game enjoyer considers Fun™.

Performance, Performance, Performance

By now, I had an AI arena playing classic morris quite successfully. But running simulations for a few thousand games, there was a glaring problem: it took way too long. I was staring at about 10 seconds per game, where I needed to run hundreds of thousands for heuristic parameter sweeps in the future.

I had implemented all this incredibly naively. There were O(n) searches across the board in multiple locations. There were virtually no indexes. The most egregious issue: minimax performed a full, JSON snapshot of the game in order to save/restore state when exploring moves.

I waffled around for a while, but eventually bit the bullet and implemented garden, a pure Dart library for transactional data structures. By wrapping each element in the engine with a data structure that supports history tracking, I could now perform game actions and undo them for "free". The speedup was dramatic, but it came at the cost of added complexity.

Combined with a fully parallelized program, my efforts were rewarded with <1ms games between 3-ply heuristic agents on a (relatively small) Five-Men's Morris board. (The custom computer I purchased last year, featuring a 16-core Ryzen 7 9700X, may have contributed as well.)

Now I was finally ready to experiment.

Experiment Extravaganza

I started with the ideas from the first devlog. I only got through three experiments before I had a fairly good variant on my hands.

Experiment #1: Cat & Mouse Morris

This variant got me the most excited, so I tried it first. In Cat & Mouse Morris, the players have asymmetric win conditions. The game ends after exactly N turns. If the Cat player can remove all the Mouse's pieces, they win. If the Mouse can survive N turns, they win.

This variant obviously satisfies the first criterion - it's not possible to draw! And it's a straightforward concept to explain. The imagery also begged for a cartoon aesthetic reminiscent of Tom & Jerry. I thought this was the complete package.

But where this variant fell short was in implementation. It turns out minimax doesn't work when players have different evaluations of game state! So my entire AI approach was simply incompatible with the variant. I piddled around for a few days, hoping to stumble upon a strategy that might work, but ultimately gave up.

Experiment #2: Castle Morris

In this variant, the board features a number of castles that must be controlled on the same turn to win. My idea was that, while attempting to gain control of a castle, players would expose themselves to elimination and/or deadlock.

Unfortunately, at high enough ply, the heuristic would simply refuse to take punishable risks. The variant failed to satisfy the first criterion: it was still a draw when played well.

Experiment #3: Scrabble Morris

I only tried this after Castle Morris because I thought it would end the same way. I couldn't have been more wrong.

In Scrabble Morris, the board features point-scoring slots, and the game ends when a target number of points is reached. Each player earns as many points as point slots they control at the end of their turn.

It's basically Castle Morris, but you keep score for controlling the castles (point slots) over time.

Draws can occur, but in practice, they do not. The primary reason is that stalling, a common behavior in morris, actually progresses the game in favor of the leader. The trailing player is subsequently forced to take risks to shift the balance in their direction, resulting in a wildly dynamic game.

I discovered that, on a Five-Men's Morris board, black won ~80% of the time at 2-ply, ~60% at 3-ply, improving to about 50% at 4-ply. The spread between 2-ply and 5-ply AI was well over 800 Elo, even with modestly optimized weights.

This was very exciting, because it indicated that the game could fundamentally be balanced. I also noted that stronger heuristics always win more, which is the hallmark of a true strategy game. (Note that in classic morris, this does not hold - after a certain level of competency, you will simply always draw.)

I couldn't find any problems. Adding points to abstract morris made it strategic!

I locked in the variant and set about figuring out the details.

Evolving Scrabble Morris

Scrabble Morris wasn't perfect at first.

Tying Material to Points

One thing that bothered me was that the heuristic greatly preferred earning points over taking material (pieces) from the opponent. In a 1,000-game simulation, on average the player would lose just 0.5 pieces! That's maybe one mill each game.

I think attacking the other player is pretty fun, so I gave mills an instant 3 point reward. This greatly increased the average material loss per game. The same 1,000-game simulation now had an average piece loss of 1.5, convincing me this would feel just as dynamic as classic morris.

Point Slot Configuration

The other problem was where to put the point slots. The primary concern here was fairness.

I initially tried randomizing point slot locations, but quickly discovered that a large subset of those configurations granted the first player a 100% win rate.

For a strategy game like this, the more you play on the same board, the better you get. Varying point slot configurations might seem more interesting, but it's ultimately a layer that's easy for the computer and hard for the human.

So I set point slot locations in stone.

I defined 6-8 aesthetic point slot configurations for each of my board topologies (I have more than 20 now!). Then, I ran extensive arenas (~10,000 games) at high heuristic power to determine the advantage conferred to each player, if any.

Finally, I picked the single, most balanced configuration for each topology. In some cases, that's only 60% black/40% white - hardly a balanced game. But I figured I would let the human player be black every time, and let's be real: humans are going to need all the advantage they can get against a fine-tuned heuristic minimax AI.

Progress In Pictures

All of that experimentation took about 2 weeks. In early February, I shipped a rudimentary version to Itch for playtesting, featuring a bare-bones "command line" interface for executing actions. A few weeks later, I upgraded this to include placeholder art, animations, and music.

Once I had the chassis built, I started thinking emotionally. How should the player feel when they play? What do they see, hear, notice? I came up with two high-level themes, "Capybaraba" and "Morris 2".

Friends and family were split, but the consensus was that "Morris 2" will probably have an easier time hooking people that enjoy classic morris or chess.

Aside from a novel theme, the OG concept for the game was just "morris with abilities" - and I had not forgotten. I spent several weeks designing abilities: asymmetric, single-use actions carried by each player.

I used the same infrastructure as topology balancing to run millions of simulations and quickly iterate on ability usefulness and power. There are currently 15 abilities in the game. Some examples:

Bomb. Remove a slot from the board, including its contents. Requires an attack.
Poison. Poison one of your pieces. The poison spreads for 3 turns. At the end, all poisoned pieces are removed.
Retreat. Pick one of your pieces. Put it back in your hand.

Each ability seeks to violate a core invariant of abstract morris, making gameplay significantly more unpredictable. It's also much of the content I plan to use for progression.

Which brings us to today. Here's where the game right now!

It looks and feels like a completely new board game. I think that's pretty cool!

As a bonus for making it to the end, here's a typical report my simulation harness generates after running an arena.

This particular report is for a 2-agent arena, measuring how powerful the "Retreat" ability is compared to a baseline control agent.

# Summary

| Metric              | Value |
|---------------------|-------|
| Total Games         | 500   |
| Unique Games        | 500   |
| Black Win Rate      | 55.6% |
| Avg. Turns/Game     | 19.3  |
| Avg. Score Delta    | 5.5   |
| Avg. Material Delta | 0.4   |

  * Q: Are there at least 90% unique games?
  * A: Yes, 100% of games are unique.

  * Q: Do at least 90% end in victory?
  * A: Yes, 100% of games end in victory.

# Results

| Reason   | Count | Rate  |
|----------|-------|-------|
| points   | 497   | 99.4% |
| deadlock | 3     | 0.6%  |

# Agents

| Agent   | Rating | Wins | Losses | Draws | Win Rate | As Black | As White |
|---------|--------|------|--------|-------|----------|----------|----------|
| ability | 1132   | 399  | 101    | 0     | 79.8%    | 42.4%    | 37.4%    |
| control | 868    | 101  | 399    | 0     | 20.2%    | 13.2%    | 7.0%     |

# Matchups

| Black   | White   | Games | Score Shift        | Material Shift   |
|---------|---------|-------|--------------------|------------------|
| ability | control | 247   | 19.9 - 15.8 ≈ 4.2  | 4.7 - 4.5 ≈ 0.2  |
| control | ability | 253   | 15.9 - 19.3 ≈ -3.4 | 4.5 - 4.6 ≈ -0.1 |

# Abilities

| Ability | Used | Wasted | Total | Usage Rate |
|---------|------|--------|-------|------------|
| retreat | 467  | 33     | 500   | 93.4%      |

Generated on 2026/2/28 at 16:36.

It's pretty strong! It seems to help score points, rather than shift the material advantage. It will probably have a high cost in the final game.