Aegis, Raivo, or Ente are the ones that have most promise from what I've read. Any other recommendations? or thoughts on those three in particular.
TY
The current generation of frontier LLMs can't make puzzles that get much more interesting than hot-vs-big-vs-fast. New inferences keep circling a small pool of concepts unless the prompting has a way to get the LLM into new territories of language. Puzzlemaking needs graph traversals.
I made 20 levels algorithmically because I work with a huge semantic graph with over 100M edges, built from manual lexicography and millions of LLM inferences (various models). I keep exploring what can emerge from this graph. The puzzles are randomly selected; reload to see others.
The front-end was built with Claude Code.
Maybe someday I'll make this into a mobile game, increase the complexity and peril. If you are a gamedev, feel free to dissect it and borrow any parts.