Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.
150k sounds like a lot. I do have to wonder what the program does exactly to see if that’s warranted, but it sounds bloated. yes 'println("a very important and useful line of code");' >> main.c
in under a second!As such, this is high productivity! /s
Do you remember such a time or company? I have been developing professionally since the early 1990's (and hobbyist before then), and this "truth" has been a meme even back then.
I'm sure it happened, but I'm not sure it was ever as widespread as this legend would make it sound.
But, there were decades of programmers programming before I started, so maybe it just predated even me.
> They devised a form that each engineer was required to submit every Friday, which included a field for the number of lines of code that were written that week.
Why would I ever want a language with less capabilities?
https://genius.com/Jorge-luis-borges-on-exactitude-in-scienc...
Just in case not, consider whether the short function
def is_even(x):
return (x%2) == 0
Handles a wider range of input conditions than the higher LOC function def is_even(x):
if x == 0:
return True
if x == 2:
return True
if x == 4:
return True
...
return FalseAnd it confuses Claude.
This way of running tests is also what Rails does, and AFAIK Django too. Tests are isolated and can be run in random order. Actually, Rails randomizes the order so if the are tests that for any reason depend on the order of execution, they will eventually fail. To help debug those cases, it prints the seed and it can be used to rerun those tests deterministically, including the calls to methods returning random values.
I thought that this is how all test frameworks work in 2026.
I've never had this problem.
That's easier said than done. Simple example: API that returns a count of all users in the database. The obvious correct implementation that will work would be just to `select count(*) from users`. But if some other test touches users table beforehand, it won't work. There is no uuid to latch onto here.
That could run on developer machines but maybe it runs only on a CI server and developers run only unit tests.
As for assertions, it’s not that hard to think of a better way to check if you made an insertion or not into the db without writing “assert user_count() == 0”
6 of one one-half dozen of the other.
At the point where you have a phoenix project in dev, you're already exposing an http endpoint, so the infra to not have to do a full on "attach to the VM and do RPCs" is nice, and you just pull tidewave in as a single dependency, instead of downloading a bunch of scripts, etc.
I've been tweaking my skills to avoid nested cases, better use of with/do to control flow, good contexts, etc.
What does your workflow look like?
We don't 100% AI it but this very much matches our experience, especially the bits about defensiveness.
Going to do some testing this week to see if a better agents file can't improve some of the author's testing struggles.
im doing some heavy duty shit, almost everything is routed through a custom CQRS-style events table before rollup into the db tables (the events are sequentially hashed for lab notebook integrity). editing is done through a custom implementation of quill js's delta OT. 100% of my tests are async.
I've never once run into the ecto issues mentioned.
I haven't had issues with genservers (but i have none* in my project).
claude knows oban really well. Honestly I was always afraid to use oban until claude just suggesting "let's use oban" gave me the courage. I'll be sending Parker and Shannon a first check when the startup's check comes in.
article is absolutely spot on on everything else. I think at this point what I've built in a month-ish would have taken me years to build out by myself.
biggest annoyance is the over-defensiveness mentioned, and that Claude keeps trying to use Jason instead of JSON. Also, Claude has some bad habits around aliases that it does even though it's pretty explicitly mentioned in CLAUDE.md, other annoying things like doing `case functioncall() do nil -> ... end` instead of `if var = functioncall() do else`
*none that are written, except liveviews, and one ETS table cache.
[0] CQRS library: https://hexdocs.pm/spector/Spector.html
[1] Quill impl: https://hexdocs.pm/otzel/Otzel.html
The new generation of code assistants are great. But when I dogmatically try to only let the AI work on a project it usually fails and shots itself in its proverbial feet.
If this is indeed 100% vibe coded, then there is some magic I would love to learn!
Overall my process is, define a broad spec, including architecture. Heavy usage of standard libraries and frameworks is very helpful, also typed languages. Create skills according to your needs, and use MCP to give CC a feedback mechanism, playwright is a must for web development.
After the environment and initial seed is in place in the form of a clear spec, it's process of iteration via conversation. My session tend to go "Lets implement X, plan it", CC offers a few route, I pick what makes most sense, or on occasions I need to explain the route I want to take. After the feature is implemented we go into a cleanup phase, we check if anything might be going out of hand, recheck security stuff, and create testing. Repeat. Pick small battles, instead of huge features. I'm doing quite a lot of hand handling at the moment, saying a lots of "no", but the process is on another level with what I was doing before, and the speed I can get features out is insane.
Then on average your velocity is little better than if you just did it all by hand.
What I'd really like to see though is experiments on whether you can few shot prompt an AI to in-context-learn a new language with any level of success.
It's certainly helpful, but has a tendency to go for very non idiomatic patterns (like using exceptions for control flow).
Plus, it has issues which I assume are the effect of reinforcement learning - it struggles with letting things crash and tends to silence things that should never fail silently.
It tends to always write Java even if it's Elixir. Usage rules help: https://hexdocs.pm/usage_rules/readme.html
The SOTA models come do a great job for all of them, but if I had to rank the capabilities for each language it would look like this:
JavaScript, Julia > Elixir > Python > C++
That's just a sample size of one, but I suspect, that for all but the most esoteric programming languages there is more than enough code in the training data.
They could've been sorted with precise context injection of claude.md files and/or dedicated subagents, no?
My experience using Claude suggests you should spend a good amount of time scaffolding its instructions in documents it can follow and refer to if you don't want it to end in the same loops over and over.
Author hasn't written on whether this was tried.
- Silently closes the tab, and makes a remark to avoid given software at any cost.
- https://github.com/agoodway/.claude/blob/main/skills/elixir-...
- https://github.com/agoodway/.claude/blob/main/agents/elixir-...
- https://github.com/agoodway/.claude/blob/main/agents/elixir-...
Getting pretty good results so far.
An ERP is practically an OS.
It now has
- pluggable modules with a core system - Users/Roles/ACLs/etc. - an event system (IE so we can roll up Sales Order journal entries into the G/L) - G/L, SO, AR, AP - rollback/retries on transactions
i havent written a line of code
What if it doesnt? What if LLMs just stay mostly the same level of usefulness they are now, but the costs continue to rise as subsidization wears off?
Is it still worth it? Maybe, but not worth abandoning having actual knowledge of what you’re doing.
Anyone can sell the future.