Hacker News Clone

embedding-shapeMay 24, 2026, 2:24 PM

I'm not sure you need a "DeepSeek native coding agent" to take advantage of DeepSeeks cache, yesterday as the Codex quota usage issue still wasn't solved for me, I wrote a tiny little bridge so I could use DeepSeek V4 Pro via Codex, and seems most of everything I did was basically cached as far as I can tell: https://i.imgur.com/7eKn6wN.png (2026-05-23 Input (Cache hit): 39,123,200 tokens, Input (Cache miss) 1,692,286), and the bridge is doing not special, just massage the DeepSeek API shape into what Codex expects, nothing particular about caching at all.

Besides being even better at the caching, I'm not sure what benefits you'd get compared to just firing up OpenCode with the DeepSeek API yourself, it'll similarly do caching for sure and also "talks directly to api.deepseek.com" if that matters, and you'll get a much more mature harness.

3ulerMay 24, 2026, 3:14 PM

Opencode has really bad cache stability issues that they seem uninterested in fixing at the moment.

datheryMay 24, 2026, 4:44 PM

The OpenCode devs talk about this on Twitter a lot, e.g. https://xcancel.com/thdxr/status/2048268697790300343

> tool call pruning breaks cache and people will tell you this is horrible and expensive

> except i looked at some anthropic data and real user behavior ends up with better cache hits and 30% less spend

> even this is needs to be analyzed further, it's just not simple

> for openai data it's inverted! cache hit ratio is actually better [sic: I think he meant worse based on the screenshot] with tool call pruning turned on

> but the net $ saved is only 5%

> kimi is a funny one - it has better cache hits with pruning on...but is also more expensive!

There was also another thread recently where he discussed that pruning improves user experience (models are smarter with less context) but I can't find it.

This can also be disabled in the config: https://opencode.ai/docs/config/#compaction

soerxpsoMay 24, 2026, 7:36 PM

My understanding of caching with most models/providers is that a prefix substring of the context has to be reused for a cache hit, but not necessarily the whole entire context window. So if you prune tool calls from the history, you're going to get one cache miss on the newly-pruned history, and then you're going to be getting cache hits on every subsequent turn, with a lower number of input tokens. If you prune subsequent tool calls after that, you would still get a cache hit for the already-pruned portion of the context, just not the full context.

__natty__May 24, 2026, 8:05 PM

So it makes sense to first send stable prompt, reasoning and files content, tool calls summary and actual tool calls at the very end?

leemooreMay 24, 2026, 11:26 PM

The way you do this (and the way opencode does it) is you do most of your pruning in more recent history. Last I looked at opencode, they start pruning tool call results after 2 full agentic turns. So you probably dont get quite as good hits on cache for the most recent 1-5% of your turns, but after that everything else caches fine and those tool calls that likely aren't relavent to your session anymore are gone.

awoimbeeMay 25, 2026, 7:43 AM

You didn't quote the interesting part:

> our implementation is it only prunes calls from > 3 user messages ago, if context is > 40K, and only if there's at least 20K tokens to be removed

Seems reasonable to me and explains why I can have long sessions (way longer than with zed agents) while still hitting cache. Opencode is just missing per-provider TTL.

arthurcolleMay 25, 2026, 9:03 AM

I found that keeping current context utilization at 18% of total context length was best for minimizing spend, across all models with 400k context length or more

hirako2000May 24, 2026, 7:10 PM

They are. Empirical evidence on my side. Because attention is sparse across the context. It's not truly treating a million token the way it treats a fraction of that count. For performance.

huqedatoMay 24, 2026, 4:25 PM

I can't confirm this. Having utilized Opencode for a large project over the past 10 months, with multiple models and agents, we've never run into such 'cache stability issues'."

estebarbMay 24, 2026, 11:25 PM

I'm not sure that is really the case, or relevant in practice. I have been using OpenCode with DeepSeek lately (regular coding). For instance, today I got 120 million input tokens hitting cache, vs just 2.59million missing cache.

ctxcMay 25, 2026, 3:19 AM

Reads like a LOT of tokens to me. What does your usage /workflow look like? I'm v curious because although I do use Claude code, my token counts aren't nearly as much

I want to know if I'm missing something cool!

mordaeMay 25, 2026, 8:23 AM

Not OP, but I routinely load 150k tokens into context. A full sub-package to work on, select other files in the monorepo, e.g. front-end visualization and back-end data loader. Then work some 150k tokens, then start again.

At the end, cache hit rate is like 99.5% if Novita is not having issues.

For official DeepSeek API, 99.9% or something.

Custom harness that never compacts or otherwise doctors the history.

ctxcMay 25, 2026, 12:18 PM

Those numbers make sense to me...120 million input tokens is like 120 sessions of hitting the full context limit, which seems like a lot to me though

metalspotMay 24, 2026, 6:40 PM

I am getting 98.6% cache hit ratio on deepseek-v4-flash with opencode

bobkbMay 24, 2026, 7:17 PM

That’s impressive!

On the sheer performance it’s comparable to Opus ?

stavrosMay 24, 2026, 10:41 PM

Here are my stats (from DeepSeek directly, with a script I wrote). The prices are what equivalent Sonnet usage would have cost, the actual amount I paid was $10. On performance, DeepSeek V4 Pro is comparable to Sonnet for me.

     ./cost.py amount-2026-5.csv 0.3 3.75 15
    input_cache_hit_tokens: 472,971,520 tokens -> $141.8915
    input_cache_miss_tokens: 13,299,013 tokens -> $49.8713
    output_tokens: 3,334,962 tokens -> $50.0244
    cache hit rate: 97.27% (472,971,520/486,270,533)
    cache miss rate: 2.73% (13,299,013/486,270,533)
    total: $241.7872

All of this usage was with an OpenCode subagent exclusively.

c0rruptbytesMay 25, 2026, 2:41 AM

[flagged]

upcoming-sesameMay 24, 2026, 7:44 PM

out of curiosity, how do you measure cache hit rate in opencode ?

malikNFMay 24, 2026, 8:40 PM

opencode stats

luguMay 24, 2026, 10:17 PM

So the calculation is:

Total input token = input + cache read + cache write Cache hit rate = cache read / total input token.

That is 71% in my very limited use of opencode.

hackernows_testMay 24, 2026, 10:53 PM

The first

magicalhippoMay 25, 2026, 5:41 AM

What I noticed when using OpenCode with llama.cpp, was that the default host RAM prompt cache size in llama.cpp was way too small for say 128k Qwen3.6 27B.

The default is just 8GB and a full 128k context for the dense model can take most of that. So then comes an agent and causes eviction and subsequent cache miss.

Bumped the cache size (--cram IIRC) up to 48GB and had much better results.

verdvermMay 25, 2026, 5:07 PM

There are some that are specific to certain models like qwen/gemma

I switched to vLLM and those went away. Need to look at my opencode config and adjust some others based on things I see here

BombthecatMay 24, 2026, 3:49 PM

[flagged]

tontintonMay 24, 2026, 6:10 PM

Yep exactly my thoughts, went and looked at the code for the deepseek provider in my coding agent. and basically all of what the author wrote there is implemented... http://github.com/tontinton/maki for the curios

himata4113May 24, 2026, 2:29 PM

this appears to be native to the terminal, as in, there's no special application that runs or wraps an agent inside a tui. So basically instead of commands you type plain english?

embedding-shapeMay 24, 2026, 2:31 PM

> this appears to be native to the terminal, as in, there's no special application that runs or wraps an agent inside a tui

Same with codex? codex-rs at least, is a TUI as well, it does run a "app-server" in the background, that the TUI actually interacts with, but that's just an implementation detail. Also makes it easy to hook in your own programs to fire of codex "headless" sessions even without the TUI.

agrippanuxMay 24, 2026, 6:23 PM

This website seems to have been generated by Codex - I asked Codex to create an HTML overview of a feature for my team and it made an overly produced monstrosity - complete with the same large stat boxes that were for the most part devoid of meaningful information - using the same font, colors, layout, hero section, etc. It was also terrible on mobile just like this is.

In the end I had Claude produce a one-page html file that was 95% of the way there and it took minor editing to clearly explain the intent of the feature.

port11May 24, 2026, 8:08 PM

A lot of LLM-driven design now looks like this. I don’t understand how people don’t find ugly the pairings with an heavily italicised serif. You also can’t read much of the page on mobile, because the code example keeps shifting the content around.

Now, that is overly critical, I’m sure their heart is in the right place. But a simpler website would do :)

gizajobMay 25, 2026, 7:07 AM

Yeah such amazing tech used to produce a tediously unreadable website with great flair.

krm01May 24, 2026, 8:26 PM

It’s sad to see companies not spending a bit more on design. Sure, ai will help you get something decent out fast. But there’s a threshold where design becomes an indicator of trust. Especially for b2b software that tailor to large corps. Good design, character, adds directly to the bottom line.

schaeferMay 25, 2026, 12:22 PM

> It’s sad to see companies…

The article is about an open source agent harness, Reasonix, that is built to leverage the DeepSeek native api.

There’s no company here. No design budget. These people are graciously sharing a project they made in their free time.

darkmatriarchMay 25, 2026, 4:15 PM

You're right, but I find as a solo engineer it's still important to check the frontends I create on mobile

port11May 25, 2026, 9:06 PM

I agree. I didn’t mean to be too critical. But if they’d made something simpler, I think it would save them tokens and end up more likely to convince their target audience of developers.

(The series of ‘motherfucking websites’ comes to mind, they were all very readable and simple, even if satire.)

easygenesMay 25, 2026, 6:40 AM

Claude Opus 4.7 defaults to exactly this design language for a lot of "just make me a rich html presentation page" requests without further specification.

locknitpickerMay 24, 2026, 7:49 PM

> In the end I had Claude produce a one-page html file that was 95% of the way there and it took minor editing to clearly explain the intent of the feature.

That doesn't say much about any model though. For starters, any software engineer can tell you that leaving out features can drastically simplify any project.

ritonlajoieMay 25, 2026, 12:50 AM

strange, I got the same design with claude design, same fonts, same title designs with the strange character etc...

jbellisMay 24, 2026, 8:10 PM

As someone who has been writing harnesses for a year: the people at opencode etc aren't stupid, when they decide to break the prefix cache [usually partially] it's always because they've tested it and it gives better results overall.

If you think that dsv4 behaves differently enough from the aggregate of other models, submit a PR with a patch to special case that to your harness of choice with evidence. Just blindly assuming "append only all the time because cache" is a waste of everyone's time.

anon373839May 25, 2026, 4:51 AM

Are there any learning resources you'd recommend on writing harnesses? I'm interested in doing a non-coding one, but not really sure where to start.

jbellisMay 25, 2026, 8:05 AM

Generically, I would say, just start building it and ask your favorite coding agent for advice when you get stuck. This is the first technology that can teach you how to use it! (But do ask a model with a recent knowledge cutoff, i.e. not gemini.)

sams99May 25, 2026, 9:58 PM

My agent wrote a pile of very interesting articles at wasnotwas.com I have been a bit quiet there for a bit, but it covers lots of areas that are very interesting to harness builders (albeit less interesting to the general public)

schaeferMay 25, 2026, 12:43 PM

> As someone who has been writing harnesses for a year…

Your agent harness, brokk, looks great. I’m going to try it this morning.

phrotomaMay 25, 2026, 10:33 AM

Is "harness" in this context ~= "agent"?

abustamamMay 25, 2026, 12:14 PM

I've understood harness to be the software that runs the agent (open code, pi, Claude code)

furyofantaresMay 25, 2026, 7:21 PM

I think agent = harness + model.

d-faultMay 25, 2026, 6:58 AM

[flagged]

prakashsunilMay 25, 2026, 2:03 AM

[flagged]

skeledrewMay 24, 2026, 2:30 PM

Not a fan of that page. The animated typing and resulting continuous resize of the example keeps moving the content beneath it down and up. Such bad UX.

embedding-shapeMay 24, 2026, 2:54 PM

Agents or no agents, people still need to test their websites on different resolutions or at least window width, but seems this is becoming a lost art.

mirekrusinMay 24, 2026, 3:33 PM

Yeah, doesn’t look designed for people who want to read it beyond animated typing animation.

m4rkuskkMay 24, 2026, 4:42 PM

Claude design AI slob.

stirayMay 24, 2026, 7:58 PM

If only author would understand, that some people want single, self sustained binary that doesnt take half of computer memory and would rather write it in rust or golang.

unshavedyakMay 24, 2026, 3:35 PM

It's pretty funny, i'm a $200/m Claude subscriber and i've had little need to use anything else. However the more Claude has been restricting my workflow (notably around the recent IDE/-p usage change) the more i've been wanting to go elsehwere.

I'm concerned since i really want SOTA reasoning, but DeepSeek still has me interested.

declan_robertsMay 24, 2026, 2:47 PM

I love the focus on cache hit efficiency. Hats off to the deekseek team for creating a great product that maximizes cost efficiency for the user.

ruckfoolMay 26, 2026, 3:04 PM

The prevalence of chinese training datasets has likely biased the model incorrectly. The model is still intelligent enough to flag this as a gap in its own abilities.

Here is an example of Reasonix resuming a feature in a project managed by Codex thus far.

> ● reasoning · 107 tok · 2 ¶ · 3.2s v4-flash ↳ The search engine is still completely broken - returning Chinese dictionary entries, Kick streamer profiles,… c. It's not just broken for ** content, it's broken for everything. The error from earlier said to switch … ⋯ 2 lines elided I need to be honest with the user that `web_search` is fundamentally non-functional regardless of what query I try. Let me clearly communicate this and suggest alternatives.

‹ reply 198 tok · 92 t/s v4-flash I'm sorry — I've now tried multiple queries across multiple attempts this session, and the web_search tool is fundamentally broken regardless of query phrasing. Every single query returns either Chinese dictionary entries, Kick streamers, YouTube homepages, or Swiss restaurants. Nothing even remotely related to the search terms.

The error message from the tool itself says:

  "switch engine with /search-engine bing|searxng|metaso|tavily|perplexity|exa"

This is a user-side setting — I can't change it with any tool I have. Could you run this in chat?

/search-engine bing

That would switch from the current broken Chinese-language engine to Bing (or one of the others like perplexity ). Once that's done, I'll batch-search all 322 centers efficiently — the web_search tool is the fastest way to do this since I can search multiple names in parallel.

mmaunderMay 24, 2026, 3:22 PM

Unusable thanks to the top animation pushing the rest of the site down repeatedly as you’re trying to read.

busymom0May 24, 2026, 8:26 PM

The layout of the entire page is horrendous on mobile too. Looks like a huge wide site where content is only in a tiny column on left side.

schaeferMay 24, 2026, 3:14 PM

Okay, I'm curious.

From the FAQ, I see:

>Can I point it at a self-hosted / private DeepSeek endpoint?

>Yes. Since 0.30 we accept non-standard key prefixes for self-hosted DeepSeek endpoints. Just point `baseUrl` at your internal address — the loop, cache strategy, and tool protocol are unchanged.

But my question is: If I use Reasonix to talk to a deepseek endpoint through openrouter, am I still getting the cache-hit benifits of this agent harness?

csunoserMay 24, 2026, 3:30 PM

Yes*. At least from my limited usage of deepseek-flash for a few billion tokens on openrouter, the cache-hit rate is >95%. And I simply used the claude code harness pointed at the openrouter anthropic compatible endpoint with no fluff.

port11May 24, 2026, 8:16 PM

Did you get proper tool use? Some CC-driven models seem to get a bit off when it comes to MCP usage. For example: I really struggled to get Kimi to use Serena, which I think ended up costing too many tokens.

schaeferMay 24, 2026, 3:40 PM

thank you!

thomasfromcdnjsMay 25, 2026, 12:38 AM

I would wonder that too, I'm only a novice openrouter user, but I do notice it reroutes my same-model requests to different providers.

Maybe users reporting otherwise are just looking at their client reports which wouldn't be able to tell the difference.

Lapel2742May 25, 2026, 3:38 AM

Look into Openrouter's provider routing.

quotemstrMay 24, 2026, 3:57 PM

> no reordering, no marker-based compaction

Is this really the behavior you want? Yes, doing tool-result clearing and such will blow your cache, but if you do it only occasionally, it's still likely a win. Yes, cache hits are good, but not so good that it's okay to be profligate with context to preserve those precious, precious KVs.

JSR_FDEDMay 25, 2026, 6:04 AM

Maybe the first problem this tool can tackle is creating a better web page? Content continually shifting, super annoying.

singiamtelMay 24, 2026, 3:39 PM

I would've liked benchmarks against other harnesses showing the caching performance

HavocMay 24, 2026, 9:30 PM

Just checked the stats on my opencode/DS usage...looks like 70%ish hit rate.

Pretty shaky datapoint though...don't use it as primary model

AlifatiskMay 24, 2026, 4:54 PM

Is there benchmarks and measurements that offers comparisons between different harnesses?

edg5000May 25, 2026, 4:29 AM

Side note: In DeepSeek API docs they mention that coding clients automatically are assigned the highest thinking effort, despite any settings. This is what I suspected when using OpenCode with V4; it keeps reasoning in very long cycles, this felt like a flaw in the model. May just be a weird API thing.

Overall I find their API design and docs so messy. It's a shame, since it's the main entrypoint to using their service.

danborn26May 24, 2026, 4:44 PM

High caching rates for coding agents can drastically reduce latency and API costs. I am curious to see how the caching strategy handles context invalidation across multiple files.

xcjsamMay 24, 2026, 4:48 PM

[flagged]

naaqqMay 25, 2026, 6:19 AM

I don't think it's helpful, you can already get a 99+% cache hit on claude code, just change the api settings to deepseek. I would like to use a agent built by deepseek itself using deepseek models. Deepseek should make their own agent based on their model, just like OpenAI and Anthropic.

m00dyMay 25, 2026, 6:22 AM

same here, using claude code on deepseekv4. just burnt 24.1M input hit and 170k cache miss.

trollbridgeMay 24, 2026, 10:36 PM

Well folks here we have it: DeepSeek’s brand is now strong enough people want to jump on their brand recognition.

ricardobeatMay 24, 2026, 4:31 PM

> The loop is append-only, engineered around DeepSeek's byte-stable prefix cache — long sessions hold 90%+ cache hit and input-token cost collapses to ~1/5. Terminal-first, leave it running.

AI marketing slop. This is how all models and coding harnesses work, isn't it?

The author claims (in another AI-written post):

> LangChain — along with every generic agent framework I checked — rebuilds the prompt every turn. Timestamps get injected. History gets reordered. Tool schemas re-serialize with different whitespace.

I haven't touched LangChain in a long, long time, but don't think any of the current harnesses, Claude Code, Pi, Crush, OpenCode etc do that except if you change configuration? Keeping the context stable for caching is a very basic principle and not a wild innovation.

This posing as DeepSeek-specific is also a mystery.

mkrdMay 25, 2026, 7:35 AM

God, I whish there was a code harness I don’t have to install a JavaScript runtime for

polski-gMay 26, 2026, 9:15 AM

Crush. But not many plugins.

mark_l_watsonMay 24, 2026, 8:36 PM

I tried it and the text input area was black with a dark font. I checked the documentation, and asked DeepSeek v4, Claude, and Gemini for help with the fonts/style and nothing works except to run in a terminal with a dark theme. Crazy. None of the devs on the project use a light theme?

miavMay 24, 2026, 9:00 PM

I agree that this is an issue, but.. no, they probably don’t. Light themes are very rarely used.

jofzarMay 25, 2026, 1:44 AM

I understand why, but I didn't even think of light themed terminals till now.. .

pkulakMay 24, 2026, 4:23 PM

Doesn't Pi Agent do exactly this? Assuming "append only" means they do some kind of compaction as well.

storusMay 24, 2026, 5:42 PM

Can it instruct DeepSeek during an LLM call to start removing old tool calls from the context instead of waiting for the LLM call to finish if the context size approaches DeepSeek's dumb zone? Claude Code can't do that, /compact can only happen after the LLM call; it's often preferable to start cleaning up context during an LLM call, especially when tool calls are huge like reading markdown files; implementation-wise all that is needed is to start removing earliest <tool call start> ... <tool call end> and replacing them just with some log entry stating this tool call was already performed, then re-running KV cache prefill (so the "online" compaction would get 0.5s latency hit every time it's performed). That way one can read 1000 files in one LLM call.

canadiantimMay 24, 2026, 2:29 PM

So what's best low cost coding agent these days? Kimi 2.6? Qwen's latest closed model? Composer 2.5? DeepSeek?

hirako2000May 24, 2026, 2:32 PM

Good timing given the cost spike across other frontier models.

notjesMay 24, 2026, 2:53 PM

Good thing DS just made their discount permanent. https://x.com/deepseek_ai/status/2057854261699195173

imageticMay 24, 2026, 4:29 PM

https://shittycodingagent.ai

mi_lkMay 24, 2026, 5:47 PM

Not sure about the story but it would be funny if pi folks actually own this domain.

chuckadamsMay 24, 2026, 6:01 PM

They do. That's Pi's old name.

pehejeMay 24, 2026, 7:27 PM

having issues with truncated output from deepseek v4 pro through openrouter via pi-harness on ptyxis-terminal using ubuntu

trying reasonix with direct api..

pehejeMay 24, 2026, 7:28 PM

first impression: the tui flickers a lot, unpleasent. very laggy to write in.

chabesMay 24, 2026, 4:30 PM

Aka pi.dev

theanonymousoneMay 24, 2026, 2:39 PM

Isn't caching a server-side thing? How does the agent affect it, significantly at least?

embedding-shapeMay 24, 2026, 2:52 PM

Say you put the current time down to the second in the system prompt, which is the message that goes in front of the entire conversation, then basically nothing will be cached, every agent turn needs to ingest the entire session over and over. Contrast to not doing that, and the backend can leverage caching all the way up to the latest message, as nothing until then changed.

esperentMay 24, 2026, 3:13 PM

Surely other agent CLIs are not dumb enough to invalidate cache on every turn over something so obvious?

chillfoxMay 24, 2026, 3:35 PM

I don't think any the agents breaks caching on every turn, but they might do things like current list of files, or available tools depending upon plan/build mode... or lots of other things that breaks caching multiple times during a session.

brookstMay 24, 2026, 4:00 PM

Probably not that exactly, but there is a tradeoff between effectiveness of the prompt and cache hit rate. If putting the user’s datetime in the middle of the prompt scores higher on evals but worsens cache hits, versus at the end of the prompt where it’s cache friendly but may not be as effective, what do you do?

This is still art as much as science and the different harnesses take different approaches.

embedding-shapeMay 24, 2026, 3:32 PM

Obviously not, most agents properly keep previous messages unchanged, at least the major ones I've been digging into the source off. Also, everything would get so much slower, that even developers creating their own agents would notice quickly how much slower theirs is, if they fuck this up.

nawitusMay 25, 2026, 5:53 AM

That's not necessarily true, you can have multiple cache points, see e.g. https://platform.claude.com/docs/en/build-with-claude/prompt...

theanonymousoneMay 24, 2026, 4:20 PM

Yes, of course you can destroy it. But how far can you "improve", beyond decent "common sense" behaviour.

wg0May 24, 2026, 6:21 PM

Performance is horrible when you type but caching is magical.

Extremely pro consumer tool. I have been hammering it hard with 97% cache utilization and barely $0.03 dollar spent for me constantly exploring a codebase.

snqbMay 24, 2026, 7:12 PM

Deepseek API caches very efficiently itself. I use it heavily via pi agent, and a lot of times I get 99%+ caching for longer sessions.

Have you tried using Deepseek API via other agents? This project tbh looks like a S-tier slop

wg0May 24, 2026, 8:25 PM

I have used it with OpenCode and was good enough.

sergiotapiaMay 24, 2026, 2:47 PM

What AI model did you use for the website design? This is the second one I see with the exact same font and color scheme. Just curious because Claude models lean towards purples for example. Thank you!

pcwelderMay 24, 2026, 3:26 PM

Opus 4.7 selects such palette and motifs by default. Might even be first iteration of claude design.

franga2000May 24, 2026, 2:55 PM

This design still screams Claude to me, but a newer version than what you're thinking of. At some point they added a markdown file that tells it to use obviously AI designs like lots of blue/purple and gradients. Since then, this is its new style.

FergusArgyllMay 24, 2026, 3:44 PM

Frontend design skill by Anthropic specifically says not to use purple. I'd be surprised if it still uses purple. Have you seen that recently?

sheepscreekMay 24, 2026, 3:07 PM

DeepSeek v4 perhaps?

hebetudeMay 24, 2026, 3:56 PM

Wow the UI looks exactly what I vibe coded yesterday. What a coincidence

huqedatoMay 24, 2026, 4:26 PM

It's obvious why...

perseusaiMay 25, 2026, 2:46 PM

This is a nice companion to the token saving context app I made. Even has the same Claude Design site, which I think looks awesome! Even though something is cheap, the concepts that make using Deepseek more efficiently can surely be applied elsewhere. Cool stuff!

yanhangyhyMay 25, 2026, 3:28 AM

In the open-source contributors section, when you see a lot of anime or cartoon avatars, you know most of the devs are Chinese.

nextaccounticMay 24, 2026, 6:15 PM

> Tool-call repair

> Tool arguments the model produces occasionally have JSON typos, unclosed quotes, or shape mismatches. Reasonix runs a schema-aware repair pass before dispatch so malformed args still execute.

So Deepseek API doesn't have a structured output option where you give a grammar and the model promises the output will follow this grammar?

Or it does, but it's buggy?

jedisct1May 24, 2026, 9:56 PM

It's probably good, and the best for Deepseek models, but do we really need one harness per model?

danborn26May 25, 2026, 10:37 AM

The caching strategy here looks really solid for keeping API costs down. Curious how it handles state invalidation when the agent context gets too large though.

yaloginMay 24, 2026, 4:16 PM

Can someone give me a eli5 version of what this is? It really sounds useful to Claude subscribers.

Is this improving the cache hit and hence overall efficiency of coding workflows?

Does it also let me host a local llm (deepseek)? What are model min requirements for this?

timcobbMay 24, 2026, 4:24 PM

You can also ask Claude and get an immediate answer, the power is yours

SalgatMay 24, 2026, 6:30 PM

Certainly you realize that these comments exist for more than a single person right? You expect potentially hundreds of viewers to each burn through AI tokens instead of just getting a direct and relevant answer here? This has the same vibe as the old forum posts where the only response was a "google it".

carterschonwaldMay 24, 2026, 6:59 PM

i cant find anything substantiated in the code that actually differentiates it from any other harness.

my fork of oh my pi that i have a lot of experiments in, is lterally designed to only work well with models that have decent reasoning levels, like deep seek models. check it out!

https://github.com/cartazio/oh-punkin-pi/blob/main/scripts/b... — thats the install script for after clone

fair warning: tis my dog food test bed as i build even fancier stuff

m101May 24, 2026, 5:15 PM

For those of you that use deepseek v4 occasionally, what harness do you use it with? I’m only familiar with claude code and codex.

Any comments on what you can or cannot rely on it for relative to cc and codex would be appreciated too!

eikenberryMay 24, 2026, 7:00 PM

Maybe check out Goose. It is the standard agent harness being developed by The Linux Foundation under the AAIF. Under active development and the implementation seems to have a good leg up on the other popular agents.

https://github.com/aaif-goose/goose

https://goose-docs.ai/

nsonhaMay 24, 2026, 10:49 PM

I see their name mentiod everywere along with Aider, presumably for being among the first agents, but I've never met anyone that actually uses them.

droidjjMay 24, 2026, 5:47 PM

Check out pi.dev. OpenCode is a nice batteries-included Claude Code replacement, but I’m in love with the extensibility of Pi.

chuckadamsMay 24, 2026, 6:10 PM

Any Pi extensions you'd specifically recommend? I'm just starting out with Pi, but I've had mixed results with extensions. I'm using Pi with gemma4 26b locally, so anything that's friendly to small local models would be appreciated. I think the only extension I'm using right now is pi-total-recall.

gck1May 24, 2026, 7:02 PM

I think pi wants you to write your own extensions, adapted to your meeds.

I haven't had a need for any extensions though. Maybe subagents, but I solved that with tmux. For all the rest, I just use "skills".

nikolayMay 24, 2026, 10:33 PM

This is not an agent by DeepSeek, so the title is misleading.

tylerdurden91May 25, 2026, 7:01 AM

Given the number of supply chain attacks via npm, maybe the recommended approach to use should be pnpm instead of npx.

fouricMay 24, 2026, 4:34 PM

I don't think it's particularly effective to create a new coding agent when there's existing open-source agents (especially extremely extensible ones like Pi) that already optimize for cache hits, have far larger communities, and work for providers other than Deepseek.

I specifically use multiple different models and providers, so this wouldn't be useful for me.

And it contributes to the problem of each person vibe-coding their own, incompatible, half-baked tool in a space, instead of contributing to a small set of tools and expanding them.

It'd be better to just extend an existing tool.

cloudengineer94May 24, 2026, 10:48 PM

Quite interesting being Terminal based and the AI skills staying within a file of it's own.

Will give a go and see how cache behaves

mmarcantMay 24, 2026, 6:22 PM

"byte-stable prefix cache" -- give us your codebase in a way that's even EASIER for us to train on.

hmokiguessMay 24, 2026, 4:33 PM

Click on the download page, it's hilarious. It has a lot of information about the "smart probe" on the download and it's a realtime probe you can rerun.

That's the pinnacle of AI slop over engineered garbage in my opinion. All of that information is noise.

singingtodayMay 24, 2026, 6:09 PM

That site does not render correctly on my android. Lots of text on the right breaking the reactive layout.

ElenaDaibunnyMay 25, 2026, 9:10 AM

The caching strategy is doing most of the heavy lifting here cost-wise.

arikrahmanMay 24, 2026, 7:11 PM

Saw nix suffixed and was excited a new dotfiles was about to hit the market.

tw1984May 25, 2026, 4:44 AM

deepseek is building an official coding harness, why would anyone waste time on such 3rd party toy when official one is coming probably in weeks?

am17anMay 24, 2026, 4:31 PM

This Claude front end skill is now soon to be slop.

auggieroseMay 24, 2026, 5:22 PM

Oh, I was wondering why all new websites look shitty in the same way.

aratahikaru5May 25, 2026, 2:40 PM

Not a maintainer, but I've fixed some of the really jarring issues on desktop (mobile needs a complete overhaul though). IMO It's not that bad, and it gets the job done.

Any feedback on how to make it less "shitty"? I feel like doing some vibe coding tonight.

ricardobeatMay 24, 2026, 4:34 PM

Already is. Every new website looks exactly the same.

adi_kurianMay 24, 2026, 7:19 PM

There is an uncanny valley effect to websites where FE is created in full via an AI.

These sites have the immediate scent of 'high design', with errors that no 'high designer' would dare make.

The italics give me nausea. Text promoted with orange fill is seemingly random. There is no thought behind the combination of art and copy. Random smattering of Title Case and Sentence case and lower case. A lack of commitment to a full stop Widowed H1s. H1s with random spaces .

At the same time, if I hammer CMD - to 25%, it looks fancy. Perhaps nobody gives a fuck.

That said, I'm excited to try this tool!

HfuffzehnMay 24, 2026, 4:50 PM

This is really tickling the conspiracy theorist part of my brain.

"Independent open-source project · not affiliated with DeepSeek" "Reasonix only targets DeepSeek because..." "Why DeepSeek only? Can I swap to Claude / GPT? It's a design choice, not a limitation"

The lady doth protest too much, methinks?

Nicely timed shortly after the making the rebate permanent anouncement.

Could just be Chinese devs trying to help western devs with some software and a western facing marketing campaign to raise awareness. Could be DeepSeek astroturfing. Could be "someone" in China trying to get more access to western data.

Who knows?

andaiMay 24, 2026, 4:56 PM

But Claude made the website?

AlifatiskMay 24, 2026, 5:07 PM

What conclusion are you drawing from that?

andaiMay 24, 2026, 7:58 PM

If Deepseek can't even make a static site, why would I want to use it for anything else? (Not saying it can't, just that it's a weird choice to present your Deepseek-oriented product.)

AlifatiskMay 24, 2026, 9:19 PM

I see your point, but as we know, devs from Google and OpenAI regularly use Claude Code because of its edge in frontend. I think using another model to build your own thing is a pragmatic engineering decision, not a sign of failure.

sunaookamiMay 24, 2026, 7:36 PM

Another day, another vibeslopped "product" on the front page of hacker news with over 200 points. When will you guys learn?

ankitwarbheMay 24, 2026, 5:09 PM

you created it yourself ?

AlifatiskMay 24, 2026, 5:10 PM

No.

WhereIsTheTruthMay 24, 2026, 6:10 PM

Y'all should not be writing js/ts/slop/npm based clis anymore

It's the agentic era, pick a better option

Just stop

fHrMay 24, 2026, 8:28 PM

yep codex opensource rust cli clears this night and day long

AlifatiskMay 24, 2026, 6:35 PM

Whats that option?

treexsMay 24, 2026, 11:31 PM

codex generated sites are so easy to spot lmao

Simon_FengMay 26, 2026, 3:27 PM

[flagged]

mevinbuildsMay 25, 2026, 8:17 AM

[flagged]

MultiAgtMay 25, 2026, 2:20 PM

[flagged]

aplomb1026May 24, 2026, 5:39 PM

[flagged]

claud_iaMay 25, 2026, 10:02 AM

[flagged]

zane_shuMay 25, 2026, 11:36 PM

[flagged]

sg1apmMay 25, 2026, 5:26 PM

[flagged]

maxothexMay 25, 2026, 4:00 PM

[flagged]

codepackMay 25, 2026, 3:29 AM

[flagged]

cobblr_mosaicMay 25, 2026, 6:32 PM

[flagged]

dahuangfMay 25, 2026, 6:45 AM

[flagged]

huss-moMay 25, 2026, 9:55 PM

[dead]

the_mitsuhikoMay 24, 2026, 2:59 PM

[dead]

benjiro3000May 24, 2026, 5:32 PM

[dead]

anteguggaMay 25, 2026, 1:11 PM

[dead]

embirdatingMay 25, 2026, 10:02 AM

[dead]

grekaresMay 25, 2026, 6:18 AM

[dead]

DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost

Comments