Dev Systems
Show HN: SiMM – Distributed KV Cache for the Long-Context and Agent Era
We built SiMM because LLM context lengths are growing much faster than GPU memory.With long Chain-of-Thought reasoning and multi-turn agents, prompts are getting much longer. According to OpenRouter’s State of AI 2025, average context length has grown about 4× in the past year.This creates two problems in inference systems:• Slow TTFT — long contexts make prefill expensive
• High GPU memory cost — KV cache quickly exhausts HBMInstead of recomputing long prompts or keeping all KV cache in GPU mem
Ask HN: Why do the majority of vibecoded projects fail?
This was recently asked on reddit in the context of a hypothetical "slack killer" Ws chat app that is proofed in 20 minutes and works on localhost, with the "failure" being because AI can't build scalable distributed systems, but discussion not so deep, a lotta jokes.Here's what I think:I think AI can build scalable, distributed systems, too, you just have to know what you're doing.to be slightly more substantial: there's a lot of "implicit" deci
Show HN: Replacing $50k manual forensic audits with a deterministic .py engine
I’m a software architect, and I recently built Exit Protocol (https://exitprotocols.com), an automated forensic accounting engine for high-conflict litigation.Problem:
If you get divorced and need to prove that a specific $250k in a heavily commingled joint bank account is your "separate property" (e.g., from a pre-marital startup exit), the burden of proof is strictly mathematical. Historically, this meant paying a forensic CPA $500/hour to dump years of blurry bank PDF
Becoming a Forest Civilisation
A forest is not one tree. It is many different things growing in the same place, competing, cooperating, dying, regrowing. No single species runs it. It holds together because the diversity itself is the structure.Human civilisations tend toward monocultures. One way of thinking crowds out the others - not because anyone chooses it, but because successful ideas spread. This has always happened. It is happening faster now.A forest civilisation resists this deliberately. Not by fighting successful
Show HN: AgentLog – a lightweight event bus for AI agents using JSONL logs
I’ve been experimenting with infrastructure for multi-agent systems.I built a small project called AgentLog.The core idea is very simple, topics are just append-only JSONL files.Agents publish events over HTTP and subscribe to streams using SSE.The system is intentionally single-node and minimal for now.Future ideas I’m exploring:
- replayable agent workflows
- tracing reasoning across agents
- visualizing event timelines
- distributed/federated agent logsCurious if others building agent sy
A lock-free state management engine for DualHeart Financial Association
We just finished an internal overhaul of the data synchronization layer at DualHeart Financial Association. The goal was to replace our traditional mutex-heavy architecture with a lock-free, event-driven state machine.Key technical wins for DualHeart Financial Association:Throughput: Achieved a 3x increase in message processing by eliminating lock contention.Architecture: Implemented a minimalist, high-end schema that maps directly to memory-mapped files for near-instant persistence.Low Latency:
Tell HN: AI tools are making me lose interest in CS fundamentals
With powerful AI coding assistants, I sometimes feel less motivated to study deep computer science topics like distributed systems and algorithms. AI can generate solutions quickly, which makes the effort of learning the fundamentals feel less urgent.<p>For those who have been in the industry longer, why do you think it’s still important to stay strong in CS fundamentals?
Show HN: I built a 38K-line Rust CLI using 3 AI models as my engineering team
Hi HN,I was getting incredibly frustrated with the current state of AI agent skill managers (like skills.sh). They rely on heavy Node.js runtimes just to manage a few skills, symlinks break constantly across environments, and there's no real determinism.I wanted a tool that treats agent skills like Terraform treats infrastructure: Config-as-Code, purely deterministic, and zero-dependency.So I built eden-skills — a single ~10MB Rust binary based on Tokio that uses a skills.toml to lock every
Ask HN: I built an AI-native codebase framework–could you evaluate it?
I built this open-source project and would really appreciate technical feedback from people here:https://github.com/xodn348/ai-nativeThe goal is to make AI-assisted development more reliable through clearer project structure, explicit contracts, and verification workflow.I made this because applying these patterns from scratch in every project was repetitive and hard to maintain, so I wanted a reusable framework.If you have time, I’d love your evaluation on:1. What is useful
Show HN: GladeKit – AI agent for Unity game development
We’ve been building GladeKit, an AI agent for Unity game development that works directly inside the editor. Instead of just generating code, it reads the actual project state: scene hierarchy, scripts, selected objects, component values, installed packages, and compiler errors - then takes actions inside Unity.What it does today:
- 150+ native Unity actions (create prefabs, wire components, configure Animator Controllers, build UI, set up physics, lighting, and NavMeshes)
- Reads real editor c
Workflow to build context for coding agents
Here’s the workflow my team and I have found works best with coding agents:- Plan: Write a plan in markdown. Edit this. Iterate. The plan isn’t a throwaway note. It tracks status as work progresses (draft -> in-development -> in-review -> completed), versions with git alongside the code, and serves as the single source of truth. When the agent later implements, it reads this document. When we review the work, we compare against it.- Diagram: Have the agent enrich the plan with architect
Show HN: Mozzie – a local desktop orchestrator for Codex, Claude Code and Gemini
Mozzie started as a tool I built for my own workflow.I like working on multiple things at once, but most development tools split the workflow across different places: tickets live in issue trackers, execution happens in terminals, and context gets lost between them. I wanted my work items and their context right next to the place where the work actually happens.Mozzie is a local desktop workspace where each work item can spawn its own terminal or coding agent. The idea is that tasks become the p
Show HN: I built an AI comic generator from scratch using only natural language
I spent ~4 hours building AIComicBuilder [1], a full-stack AI comic/drama generation platform,
without writing a single line of code manually. This is a writeup of my "vibe coding" workflow
using Claude Code. The app lets you: input a script → AI generates screenplay → character analysis → storyboard
generation (with first/last frame images) → video generation. Supports OpenAI-compatible,
Gemini, and Bytedance Seedance APIs for text/image/video models, conf
Show HN: From Claude Code to OpenCode – My Evolution in Vibe AI Engineering
I’ve spent the last few months iterating on my "Vibe Coding" workflow, moving away from closed-box solutions toward a more transparent, multi-provider stack. I documented the transition from Codex and Claude Code to an open-source setup using OpenCode and opencode serve.Cursor -> Claude Code -> OpenCode -> OpenCode + OpenCode-Manager -> Codex + Tmux + Tailscale -> OpenCode Serve + Tailscale.Press enter or click to view image in full sizeKey takeaways from the journey:The
Toolpack SDK, an Open Source TypeScript SDK for Building AI-Powered Applications
Just Released Toolpack SDK — a completely Open-Source unified TypeScript SDK for AI developmentIf you've worked with multiple LLM providers, you know the pain: each has different APIs, different tool formats, different quirks.Toolpack SDK gives you a single interface across OpenAI, Anthropic, Gemini, and Ollama.It comes with 77 built-in tools for file ops, git, databases, web scraping, code analysis, and shell commands. You can also create and integrate your own custom tools.The workflow en
Show HN: Mesa – A collaborative canvas IDE built for agent-first development
Hi HN - I'm Ryan a product designer who codes, and I built Mesa. Current IDEs feel wrong for the type of development being done now - the focus is still on files.Mesa puts the focus on the full workflow: your agent, terminal, browser, and files all live as equal nodes on a canvas with full multiplayer support. (think figma but for code)I was tired of the overhead of switching windows, tabs, and terminals across multiple projects. Inspired by TouchDesigner and Factorio, I wanted something mo
Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps
Even seemingly simple engineering tasks — like updating an API — can become monumental undertakings when you’re dealing with millions of lines of code and thousands of engineers, especially if the changes are security-related. Nowhere is this more apparent than in mobile security, where a single class of vulnerability can be replicated across hundreds of call sites scattered throughout a sprawling, multi-app codebase serving billions of users.Meta’s Product Security team has develope
Resume tokens and last-event IDs for LLM streaming: How they work and what they cost to build
When an AI response reaches token 150 and the connection drops, most implementations have one answer: start over. The user re-prompts, you pay for the same tokens twice, and the experience breaks.Resume tokens and last-event IDs are the mechanism that prevents this. They make streams addressable – every message gets an identifier, clients track their position, and reconnections pick up from exactly where they left off. The concept is straightforward. The production scope is not: storage design,
Cost per outcome: measuring the real economics of AI workflows
Hi HN, I’m the technical founder of botanu (https://www.botanu.ai
).I started building this after repeatedly running into the same problem on AI teams: we could see total LLM spend, but we couldn’t answer a simple question:“What did one successful outcome actually cost?”In real systems, a single business event often requires multiple attempts before it succeeds — retries, fallbacks, tool calls, escalations, async workers, etc. Most tooling measures individual model calls or sometimes a
Show HN: Pacto – OCI-distributed contracts for cloud-native services
Author here.I work as a platform engineer and kept running into the same problem: services are described in fragments across different tools.APIs live in OpenAPI specs.
Deployment assumptions end up in Helm values.
Runtime details are hidden in Kubernetes manifests.
Configuration lives in environment variables.
Dependencies are often documented in READMEs or tribal knowledge.There is no single machine-readable contract that describes how a service actually behaves operationally.So I started buil