Dev Systems
Show HN: Collabmem – a memory system for long-term collaboration with AI
Hello HN! I built collabmem, a simple memory system for long-term collaboration between humans and AI assistants. And it's easy to install, just ask Claude Code: Install the long-term collaboration memory system by cloning https://github.com/visionscaper/collabmem to a temporary location and following the instructions in it.
To collaborate with AI over weeks, months, or even years, there needs to be a shared conceptual understanding of:- History (episodic memory): wh
Show HN: 500k+ events/sec transformations for ClickHouse ingestion
Hi HN! We are Ashish and Armend, founders of GlassFlow.Over the last year, we worked with teams running high-throughput pipelines into self-hosted ClickHouse. Mostly for observability and real-time analytics.A question that came repeatedly was:
What happens when throughput grows?Usually, things work fine at 10k events/sec, but we started seeing backpressure and errors at >100k.When the throughput per pipeline stops scaling, then adding more CPU/memory doesn’t help because often part
Mythos, Glasswing, and the hardware disclosure problem nobody is discussing
Coverage of Anthropic's Claude Mythos Preview and Project Glasswing has focused almost entirely on software vulnerabilities. That is where the demos are and where controlled release maps cleanly onto existing disclosure practice. I have not seen anyone engage with the next obvious question: what happens when a Mythos-class model is given detailed hardware architecture documentation and asked to do a security-oriented review? My intuition is the hardware case is meaningfully worse, for reaso
Show HN: OS Megakernel that match M5 Max Tok/w at 2x the Throughput on RTX 3090
Hey there, we fused all 24 layers of Qwen3.5-0.8B (a hybrid DeltaNet + Attention model) into a single CUDA kernel launch and made it open-source for everyone to try it.On an RTX 3090 power-limited to 220W:
- 411 tok/s vs 229 tok/s on M5 Max (1.8x)
- 1.87 tok/J, beating M5 Max efficiency
- 1.55x faster decode than llama.cpp on the same GPU
- 3.4x faster prefillThe RTX 3090 launched in 2020. Everyone calls it power-hungry. It isn't, the software is.
The conventional wisdom NVID
Show HN: Build queryable packs for AI agents from videos, podcasts, and files
Hi,This started from a pretty personal use case.There was this very technical person I follow who would go live on YouTube from time to time. He has a ton of experience, and would casually drop really good insights about software architecture, engineering tradeoffs, and just general "you only learn this after years" kind of stuff. He also posts shorter clips, but I wanted something else: I wanted that knowledge to be always there, queryable whenever I needed it.At the same time, I was
Show HN: Druids – Build your own software factory
Hi HN!Druids (https://github.com/fulcrumresearch/druids) is an open-source library for structuring and running multi-agent coding workflows. Druids makes it easy to do this by abstracting away all the VM infrastructure, agent provisioning, and communication. You can watch our demo video here (https://www.youtube.com/watch?v=EVJqW-tvSy4) to see what it looks like.At a high level:- Users can write Python programs that define what roles the agents take on and how
Advancing to Agentic AI with Azure NetApp Files VS Code Extension v1.2.0
Table of ContentsAbstractIntroducing Agentic AI: The Agent Volume ScanWhy This MattersWhy AI-Informed OperationsCore ComponentsEnhanced Natural Language InterfaceAI-Powered Analysis and TemplatesWhat are the Benefits?Business BenefitsEconomic BenefitsTechnical BenefitsReal‑World ScenarioLearn more AbstractThe Azure NetApp Files VS Code Extension v1.2.0 introduces a major leap toward agentic, AI‑informed cloud operations with the debut of the agentic scanning of the volumes. Moving beyond tr
Escaping the Fork: How Meta Modernized WebRTC Across 50+ Use Cases
At Meta, WebRTC powers real-time audio and video across various platforms. But forking a large open-source project like WebRTC within our monorepo presents unique challenges – over time, an internal fork can drift behind upstream, cutting itself off from community upgrades.We’re sharing how we escaped this “forking trap” – from building a dual-stack architecture that enabled safe A/B testing across 50+ use cases, to the workflows that now keep us continuously upgraded wit
Show HN: PromptJuggler – A dev env and runner for prompts, workflows, agents
Backstory: At work I had to build an AI pipeline to run millions of prompts. First I just put the prompts into string consts and integrated directly with api, chaining one run onto the output of another – but it quickly became a maintenance nightmare. Iterating on prompts, testing them over datasets, experimenting with different chaining did not fit into the regular sdlc and running them at our scale was quite difficult as most of the time is spent on waiting for the api response while holding o
Show HN: Splice CAD – Wiring and cable assembly CAD with an agentic assist
Still working on Splice CAD. The two most significant updates focus on handling design complexity and automating using agentic workflows. Previous Show HN here:https://news.ycombinator.com/item?id=449781401. Projects:This workflow is more amenable to system-level modeling. It overlaps somewhat with the original workflow but includes improvements including:Complex Routing: Model topologies with branchpoints.Mating Relationships: Represent physical mating (e.g., a receptacle into a
Show HN: jmux – tmux-based development environment for humans and coding agents
I've been a tmux user for years. When I started running 5-10 Claude Code sessions in parallel, I tried the tools that are out there: Conductor, cmux, the GUI orchestrators. None of them felt right. They either wanted me to leave tmux entirely for a 100MB+ Electron app with its own editor and Git workflow, or they were thin wrappers that didn't solve the actual problem: I need to parallelize my entire development environment, agents, editors, servers, logs, and keep track of all of it.S
Show HN: Secure SDLC Agents for Claude and Cursor (MCP)
Hey HN,I have been using Claude Code and Cursor lately and as we all know, they write code incredibly fast but a few times i have noticed they can introduce the same security flaws. For example, you ask the LLM to build a file upload feature, you will get working code in minutes, but it would almost always miss magic-byte validation or leaves you vulnerable to SVG XSS. The LLM optimizes for code that compiles not code that is secure.To fix this for my own workflow, I made a set of 8 security-foc
ACID vs BASE Explained | Why Distributed Systems Break Under Load (System Design Deep Dive)
ACID vs BASE — Why Money Disappears in Distributed Systems An e-commerce system crashed mid-transaction. Customers ...
Designing Reliable Health Check Endpoints for IIS Behind Azure Application Gateway
Why Health Probes Matter in Azure Application GatewayAzure Application Gateway relies entirely on health probes to determine whether backend instances should receive traffic.If a probe:Receives a non‑200 responseTimes outGets redirectedRequires authentication…the backend is marked Unhealthy, and traffic is stopped—resulting in user-facing errors. A healthy IIS application does not automatically mean a healthy Application Gateway backend. Failure Flow: How a Misconfigured Health Probe L
Build a multi-tenant configuration system with tagged storage patterns
In modern microservices architectures, configuration management remains one of the most challenging operational concerns. Two gaps emerge as organizations scale: handling tenant metadata that changes faster than cache TTL allows, and scaling the metadata service itself without creating a performance bottleneck. Traditional caching strategies force an uncomfortable trade-off: either accept stale tenant context (risking incorrect data isolation or feature flags), or implement aggressive cache inva
Trust But Canary: Configuration Safety at Scale
As AI increases developer speed and productivity it also increases the need for safeguards.On this episode of the Meta Tech Podcast, Pascal Hartig sits down with Ishwari and Joe from Meta’s Configurations team to discuss how Meta makes config rollouts safe at scale. Listen in to learn about canarying and progressive rollouts, the health checks and monitoring signals used to catch regressions early, and how incident reviews focus on improving systems rather than blaming people.They also tal
Ably Python SDK v3: realtime for Python, built for AI
Python dominates AI development. It's where teams build their agents, orchestration layers, and the backend systems that turn LLM calls into products people actually use. Over the past year, those systems have matured rapidly. What used to live in notebooks and prototypes is now running in production, serving real users with real expectations around reliability and performance.That maturity brings infrastructure requirements. Tokens need to stream in order. Sessions need to survive refreshes, re
Laid Off from Oracle(OCI). Looking for Software Roles (USA)
10+ yrs of experience working in distributed backend systems(Java). Founding Engineer in early stage cyber security startup, Worked on tier 1 service in oracle cloud infrastructure (OCI) which handled 295~ millions requests / operations. Scaled services for Series B Startup.
Show HN: Composer – AI architect / MCP for software architecture diagrams
Hi everyone!I built Composer, which is a tool turns your ideas into architecture diagrams. You can also use MCP to turn your EXISTING codebase into a visual diagram!It connects to all possible tools using MCP (Claude Code, Codex, OpenCode, etc.)The goal was to make system design easier and to be able to draw out what I wanted to make before I started / to explain to others.Its currently live at usecomposer.com and free!I’d love feedback on whether it feels useful for real projects and where
Unlock efficient model deployment: Simplified Inference Operator setup on Amazon SageMaker HyperPod
Amazon SageMaker HyperPod offers an end-to-end experience supporting the full lifecycle of AI development—from interactive experimentation and training to inference and post-training workflows. The SageMaker HyperPod Inference Operator is a Kubernetes controller that manages the deployment and lifecycle of models on HyperPod clusters, offering flexible deployment interfaces (kubectl, Python SDK, SageMaker Studio UI, or HyperPod CLI), advanced autoscaling with dynamic resource allocation, and com