Dev Systems
Show HN: TLA+ Process Studio
Disclaimer: This was made with LLMs.I made this tool to help understand large business processes that can be modelled as a single state machine.The core loop of this is to enable to walk stakeholders through discussing each step, adding comments, and reiterating with an LLM of their choice to generate the TLA+ syntax on the left.Users can click through the green state nodes to see how things work visually.You can see some sample state machines in the dropdown in the top left.The power would come
Ask HN: What's your go-to queue system?
Hello,I am building a new product and need to have a robust queue system but I'm not sure what product to choose.I've worked in Amazon before and usually AWS tools are the default go-to but in my own time for small project I've used things like: https://github.com/hibiken/asynq for basic tasks.My concern with the above is that it's still in "early" development; from their README:```
Status: The library relatively stable and is currently undergoin
A 30 Year OG Application Developer Available
https://www.youtube.com/watch?v=DACtpW9Q-hcThat link is the #1 Architectural Interior Design software used by the top firms in the world. What you are looking at is the result of pure architectural discipline. To handle massive global budgets, complex subcontractor workflows, and real-time synchronization with massive enterprise CAD suites, I engineered a closed-loop, self-aware data object model. The data objects carry their own application logic and database schemas. It is a zer
Show HN: eBook to audiobook narration with realistic AI voices
For a while I've wanted to try out the new AI voices for long-form narration, but everything I found required a subscription that didn't justify my limited usage. I came across the open Kokoro model [0] and the voices are very good -- good enough to listen to for hours without the fatigue I got from legacy, robotic TTS voices. The model is 82m parameters and designed to run fast, but I still struggled to get reasonable times from CPU inference on my 12-core laptop. I thought a cloud-ba
Stop vs disconnect - why canceling AI streaming is harder than it looks
You add a stop button to your AI chat app: a customer support agent, a coding assistant, a research tool the user can steer mid-task. A user clicks it mid-response. The frontend stops rendering. Then you check your backend logs and realize the underlying generation is still running, and you’re still paying for every token.This is not a bug. The Vercel AI SDK docs document it explicitly: in a resumable stream setup, calling stop() only closes the current HTTP connection and should not cancel the
Agentic apps that go beyond chat
You are planning a trip with an AI assistant on your laptop. You are chatting with the agent, and as you progress it is dropping pins on a map, building a day-by-day itinerary, adding up a budget, and streaming its reasoning as it goes. The state of your interactive session is a combination of the chat history, the synthetic UI constructed by the agent during that process, and structured state, the itinerary, arising from the decisions you each make.Building such an app has challenges beyond get
How to become an AI infrastructure engineer?
Hi,I currently work on a GenAI platform for one of the largest local industrial companies. My daily work mostly involves building inference infrastructure on top of a 48x H200 GPU, Kubernetes and vLLM. Hence, I'd say it's 80% SRE and 20% software engineering when it comes to building request routing and internal control planes.Although I have a background in backend engineering rather than ML research or low-level GPU programming, I am trying to understand what I need to learn to becom
Ask HN: Switching from backend development to graphics programming
I love computers. I wrote my first program in Borland C++ when I was 11.By chance, I managed at my 15 to get a job where I did some HTML pages and later some PHP programming. Making websites wasn't as fun as making games in C++. Overall PHP didn't seem as fun as C++. I made all of my lab projects at the university in C++ with Qt, or wxWidgets or bare WinAPI. In fact, I improved the university the internal testing system on our Math/Physics faculty. I got a privilege to do la
Show HN: Supaqueue – Node.js background job queue (no Redis needed)
Hi everyone,I have been using BullMq for most of my background job related work but lately I have been working on some smaller scale app where I having a full blown Redis setup with separate worker process would have been overkill.That is why I built a lightweight, in-memory Node.js background job queue. It comes with a Bull/BullMq-type API, concurrency control, schedulers, job retention and much more. It has zero dependencies and is fully typesafe. Use this when you need a simple, performa
Introducing AI Transport v0.3.0
Last week we introduced AI Transport v0.2.0 and made one idea the centre of the design: the session is the channel. Every input, output, and lifecycle event for an AI conversation is just a message published to an Ably channel, which is what makes a session durable, multi-party, and resumable. In v0.3.0, we added first-class support for presence and LiveObjects to AI sessions, allowing you and your agent to see who's online and update shared state in real time.We made the codec interface declara
How Meta Engineered Ultra-Narrow Batteries for AI Glasses
Smart glasses like the Ray-Ban Meta and Oakley Meta Vanguards need to pack enough energy to power features like cameras, speakers, AI workloads, and even a display. But it all has to fit into the glasses’ temple arms.So how do you place a battery with enough power to run a pair of smart glasses all day into a form factor narrower than an adult’s pinky finger? You have to rethink how batteries are made. In episode 86 of the Meta Tech Podcast, host Pascal Hartig sat down with Karthik and Myuran, t
Nobody trusted our internal dashboards. Now they live in code
How we used AI to fix a data trust problem, and built a governed reporting system the whole company can contribute to.We audited our skills library a few months ago and found twelve dashboards hiding in it.Not dashboards. Skills that built dashboards. Someone needed a view of some data, asked Claude to put it together, got a long HTML page out of it, and then wrapped the whole thing in a skill so others could run it again. Twelve times over, by different people, for different questions.This is w
See Anthropic Orchestrate the Narrative
tl;dr FOSS is the biggest threat to the largest new economic sector, so everything that economic sector does should be viewed through the lens of trying to kill it.I occasionally see articles and sentiments along the lines of, Anthropic is or is not, "scare mongering to boost the perceived cultural impact of their AI/ML tools; a sort of underhanded advertisement".If your job is to defend Anthropic online, it's a good angle to fight from. It's a viral subject with no prac
Toward More Controllable AI Video Editing: An Early Research Exploration at Netflix
By Zhuoning Yuan, Ta-Ying Cheng, Benjamin Klein, Bahareh AzarnoushIntroductionAt Netflix, we build technology to help storytellers bring their creative visions to life and to help members discover the stories they love.To connect stories with diverse audiences around the world, we produce promotional assets, including trailers, teasers, and social short‑form videos, that build on and elevate the original footage. Through close collaboration with the teams crafting these assets, we identified a r
How Netflix Simplified Batch Compute with Kueue
By Alvin Bao, Alex Petrov, Jennifer Lai, Aidan Sherr, and Samartha ChandrashekarAs a part of the journey to transition Netflix’s compute infrastructure to be more Kubernetes-native, we have leaned into incorporating components from the Kubernetes ecosystem into our container platform Titus. One example of this is our use of Kueue, a cloud-native job queueing system for batch workloads, which has largely replaced the custom queuing and scheduling logic in our homegrown managed batch solution Comp
Secure multi-tenant RAG with Amazon Bedrock and Verified Permissions
Large organizations building internal generative AI applications face a recurring challenge: controlling which teams or departments can access which documents, without duplicating infrastructure for each group. Within a single tenant, employees from a specific department should only access material assigned to that department. However, executives, with a wider span of control, will require access to material across multiple departments. Retrieval Augmented Generation (RAG) is one of several comp
Modernizing financial analytics with Amazon SageMaker Unified Studio
Avanse Financial Services is one of India’s leading education loan providers. Their Data Engineering Team had built a data lake on AWS using Amazon Simple Storage Service (Amazon S3), Amazon Athena, and AWS Glue for data ingestion and processing. However, their analytics and reporting layer ran on an external analytics application that wasn’t integrated with AWS. Data had to be copied from Amazon S3 into this external application before analysts could run any report, its license consumed a signi
Architecting AI-powered resilience framework on AWS
When your production system goes down, you often discover the hard way that your resilience testing missed critical dependencies. Building an AI-powered resilience framework on AWS helps you find those weaknesses before your customers do. Your systems don’t fail because your infrastructure isn’t resilient. They fail because resilience is assumed, not proven. Every deployment introduces new dependencies, every configuration change creates untested paths, and every gap between design intent and ru
Adopting AV1 for Real-Time Communication (RTC) at Scale
Adopting AV1 for real-time communication at Meta has been a multi-year effort spanning codec selection, device eligibility, rate control, and error resilience.We’re sharing the technical and operational challenges while deploying AV1 and expanding coverage, and how we addressed them for real-time communication.We’re presenting several technologies for improving AV1 call quality, including rate control and error resilience.The AV1 video codec, first standardized by AOMedia in 2018, has rapidly ev
Vercel AI SDK in production: when DefaultChatTransport needs a session layer
You've built an AI chat app on the Vercel AI SDK. It works in development. The model responds, the stream comes through, and the UI updates cleanly. Then you ship to production, and the transport layer starts showing its edges.Most of these failures are quiet: things that work in demos and break in ways that are hard to pin down until you know where to look. They share a common cause: DefaultChatTransport is built for HTTP, and HTTP has structural properties that some production requirements exc