Open for one new engagement — 2026 · Independent · Remote

Data2Dialog·

An applied AI lab.

We design, build, and evaluate intelligent systems — the kind that have to hold up on Tuesday, not just in a demo. Four research areas, a handful of clients, one bar: we only take work where the AI question is the main question.

Four places we go deep.

These are the topics we pursue whether or not a client is paying us to. They shape the work we take on and the things we refuse to ship.

I

Agent reliability

Tool drift, long-horizon state, handoff protocols, sandboxing — the instrumentation that makes silent failure visible. Most agent systems break quietly. We build the plumbing that catches it.

II

Evaluation infrastructure

Production-grade evals sampled from real traffic, scored on the outcome you would lose sleep over, cheap enough to run on every PR. Benchmarks discriminate models; we help you discriminate versions.

III

Interfaces & trust

Copilots, inline assistants, inspectable agents. We obsess over what happens when the model is wrong — how fast users notice, how cheaply they correct, and what the product learns from the correction.

IV

Advisory & architecture

Model selection, retrieval design, eval audits, buy-vs-build calls. We tell you what we would do and why — with the reasoning visible enough that you can disagree with the step, not the answer.

See the full treatment

Three kinds of engagement.

Pick the shape that matches where you are. We can move between them as the work evolves.

Read the full menu

Things we’re building.

The lab also ships its own software. Small, opinionated tools that scratch our own itches first. Some are live; some are still finding their shape.

Shipping soon

HiringSignal

Real hiring managers, not recruiters. Ranks LinkedIn hiring posts by who’s actually doing the hiring — founders, EMs, CTOs — and drops the rest. Vertex AI Search behind a rule-based scorer.

Open source

Altassian

An open-source Confluence alternative. A wiki platform for teams to create, organize, and collaborate on documentation — without the enterprise tax. Django + SvelteKit, self-hostable.

In development

Project Pager

A phone-first, cross-agent surface for AI coding agents. One inbox for every Claude Code, Codex, and Cursor session you have running — with rich rendering of diffs, tool calls, and approvals.

Concept

AgentPass

1Password for AI agents. A credential broker that lets agents log in to the services they need, without ever giving them your passwords. Designed around the principle that agents should hold capabilities, not secrets.

See all products

Working on something interesting?
We should talk.

Start a conversation