Open for one new engagement — 2026 · Independent · Remote

Data2Dialog·

An applied AI lab.

We design, build, and evaluate intelligent systems — the kind that have to hold up on Tuesday, not just in a demo. Four research areas, a handful of clients, one bar: we only take work where the AI question is the main question.

Start a conversation See what we work on

01 / Research areas

Four places we go deep.

These are the topics we pursue whether or not a client is paying us to. They shape the work we take on and the things we refuse to ship.

Agent reliability

Tool drift, long-horizon state, handoff protocols, sandboxing — the instrumentation that makes silent failure visible. Most agent systems break quietly. We build the plumbing that catches it.

Evaluation infrastructure

Production-grade evals sampled from real traffic, scored on the outcome you would lose sleep over, cheap enough to run on every PR. Benchmarks discriminate models; we help you discriminate versions.

III

Interfaces & trust

Copilots, inline assistants, inspectable agents. We obsess over what happens when the model is wrong — how fast users notice, how cheaply they correct, and what the product learns from the correction.

Advisory & architecture

Model selection, retrieval design, eval audits, buy-vs-build calls. We tell you what we would do and why — with the reasoning visible enough that you can disagree with the step, not the answer.

See the full treatment

02 / What we take on

Three kinds of engagement.

Pick the shape that matches where you are. We can move between them as the work evolves.

01 · Spike

A focused answer.

Is this feature even possible? What would it cost? We take the sharpest AI question you have and come back with working code and a real recommendation.

~3–5 calendar weeks · Flat fee · Prototype + memo

02 · Build

Take it to production.

We embed alongside your engineers and build the AI layer of your product with them. Eval harness from week one, weekly demos, clean handoff. You own the code at the end; we stay long enough to watch it run.

~8–16 calendar weeks · Outcome: shipped feature

03 · Advisory

A second set of eyes.

Retained advisory for teams already building. Architecture reviews, eval design, hiring help, weekly calls. For when the internal answer is “we think so.”

Monthly · Outcome: sharper decisions

Read the full menu

03 / Products

Things we’re building.

The lab also ships its own software. Small, opinionated tools that scratch our own itches first. Some are live; some are still finding their shape.

Shipping soon

HiringSignal

Real hiring managers, not recruiters. Ranks LinkedIn hiring posts by who’s actually doing the hiring — founders, EMs, CTOs — and drops the rest. Vertex AI Search behind a rule-based scorer.

Open source

Altassian

An open-source Confluence alternative. A wiki platform for teams to create, organize, and collaborate on documentation — without the enterprise tax. Django + SvelteKit, self-hostable.

In development

Project Pager

A phone-first, cross-agent surface for AI coding agents. One inbox for every Claude Code, Codex, and Cursor session you have running — with rich rendering of diffs, tool calls, and approvals.

Concept

AgentPass

1Password for AI agents. A credential broker that lets agents log in to the services they need, without ever giving them your passwords. Designed around the principle that agents should hold capabilities, not secrets.

See all products

Working on something interesting?
We should talk.

Start a conversation