Blog

Working notes from inside the AI training industry

Rates, rubrics, red-teaming, and what frontier labs actually pay for in 2026. Plain reads — no fluff, no hype, just what we wish we'd known when we started hiring experts directly for AI training work.

Technical · 9 min read

Agentic evaluations: what frontier labs need from evaluators in 2026

Half the briefs landing on our roster aren't "pick A or B" anymore — they're 40-step model trajectories with tool calls, browser actions, and stack traces. Here's what agentic eval actually looks like, what it pays, and which evaluator skills transfer.

May 22, 2026 · David Park
Playbook · 6 min read

How to read a rubric before you accept a brief

The highest-leverage thing an evaluator does isn't the work — it's choosing which briefs to accept. A five-minute pre-acceptance read that catches the briefs that will waste your time on appeals, and identifies the ones that pay cleanly.

May 18, 2026 · Sophia Reyes
Industry · 8 min read

How AI training pay actually works in 2026

Most articles quote a single hourly rate. Reality is bimodal — crowd workers at $8–25, specialists at $30–60, and credentialed experts at $75–150. Here's how the tiers actually break down, and what moves you between them.

May 16, 2026 · Elena Lange
Playbook · 11 min read

Red-teaming LLMs: a working guide for new evaluators

Six attack categories, sample probes for each, and the unspoken rules that separate a $25/hr crowd reviewer from a $90/hr safety specialist. Written for people who can already prompt a model fluently but are new to adversarial work.

May 14, 2026 · David Park
Careers · 7 min read

Why doctors, lawyers, and engineers earn the most as AI evaluators

The expert tier exists because frontier labs need ground truth, not consensus. If you can answer "is this medical advice safe?" or "does this contract clause survive challenge?", you're not a crowd worker — you're a regulator the model trains against.

May 10, 2026 · Sophia Reyes
Technical · 10 min read

RLHF, DPO, GRPO: the alphabet soup of preference data, demystified

What each method asks of a human rater, why the rubrics differ, and how to spot which one you're actually being paid to label for. A working guide for evaluators who want to read the room before they accept a brief.

May 6, 2026 · David Park

Get matched to work that pays for what you actually know.

Join the OBG roster and earn from a single employer — us. We publish the briefs, run the matching, and pay you every Friday. You handle the work.

Join as expert See how it works