Mercor

Mercor is a marketplace connecting domain experts to remote, paid AI roles and providing AI labs and enterprises with expert-created frontier datasets, benchmarks, and evaluation environments.

Est.2022

San Francisco, CA, United States

Talent Platforms

Category — Click to see all Talent Platforms solutions

Focus Area — Click to filter by AI

Data

Focus Area — Click to filter by Data

Artificial Intelligence

Industry — Click to see all Artificial Intelligence solutions

Clinical Healthcare

Industry — Click to see all Clinical Healthcare solutions

HC Score

#5 in Data #10 in AI

Contact directlyWebsite Email

Mercor

About Mercor

Profile not yet claimed

Mercor is a talent marketplace that connects top-tier experts with remote, paid AI roles and projects, positioning itself as a way for professionals to “shape the future of AI.” The platform offers role-based opportunities across high-skill domains such as medicine, law, finance, consulting, and software engineering, and highlights regular payouts and competitive hourly pay for expert work. For AI labs and enterprises, Mercor provides “frontier data for frontier AI” by mobilizing subject-matter experts to create specialized datasets, benchmarks, and evaluation environments. The company states it develops benchmarks, evaluation environments, and large-scale human datasets, and offers data, evals, and post-training work designed to drive improvements in advanced reasoning, long-horizon planning, tool use, and safe behavior under uncertainty. Mercor also publishes benchmark families including APEX (AI Productivity Index), APEX-Agents, and ACE (AI Consumer Index), with associated artifacts like papers, datasets, code, and sample tasks. The company positions its work at the cutting edge of AI evaluation and data creation, and claims usage by leading AI labs and major public-company enterprises. As an employer, Mercor emphasizes high-velocity, in-person collaboration from its San Francisco headquarters, and describes itself as profitable, Series C, and valued at $10 billion. It provides benefits for US full-time employees including equity, food stipend, housing support, relocation assistance, fitness membership, unlimited time off, 401(k), parental leave, and wellness services.

Quick Stats

Verified (HC)

HC score

Help their score or give them credit.

in Data

View leaderboard

verified business cases

Social Proof

Customers

Badges

+2 more

Solution Details

Focus Areas

AIData

Industries

Artificial IntelligenceClinical HealthcareComputer software

Customer Regions

Talent Regions

Key Features

AI InterviewingBi-weekly PayComp Benchmarking

Compare Top Ranked Alternatives

Top ranked solutions in Data

Gigged.AI

Start a batch action

Select solutions, then use /shortlist, /request, or /rfp to act on all at once

Save to my cloud

Glasgow, United Kingdom

The enterprise platform for on-demand tech talent and human + AI orchestration

Focus Area

Industry

+23 more

Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.

Start a batch action

Select solutions, then use /shortlist, /request, or /rfp to act on all at once

Save to my cloud

Gig Talent is redefining how top talent partners with organizations to accelerate transformation, strengthen culture, build capabilities, and drive performance.

Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.

Start a batch action

Select solutions, then use /shortlist, /request, or /rfp to act on all at once

Save to my cloud

Indianapolis, United States

Global crowdsourcing platform for developers and designers.

Focus Area

Industry

+10 more

Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.

Start a batch action

Select solutions, then use /shortlist, /request, or /rfp to act on all at once

Save to my cloud

New York, United States

Tribe AI builds custom, production-ready AI and GenAI solutions for enterprises, combining embedded delivery teams with a vetted network of AI engineers and deep partnerships with OpenAI, Anthropic, and AWS.

Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.

Start a batch action

Select solutions, then use /shortlist, /request, or /rfp to act on all at once

Save to my cloud

Your lane to world-class tech talent - Nearshore developers with 85% retention and 75 NPS

Focus Area

Industry

+6 more

Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.

More actions

Historical Performance

Tracking the performance of the solution based on what's most important to you

Skill tag

Industry tag

Business Case

$100M+ Revenue Was Unlocked by Compressing Decision Cycles From Days to Hours

Mercor scaled from fewer than a dozen active client projects to managing hundreds of projects while growing rapidly in headcount. The company had no data team and lacked a central analytics platform, collaborative dashboards, or reliable access to key operational metrics. Teams pulled raw data via VPN into AWS and relied on spreadsheets and a few technical people for custom reports. For a business operating on hour-to-hour timelines, these delays risked millions in lost revenue. Mercor made a single analytics platform the foundation for Ops, Finance, Sourcing, and Sales, connecting data from its warehouse and operational sources like Google Sheets, Airtable, and the Mercor platform. The company rolled out self-serve reporting so non-technical users could build dashboards without needing SQL or Python. Notebook-based AI assistance removed the reporting bottleneck and enabled teams to iterate on metrics and views in real time. Operations used dashboards to monitor project health across hundreds of customer engagements. Decision cycles were compressed from days to hours, enabling faster action on throughput, efficiency, quality, and revenue metrics. Over the past year, improved execution and velocity expanded capacity to take on more projects, which unlocked over $100M in revenue. Dashboards were created in hours rather than days, and the operations team tracked 60+ metrics per project across hundreds of active projects. Mercor also reported zero enterprise customer churn.

Key Results

$100M+ revenue unlocked over the past year
Decision cycles reduced from days to hours
60+ metrics tracked per project

Project Details

Time to Start

Click to inquire

Time to Complete

Click to inquire

Cost

Click to inquire

Save to Cloud

Source this exact business case

Feb 20, 2026

Self Reported

Business Case

Pass@1 Nearly Doubled With 874 Expert-Labeled Tasks and 1 Training Epoch

Mercor needed to prove that a small amount of expert-labeled data could materially improve real-world agent performance on long-horizon, professional tasks. The goal was to drive measurable gains on the APEX-Agents benchmark, which tested day-to-day work across investment banking, management consulting, and corporate law. A key risk in this low-data setting was wasting scarce expert effort on data that would not transfer to the hardest benchmark tasks. Mercor partnered with Applied Compute to post-train an open-source model using an expert-labeled dev set. Mercor supplied a dev set of 874 tasks split across 50 unique “worlds,” and none of the tasks or worlds appeared in the APEX-Agents benchmark. Applied Compute deployed its proprietary long-horizon RL stack and ran single-epoch training with no SFT warmup, no filtering, and no task or rubric modifications. The team evaluated performance on the full APEX-Agents benchmark (n=480) using Pass@1, Pass@3, and mean criteria passed, starting from a GLM 4.6 baseline. The post-trained model outperformed the baseline across all metrics using just 874 expert-labeled tasks, with the largest gains in corporate law. With fewer than 1,000 high-quality data points, Pass@1 and mean score nearly doubled on APEX-Agents. On the corporate law evaluations, Pass@1 tripled. The baseline GLM 4.6 model scored 3.8% Pass@1 and 12.1% mean score prior to post-training, and the training trendline remained near-linear, indicating additional data would likely continue yielding gains.

Key Results

874 expert-labeled tasks used for post-training
50 unique “worlds” in the dev set
480 tasks in the APEX-Agents benchmark (n=480)

Project Details

Time to Start

Click to inquire

Time to Complete

Click to inquire

Cost

Click to inquire

Save to Cloud

Source this exact business case

Feb 20, 2026

Self Reported

Business Case

Reduced Expert Sourcing from Weeks to Hours

Sixtyfour AI

The customer needed to identify domain experts capable of generating problems that stumped current AI models like GPT-4 for next-generation AI training. Their prior sourcing process took weeks per search. It often returned candidates who appeared qualified but lacked actual expertise. They partnered with Sixtyfour AI to improve expert discovery and validation. Recursive enrichment agents traversed academic publications, co-authors, conference presentations, and specialized forums. This built comprehensive expertise profiles to surface qualified domain experts for AI labs. The customer reduced the time required to deliver qualified domain experts from weeks to hours. The process supported sourcing across multiple specializations, including rare genetic dermatology, investment banking, and competitive programming. It improved confidence that sourced experts had the depth required to produce AI-training problems that challenged GPT-4-level models.

Key Results