Welcome to the Brave New World of Work




Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.




Mercor is a marketplace connecting domain experts to remote, paid AI roles and providing AI labs and enterprises with expert-created frontier datasets, benchmarks, and evaluation environments.
Mercor is a talent marketplace that connects top-tier experts with remote, paid AI roles and projects, positioning itself as a way for professionals to “shape the future of AI.” The platform offers role-based opportunities across high-skill domains such as medicine, law, finance, consulting, and software engineering, and highlights regular payouts and competitive hourly pay for expert work. For AI labs and enterprises, Mercor provides “frontier data for frontier AI” by mobilizing subject-matter experts to create specialized datasets, benchmarks, and evaluation environments. The company states it develops benchmarks, evaluation environments, and large-scale human datasets, and offers data, evals, and post-training work designed to drive improvements in advanced reasoning, long-horizon planning, tool use, and safe behavior under uncertainty. Mercor also publishes benchmark families including APEX (AI Productivity Index), APEX-Agents, and ACE (AI Consumer Index), with associated artifacts like papers, datasets, code, and sample tasks. The company positions its work at the cutting edge of AI evaluation and data creation, and claims usage by leading AI labs and major public-company enterprises. As an employer, Mercor emphasizes high-velocity, in-person collaboration from its San Francisco headquarters, and describes itself as profitable, Series C, and valued at $10 billion. It provides benefits for US full-time employees including equity, food stipend, housing support, relocation assistance, fitness membership, unlimited time off, 401(k), parental leave, and wellness services.







Benchmark family assessing frontier model capability on economically valuable professional tasks (APEX), long-horizon agent tasks (APEX-Agents), and consumer activities (ACE), with supporting blog/paper/data/code/sample tasks.
Benchmarking
Leaderboards
Open Tooling
Marketplace for professionals to find top-tier, remote AI roles matched to their expertise, with listed hourly pay ranges and ongoing work opportunities.
Remote Roles
Hourly Pay
AI Interviewing
Large-scale expert data creation to fuel AI breakthroughs, including specialized annotations and datasets across many domains for model training and post-training.
Expert Annotations
Post-training Data
Domain Coverage
Reinforcement learning environments built by creating realistic data-rich worlds, implementing tools/applications for agents, and creating rigorous tasks and verifiers.
Task Verifiers
Tool Simulation
Data-rich Worlds
Top ranked solutions in Data

Mercor scaled from fewer than a dozen active client projects to managing hundreds of projects while growing rapidly in headcount. The company had no data team and lacked a central analytics platform, collaborative dashboards, or reliable access to key operational metrics. Teams pulled raw data via VPN into AWS and relied on spreadsheets and a few technical people for custom reports. For a business operating on hour-to-hour timelines, these delays risked millions in lost revenue. Mercor made a single analytics platform the foundation for Ops, Finance, Sourcing, and Sales, connecting data from its warehouse and operational sources like Google Sheets, Airtable, and the Mercor platform. The company rolled out self-serve reporting so non-technical users could build dashboards without needing SQL or Python. Notebook-based AI assistance removed the reporting bottleneck and enabled teams to iterate on metrics and views in real time. Operations used dashboards to monitor project health across hundreds of customer engagements. Decision cycles were compressed from days to hours, enabling faster action on throughput, efficiency, quality, and revenue metrics. Over the past year, improved execution and velocity expanded capacity to take on more projects, which unlocked over $100M in revenue. Dashboards were created in hours rather than days, and the operations team tracked 60+ metrics per project across hundreds of active projects. Mercor also reported zero enterprise customer churn.
Project Details

Mercor needed to prove that a small amount of expert-labeled data could materially improve real-world agent performance on long-horizon, professional tasks. The goal was to drive measurable gains on the APEX-Agents benchmark, which tested day-to-day work across investment banking, management consulting, and corporate law. A key risk in this low-data setting was wasting scarce expert effort on data that would not transfer to the hardest benchmark tasks. Mercor partnered with Applied Compute to post-train an open-source model using an expert-labeled dev set. Mercor supplied a dev set of 874 tasks split across 50 unique “worlds,” and none of the tasks or worlds appeared in the APEX-Agents benchmark. Applied Compute deployed its proprietary long-horizon RL stack and ran single-epoch training with no SFT warmup, no filtering, and no task or rubric modifications. The team evaluated performance on the full APEX-Agents benchmark (n=480) using Pass@1, Pass@3, and mean criteria passed, starting from a GLM 4.6 baseline. The post-trained model outperformed the baseline across all metrics using just 874 expert-labeled tasks, with the largest gains in corporate law. With fewer than 1,000 high-quality data points, Pass@1 and mean score nearly doubled on APEX-Agents. On the corporate law evaluations, Pass@1 tripled. The baseline GLM 4.6 model scored 3.8% Pass@1 and 12.1% mean score prior to post-training, and the training trendline remained near-linear, indicating additional data would likely continue yielding gains.
Project Details

I have had the pleasure of working on several Mercor projects, and my experience has been outstanding. The only area for improvement would be the response time regarding evaluations.

I have worked on multiple AI training platforms, Mercor stands out. The work is well-organized, communication is clear, and the team is responsive. Highly recommend.

Just wrapped up my second contract with Mercor. Their professionalism cuts through immediately -- seamless workflows, clear communication, and a genuine respect for expert talent.

My experience with Mercor has been exceptional. I have been participating since July 2024 and the compensation is strong and well above average on an hourly basis.

Working in the area of AI training can lead you to companies that pay an absolute pittance. Mercor is by far the best of all similar companies I've worked for. Can't rate it highly enough.

Mercor consistently followed up with our team to make sure we were having a good experience. Their site is easy to use, engineers responsive, and their vetting, extensive. A must have for anyone building a business with engineering load.

Interventional radiology fellow
After five or six years of training, you get a little sick of it. You’re working nonstop, but financially you’re still barely treading water.

Interventional radiology fellow
I love procedures. I love reading imaging. What I don’t love is spending hours pre-rounding, protocoling, checking labs, and sitting behind a computer.

Computer Information Systems Expert
I like solving problems. When I see something that could be more efficient, more automated, I have to figure out how to fix it.

Computer Information Systems Expert
You do it not because you need it, but because you want to see if you can do it.

Computer Information Systems Expert
Since leaving my last job, I actually get to linger a little bit longer and just enjoy the beach. Which is nice.

Computer Information Systems Expert
My typical day was Zoom, Slack, communication, and it would be perpetual. At Mercor, I check what needs to be done, complete it, and move on. I don't need to schedule a Zoom call. It's almost been liberating.

Computer Information Systems Expert
I always thought they just scraped data from books and social media. I didn't know they used real people with domain expertise to train these models.

Computer Information Systems Expert
As someone who was in IT 15 years ago, I did not have AI to ask these questions to. I'd want to make sure that junior CIS admins asking AI about IT or computer information systems are getting accurate, authentic answers.

Computer Information Systems Expert
It was dynamically asking me questions about my resume and how it pertained to what they were looking for.

Computer Information Systems Expert
Sometimes you get an interview and then you get ghosted. The job market is not amazing right now. It is pretty tough.

Computer Information Systems Expert
I used to apply my technical expertise by helping end users and customers. Now I'm helping train an AI model in the same topics I'd be talking through with those people. I never really thought I'd be doing that.
Named in the "Foundation Builders" category among Bloomberg's top AI startups.
Bloomberg placed Mercor in its "Foundation Builders" category among the top 24 AI startups to watch.
Fortune profiles CEO Brendan Foody and the three co-founders who became the world's youngest self-made billionaires at age 22.
All three Mercor co-founders (Brendan Foody, Adarsh Hiremath, Surya Midha) named to Forbes 30 Under 30 in AI.
Big Think analysis of Mercor's rapid growth from founding to $10B valuation in under three years.
Deep dive into Mercor's business model of connecting AI labs with domain experts who provide proprietary knowledge for model training.
CNBC reports on Mercor's $350M Series C at $10B valuation, noting 30,000+ contractors paid over $1.5M per day.
TechCrunch covers Mercor's $350M Series C led by Felicis, with Benchmark, General Catalyst, and Robinhood Ventures participating. Valuation quintupled from $2B to $10B in eight months.

Sixtyfour AI
The customer needed to identify domain experts capable of generating problems that stumped current AI models like GPT-4 for next-generation AI training. Their prior sourcing process took weeks per search. It often returned candidates who appeared qualified but lacked actual expertise. They partnered with Sixtyfour AI to improve expert discovery and validation. Recursive enrichment agents traversed academic publications, co-authors, conference presentations, and specialized forums. This built comprehensive expertise profiles to surface qualified domain experts for AI labs. The customer reduced the time required to deliver qualified domain experts from weeks to hours. The process supported sourcing across multiple specializations, including rare genetic dermatology, investment banking, and competitive programming. It improved confidence that sourced experts had the depth required to produce AI-training problems that challenged GPT-4-level models.
Skills
Project Details
TechCrunch covers Mercor's $100M Series B led by Felicis, with participation from General Catalyst, DST Global, Benchmark, and Menlo Ventures.
CNBC covers Mercor's Series B funding and $2B valuation, highlighting 50% month-over-month revenue increases and partnerships with top 5 AI labs including OpenAI.
Full service creative production company helping brands maximise the impact of their marketing content




Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.
Empowering US startups with unrivaled access to global engineering talent, seamless hiring, and improved retention.




Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.



An independent global marketing consultancy delivering outsized growth.




Human Cloud Verification ensures that the listed end customer is verified. It's used across kudos, customers, and business cases, and performed by Human Cloud. Think about it like a background check.


