Expert Prompt Curators for Advanced AI Evaluation Dataset

Remote, USA Full-time

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description Mercor is collaborating with a leading AI research lab to develop a next-generation evaluation dataset for frontier AI models. We are seeking experts with advanced domain knowledge across diverse fields to design extremely challenging prompts that cannot be solved by existing AI systems without internet search or browsing capabilities. The goal is to create a benchmark dataset that pushes the limits of current AI reasoning and retrieval. This is a short-term research engagement with significant impact on AI evaluation. Key Responsibilities Create original, expert-level prompts that require tool use (e.g., search, browse, or code execution). Ensure prompts are objective, self-contained, and yield clear, unambiguous answers. Test prompts against advanced AI models and document failures/successes. Provide reasoning steps and solutions for each prompt. Classify prompts into subject domains for dataset organization. Collaborate with reviewers for expert validation and prompt refinement. Qualifications Advanced academic or professional expertise in a specialized subject (STEM, law, finance, history, cultural studies, etc.). Strong ability to design precise, high-difficulty questions requiring deep knowledge and external references. Experience in academic research, benchmarking, or test question design preferred. Attention to detail and ability to provide concise reasoning explanations. Familiarity with AI models and their limitations is a plus. Requirements Remote and asynchronous — set your own hours. Expected commitment: ~10–20 hours/week. Project duration: ~2 months, with possible extensions based on dataset needs. Opportunity to contribute to high-impact AI safety and evaluation research. Compensation & Contract Terms Competitive hourly compensation based on expertise. Independent contractor engagement. Payments for services rendered processed weekly via Stripe Connect. Application Process Submit your resume or CV highlighting your subject matter expertise. Complete a brief questionnaire about your background and areas of specialization. Selected applicants may be asked to draft a short test prompt. You’ll receive follow-up within a few days regarding next steps.

Apply Now

Experienced Customer Service Representative for Nights and Weekends - Delivering Exceptional Experiences in a Fast-Paced Environment

Remote, USA Full-time

Experienced Remote Opinion Sharing and Data Entry Specialist – Influencing Retail Experiences through Consumer Insights and Data Analysis at blithequark

Remote, USA Full-time

Senior Data Scientist, Fraud Detection job at Crunchyroll in Los Angeles, CA, San Francisco, CA

Remote, USA Full-time

Back to Home

Expert Prompt Curators for Advanced AI Evaluation Dataset

Similar Jobs

Logic Expert

Technical Writer

Vice President, Change Management

Senior Product Manager

Startup-Minded GTM Specialist

Social Media Manager

Global Performance Marketing Manager

Customer Support Specialist

Customer Success Manager, Mid Market | ANZ

PHIL GTM Customer Support Enablement Lead

Senior Product Manager, Social AI - US

Easement Program Specialist – Remote

Chronic Disease Management Nurse (RN-Remote CA, AZ, NV, TX, NM)

Experienced Freelance Product Review Writer for Women's Health and Childcare Content

Real Estate Cold Caller Needed – $4/Hour (Long-Term Position) Hiring Immediately

Experienced Chat Specialist – Remote Customer Support for arenaflex's Global Entertainment Platform

Sales Development Representative (Denver/Boulder, CO)

Experienced Customer Service Representative for Nights and Weekends - Delivering Exceptional Experiences in a Fast-Paced Environment

Experienced Remote Opinion Sharing and Data Entry Specialist – Influencing Retail Experiences through Consumer Insights and Data Analysis at blithequark

Senior Data Scientist, Fraud Detection job at Crunchyroll in Los Angeles, CA, San Francisco, CA

Expert Prompt Curators for Advanced AI Evaluation Dataset

Similar Jobs

Logic Expert

Technical Writer

Vice President, Change Management

Senior Product Manager

Startup-Minded GTM Specialist

Social Media Manager

Global Performance Marketing Manager

Customer Support Specialist

Customer Success Manager, Mid Market | ANZ

PHIL GTM Customer Support Enablement Lead

Senior Product Manager, Social AI - US

Easement Program Specialist – Remote

Chronic Disease Management Nurse (RN-Remote CA, AZ, NV, TX, NM)

Experienced Freelance Product Review Writer for Women's Health and Childcare Content

Real Estate Cold Caller Needed – $4/Hour (Long-Term Position) Hiring Immediately

**Experienced Chat Specialist – Remote Customer Support for arenaflex's Global Entertainment Platform**

Sales Development Representative (Denver/Boulder, CO)

Experienced Customer Service Representative for Nights and Weekends - Delivering Exceptional Experiences in a Fast-Paced Environment

Experienced Remote Opinion Sharing and Data Entry Specialist – Influencing Retail Experiences through Consumer Insights and Data Analysis at blithequark

Senior Data Scientist, Fraud Detection job at Crunchyroll in Los Angeles, CA, San Francisco, CA

Experienced Chat Specialist – Remote Customer Support for arenaflex's Global Entertainment Platform