Data Scientist

Remote, USA Full-time
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description We're seeking a data-driven analyst to conduct comprehensive failure analysis on AI agent performance across finance-sector tasks. You'll identify patterns, root causes, and systemic issues in our evaluation framework by analyzing task performance across multiple dimensions (task types, file types, criteria, etc.). Statistical Failure Analysis : Identify patterns in AI agent failures across task components (prompts, rubrics, templates, file types, tags) Root Cause Analysis : Determine whether failures stem from task design, rubric clarity, file complexity, or agent limitations Dimension Analysis : Analyze performance variations across finance sub-domains, file types, and task categories Reporting & Visualization : Create dashboards and reports highlighting failure clusters, edge cases, and improvement opportunities Quality Framework : Recommend improvements to task design, rubric structure, and evaluation criteria based on statistical findings Stakeholder Communication : Present insights to data labeling experts and technical teams Qualifications Statistical Expertise : Strong foundation in statistical analysis, hypothesis testing, and pattern recognition Programming : Proficiency in Python (pandas, scipy, matplotlib/seaborn) or R for data analysis Data Analysis : Experience with exploratory data analysis and creating actionable insights from complex datasets AI/ML Familiarity : Understanding of LLM evaluation methods and quality metrics Tools : Comfortable working with Excel, data visualization tools (Tableau/Looker), and SQL Requirements Experience with AI/ML model evaluation or quality assurance Background in finance or willingness to learn finance domain concepts Experience with multi-dimensional failure analysis Familiarity with benchmark datasets and evaluation frameworks 2-4 years of relevant experience
Apply Now

Similar Jobs

QA Engineer

Remote, USA Full-time

Junior Product Manager

Remote, USA Full-time

Director of Marketing

Remote, USA Full-time

Senior Product Manager

Remote, USA Full-time

Remote Data Entry Clerk – Amazon Store

Remote, USA Full-time

Partner Sales Manager

Remote, USA Full-time

Sales Agent

Remote, USA Full-time

Digital Marketing Content Writer & SEO Specialist

Remote, USA Full-time

Business Data Scientist

Remote, USA Full-time

Senior Analyst

Remote, USA Full-time

Auto Claims Adjuster Trainee

Remote, USA Full-time

Customer Service Assistant 2

Remote, USA Full-time

**Experienced Customer Service Representative – Remote Entry-Level Position at blithequark: Join Our Team Today!**

Remote, USA Full-time

Experienced Customer Service Representative – Remote Work Opportunity for Exceptional Client Support and Sales Growth at arenaflex

Remote, USA Full-time

Experienced Remote Customer Service Representative – Delivering Exceptional Support and Driving Customer Satisfaction at blithequark

Remote, USA Full-time

Experienced Online Chat Support Representative – Remote Part-Time Opportunity for Exceptional Customer Service Professionals

Remote, USA Full-time

Retail Overnight Customer Service Specialist – No Experience Required

Remote, USA Full-time

Entry-Level Data Analyst – Unlock Your Potential in Data-Driven Insights and Decision Making with a Forward-Thinking Company

Remote, USA Full-time

Associate Lead, Revenue Growth Management

Remote, USA Full-time

[Remote] Front Office Business Partner

Remote, USA Full-time
Back to Home