Job Description
- Design and implement self-contained evaluation tasks, including prompts, supporting files, and detailed grading rubrics to assess AI performance on practical computer-based workflows.
- Define clear, unambiguous written criteria for successful and unsuccessful task completion across diverse administrative and workflow scenarios.
- Meticulously observe and document AI agent behaviors, producing crisp, precise summaries and reports in high-quality English.
- Iterate and refine evaluation tasks and rubrics based on feedback and team collaboration to ensure robust benchmarking methodologies.
- Collaborate with the customer's team to share insights and help drive continuous improvement in AI evaluation techniques.
- Have a minimum of of experience in roles emphasizing written precision and structured thinking, such as paralegal, executive assistant, junior analyst, librarian, document archival specialist, rese...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application