We're looking for AI Agent Evaluation Analysts to review evaluation tasks and scenarios for logic, completeness, and realism, and help define clear expected behaviors for AI agents. No coding background is required, but you must be curious, intellectually rigorous, and capable of evaluating complex systems.
Requirements
- Excellent analytical thinking: Can reason about complex systems, scenarios, and logical implications.
- Strong attention to detail: Can spot contradictions, ambiguities, and vague requirements.
- Familiarity with structured data formats: Can read, not necessarily write JSON/YAML.
- Ability to assess scenarios holistically: What's missing, what’s unrealistic, what might break?
- Good communication and clear writing (in English) to document your findings.
Benefits
- Get paid for your expertise, with rates that can go up to $55/hour depending on your skills, experience, and project needs.
- Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.
- Participate in an advanced AI project and gain valuable experience to enhance your portfolio.
- Influence how future AI models understand and communicate in your field of expertise.