We are seeking highly skilled and culturally aware Red Teaming / Prompt Writers to join our AI Safety and Evaluation program. In this role, you will design, test, and refine prompts that challenge AI models on cultural relevance, sensitivity, and contextual accuracy. Your work will directly contribute to identifying cultural blind spots, potential biases, and region-specific issues in AI responses.
This position is ideal for individuals with strong writing, research, and critical analysis skills, and a deep understanding of cultural norms, practices, and sensitivities across regions.
- Prompt Writing & Scenario Design
- Create diverse and challenging prompts to test AI models on cultural awareness, sensitivity, and adaptability.
- Develop scenarios that highlight cultural nuances, idiomatic expressions, traditions, festivals, rituals, and social issues.
- Red Teaming & Evaluation
- Simulate real-world user behaviour to surface failures in model responses.
- Identify cultural insensitivity, stereotyping, misrepresentation, or harmful outputs.
- Tag and classify violations (e.g., cultural inaccuracies, omission of key context, or safety triggers with cultural dimensions).
- Cultural Research & Documentation
- Conduct research on cultural practices, linguistic variations, and regional norms to inform prompt design.
- Maintain structured documentation of prompts, responses, and evaluation results for model improvement.
- Collaboration & Reporting
- Work with AI researchers, linguists, and safety experts to refine test cases.
- Provide clear, structured feedback and recommendations for mitigation strategies.
Required Qualification
- Proven experience in prompt writing, content evaluation, linguistics, or cultural studies.
- Strong knowledge of regional and cultural practices (one or more languages/regions preferred).
- Excellent writing skills with the ability to frame unbiased, clear, and context-rich prompts.
- Analytical mindset with the ability to detect subtle cultural inaccuracies or omissions.
- Comfort with structured evaluation frameworks and documentation.