Company
International SOS is looking to recruit a 6-month Data Analysis Intern to support in an upcoming project, an internal initiative aimed at conducting an in-depth analysis of high-cost medical conditions using proprietary health data sets from the past year. The project will support strategic decision-making by identifying cost drivers and patterns in medical claims.
The Role
- Work in close supervision by the project leadership manager.
- Build and evaluate an NLP/LLM-assisted pipeline to automatically map messy medical descriptions/claims text to ICD-10 (simplified), with measurable accuracy and human-in-the-loop validation.
- Data understanding and labeling applied to NLP/LLM approach in collaboration with Intl.SOS health expert for review. Review and code health and claims records using simplified ICD-10 coding system.
- Extract and organize relevant data to associate costs claimed with ICD-coded health files.
- Perform data analysis to identify trends and insights related to high-cost medical conditions.
- Ensure strict compliance with the organization's confidentiality and privacy policies.
- Prepare reports and dashboards using tools such as Excel and Power BI.
- Communicate findings clearly in English (written and verbal).
Why Join Us
- Gain hands-on experience in health data analytics within a global organization.
- Work on a meaningful project impacting healthcare cost management.
- Exposure to advanced tools and real-world data sets.
Requirements
- Undergraduates currently pursuing or recently completed a degree in Data Analytics, Health Information Management, Statistics, or related fields.
- Possess strong data analysis skills.
- Proficiency in Python (pandas) and ability to build a small reproducible NLP pipeline.
- Basic knowledge of NLP/LLMs (text classification, embeddings, prompt design).
- Familiarity with medical terminology (clinical training not required).
- Ability to apply ICD-10 (simplified version) coding to health records.
- Proficiency in Excel, Power BI, and email communication tools.
- Experience with RAG / vector search and/or tools like spaCy, transformers will be an added advantage.
- Familiarity with ICD-10 mapping or medical NLP preferred.