Develop, construct, test, and maintain scalable data systems, pipelines, and architectures (data lakes, warehouses). Collect raw data from various sources, clean it, and transform it into usable formats (ETL/ELT). Work with data scientists, analysts, and business stakeholders to understand data requirements and deliver solutions. Improve existing data frameworks, monitor workflows, and troubleshoot issues. Implement data governance, validation, and security measures.
Qualifications & Skills:
Education
Polytechnic Diploma or bachelor's degree in computer science, Data Analytics, Business Intelligence or any related field
Experience:
Programming: Strong Python and SQL skills are fundamental, with Java/Scala useful for big data. Databases: Experience with relational (PostgreSQL, MySQL) and NoSQL databases (MongoDB). Cloud Platforms: Proficiency in major cloud providers (AWS, Azure, GCP) for data services (S3, Redshift, EMR, etc.).
Big Data: Working knowledge of tools like Apache Spark, Kafka, and Hadoop. ETL/ELT: Designing and implementing data extraction, transformation, and loading processes. Data Warehousing: Concepts and tools for storing large datasets (e.g., Redshift, Snowflake). Orchestration: Tools like Apache Airflow for scheduling and managing workflows. Data Modeling: Designing efficient database and data warehouse schemas.
TechnicalSkills:
Programming: Python, Java, SQL (essential). Databases: Relational (SQL) & NoSQL databases. Big Data Tools: Hadoop, Spark (often). Cloud Platforms: AWS, Azure, GCP (increasingly common). ETL/ELT Tools: Expertise in data integration tools. Software Engineering: Strong understanding of data structures and algorithms
Soft Skills:
Strong analytical and problem-solving abilities. Good communication skills to collaborate with non-technical teams. Eagerness to learn and adapt in a fast-paced environment Open to learn new software or BI tools in the market
Nice to Have:
Knowledge of GitHub, GitLab Knowledge of SDLC & DDLC Understanding of Agile/Scrum methodologies.