Search by job, company or skills

J

Senior Systems Engineer - L3 Operations (Data Analytics & AI) (Ref 26210a)

5-7 Years
SGD 5,500 - 6,500 per month
Save
new job description bg glownew job description bg glow
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Responsibilities

. Monitor and maintain production data pipelines to ensure 99.9% uptime and optimal performance

. Implement comprehensive logging, alerting, and monitoring systems using Application monitoring tools

. Perform regular health checks performance, job execution times, and resource utilization to identify and resolve bottlenecks proactively

. Manage incident response procedures for pipeline failures, including root cause analysis, resolution, and post-incident reviews

. Establish and maintain disaster recovery procedures and backup strategies for critical data assets within the Databricks environment

. Conduct regular performance tuning of Spark jobs and Databricks cluster configurations to optimize cost and execution efficiency

. Maintain comprehensive documentation for operational procedures, runbooks, and troubleshooting guides

. Coordinate scheduled maintenance windows and system upgrades with minimal business impact

. Manage user access controls, workspace configurations, and security policies within Application environments

Requirements

. Degree in Computer Science or Computer Engineering

. Minimum 5 years working experience in system operations compliance and management areas

. Project hands-on experience specifically with AWS platform (primary requirement), cloud operations or cloud architecture

. Must be cloud certified (AWS)

. Proficiency in Databricks platform, including workspace management, cluster configuration, and job orchestration

. Strong expertise in Apache Spark within Databricks environment, including Spark SQL, DataFrames, and RDDs

. Good in-depth understanding of data warehouse concepts, data profiling, data verification and advanced analytics techniques

. Strong knowledge of monitoring, incident management, and cloud cost control

. Technology Stack Experience:

. Databricks

. AWS cloud services and architecture

. IDMC (Informatica Data Management Cloud)

. Tableau for data visualization

. Oracle Database management

. ML Ops practices within Databricks environment

. STATA for statistical analysis is advantage

. Amazon SageMaker integration with Databricks

. DataRobot platform integration

. Good interpersonal skills with the ability to work with different groups of stakeholders

. Strong problem-solving skills and ability to work independently in a fast-paced environment with minimal supervision

. Excellent communication skills for technical documentation and cross-team collaboration

Licence no: 12C6060

More Info

Job Type:
Industry:
Employment Type:

Job ID: 147358993