- 1-year contract, renewable
- Hybrid work arrangement
- Government project
We are seeking a Systems Engineer with strong foundations in cloud infrastructure, DevOps, and system integration, complemented by specialized expertise in AI/ML technologies and video analytics applications.
As you will be working on a systems engineering role with AI/ML serving as augmented skillsets, core systems engineering competencies such as cloud platforms, DevOps practices, API integration are mandatory foundations.
Operating at the intersection of systems engineering, edge computing, and AI/ML technologies, this role drives the design, deployment, and operation of intelligent transportation systems for Singapore's Land Transport Authority, ensuring AI-powered solutions are architected with systems engineering rigor and operated with production reliability.
Key Responsibilities
- Design and build application/cloud/edge infrastructure for AI/ML experimentation and prototyping with established frameworks and best practices
- Set up CI/CD pipelines for ML model testing, training, and deployment automation
- Deploy and evaluate AI/ML models on cloud and edge platforms with focus on system integration and production readiness
- Translate stakeholder needs into actionable system requirements with full traceability
- Integrate emerging AI technologies with existing LTA infrastructure ensuring seamless interoperability
- Conduct proof-of-concept implementations focusing on deployment feasibility and performance benchmarks
- Automate data pipelines and model training workflows using DevOps and MLOps practices
- Foster cross-functional collaboration among AI/ML engineers, developers, security teams, and operations staff
- Deploy and operate production-ready video analytics systems at scale across LTA infrastructure
- Build and manage edge computing infrastructure with containerization, monitoring, and automated health checks
- Implement comprehensive monitoring, logging, and alerting for AI/ML production systems
- Integrate video analytics solutions with LTA's APIs, databases, message queues, and network systems
- Optimize system performance, GPU allocation, and real-time processing for latency and throughput requirements
- Work with ML platforms to support model development, versioning, and deployment pipelines
- Implement security, compliance, and data governance including PDPA compliance, encryption, and access controls
- Maintain and troubleshoot deployed infrastructure with systematic debugging and root cause analysis
- Monitor system health, implement automated remediation, and respond to production incidents
- Manage model retraining pipelines and version control ensuring reproducibility and rollback capabilities
- Provide technical support for infrastructure, networking, and AI/ML system performance issues
- Collaborate with operations teams ensuring reliability, uptime SLAs, and operational excellence
Qualifications
- Bachelor's degree in Computer Science, Engineering, Information Systems, or related technical field (required)
- Master's degree in AI/ML, Computer Vision, Computer Science, or related field (preferred)
- AWS certifications: Solutions Architect, Machine Learning Specialty, DevOps Engineer Professional
- Kubernetes certifications: CKA, CKAD
- NVIDIA Deep Learning Institute certifications
- Cloud Platforms: AWS infrastructure and services including EC2, S3, Lambda, VPC, IAM, CloudWatch, ECS/EKS cloud architecture for compute, storage, networking, and security
- DevOps & Automation: Git version control, CI/CD pipelines, Infrastructure as Code, Docker containerization, Kubernetes orchestration, automated testing and deployment strategies
- Systems Integration: RESTful API design and development, data pipelines and ETL processes, database integration, networking fundamentals, distributed systems and microservices
- Programming: Python/TypeScript/Java/similar for systems automation and ML applications scripting for infrastructure management production-grade code practices including testing, logging, and documentation
- AI/ML Frameworks: PyTorch, TensorFlow for computer vision OpenCV for image/video processing object detection and tracking algorithms model optimization techniques
- Edge AI & IoT: Nvidia Jetson platform development and optimization edge computing deployment strategies IoT camera infrastructure real-time data processing
- ML Operations: AWS SageMaker and ML services ML platforms experience model training, versioning, and deployment pipelines production model monitoring and maintenance
- Traffic monitoring and analysis: vehicle detection, traffic flow, congestion analysis
- Safety and compliance monitoring: violation detection, construction safety, pedestrian safety
- Experience with transportation, smart city projects, or intelligent transportation systems is a strong plus
- Strong systems-thinking mindset to decompose complex problems into modular solutions
- Excellent communication and stakeholder management translating technical concepts to non-technical audiences
- Ability to balance technical depth with production delivery and operational reliability
- Proactive, collaborative, and adaptable in fast-evolving technical environments
- Documentation expertise: technical documentation, architecture diagrams, runbooks, API documentation
- Mentoring and knowledge transfer capabilities
- Agile/project management experience
Experience Requirements
- Minimum 3 years hands-on experience as Systems Engineer working with cloud infrastructure, DevOps practices, and system integration in production environments
- Proven track record deploying and operating production systems at scale with reliability, monitoring, and incident response
- Strong foundation in system architecture, networking, security, and distributed systems design
- Demonstrated experience applying AI/ML technologies in production including model deployment and lifecycle management
- Experience with edge AI deployment on Nvidia Jetson or similar platforms highly valued
- Experience with video analytics, computer vision, or outdoor AI applications is a plus
- Government or public sector project experience with compliance requirements advantageous
- Portfolio demonstrating successful AI/ML deployment integrated with enterprise/government infrastructure
- Ability to work with non-technical stakeholders and translate complex technical concepts clearly
- Understanding of data privacy, security, and compliance in government contexts
- Problem-solving mindset with ability to work independently and in cross-functional teams
- On-call availability for production infrastructure support on rotation basis
- Ability to respond to incidents with systematic troubleshooting under pressure