
Search by job, company or skills

Responsibilities
• Own end-to-end reliability, availability, performance, and scalability of hybrid and cloud-connected network services.
• Apply Network Reliability Engineering (NRE) and Site Reliability Engineering (SRE) principles to reduce operational toil, improve service resilience, and support business objectives.
• Design, implement, and continuously improve highly available hybrid and cloud network architectures across on-premises data centers and AWS/Azure environments.
• Perform deep technical troubleshooting, root cause analysis, and incident resolution for complex network and security-related issues.
• Build, maintain, and optimize automation and self-healing workflows supporting Day-1, Day-2, and Day-N network operations.
• Develop, maintain, and optimize Ansible playbooks, roles, and inventories to automate network device configuration, deployment, and operational activities across multi-vendor environments.
• Automate routine network tasks including configuration management, backups, compliance checks, firmware upgrades, and provisioning to improve reliability and reduce manual effort.
• Translate network SOPs, runbooks, and BAU activities into reusable Ansible-based Infrastructure-as-Code (IaC) automation workflows.
• Implement event-driven and self-healing automation by integrating Ansible playbooks with monitoring systems, APIs, and ITSM tools for automated remediation and faster incident resolution.
• Ensure network consistency and compliance by validating device states, detecting configuration drift, and enforcing standardized configurations across network infrastructure.
• Integrate infrastructure and network changes into CI/CD pipelines with appropriate testing, validation, deployment, and rollback mechanisms.
• Define, monitor, and improve Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets for network services.
• Partner with cloud, security, application, and platform teams to achieve business KPIs and KRAs.
• Support Cisco ACI fabrics integrated with hybrid connectivity models.
• Manage and support Cisco IOS-XE, Cisco NX-OS, and Cisco Wireless LAN Controllers (WLC) within enterprise and data center environments.
• Design, manage, and troubleshoot AWS networking services including VPCs, routing, security groups, NACLs, Transit Gateway, VPN, and Direct Connect.
• Design, manage, and troubleshoot Azure networking services including VNets, UDRs, NSGs, VPN Gateway, ExpressRoute, and Azure Firewall.
• Support hybrid connectivity solutions including internet, MPLS, site-to-site VPN, client VPN, and secure cloud access.
• Administer and support enterprise firewall platforms including Check Point, Palo Alto, and Cisco Firepower (FTD).
• Utilize AlgoSec for firewall policy lifecycle management and automation across hybrid and cloud environments.
• Manage and support proxy and secure web gateway solutions including Bluecoat ProxySG (SGOS), BMC, and CAS.
• Implement and support cloud-delivered security solutions including Zscaler ZIA, ZPA, and ZTNA for hybrid and remote access environments.
• Design and support resilient application delivery solutions using F5 VELOS chassis and F5 LTM, GTM, and APM modules.
• Monitor hybrid and cloud network services using Datadog and ensure operational visibility and performance.
• Utilize Cisco DNA Center (DNAC) for network assurance, visibility, and automation.
• Support Forescout for network visibility, device profiling, compliance monitoring, and security posture management.
• Build dashboards, alerts, and telemetry solutions using Prometheus and Grafana.
• Integrate GitHub for version control, peer reviews, and change traceability across network and automation initiatives.
• Develop integrations using REST APIs, SDKs, CLI tools, and GUIs to connect network, security, and cloud platforms.
• Develop tools, services, dashboards, and automation solutions using Python and Django.
• Implement Infrastructure as Code (IaC) practices and automated validation workflows.
• Participate in on-call rotations, major incident response activities, CI/CD-driven change management, and collaboration with global support teams.
Requirements
• Bachelor's degree in Computer Science, Information Technology, or equivalent practical experience.
• 7–12 years of experience in Network Engineering, Network Security, Cloud Networking, Network Reliability Engineering (NRE), or Site Reliability Engineering (SRE) roles.
• Strong hands-on experience across hybrid and cloud network environments, including on-premises data centers and AWS/Azure cloud platforms.
• Strong reliability engineering mindset with a focus on automation, scalability, observability, and operational excellence.
• Proven experience operating and supporting production hybrid and cloud network environments.
• Hands-on expertise with Ansible Automation Platform, including playbook development, network automation, and Infrastructure-as-Code practices.
• Strong knowledge of Cisco ACI, Cisco IOS-XE, Cisco NX-OS, and Cisco Wireless LAN Controllers (WLC).
• Practical experience with AWS networking services including VPCs, routing, security groups, NACLs, Transit Gateway, VPN, and Direct Connect.
• Practical experience with Azure networking services including VNets, UDRs, NSGs, VPN Gateway, ExpressRoute, and Azure Firewall.
• Strong understanding of hybrid connectivity models including MPLS, internet connectivity, site-to-site VPN, client VPN, and secure cloud access.
• Hands-on experience with Check Point, Palo Alto, and Cisco Firepower (FTD) firewall platforms.
• Experience using AlgoSec for firewall policy management and automation.
• Expertise with Bluecoat ProxySG (SGOS), BMC, CAS, and secure web gateway technologies.
• Experience with Zscaler ZIA, ZPA, and ZTNA solutions supporting hybrid and remote access models.
• Strong expertise with F5 VELOS, F5 LTM, GTM, and APM technologies.
• Hands-on experience with Datadog, Cisco DNA Center (DNAC), Forescout, Prometheus, and Grafana.
• Strong experience building automation solutions using REST APIs, SDKs, CLI tools, and platform integrations.
• Experience integrating GitHub and building CI/CD pipelines for infrastructure and network automation.
• Strong programming and development experience using Python and Django.
• Ability to implement Infrastructure as Code (IaC) and automated validation frameworks.
• Excellent troubleshooting, analytical, and root cause analysis skills.
• Ability to translate business KRAs into measurable technical reliability objectives.
• Strong documentation, collaboration, stakeholder management, and communication skills.
• Relevant certifications such as CCNP, CCIE, AWS Networking, Azure Networking, PCNSE, CCSA/CCSE, or F5 certifications are highly desirable.
Exasoft is a leading Technology and Talent Solutions company headquartered in Singapore.
Technology and talent are key pillars for the creation of digitally enabled smart industries and will be integral deciding who emerges as a leader in the new smarter, connected world. Exasoft delivers cross-industry expertise in technology, talent services and skilling to enable digital transformation and accelerate innovation.
Our team comprises of consultants knowledgeable in specific fields and industries. We can help you find strong potential employees in most of the areas of expertise.
Job ID: 150616243
Skills:
Github, Prometheus, Grafana, Cisco Aci, Datadog, Django, Ansible, F5 Ltm, Azure, Python, AWS, AlgoSec, F5 VELOS, Zscaler ZIA, Cisco IOS-XE, Forescout, Cisco DNA Center, NX-OS
Skills:
F5 Ltm, Django, Github, Grafana, Ansible, Datadog, AWS, Prometheus, Python, Azure, Cisco Aci, Cisco DNA Center, ZTNA, NX-OS, Zscaler ZIA, Cisco IOS-XE, Forescout, F5 VELOS, AlgoSec
We don’t charge any money for job offers