Responsible for operation and maintenance related platform design and development work; participate in the construction and continuous improvement of the operation & maintenance system.
Participate in the daily management, operation and maintenance of platform systems, ensuring stability, performance, and security.
Research service architecture, discover potential problems, formulate system adjustment and optimization plans to improve system stability, efficiency, and security posture.
Build, deploy and maintain internal tooling and backend services to monitor on-chain and off-chain activity.
Develop alerting systems and dashboards to track the health of blockchain transactions, internal services, and third-party APIs.
Work closely with developers to build observability and reliability tooling (logging, tracing, metrics, profiling) and integrate security signals into existing monitoring.
Embed security into the CI/CD pipeline, including image scanning, dependency scanning, SCA, SAST/DAST integration, and enforcing security gates for critical services.
Design and implement secrets management and key management best practices, including KMS/HSM integration, key rotation, access control, and secure handling of credentials, API keys, and certificates.
Harden infrastructure and runtime environments, including Kubernetes, containers, Linux hosts, and cloud accounts (AWS/GCP), following least-privilege, network segmentation, and baseline hardening standards.
Collaborate with Security / Compliance / Custody teams to translate policies (e.g. access control, segregation of duties, audit logging) into technical controls and automation.
Participate in incident response and post-incident review, including triage, log analysis, impact assessment, and implementing long-term preventive measures via automation.
Continuously improve security observability, including anomaly detection on privileged operations, configuration drift, and blockchain-related risk signals.
Perform other related duties as assigned based on evolving operational and security needs.
Requirements
Bachelor's degree or above with minimum 3 years DevOps / SRE / Platform Engineering experience; strong computer science or related background.
Solid experience with Linux, Docker, shell scripting, databases, CI/CD tools, Git, and common deployment/maintenance/optimization practices.
Hands-on experience with AWS services (EC2, RDS, S3, EKS, ECR, etc.); familiarity with Google Cloud or other major cloud providers.
Understanding of Kubernetes or other orchestration frameworks; experience with cluster hardening, RBAC design, and policy engines is a plus.
Familiar with container technologies and underlying OS concepts (filesystem, namespaces, cgroups); Docker internals or source code familiarity is a strong plus.
Backend development experience (e.g. Java, Go, Node.js, Python) for building internal tools, automation, and integrations.
Practical experience in at least some of the following DevSecOps areas:
Infrastructure as Code (Terraform/CloudFormation) with security best practices.
System and network hardening, firewall / security group management, zero-trust / least-privilege design.
Centralized logging, SIEM, or security analytics for production systems.
Familiarity with security frameworks or concepts, such as OWASP Top 10, CIS Benchmarks, principle of least privilege, secure key management; experience in regulated / financial / crypto environments is a plus.
Strong communication and collaboration skills, able to work closely with developers, security, and operations teams to land secure-by-design solutions.
Able to articulate thoughts, risks, and trade-offs in a respectful and constructive manner.
Ability to work independently and as a team player, with a strong sense of ownership, responsibility, and attention to detail.
Comfortable in a fast-paced, high-availability, security-sensitive environment.
Proficiency in English and Chinese for seamless stakeholder communication.