
Search by job, company or skills
About ClickHouse
Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies. With over 2,000 customers and ARR that has more than quadrupled over the past year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads. ClickHouse's incredible momentum was confirmed in its recent $350M Series C financing that included new, tier one investors, Khosla Ventures, BOND, IVP, Battery Ventures and Bessemer Venture Partners. We're on a mission to transform how companies use data. Come be a part of our journey
What will you do
Benchmark system performance, database performance analysis, capacity sizing and optimization.
Troubleshoot and debug applications, server errors, logs, and triage accordingly.
Recommend configuration tuning/optimizations for performance bottlenecks.
Work closely and partner with ClickHouse's core development team, cloud team, and security team to improve the performance of ClickHouse Cloud.
Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
Develop, deploy and manage tools to systematically run chaos experiments and measure impact
Enjoy working on, and gaining a deep understanding of, large scale distributed systems
Study the problems in the software resilience, operational, and delivery spaces
Extend our entire backend to enable Chaos Engineering techniques in the system
Observe running systems, and determine/prioritize innovative ways to disrupt them
About you:
You have 8+ years of relevant software development industry experience building and operating scalable, fault-tolerant, distributed systems.
Software development experience in Go, C/C++, Java, or similar.
Experience with concurrency, multithreading, and the deployment of distributed system architectures
Experience developing cloud infrastructure services, preferably with Kubernetes.
Experience leading and shipping large scope technical projects in collaboration with multiple experienced engineers.
Expertise with a public cloud provider (AWS, GCP, Azure) and their infrastructure as a service offering (e.g. EC2).
You have excellent communication skills and the ability to work well within a team and across engineering teams.
You are a strong problem solver and have solid production debugging skills.
You are passionate about efficiency, availability, scalability and data governance.
You Thrive in a fast paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward.
You have a high level of responsibility, ownership, and accountability
Job ID: 136716063