
Search by job, company or skills
Design and develop highly scalable, Real time systems using Hadoop ecosystem components(Iceberg, Spark, Ozone, Trino, Hive, Ranger, Kafka, Flink and Nifi)
• Build robust data ingestion and transformation frameworks using Java, Spark, Python, and shell scripting for ingesting multi model data(image, audio, video, unstructured documents) with both batch and real-time.
• Develop full stack applications and internal engineering tools using Python, shell scripting, and modern web frameworks (e.g., Flask, React).
• Collaborate closely with data scientists to operationalize machine learning models using Cloudera Machine Learning (CML).
• Perform performance tuning and optimization of data applications on Hadoop to ensure optimal resource utilization.
• Experience working with ML platforms such as CML, Spark MLlib, and Python ML libraries (scikit learn, XGBoost), including model deployment.
• Design and develop highly scalable, Real time systems using Hadoop ecosystem components(Iceberg, Spark, Ozone, Trino, Hive, Ranger, Kafka, Flink and Nifi)
• Build robust data ingestion and transformation frameworks using Java, Spark, Python, and shell scripting for ingesting multi model data(image, audio, video, unstructured documents) with both batch and real-time.
• Develop full stack applications and internal engineering tools using Python, shell scripting, and modern web frameworks (e.g., Flask, React).
• Collaborate closely with data scientists to operationalize machine learning models using Cloudera Machine Learning (CML).
• Perform performance tuning and optimization of data applications on Hadoop to ensure optimal resource utilization.
Total Experience
10+ yrs
Relevant Experience
6+ yrs
Mandatory skills
• Hadoop ecosystem (Spark, Hive, Kafka, Flink, NiFi, Iceberg, Trino)
• Java, Python, Spark (batch & real-time processing)
• Data ingestion & transformation frameworks
• Performance tuning on Hadoop platforms
• Shell scripting
• Real-time data processing systems
• ML model operationalization (CML / Spark ML)
Job ID: 150686333
Skills:
Java, Ranger, Hadoop, Kafka, React, Hive, XGBoost, Spark, Shell scripting, Flask, Python, MLlib, Flink, Ozone, Iceberg, Trino, Nifi, Cloudera Machine Learning
Skills:
Java, Apache Flink, Hadoop Ecosystem, Apache Spark, Apache Nifi, shell scripting, XGBoost, Apache Kafka, Python, Apache Hive, Apache Iceberg, scikit-learn, Spark MLlib
Skills:
Java, Hadoop, Scala, Devops, Spark, Python, Etl, ML algorithms, data management technology, Relational Databases, cloud-based AI platforms, non-relational databases
Skills:
Pytorch, Tensorflow, Python, World Models, Sequential Modelling
Skills:
Kafka, Sql, Tensorflow, Gcp, Pytorch, Docker, Spark, Databricks, Azure, Kubernetes, Python, AWS, NoSQL databases, Scikit-learn
We don’t charge any money for job offers