
Search by job, company or skills
Position: Developers
Exp: 10+ Years
Requirements
Education:
Bachelor's degree/University degree in Computer Science, Engineering, or equivalent experience
Essential:
Adopt an uncompromising attitude when it comes to quality and help raise bar of products and team members
Be a team player who communicates effectively and professionally with both internal and external customers
Identify ideas to improve system performance and impact availability
Embrace tackling and resolving complex technical design issues
Possess strong problem solving and decision-making skills while exercising good judgment
Strong analytical and problem-solving skills
Ability to work on multiple projects at a time
Be able to work under pressure and manage deadlines or unexpected changes in expectations or requirements
Good communication skills - ability to convey technical information to non-technical audience
Ability to understand the big picture
Ability to develop long lasting relationships with all levels
Deep understanding and experience in software development cycle, including Agile based rapid delivery
Collaborate with business and IT to analyse, elicit and review business requirements
Facilitate communication between vendor, project team, business stakeholders and internal IT team
Ability to work in a team distributed across multiple locations
Programming Languages: Proficiency in Python for data processing, automation, and unix shell scripting.
Big Data Frameworks: Hands-on experience with Apache Spark for distributed data processing and analytics.
Database & Querying: Strong knowledge of SQL for relational databases and Hive for querying large datasets in Hadoop ecosystems.
ETL Development: Expertise in designing and implementing ETL pipelines for data ingestion, transformation, and loading.
Workflow Orchestration: Familiarity with Control-M or similar scheduling tools for batch job automation and monitoring.
Order Details
Data Warehousing: Understanding of data modeling and optimization techniques for large-scale data storage and retrieval.
Performance Tuning: Ability to optimize queries and jobs for efficiency and scalability.
Version Control & CI/CD: Experience with Git and deployment pipelines for data engineering workflows.
BI/Analytics Integration: Familiarity with how downstream tools (Power BI/Tableau) consume curated datasets.
Security: IAM, secrets management, encryption at rest/in transit, PII handling.
Develop robust data ingestion, transformation, and loading processes across batch and near-real-time workflows.
Implement distributed processing with Apache Spark (PySpark/Scala) for large-scale data transformations and analytics.
Develop robust data ingestion, transformation, and loading processes across batch and near-real-time workflows.
Implement distributed processing with Apache Spark (PySpark/Scala) for large-scale data transformations and analytics.
Create and maintain logical and physical data models (dimensional/star schemas, data vault, or wide tables) optimized for analytics and reporting.
Write optimized SQL and HiveQL queries; manage tables, partitions, and storage formats.
Schedule and monitor pipelines using Control M ensuring SLA adherence and timely delivery.
Tune Spark jobs, SQL/Hive queries, and storage strategies for scalability and cost efficiency.
Implement validation, reconciliation, and lineage using checks, unit tests, and metadata frameworks.
Build operational dashboards and alerts; diagnose failures; drive root-cause analysis and remediation.
Maintain clear runbooks, architecture diagrams, data dictionaries, and coding standards.
Apply best practices for data privacy, access control as applicable.
Execute continuous service improvement and process improvement plans.
Prepare Unit test cases and work closely with Testing team during SIT and UAT.
Build package and migrating the code drop through environment
Key Domains/Skillset:
Languages: Python (PySpark), SQL, Unix Shell Scripting
Frameworks: Spark, Hive, Sqoop
Orchestration: Control M
Storage & Files: HDFS, Parquet, ORC
Version Control & CI/CD: Git, GitHub/GitLab
Release and Deployment: Aldon, Jenkins
Issue Tracking: Jira
Documentation: Confluence/Wiki
Optional: QLK sense, Tableau or any reporting dashboard
If interested, please drop your updated CV at: [Confidential Information]
RIDIK, a subsidiary of CLPS Inc, is part of a global leading information technology consulting and solutions service provider focusing on the banking, insurance, and financial service sectors.
As a wholly-owned subsidiary of CLPS Incorporation (Nasdaq: CLPS), we leverage global resources to deliver innovative, tailored solutions across Asia Pacific, North America, and the Middle East.
We have more than 3000 employees working across 8 countries and 8 development centres. Our development centres have been certified with ISO 9001, 27001, and CMMi L5. For more information: please visit: https://www.clpsglobal.com/.
Job ID: 138847733