Architect end-to-end data integration solutions using Talend Data Integration, Talend Cloud, Talend Big Data, and related components
Design ETL/ELT frameworks, data ingestion pipelines, workflow orchestration, and reusable Talend components
Define architecture blueprints, HLD/LLD documentation, and integration patterns
Lead modernization efforts including migration from legacy ETL tools to Talend
Justify and recommend system sizing — hardware, memory, compute, cluster configurations — based on data volumes, throughput requirements, and SLA expectations
Development & Technical Leadership
Strong hands-on development background with ability to build, review, and optimize complex Talend jobs across batch, real-time, and streaming workloads
Build and optimize ETL/ELT pipelines integrating diverse data sources — databases, APIs, flat files, cloud platforms, and streaming systems
Design and implement data integration via REST/SOAP APIs, HTTP connectors, and data services routes within Talend
Implement complex transformations, data quality rules, profiling, cleansing, deduplication, metadata management, and lineage
Leverage Talend Big Data components to process large-scale datasets on Hadoop, Spark, or cloud-native big data platforms
Work across relational databases (Oracle, SQL Server, MySQL, PostgreSQL) and cloud storage solutions
Provide hands-on guidance to development teams on Talend jobs, best practices, error handling, logging, and scalability
Conduct architecture reviews, performance tuning, and optimization of Talend workloads
Streaming & Real-Time Integration
Design and implement real-time and near-real-time data pipelines using Talend with Kafka, Spark Streaming, or equivalent streaming frameworks
Architect event-driven integration patterns for high-throughput, low-latency data flows
Monitor and tune streaming pipelines for performance, fault tolerance, and reliability
Data Quality & Governance
Define and enforce data quality frameworks — profiling, validation rules, anomaly detection, and exception handling — within Talend pipelines
Ensure alignment with enterprise data governance, security, compliance, and data lineage requirements
Operations & DevOps
Oversee deployment, scheduling, monitoring, and maintenance of Talend jobs
Collaborate with DevOps teams to design CI/CD pipelines for Talend solutions
Troubleshoot production issues and ensure high availability and reliability of ETL workflows
Requirements
Required Skills & Experience
10+ years of overall data integration/ETL experience, with at least 4-5 years as a Talend architect or senior developer
Strong hands-on development expertise in Talend Data Integration, Talend Cloud, Talend Big Data, and Talend Administration Center — must be able to build, not just design
Proven experience designing HLD/LLD and integration architecture patterns
Demonstrated ability to size and justify infrastructure and platform configurations based on workload profiling and capacity planning
Hands-on experience with streaming technologies — Kafka, Spark Streaming, or equivalent — integrated within Talend pipelines
Experience designing and consuming REST/SOAP APIs, HTTP-based connectors, and Talend Data Services for service-oriented integration patterns
Strong expertise in data quality implementation — profiling, cleansing, validation, deduplication, and exception management within Talend
Proficiency with Talend Big Data components and processing at scale on Hadoop, Spark, or cloud-native equivalents
Experience with legacy ETL migration projects (Informatica, DataStage, SSIS to Talend preferred)
Proficiency with relational databases (Oracle, SQL Server, MySQL, PostgreSQL) and cloud platforms (AWS, Azure, or GCP)
Knowledge of CI/CD practices, Git-based version control, and DevOps tooling (Jenkins, GitLab CI, etc.)
Understanding of data governance frameworks, metadata management, and lineage concepts
Good to Have
Talend certification (Architect or Developer level)
Exposure to cloud-native data platforms (Databricks, Snowflake, Redshift, BigQuery)
Familiarity with enterprise data cataloguing and governance platforms (Collibra, Alation, etc.)
Experience with containerised deployments (Docker, Kubernetes) for Talend workloads