Data Engineer

Bahwan CyberTek · Chennai, Tamil Nadu, India

Full-time · Senior · Posted 11 days ago

Role description
Pipeline Development & Data Engineering • Lead end-to-end technical delivery of data engineering solutions aligned with business initiatives and data platform architecture. • Design, develop, test, and support ingestion and transformation (ETL/ELT) pipelines using Azure Databricks. Configure applications for best performance. • Develop data processing jobs using PySpark, SQL, and Python with an emphasis on reliability, maintainability, and performance. • Apply Lakehouse patterns to deliver curated data layers, Implement and manage governed data assets using Unity Catalog (organization, access controls, and best practices). Orchestration & Operational SDLC • Orchestrate end-to-end workflows using Azure Data Factory (ADF), including scheduling, dependency management, monitoring, and operational support. • Establish, improve, and implement SDLC processes including development, testing, and production deployments. • Apply Agile and DevOps practices. Ensure platform requirements (functional and non-functional) are met per design and architecture. Collaboration & Stakeholder Engagement • Collaborate with the Data/Business Analyst, Testing Engineer, and stakeholders to clarify requirements and ensure delivered outputs meet expectations. • Collaborate with cross-functional stakeholders including ZTD Infrastructure, IT, business, and Information Security teams to evolve technology solutions. • Engage and collaborate with technology partners and industry forums to stay abreast of technology trends, applying them as per business needs. Engineering Standards & Quality • Adhere to engineering standards through pull requests, code reviews, documentation, and reusable design patterns. • Troubleshoot development and pre-production issues and drive continuous improvements to stability and performance.
Other details
Education: • Bachelor’s degree in Computer Science, Computer Engineering, Data Engineering, or a relevant field. Required Experience: • 5–8+ years of solid understanding in data analytics concepts (data ingestion, data warehouse, data lakes, reporting, etc.). • 5+ years of experience delivering production ETL pipelines. • 2+ years of hands-on experience with Databricks (Azure Databricks preferred). • Built, optimized, and maintained ETL pipelines using Databricks for large-scale data processing. • Strong experience with PySpark and SQL; proficiency in Python. • Strong experience in Azure cloud and data platform services including Data Factory, Databricks, databases, ETL/ELT concepts, IAM, and security controls. • Strong understanding of cloud technologies, data management, big data solutions, machine learning, and AI frameworks. • Experience with ADF for orchestration and operational monitoring. • Experience implementing CI/CD practices using GitHub Actions. • Knowledge/experience in DevOps and Agile methodologies. • Collaborate with cross teams, infrastructure, and security teams for best and timely delivery. Preferred Qualifications: • Experience with Spark/Delta optimization (partitioning, file sizing, performance tuning). • Experience in healthcare/Life Sciences is preferred

Sign up to apply