Data Engineer
Celebal Technologies · Navi Mumbai, Maharashtra, India
Full-time · Senior · Posted 16 days ago
Job Title: Data Engineers
Experience Required: 3-5 Years & 6-10 Years
Locations: Navi Mumbai
Duration; Fulltime
Positions: Multiple
Job Summary: We are looking for a highly skilled Azure Data Engineer with a strong background in real-time and batch data ingestion and big data processing, particularly using Kafka and Databricks. The ideal candidate will have a deep understanding of streaming architectures, Medallion data models, and performance optimization techniques in cloud environments. This role requires hands-on technical expertise, including live coding during the interview process.
Key Responsibilities
• Design and implement streaming data pipelines integrating Kafka with Databricks using Structured Streaming.
• Architect and maintain Medallion Architecture with well-defined Bronze, Silver, and Gold layers.
• Implement efficient ingestion using Databricks Autoloader for high-throughput data loads. • Work with large volumes of structured and unstructured data, ensuring high availability and performance.
• Apply performance tuning techniques such as partitioning, caching, and cluster resource optimization.
• Collaborate with cross-functional teams (data scientists, analysts, business users) to build robust data solutions.
• Establish best practices for code versioning, deployment automation, and data governance.
Required Technical Skills:
• Strong expertise in Azure Databricks and Spark Structured Streaming
• 3-8 Years experience in Data Engineering
• Processing modes (append, update, complete)
• Output modes (append, complete, update)
• Checkpointing and state management
• Experience with Kafka integration for real-time data pipelines
• Deep understanding of Medallion Architecture
• Proficiency with Databricks Autoloader and schema evolution
• Deep understanding of Unity Catalog and Foreign catalog
• Strong knowledge of Spark SQL, Delta Lake, and DataFrames
• Expertise in performance tuning (query optimization, cluster configuration, caching strategies)
• Must have Data management strategies
• Excellent with Governance and Access management
• Strong with Data modelling, Data warehousing concepts, Databricks as a platform
• Solid understanding of Window functions