Data Engineer+AI Exposure

Cognizant · Pune, Maharashtra, India

Full-time · Senior · Posted 9 days ago

Role: Data Engineer + AI Exposure

Location : Bangalore

Experience: 7 to 13 Years

Notice: Immediate to 90 days

a { text-decoration: none; color: #464feb; } tr th, tr td { border: 1px solid #e6e6e6; } tr th { background-color: #f5f5f5; }

Job Summary

We are seeking a skilled Data Engineer with AI/ML exposure responsible for designing, building, and maintaining scalable data pipelines and supporting data-driven applications, including AI/ML use cases. The ideal candidate should have strong expertise in data engineering tools along with working knowledge of machine learning workflows and cloud-based data platforms.

a { text-decoration: none; color: #464feb; } tr th, tr td { border: 1px solid #e6e6e6; } tr th { background-color: #f5f5f5; }

Key Responsibilities

Data Engineering

Design, develop, and maintain scalable ETL/ELT pipelines
Build and optimize data architectures, data lakes, and data warehouses
Ensure data quality, integrity, and security across systems
Work with structured and unstructured data from various sources

Big Data & Cloud

Develop solutions using tools such as Azure Data Factory / AWS Glue / GCP Dataflow
Work with big data technologies like Spark, Hadoop, or Databricks
Manage data storage solutions including S3, ADLS, BigQuery, Snowflake, or Redshift

AI/ML Exposure

Support machine learning pipelines and data preparation for ML models
Collaborate with Data Scientists to enable feature engineering and model deployment
Work on AI-enabled data solutions (e.g., NLP, recommendation systems, prediction models)
Basic understanding of ML frameworks (Scikit-learn, TensorFlow, or PyTorch is a plus)

Data Modeling & Optimization

Design and implement data models (dimensional & normalized)
Optimize queries and pipelines for efficiency and cost

Collaboration & Governance

Work closely with business teams, analysts, and ML engineers
Implement data governance, lineage, and compliance standards
Document workflows, pipelines, and architectures

Required Skills

Core Data Engineering

Strong in SQL, Python
Experience with ETL tools and pipeline orchestration (Airflow, ADF, etc.)
Hands-on with data warehousing concepts

Big Data Technologies

Apache Spark / PySpark
Hadoop ecosystem (optional but preferred)

Cloud Platforms (any one required)

Azure / AWS / GCP hands-on experience
Familiarity with cloud-native data services

AI/ML Exposure

Experience working with data for ML models
Knowledge of ML lifecycle and data preparation
Exposure to MLOps concepts (bonus)

a { text-decoration: none; color: #464feb; } tr th, tr td { border: 1px solid #e6e6e6; } tr th { background-color: #f5f5f5; }

Preferred Qualifications
Experience with Databricks / Snowflake
Knowledge of API-based data ingestion
Familiarity with CI/CD pipelines
Exposure to real-time streaming (Kafka, Event Hub, etc.)
Understanding of Generative AI or LLM integrations (added advantage)