Data Engineer+AI Exposure
Cognizant · Pune, Maharashtra, India
Full-time · Senior · Posted 9 days ago
Role: Data Engineer + AI Exposure
Location : Bangalore
Experience: 7 to 13 Years
Notice: Immediate to 90 days
a { text-decoration: none; color: #464feb; } tr th, tr td { border: 1px solid #e6e6e6; } tr th { background-color: #f5f5f5; }
Job Summary
We are seeking a skilled Data Engineer with AI/ML exposure responsible for designing, building, and maintaining scalable data pipelines and supporting data-driven applications, including AI/ML use cases. The ideal candidate should have strong expertise in data engineering tools along with working knowledge of machine learning workflows and cloud-based data platforms.
a { text-decoration: none; color: #464feb; } tr th, tr td { border: 1px solid #e6e6e6; } tr th { background-color: #f5f5f5; }
Key Responsibilities
Data Engineering
Design, develop, and maintain scalable ETL/ELT pipelines
Build and optimize data architectures, data lakes, and data warehouses
Ensure data quality, integrity, and security across systems
Work with structured and unstructured data from various sources
Big Data & Cloud
Develop solutions using tools such as Azure Data Factory / AWS Glue / GCP Dataflow
Work with big data technologies like Spark, Hadoop, or Databricks
Manage data storage solutions including S3, ADLS, BigQuery, Snowflake, or Redshift
AI/ML Exposure
Support machine learning pipelines and data preparation for ML models
Collaborate with Data Scientists to enable feature engineering and model deployment
Work on AI-enabled data solutions (e.g., NLP, recommendation systems, prediction models)
Basic understanding of ML frameworks (Scikit-learn, TensorFlow, or PyTorch is a plus)
Data Modeling & Optimization
Design and implement data models (dimensional & normalized)
Optimize queries and pipelines for efficiency and cost
Collaboration & Governance
Work closely with business teams, analysts, and ML engineers
Implement data governance, lineage, and compliance standards
Document workflows, pipelines, and architectures
Required Skills
Core Data Engineering
Strong in SQL, Python
Experience with ETL tools and pipeline orchestration (Airflow, ADF, etc.)
Hands-on with data warehousing concepts
Big Data Technologies
Apache Spark / PySpark
Hadoop ecosystem (optional but preferred)
Cloud Platforms (any one required)
Azure / AWS / GCP hands-on experience
Familiarity with cloud-native data services
AI/ML Exposure
Experience working with data for ML models
Knowledge of ML lifecycle and data preparation
Exposure to MLOps concepts (bonus)
a { text-decoration: none; color: #464feb; } tr th, tr td { border: 1px solid #e6e6e6; } tr th { background-color: #f5f5f5; }
Preferred Qualifications
Experience with Databricks / Snowflake
Knowledge of API-based data ingestion
Familiarity with CI/CD pipelines
Exposure to real-time streaming (Kafka, Event Hub, etc.)
Understanding of Generative AI or LLM integrations (added advantage)