Lead Data Scientist
Eucloid Data Solutions · Chennai, Tamil Nadu, India
Full-time · Senior · Posted 1 month ago
We are looking for a Lead Data Scientist - Vision & Multimodal AI to architect and build next-generation Vision-Language Model (VLM) systems at scale.
The Candidate Will Have Responsibilities Across The Following Functions
Architect and Build RLHF Frameworks:
Design end-to-end RLHF pipelines (SFT Reward Modelling PPO/DPO).
Develop scalable human feedback collection systems.
Implement preference modelling and ranking pipelines.
Optimise reward models for multimodal outputs (image + text).
Build automated evaluation frameworks.
Train And Fine-Tune OSS Vision-Language Models
Experience working with Qwen-VL, Llama, and GPT OSS.
Pretraining / instruction tuning multimodal models.
Parameter-efficient fine-tuning (LoRA, QLoRA).
Dataset curation and synthetic data generation.
Scaling training on multi-GPU / multi-node clusters.
Optimising for alignment, hallucination reduction, and safety.
Highly Scalable Deployment Of VLM Systems
Design distributed inference pipelines (GPU-optimised).
Model serving using vLLM and Triton Inference Server.
Optimise latency, throughput, and cost.
Implement batching, KV caching, quantisation, and tensor parallelism.
Deploy on a Kubernetes-based infrastructure.
Build monitoring for drift, performance, and hallucinations.
Multimodal AI System Design
Architect systems combining OCR, vision encoders, LLMs, and retrieval.
Implement retrieval-augmented multimodal pipelines.
Design evaluation benchmarks for VQA, grounding, and reasoning.
Ensure model safety and guardrails.
Technical Leadership
Lead a team of ML engineers and research scientists.
Define a technical roadmap for multimodal AI.
Review model architectures and code quality.
Collaborate with product and infrastructure teams.
Requirements
6+ years in ML / AI.
2+ years working with large-scale LLM or VLM systems.
Strong hands-on experience building RLHF pipelines (not just using libraries).
Deep PyTorch expertise.
Experience training models > 7B parameters.
Experience with distributed training (Deep Speed, FSDP).
Production-grade deployment experience handling 10k+ QPS workloads.
Strong understanding of transformer architectures.
This Role Requires Deep Expertise In
Architecting and implementing RLHF (Reinforcement Learning from Human Feedback) Frameworks.
Training and fine-tuning Open-Source Vision-Language Models (VLMs).
Deploying and scaling multimodal models to production, serving millions of requests.
This job was posted by Eucloid Careers from Eucloid Data Solutions.