AI Enablement Engineer

Neurealm · Bengaluru, Karnataka, India

Full-time · Senior · Posted 1 month ago

Note: This is an AI Enablement role — not AI Applications Engineering. Candidates must be
comfortable working below the model API: at the level of graph optimisation, fine-tuning, layer-wise
profiling and even kernel implementation. Full-stack engineers who consume pre-trained models are a
different profile and should not be mapped to this JD.
About the Role
You will work on porting, optimising, and enabling AI/ML models on commercial and custom AI
accelerator platforms — both edge devices and data centre silicon. The work is hands-on, low-level, and
highly specialised.
What You Will Do
Port and onboard AI models to target accelerator hardware
Perform quantisation, pruning, and accuracy/performance trade-off analysis
Model surgery and graph-level optimisation (ONNX, TFLite, PyTorch)
Low-level kernel development and custom operator implementation
Profile and benchmark inference performance (latency, throughput, power)
What We Are Looking For
Hands-on experience with at least one AI accelerator toolchain: Qualcomm AI Engine, TI C7x/
TDA4x, NVIDIA GPUs, MediaTek APU, or similar
Strong Python; C/C++ for kernel-level work
Working knowledge of ONNX, TFLite, PyTorch model formats
Understanding of quantisation techniques (INT8, FP16, mixed precision, etc)
Familiarity with embedded Linux, RTOS, or bare-metal environments
Nice to Have
Experience with graph compilers (TVM, MLIR, or vendor-specific)
Background in DSP or signal processing
Prior semiconductor customer delivery experience