About the Opportunity
A fast-scaling technology company operating in the Enterprise AI / Generative AI sector, building production-grade LLM-driven products and intelligent automation for global clients. We deliver secure, low-latency generative systems that power conversational AI, summarization, code generation, and retrieval-augmented applications across cloud-native environments.
We are hiring a Senior Generative AI Engineer (7+ years) to own architecture, model development, and production deployment of advanced generative systems. This is a fully remote role for candidates based in India.
Role & Responsibilities
- Lead end-to-end design and delivery of Generative AI/LLM solutions—data ingestion, pre-processing, model training/fine-tuning, evaluation, and scalable inference.
- Develop and productionize transformer-based models (instruction-tuning, LoRA, quantization) using PyTorch/TensorFlow and Hugging Face tooling.
- Architect and implement RAG pipelines integrating vector databases (FAISS/Milvus/Chroma), dense / sparse retrieval, and scalable embedding workflows.
- Optimize inference throughput and latency using ONNX/TorchScript/TensorRT, autoscaling, and cost-efficient deployment patterns on cloud infra.
- Define MLOps best practices: CI/CD for models, containerization, observability, automated retraining, drift detection and rollout strategies.
- Mentor engineers, conduct code reviews, and collaborate with product & data science to translate research into reliable production systems.
Skills & Qualifications
- Must-Have
- 7+ years software/ML engineering experience with significant time on generative/LLM projects.
- Strong proficiency in Python and deep learning frameworks (PyTorch preferred; TensorFlow acceptable).
- Hands-on experience with Hugging Face Transformers, tokenizers, training and fine-tuning workflows.
- Proven experience building RAG systems and working with vector stores (FAISS, Milvus, Chroma) and embedding pipelines.
- Experience deploying models to production using Docker, Kubernetes, and cloud services (AWS/GCP/Azure).
- Solid software engineering practices: unit testing, CI/CD, code reviews, and monitoring for ML systems.
- Preferred
- Experience with model compression/acceleration (quantization, distillation), ONNX, or TensorRT.
- Familiarity with LangChain or similar orchestration frameworks, agentic workflows, and tool-calling patterns.
- Background in prompt engineering, instruction tuning, Reinforcement Learning from Human Feedback (RLHF) exposure.
- Knowledge of data privacy, secure model serving, and compliance controls for enterprise deployments.
Benefits & Culture Highlights
- Fully remote, India-based role with flexible hours and a results-oriented culture.
- Opportunity to shape product architecture and scale cutting-edge generative AI for enterprise customers.
- Collaborative environment with senior ML engineers, data scientists, and product stakeholders—mentorship and career growth.
To apply, you should be passionate about bringing advanced generative models into production, comfortable with both research-to-production translation and the operational discipline required to run mission-critical AI systems.