This is a remote position.
Noted—You will be required to visit the office once a month (in Pune, Mumbai, or Bangalore), if needed.
Position Overview
We are seeking a talented Generative AI Engineer with deep expertise in FastAPI development to join our team. In this role, you will design, build, and deploy robust backend systems that power cutting-edge generative AI applications. You will work at the intersection of AI model engineering and scalable API development, collaborating with data scientists, ML engineers, and product teams to deliver innovative AI-driven solution
Key Responsibilities
• Design, develop, and maintain high-performance backend systems using Python and FastAPI, enabling seamless integration and deployment of generative AI models
• Develop and expose RESTful APIs for AI-powered services, ensuring secure, reliable, and scalable endpoints[3][1].
• Implement prompt engineering, retrieval-augmented generation (RAG), and embedding techniques to enhance AI model performance and relevance
• Integrate vector databases (e.g., Pinecone, ChromaDB) and manage efficient data storage and retrieval for AI applications
• Collaborate with cross-functional teams to prototype, test, and deploy AI solutions aligned with business objectives
• Develop and maintain CI/CD pipelines for continuous integration, testing, and deployment of AI models and APIs
• Monitor, troubleshoot, and optimize backend and AI systems for performance, reliability, and scalability
• Stay current with advancements in generative AI, LLMs, and backend technologies, integrating best practices into the development lifecycle
• Document APIs, workflows, and system architecture for team knowledge sharing and future maintenance
Required Skills and Qualifications
• Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
• 5+ years of experience in backend development with strong proficiency in Python and FastAPI
• Proven experience building and deploying generative AI models (LLMs, GANs, VAEs) in production environments
• Solid understanding of prompt engineering, RAG, embeddings, and vector databases
• Experience integrating AI models with RESTful APIs and managing secure API authentication and authorization
• Proficiency with containerization and deployment tools such as Docker and Kubernetes
• Familiarity with SQL and NoSQL databases (e.g., PostgreSQL, MongoDB) for managing conversational or application data
• Experience with cloud platforms (AWS, Azure, GCP) for deploying AI applications at scale
• Strong problem-solving skills and the ability to tackle complex technical challenges
• Excellent communication and teamwork skills.