Lead AI/ML Ops Engineer, Foundry RnD-Mastercard-Pune, Maharashtra, India-7

Must have

Cloud Expertise – Strong understanding of cloud platforms (Azure/AWS) and AI/ML components such as Databricks, Azure Cognitive Services, and MLflow.

Infrastructure as Code (IaC) – Hands-on experience with Terraform, and IaC orchestration tools like Terragrunt.

Scripting & Automation – Strong command-line proficiency with Bash/Python, or equivalent scripting languages.

Containerisation & Orchestration – Expertise in Kubernetes/Docker and how they optimise ML development workflows.

Monitoring & Observability – Experience with monitoring for ML-specific use cases.

Collaboration & Communication – Excellent written and verbal communication skills, with the ability to work in collaborative, multi-cultural teams.

Nice to have

ML Workflow Automation – Experience in ML pipeline orchestration, using tools such as Jenkins, GitHub Actions, or dedicated compute environments.

Model & Data Management – Familiarity with model registries, AI Agents, Retrieval-Augmented Generation (RAG) techniques, and frameworks like LangChain/LlamaIndex.

Hands-on experience with Databricks, Azure ML, or SageMaker.

Understanding of security best practices for MLOps, including data privacy & compliance in cloud platforms.

Knowledge of ML frameworks like TensorFlow, PyTorch, or Scikit-learn.

Experience working in complex enterprise environments with strict security and compliance requirements.

Strong networking fundamentals, including configuring and maintaining secure mTLS-based communication between services.

Excellent problem-solving skills and attention to detail.

Exposure to Java or R (optional but beneficial for enterprise AI environments).

Hands-on experience with stacks such as Prometheus, Grafana, Splunk, ELK and tuning observability for ML-specific use cases.

Role Responsibilities:

Automate & Optimise AI/ML Infrastructure – Enable scalable, repeatable, and secure AI/ML services for research and development (R&D).

Collaborate Across Teams – Work with ML Engineers, DevOps, and Software teams to design robust ML infrastructure and deployment strategies.

Evaluate & Integrate Emerging Technologies – Continuously assess and integrate MLOps best practices to enhance automation, efficiency, and security.

Monitor & Improve ML Operations – Implement proactive monitoring & alerting solutions to improve system performance, reliability, and operational insights.

Perform Research & Proof-of-Concepts (PoCs) – Conduct research and evaluate new technologies to drive innovation and improve AI/ML development and integration cycles.

Contribute to Internal & External Knowledge Sharing – Document findings, best practices, and PoCs to support broader engineering teams.

Lead AI/ML Ops Engineer, Foundry RnD

Job description & requirements

Location :

Create alert for similar jobs

Enquiry for: