Lead AI/ML Ops Engineer, Foundry RnD-Mastercard-Pune, Maharashtra, India-7 - 10 years

MA

Lead AI/ML Ops Engineer, Foundry RnD

Mastercard

3 months ago

7 - 10 years

Work From Office

Pune, Maharashtra, India

  • Enable scalable, repeatable, and secure AI/ML services for research and development (R&D).
  • Continuously assess and integrate MLOps best practices to enhance automation, efficiency, and security.
  • Implement proactive monitoring & alerting solutions to improve system performance, reliability, and operational insights.
  • Cloud Platforms

    AI/ML

    Data Bricks

    Azure

    MLFlow

    Terraform

    PYTHON

    Kubernetes

    GitHub Actions

    RAG(Retrieval-Augmented Generation)

    LangChain

    Tensorflow

    Pytorch

    Scikit-Learn

    JAVA

    Job description & requirements

    Must have

    Cloud Expertise – Strong understanding of cloud platforms (Azure/AWS) and AI/ML components such as Databricks, Azure Cognitive Services, and MLflow.

    Infrastructure as Code (IaC) – Hands-on experience with Terraform, and IaC orchestration tools like Terragrunt.

    Scripting & Automation – Strong command-line proficiency with Bash/Python, or equivalent scripting languages.

    Containerisation & Orchestration – Expertise in Kubernetes/Docker and how they optimise ML development workflows.

    Monitoring & Observability – Experience with monitoring for ML-specific use cases.

    Collaboration & Communication – Excellent written and verbal communication skills, with the ability to work in collaborative, multi-cultural teams.


    Nice to have

    ML Workflow Automation – Experience in ML pipeline orchestration, using tools such as Jenkins, GitHub Actions, or dedicated compute environments.

    Model & Data Management – Familiarity with model registries, AI Agents, Retrieval-Augmented Generation (RAG) techniques, and frameworks like LangChain/LlamaIndex.

    Hands-on experience with Databricks, Azure ML, or SageMaker.

    Understanding of security best practices for MLOps, including data privacy & compliance in cloud platforms.

    Knowledge of ML frameworks like TensorFlow, PyTorch, or Scikit-learn.

    Experience working in complex enterprise environments with strict security and compliance requirements.

    Strong networking fundamentals, including configuring and maintaining secure mTLS-based communication between services.

    Excellent problem-solving skills and attention to detail.

    Exposure to Java or R (optional but beneficial for enterprise AI environments).

    Hands-on experience with stacks such as Prometheus, Grafana, Splunk, ELK and tuning observability for ML-specific use cases.


    Role Responsibilities:

    Automate & Optimise AI/ML Infrastructure – Enable scalable, repeatable, and secure AI/ML services for research and development (R&D).

    Collaborate Across Teams – Work with ML Engineers, DevOps, and Software teams to design robust ML infrastructure and deployment strategies.

    Evaluate & Integrate Emerging Technologies – Continuously assess and integrate MLOps best practices to enhance automation, efficiency, and security.

    Monitor & Improve ML Operations – Implement proactive monitoring & alerting solutions to improve system performance, reliability, and operational insights.

    Perform Research & Proof-of-Concepts (PoCs) – Conduct research and evaluate new technologies to drive innovation and improve AI/ML development and integration cycles.

    Contribute to Internal & External Knowledge Sharing – Document findings, best practices, and PoCs to support broader engineering teams.

    Experience :

    7 - 10 years

    Job Domain/Function :

    AI/ML

    Job Type :

    Work From Office

    Employment Type :

    Full Time

    Number Of Position(s) :

    1

    Educational Qualifications :

    Bachelor's Degree

    Location :

    Pune, Maharashtra, India, Pune, Maharashtra, India

    Create alert for similar jobs

    MA

    Mastercard