Application SRE (Job ID: 207481)

Urgent
Application ends: May 6, 2025
Apply Now

Job Description

Company: Infosys Limited
Location: Bangalore, Karnataka, India
Job Type: Full-Time
Experience: 9-13 Years
Education: Master of Engineering / Master of Technology / Bachelor of Science /
Bachelor of Engineering / BTech (or equivalent)
Expected Salary: ₹22 – ₹30 Lakhs per year
Job Category: Information Technology, Engineering
Website: Website info
Contact: Contact us

About the Role:
As a Senior Site Reliability Engineer (Application SRE) at Infosys, you will play a critical
role in supporting our application developers by providing expert guidance on reliability
best practices for both applications and infrastructure.
Your responsibilities include improving the reliability, quality, and time-to-market of our
suite of products and applications through strategic engineering, continuous monitoring,
and proactive issue resolution.
You will define system metrics (SLO/SLI), establish observability mechanisms, and
develop error budgets. In addition, you will design high-availability architectures, drive
a metrics-driven culture, and work closely with solution architects and development
teams to ensure our systems are robust, secure, and highly efficient.

Key Responsibilities:

  • Reliability & Observability:
  • Define suitable metrics (SLO/SLI) and set up observability mechanisms to track system
    performance.
  • Establish error budgets as per defined SLOs and balance feature development speed
    with system reliability.
  • System Architecture & Automation:
  • Design strategies and implement high availability and load balancer-based
    architectures.
  • Optimize automation and develop self-healing capabilities for systems.
  • Operational Support:
  • Provide primary operational support and engineering for products and applications.
  • Manage and participate in on-call incidents (Priority Incidents) and lead root cause
    analysis for issues.
  • Collaboration & Integration:
  • Partner with solution architects and development teams to enhance service reliability
    and performance.
  • Participate in system design, optimize code, and automate operational tasks to reduce
    toil.
  • Performance & Security:
  • Provide solutions for performance management, monitoring, and observability.
  • Improve application security and performance through proactive measures.
  • Process & Best Practices:
  • Define, evangelize, and maintain SRE best practices along with DevSecOps standards.
  • Work on distributed tracing to visualize workflows and analyse issues/incidents.
  • Technical Excellence:
  • Use scripting languages (Python, Ruby, JSON, Java, Node.JS, etc.) to develop and
    maintain tools.
  • Leverage experience with observability tools (e.g., New Relic, Prometheus, DataDog,
    Splunk) and event correlation tools like BigPanda.
  • Cloud & Infrastructure:
  • Work with cloud platforms (AWS, Azure, Google Cloud) and container orchestration
    tools (Kubernetes, Docker Swarm) to support scalable systems.

Technical and Professional Requirements:

  • Experience:
  • At least 5+ years of SRE experience in large-scale programs with a focus on release
    engineering, observability, and reliability.
  • At least 2 hands-on project experiences in SAP S/4HANA public cloud are not required
    here but relevant SRE project experience is essential.
  • Skills:
  • Proficiency in one or more observability tools (e.g., New Relic, AppDynamics,
    Prometheus, Dynatrace, DataDog, Splunk).
  • Strong experience in scripting or development languages, such as Python, Ruby, Java,
    or Node. JS.
  • Experience with CICD tooling, Agile methodologies, and configuration management
    tools.
  • Strong knowledge of microservices architecture, cloud cost optimization, and FinOps
    is a plus.
  • Additional:
  • Experience with container orchestration (e.g., Kubernetes, Docker Swarm) and
    infrastructure automation tools (e.g., Terraform, CloudFormation, Ansible, Puppet).
  • Familiarity with ITSM tools such as ServiceNow and knowledge of SQL/NoSQL
    databases