Site Reliability Engineer Job at TalentOla, Austin, TX

c2xFdlJxOG9oVVU2dW9acDhhN25aVjRr
  • TalentOla
  • Austin, TX

Job Description

Job Title: Site Reliability Engineer

Location: Austin, TX (Onsite)

Job Summary

Seasoned Site Reliability Engineer (SRE) with 8+ years of experience in supporting complex, large-scale distributed systems. Highly skilled in managing production failures, conducting root cause analysis, and driving effective remediation. Strong communicator with expertise in ing, monitoring, and release management, complemented by automation proficiency and a keen ability to learn quickly.

This role involves providing 24/7 support as part of the SRE team, ensuring the reliability and performance of mission-critical Java, .NET, and Batch applications deployed across GCP, PCF, and on-premise environments.

Technical Skills:

  • Expertise in understanding large scale production systems and technologies, for example load balancing, monitoring, distributed systems, microservices, and configuration management.
  • Should have solid hands-on experience in troubleshooting and fixing application failures, application Performance degradation, Code issues, cloud platform issues, Batch Failures, Infra failures, DB failures, Network failures.
  • Hands-on experience in performing Production deployments using CI/CD and exposure to deployment strategies.
  • Experience in troubleshooting of Linux/Unix.
  • Monitor the application/Services/batch availability.
  • Act quickly on the application s(Performance, Availability) and Batch Job failures
  • Perform the required analysis (Code/Log) and escalate to the Engineering team as required.
  • Initiate and drive the Techlines in case of outages/major incidents/Batch abends and ensure Service Restoration in the least time possible.
  • Effectively handle the Incident, Problem, Release and Change management.
  • Own and deliver the user stories assigned as part of the sprint.
  • The user stories range from application code Debugging, Issue analysis, Code fix, Knowledge base creation, documentation of SOP's, Production
  • Deployments, Pre & Post Patching/Maintenance activities, Service Requests
  • Build monitoring solutions using APM tools like Splunk, Appdynamics, Thousand Eyes, ITRS, AppMetrics, MoogSoft, Kafka etc
  • Automate of day-day operational tasks.
  • Be part of the Exit reviews to ensure the best practices are followed to have the right code deployed to Production systems
  • Provide feedback/recommend improvements to the system which would enable highly stable systems.
  • Strong understanding of Networking Concepts (TCP/IP, SSL/TLS, IPSec, VPN etc), Firewall and Load Balancers.
  • Experience in Scripting Shell/Powershell/Python
  • Strong Experience in working with any Cloud-based infrastructure (PCF, GCP, AWS, Azure Cloud or others)

Job Tags

Full time,

Similar Jobs

Focused HR Solutions

1-20- Power BI Developer Job at Focused HR Solutions

100% Remote Our client has an opening for a Power BI developer 738538. This position is up to 12 months, with the option of an extension, and the client is in Atlanta, GA Please send the rate and resume. Qualifications: Demonstrated experience with system... 

Elder Home Care

Companion, Home Health Aide (HHA) or Personal Care Aide (PCA) Job at Elder Home Care

 ...Companion, Home Health Aide (HHA) or Personal Care Aide (PCA) Elder Home Care This rewarding position provides consistent, flexible hours to accommodate your personal needs while providing a great career with a growing company. If you have a genuine passion for... 

Get It - Healthcare

Mental Health Therapist (LICSW, LCSW, LMCH, LMFT) - Remote Job at Get It - Healthcare

Position Overview: This is a remote, contract-based position offering the flexibility to work from home while making a meaningful...  ...ensuring continuity of care. Work closely with a dedicated team of healthcare professionals, ensuring holistic and comprehensive patient care... 

Cohesity

Vice President, Corporate Communications | Santa Clara, CA - USA (Office) Job at Cohesity

 ...culture where you can thrive. Explore our open roles and secure the next step in your career at Cohesity!Vice President, Corporate Communications Cohesity is a leader in AI-powered data security and management. Aided by an extensive ecosystem of partners, Cohesity makes... 

Axion Ray

Director of Talent and People Job at Axion Ray

 ...powered by advanced artificial intelligence. With top-tier backing from VCs like Bessemer and strategic partners like Boeing and Raytheon, Axion Ray is uniquely positioned to address manufacturings toughest challenges. Our team consists of experts from Palantir, McKinsey...