Ushur is transforming the way enterprises communicate and engage with customers. Fueled by consumer’s self-service demands, enterprises are modernizing customer engagement and experience models. Ushur is fast becoming the platform of choice for Customer Experience Automation™, enabling these enterprises to leapfrog their digital native counterparts and deliver delightful customer and employee experiences. With cutting-edge Conversational AI, Machine Learning and Intelligent Process Automation technologies, Ushur has enabled Fortune 100 enterprises including some of the world’s most well known brands in healthcare, insurance, banking and financial services sectors to automate their customer engagement. Cloud-native, 100% no-code and purely workflow-driven, Ushur empowers citizen developers within business operations teams to build AI-powered, fully-automated and omni-channel experience to digitally transform customer journeys end-to-end.
Designation: Senior Site Reliability Engineer
Location: Bengaluru, India
Experience: 5-8 Years
What You'll Do:
Infrastructure Reliability & Monitoring:
Design, implement, and maintain highly available systems using Kubernetes across cloud environments (AWS, Azure, GCP).
Implement comprehensive monitoring and alerting solutions using Sumo Logic, ensuring real-time visibility into system health and performance.
Automation & Infrastructure as Code (IaC):
Automate infrastructure provisioning and configuration using tools like Terraform and Ansible, ensuring scalability, efficiency, and reproducibility.
Troubleshooting & Incident Management:
Lead troubleshooting efforts to diagnose and resolve complex issues within distributed systems.
Coordinate incident responses, conduct root cause analyses (RCA), and implement solutions to prevent recurrence.
CI/CD Pipeline Management:
Maintain and optimize continuous integration/continuous delivery (CI/CD) pipelines for deploying infrastructure and applications efficiently.
Performance Optimization:
Monitor system performance and proactively optimize infrastructure to handle growing workloads.
Identify performance bottlenecks and implement scalability solutions.
Security & Compliance:
Collaborate with the security team to implement security best practices and ensure compliance with industry regulations.
Collaboration & Mentoring:
Work closely with Development, Product, and Operations teams to streamline processes and improve system reliability.
Mentor junior engineers, providing technical guidance and advocating for SRE best practices.
Networking:
Ensure the design and management of secure and efficient network architecture across cloud platforms.
Deep knowledge of networking fundamentals, security protocols, and network performance optimization.
Qualifications:
5+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering roles.
Expertise in Sumo Logic for monitoring, alerting, and log aggregation.
Hands-on experience with Kubernetes, managing large-scale containerized applications.
Strong troubleshooting skills in complex, distributed cloud environments.
Proficiency with cloud platforms such as AWS, Azure, or Google Cloud (GCP).Automation experience with tools like Terraform, Ansible, or other Infrastructure as Code (IaC) technologies.
Deep understanding of Linux/Unix systems, along with scripting languages such as Bash, Python, or Go.
Expertise in CI/CD pipelines and tools such as Jenkins, GitLab, or CircleCI.
Solid knowledge of networking, security protocols, and best practices for cloud environments.
Preferred Skills:
Experience with database technologies such as MySQL, PostgreSQL, or MongoDB.
Familiarity with MLOps and AI/ML infrastructure is a plus.
Knowledge of service mesh technologies like Istio or Linkerd.
Cloud certifications (AWS, Azure, GCP) are highly desirable.
Personal Attributes:
Strong analytical and problem-solving abilities.
Excellent communication and collaboration skills.
Passion for automation and innovation, with a deep sense of ownership over the systems managed.
Ability to lead and mentor, fostering growth within the team.