Allegion is seeking a highly motivated Lead Site Reliability Engineer to lead our SRE team designing solutions targeted at extending security technology. The ideal candidate should have proven expertise in leading, designing, developing and deploying a scalable, robust system using cloud technologies.
Requirements
- Provide technical leadership and mentorship to a team of Site Reliability Engineers, promoting best practices in system architecture, reliability, and cloud security.
- Design, implement, and manage high-availability and fault-tolerant systems using Java, Spring, AWS, and cloud security best practices.
- Work with development teams to ensure that systems are designed with resiliency and security in mind.
- Implement and manage monitoring, alerting, and logging solutions to track performance, availability, and security metrics across infrastructure and applications.
- Troubleshoot production issues related to performance, scaling, and security, ensuring that issues are resolved in a timely manner with minimal impact.
- Drive automation initiatives across infrastructure, security, and monitoring tasks, aiming to reduce manual intervention and improve efficiency.
- Collaborate with cross-functional teams to design disaster recovery plans, backup strategies, and business continuity plans.
- Write clean, efficient, and well-documented code that adheres to software development best practices and coding standards.
- Stay updated with the latest industry trends, technologies, and best practices in software development and apply them to enhance our software applications.
- Fix application bugs and validate them in lower environments, promoting fixes to the production environment using CI/CD pipelines.
- Collaborate with software engineering teams to build and deploy applications using best practices in reliability, observability, scalability, and security.
- Develop and implement automation tools and frameworks to streamline operational processes, reduce manual intervention, and improve efficiency.
- Build dashboards to measure KPIs and SLOs with a single pane of glass mindset.
- Participate in on-call rotations and respond to incidents, ensuring timely resolution and minimal impact on users, thereby meeting SLOs/SLAs.
Benefits
- Health, dental and vision insurance coverage
- Unlimited Paid Time Off
- 401K plan with 6% company match and no vesting period
- Health Savings Accounts
- Flexible Spending Accounts
- Disability Insurance
- Life Insurance
- Tuition Reimbursement
- Voluntary Wellness Program
- Employee Discounts
- Community involvement and opportunities to give back