45 min
Free
What is SRE
SRE stands for Site Reliability Engineers, originally created at Google in 2003 and has become a key part of any development team.
The primary role of SRE is keep a Website or Web Service stable and available. Which will include:
01
System Design
Design a system in a way that takes advantage of best practices and follows principles where ever it may sit.
02
Monitoring
Create a set of Service Level Objectives and install the monitoring to report the needed Service Level Indicators.
03
Oncall
Use the monitoring and AI to raise an alarm before an incident happens without any interruption to service.
How can I help
With my experience working with small and large organisation, I can help you by
Services
Upskilling
I can help upskill your current team with resources specific to your use cases.
-
Runbooks
-
Playbooks
-
1on1 and group coaching
-
Principles and policies
-
Additional documentation
Management
I can work with your company up to 5 days a week for a short period, take on the management responsibilities of the team in the view to find a suitable replacement. That can either be trained internally or hired externally.
Advise and refinement
I can help the management or executive of the company
-
Demo's and presentations on benefits and needs of SRE.
-
Provide simple, understandable definitions to concerned business departments.
-
Define the requirements and responsibilities of an SRE team.
-
Draw up documentation for tools and procedures for a new team.
-
Create job descriptions and interview candidates.
-
Advise and improve on an ongoing basis.
-
Create architecture designs and system diagrams for different stakeholder viewpoints.
-
Regulatory and Governance (ISO27001, SOC2, PCI, HIPAA)