Vacancy title:
Site Reliability Engineer/ System Administrator
Jobs at:
ENGIE Energy AccessDeadline of this Job:
Friday, December 22 2023
Summary
Date Posted: Friday, December 08 2023, Base Salary: Not Disclosed
JOB DETAILS:
Job Summary:
We are seeking a talented and experienced System Administrator/Site Reliability Engineer (SRE) to join our dynamic team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems and services. You will collaborate with cross-functional teams to implement and maintain robust infrastructure solutions, focusing on automation, monitoring, and incident response. The ideal candidate is passionate about optimizing and enhancing system reliability, possesses strong problem-solving skills, and is committed to driving excellence in operational practices.
Key Responsibilities:
1. Infrastructure Automation:
o Develop and maintain automation tools and scripts for provisioning, configuration, and deployment.
o Implement infrastructure as code (IaC) practices to ensure consistency and reproducibility.
2. Monitoring and Incident Response:
o Set up and maintain monitoring systems to detect and respond to performance issues and outages.
o Participate in on-call rotations and respond promptly to incidents, troubleshoot, and implement solutions to prevent recurrence.
3. Performance Optimization:
o Optimize system performance through continuous analysis and tuning.
4. Reliability Engineering:
o Implement best practices for reliability, such as error budgeting, SLIs/SLOs, and blameless post-mortems.
o Work towards minimizing manual intervention through automation.
5. System Administration:
o Manage and maintain server infrastructure, including installation, configuration, and troubleshooting of operating systems.
o Implement and maintain security measures, such as firewalls and intrusion detection systems.
o Perform regular system backups and recovery procedures.
6. Collaboration and Communication:
o Collaborate with cross-functional teams to align infrastructure and operational requirements.
o Provide technical guidance and support to colleagues in areas related to reliability.
Qualifications:
• Bachelor’s degree in computer science, Information Technology, or a related field.
• Proven experience as a Site Reliability Engineer or System Administrator.
• Strong Linux and Bash scripting skills.
• Proficiency in cloud platforms (e.g., AWS, Azure, GCP, Linode, DigitalOcean).
• Experience with container orchestration tools (e.g., Kubernetes, Docker, LXD).
• In-depth knowledge of networking, security, and system administration.
• Familiarity with infrastructure as code tools (e.g., Terraform, Ansible).
• Excellent problem-solving and troubleshooting skills.
• Strong communication and collaboration skills.
Preferred Qualifications:
• Experience with CI/CD pipelines and related tools.
• Knowledge of distributed systems and microservices architecture.
• Familiarity with observability tools (e.g., Prometheus, Grafana, ELK stack).
• Familiarity with programming languages (e.g., Python, Ruby).
Job Experience: No Requirements
Work Hours: 8
Experience in Months:
Level of Education: Bachelor Degree
Job application procedure
Interested and qualified Click here to apply.
All Jobs
Join a Focused Community on job search to uncover both advertised and non-advertised jobs that you may not be aware of. A jobs WhatsApp Group Community can ensure that you know the opportunities happening around you and a jobs Facebook Group Community provides an opportunity to discuss with employers who need to fill urgent position. Click the links to join. You can view previously sent Email Alerts here incase you missed them and Subscribe so that you never miss out.