|
Site Reliability Engineer - Bethesda Maryland
Company: DMI (Digital Management, Inc.) Location: Bethesda, Maryland
Posted On: 05/03/2024
Site Reliability Engineer Job ID 2024-26319 Category AWS Cloud Location US-Remote About DMI DMI is a leading global provider of digital services working at the intersection of public and private sectors. With broad capabilities across IT managed services, cybersecurity, cloud migration and application development, DMI provides on-site and remote support to clients within governments, healthcare, financial services, transportation, manufacturing, and other critical infrastructure sectors. DMI has grown to over 2,100+ employees globally and has been continually recognized as a Top Workplace in both regional and national categories. About the Opportunity DMI, LLC is seeking a Site Reliablity Engineer, who will be responsible for monitoring, automating, and improving the reliability, performance, and availability of software systems. Duties and Responsibilities: Run the production environment by monitoring availability and taking a holistic view of system health. - Build software and systems to manage platform infrastructure and applications.
- Improve reliability, quality, and time-to-market of our suite of software solutions.
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement.
- Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding.
- Anticipate potential problems before they occur and come up with solutions.
- Conduct post-incident reviews.
- Document your work to turn findings into repeatable actions.
- Partner with development teams to improve services through rigorous testing and release procedures.
- Participate in system design consulting, platform management, and capacity planning.
- Create sustainable systems and services through automation and uplifts.
- Balance feature development speed and reliability with well-defined service-level objectives.
Qualifications Education and Years of Experience: - Bachelor's degree in business, Information Technology, Computer Science, Engineering or related technical or functional discipline.
- 5+ years of proven experience as a Site Reliability Engineer or similar roleRequired and Desired Skills/Certifications:
- Relevant industry certifications.
- Experience building and maintaining CI/CD pipelines for automated deployments.
- Experience with infrastructure as code tools like CloudFormation (preferable) and Terraform.
- Experience working in an Agile environment (e.g., Scrum, Kanban) and knowledge of JIRA/Confluence.
- Programming experience (structured and OOP) using one or more high-level languages.
- Experience scripting in bash/shell.
- Strong unit testing experience.
- Ability to work with cloud-native infrastructures.
- Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
- Ability to collaborate and communicate asynchronously.Additional Requirements:
- Previous government contracting experience.
- At least two associated level AWS certifications (e.g., AWS Certified Solution Architect - Associate, AWS Certified SysOps Admin - Associate, AWS Certified Developer - Associate).Min Citizenship Status Required: U.S. Citizen Physical Requirements: No Physical requirement needed for this position.Location: Remote#LI-MP1 Working at DMIDMI is a diverse, prosperous, and rewarding place to work. Being part of the DMI family means we care about your wellbeing. We offer a variety of perks and benefits that help meet various interests and needs, while still having the opportunity to work directly with several of our award-winning, Fortune 1000 clients. The following categories make up your DMI wellbeing:
|
|