|
Product Reliability Engineer, New Grad - Palo Alto California
Company: Palantir Technologies Location: Palo Alto, California
Posted On: 05/04/2024
A World-Changing CompanyPalantir builds the world's leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.The RoleProduct Reliability Engineers (PREs) are the driving forces of stability across Palantir's products and help to ensure our products are available 24/7. When something goes wrong, we are the first to respond and are responsible for triaging, troubleshooting, and coordinating the resolution.Every day at Palantir is different: we're constantly evolving to better respond to customer needs, and as a PRE you will work closely with our engineering and business teams to minimize risks. You are a resourceful, creative, and agile problem solver who is able to work collaboratively and independently to resolve the most difficult and nebulous technical issues. This includes everything from creating product health metrics and automated alerts, to fixing product bugs, streamlining operational tasks, and developing and documenting strategies for responding to incidents.Whatever the technical root cause of the issue is, you'll play a central and critical role in resolving it - seeking not just a one-time fix, but a permanent solution.Core Responsibilities - Develop a deep understanding of Palantir's products and processes.
- Collaborate with customer-facing, product, and infrastructure teams on the development and deployment of scalable, reliable software for our customers.
- Deliver end-to-end improvements to stability by proactively preventing issues via telemetry and automation and directly reducing the need for reactive support.
- Maintain and improve the operational capacity of our production databases, including resolving incidents and streamlining operational workflows.
- Reduce the operational overhead and make data-driven decisions about investments in stability and reliability.
- Take part in an on-call rotation responsible for coordinating Palantir's response to critical incidents, ensuring efficient resolution with minimal customer impact.What We Value
- Excellent problem solving skills, ability to break down and explain complex concepts, and strong attention to detail.
- Comfort working in a fast moving environment with dynamic objectives that require creative thinking to address product and customer needs.
- Ability to work both independently and make decisions autonomously, as well as collaborate as part of a distributed team with members from our offices across America and Europe
- Experience coding with Java, Go and/or web technologies (e.g. HTML, CSS, JavaScript, Python/Ruby, Django/Flask/Ruby on Rails, etc.) is a plus.
- Experience with distributed computing systems and/or cloud infrastructures (e.g. Spark, Hadoop, YARN, Kubernetes, AWS, etc.) is a plus.What We Require
- Background in Computer Science, Engineering, Information Systems, or other technical field.Our benefits aim to promote health and wellbeing across all areas of Palantirians' lives. We work to continuously improve our offerings and listen to our community as we design and update them. The list below details our available benefits and some of the perks that can be enjoyed as an employee of Palantir Technologies.Benefits
|
|