Staff Software Engineer - Infrastructure
Gantry
Software Engineering, Other Engineering
Remote
Posted on Tuesday, July 11, 2023
Gantry is building product testing and analytics for LLM-powered applications. We’re developing the most reliable, trustworthy way to evaluate LLM apps, and workflows to integrate those evaluations into the product development process. You can think of it like unit testing + Mixpanel for AI app builders. Gantry was founded by Josh Tobin, former OpenAI researcher and co-founder of The Full Stack, and Vicki Cheung, former founding engineer at OpenAI and Compute team lead at Lyft.
We are hiring a Staff Level Engineer to build the infrastructure that powers our product. Since our users are data scientists, ML engineers, and infrastructure engineers, your work will not only help our product scale, but also be a determining factor in how usable and delightful it is.
Responsibilities
- Design and develop scalable and fault-tolerant data pipelines for ingesting, processing, and storing large volumes of data generated by ML models
- Collaborate with data scientists and machine learning engineers to identify and implement the most efficient data storage and retrieval mechanisms for monitoring and analyzing model performance
- Ensure that our systems are reliable, secure, and worthy of our customers' trustBuild workflows that enable our team and our users to be productive when developing and using our systems
- Mentor and provide guidance to junior members of the data infrastructure team, fostering their growth and ensuring high-quality deliverables.
Requirements
- Have 10+ years of experience in data engineering or related roles, with a strong focus on designing and implementing data infrastructure for ML model monitoring or similar applications
- Have a deep understanding of distributed systems, data storage technologies (e.g., Hadoop, Elasticsearch, Cassandra), and data processing frameworks (e.g., Spark, Flink)
- Have experience with cloud platforms such as AWS, Azure, or GCP and their data services (e.g., S3, Redshift, BigQuery)
- Are excited to help to accelerate the ability of companies to deploy machine learning as part of their core products and services
- Thrive on challenging technical problems, i.e., large-scale data infrastructure that supports both batch and near-real time use case
- Are comfortable managing complexity, making pragmatic trade-offs, and dealing with ambiguity
Gantry Systems, Inc. is an equal opportunity employer. We celebrate diversity, are committed to creating an inclusive environment for all employees, and intend to consider qualified applicants with criminal histories. Most of the team is based in San Francisco, but we are building a remote-friendly company and welcome applicants from anywhere in the US or Canada.