 |
Senior Data Engineer - San Francisco California
Company: Tavus Inc. Location: San Francisco, California
Posted On: 05/02/2025
About UsAt , we're building the human layer of AI. Our mission is to make human-AI interaction as natural as face-to-face interaction, enabling the human touch where it has been previously unscalable. We achieve this through pioneering research in multi-modal AI models for human perception and understanding, combined with state-of-the-art human avatar rendering and communication models. Our models power everything from text-to-video AI avatars to real-time conversational video experiences across industries like healthcare, recruiting, sales, education, and more. By enabling AI to see, hear, and communicate with human-like authenticity, we're creating the foundation for the next generation of AI employees, assistants, and companions.We're a Series A company backed by top investors, including Sequoia, Y Combinator, and Scale VC. Join us in driving the future of human-AI interaction. The RoleData is the foundation of everything we build. We're looking for a Senior Data Engineer who goes beyond pipelines and cleaning datasets. You'll own our entire data strategy, from sourcing and curating to structuring and optimizing, ensuring our models and products are powered by the highest-quality data possible. You're a true master of your craft including data sourcing, formatting, labeling, cleaning, and making use of our internal data.Your Mission - Be a data visionary - You anticipate the data needs not just for today, but for the future. You know how to curate diverse, high-quality datasets to ensure AI models reach their full potential.
- You should have a product minded approach, and clearly understand the bigger picture of our mission and the importance of data to that. You're constantly thinking about what data is missing for our next phase of models
- Influence AI model training - Your data work will directly impact AI model performance, efficiency, and inference accuracy. You will collaborate closely with ML engineers to optimize datasets for maximum AI effectiveness.
- Own the data, end-to-end - from sourcing to structuring-so it's clean, scalable, and actually useful.
- Be a data hunter - Web scraping, third-party deals, unconventional sources-you'll find, collect, and curate the best multimodal data (text, video, images) to power our models. Manage large-scale data procurement to ensure our models train on the highest quality information.
- Master video data - AI-generated video has unique challenges, from proper classification and segmentation to structuring it for machine learning training. You will own this challenge and ensure that our video datasets are structured for AI success.
- Optimize labeling & automation - You will own the data labeling process and build automated workflows to make cleaning, labeling, and structuring data as efficient as possible. Work closely with our data annotation teams to ensure high-quality labeled data for ML models.
- Turn internal data into gold - Our own platform is a goldmine of insights-help us unlock and use it to drive smarter decisions and supercharge growth.
- Speed + precision - Move fast, but don't break data. Every pipeline, dataset, and workflow should be tight, efficient, and built to last.What We're Looking For
|
 |