Data Lead
Company Description
Adept is an ML research and product lab building the next frontier of models that can take actions in the digital world. We’ve raised a $65M Series A from Greylock and Addition and several angel investors, and were recently highlighted by Fortune.
We're looking for passionate team members who are energized by our ambitious mission, excited by a fast-paced startup environment, and eager to learn and collaborate together in our San Francisco office.
Check out our ACT demo to see what the future holds.
We are seeking a talented and experienced Data Lead to join our team and help drive the development of our giant neural networks. In this role, you will be responsible for leading the collection, cleaning, and annotation of data from various sources (text, images and video) across the internet. High quality data is one of the foundational levers to improve our models and customer experiences. You will work closely with our engineering team to ensure that the data we use to train our models is of the highest quality and diverse enough to accurately represent the real world.
What you'll achieve
- Lead the giant model data collection and annotation efforts for the company
- Develop and maintain processes for collecting and cleaning data from a variety of sources
- Collaborate with engineering team to ensure that data is of high quality and meets the needs of our models
- Analyze and understand data trends and patterns to inform data collection strategies
- Stay up-to-date with industry best practices and emerging technologies related to data collection and management
Skill you'll need to bring
- Bachelor's or Master's degree in Computer Science, Data Science, or a related field
- Strong understanding of data collection and cleaning techniques, including building large-scale data pipelines and the use of web scraping tools and APIs
- Experience working with large, complex datasets and analyzing data trends and patterns
- Strong problem-solving and communication skills
- Experience with machine learning and AI language models a plus