Senior Taxonomist

Cohere in Toronto, San Francisco, New York City

Who are we?
We’re a small, diverse team working at the cutting edge of machine learning. At Cohere, our mission is to build machines that understand the world and to make them safely accessible to all. Language is at the crux of this, but it can be difficult and expensive to parse the syntax, semantics, and context that all work together to give words meaning. The Cohere platform provides access to Large Language Models through its APIs that read billions of web pages and learns to understand the meaning, sentiment, and intent of the words we use in a richness never seen before.

We've raised our Series B , signed a multi-year partnership with Google Cloud , and we are focused on bringing our technology to market. We will partner with customers so they can build natural language understanding and generation into their products with just a few lines of code.

We’re ambitious — we believe our technology will fundamentally transform how industries interact with natural language. And we have the technical chops to back it up - Cohere’s CEO, Aidan Gomez, is a co-author of the groundbreaking paper “Attention is all you need” , (over 53k citations) and was previously part of Google Brain. Our entire technical team is world-class.

We are focused on creating a diverse and inclusive work environment so that all of our team members can thrive. We welcome kind and brilliant people to our team, from wherever they come.

Why this role?
Large language models are core to Cohere, and the data we collect is core to the language models we train. While the current iteration of LLMs is trained primarily on web text, the next generation of LLMs will rely on human annotation to create custom datasets to further develop the capabilities of these models.

We are looking for a Senior Taxonomist to work closely with engineering and product teams to lead the creation of custom datasets for training specialized models to enable enterprise solutions using LLM's cutting-edge capabilities.

This role requires a diverse set of skills and draws on a range of disciplines. We are therefore considering a broad range of backgrounds for this role, including ML, NLP, HCI, software engineering, and relevant linguistic and social sciences.

Please Note: We have offices in Toronto, Palo Alto, and London but embrace being remote-first! There are no restrictions on where you can be located for this role.
If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! If you consider yourself a thoughtful worker, a lifelong learner, and a kind and playful team member, Cohere is the place for you.

We welcome applicants of all kinds and are committed to providing both an equal opportunity process and work environment. We value and celebrate diversity and strive to create an inclusive work environment for all.

Our Perks:
🤝 An open and inclusive culture and work environment
🧑‍💻 Work closely with a team on the cutting edge of AI research
🍽 Free daily lunch
🦷 Full health and dental benefits, including a separate budget to take care of your mental health
🐣 100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
🏙 Remote-flexible, offices in Toronto, Palo Alto, and London and coworking stipend
✈️ 6 weeks of vacation and shared Canada/US/UK holidays

#LI-Remote
    • Collaborate with Science and Product teams to define annotation tasks, coordinate resourcing, and review annotated data for quality
    • Develop and disseminate data labeling best practices learned from building enterprise solutions using LLMs
    • Develop labeled data assets according to annotation guides to train and evaluate LLMs in collaboration with Machine Learning Engineers for real-world use cases
    • Collaborate with centralized data and evaluation teams on specialized collection protocols, UIs, and instructions for diverse and creative human annotation tasks
    • A BA in Linguistics, Library Science, or a related field (We encourage non-traditional backgrounds to apply!)
    • Experience with ontology development and information domain modeling
    • Running and managing human annotation jobs for large-scale data collection
    • With quality control and best practices for human annotation
    • Experience with Jupyter notebooks
    • Experience with SQL, terminal, and command line
Apply