* Please note, the role is remote and candidates should be based in Poland

The Opportunity

Software Engineering plays a key role in insitro’s approach to rethinking drug development. Our team is responsible for ensuring our biological data factory’s robots and instruments produce high quality data, optimizing storage, queries, and ingestion of petabytes of experimental results. On top of this stack, we build the infrastructure that our machine learning engineers and scientists leverage to train powerful models that solve key problems in the drug development process.

You will work closely with a cross-functional team of scientists, bioengineers, and data scientists to develop data architectures and systems on the high throughput platforms that enable our scientists to be maximally productive. You will design, implement, and deploy novel methods that use a broad spectrum of data engineering approaches, including techniques at the forefront of the field. You will work as part of a team to rigorously design our data platform, identify key architectural performance improvements and support ongoing discovery and automation platforms.

Here are some examples of the style of project you can expect to work on:

Design cheminformatic tools and pipelines that enable us to wrangle multi-billion molecule screening libraries.
Build image processing pipelines that transform raw microscopy data into phenotype predictions through our machine learning processes.
Architect data cataloging and provenance tracking solutions that accelerate multidisciplinary scientific teams.

You will be joining as the founding team of a biotech startup that has long-term stability due to significant funding, but yet is very much in formation. A lot can change in this early and exciting phase, providing many opportunities for significant impact. You will work closely with a very talented team, learn a broad range of skills, and help shape insitro’s culture, strategic direction, and outcomes. Join us, and help make a difference to patients!

About You

BS, MS, or Ph.D. in computer science, statistics, mathematics, physics, engineering, or equivalent practical experience.
Expertise in one or more general-purpose programming languages such as Python, C/C++, or Go. We primarily use Python.
Familiarity with cloud computing services. We use AWS.
Familiarity with database technologies, data pipelines, workflow engines, distributed computing technologies such as Spark. We primarily use Postgres, redun , and Spark.
Familiarity with web services and application frameworks such as Django and Flask.
Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions.
Proficiency in Linux environments including shell scripting and experience with version control practices and tools such as git.
Passion for making a positive impact through your work.

Nice to Have

Experience with biological data such as DNA sequences, RNAseq, proteomics and microscopy images.
Experience with medium-sized data sets (100TB+)
Experience with the SciPy/PyData ecosystem (numpy, pandas, scipy, dask, etc.)
Demonstrated ability to develop novel data engineering methods that go beyond putting together existing code, and to apply problem-solving skills to complex issues.
Real-world work experience in software development for high-end data processing engines.

Benefits at insitro

Highly competitive salary

Health insurance benefits
Gym allowance
Flexible work schedule
Home office equipment

GDPR

The Controller of your personal data is Insitro, Inc., with offices at 279 East Grand Avenue, South San Francisco, California, United States. Your personal data is processed for the purposes of the current recruitment process. Providing your personal data is voluntary, but its processing and transfer to the United States by or on behalf of Insitro, Inc. is necessary for this purpose. You have the right to access, correct, modify, update, rectify, and request the transfer or deletion of your personal data.

You hereby consent to Insitro, Inc., with offices at 279 East Grand Avenue, South San Francisco, California, United States, retaining and processing your personal data after the current recruitment process is finished, for the purposes of future recruitment processes. You have the right to withdraw this consent at any time by sending a notification to recruiting@insitro.com.

About insitro

insitro is a data-driven drug discovery and development company using machine learning and data at scale to transform the way that drugs are discovered and developed for patients. insitro is developing predictive machine learning models to discover underlying biologic state based on human cohort data and in-house generated cellular data at scale. These predictive models can be brought to bear on key bottlenecks in pharmaceutical R&D to advance novel targets and patient biomarkers, design therapeutics, and inform clinical strategy. insitro is advancing a wholly owned and partnered pipeline of biologic insights and molecules in neuroscience and metabolic diseases. Since formation in mid 2018, insitro has raised over $700 million from top tech, biotech, and crossover investors and from collaborations with pharmaceutical partners. For more information on insitro, please visit the company’s website at www.insitro.com .

Senior/Staff Software Engineer - Scientific Pipelines

Insitro in Poland, Remote

The Opportunity

About You

Nice to Have

Benefits at insitro