We believe the world is changing faster than ever

Join a Two Sigma Ventures portfolio company to help shape the future of technology.
Leverage our network to build your career.
Tell us about your professional DNA to get discovered by any company in our network with opportunities relevant to your career goals.

Machine Learning Scientist, Omics



Software Engineering
South San Francisco, CA, USA
Posted on Wednesday, April 3, 2024

The Opportunity

State-of-the-art technologies that measure multiple cellular aspects are at the heart of Insitro's efforts to accelerate drug development. We routinely generate in-house large scale omics datasets to derive clinically relevant insights, as well as leverage publicly available resources.

We are looking to hire a new team member with expertise in developing ML methods for omics readouts, in domains such as single cell transcriptomics (e.g. batch correction, trajectory analysis, foundation models), spatial omics (e.g. cell to cell communication, spatial deconvolution), or multi-omics (e.g. integration of distinct readouts). Your expertise will help the team navigate the complexities of transcriptomics data.It will also ensure that the tools being developed are accurate and effective and that the analyses are performed to the highest rigor and in line with best practices in the field.

In this role, you will directly impact all target prioritization efforts, advance our understanding of diseases, and aid the development of new treatments. Towards these goals, you will build cutting-edge ML methods that address unmet company needs for in-house generated datasets (with an emphasis on multimodal measurements and data integration) as well as for human cohort data in order to extract insights about disease mechanisms. You will be part of a cross-functional team of life scientists, data scientists, bioengineers, software engineers, and machine learning scientists that strive to identify therapeutic targets and develop drugs of high efficacy and low toxicity.

You will be joining a vibrant biotech startup that has long-term stability, due to significant funding, and is in a high growth phase, and you will have many opportunities for significant impact. You will work closely with a very talented team, learn a broad range of skills, and help shape insitro's culture, strategic direction, and outcomes. Join us, and help make a difference to patients!

About You

  • Ph.D. in computer science, machine learning, computational biology, systems biology, or a related discipline.
  • Extensive hands on experience developing ML methods for omics data, especially single-cell RNA-seq data
  • Hands on experience analyzing single-cell RNA-seq data, in particular familiarity with the scverse ecosystem.
  • Experience with spatial transcriptomics and/or proteomics
  • Some understanding of human physiology or disease biology (e.g. neurosciences, cancer biology).
  • Familiarity with data from CRISPR-based experiments (e.g. perturb-seq)
  • Strong programming skills in scientific programming languages (i.e., Python)
  • Committed to writing well-commented code and documentation, and familiarity with coding best practices (i.e. version control, code review)
  • Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions
  • Publication record of meaningful contributions to high-quality work in relevant computational biology, systems biology, life sciences, or biomedical venues
  • Passion for making a difference in the world

Nice to Have

  • Experience working with diverse functional genomic assay data (RNA/DNase/ATAC/ChIP-seq, etc)
  • Hands on experience working with microscopy data or similar biomedical or biophysical imaging modalities
  • Understanding of systems biology approaches, including network analysis
  • First-hand experience studying diseases using omics or imaging data
  • Experience with proteomics and/or metabolomics data generated through mass spectrometry
  • Passionate about troubleshooting, asking questions and learning independently
  • Experience with modeling time-series datasets
  • Familiarity with cloud computing services (e.g., AWS or azure)
  • Demonstrated ability to write software in a team, industry experience or substantial involvement with open source projects.
  • Experience building infrastructure for data processing

Compensation & Benefits at insitro

Our target starting salary for successful US-based applicants for this role is $185,000 - $270,000. To determine starting pay, we consider multiple job-related factors including a candidate's skills, education and experience, market demand, business needs, and internal parity. We may also adjust this range in the future based on market data.

This role is eligible for participation in our Annual Performance Bonus Plan (based on company targets by role level and annual company performance) and our Equity Incentive Plan, subject to the terms of those plans and associated policies.

In addition, insitro also provides our employees:

  • 401(k) plan with employer matching for contributions
  • Excellent medical, dental, and vision coverage (insitro pays 100% of premiums for employees on our base plans), as well as mental health and well-being support
  • Open, flexible vacation policy
  • Paid parental leave
  • Quarterly budget for books and online courses for self-development
  • Support to occasionally attend professional conferences that are meaningful to your career growth and development
  • New hire stipend for home office setup
  • Monthly cell phone & internet stipend
  • Access to free onsite baristas and cafe with daily lunch and breakfast
  • Access to free onsite fitness center
  • Commuter benefits


About insitro
insitro is a drug discovery and development company using machine learning (ML) and data at scale to decode biology for transformative medicines. At the core of insitro’s approach is the convergence of in-house generated multi-modal cellular data and high-content phenotypic human cohort data. We rely on these data to develop ML-driven, predictive disease models that uncover underlying biologic state and elucidate critical drivers of disease. These powerful models rely on extensive biological and computational infrastructure and allow insitro to advance novel targets and patient biomarkers, design therapeutics and inform clinical strategy. insitro is advancing a wholly owned and partnered pipeline of insights and therapeutics in neuroscience, oncology and metabolism. Since launching in 2018, insitro has raised over $700 million from top tech, biotech and crossover investors, and from collaborations with pharmaceutical partners. For more information on insitro, please visit www.insitro.com.