This role is based onsite in Redwood City (Redwood Shores), California
Reporting to the Director of Data Science, we are seeking an exceptional Proteomics Data Scientist to join our Data Science team based in our Redwood City office. This group implements and develops the newest methods for proteomics molecular profiling leveraging Seer’s proprietary technology platform and the latest generation of mass spectrometers. The driving focus of this team, working closely with the , is to extract value and insights from large scale datasets produced by Seer-technology.
Seer’s new Data Scientist will be empowered to perform exploratory data analysis, data wrangling, statistical analysis, machine learning models and design novel algorithms in analyzing proteomics data. The ideal candidate will have a track record of accomplishments applying data science methods to address biological problems in academia or in the life sciences/pharmaceutical/biotechnology industry. Significant experience with LCMS proteomics data and algorithms is highly desired. The successful candidate is likely to have a background in computer science, bioinformatics, molecular biology, or biochemistry with 3 years experience in life science data engineering. Experience with LCMS proteomics methods, data and algorithms is highly desired. This role requires superior skills in programs including Python or R, as well as a highly collaborative work ethic.
Responsibilities & Goals
- The Data Science group is focused on using modern data science tooling for processing and analyzing our proteomics molecular profiling data. Areas of specific responsibility and attention will include:
- Perform exploratory data analysis, data wrangling, statistical analysis and modeling. Use appropriate and succinct methods, reproducible reporting, and visualizations to convey insights to the wider team
- Support proteomics and multi-omics data discovery activities through the design and delivery of novel algorithms, statistical methods, tools and techniques
- Apply novel methods to internal and external data to demonstrate impact
- The analysis, selection & iteration of Seer’s computational proteomic data processing pipeline – maximizing the identification and quantification of the features present in the raw mass spectrometry data
- Partner with data engineers to ensure all optimizations can scale
- Carefully curate generated data sets, optimizing for traceability and ease of access
- Stay current with developments in data science and internally promote new techniques/ideas
- Provide training and advice to scientists on optimal use of key data and analysis in computational mass spectrometry
Key requirements include:
- Master of Science (or equivalent) in relevant discipline such as computer science, bioinformatics, molecular biology or biochemistry. Further graduate degree preferred.
- Superior Python or R skills in data analysis, statistics and machine learning. Comfortable in Jupyter/RStudio environments (experience in Domino Data Lab an advantage)
- Strong understanding of reproducible data analysis methods and best practices in software development.
- Demonstrated understanding of algorithms and statistical methods in proteomics, genomics and/or genetics
- Demonstrated understanding of MS-based proteomics algorithms, including peptide identification, PTM identification, protein inference as well as peptide/protein quantification.
- Familiarity of relational databases and good working knowledge of SQL
- Knowledge of Skyline, OpenMS, MaxQuant, MSFragger and other proteomic computational tools would be a distinct advantage
Apply here: https://grnh.se/3b8889af4us