The long-term goal of the Reinert lab is to bridge the existing gap between advanced results in algorithmic research and their practical application as bioinformatics tools for real world data. The group is achieving this by working on solving well-posed computational problems arising in the context of biomedical –omics data analysis, mostly for next generation sequencing (NGS) data as well as proteomics data produced by HPLC/MS methods.
The applications of these new data are manifold. In NGS analysis they range from DNA sequencing for reference-guided assembly or for ChIP- Seq, over sequencing of RNA transcripts (RNA-Seq) to sequencing and mapping bisulfite-treated reads. In proteomics various protocols for producing data for qualitative and quantitative analysis are constantly being developed and need adequate algorithms for processing.
The aim of our groups at FUB and MPI is not solely to develop algorithms and tools for those applications, but rather to dissect them into well-defined algorithmic components. We then generalize those components as much as possible and implement them efficiently and robustly in software libraries. The two most mature projects in the Reinert lab are SeqAn and OpenMS. The OpenMS project was initiated jointly with the University of Tübingen and the ETH Zurich and resides now mostly in Tübingen. Both SeqAn and OpenMS are supported by the German Network for Bioinformatics Infrastructure (de.NBI).
Ongoing research themes:
Data structures and parallelisation
In this theme the focus lies in the development and implementation of data structures for bioinformatics analysis that can be used in several of the other themes. The emphasis lies on handling collections of genomic data, parallelisation and vectorisation of algorithms and distributed analysis
Read mapping and variant detection
In this theme we research methods for finding genomic variation between individuals and a reference sequence. This encompasses methods for global and local read mapping and algorithms and tools for finding structural variations.
(Pan)-Genome comparison and metagenomics
In this theme we research the possibilities arising when we have several genomes, i.e. pangenomes or metagenomes. Our research is focused on practical compression techniques, metagenomic analyses and distributed indices.
Genomic RNA analysis
The research theme deals with RNA analysis. We work on fast structural pairwise and multiple alignment and recently on the analysis of lncRNA in collaboration with Annalisa Marsico.
Proteome analysis with HPLC-MS
Together with the Kohlbacher lab in Tübingen we are a founder of OpenMS, the most comprehensive library for Proteomics and Metabolomics analysis and work on algorithmic problems in this field.
Infrastructure research and service
We aim at bridging the gap between algorithms and implementations and experimental researchers. As such, we work on maintaining bioinformatics infrastructure and to deliver practical solutions to our biomedical partners.
Finished projects
Marina, FP7 EU project
The aim of MARINA is to develop and validate the Risk Management Methods for Nanomaterials.To do this, MARINA will address the four central themes for the Risk Assessment and Management of Nanomaterials: Materials, Exposure, Hazard, and Risk. In MARINA we will develop beyond state-of-the-art referential tools from each of these themes and integrate them into a Risk Management Toolbox and Strategy for both human and environmental health.In our work package we develop algorithms and pipelines to analyse protein, gene and metabolite expression.
Blue Ion, BMWi project
The project is about developing software for ion mobility spectrometers.
Predict IV,Developing analysis pipelines for omics data (EU)
The overall aim of Predict-IV is to develop strategies to improve the assessment of drug safety in the early stage of development and late discovery phase, by an intelligent combination of non animal-based test systems, cell biology, mechanistic toxicology and in-silico modelling, in a rapid and cost effective manner. A better prediction of the safety of an investigational compound in early development will be delivered, by applying advances in predictive toxicology (toxicogenomics and metabolomics, prediction of pharmacokinetics and high content imaging) and modelling.
Genome Comparison, Algorithm Engineering for Efficient Genome Comparison ( DFG SPP 1307)
There were two main research goals in the proposed project: (1) To design, analyze, implement and experimentally validate efficient and versatile genome comparison algorithms based upon suitable computation models and (2) to integrate all required algorithmic components in the SeqAn library for biological sequence analysis for the purpose of disseminating the core algorithms and data structures to the bioinformatics and algorithm engineering community.
Data Parallelism Algorithm Engineering for High Throughput Sequencing Data (DFG)
This proposal aimed to respond to the described increase of genomic sequence data with algorithmic approaches that benefit from redundancies across multiple datasets.
Genome Comparison, Algorithm Engineering for Efficient Genome Comparison ( DFG SPP 1307)
We plan to further pursue the two main research goals from the originally proposed project: (1) To design, analyze, implement and experimentally validate efficient and versatile genome comparison algorithms based upon suitable computation models and (2) to integrate all required algorithmic compo- nents in the SeqAn library for biological sequence analysis for the purpose of disseminating the core algorithms and data structures to the bioinformatics and algorithm engineering community.
Clinical degradomics (BMBF)
Characterisation of disease relevant peptidases.