Tools

StrainVis: interactive visual strain-level analysis of microbiome data

StrainVis is a Python-based web application for visual analyses and interactive exploration of the results obtained by the SynTracker pipeline or by other strain tracking methods, based on ANI (Average Nucleotide Identity). StrainVis accepts either SynTracker’s output file ‘synteny_scores_per_region.csv’ or an ANI file, obtained by another method, containing either one reference genome or multiple reference genomes (usually, one reference genome per species). It presents accordingly analyses for each species separately and for multiple species together. A metadata file (matches…

SynTracker: a pipeline to track closely related microbial strains using genome synteny

In the human gut microbiome, specific strains emerge due to within-host evolution and can occasionally be transferred to or from other hosts. Phenotypic variance among such strains can have implications for strain transmission and interaction with the host. Surveilling strains of the same species, within and between individuals, can further our knowledge about the way in which microbial diversity is generated and maintained in host populations. Existing methods to estimate the biological relatedness of similar strains usually rely on…

ResMiCo: Increasing the quality of metagenome-assembled genomes with deep learning

The number of published metagenome assemblies is rapidly growing due to advances in sequencing technologies. However, sequencing errors, variable coverage, repetitive genomic regions, and other factors can produce misassemblies, which are challenging to detect for taxonomically novel genomic data. Assembly errors can affect all downstream analyses of the assemblies. Accuracy for the state of the art in reference-free misassembly prediction does not exceed an AUPRC of 0.57, and it is not clear how well these models generalize to real-world…

EndoR: an R package for interpreting tree ensemble machine learning models

Tree ensemble machine learning models are increasingly used in microbiome science as they are compatible with the compositional, high-dimensional, and sparse structure of sequence-based microbiome data. While such models are often good at predicting phenotypes based on microbiome data, they only yield limited insights into how microbial taxa may be associated. We developed endoR, a method to interpret tree ensemble models. First, endoR simplifies the fitted model into a decision ensemble. Then, it extracts information on the importance of…

Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets

Mapping metagenome reads to reference databases is the standard approach for assessing microbial taxonomic and functional diversity from metagenomic data. However, public reference databases often lack recently generated genomic data such as metagenome-assembled genomes (MAGs), which can limit the sensitivity of read-mapping approaches. We previously developed the Struo pipeline in order to provide a straight-forward method for constructing custom databases; however, the pipeline does not scale well enough to cope with the ever-increasing number of publicly available microbial genomes….

Struo: a pipeline for building custom databases for common metagenome profilers

Taxonomic and functional information from microbial communities can be efficiently obtained by metagenome profiling, which requires databases of genes and genomes to which sequence reads are mapped. However, the databases that accompany metagenome profilers are not updated at a pace that matches the increase in available microbial genomes, and unifying database content across metagenome profiling tools can be cumbersome. To address this, we developed Struo, a modular pipeline that automatizes the acquisition of genomes from public repositories and the…

StrainVis: interactive visual strain-level analysis of microbiome data

Read More >>

SynTracker: a pipeline to track closely related microbial strains using genome synteny

Read More >>

ResMiCo: Increasing the quality of metagenome-assembled genomes with deep learning

Read More >>

EndoR: an R package for interpreting tree ensemble machine learning models

Read More >>

Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets

Read More >>

Struo: a pipeline for building custom databases for common metagenome profilers

Read More >>

StrainVis: interactive visual strain-level analysis of microbiome data

Read More >>

SynTracker: a pipeline to track closely related microbial strains using genome synteny

Read More >>

ResMiCo: Increasing the quality of metagenome-assembled genomes with deep learning

Read More >>

EndoR: an R package for interpreting tree ensemble machine learning models

Read More >>

Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets

Read More >>

Struo: a pipeline for building custom databases for common metagenome profilers

Read More >>

Department of Microbiome Science

Max Planck Institute for Biology Tübingen

Max-Planck-Ring 5, 72076 Tübingen, Germany

MPI for Biology Tübingen

Max Planck Campus Tübingen

Max Planck Society