The Leibniz Institute for the Analysis of Biodiversity Change

is a research museum of the Leibniz Association

BaCoCa – assessment of sequence biases

Tabs

Information

Quick facts

Project title:

BaCoCa – a heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions

ZFMK Project lead:

Dr. Patrick Kück

Unit:

Algorithmic Development

Description

BaCoCa is designed to perform multiple statistical analyses on multiple nucleotide and amino-acid sequence alignments. The results of the BaCoCa analyses can be used for a detailed and statistical comprehensive data evaluation. Furthermore, the results can help to identify phylogenetic sequence biases which can lead to incorrect tree reconstructions.The program can handle hundreds of user specified gene and taxon partitions of a single sequence input file in one process run. BaCoCa is a command-line driven program written in Perl and works on WindowsPCs, Macs and Linux running systems. Therefore, it can be easily integrated into automatic pipeline processes of phylogenomic studies. All results issued by BaCoCa can be directly integrated into further analyses using statistical R packages. For example, heat map analyses of taxon versus gene matrices can be used to find clusters of genes and/or taxa with similar properties. Furthermore, all calculations of the BaCoCa software program are very fast and can be easily executed on a normal desktop computer, even if data sets consist of phylogenomic data. The downloadable BaCoCa.zip file contains the BaCoCa executable Perlscript (BaCoCa.v1.0beta.pl), a detailed documentation of all BaCoCa implemented calculations as well as detailed information of usage and BaCoCa result outputfiles (BaCoCa_Manual.pdf), and example infiles of empirical nucleotide (BaCoCa_Example_Files_NUC) and amino acid (BaCoCa_Example_Files_AA) supermatrices.BaCoCa

Schematic overview of BaCoCa workflow. Two kinds of alignment files are recognized as input as well as additionally files defining taxon subsets and partitions using the c and p options, respectively. The structure of the output results folder is shown and an example of the summary file. Using the r option heat maps in combination with hierarchical clustering are generated by BaCoCa.

The actual version of BaCoCa and the corresponding manual can be downloaded from GitHub:

https://github.com/PatrickKueck/BaCoCa

Location

Team

Dr. Patrick Kück

ZFMK Project lead

Algorithmic Development

External team members

Prof. Dr. Torsten Struck, professorship for evolutionary genomics at the University of Oslo and curator of the Helmint collection at Natural History Museum, Oslo: expertise in bioinformatics, phylogenomics, evolutionary genomics, systematics, molecular evolution, annelid phylogeny

Funding

Contact person

Dr. Patrick Kück

Head of Section

Algorithmic Development

+49 228 9122-404

+49 228 9122-212

P.Kueck [at] leibniz-lib.de

BaCoCa – assessment of sequence biases

Tabs

Quick facts

Description

Location

External team members

Contact person

THE LIB

Research

Museum

Search form

BaCoCa – assessment of sequence biases

Tabs

Quick facts

Description

Location

External team members

Zoological Research Museum Alexander Koenig

Contact person