A Python package for obtaining, parsing and exploring biological taxonomies.
Description
MultiTax is a Python package that provides a standardised set of functions for downloading, parsing, filtering, exploring, translating, converting and writing multiple taxonomies, including GTDB, NCBI, Silva, Greengenes and Open Tree Taxonomy, as well as custom-formatted taxonomies. The main goals are:
- to be fast, intuitive, generalised and easy to use
- explore different taxonomies using the same set of commands.
- enable integration and compatibility with multiple taxonomies
- translate taxonomies (only partially implemented between NCBI and GTDB).
MultiTax does not link sequence identifiers to taxonomic nodes; it only handles the taxonomy.
Installation
pip
pip install multitax
conda
conda install -c bioconda multitax
local
git clone https://github.com/pirovc/multitax.git
cd multitax
pip install .
API Documentation
https://pirovc.github.io/multitax/
Basic usage examples with GTDB
from multitax import GtdbTx
# Download and parse taxonomy
tax = GtdbTx()
# Get lineage for the Escherichia genus
tax.lineage("g__Escherichia")
# ['1', 'd__Bacteria', 'p__Proteobacteria', 'c__Gammaproteobacteria', 'o__Enterobacterales', 'f__Enterobacteriaceae', 'g__Escherichia']