Deogen is a variant-effect predictor that aims at the multi-level contextualization of both the target variant and the affected protein. It performs this contextualization by combining different sources of biological information.
Those sources can be roughly divided into variant-oriented and protein oriented features. The former are evolution-based and aim at predicting the molecular phenotype induced by the variant on the protein, while the latter comprehend information from Protein-protein interaction networks, pathway annotation, degree of recessiveness and essentiality of the gene.
The method has been developed by Daniele Raimondi, Andrea Gazzo, Marianne Rooman, Tom Lenaerts and Wim Vranken and has been published here (doi: 10.1093/bioinformatics/btw094).
NEW: Deogen is also now available as Docker container!
The python implementation of the predictor can be downloaded from here or can be obtained by cloning the repository with the following command:
git clone https://firstname.lastname@example.org/eddiewrc/deogen.git
A comprehensive installation guide with usage examples is available in the repository. The software has been designed and tested on Linux systems!
It's now available the 1.1 version, with some bug fixed and data update. For any problem during the installation, I'll be glad to help, just contact me via email.
Requirements/Disclaimer: The code is released under GNU GPL. The current implementation requires the following softwares and data:
(more info and installation detail can be found in the README present on the git repo!)
- scikit-learn (Machine Learning library in Python)
- UniRef100 protein sequence database
- NCBInr protein sequence database (here the latest)
- NCBI BLAST 2.2.28+ (Basic Local Alignment Search Tool)
- CD-HIT 3.1.2 (clustering and comparing protein sequences)
- PROVEAN (homology-based variant-effect predictor)
- JackHmmer from HMMER suite
In order to provide the best tool possible, we are very grateful for bug reports!
To cite Deogen:
Raimondi, D., Gazzo, A. M., Rooman, M., Lenaerts, T., & Vranken, W. F. (2016). Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects. Bioinformatics, btw094.