The pragmatic species concept for Bacteria and Archaea is ultimately based on DNA-DNA hybridization (DDH). While enabling the taxonomist, in principle, to obtain an estimate of the overall similarity between the genomes of two strains, this technique is tedious and not easily be made reproducible between different labs and cannot be used to incrementally built up a comparative database. Recent technological progress in the area of genome sequencing calls for bioinformatics methods to replace the wet-lab DDH by in-silico genome-to-genome comparison. This web service offers state-of-the-art methods for inferring whole-genome distances which are well able to mimic DDH. These distance functions can also cope with heavily reduced genomes and repetitive sequence regions. Some of them are also very robust against missing fractions of genomic information (due to incomplete genome sequencing). Our digitally derived genome-to-genome distances show a better correlation with 16S rRNA gene sequence distances than DDH values. Thus, this web service can be used for genome-based species delineation. Once you have obtained complete or incomplete, assembled genomes sequences, the use is easy: upload your sequence files in our distance calculation form and let our server calculate intergenomic distances for you. These are converted into similarity values analogous to DDH and sent to you via e-mail to support your decision about the relatedness of your novel strain to known type strains.
This service is provided to the scientific community by Alexander Auch and Markus Göker (DSMZ).
The rationale of the distance calculation and its relation to DDH values is described in the following papers (by using this service you agree to cite at least one of them):
See also the official press release of the DSMZ (in German) and our presentation given at the 3rd joint conference of the DGHM and the VAAM.
The GBDP procedure has previously been introduced in the following studies:
Please feel free to contact us using the address shown below.
Questions, comments, bug reports etc. are welcome.
.
Neither the uploaded genome sequences nor your e-mail addresses are stored on the server after the calculations have been completed. Your personal data are not collected by us and are not made available to third parties; they are only used by us to calculate general usage statistics. All user data will be deleted after 24 hours.
Your access to and use of the site and its services is at your own risk.
The services are provided "as is", without warranty or condition of any kind.
The authors of this site will have no responsibility for any harm to your computer system, loss of data, or other harm that results from your access to or use of the site or its services.
The authors make no warranty that the site or its services will be available on an uninterrupted, secure, or error-free basis.
Use at you own risk!