GGDC - Genome-To-Genome Distance Calculator

Calculate Distances

Processing of the submitted job may take several minutes depending on the current workload of the server and the job size. After the job is finished, an eMail containing the results will be sent to the given address. All data belonging to this job will be deleted afterwards. Some statistical data will be permanently stored, that allows to generate overall usage statistics.

This service is designed for small and middle-sized datasets of at most 15 MB of data. This limitation should be sufficient for all currently sequenced prokaryotic genomes. For example, the largest prokaryotic genome sequenced to date has approx. 13 Mbp (Sorangium cellulosum). If you intend to use it for larger data sizes, please contact the authors.

Use of this form is free for academic purposes. For all other uses, please contact the authors.

Form

General settings

Please choose the software for determining high-scoring sequence pairs (HSPs) or maximally unique matches (MUMs). For each program, we use optimized parameter settings as described in our accompanying publication.
If you are interested in other settings, please contact the authors.

Query Genome:

Please enter the name of the query organism (or its Genbank accession ID, or a list of such IDs). The checkbox indicates whether what you entered in this field should be treated as a Genbank Accession ID (or a list of IDs, separated by blanks) or as a name.
If you enter a name, letters, numbers, as well as the '_' character are allowed, and the field is optional. However, a FASTA sequence is required in that case. If you provide the FASTA files yourself, make sure that they contain the complete genome; otherwise only distance function 2 (identities / HSP length) will work well.
If you enter a (list of) Genbank accession ID(s), the corresponding FASTA file will be downloaded automatically and does not need to be provided by the user.
Accession ID

Please select the FASTA file containing the nucleotide sequence of the query genome.
In case the query organism contains several chromosomes or extra-chromosomal elements like plasmids, a multi-FASTA file can be uploaded as well.

Reference Genomes:

Please use this form to enter Genbank accession IDs of reference genomes. Alternatively, FASTA files can be uploaded (see below).
The sequences belonging to the accession IDs entered here are automatically downloaded from NCBI and included in the set of reference genomes.
Each line should contain accession IDs belonging to the same organism. If a reference genome consists of more than one chromosome (or plasmid), the corresponding accession IDs have to be provided in the same line, separated with blanks.



Sequence upload

As an alternative to using Genbank sequences (see above), please use this form to upload genomes in FASTA format.
Please make sure that the files to be uploaded contain the complete genome; otherwise only distance function 2 (identities / HSP length) will work well.

Please enter the name of the reference organism. Alpha-numerical as well as the '_' character are allowed.
This field is optional.
Please select the FASTA file containing the nucleotide sequence of the reference genome.
If the reference organism contains several chromosomes or extra-chromosomal elements like plasmids, a multi-FASTA file can be uploaded as well.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

Personal data:

Please provide your eMail address. This address will only be used to send the calculated distances.
 

Disclaimer

Your access to and use of the site and its services is at your own risk. The services are provided "as is", without warranty or condition of any kind. The authors of this site will have no responsibility for any harm to your computer system, loss of data, or other harm that results from your access to or use of the site or its services. The authors make no warranty that the site or its services will be available on an uninterrupted, secure, or error-free basis.
Use at you own risk!