This database has been built up to provide a tool for quick and accurate identification of dermatophytes using ITS sequences.
In its current version the database comprises over 200 sequences. About 50% of the sequences belong to strains that are available from the CBS collection, while the remaining sequences have been selected from GenBank in view of covering the extant biodiversity.
The three main components of a DNA barcode are: (1) the full sequence, (2) the unambiguously identified strain/specimen that is deposited in public collections/herbaria, and (3) the trace-files of the sequence. At the moment only a small number of sequences meet the requirements of DNA barcodes by being linked with trace-files. However, trace-files will be provided for all sequences of CBS-strains very soon.
The identification process (search modus) of this database resembles the BLAST tool in GenBank. As in BLAST, the output contains alignments with the most similar sequences plus a distance tree showing the position of the unknown sequence.
To facilitate the unambiguous identification we implemented the following novelties:
• threshold values for similarity and distance up to which the identification of a certain species is trustworthy
• validated DNA barcodes (= consensus sequences of all DNA barcodes of the same species ideally including the type barcode) created by using IUPAC codes and “P” = gap/N for the variable positions
• information on the coverage of the distribution area (number of strains of a certain species included in the database / number of countries where these strains were isolated / number of continents to which these countries belong).
The reliability of identification tool strongly relies on comprehensive sampling. In this database, the common anthropophilic species are well represented and their ITS sequences probably cover the existing diversity. Especially for geophilic but also for several zoophilic species the sampling is incomplete resulting in uncertainties in infraspecific variation. The database will therefore continually be expanded, with emphasis on sequences from underrepresented species and regions of the world. During this process the interaction with clinical networks is mandatory. We would very much appreciate to receive strains of underrepresented species or strains that cannot unambiguously be identified down to species level.
The database presented here is the result of collaboration between Grit Walther, Vincent Robert and Sybren de Hoog from the CBS, Yvonne Gräser from the Charité, Humboldt University in Germany, and Géraldine Jacon form the BCCM/IHEM collection in Belgium.