TypOn Team <typon@kdbio.inesc-id.pt> OWL TypOn - The Microbial Typing Ontology TyPon provides a comprehensive description of the existing microbial typing methods for the identification of bacterial Isolates and their classification. Such a description constitutes an universal format for the exchange of information on the microbial typing field, providing a vehicle for the integration of the numerous disparate online databases. In its current version, TyPon describes most used microbial typing methods but it is, and always will be, a work in progress given the constant advances in the microbial typing field. TypOn - The Microbial Typing Ontology 20140606 An Allele is one of multiple forms of a genetic locus. An observable variant of a locus. Most of the sequence based typing methods rely on the diferent allelic forms of one or more genetic loci (see Locus) to provide a discrimination criteria of isolates at the species or subspecies level. 1 1 1 FragmentSizeBased refers to all typing methods where the allele determination is based on fragment sizes rather than its sequence. Fragment Size Based Genotypic characterizes all the Typing Information that is based directly on the sequence itself or properties (such as sequence length). Genotypic Population of microbial cells in pure culture derived from a single colony on an isolation plate and characterized by identification to the species level. Definition adapted from Tenover, F. C., R. D. Arbeit, R. V. Goering, P. A. Mickelsen, B. E. Murray, D. H. Persing, and B. Swaminathan. 1995. Interpreting chromosomal DNA restriction patterns produced by pulsed-field gel electrophoresis: criteria for bacterial strain typing. J Clin Microbiol 33:2233-9. And from Struelens, M. J. 1996. Consensus guidelines for appropriate use and evaluation of microbial epidemiologic typing systems. Clin Microbiol Infect 2:2- 11. 1 1 1 1 Unique Identifier attributed to a repeat sequence as originally defined by Koreen et al, J Clin Microbiol 2004 Feb;42(2):792-9 ( http://www.ncbi.nlm.nih.gov/pubmed/14766855). Kreiswirth id Locus is a specific location of a gene or DNA sequence on the chromossome. For typing purposes a locus can be defined by the pair of PCR primers that are used to amplify the desired sequence. The resulting sequence is usually trimmed by comparing it to a template sequence. Locus for typing purpose is a position on the genome usually defined by similarity. Locus is a named region on a genome. MLST stands for Multilocus Sequence Typing and it is a sequence based typing method that traditionally focused on the sequence of seven housekeeping loci to determine the phylogenetic relationships between strains of the same species. The novel Whole Genome Sequencing (WGS) techniques are bringing new approaches to MLST moving from traditional 7-loci approches to schemas containing up to thousands of genes (see rMLST, wgMLST,cgMLST). Maiden MC, Bygraves JA, Feil E et al. (March 1998). "Multilocus sequence typing: A portable approach to the identification of clones within populations of pathogenic microorganisms". Proc. Natl. Acad. Sci. U.S.A. 95 (6):3140-3145. doi:10.1073/pnas.95.6.3140. PMC 19708. PMID 9501229. MLST 1 1 MLVA stands for Multilocus Variable Number of Tandem Repeats Analysis. This method can be performed by three distinct methodologies: by running the amplified locus in a gel , using a capilary electrophoresis apparatus or sequencing the resulting amplified fragment. The first two infer the number of repeats on each locus indirectly by the fragment size. MLVA 1 MultiLocus characterizes all the methods based only in the characteristics (lenght or sequence ) of only two or more Locus MultiLocus 1 Unique identifier to an article as defined by PubMed (www.pubmed.org). Can be converted to other unique Identifiers commonly used in scientific literature (PMID,PMCID, Manuscript IDor DOI) by the tool provided in http://www.ncbi.nlm.nih.gov/pmc/pmctopmid/ PubMed id RepeatProfilePart is the entity that stores the index order of a given Spa Repeat on a given SpaAllele Repeat Profile Part Unique identifier attributed to a repeat sequence as originally defined by Harmsen D et al, J CLin Microbiol 2003 Dec;41(12):5442-8 (http://www.ncbi.nlm.nih.gov/pubmed/14662923) and currently by the SpaServer (http://www.spaserver.ridom.de/). Ridom id Short Read Archive ( SRA) Accession Number. Identifies a link for the raw data generating by a sequencing platform. The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, Helicos Heliscope®, Complete Genomics®, and Pacific Biosciences SMRT®. sra acession ST stands for Sequence Type and it is a unique identifier to a unique allelic profile generated by a MLST method. It is most commonly used with the traditional MLST schemas with seven housekeeping genes. ST An MLSTAllele is an Allele (See Allele) defined in a Multilocus Sequence Typing Schema. Typically, it corresponds to a sequence of a locus of an housekeeping Gene. Each MLST_Allele has an Id that corresponds to an unique Sequence. This Id is obtained by comparing the Sequence with an online database that is the repository for the MLST Schema (see Schema). A Schema is the enumeration and description of a set of Locus used in a multilocus typing methodology. Schema 1 1 SchemaPart is the entity that store the index order of a given Locus on a Schema. Schema Part 1 1 Sequence represents the DNA sequence of a given locus by storing its Nucleotide Sequence and/or a pointer for a file where the sequence is stored in a file system. Sequence SequenceBased refers to all typing methods based on the DNA sequence content of one or more loci. Sequence Based SingleLocus characterizes all the methods based only in the characteristics (lenght or sequence) of only one Locus. Single Locus 1 1 A SpaAllele corresponds to a specific repeat Sequence found as result of the amplification of the locus of spa gene in Staphylococcus aureus. Spa Allele SpaRepeat refers to the sequence of the repeat together with its identifier. a series of SpaRepeats and their order define a given SpaAllele and therefore a Spa type. Spa Repeat High level concept to agregate all the information relative to typing methods that have been applied to charactherize the isolate. Typing Information An MLVAAllele corresponds to the number of repeats found on a specific locus on a MLVA Schema (see Schema). 1 1 The relationship between the :Schema and the foaf:Agent which administrates it. administrated by cgMLST refers to core genome MLST , a variant of MLST where the loci are core genome genes,i.e., are present in the great majority of strains of a given species cgMLST The relationship between an :Isolate or an :Allele and the foaf:Agent which curates it. curated by The date that the isolate or allele was inserted into a database. date entered Raw sequence file. file has allele has locus has repeat has repeat profile part has schema has schema part has sequence has typing Allele identifier. id image file The index of the respective part. index Matemathical algorithm used for the software to assess the fragment sizes for the bands in gel patterns interpolation method is associated to is from is of locus is of taxon isolate name isolated at List of the fragments from the molecular size marker that will be used for the size assessment of the fragments for the target bands of the isolates. marker fragments used Molecular size marker used in the experiment. marker used number of repeats nucleotice sequence other name owned by String defining the allelic profile composed by the value of all Allele IDs, in the order of locus that was initially defined, separeted by '-'. Commonly used in the traditional MLST schemas. profile rMLST refers to ribossomal MLST, a variant of MLST technique that uses the 53 ribossomal protein genes and allow an universal characterization of bacteria from domain to strain. Jolley KA, Bliss CM, Bennett JS, Bratcher HB, Brehony C, Colles FM, Wimalarathna H, Harrison OB, Sheppard SK, Cody AJ, Maiden MC. Ribosomal multilocus sequence typing: universal characterization of bacteria from domain to strain. Department of Zoology, University of Oxford, Oxford, UK. keith.jolley@zoo.ox.ac.uk. rMLST Experimental protocol describing the electrophoretic conditions including the apparatus and settings used. run protocol The date when the sample was collected. sample collection date schema name sent by sequence size Unique identifier for a Spa gene repeat sequence assigned by the http://www.spaserver.ridom.de/ and curated by SeqNet (http://www.seqnet.org/). spa type spaTyping is a typing method based on sequencing of the polymorphic X region of the protein A gene (spa), present in all strains of Staphylococcus aureus. The X region is constituted of a variable number of 24-bp repeats flanked by well-conserved regions. Shopsin B, Gomez M, Montgomery SO, Smith DH, Waddington M, Dodge DE, Bost DA, Riehman M, Naidich S, Kreiswirth BN.Evaluation of protein A gene polymorphic region DNA sequencing for typing of Staphylococcus aureus strains.J Clin Microbiol. 1999 Nov;37(11):3556-63. Harmsen D, Claus H, Witte W, Rothgänger J, Claus H, Turnwald D, Vogel U. Typing of methicillin-resistant Staphylococcus aureus in a university hospital setting by using novel software for spa repeat determination and database management. J Clin Microbiol. 2003 Dec;41(12):5442-8. spa Typing 1 species name cgMLST refers to core genome MLST , a variant of MLST where the loci are core genome genes,i.e., are present in the great majority of strains of a given species wgMLST