!version: $Revision: 1.1 $ !date: Thu May 19 14:54:43 PDT 2005 !saved-by: kareneilbeck !autogenerated-by: DAG-Edit version 1.419 rev 3 ! !Gene Ontology definitions ! term: antisense_primary_transcript id: SO:0000645 definition: The reverse complement of the primary transcript. definition_reference: SO:ke term: antisense_RNA id: SO:0000644 definition: Antisense RNA is RNA that is transcribed from the coding, rather than the template, strand of DNA. It is therefore complementary to mRNA. definition_reference: SO:ke term: ARS id: SO:0000436 definition: A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host. definition_reference: SO:ma term: assembly id: SO:0000353 definition: A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences. definition_reference: SO:ma term: assembly_component id: SO:0000143 definition: A region of sequence which may be used to manufacture a longer assembled, sequence. definition_reference: SO:ke term: attenuator id: SO:0000140 definition: A sequence segment located between the promoter and a structural gene that causes partial termination of transcription. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: autocatalytically_spliced_intron id: SO:0000588 definition: A self spliced intron. definition_reference: SO:ke term: binding_site id: SO:0000409 definition: A region on the surface of a molecule that may interact with another molecule. definition_reference: SO:ke term: branch_site id: SO:0000611 definition: A pyrimidine rich sequence near the 3' end of an intron to which the 5'end becomes covalently bound during nuclear splicing. The resulting structure resembles a lariat. definition_reference: SO:ke term: cap id: SO:0000581 definition: A structure consisting of a 7-methylguanosine in 5'-5' triphosphate linkage with the first nucleotide of an mRNA. It is added post-transcriptionally, and is not encoded in the DNA. definition_reference: http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/mbglossary/mbgloss.html term: cDNA_match id: SO:0000689 definition: A match against cDNA sequence. definition_reference: SO:ke term: CDS id: SO:0000316 definition: A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon. definition_reference: SO:ma term: centromere id: SO:0000577 definition: A region of chromosome where the spindle fibers attach during mitosis and meiosis. definition_reference: SO:ke term: chromosomal_structural_element id: SO:0000628 definition: A part of a chromosome that has structural function. definition_reference: SO:ke term: chromosome id: SO:0000340 definition: Structural unit composed of long DNA molecule. definition_reference: http://biotech.icmb.utexas.edu/search/dict-search.mhtml term: clip id: SO:0000303 definition: Part of the primary transcript that is clipped off during processing. definition_reference: SO:ke term: clone id: SO:0000151 definition: A piece of DNA that has been inserted in a vector so that it can be propagated in E. coli or some other organism. definition_reference: http://www.geospiza.com/community/support/glossary/ term: clone_end id: SO:0000103 definition: The end of the clone insert. definition_reference: SO:ke term: clone_start id: SO:0000179 definition: The start of the clone insert. definition_reference: SO:ke term: codon id: SO:0000360 definition: A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together signify a unique amino acid or the termination of translation. definition_reference: http://genomics.phrma.org/lexicon/c.html term: complex_substitution id: SO:1000005 definition: When no simple or well defined DNA mutation event describes the observed DNA change, the keyword "complex" should be used. Usually there are multiple equally plausible explanations for the change. definition_reference: http://www.ebi.ac.uk/mutations/recommendations/mutevent.html term: contig id: SO:0000149 definition: A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unvailable bases. definition_reference: SO:ls term: CpG_island id: SO:0000307 definition: Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes. definition_reference: SO:rd term: cross_genome_match id: SO:0000177 definition: A nucleotide match against a sequence from another organism. definition_reference: SO:ma term: databank_entry id: SO:2000061 definition: The sequence referred to by an entry in a databank such as Genbank or SwissProt. definition_reference: SO:ke term: decayed_exon id: SO:0000464 definition: A non-functional descendent of an exon. definition_reference: SO:ke term: deletion id: SO:0000159 definition: The sequence that is deleted. definition_reference: SO:ke term: deletion_junction id: SO:0000687 definition: The space between two bases in a sequence which marks the position where a deletion has occured. definition_reference: SO:ke term: direct_repeat id: SO:0000314 definition: A repeat where the same sequence is repeated in the same direction. Example: GCTGA-----GCTGA. definition_reference: SO:ke term: dispersed_repeat id: SO:0000658 definition: A repeat that is located at dispersed sites in the genome. definition_reference: SO:ke term: enhancer id: SO:0000165 definition: A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: enzymatic_RNA id: SO:0000372 definition: A non-coding RNA, usually with a specific secondary structure, that acts to regulate gene expression. definition_reference: SO:ma term: EST id: SO:0000345 definition: Expressed Sequence Tag: The sequence of a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long. definition_reference: http://genomics.phrma.org/lexicon/e.html term: EST_match id: SO:0000668 definition: A match against an EST sequence. definition_reference: SO:ke term: exon id: SO:0000147 definition: A region of the genome that codes for portion of spliced messenger RNA (SO:0000234); may contain 5'-untranslated region (SO:0000204), all open reading frames (SO:0000236) and 3'-untranslated region (SO:0000205). definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: exon_junction id: SO:0000333 definition: The boundary between two exons in a processed transcript. definition_reference: SO:ke term: experimental_result_region id: SO:0000703 definition: A region of sequence implicated in an experimental result. definition_reference: SO:ke term: expressed_sequence_match id: SO:0000102 definition: A match to an EST or cDNA sequence. definition_reference: SO:ke term: five_prime_UTR id: SO:0000204 definition: A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: flanking_region id: SO:0000239 definition: The DNA sequences extending on either side of a specific locus. definition_reference: http://biotech.icmb.utexas.edu/search/dict-search.mhtml term: gap id: SO:0000730 definition: A gap in the sequence of known length. THe unkown bases are filled in with N's. definition_reference: SO:ke term: gene id: SO:0000704 definition: A locatable region of genomic sequence, corresponding to a unit of inheritance, which is associated with regulatory regions, transcribed regions and/or other functional sequence regions definition_reference: SO:rd term: gene_group id: SO:0005855 definition: A collection of related genes. definition_reference: SO:ma term: gene_group_regulatory_region id: SO:0000752 definition: A kind of regulatory region that regulates a gene_group such as an operon, rather than an individual gene. definition_reference: SO:ke comment: This term was added so that gene_groups may have their own collection of regulators. It was necessary because gene groups are regulated by different kinds of regulatory regions than single genes. term: golden_path id: SO:0000688 definition: A set of subregions selected from sequence contigs which when concatenated form a nonredundant linear sequence. definition_reference: SO:ls term: golden_path_fragment id: SO:0000468 definition: One of the pieces of sequence that make up a golden path. definition_reference: SO:rd term: group_II_intron id: SO:0000603 definition: Group II introns are found in rRNA, tRNA and mRNA of organelles in fungi, plants and protists, and also in mRNA in bacteria. They are large self-splicing ribozymes and have 6 structural domains (usually designated dI to dVI). A subset of group II introns also encode essential splicing proteins in intronic ORFs. The length of these introns can therefore be up to 3kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps. The 2' hydroxyl of a bulged adenosine in domain VI attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. Protein machinery is required for splicing in vivo, and long range intron-intron and intron-exon interactions are important for splice site positioning. Group II introns are further sub-classified into groups IIA and IIB which differ in splice site consensus, distance of bulged A from 3' splice site, some tertiary interactions, and intronic ORF phylogeny. definition_reference: http://www.sanger.ac.uk/Software/Rfam/browse/index.shtml term: group_I_intron id: SO:0000587 definition: Group I catalytic introns are large self-splicing ribozymes. They catalyse their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of 9 paired regions (P1-P9). These fold to essentially two domains, the P4-P6 domain (formed from the stacking of P5, P4, P6 and P6a helices) and the P3-P9 domain (formed from the P8, P3, P7 and P9 helices). Group I catalytic introns often have long ORFs inserted in loop regions. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00028 term: guide_RNA id: SO:0000602 definition: A short 3'-uridylated RNA that can form a perfect duplex (except for the oligoU tail (SO:0000609)) with a stretch of mature edited mRNA. definition_reference: http://www.rna.ucla.edu/index.html term: hammerhead_ribozyme id: SO:0000380 definition: A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs. definition_reference: http:rnaworld.bio.ukans.edu/class/RNA/RNA00/RNA_World_3.html term: insertion id: SO:0000667 definition: A region of sequence identified as having been inserted. definition_reference: SO:ke term: insertion_site id: SO:0000366 definition: The junction where an insertion occurred. definition_reference: SO:ke term: insulator id: SO:0000627 definition: Nucleic acid regulatory sequences that limit or oppose the action of ENHANCER ELEMENTS and define the boundary between differentially regulated gene loci. definition_reference: http:http://medical.webends.com/kw/Insulator%20Elements term: integrated_virus id: SO:0000113 definition: A viral sequence which has integrated into the host genome. definition_reference: SO:ke term: intergenic_region id: SO:0000605 definition: The region between two known genes. definition_reference: SO:ke term: intron id: SO:0000188 definition: A segment of DNA that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: inversion id: SO:1000036 definition: A continuous nucleotide sequence is inverted in the same position. definition_reference: http://www.ebi.ac.uk/mutations/recommendations/mutevent.html term: inverted_repeat id: SO:0000294 definition: The sequence is complementarily repeated on the opposite strand. Example: GCTGA-----TCAGC. definition_reference: SO:ke term: junction id: SO:0000699 definition: A junction refers to an interbase location of zero in a sequence. definition_reference: SO:ke term: located_sequence_feature id: SO:0000110 definition: A biological feature that can be attributed to a region of biological sequence. definition_reference: SO:ke term: match id: SO:0000343 definition: A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4. definition_reference: SO:ke term: match_part id: SO:0000039 definition: A part of a match, for example an hsp from blast isa match_part. definition_reference: SO:ke term: match_set id: SO:0000038 definition: A collection of match parts definition_reference: SO:ke term: mature_peptide id: SO:0000419 definition: The coding sequence for the mature or final peptide or protein product following post-translational modification. definition_reference: http:www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html term: methylated_A id: SO:0000161 definition: A methylated adenine. definition_reference: SO:ke term: methylated_base_feature id: SO:0000306 definition: A nucleotide modified by methylation. definition_reference: SO:ke term: methylated_C id: SO:0000114 definition: A methylated deoxy-cytosine. definition_reference: SO:ke term: microsatellite id: SO:0000289 definition: A very short unit sequence of DNA (2 to 4 bp) that is repeated multiple times in tandem. definition_reference: http://www.informatics.jax.org/silver/glossary.shtml term: minisatellite id: SO:0000643 definition: A repetitive sequence spanning 500 to 20,000 base pairs (a repeat unit is 5 - 30 base pairs). definition_reference: http://www.rerf.or.jp/eigo/glossary/minisate.htm term: miRNA id: SO:0000276 definition: Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene. miRNAs are produced from precursor molecules (SO:0000647) that can form local hairpin strcutures, which ordinarily are processed (via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpinprecursor molecule. miRNAs may trigger the cleavage of their target molecules oract as translational repressors. definition_reference: PMID:12592000 term: modified_base_site id: SO:0000305 definition: A modified nucleotide, i.e. a nucleotide other than A, T, C. G or (in RNA) U. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types comment: modified base\\: term: mRNA id: SO:0000234 definition: Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns. definition_reference: SO:ma comment: mRNA does not contain introns as it is a processd_transcript.\nThe equivalent kind of primary_transcript is protein_coding_primary_transcript (SO:0000120) which may contain introns. term: ncRNA id: SO:0000655 definition: An mRNA sequence that does not encode for a protein rather the RNA molecule is the gene product. definition_reference: SO:ke comment: ncRNA is a processed_transcript so it may not contain parts such as transcribed_spacer_regions that are removed in the act of processing. For the corresponding primary_transcripts, please see term SO:0000483 nc_primary_transcript. term: nc_primary_transcript id: SO:0000483 definition: A primary transcript that is never translated into a protein. definition_reference: SO:ke term: non_transcribed_region id: SO:0000183 definition: A region of the gene which is not transcribed. definition_reference: SO:ke term: nuclease_sensitive_site id: SO:0000684 definition: A region of nucleotide sequence targeting by a nuclease enzyme. definition_reference: SO:ma term: nucleotide_match id: SO:0000347 definition: A match against a nucleotide sequence. definition_reference: SO:ke term: nucleotide_motif id: SO:0000714 definition: A region of nucleotide sequence corresponding to a known motif. definition_reference: SO:ke term: oligo id: SO:0000696 definition: A short oligonucleotide sequence, of length on the order of 10's of bases; either single or double stranded. definition_reference: SO:ma term: operator id: SO:0000057 definition: A regulatory element of an operon to which activators or repressors bind hereby effecting translation of genes in that operon. definition_reference: SO:ma term: operon id: SO:0000178 definition: A group of contiguous genes transcribed as a single (polycistronic) mRNA from a single regulatory region. definition_reference: SO:ma term: ORF id: SO:0000236 definition: The inframe interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER definition_reference: SO:ma definition_reference: SO:rb comment: The definition was modified by Rama. This terms now basically is the same as a CDS. This must be revised. term: origin_of_replication id: SO:0000296 definition: The origin of replication; starting site for duplication of a nucleic acid molecule to give two identical copies. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: origin_of_transfer id: SO:0000724 definition: A region of a DNA molecule whre transfer is initiated during the process of conjugation or mobilization. definition_reference: http:http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: PCR_product id: SO:0000006 definition: A region amplified by a PCR reaction. definition_reference: SO:ke term: point_mutation id: SO:1000008 definition: A mutation event where a single DNA nucleotide changes into another nucleotide. definition_reference: http://www.ebi.ac.uk/mutations/recommendations/mutevent.html term: polyA_sequence id: SO:0000610 definition: Sequence of about 100 nucleotides of A added to the 3' end of most eukaryotic mRNAs. definition_reference: SO:ke term: polyA_signal_sequence id: SO:0000551 definition: The recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: polyA_site id: SO:0000553 definition: The site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: polypeptide id: SO:0000104 definition: A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversable denaturation. definition_reference: SO:ma term: polypyrimidine_tract id: SO:0000612 definition: The polypyrimidine tract is one of the cis-acting sequence elements directing intron removal in pre-mRNA splicing. definition_reference: http://nar.oupjournals.org/cgi/content/full/25/4/888 term: possible_assembly_error id: SO:0000702 definition: A region of sequence where there may have been an error in the assembly. definition_reference: SO:ke term: possible_base_call_error id: SO:0000701 definition: A region of sequence where the validity of the base calling is questionable. definition_reference: SO:ke term: primary_transcript id: SO:0000185 definition: The primary (initial, unprocessed) transcript; includes five_prime_clip (SO:0000555), five_prime_untranslated_region (SO:0000204), open reading frames (SO:0000236), introns (SO:0000188) and three_prime_ untranslated_region (three_prime_UTR), and three_prime_clip (SO:0000557). definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: primer id: SO:0000112 definition: A short preexisting polynucleotide chain to which new deoxyribonucleotides can be added by DNA polymerase. definition_reference: http://www.ornl.gov/TechResources/Human_Genome/publicat/primer2001/glossary.html term: processed_transcript id: SO:0000233 definition: A transcript which has undergone processing to remove parts such as introns and transcribed_spacer_regions. definition_reference: SO:ke comment: A processed transcript cannot contain introns. term: promoter id: SO:0000167 definition: The region on a DNA molecule involved in RNA polymerase binding to initiate transcription. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: protein_coding_primary_transcript id: SO:0000120 definition: A primary transcript that, at least in part, encodes one or more proteins. definition_reference: SO:ke comment: May contain introns term: protein_match id: SO:0000349 definition: A match against a protein sequence. definition_reference: SO:ke term: pseudogene id: SO:0000336 definition: A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their "normal" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its "normal" paralog). definition_reference: http://www.ucl.ac.uk/ ~ ucbhjow/b241/glossary.html term: pseudogenic_exon id: SO:0000507 definition: The exon of a pseudogene. definition_reference: SO:rb comment: Term added in response to request at SO meeting in August 2004 to allow the detailed annotation of pseudogenes. term: pseudogenic_region id: SO:0000462 definition: A non-functional descendent of a functional entitity. definition_reference: SO:cjm term: pseudogenic_transcript id: SO:0000516 definition: A transcript of a pseudogene definition_reference: SO:rb comment: Term added at the SO meeting in August 2004 to allow more detailed annotation of pseudogenes. term: rasiRNA id: SO:0000454 definition: A small, 17-28-nt, small interfering RNA derived from transcripts ofrepetitive elements. definition_reference: http://www.developmentalcell.com/content/article/abstract?uid=PIIS1534580703002284 term: read id: SO:0000150 definition: A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine. definition_reference: SO:rd term: reading_frame id: SO:0000717 definition: A nucleic acid sequence that when read as sequential triplets, has the potential of encoding a sequential string of amino acids. It does not contain the start or stop codon. definition_reference: SO:rb comment: This term was added after a request by SGD.\nAgust 2004. Modified after SO meeting in Cambridge to not include start or stop. term: read_pair id: SO:0000007 definition: A pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert. definition_reference: SO:ls term: reagent id: SO:0000695 definition: A sequence used in experiment. definition_reference: SO:ke term: region id: SO:0000001 definition: Continous sequence. definition_reference: SO:ke term: regulatory_region id: SO:0005836 definition: A DNA sequence that controls the expression of a gene. definition_reference: http://www.genpromag.com/scripts/glossary.asp?LETTER=R term: regulon id: SO:1001284 definition: A group of genes, whether linked as a cluster or not, that respond to a common regulatory signal. definition_reference: ISBN:0198506732 term: remark id: SO:0000700 definition: A comment about the sequence. definition_reference: SO:ke term: repeat_family id: SO:0000187 definition: A group of characterized repeat sequences. definition_reference: SO:ke term: repeat_region id: SO:0000657 definition: A region of sequence containing one or more repeat units. definition_reference: SO:ke term: restriction_fragment id: SO:0000412 definition: Any of the individual polynucleotide sequences produced by digestion of DNA with a restriction endonuclease. definition_reference: http://www.agron.missouri.edu/cgi-bin/sybgw_mdb/mdb3/Term/119 term: RFLP_fragment id: SO:0000193 definition: A polymorphism detectable by the size differences in DNA fragments generated by a restriction enzyme. definition_reference: PMID:6247908 term: ribosome_entry_site id: SO:0000139 definition: Region in mRNA where ribosome assembles. definition_reference: SO:ke comment: gene\\: term: ribozyme id: SO:0000374 definition: An RNA with catalytic activity. definition_reference: SO:ma term: RNAi_reagent id: SO:0000337 definition: A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference. definition_reference: SO:rd term: RNase_MRP_RNA id: SO:0000385 definition: The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00030 term: RNase_P_RNA id: SO:0000386 definition: The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterised activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00010 term: rRNA id: SO:0000252 definition: RNA that comprises part of a ribosome, and that can provide both structural scaffolding and catalytic activity. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types definition_reference: ISBN:0198506732 term: rRNA_18S id: SO:0000407 definition: 18S_rRNA -A large polynucleotide which functions as a part of the small subunit of the ribosome definition_reference: SO:ke term: rRNA_28S id: SO:0000653 definition: A component of the large ribosomal subunit. definition_reference: SO:ke term: rRNA_5.8S id: SO:0000375 definition: 5.8S ribosomal RNA (5.8S rRNA) is a component of the large subunit of the eukaryotic ribosome. It is transcribed by RNA polymerase I as part of the 45S precursor that also contains 18S and 28S rRNA. Functionally, it is thought that 5.8S rRNA may be involved in ribosome translocation. It is also known to form covalent linkage to the p53 tumour suppressor protein. 5.8S rRNA is also found in archaea. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00002 term: rRNA_5S id: SO:0000652 definition: 5S ribosomal RNA (5S rRNA) is a component of the large ribosomal subunit in both prokaryotes and eukaryotes. In eukaryotes, it is synthesised by RNA polymerase III (the other eukaryotic rRNAs are cleaved from a 45S precursor synthesised by RNA polymerase I). In Xenopus oocytes, it has been shown that fingers 4-7 of the nine-zinc finger transcription factor TFIIIA can bind to the central region of 5S RNA. Thus, in addition to positively regulating 5S rRNA transcription, TFIIIA also stabilises 5S rRNA until it is required for transcription. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00001 term: SAGE_tag id: SO:0000326 definition: A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts. definition_reference: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7570003&dopt=Abstract term: scRNA id: SO:0000013 definition: Any one of several small cytoplasmic RNA moleculespresent in the cytoplasm and sometimes nucleus of a eukaryote. definition_reference: http:www.ebi.ac.uk/embl/WebFeat/align/scRNA_s.html term: sequence_difference id: SO:0000413 definition: A region where the sequences differs from that of a specified sequence. definition_reference: SO:ke term: sequence_variant id: SO:0000109 definition: A region of sequence where variation has been observed. definition_reference: SO:ke term: signal_peptide id: SO:0000418 definition: The sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence. definition_reference: http:www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html term: silencer id: SO:0000625 definition: Combination of short DNA sequence elements which suppress the transcription of an adjacent gene or genes. definition_reference: http://www.brunel.ac.uk/depts/bio/project/old_hmg/gloss3.htm#s term: siRNA id: SO:0000646 definition: Small RNA molecule that is the product of a longerexogenous or endogenous dsRNA, which is either a bimolecular duplexe or very longhairpin, processed (via the Dicer pathway) such that numerous siRNAs accumulatefrom both strands of the dsRNA. sRNAs trigger the cleavage of their target molecules. definition_reference: PMID:12592000 term: small_regulatory_ncRNA id: SO:0000370 definition: A non-coding RNA, usually with a specific secondary structure, that acts to regulate gene expression. definition_reference: SO:ma term: snoRNA id: SO:0000275 definition: Small nucleolar RNAs (snoRNAs) are involved in the processing and modification of rRNA in the nucleolus. There are two main classes of snoRNAs: the box C/D class, and the box H/ACA class. U3 snoRNA is a member of the box C/D class. Indeed, the box C/D element is a subset of the six short sequence elements found in all U3 snoRNAs, namely boxes A, A', B, C, C', and D. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00012 term: SNP id: SO:0000694 definition: SNPs are single base pair positions in genomic DNA at which different sequence alternatives (alleles) exist in normal individuals in some population(s), wherein the least frequent allele has an abundance of 1% or greater. definition_reference: http://www.cgr.ki.se/cgb/groups/brookes/Articles/essence_of_snps_article.pdf term: snRNA id: SO:0000274 definition: Small non-coding RNA in the nucleoplasm. A small nuclear RNA molecule involved in pre-mRNA splicing and processing definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types definition_reference: ems:WB definition_reference: PMID:11733745 term: spliceosomal_intron id: SO:0000662 definition: An intron which is spliced by the spliceosome. definition_reference: SO:ke term: splice_acceptor_site id: SO:0000164 definition: The junction between the 3 prime end of an intron and the following exon. definition_reference: http://www.ucl.ac.uk/ ~ ucbhjow/b241/glossary.html term: splice_donor_site id: SO:0000163 definition: The junction between the 3 prime end of an exon and the following intron. definition_reference: http://www.ucl.ac.uk/ ~ ucbhjow/b241/glossary.html term: splice_enhancer id: SO:0000344 definition: Region of a transcript that regulates splicing. definition_reference: SO:ke term: splice_site id: SO:0000162 definition: The position where intron is excised. definition_reference: SO:ke term: SRP_RNA id: SO:0000590 definition: The signal recognition particle (SRP) is a universally conserved ribonucleoprotein. It is involved in the co-translational targeting of proteins to membranes. The eukaryotic SRP consists of a 300-nucleotide 7S RNA and six proteins: SRPs 72, 68, 54, 19, 14, and 9. Archaeal SRP consists of a 7S RNA and homologues of the eukaryotic SRP19 and SRP54 proteins. In most eubacteria, the SRP consists of a 4.5S RNA and the Ffh protein (a homologue of the eukaryotic SRP54 protein). Eukaryotic and archaeal 7S RNAs have very similar secondary structures, with eight helical elements. These fold into the Alu and S domains, separated by a long linker region. Eubacterial SRP is generally a simpler structure, with the M domain of Ffh bound to a region of the 4.5S RNA that corresponds to helix 8 of the eukaryotic and archaeal SRP S domain. Some Gram-positive bacteria (e.g. Bacillus subtilis), however, have a larger SRP RNA that also has an Alu domain. The Alu domain is thought to mediate the peptide chain elongation retardation function of the SRP. The universally conserved helix which interacts with the SRP54/Ffh M domain mediates signal sequence recognition. In eukaryotes and archaea, the SRP19-helix 6 complex is thought to be involved in SRP assembly and stabilizes helix 8 for SRP54 binding. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00017 term: stRNA id: SO:0000649 definition: Non-coding RNAs of about 21 nucleotides in length that regulate temporal development; first discovered in C. elegans. definition_reference: PMID:11081512 term: STS id: SO:0000331 definition: Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known. definition_reference: http://www.biospace.com term: substitution id: SO:1000002 definition: Any change in genomic DNA caused by a single event. definition_reference: http://www.ebi.ac.uk/mutations/recommendations/mutevent.html term: supercontig id: SO:0000148 definition: One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's. definition_reference: SO:ls term: tag id: SO:0000324 definition: A nucleotide sequence that may be used to identify a larger sequence. definition_reference: SO:ke term: tandem_repeat id: SO:0000705 definition: Two or more adjacent copies of a DNA sequence. definition_reference: http://www.sci.sdsu.edu/ ~ smaloy/Glossary/T.html term: telomerase_RNA id: SO:0000390 definition: The RNA component of telomerase, a reverse transcriptase that synthesises telomeric DNA. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00025 term: telomere id: SO:0000624 definition: A specific structure at the end of a linear chromosome, required for the integrity and maintenence of the end, definition_reference: SO:ma term: terminator id: SO:0000141 definition: The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: TF_binding_site id: SO:0000235 definition: A region of a molecule that binds to a transcription factor. definition_reference: SO:ke term: three_prime_UTR id: SO:0000205 definition: A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein. definition_reference: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types term: tiling_path id: SO:0000472 definition: A set of regions which overlap with minimal polymorphism to form a linear sequence. definition_reference: CJM:SO term: tiling_path_fragment id: SO:0000474 definition: A piece of sequence that makes up a tiling_path.SO:0000472. definition_reference: SO:ke term: transcript id: SO:0000673 definition: An RNA synthesized on a DNA or RNA template by an RNA polymerase. definition_reference: SO:ma term: transcription_end_site id: SO:0000616 definition: The site where transcription ends. definition_reference: SO:ke term: transcription_start_site id: SO:0000315 definition: The site where transcription begins. definition_reference: SO:ke term: transit_peptide id: SO:0000725 definition: The coding sequence for an N-terminal domain of a nuclear-encoded organellar protein: this domain is involved in post translational import of the protein into the organelle. definition_reference: http:http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html#line_types comment: Added to bring SO inline with the embl ddbj genbank feature table. term: translated_nucleotide_match id: SO:0000181 definition: A match against a translated sequence. definition_reference: SO:ke term: transposable_element id: SO:0000101 definition: A transposon or insertion sequence. An element that can insert in a variety of DNA sequences. definition_reference: http://www.sci.sdsu.edu/ ~ smaloy/Glossary/T.html term: transposable_element_insertion_site id: SO:0000368 definition: The junction in a genome where a transposable_element has inserted. definition_reference: SO:ke term: trans_splice_acceptor_site id: SO:0000706 definition: The process that produces mature transcripts by combining exons of independent pre-mRNA molecules. The acceptor site lies on the 3' of these molecules. definition_reference: SO:ke term: tRNA id: SO:0000253 definition: Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. tRNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). tRNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00005 definition_reference: ISBN:0198506732 term: U11_snRNA id: SO:0000398 definition: U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence. definition_reference: PMID:9622129 term: U12_snRNA id: SO:0000399 definition: The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00007 term: U14_snRNA id: SO:0000403 definition: U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00016 term: U1_snRNA id: SO:0000391 definition: U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00003 term: U2_snRNA id: SO:0000392 definition: U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00004 term: U4atac_snRNA id: SO:0000394 definition: An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397). definition_reference: PMID:=12409455 term: U4_snRNA id: SO:0000393 definition: U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015 term: U5_snRNA id: SO:0000395 definition: U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00020 term: U6atac_snRNA id: SO:0000397 definition: U6atac_snRNA -An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394). definition_reference: http:http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=retrieve&db=pubmed&list_uids=1 2409455&dopt=Abstract term: U6_snRNA id: SO:0000396 definition: U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015 term: ultracontig id: SO:0000719 definition: An ordered and oriented set of scaffolds based on somewhat weaker sets of inferential evidence such as one set of mate pair reads together with supporting evidence from ESTs or location of markers from SNP or microsatellite maps, or cytogenetic localization of contained markers. definition_reference: FB:WG term: UTR id: SO:0000203 definition: Messenger RNA sequences that are untranslated and lie five prime and three prime to sequences which are translated. definition_reference: SO:ke term: vault_RNA id: SO:0000404 definition: A family of RNAs are found as part of the enigmatic vault ribonuceoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00006 term: virtual_sequence id: SO:0000499 definition: A continous piece of sequence similar to the 'virtual contig' concept of ensembl. definition_reference: SO:ke term: Y_RNA id: SO:0000405 definition: Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin. definition_reference: http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00019