drfindformat Wiki The master copies of EMBOSS documentation are available at http://emboss.open-bio.org/wiki/Appdocs on the EMBOSS Wiki. Please help by correcting and extending the Wiki pages. Function Find public databases by format Description drfindformat searches the Data Resource Catalogue to find entries with EDAM format terms matching a query string. Algorithm The first search is of the EDAM ontology format namespace, using the term names and their synonynms. All child terms are automatically included in the set of matches inless the -nosubclasses qualifier is used. The -sensitive qualifier also searches the definition strings. The set of EDAM terms are then compared to entries in the Data Resource Catalogue, searching the 'efmt' EDAM format index. Usage Here is a sample session with drfindformat % drfindformat fasta Find public databases by format Data resource output file [drfindformat.drcat]: Go to the output files for this example Command line arguments Find public databases by format Version: EMBOSS:6.4.0.0 Standard (Mandatory) qualifiers: [-query] string List of EDAM data keywords (Any string) [-outfile] outresource [*.drfindformat] Output data resource file name Additional (Optional) qualifiers: (none) Advanced (Unprompted) qualifiers: -sensitive boolean [N] By default, the query keywords are matched against the EDAM term names (and synonyms) only. This option also matches the keywords against the EDAM term definitions and will therefore (typically) report more matches. -[no]subclasses boolean [Y] Extend the query matches to include all terms which are specialisations (EDAM sub-classes) of the matched type. Associated qualifiers: "-outfile" associated qualifiers -odirectory2 string Output directory -oformat2 string Data resource output format General qualifiers: -auto boolean Turn off prompts -stdout boolean Write first file to standard output -filter boolean Read first file from standard input, write first file to standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messages -version boolean Report version number and exit Input file format None. Output file format The output is a standard EMBOSS resource file. The results can be output in one of several styles by using the command-line qualifier -oformat xxx, where 'xxx' is replaced by the name of the required format. The available format names are: drcat, basic, wsbasic, list. See: http://emboss.sf.net/docs/themes/ResourceFormats.html for further information on resource formats. Output files for usage example File: drfindformat.drcat ID dbEST Name dbEST database of EST sequences Desc dbEST is a division of GenBank that contains sequence data and other inf ormation on "single-pass" cDNA sequences, or "Expressed Sequence Tags", from a n umber of organisms. URL http://www.ncbi.nlm.nih.gov/dbEST/ Cat Not available Taxon 1 | all EDAMtpc 0000655 | mRNA, EST or cDNA EDAMdat 0000849 | Sequence record EDAMid 0002314 | GI number EDAMid 0001105 | dbEST accession EDAMfmt 0002310 | FASTA-HTML EDAMfmt 0002532 | GenBank-HTML EDAMfmt 0002331 | HTML Xref SP_FT | None Query Sequence record | GenBank-HTML | dbEST accession | http://www.ncbi.nlm. nih.gov/nucest/%s?report=genbank Query Sequence record | HTML {est} | dbEST accession | http://www.ncbi.nlm.ni h.gov/nucest/%s?report=est Query Sequence record | HTML {docsum} | dbEST accession | http://www.ncbi.nlm .nih.gov/nucest/%s?report=docsum Query Sequence record | FASTA-HTML | dbEST accession | http://www.ncbi.nlm.ni h.gov/nucest/%s?report=fasta Query Sequence record | GenBank-HTML | dbEST accession | http://www.ncbi.nlm. nih.gov/nucest/%s?report=genbank Query Sequence record | GenBank-HTML | GI number | http://www.ncbi.nlm.nih.go v/nucest/%s?report=genbank Query Sequence record | HTML {est} | GI number | http://www.ncbi.nlm.nih.gov/ nucest/%s?report=est Query Sequence record | HTML {docsum} | GI number | http://www.ncbi.nlm.nih.g ov/nucest/%s?report=docsum Query Sequence record | FASTA-HTML | GI number | http://www.ncbi.nlm.nih.gov/ nucest/%s?report=fasta Query Sequence record | GenBank-HTML | GI number | http://www.ncbi.nlm.nih.go v/nucest/%s?report=genbank Example dbEST accession | f12345 Example GI number | 706694 ID REDIdb Name RNA editing database (REDIdb) Desc Sequences post-transcriptionally modified by RNA editing from primary da tabases and literature. All editing information such as substitutions, insertion s and deletions occurring in a wide range of organisms is stored. URL http://biologia.unical.it/py_script/overview.html Taxon 1 | all EDAMtpc 0000630 | Gene structure EDAMdat 0002043 | Sequence record lite EDAMdat 0001383 | Sequence alignment (nucleic acid) EDAMid 0002781 | REDIdb ID EDAMfmt 0002310 | FASTA-HTML EDAMfmt 0002331 | HTML Query Sequence record lite {REDIdb entry} | HTML | REDIdb ID | http://biologi a.unical.it/py_script/cgi-bin/retrieve.py?query=%s Query Sequence record lite {REDIdb fasta} | FASTA-HTML | REDIdb ID | http://b iologia.unical.it/py_script/cgi-bin/fasta.py?query=%s Query Sequence alignment (nucleic acid) {REDIdb overview} | HTML | REDIdb ID | http://biologia.unical.it/py_script/cgi-bin/display.py?query=%s Query Sequence alignment (nucleic acid) {REDIdb alignment} | HTML | REDIdb ID | http://biologia.unical.it/py_script/cgi-bin/align.py?query=%s Example REDIdb ID | EDI_000000002 ID UniRef Name Non-redundant reference (UniRef) databases Desc Clustered sets of sequences from UniProt Knowledgebase (including splice variants and isoforms) and selected UniParc records, in order to obtain complet e coverage of sequence space at several resolutions while hiding redundant seque nces (but not their descriptions) from view. URL http://www.uniprot.org/help/uniref Cat Other Taxon 1 | all [Part of this file has been deleted for brevity] Xref SP_FT | None Query Sequence record full | HTML | UniProt accession | http://www.uniprot.or g/uniprot/%s Query Sequence record full | uniprot | UniProt accession | http://www.uniprot .org/uniprot/%s.txt Query Sequence record full | XML | UniProt accession | http://www.uniprot.org /uniprot/%s.xml Query Sequence record full | RDF | UniProt accession | http://www.uniprot.org /uniprot/%s.rdf Query Sequence record full | FASTA format | UniProt accession | http://www.un iprot.org/uniprot/%s.fasta Example UniProt accession | P12345 ID Ensembl Acc DB-0023 Name Ensembl eukaryotic genome annotation project Desc Genome databases for vertebrates and other eukaryotic species. URL http://www.ensembl.org/ Cat Genome annotation databases Taxon 33208 | Metazoa EDAMtpc 0000643 | Genomes EDAMtpc 0002818 | Eukaryote EDAMtpc 0000643 | Genomes EDAMdat 0000849 | Sequence record EDAMdat 0000916 | Gene annotation EDAMid 0001033 | Gene ID (Ensembl) EDAMid 0002725 | Transcript ID (Ensembl) EDAMfmt 0001929 | FASTA format EDAMfmt 0002331 | HTML Xref SP_explicit | Transcript ID (Ensembl);Protein ID (Ensembl);Gene ID (Ense mbl) Xref SP_FT | None Query Gene annotation | HTML | Gene ID (Ensembl) | http://www.ensembl.org/Hom o_sapiens/Gene/Summary?g=%s Query Sequence record | FASTA format | Gene ID (Ensembl);Transcript ID (Ensem bl) | http://www.ensembl.org/Homo_sapiens/Gene/Export?db=core;g=%s1;output=fasta ;r=13:31787617-31871809;strand=feature;t=%s2;time=1244110856.85314;st=cdna;st=co ding;st=peptide;st=utr5;st=utr3;st=exons;st=introns;genomic=unmasked;_format=Tex t Example Gene ID (Ensembl);Transcript ID (Ensembl) | ENSG00000139618;ENST00000380 152 ID UniProtKB/Swiss-Prot IDalt SwissProt Name Universal protein resource knowledge base / Swiss-Prot Desc Section of the UniProt knowledgebase, containing annotated records, whic h include curator-evaluated computational analysis, as well as, information extr acted from the literature URL http://www.uniprot.org Taxon 1 | all EDAMtpc 0000639 | Protein sequences EDAMdat 0002201 | Sequence record full EDAMid 0003021 | UniProt accession EDAMfmt 0001929 | FASTA format EDAMfmt 0002376 | RDF EDAMfmt 0002331 | HTML EDAMfmt 0002332 | XML Xref EMBL_explicit | UniProt accession Query Sequence record full | HTML | UniProt accession | http://www.uniprot.or g/uniprot/%s Query Sequence record full | Text | UniProt accession | http://www.uniprot.or g/uniprot/%s.txt Query Sequence record full | XML | UniProt accession | http://www.uniprot.org /uniprot/%s.xml Query Sequence record full | RDF | UniProt accession | http://www.uniprot.org /uniprot/%s.rdf Query Sequence record full | FASTA format | UniProt accession | http://www.un iprot.org/uniprot/%s.fasta Example UniProt accession | P12345 Data files The Data Resource Catalogue is included in EMBOSS as local database drcat. The EDAM Ontology is included in EMBOSS as local database edam. Notes None. References None. Warnings None. Diagnostic Error Messages None. Exit status It always exits with status 0. Known bugs None. See also Program name Description drfinddata Find public databases by data type drfindid Find public databases by identifier drfindresource Find public databases by resource drget Get data resource entries drtext Get data resource entries complete text edamdef Find EDAM ontology terms by definition edamhasinput Find EDAM ontology terms by has_input relation edamhasoutput Find EDAM ontology terms by has_output relation edamisformat Find EDAM ontology terms by is_format_of relation edamisid Find EDAM ontology terms by is_identifier_of relation edamname Find EDAM ontology terms by name wossdata Finds programs by EDAM data wossinput Finds programs by EDAM input data wossoperation Finds programs by EDAM operation wossoutput Finds programs by EDAM output data wossparam Finds programs by EDAM parameter wosstopic Finds programs by EDAM topic Author(s) Peter Rice European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK Please report all bugs to the EMBOSS bug team (emboss-bug (c) emboss.open-bio.org) not to the original author. History Target users This program is intended to be used by everyone and everything, from naive users to embedded scripts. Comments None