sirna


Wiki

   The master copies of EMBOSS documentation are available at
   http://emboss.open-bio.org/wiki/Appdocs on the EMBOSS Wiki.

   Please help by correcting and extending the Wiki pages.

Function

   Finds siRNA duplexes in mRNA

Description

   Finds siRNA duplexes in mRNA. The output is a standard EMBOSS report
   file. The siRNAs are reported in order of best score first. sirna
   reports both the sense and antisense siRNAs as 5' to 3'.

Algorithm

for each input sequence:

    find the start position of the CDS in the feature table
    if there is no such CDS, take the -sbegin position as the CDS start

    for each 23 base window along the sequence:

        set the score for this window = 0
        if base 2 of the window is not 'a': ignore this window
        if the window is within 50 bases of the CDS start: ignore this window
        if the window is within 100 bases of the CDS: score = -2
        measure the %GC of the 20 bases from position 2 to 21 of the window
        for the following %GC values change the score:
                %GC <= 25% (<= 5 bases): ignore this window
                %GC 30% (6 bases): score + 0
                %GC 35% (7 bases): score + 2
                %GC 40% (8 bases): score + 4
                %GC 45% (9 bases): score + 5
                %GC 50% (10 bases): score + 6
                %GC 55% (11 bases): score + 5
                %GC 60% (12 bases): score + 4
                %GC 65% (13 bases): score + 2
                %GC 70% (14 bases): score + 0
                %GC >= 75% (>= 15 bases): ignore this window
        if the window starts with a 'AA': score + 3
        if the window does not start 'AA' and it is required: ignore this window
        if the window ends with a 'TT': score + 1
        if the window does not end 'TT' and it is required: ignore this window
        if 4 G's in a row are found: ignore this window
        if any 4 bases in a row are present and not required: ignore this window
        if PolIII probes are required and the window is not NARN(17)YNN: ignore
this window
        if the score is > 0: store this window for output

    sort the windows found by their score
    output the 23-base windows to the sequence file
    if the 'context' qualifier is specified, output window bases 1 and 2 in brac
kets to the report file
    take the window bases 3 to 21, add 'dTdT' output to the report file
    take the window bases 3 to 21, reverse complement, add 'dTdT' output to the
report file

Usage

   Here is a sample session with sirna


% sirna
Finds siRNA duplexes in mRNA
Input nucleotide sequence(s): tembl:x65923
Output report [x65923.sirna]:
output sequence(s) [x65923.fasta]:


   Go to the input files for this example
   Go to the output files for this example

   Example 2

   Show the first two bases of the 23 base target region in brackets.
   These do not form part of the sequence to be ordered, but it is useful
   to see if the 23 base region starts with an 'AA'.


% sirna -context
Finds siRNA duplexes in mRNA
Input nucleotide sequence(s): tembl:x65923
Output report [x65923.sirna]:
output sequence(s) [x65923.fasta]:


   Go to the output files for this example

Command line arguments

Finds siRNA duplexes in mRNA
Version: EMBOSS:6.4.0.0

   Standard (Mandatory) qualifiers:
  [-sequence]          seqall     Nucleotide sequence(s) filename and optional
                                  format, or reference (input USA)
  [-outfile]           report     [*.sirna] The output is a table of the
                                  forward and reverse parts of the 21 base
                                  siRNA duplex. Both the forward and reverse
                                  sequences are written 5' to 3', ready to be
                                  ordered. The last two bases have been
                                  replaced by 'dTdT'. The starting position of
                                  the 23 base region and the %GC content is
                                  also given. If you wish to see the complete
                                  23 base sequence, then either look at the
                                  sequence in the other output file, or use
                                  the qualifier '-context' which will display
                                  the 23 bases of the forward sequence in this
                                  report with the first two bases in
                                  brackets. These first two bases do not form
                                  part of the siRNA probe to be ordered.
                                  (default -rformat table)
  [-outseq]            seqoutall  [.] This is a file of the
                                  sequences of the 23 base regions that the
                                  siRNAs are selected from. You may use it to
                                  do searches of mRNA databases (e.g. REFSEQ)
                                  to confirm that the probes are unique to the
                                  gene you wish to use it on.

   Additional (Optional) qualifiers:
   -poliii             boolean    [N] This option allows you to select only
                                  the 21 base probes that start with a purine
                                  and so can be expressed from Pol III
                                  expression vectors. This is the NARN(17)YNN
                                  pattern that has been suggested by Tuschl et
                                  al.
   -aa                 boolean    [N] This option allows you to select only
                                  those 23 base regions that start with AA. If
                                  this option is not selected then regions
                                  that start with AA will be favoured by
                                  giving them a higher score, but regions that
                                  do not start with AA will also be reported.
   -tt                 boolean    [N] This option allows you to select only
                                  those 23 base regions that end with TT. If
                                  this option is not selected then regions
                                  that end with TT will be favoured by giving
                                  them a higher score, but regions that do not
                                  end with TT will also be reported.
   -[no]polybase       boolean    [Y] If this option is FALSE then only those
                                  23 base regions that have no repeat of 4 or
                                  more of any bases in a row will be reported.
                                  No regions will ever be reported that have
                                  4 or more G's in a row.
   -context            boolean    [N] The output report file gives the
                                  sequences of the 21 base siRNA regions ready
                                  to be ordered. This does not give you an
                                  indication of the 2 bases before the 21
                                  bases. It is often interesting to see which
                                  of the suggested possible probe regions have
                                  an 'AA' in front of them (i.e. it is useful
                                  to see which of the 23 base regions start
                                  with an 'AA'). This option displays the
                                  whole 23 bases of the region with the first
                                  two bases in brackets, e.g. '(AA)' to give
                                  you some context for the probe region. YOU
                                  SHOULD NOT INCLUDE THE TWO BASES IN BRACKETS
                                  WHEN YOU PLACE AN ORDER FOR THE PROBES.

   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers:

   "-sequence" associated qualifiers
   -sbegin1            integer    Start of each sequence to be used
   -send1              integer    End of each sequence to be used
   -sreverse1          boolean    Reverse (if DNA)
   -sask1              boolean    Ask for begin/end/reverse
   -snucleotide1       boolean    Sequence is nucleotide
   -sprotein1          boolean    Sequence is protein
   -slower1            boolean    Make lower case
   -supper1            boolean    Make upper case
   -sformat1           string     Input sequence format
   -sdbname1           string     Database name
   -sid1               string     Entryname
   -ufo1               string     UFO features
   -fformat1           string     Features format
   -fopenfile1         string     Features file name

   "-outfile" associated qualifiers
   -rformat2           string     Report format
   -rname2             string     Base file name
   -rextension2        string     File name extension
   -rdirectory2        string     Output directory
   -raccshow2          boolean    Show accession number in the report
   -rdesshow2          boolean    Show description in the report
   -rscoreshow2        boolean    Show the score in the report
   -rstrandshow2       boolean    Show the nucleotide strand in the report
   -rusashow2          boolean    Show the full USA in the report
   -rmaxall2           integer    Maximum total hits to report
   -rmaxseq2           integer    Maximum hits to report for one sequence

   "-outseq" associated qualifiers
   -osformat3          string     Output seq format
   -osextension3       string     File name extension
   -osname3            string     Base file name
   -osdirectory3       string     Output directory
   -osdbname3          string     Database name to add
   -ossingle3          boolean    Separate file for each entry
   -oufo3              string     UFO features
   -offormat3          string     Features format
   -ofname3            string     Features file name
   -ofdirectory3       string     Output directory

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write first file to standard output
   -filter             boolean    Read first file from standard input, write
                                  first file to standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options and exit. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages
   -version            boolean    Report version number and exit


Input file format

   The input is a standard EMBOSS sequence query (also known as a 'USA').

   Major sequence database sources defined as standard in EMBOSS
   installations include srs:embl, srs:uniprot and ensembl

   Data can also be read from sequence output in any supported format
   written by an EMBOSS or third-party application.

   The input format can be specified by using the command-line qualifier
   -sformat xxx, where 'xxx' is replaced by the name of the required
   format. The available format names are: gff (gff3), gff2, embl (em),
   genbank (gb, refseq), ddbj, refseqp, pir (nbrf), swissprot (swiss, sw),
   dasgff and debug.

   See: http://emboss.sf.net/docs/themes/SequenceFormats.html for further
   information on sequence formats.

  Input files for usage example

   'tembl:x65923' is a sequence entry in the example nucleic acid database
   'tembl'

  Database entry: tembl:x65923

ID   X65923; SV 1; linear; mRNA; STD; HUM; 518 BP.
XX
AC   X65923;
XX
DT   13-MAY-1992 (Rel. 31, Created)
DT   18-APR-2005 (Rel. 83, Last updated, Version 11)
XX
DE   H.sapiens fau mRNA
XX
KW   fau gene.
XX
OS   Homo sapiens (human)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC   Homo.
XX
RN   [1]
RP   1-518
RA   Michiels L.M.R.;
RT   ;
RL   Submitted (29-APR-1992) to the EMBL/GenBank/DDBJ databases.
RL   L.M.R. Michiels, University of Antwerp, Dept of Biochemistry,
RL   Universiteisplein 1, 2610 Wilrijk, BELGIUM
XX
RN   [2]
RP   1-518
RX   PUBMED; 8395683.
RA   Michiels L., Van der Rauwelaert E., Van Hasselt F., Kas K., Merregaert J.;
RT   "fau cDNA encodes a ubiquitin-like-S30 fusion protein and is expressed as
RT   an antisense sequence in the Finkel-Biskis-Reilly murine sarcoma virus";
RL   Oncogene 8(9):2537-2546(1993).
XX
DR   H-InvDB; HIT000322806.
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..518
FT                   /organism="Homo sapiens"
FT                   /chromosome="11q"
FT                   /map="13"
FT                   /mol_type="mRNA"
FT                   /clone_lib="cDNA"
FT                   /clone="pUIA 631"
FT                   /tissue_type="placenta"
FT                   /db_xref="taxon:9606"
FT   misc_feature    57..278
FT                   /note="ubiquitin like part"
FT   CDS             57..458
FT                   /gene="fau"
FT                   /db_xref="GDB:135476"
FT                   /db_xref="GOA:P35544"
FT                   /db_xref="GOA:P62861"
FT                   /db_xref="HGNC:3597"
FT                   /db_xref="InterPro:IPR000626"
FT                   /db_xref="InterPro:IPR006846"
FT                   /db_xref="InterPro:IPR019954"
FT                   /db_xref="InterPro:IPR019955"
FT                   /db_xref="InterPro:IPR019956"
FT                   /db_xref="UniProtKB/Swiss-Prot:P35544"
FT                   /db_xref="UniProtKB/Swiss-Prot:P62861"
FT                   /protein_id="CAA46716.1"
FT                   /translation="MQLFVRAQELHTFEVTGQETVAQIKAHVASLEGIAPEDQVVLLAG
FT                   APLEDEATLGQCGVEALTTLEVAGRMLGGKVHGSLARAGKVRGQTPKVAKQEKKKKKTG
FT                   RAKRRMQYNRRFVNVVPTFGKKKGPNANS"
FT   misc_feature    98..102
FT                   /note="nucleolar localization signal"
FT   misc_feature    279..458
FT                   /note="S30 part"
FT   polyA_signal    484..489
FT   polyA_site      509
XX
SQ   Sequence 518 BP; 125 A; 139 C; 148 G; 106 T; 0 other;
     ttcctctttc tcgactccat cttcgcggta gctgggaccg ccgttcagtc gccaatatgc        60
     agctctttgt ccgcgcccag gagctacaca ccttcgaggt gaccggccag gaaacggtcg       120
     cccagatcaa ggctcatgta gcctcactgg agggcattgc cccggaagat caagtcgtgc       180
     tcctggcagg cgcgcccctg gaggatgagg ccactctggg ccagtgcggg gtggaggccc       240
     tgactaccct ggaagtagca ggccgcatgc ttggaggtaa agttcatggt tccctggccc       300
     gtgctggaaa agtgagaggt cagactccta aggtggccaa acaggagaag aagaagaaga       360
     agacaggtcg ggctaagcgg cggatgcagt acaaccggcg ctttgtcaac gttgtgccca       420
     cctttggcaa gaagaagggc cccaatgcca actcttaagt cttttgtaat tctggctttc       480
     tctaataaaa aagccactta gttcagtcaa aaaaaaaa                               518
//

Output file format

   The output is a standard EMBOSS report file.

   The results can be output in one of several styles by using the
   command-line qualifier -rformat xxx, where 'xxx' is replaced by the
   name of the required format. The available format names are: embl,
   genbank, gff, pir, swiss, dasgff, debug, listfile, dbmotif, diffseq,
   draw, restrict, excel, feattable, motif, nametable, regions, seqtable,
   simple, srs, table, tagseq.

   See: http://emboss.sf.net/docs/themes/ReportFormats.html for further
   information on report formats.

   sirna outputs a report format file. The default format is 'table'.

  Output files for usage example

  File: x65923.sirna

########################################
# Program: sirna
# Rundate: Fri 15 Jul 2011 12:00:00
# Commandline: sirna
#    -sequence tembl:x65923
# Report_format: table
# Report_file: x65923.sirna
########################################

#=======================================
#
# Sequence: X65923     from: 1   to: 518
# HitCount: 85
#
# CDS region found in feature table starting at 57
#
#=======================================

  Start     End  Strand   Score    GC%             Sense_siRNA         Antisense
_siRNA
    308     330       +   9.000   50.0 AAGUGAGAGGUCAGACUCCdTdT GGAGUCUGACCUCUCAC
UUdTdT
    309     331       +   9.000   50.0 AGUGAGAGGUCAGACUCCUdTdT AGGAGUCUGACCUCUCA
CUdTdT
    310     332       +   9.000   50.0 GUGAGAGGUCAGACUCCUAdTdT UAGGAGUCUGACCUCUC
ACdTdT
    351     373       +   9.000   50.0 GAAGAAGAAGACAGGUCGGdTdT CCGACCUGUCUUCUUCU
UCdTdT
    166     188       +   8.000   55.0 GAUCAAGUCGUGCUCCUGGdTdT CCAGGAGCACGACUUGA
UCdTdT
    279     301       +   8.000   55.0 AGUUCAUGGUUCCCUGGCCdTdT GGCCAGGGAACCAUGAA
CUdTdT
    330     352       +   8.000   55.0 GGUGGCCAAACAGGAGAAGdTdT CUUCUCCUGUUUGGCCA
CCdTdT
    354     376       +   8.000   55.0 GAAGAAGACAGGUCGGGCUdTdT AGCCCGACCUGUCUUCU
UCdTdT
    357     379       +   8.000   55.0 GAAGACAGGUCGGGCUAAGdTdT CUUAGCCCGACCUGUCU
UCdTdT
    393     415       +   8.000   55.0 CCGGCGCUUUGUCAACGUUdTdT AACGUUGACAAAGCGCC
GGdTdT
    253     275       +   7.000   60.0 GUAGCAGGCCGCAUGCUUGdTdT CAAGCAUGCGGCCUGCU
ACdTdT
    280     302       +   7.000   60.0 GUUCAUGGUUCCCUGGCCCdTdT GGGCCAGGGAACCAUGA
ACdTdT
    339     361       +   7.000   40.0 ACAGGAGAAGAAGAAGAAGdTdT CUUCUUCUUCUUCUCCU
GUdTdT
    340     362       +   7.000   40.0 CAGGAGAAGAAGAAGAAGAdTdT UCUUCUUCUUCUUCUCC
UGdTdT
    348     370       +   7.000   40.0 GAAGAAGAAGAAGACAGGUdTdT ACCUGUCUUCUUCUUCU
UCdTdT
    375     397       +   7.000   60.0 GCGGCGGAUGCAGUACAACdTdT GUUGUACUGCAUCCGCC
GCdTdT
    408     430       +   7.000   60.0 CGUUGUGCCCACCUUUGGCdTdT GCCAAAGGUGGGCACAA
CGdTdT
    429     451       +   7.000   60.0 GAAGAAGGGCCCCAAUGCCdTdT GGCAUUGGGGCCCUUCU
UCdTdT
    432     454       +   7.000   60.0 GAAGGGCCCCAAUGCCAACdTdT GUUGGCAUUGGGGCCCU
UCdTdT
    435     457       +   7.000   60.0 GGGCCCCAAUGCCAACUCUdTdT AGAGUUGGCAUUGGGGC
CCdTdT
    488     510       +   7.000   40.0 AAAGCCACUUAGUUCAGUCdTdT GACUGAACUAAGUGGCU
UUdTdT
    489     511       +   7.000   40.0 AAGCCACUUAGUUCAGUCAdTdT UGACUGAACUAAGUGGC
UUdTdT
    490     512       +   7.000   40.0 AGCCACUUAGUUCAGUCAAdTdT UUGACUGAACUAAGUGG
CUdTdT
    491     513       +   7.000   40.0 GCCACUUAGUUCAGUCAAAdTdT UUUGACUGAACUAAGUG
GCdTdT
    129     151       +   6.000   55.0 GGCUCAUGUAGCCUCACUGdTdT CAGUGAGGCUACAUGAG
CCdTdT
    165     187       +   6.000   50.0 AGAUCAAGUCGUGCUCCUGdTdT CAGGAGCACGACUUGAU
CUdTdT
    278     300       +   6.000   50.0 AAGUUCAUGGUUCCCUGGCdTdT GCCAGGGAACCAUGAAC
UUdTdT
    314     336       +   6.000   50.0 GAGGUCAGACUCCUAAGGUdTdT ACCUUAGGAGUCUGACC
UCdTdT
    321     343       +   6.000   50.0 GACUCCUAAGGUGGCCAAAdTdT UUUGGCCACCUUAGGAG
UCdTdT
    323     345       +   6.000   50.0 CUCCUAAGGUGGCCAAACAdTdT UGUUUGGCCACCUUAGG
AGdTdT
    329     351       +   6.000   50.0 AGGUGGCCAAACAGGAGAAdTdT UUCUCCUGUUUGGCCAC
CUdTdT


  [Part of this file has been deleted for brevity]

    374     396       +   5.000   55.0 AGCGGCGGAUGCAGUACAAdTdT UUGUACUGCAUCCGCCG
CUdTdT
    383     405       +   5.000   55.0 UGCAGUACAACCGGCGCUUdTdT AAGCGCCGGUUGUACUG
CAdTdT
    387     409       +   5.000   55.0 GUACAACCGGCGCUUUGUCdTdT GACAAAGCGCCGGUUGU
ACdTdT
    390     412       +   5.000   55.0 CAACCGGCGCUUUGUCAACdTdT GUUGACAAAGCGCCGGU
UGdTdT
    392     414       +   5.000   55.0 ACCGGCGCUUUGUCAACGUdTdT ACGUUGACAAAGCGCCG
GUdTdT
    407     429       +   5.000   55.0 ACGUUGUGCCCACCUUUGGdTdT CCAAAGGUGGGCACAAC
GUdTdT
    428     450       +   5.000   55.0 AGAAGAAGGGCCCCAAUGCdTdT GCAUUGGGGCCCUUCUU
CUdTdT
    431     453       +   5.000   55.0 AGAAGGGCCCCAAUGCCAAdTdT UUGGCAUUGGGGCCCUU
CUdTdT
    434     456       +   5.000   60.0 AGGGCCCCAAUGCCAACUCdTdT GAGUUGGCAUUGGGGCC
CUdTdT
    444     466       +   5.000   35.0 UGCCAACUCUUAAGUCUUUdTdT AAAGACUUAAGAGUUGG
CAdTdT
    487     509       +   5.000   35.0 AAAAGCCACUUAGUUCAGUdTdT ACUGAACUAAGUGGCUU
UUdTdT
    123     145       +   4.000   50.0 GAUCAAGGCUCAUGUAGCCdTdT GGCUACAUGAGCCUUGA
UCdTdT
    125     147       +   4.000   50.0 UCAAGGCUCAUGUAGCCUCdTdT GAGGCUACAUGAGCCUU
GAdTdT
    128     150       +   4.000   50.0 AGGCUCAUGUAGCCUCACUdTdT AGUGAGGCUACAUGAGC
CUdTdT
    155     177       +   4.000   50.0 UUGCCCCGGAAGAUCAAGUdTdT ACUUGAUCUUCCGGGGC
AAdTdT
    234     256       +   4.000   60.0 GGCCCUGACUACCCUGGAAdTdT UUCCAGGGUAGUCAGGG
CCdTdT
    259     281       +   4.000   60.0 GGCCGCAUGCUUGGAGGUAdTdT UACCUCCAAGCAUGCGG
CCdTdT
    266     288       +   4.000   40.0 UGCUUGGAGGUAAAGUUCAdTdT UGAACUUUACCUCCAAG
CAdTdT
    342     364       +   4.000   40.0 GGAGAAGAAGAAGAAGAAGdTdT CUUCUUCUUCUUCUUCU
CCdTdT
    347     369       +   4.000   40.0 AGAAGAAGAAGAAGACAGGdTdT CCUGUCUUCUUCUUCUU
CUdTdT
    359     381       +   4.000   60.0 AGACAGGUCGGGCUAAGCGdTdT CGCUUAGCCCGACCUGU
CUdTdT
    111     133       +   3.000   55.0 AACGGUCGCCCAGAUCAAGdTdT CUUGAUCUGGGCGACCG
UUdTdT
    113     135       +   3.000   65.0 CGGUCGCCCAGAUCAAGGCdTdT GCCUUGAUCUGGGCGAC
CGdTdT
    172     194       +   3.000   70.0 GUCGUGCUCCUGGCAGGCGdTdT CGCCUGCCAGGAGCACG
ACdTdT
    443     465       +   3.000   35.0 AUGCCAACUCUUAAGUCUUdTdT AAGACUUAAGAGUUGGC
AUdTdT
    456     478       +   3.000   35.0 AGUCUUUUGUAAUUCUGGCdTdT GCCAGAAUUACAAAAGA
CUdTdT
    468     490       +   3.000   30.0 UUCUGGCUUUCUCUAAUAAdTdT UUAUUAGAGAAAGCCAG
AAdTdT
    484     506       +   3.000   30.0 UAAAAAAGCCACUUAGUUCdTdT GAACUAAGUGGCUUUUU
UAdTdT
    108     130       +   2.000   60.0 GGAAACGGUCGCCCAGAUCdTdT GAUCUGGGCGACCGUUU
CCdTdT
    135     157       +   2.000   60.0 UGUAGCCUCACUGGAGGGCdTdT GCCCUCCAGUGAGGCUA
CAdTdT
    139     161       +   2.000   60.0 GCCUCACUGGAGGGCAUUGdTdT CAAUGCCCUCCAGUGAG
GCdTdT
    150     172       +   2.000   60.0 GGGCAUUGCCCCGGAAGAUdTdT AUCUUCCGGGGCAAUGC
CCdTdT
    171     193       +   2.000   65.0 AGUCGUGCUCCUGGCAGGCdTdT GCCUGCCAGGAGCACGA
CUdTdT
    201     223       +   2.000   65.0 GGAUGAGGCCACUCUGGGCdTdT GCCCAGAGUGGCCUCAU
CCdTdT
    204     226       +   2.000   65.0 UGAGGCCACUCUGGGCCAGdTdT CUGGCCCAGAGUGGCCU
CAdTdT
    245     267       +   2.000   65.0 CCCUGGAAGUAGCAGGCCGdTdT CGGCCUGCUACUUCCAG
GGdTdT
    256     278       +   2.000   65.0 GCAGGCCGCAUGCUUGGAGdTdT CUCCAAGCAUGCGGCCU
GCdTdT
    285     307       +   2.000   65.0 UGGUUCCCUGGCCCGUGCUdTdT AGCACGGGCCAGGGAAC
CAdTdT
    338     360       +   2.000   35.0 AACAGGAGAAGAAGAAGAAdTdT UUCUUCUUCUUCUCCUG
UUdTdT
    345     367       +   2.000   35.0 GAAGAAGAAGAAGAAGACAdTdT UGUCUUCUUCUUCUUCU
UCdTdT
    486     508       +   2.000   35.0 AAAAAGCCACUUAGUUCAGdTdT CUGAACUAAGUGGCUUU
UUdTdT

#---------------------------------------
#---------------------------------------

#---------------------------------------
# Total_sequences: 1
# Total_length: 518
# Reported_sequences: 1
# Reported_hitcount: 85
#---------------------------------------

  File: x65923.fasta

>X65923_308 %GC 50.0 Score 9 H.sapiens fau mRNA
aaaagtgagaggtcagactccta
>X65923_309 %GC 50.0 Score 9 H.sapiens fau mRNA
aaagtgagaggtcagactcctaa
>X65923_310 %GC 50.0 Score 9 H.sapiens fau mRNA
aagtgagaggtcagactcctaag
>X65923_351 %GC 50.0 Score 9 H.sapiens fau mRNA
aagaagaagaagacaggtcgggc
>X65923_166 %GC 55.0 Score 8 H.sapiens fau mRNA
aagatcaagtcgtgctcctggca
>X65923_279 %GC 55.0 Score 8 H.sapiens fau mRNA
aaagttcatggttccctggcccg
>X65923_330 %GC 55.0 Score 8 H.sapiens fau mRNA
aaggtggccaaacaggagaagaa
>X65923_354 %GC 55.0 Score 8 H.sapiens fau mRNA
aagaagaagacaggtcgggctaa
>X65923_357 %GC 55.0 Score 8 H.sapiens fau mRNA
aagaagacaggtcgggctaagcg
>X65923_393 %GC 55.0 Score 8 H.sapiens fau mRNA
aaccggcgctttgtcaacgttgt
>X65923_253 %GC 60.0 Score 7 H.sapiens fau mRNA
aagtagcaggccgcatgcttgga
>X65923_280 %GC 60.0 Score 7 H.sapiens fau mRNA
aagttcatggttccctggcccgt
>X65923_339 %GC 40.0 Score 7 H.sapiens fau mRNA
aaacaggagaagaagaagaagaa
>X65923_340 %GC 40.0 Score 7 H.sapiens fau mRNA
aacaggagaagaagaagaagaag
>X65923_348 %GC 40.0 Score 7 H.sapiens fau mRNA
aagaagaagaagaagacaggtcg
>X65923_375 %GC 60.0 Score 7 H.sapiens fau mRNA
aagcggcggatgcagtacaaccg
>X65923_408 %GC 60.0 Score 7 H.sapiens fau mRNA
aacgttgtgcccacctttggcaa
>X65923_429 %GC 60.0 Score 7 H.sapiens fau mRNA
aagaagaagggccccaatgccaa
>X65923_432 %GC 60.0 Score 7 H.sapiens fau mRNA
aagaagggccccaatgccaactc
>X65923_435 %GC 60.0 Score 7 H.sapiens fau mRNA
aagggccccaatgccaactctta
>X65923_488 %GC 40.0 Score 7 H.sapiens fau mRNA
aaaaagccacttagttcagtcaa
>X65923_489 %GC 40.0 Score 7 H.sapiens fau mRNA
aaaagccacttagttcagtcaaa
>X65923_490 %GC 40.0 Score 7 H.sapiens fau mRNA
aaagccacttagttcagtcaaaa
>X65923_491 %GC 40.0 Score 7 H.sapiens fau mRNA
aagccacttagttcagtcaaaaa
>X65923_129 %GC 55.0 Score 6 H.sapiens fau mRNA
aaggctcatgtagcctcactgga


  [Part of this file has been deleted for brevity]

gaggccctgactaccctggaagt
>X65923_259 %GC 60.0 Score 4 H.sapiens fau mRNA
caggccgcatgcttggaggtaaa
>X65923_266 %GC 40.0 Score 4 H.sapiens fau mRNA
catgcttggaggtaaagttcatg
>X65923_342 %GC 40.0 Score 4 H.sapiens fau mRNA
caggagaagaagaagaagaagac
>X65923_347 %GC 40.0 Score 4 H.sapiens fau mRNA
gaagaagaagaagaagacaggtc
>X65923_359 %GC 60.0 Score 4 H.sapiens fau mRNA
gaagacaggtcgggctaagcggc
>X65923_111 %GC 55.0 Score 3 H.sapiens fau mRNA
gaaacggtcgcccagatcaaggc
>X65923_113 %GC 65.0 Score 3 H.sapiens fau mRNA
aacggtcgcccagatcaaggctc
>X65923_172 %GC 70.0 Score 3 H.sapiens fau mRNA
aagtcgtgctcctggcaggcgcg
>X65923_443 %GC 35.0 Score 3 H.sapiens fau mRNA
caatgccaactcttaagtctttt
>X65923_456 %GC 35.0 Score 3 H.sapiens fau mRNA
taagtcttttgtaattctggctt
>X65923_468 %GC 30.0 Score 3 H.sapiens fau mRNA
aattctggctttctctaataaaa
>X65923_484 %GC 30.0 Score 3 H.sapiens fau mRNA
aataaaaaagccacttagttcag
>X65923_108 %GC 60.0 Score 2 H.sapiens fau mRNA
caggaaacggtcgcccagatcaa
>X65923_135 %GC 60.0 Score 2 H.sapiens fau mRNA
catgtagcctcactggagggcat
>X65923_139 %GC 60.0 Score 2 H.sapiens fau mRNA
tagcctcactggagggcattgcc
>X65923_150 %GC 60.0 Score 2 H.sapiens fau mRNA
gagggcattgccccggaagatca
>X65923_171 %GC 65.0 Score 2 H.sapiens fau mRNA
caagtcgtgctcctggcaggcgc
>X65923_201 %GC 65.0 Score 2 H.sapiens fau mRNA
gaggatgaggccactctgggcca
>X65923_204 %GC 65.0 Score 2 H.sapiens fau mRNA
gatgaggccactctgggccagtg
>X65923_245 %GC 65.0 Score 2 H.sapiens fau mRNA
taccctggaagtagcaggccgca
>X65923_256 %GC 65.0 Score 2 H.sapiens fau mRNA
tagcaggccgcatgcttggaggt
>X65923_285 %GC 65.0 Score 2 H.sapiens fau mRNA
catggttccctggcccgtgctgg
>X65923_338 %GC 35.0 Score 2 H.sapiens fau mRNA
caaacaggagaagaagaagaaga
>X65923_345 %GC 35.0 Score 2 H.sapiens fau mRNA
gagaagaagaagaagaagacagg
>X65923_486 %GC 35.0 Score 2 H.sapiens fau mRNA
taaaaaagccacttagttcagtc

  Output files for usage example 2

  File: x65923.sirna

########################################
# Program: sirna
# Rundate: Fri 15 Jul 2011 12:00:00
# Commandline: sirna
#    -context
#    -sequence tembl:x65923
# Report_format: table
# Report_file: x65923.sirna
########################################

#=======================================
#
# Sequence: X65923     from: 1   to: 518
# HitCount: 85
#
# The forward sense sequence shows the first 2 bases of
# the 23 base region in brackets, this should be ignored
# when ordering siRNA probes.
# CDS region found in feature table starting at 57
#
#=======================================

  Start     End  Strand   Score    GC%                 Sense_siRNA         Antis
ense_siRNA
    308     330       +   9.000   50.0 (AA)AAGUGAGAGGUCAGACUCCdTdT GGAGUCUGACCUC
UCACUUdTdT
    309     331       +   9.000   50.0 (AA)AGUGAGAGGUCAGACUCCUdTdT AGGAGUCUGACCU
CUCACUdTdT
    310     332       +   9.000   50.0 (AA)GUGAGAGGUCAGACUCCUAdTdT UAGGAGUCUGACC
UCUCACdTdT
    351     373       +   9.000   50.0 (AA)GAAGAAGAAGACAGGUCGGdTdT CCGACCUGUCUUC
UUCUUCdTdT
    166     188       +   8.000   55.0 (AA)GAUCAAGUCGUGCUCCUGGdTdT CCAGGAGCACGAC
UUGAUCdTdT
    279     301       +   8.000   55.0 (AA)AGUUCAUGGUUCCCUGGCCdTdT GGCCAGGGAACCA
UGAACUdTdT
    330     352       +   8.000   55.0 (AA)GGUGGCCAAACAGGAGAAGdTdT CUUCUCCUGUUUG
GCCACCdTdT
    354     376       +   8.000   55.0 (AA)GAAGAAGACAGGUCGGGCUdTdT AGCCCGACCUGUC
UUCUUCdTdT
    357     379       +   8.000   55.0 (AA)GAAGACAGGUCGGGCUAAGdTdT CUUAGCCCGACCU
GUCUUCdTdT
    393     415       +   8.000   55.0 (AA)CCGGCGCUUUGUCAACGUUdTdT AACGUUGACAAAG
CGCCGGdTdT
    253     275       +   7.000   60.0 (AA)GUAGCAGGCCGCAUGCUUGdTdT CAAGCAUGCGGCC
UGCUACdTdT
    280     302       +   7.000   60.0 (AA)GUUCAUGGUUCCCUGGCCCdTdT GGGCCAGGGAACC
AUGAACdTdT
    339     361       +   7.000   40.0 (AA)ACAGGAGAAGAAGAAGAAGdTdT CUUCUUCUUCUUC
UCCUGUdTdT
    340     362       +   7.000   40.0 (AA)CAGGAGAAGAAGAAGAAGAdTdT UCUUCUUCUUCUU
CUCCUGdTdT
    348     370       +   7.000   40.0 (AA)GAAGAAGAAGAAGACAGGUdTdT ACCUGUCUUCUUC
UUCUUCdTdT
    375     397       +   7.000   60.0 (AA)GCGGCGGAUGCAGUACAACdTdT GUUGUACUGCAUC
CGCCGCdTdT
    408     430       +   7.000   60.0 (AA)CGUUGUGCCCACCUUUGGCdTdT GCCAAAGGUGGGC
ACAACGdTdT
    429     451       +   7.000   60.0 (AA)GAAGAAGGGCCCCAAUGCCdTdT GGCAUUGGGGCCC
UUCUUCdTdT
    432     454       +   7.000   60.0 (AA)GAAGGGCCCCAAUGCCAACdTdT GUUGGCAUUGGGG
CCCUUCdTdT
    435     457       +   7.000   60.0 (AA)GGGCCCCAAUGCCAACUCUdTdT AGAGUUGGCAUUG
GGGCCCdTdT
    488     510       +   7.000   40.0 (AA)AAAGCCACUUAGUUCAGUCdTdT GACUGAACUAAGU
GGCUUUdTdT
    489     511       +   7.000   40.0 (AA)AAGCCACUUAGUUCAGUCAdTdT UGACUGAACUAAG
UGGCUUdTdT
    490     512       +   7.000   40.0 (AA)AGCCACUUAGUUCAGUCAAdTdT UUGACUGAACUAA
GUGGCUdTdT
    491     513       +   7.000   40.0 (AA)GCCACUUAGUUCAGUCAAAdTdT UUUGACUGAACUA
AGUGGCdTdT
    129     151       +   6.000   55.0 (AA)GGCUCAUGUAGCCUCACUGdTdT CAGUGAGGCUACA
UGAGCCdTdT
    165     187       +   6.000   50.0 (GA)AGAUCAAGUCGUGCUCCUGdTdT CAGGAGCACGACU
UGAUCUdTdT
    278     300       +   6.000   50.0 (UA)AAGUUCAUGGUUCCCUGGCdTdT GCCAGGGAACCAU
GAACUUdTdT


  [Part of this file has been deleted for brevity]

    374     396       +   5.000   55.0 (UA)AGCGGCGGAUGCAGUACAAdTdT UUGUACUGCAUCC
GCCGCUdTdT
    383     405       +   5.000   55.0 (GA)UGCAGUACAACCGGCGCUUdTdT AAGCGCCGGUUGU
ACUGCAdTdT
    387     409       +   5.000   55.0 (CA)GUACAACCGGCGCUUUGUCdTdT GACAAAGCGCCGG
UUGUACdTdT
    390     412       +   5.000   55.0 (UA)CAACCGGCGCUUUGUCAACdTdT GUUGACAAAGCGC
CGGUUGdTdT
    392     414       +   5.000   55.0 (CA)ACCGGCGCUUUGUCAACGUdTdT ACGUUGACAAAGC
GCCGGUdTdT
    407     429       +   5.000   55.0 (CA)ACGUUGUGCCCACCUUUGGdTdT CCAAAGGUGGGCA
CAACGUdTdT
    428     450       +   5.000   55.0 (CA)AGAAGAAGGGCCCCAAUGCdTdT GCAUUGGGGCCCU
UCUUCUdTdT
    431     453       +   5.000   55.0 (GA)AGAAGGGCCCCAAUGCCAAdTdT UUGGCAUUGGGGC
CCUUCUdTdT
    434     456       +   5.000   60.0 (GA)AGGGCCCCAAUGCCAACUCdTdT GAGUUGGCAUUGG
GGCCCUdTdT
    444     466       +   5.000   35.0 (AA)UGCCAACUCUUAAGUCUUUdTdT AAAGACUUAAGAG
UUGGCAdTdT
    487     509       +   5.000   35.0 (AA)AAAAGCCACUUAGUUCAGUdTdT ACUGAACUAAGUG
GCUUUUdTdT
    123     145       +   4.000   50.0 (CA)GAUCAAGGCUCAUGUAGCCdTdT GGCUACAUGAGCC
UUGAUCdTdT
    125     147       +   4.000   50.0 (GA)UCAAGGCUCAUGUAGCCUCdTdT GAGGCUACAUGAG
CCUUGAdTdT
    128     150       +   4.000   50.0 (CA)AGGCUCAUGUAGCCUCACUdTdT AGUGAGGCUACAU
GAGCCUdTdT
    155     177       +   4.000   50.0 (CA)UUGCCCCGGAAGAUCAAGUdTdT ACUUGAUCUUCCG
GGGCAAdTdT
    234     256       +   4.000   60.0 (GA)GGCCCUGACUACCCUGGAAdTdT UUCCAGGGUAGUC
AGGGCCdTdT
    259     281       +   4.000   60.0 (CA)GGCCGCAUGCUUGGAGGUAdTdT UACCUCCAAGCAU
GCGGCCdTdT
    266     288       +   4.000   40.0 (CA)UGCUUGGAGGUAAAGUUCAdTdT UGAACUUUACCUC
CAAGCAdTdT
    342     364       +   4.000   40.0 (CA)GGAGAAGAAGAAGAAGAAGdTdT CUUCUUCUUCUUC
UUCUCCdTdT
    347     369       +   4.000   40.0 (GA)AGAAGAAGAAGAAGACAGGdTdT CCUGUCUUCUUCU
UCUUCUdTdT
    359     381       +   4.000   60.0 (GA)AGACAGGUCGGGCUAAGCGdTdT CGCUUAGCCCGAC
CUGUCUdTdT
    111     133       +   3.000   55.0 (GA)AACGGUCGCCCAGAUCAAGdTdT CUUGAUCUGGGCG
ACCGUUdTdT
    113     135       +   3.000   65.0 (AA)CGGUCGCCCAGAUCAAGGCdTdT GCCUUGAUCUGGG
CGACCGdTdT
    172     194       +   3.000   70.0 (AA)GUCGUGCUCCUGGCAGGCGdTdT CGCCUGCCAGGAG
CACGACdTdT
    443     465       +   3.000   35.0 (CA)AUGCCAACUCUUAAGUCUUdTdT AAGACUUAAGAGU
UGGCAUdTdT
    456     478       +   3.000   35.0 (UA)AGUCUUUUGUAAUUCUGGCdTdT GCCAGAAUUACAA
AAGACUdTdT
    468     490       +   3.000   30.0 (AA)UUCUGGCUUUCUCUAAUAAdTdT UUAUUAGAGAAAG
CCAGAAdTdT
    484     506       +   3.000   30.0 (AA)UAAAAAAGCCACUUAGUUCdTdT GAACUAAGUGGCU
UUUUUAdTdT
    108     130       +   2.000   60.0 (CA)GGAAACGGUCGCCCAGAUCdTdT GAUCUGGGCGACC
GUUUCCdTdT
    135     157       +   2.000   60.0 (CA)UGUAGCCUCACUGGAGGGCdTdT GCCCUCCAGUGAG
GCUACAdTdT
    139     161       +   2.000   60.0 (UA)GCCUCACUGGAGGGCAUUGdTdT CAAUGCCCUCCAG
UGAGGCdTdT
    150     172       +   2.000   60.0 (GA)GGGCAUUGCCCCGGAAGAUdTdT AUCUUCCGGGGCA
AUGCCCdTdT
    171     193       +   2.000   65.0 (CA)AGUCGUGCUCCUGGCAGGCdTdT GCCUGCCAGGAGC
ACGACUdTdT
    201     223       +   2.000   65.0 (GA)GGAUGAGGCCACUCUGGGCdTdT GCCCAGAGUGGCC
UCAUCCdTdT
    204     226       +   2.000   65.0 (GA)UGAGGCCACUCUGGGCCAGdTdT CUGGCCCAGAGUG
GCCUCAdTdT
    245     267       +   2.000   65.0 (UA)CCCUGGAAGUAGCAGGCCGdTdT CGGCCUGCUACUU
CCAGGGdTdT
    256     278       +   2.000   65.0 (UA)GCAGGCCGCAUGCUUGGAGdTdT CUCCAAGCAUGCG
GCCUGCdTdT
    285     307       +   2.000   65.0 (CA)UGGUUCCCUGGCCCGUGCUdTdT AGCACGGGCCAGG
GAACCAdTdT
    338     360       +   2.000   35.0 (CA)AACAGGAGAAGAAGAAGAAdTdT UUCUUCUUCUUCU
CCUGUUdTdT
    345     367       +   2.000   35.0 (GA)GAAGAAGAAGAAGAAGACAdTdT UGUCUUCUUCUUC
UUCUUCdTdT
    486     508       +   2.000   35.0 (UA)AAAAAGCCACUUAGUUCAGdTdT CUGAACUAAGUGG
CUUUUUdTdT

#---------------------------------------
#---------------------------------------

#---------------------------------------
# Total_sequences: 1
# Total_length: 518
# Reported_sequences: 1
# Reported_hitcount: 85
#---------------------------------------

   The siRNAs are reported in order of best score first.

   sirna reports both the sense and antisense siRNAs as 5' to 3'.

Data files

   None.

Notes

   RNA interference (RNAi) is a phenomenon whereby small interfering RNA
   strands (siRNA) inhibit gene expression at the level of transcription
   or translation of specific genes. RNAi is a defence mechanism against
   viruses and is important in regulating development and genome
   maintenance. siRNA are double stranded RNA molecules where one or the
   other strand is strongly complementary to a target RNA strand. Once
   they bind to a target, a nuclease protein guided by the siRNA cleaves
   the target and renders it untranslateable.

   Gene silencing using RNAi has been used to determine the function of
   many genes in Drosophilia, C. elegans, and many plant species. The
   duration of knockdown by siRNA can typically last for 7-10 days, and
   has been shown to transfer to daughter cells. Of further note, siRNAs
   are effective at quantities much lower than alternative gene silencing
   methodologies, including antisense and ribozyme based strategies.

   Due to various mechanisms of antiviral response to long dsRNA, RNAi at
   first proved more difficult to establish in mammalian species. Then,
   Tuschl, Elbashir, and others discovered that RNAi can be elicited very
   effectively by well-defined 21-base duplex RNAs. When these small
   interfering RNA, or siRNA, are added in duplex form with a transfection
   agent to mammalian cell cultures, the 21-base-pair RNA acts in concert
   with cellular components to silence the gene with sequence homology to
   one of the siRNA sequences. Strategies for the design of effective
   siRNA sequences have been recently documented, most notably by Sayda
   Elbashir, Thomas Tuschl, et al.

   Their studies of mammalian RNAi suggest that the most efficient
   gene-silencing effect is achieved using double-stranded siRNA having a
   19-nucleotide complementary region and a 2-nucleotide 3' overhang at
   each end. Current models of the RNAi mechanism suggest that the
   antisense siRNA strand recognizes the specific gene target.

   In gene-specific RNAi, the coding region (CDS) of the mRNA is usually
   targeted. The search for an appropriate target sequence should begin
   50-100 nucleotides downstream of the start codon. UTR-binding proteins
   and/or translation initiation complexes may interfere with the binding
   of the siRNP endonuclease complex. Tuschl, Elbashir et al. say that
   they have successfully used siRNAs targetting the 3' UTR. To avoid
   interference from mRNA regulatory proteins, sequences in the 5'
   untranslated region or near the start codon should not be targeted.

   A set of rules for the design of siRNA has been suggested
   http://www.mpibpc.gwdg.de/abteilungen/100/105/sirna.html based on the
   work of Tuschl, Elbashir et al. They suggest searching for 23-nt
   sequence motif AA(N19)TT (N, any nucleotide) and select hits with
   approx. 50% G/C-content (30% to 70% has also worked in for them). If no
   suitable sequences are found, the search is extended using the motif
   NA(N21). The sequence of the sense siRNA corresponds to (N19)TT or N21
   (position 3 to 23 of the 23-nt motif), respectively. In the latter
   case, they convert the 3' end of the sense siRNA to TT.

   The rationale for this sequence conversion is to generate a symmetric
   duplex with respect to the sequence composition of the sense and
   antisense 3' overhangs. The antisense siRNA is synthesized as the
   complement to position 1 to 21 of the 23-nt motif. Because position 1
   of the 23-nt motif is not recognized sequence-specifically by the
   antisense siRNA, the 3'-most nucleotide residue of the antisense siRNA,
   can be chosen deliberately. However, the penultimate nucleotide of the
   antisense siRNA (complementary to position 2 of the 23-nt motif) should
   always be complementary to the targeted sequence. For simplifying
   chemical synthesis, they always use TT.

   More recently, they preferentially select siRNAs corresponding to the
   target motif NAR(N17)YNN, where R is purine (A, G) and Y is pyrimidine
   (C, U). The respective 21-nt sense and antisense siRNAs therefore begin
   with a purine nucleotide and can also be expressed from pol III
   expression vectors without a change in targeting site; expression of
   RNAs from pol III promoters is only efficient when the first
   transcribed nucleotide is a purine.

   They always design siRNAs with symmetric 3' TT overhangs, believing
   that symmetric 3' overhangs help to ensure that the siRNPs are formed
   with approximately equal ratios of sense and antisense target
   RNA-cleaving siRNPs Please note that the modification of the overhang
   of the sense sequence of the siRNA duplex is not expected to affect
   targeted mRNA recognition, as the antisense siRNA strand guides target
   recognition. In summary, no matter what you do to your overhangs,
   siRNAs should still function to a reasonable extent. However, using TT
   in the 3' overhang will always help your RNA synthesis company to let
   you know when you accidentally order a siRNA sequences 3' to 5' rather
   than in the recommended format of 5' to 3'. sirna reports both the
   sense and antisense siRNAs as 5' to 3'.

   Xeragon.com also suggest that choosing a region of the mRNA with a GC
   content as close as possible to 50% is a more important consideration
   than choosing a target sequence that begins with AA. They also suggest
   that a key consideration in target selection is to avoid having more
   than three guanosines in a row, since poly G sequences can hyperstack
   and form agglomerates that potentially interfere with the siRNA
   silencing mechanism.

   siRNAs appear to effectively silence genes in more than 80% of cases.
   Current data indicate that there are regions of some mRNAs where gene
   silencing does not work. To help ensure that a given target gene is
   silenced, it is advised that at least two target sequences as far apart
   on the gene as possible be chosen.

  Coding region specification

   It's possible (although the evidence is unclear) that regulatory
   protein binding to regions in and near the untranslated 5' region might
   interfere with the RNAi process. Therefore, this program avoids
   choosing siRNA probes from the 5' UTR and from the first 50 bases of
   the coding region. The second 50 bases of the coding region has a
   penalty associated with it to reduce the reporting of possible siRNA
   probes in this region. If the input sequence has a feature table
   specifying a coding region, then this will be used, else you can
   specify the start of the coding region, where this is known by the
   -sbegin command-line qualifier (which is normally used to specify the
   start of the region of a sequence that should be analysed in all EMBOSS
   programs). sirna looks at the feature table of the input mRNA sequence
   to find the coding regions (CDS). It will ignore the 5' UTR and the
   first 50 bases of the CDS. It will assign a penalty of 2 points to any
   siRNA in positions 51 to 100 in the CDS. If there is no CDS in the
   feature table, you can specify the CDS by using the command-line
   qualifier -sbegin to indicate where the CDS should start. If there is
   no CDS in the feature table and you do not use the command-line
   qualifier -sbegin, then sirna will assume that the CDS region is not
   known and will look for siRNAs in the whole of the sequence with no
   penaties associated with the location within the sequence. All these
   confusing regions There are a lot of references to 23 base regions, 21
   base regions, 19 base regions, etc. in any description of siRNA.
   Perhaps an example with a sequence would be clearer? The 23 base
   region, in this case starting with an AA, might typically look like:
5' AAGUGAGAGGUCAGACUCCUATC

   The sense siRNA is made from the 19 bases of positions 3 to 21 of the
   23 base target region, so:
5'   GUGAGAGGUCAGACUCCUA

   and then typically d(TT) is added, so:
5'   GUGAGAGGUCAGACUCCUAdTdT

   The antisense siRNA sequence is made from bases 3 to 21 of the target
   region, so:
5'   GUGAGAGGUCAGACUCCUA sense
3'   CACUCUCCAGUCUGAGGAU antisense 3' -> 5'

   so the antisense sequence that should be ordered with d(TT) added is:
5'   UAGGAGUCUGACCUCUCACdTdT antisense 5' -> 3'

References

    1. Elbashir, S. M., et al. (2001a). Duplexes of 21-nucleotide RNAs
       mediate RNA interference in mammalian cell culture. Nature 411:
       494-498.
    2. Elbashir, S. M., W. Lendeckel and T. Tuschl (2001b). RNA
       interference is mediated by 21 and 22 nt RNAs. Genes & Dev. 15:
       188-200.

Warnings

   It is assumed that the input sequence is mRNA.

Diagnostic Error Messages

   None.

Exit status

   It always exits with status 0.

Known bugs

   None.

See also

   Program name     Description
   banana           Plot bending and curvature data for B-DNA
   btwisted         Calculate the twisting in a B-DNA sequence
   einverted        Finds inverted repeats in nucleotide sequences
   marscan          Finds matrix/scaffold recognition (MRS) signatures in DNA
                    sequences
   trimest          Remove poly-A tails from nucleotide sequences

Author(s)

   Gary Williams formerly at:
   MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust
   Genome Campus, Hinxton, Cambridge, CB10 1SB, UK

   Please report all bugs to the EMBOSS bug team
   (emboss-bug (c) emboss.open-bio.org) not to the original author.

History

   Written (November 2002) - Gary Williams.

Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.

Comments

   None