org.biojavax.bio.seq
Interface RichSequenceHandler

All Known Implementing Classes:
BioSQLRichSequenceHandler, DummyRichSequenceHandler

public interface RichSequenceHandler

An interface for classes that know how to handle subsequence operations. Implementations may be optimized so that they perform more efficiently in certain conditions. For example a subsequence operation on a huge BioSQL backed RichSequence could be optimized so that the operation is performed more efficiently than dragging the whole sequence to memory and then doing the operation. Implementations of RichSequence should generally delegate symbolAt(int index), subStr(int start, int end), subList(int start, int end) and subSequence(int start, int end) to some implementation of this interface.

Since:
1.5
Author:
Mark Schreiber, Richard Holland

Method Summary
 void edit(RichSequence seq, Edit edit)
          Apply an edit to the Sequence as specified by the edit object.
 Iterator iterator(RichSequence seq)
          An Iterator over all Symbols in this SymbolList.
 String seqString(RichSequence seq)
          Stringify this Sequences.
 SymbolList subList(RichSequence seq, int start, int end)
          Return a new SymbolList for the symbols start to end inclusive.
 String subStr(RichSequence seq, int start, int end)
          Return a region of this sequence as a String.
 Symbol symbolAt(RichSequence seq, int index)
          Return the symbol at index, counting from 1.
 List toList(RichSequence seq)
          Returns a List of symbols.
 

Method Detail

edit

void edit(RichSequence seq,
          Edit edit)
          throws IndexOutOfBoundsException,
                 IllegalAlphabetException,
                 ChangeVetoException
Apply an edit to the Sequence as specified by the edit object.

Description

All edits can be broken down into a series of operations that change contiguous blocks of the sequence. This represent a one of those operations.

When applied, this Edit will replace 'length' number of symbols starting a position 'pos' by the SymbolList 'replacement'. This allow to do insertions (length=0), deletions (replacement=SymbolList.EMPTY_LIST) and replacements (length>=1 and replacement.length()>=1).

The pos and pos+length should always be valid positions on the SymbolList to:

Examples

 RichSequence seq = //code to initialize RichSequence
 System.out.println(seq.seqString());

 // delete 5 bases from position 4
 Edit ed = new Edit(4, 5, SymbolList.EMPTY_LIST);
 seq.edit(ed);
 System.out.println(seq.seqString());

 // delete one base from the start
 ed = new Edit(1, 1, SymbolList.EMPTY_LIST);
 seq.edit(ed);

 // delete one base from the end
 ed = new Edit(seq.length(), 1, SymbolList.EMPTY_LIST);
 seq.edit(ed);
 System.out.println(seq.seqString());

 // overwrite 2 bases from position 3 with "tt"
 ed = new Edit(3, 2, DNATools.createDNA("tt"));
 seq.edit(ed);
 System.out.println(seq.seqString());

 // add 6 bases to the start
 ed = new Edit(1, 0, DNATools.createDNA("aattgg");
 seq.edit(ed);
 System.out.println(seq.seqString());

 // add 4 bases to the end
 ed = new Edit(seq.length() + 1, 0, DNATools.createDNA("tttt"));
 seq.edit(ed);
 System.out.println(seq.seqString());

 // full edit
 ed = new Edit(3, 2, DNATools.createDNA("aatagaa");
 seq.edit(ed);
 System.out.println(seq.seqString());
 

Parameters:
edit - the Edit to perform
Throws:
IndexOutOfBoundsException - if the edit does not lie within the SymbolList
IllegalAlphabetException - if the SymbolList to insert has an incompatible alphabet
ChangeVetoException - if either the SymboList does not support the edit, or if the change was vetoed

symbolAt

Symbol symbolAt(RichSequence seq,
                int index)
                throws IndexOutOfBoundsException
Return the symbol at index, counting from 1.

Parameters:
index - the offset into this SymbolList
Returns:
the Symbol at that index
Throws:
IndexOutOfBoundsException - if index is less than 1, or greater than the length of the symbol list

toList

List toList(RichSequence seq)
Returns a List of symbols.

This should be an immutable list of symbols or a copy.

Returns:
a List of Symbols

subStr

String subStr(RichSequence seq,
              int start,
              int end)
              throws IndexOutOfBoundsException
Return a region of this sequence as a String.

This should use the same rules as seqString.

Parameters:
start - the first symbol to include
end - the last symbol to include
Returns:
the string representation
Throws:
IndexOutOfBoundsException - if either start or end are not within the SymbolList

subList

SymbolList subList(RichSequence seq,
                   int start,
                   int end)
                   throws IndexOutOfBoundsException
Return a new SymbolList for the symbols start to end inclusive.

The resulting SymbolList will count from 1 to (end-start + 1) inclusive, and refer to the symbols start to end of the original sequence.

Parameters:
start - the first symbol of the new SymbolList
end - the last symbol (inclusive) of the new SymbolList
Throws:
IndexOutOfBoundsException

seqString

String seqString(RichSequence seq)
Stringify this Sequences.

It is expected that this will use the symbol's token to render each symbol. It should be parsable back into a SymbolList using the default token parser for this alphabet.

Returns:
a string representation of the symbol list

iterator

Iterator iterator(RichSequence seq)
An Iterator over all Symbols in this SymbolList.

This is an ordered iterator over the Symbols. It cannot be used to edit the underlying symbols.

Returns:
an iterator