splitter

Function

Description

splitter splits one or more input sequences into smaller, optionally overlapping, subsequences. The subsequence size and overlap (if any) may be specified. Optionally, feature information will be used.

Usage

Command line arguments


Input File Format

splitter reads one or more nucleotide or protein sequences.

Output File Format

The names of the sequences are the same as the original sequence, with '_start-end' appended, where 'start', and 'end' are the start and end positions of the sub-sequence. eg: The name U01317 would be changed in the sub-sequences to: U01317_1-50000 and U01317_50001-73308 if they were split at the size of 50000 with no overlap.

Data files

None.

Notes

Splitting a large sequence into smaller sub-sequences for analysis might be useful in cases where a particularly memory or CPU intensive application will not run quickly enough or at all on the full sequence. This should seldom be necessary in EMBOSS.

By default, splitter will write all the sub-sequences to a single file. In some cases, particularly where non-EMBOSS programs are used, it is necessary to have a single sequence per file. To write the sub-sequences into separate files use the command-line switch -ossingle.

References

None

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with status 0

Known bugs

None.

Author(s)

History

Target users

Comments