EMBOSS: splitter

splitter

Function

Description

splitter splits one or more input sequences into smaller, optionally overlapping, subsequences. The subsequence size and overlap (if any) may be specified. Optionally, feature information will be used.

Usage

Command line arguments

Input File Format

splitter reads one or more nucleotide or protein sequences.

Output File Format

The names of the sequences are the same as the original sequence, with '_start-end' appended, where 'start', and 'end' are the start and end positions of the sub-sequence. eg: The name U01317 would be changed in the sub-sequences to: U01317_1-50000 and U01317_50001-73308 if they were split at the size of 50000 with no overlap.

Data files

None.

Notes

Splitting a large sequence into smaller sub-sequences for analysis might be useful in cases where a particularly memory or CPU intensive application will not run quickly enough or at all on the full sequence. This should seldom be necessary in EMBOSS.

By default, splitter will write all the sub-sequences to a single file. In some cases, particularly where non-EMBOSS programs are used, it is necessary to have a single sequence per file. To write the sub-sequences into separate files use the command-line switch -ossingle.

Function

Description

Usage

Command line arguments

Input File Format

Output File Format

Data files

Notes

References

Warnings

Diagnostic Error Messages

Exit status

Known bugs

Author(s)

History

Target users

Comments