wordmatch

Function

Description

wordmatch finds all regions of identity (exact matches) of a specified minimum size of two input sequences. These regions are reported in a standard EMBOSS alignment file and in standard EMBOSS sequence feature files.

Usage

Command line arguments


Input file format

wordmatch reads two sets of nucleotide or protein sequences.

Output file format

By default the output is in 'match' format.

The matches in each set of input sequences are written as feature files.

The normal 'report' header is output. It contains the details of the program run and the input sequences.

The data lines consist of five columns separated by spaces or TAB characters. Each line contains the information on one identical region. The first column is the length of the match. The second column is the name of the first sequence. The third column is the start and end position of the match. The next two columns are the name and positions of the second sequence.

Data files

None.

Notes

wordmatch will only report identical regions that are at least as long as the specified wordsize.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

0 if successful.

Known bugs

None.

Author(s)

History

Completed 27th November 1998. -->

Target users

Comments