uniqid | index /home/u11/umhameld/workspace/birchscripts/uniqid.py |
July 18, 2008, Dr. Brian Fristensky, University of Manitoba
uniqid.py - Read a source file and replace each definition line with a unique
identifier. Store the unique ID and original definition line
in a .csv file as a key-value pair.
Synopsis:
uniqid.py [options] -encode sourcein sourceout csvout
uniqid.py [options] -decode textin textout csvin
-encode (default) options begin with a dash; filenames do not
The first three filenames on the command line
are read as sourcein, the original source file;
sourceout, the sourcefile sequences in which the
description line is replaced with a unique ID;
and csvout, a comma-separated value file containing
the unique identifier and the corresponding
definition line
-decode options begin with a dash; filenames do not
The first three filenames on the command line
are read as textin, any text file containing
unique IDs generated from a previous run using
-encode; textout the output file in which the
unique ID is replaced by the original name, or
the name plus parts of the definition line; csvin,
the csv file generated by a previous run using
-encode.
-f list_of_fields similar to -f in the Unix cut
command. A comma-separated list of fields to be
written to textout when decoding files.
-s seperator seperator is a character to use as the seperator
when parsing a definition line into fields.
default = " ", a blank space
-nf string string is one or more characters to begin the
unique identifier, which which the definition
line is replaced.
Idea for more general version of program:
An option lets you input a regular expression that is used for
finding the original ID, rather than just hardwiring fasta format
into the program. The program will still default to search for fasta
sequence names, but by employing regular expressions, uniqid.py
can perform substitutions in ANY type of file. Probably not hard
to implement, either.
@modified: May 26 2010
@author: Dale Hamel
@contact: umhameld@cc.umanitoba.ca
Modules | ||||||
|
Classes | ||||||||
|
Functions | ||
|
Data | ||
BM = <birchlib.Birchmod instance> PROGRAM = 'uniqid.py: ' USAGE = '\n\t USAGE: uniqid.py [options] -encode sourcein ...uniqid.py [options] -decode textin textout csvin' blib = '/home/u11/umhameld/public_html/local/pylib/' |