AFLPcore
Class ABILaneFilter

java.lang.Object
  |
  +--AFLPcore.Operation
        |
        +--AFLPcore.ImportFilter
              |
              +--AFLPcore.ABILaneFilter

public class ABILaneFilter
extends ImportFilter

This class reads data from a lane file produced by extracting lanes from a gel run on an ABI377. It has been tested with lanes extracted by GeneScan 2.0. It will probably work on with the ABI373 as well, but this has not been tested. This class reads in the processed data, so the lane must be processed as well as simply extracted. Also, it relies on the ABI software to find the peaks in the size standard.

It will extract the following pieces of information from the file:

This information will be stored in a Lane object, which is used by the program. The peaks read in will be passed to a SizeFunction which will use them to calculate the sizing information for the data. Since the ABI software also calls peaks that are not part of the size standard, the program compares all of the peaks to an internal SizeStandard and use only the sizes it finds in that internal size standard. For example, the peaks with locations of 50.00, 100.00, and 150.00 bp would be used, but 54.23 would not. (Unless 54.23 was defined as part of the size standard, which can't really happen since the size standard must contian whole values.)

The filter has three options that must be set before it can run

  1. Data Color
  2. Size Function to use
  3. Size Standard to use
These can be manipulated using getOptions() and setOptions(Option[]). All three options are a list of choices, one of which must be selected. The possible values for the color option are red, blue, green, and yellow. The size function and the size standard can be the name of any size function/standard known to the program. This class uses the FeatureList class to retrieve the known functions. Once the options have been set, the readLane method can be called to read the actual file.

The File Format

The first 4 bytes contains the value of "ABIF" which indicates that the file is an ABI Lane file (I think). The file contains a record structure. Each record is 28 bytes long. The number of records is given in a 32-bit integer at byte 18 (indexed to 0), and the offset from the beginning of the file to the start of the first record is given as a 32-bit integer at byte 26. A record has the following structure:

    struct{
      byte[4] name;      Four ASCII character name, like "DATA"
      int tagNumber;     Distinguishes fields with the same name for 
                         example: DATA1, DATA2, ... , DATA12
      short data_type;   Denotes the type of data 4 = integer
                         7 = float, 10 = mm/dd/yy 11 = hh/mm/ss
                         18 = pascal string, 1024 = some sort of structure
      SHORT elementSize;   The size of each element.
      int numElements;   The number of elements.
      int recordLength;  The length of the whole record.
      int dataOffset;    The offset from the beginning of the file to
                         the start of the record, unless the recordLength
                         is less than 4, in which case it contains the
                         actual data.
      int unknown;       Usually 0, but seems to change with the editing
                         of the file.
    }
 
Most of this information was obtained from Clark Tibbetts paper. ( Tibbetts, Clark. "Raw Data File Formats and the Digital and Analog Raw Data Streams of the ABI PRISM DNA Sequencer(c)." 1995.)

The following records are of interest:

DATA
This contains the trace data as a sequence of 16-bit integers. The file can contain up to 12 DATA fields. The first 8 are always present. The first four represent the raw color data from the machine. The fifth through 8th represent values for the gel voltage, gel current, electoporetic power, and the gel temperature. The last four are the ones of interest to this program. They contain the color data after it has been processed and seperated. Note that this will not always exist. For example, often only certain colors are extracted from a lane. The tag number corresponds to the color in the following manner: blue = 0, yellow = 1, green = 2, red = 3. For the processed data, the correct tag number is simply given by 9 + colorNumber.
GELN
This is a pascal-type string representing the gel name.
LANE
This contains the lane number on the original gel. It is stored as a 16-bit integer in the first 2 bytes of the dataOffset field.
LANS
This probably contains lots of information, but I don't know what it is. However, the third and fourth byte in this structure give the number of the color for the size standard, as a short integer. Therefore, if the value there is stdColor, the standard trace is in DATA(9+stdColor) and PEAK(stdColor) contains the size standard peaks.
PEAK
This contains a number of peaks as called by the ABI software. This filter uses it for the size standard information. See below for a description of the peak data structure.
SpNm
This contains the name of the sample, as a pascal string.
StdF
In some cases, this contains the name of the size standard, but it seems to be missing in some files, so it is not used by this filter. (Stored as a pascal string.)
OFFS
This is not used by the program, and seems to be 1000 in most cases. 1000 is also the difference between the scan number displayed and the number stored in the peaks. This may have something to do with where the software thinks the zero point is, or it may not. It appears to be a single 16-bit integers.
A peak in the ABI file is 96 bytes long. The first 4 bytes are used to store the scan number as 32-bit integer. This scan number is different than the one displayed by the ABI programs. It is 1000 less, but the number 1000 could vary. 1000 is also the value stored in OFFS. The next two bytes are the height, as a 16-bit integer. I don't know what the next 12 bytes are. After that, the peak area is stored as a 32-bit integer. Skip four bytes again. We then have the size of the peak, in bp. This is a IEEE 754 single precision float.
   Value     Start   Length(bytes)    Type
   scan        0           4           integer (1000 + this value)
   height      4           2           integer
   area       18           4           integer
   size       26           4           IEEE 754 single-percision float
 

See Also:
SizeFunction, SizeStandard

Field Summary
static int ALL
           
static int BLUE
          color channel
static int GREEN
          color channel
static int RED
          color channel
static int YELLOW
          color channel
 
Fields inherited from class AFLPcore.ImportFilter
filetype, GEL, LANE
 
Fields inherited from class AFLPcore.Operation
descript, helpFile, name, options
 
Constructor Summary
ABILaneFilter()
          Creates a new filter to read in ABI lane files.
 
Method Summary
 java.lang.String getDescription()
          Retrieves a short, approximately one sentence, description of the filter.
 int getFileType()
          Returns the type of input file supported by this filter In this case ImportFilter.LANE, since the filter reads in lane data.
 java.lang.String getHelpFile()
          The help file describes which files the filter reads and the options that this filter accepts.
 java.lang.String getName()
          Access the name of the filter.
 Option[] getOptions()
          Returns the options for this filter, which includes the color of the data, the size function to use, and the size standard.
 Gel readGel(java.io.File inputFile)
          This filter does not read gels.
 Lane[] readLane(java.io.File inputFile)
          This is the method that is called to preform the actual reading of the file.
 Peak readPeak(java.io.RandomAccessFile in)
          Read in a peak from the file.
 void setOptions(Option[] opts)
          Sets the parameters for the filter to the specified values, including color.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

YELLOW

public static final int YELLOW
color channel

RED

public static final int RED
color channel

BLUE

public static final int BLUE
color channel

GREEN

public static final int GREEN
color channel

ALL

public static final int ALL
Constructor Detail

ABILaneFilter

public ABILaneFilter()
Creates a new filter to read in ABI lane files.
Method Detail

getName

public java.lang.String getName()
Access the name of the filter.
Overrides:
getName in class Operation
Returns:
name of the import filter

getFileType

public int getFileType()
Returns the type of input file supported by this filter In this case ImportFilter.LANE, since the filter reads in lane data.
Overrides:
getFileType in class ImportFilter
Returns:
constant LANE.

getDescription

public java.lang.String getDescription()
Retrieves a short, approximately one sentence, description of the filter.
Overrides:
getDescription in class Operation
Returns:
the description

getHelpFile

public java.lang.String getHelpFile()
The help file describes which files the filter reads and the options that this filter accepts.
Overrides:
getHelpFile in class Operation
Returns:
File that contains the help information, either html or plaintext.

getOptions

public Option[] getOptions()
Returns the options for this filter, which includes the color of the data, the size function to use, and the size standard. The first option is the color to read, which can be one of four possilbe values: Red, Blue, Green, or Yellow. The color choice is given as a Option of type CHOICE. The second option is also of type CHOICE. It tells which size method should be used to compute the size of the fragements. Please see the help files and the code for the size functions for a description of how the work. The third option describes the size standard to use. This simply gives the program a list of values. These are stored in a file called "standards.cfg" Possible values for all of these options are read in from the FeatureList class.
Overrides:
getOptions in class Operation
Returns:
an array containing the options described above.
See Also:
Option, FeatureList, SizeFunction, SizeStandard

setOptions

public void setOptions(Option[] opts)
Sets the parameters for the filter to the specified values, including color. The color must be set before this filter can run. The option representing the color should have a string value naming the color. The size function must also be set for the filter to work. It must contain the name of a valid SizeFunction. Note that the name is not the class name of the SizeFunction, but the name each SizeFunction stores internally. The third option must also be set.
Overrides:
setOptions in class Operation
Parameters:
opts - an array of length 3 which contains the options mentioned above and described in getOptions() The order must be: color, size function, size standard.
Throws:
MissingParameterError - occurs when the filter fails to extract a string from the first option in opts.
java.lang.IllegalArgumentException - occurs when a string is found but cannot be matched to one of the colors: Red, Blue, Green, or Yellow. Or if an array with length not equal to 3 is given as opts, or if the specified size function, the second option, could not be matched to a defined size function.

readLane

public Lane[] readLane(java.io.File inputFile)
                throws java.io.IOException
This is the method that is called to preform the actual reading of the file. The data in the file represents data from a single lane. The options/parameters required for the filter should be set using setOptions, and if they are not, an exception will be thrown.
Overrides:
readLane in class ImportFilter
Parameters:
inputFile - The file that contains the lane data.
Returns:
a Lane object with all of the appropriate information.
Throws:
MissingParameterError - occurs if the options are not set. Since this includes the required color, the filter cannot read in the lane.
java.io.IOException - If an error is encountered in the file, then this exception will be thrown

readGel

public Gel readGel(java.io.File inputFile)
            throws java.io.IOException
This filter does not read gels.
Overrides:
readGel in class ImportFilter
Returns:
Always null

readPeak

public Peak readPeak(java.io.RandomAccessFile in)
              throws java.io.IOException
Read in a peak from the file. A peak in the ABI file is 96 bytes long. The first 4 bytes are used to store the scan number as 32-bit integer. This scan number is different than the one displayed by the ABI programs. It is 1000 less, but the number 1000 could vary. 1000 is also the value stored in OFFS. The next two bytes are the height, as a 16-bit integer. I don't know what the next 12 bytes are. After that, the peak area is stored as a 32-bit integer. Skip four bytes again. we then have the size of the peak, in bp. This is a IEEE 754 single precision float.
   Value     Start   Length(bytes)    Type
   scan        0           4           integer (1000 + this value)
   height      4           2           integer
   area       18           4           integer
   size       26           4           IEEE 754 single-percision float
 
Parameters:
in - the input source
Returns:
a peak, with the size/location and height read from the file and the area set as the scan number, not the area.
Throws:
java.io.IOException - occurs if the file cannot be read.