This is short .mev File Format Description. Please look TM4 suite package for full document.
Production Revision 4.0 (July 16, 2004)
A MultiExperimentViewer or .mev file is a tab-delimited text file that contains coordinate and expression data for a single microarray experiment. A single header row is required to precede the expression data in order to identify the columns below. With the exception of optional comment lines, each remaining row of the file stores data for a particular spot/feature on the array.
MeV and other TM4 software tools will consider comment lines non-computational. A comment line must start with the pound symbol '#', and can be included anywhere in the file. If the pound symbol is the first character on a line, the entire line (up to the newline character '\n') will be ignored by the software tool.
The mev files created at TIGR will typically contain at least one comment at the top of the file with the following information. The format and fields contained within these comments are subject to change. See Appendix 2 for details.
version | Version number based on revisions of expression data |
format_version | The version of the .mev file format document |
date | Date of file creation or update |
analyst | Owner or the person responsible for creating the file |
analysis_id | id from the analysis table that corresponds to this set of expression values |
slide_type | slide_type from the slide_type table that this array is based on |
input_row_count | Number of rows of expression (eg. non-header) data in input files |
output_row_count | Number of rows of expression (eg. non-header) data in this file |
created_by | Software tool used to create the file |
description | Common name or other details about the experiment |
An example of the leading comments:
# version: V1.0
# format_version: V4.0
# date: 10/06/2004
# analyst: aisaeed
# analysis_id: 10579
# slide_type: IASCAG1
# input_row_count: 32448
# output_row_count: 32448
# created_by: TIGR Spotfinder 2.2.3
# TIFF files processed: gpc30025a_532_nm.tif, gpc30025a_635_nm.tif
# description: Tumor type comparison
# This is the 4th experiment in a series of 20 to identify tissue-specific genes.
The header row consists of the field names for each subsequent row in this file (with the exception of comment lines). A minimum of seven columns must be present, and these must use a set of specifically named headers. Any number of additional columns may be included. The seven required column headers are:
UID | Unique identifier for this spot |
IA | Intensity value in channel A |
IB | Intensity value in channel B |
R | Row (slide row) |
C | Column (slide column) |
MR | Meta-row (block row) |
MC | Meta-column (block column) |
As of version 4.0 of this file format the IA and IB columns can be substituted with MedA and MedB. The new requirement is that at least one integrated intensity (IA, IB, etc.) or one median (MedA, MedB, etc.) value be reported for each channel in the microarray. For example, a two channel microarray .mev file would require either IA and IB or MedA and MedB.
MedA | Median intensity in channel A |
MedB | Median intensity in channel B |
The mev files created at TIGR may use one of the two formats for the header row, depending on the origin of the mev file. The non-required columns (i.e. anything after the 7th column) may be rearranged and their names are subject to change at this time.
1) Spotfinder created mev file:
UID \t IA \t IB \t R \t C \t MR \t MC \t SR \t SC \t FlagA \t FlagB \t SAA \t SAB \t SFA \t SFB \t QCS \t QCA \t QCB \t BkgA \t BkgB \t SDA \t SDB \t SDBkgA \t SDBkgB \t MedA \t MedB \t AID
UID | Unique identifier for this spot |
IA | Intensity value in channel A |
IB | Intensity value in channel B |
R | Row (slide row) |
C | Column (slide column) |
MR | Meta-row (block row) |
MC | Meta-column (block column) |
SR | Sub-row |
SC | Sub-column |
FlagA | TIGR Spotfinder flag value in channel A |
FlagB | TIGR Spotfinder flag value in channel B |
SA | Actual spot area (in pixels) |
SF | Saturation factor |
QC | Cumulative quality control score |
QCA | Quality control score in channel A |
QCB | Quality control score in channel B |
BkgA | Background value in channel A |
BkgB | Background value in channel B |
SDA | Standard deviation for spot pixels in channel A |
SDB | Standard deviation for spot pixels in channel B |
SDBkgA | Standard deviation of the background value in channel A |
SDBkgB | Standard deviation of the background value in channel B |
MedA | Median intensity value in channel A |
MedB | Median intensity value in channel B |
MNA | Mean intensity value in channel A |
MNB | Mean intensity value in channel B |
X | X coordinate of the spot cell rectangle |
Y | Y coordinate of the spot cell rectangle |
PValueA | P-value in channel A |
PValueB | P-value in channel B |
DBID | Data Base ID (used when UID is substituted) |
The first seven fields (UID, IA, IB, R, C, MR and MC) are required as specified above.
This flexible format allows users to track slide-specific data of interest, such as background, spot size and alternate intensities without requiring them of all users or adopting a limited 'vocabulary' of field names. This header row serves to identify the required and additional data columns. UID must be the left-most column in the mev file. Other columns do not need to be present in a fixed order.
For mev files generated at TIGR, the UIDs may be of the form: database_name:spot_id (eg. cage:20238). For any given microarray database, the id field in the spot table will be unique. The combination of database and spot_id will therefore uniquely identify any spot on any array created at TIGR. It is important to note that this is not enough information to distinguish between spots in the same location on two slides of the same slide_type, as this would typically require an analysis_id. Since annotation data is based on slide_type, it is not necessary to make this distinction, as all slides of a given type will use the same annotation file.
The AID column will usually contain an incremental sequence of numbers starting at 1. These can be used to return the file to the original sorted order and can function as a unique row identifier if necessary.
Applications that generate files of expression data (commonly in tav format) by retrieving records from the database access the spot table. TIGR Spotfinder, Midas and Madam are all capable of generating UIDs of the form described above in addition to the typical coordinate and intensity data.
mev files are required to end with the extension '.mev'. At this time there are no further naming conventions for mev files.