GenePix File Formats

Archived from www.axon.com/GN_GenePix_File_Formats.html,
where the original is no longer available.


GenePix� File Formats

GenePix Pro recognizes and uses several different file types:

ATF  -  Axon Text File
GAL  -  GenePix Array List
Introduction
Example GAL file
Description of header records
Description of data records
A minimal GAL file header
GPR  -  GenePix Results
GPL  -  GenePix Lab Book
GPS  -  GenePix Settings
JPEG  -  Joint Photographics Experts Group
TIFF  -  Tagged Image File Format

ATF  -  Axon Text File format (*.atf)

ATF is a tab-delimited text file format that can be read by typical spreadsheet programs such as Microsoft Excel. It is used for GenePix Array List (GAL) files, and GenePix Results (GPR) files.

An ATF text file consists of records. Each line in the text file is a record. Each record may consist of several fields, separated by a field separator (column delimiter). The tab and comma characters are field separators. Space characters around a tab or comma are ignored and considered part of the field separator. Text strings are enclosed in quotation marks to ensure that any embedded spaces, commas and tabs are not mistaken for field separators.

The group of records at the beginning of the file is called the file header. The file header describes the file structure and includes column titles, units, and comments.

ATF File Structure

First header record    Format: ATF (all caps), Version number
Second header record   Number of optional header records n,
                       Number of data columns (fields) m
1st optional record    ...
2nd optional record    ...
nth optional record    ...
(n+3)th record         Required record containing m fields.
                       Each field contains a column title.
DATA RECORDS           Arranged in m columns (fields) of data.

See below under GenePix Array List format for an example of an ATF file.

GAL  -  GenePix Array List format (*.gal)

Introduction
Example GAL file
Description of header records
Description of data records
A minimal GAL file header

Introduction

Download a sample GAL file [broken link: http://www.axon.com/downloads/notes_gn/Demo.gal].
See also:
       Making GenePix Array List Files Application Note. [PDF] 281 KB [broken link: http://www.axon.com/downloads/notes_gn/Making_GAL_Files.pdf]
       GAL File Examples specifically for array and arrayer manufacturers. [broken link: http://www.axon.com/gn_GAL_Examples.html]

GenePix Array List files describe the size and position of blocks, the layout of feature-indicators in them, and the names and identifiers of the printed substances associated with each feature-indicator.

GenePix Pro includes an integrated Array List Generator which generates GAL files from plain text files; see the GenePix Pro online Help for details.

GAL files conform to the Axon Text File (ATF) format described above. As such, they can be created in Microsoft Excel by saving an Excel spreadsheet as Text (Tab delimited).

To create a GAL file that describes block and feature-indicator positions and geometry, but without substance IDs or names, save a settings file using the Save Settings As command in GenePix Pro (select *.gal as the output file type).

GAL files consist of two sections: the header, and data records. The header contains all the structural and positional information about the blocks; the data records contain all the name and identifier information for each spot.

GenePix Pro assigns block numbers such that the top leftmost block on the image is block #1, and the block numbers increase from left to right and then from top to bottom:

1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16

The order in which a block was created does not matter; GenePix Pro automatically renumbers all blocks to follow this rule.

Example GenePix Array List (GAL) file

The following very simple array list file describes four blocks ("BlockCount=4"), each with 24 columns and 5 rows. For simplicity, we have included the data record information (name, ID, etc) only for the first two features:

ATF       1.0
8         5
"Type=GenePix ArrayList V1.0"
"BlockCount=4"
"BlockType=0"
"URL=http://genome-www.stanford.edu/cgi-bin/dbrun/SacchDB?find+Locus+%22[ID]%22"
"Block1= 400, 400, 100, 24, 175, 5, 175"
"Block2= 4896, 400, 100, 24, 175, 5, 175"
"Block3= 400, 4896, 100, 24, 175, 5, 175"
"Block4= 4896, 4896, 100, 24, 175, 5, 175"
"Block"  "Column"  "Row"  "Name"  "ID"
 1        1         1      VPS8    YAL002W
 1        2         1      NTG1    YAL015C

Description of header records

The header section describes basic file information and all block properties apart from names and IDs (which are in the data records section). Each record is explained below:

ATF       1.0 (Required) First line of an ATF file; the same in all GAL files:
           File format (ATF) and version (1.0).
8            5 (Required) Second line of an ATF file:
           8 (number of optional header records).
         5 (number of data columns).
"Type=GenePix Array List v1.0" (Required)  Type of file; the same in all GAL files.
"BlockCount=4" (Optional)  Number of blocks described in the file.
"BlockType=0" (Optional)  Type of block described:
           0 = rectangular.
         1 = orange-packing #1.
         2 = orange-packing #2.
"URL=..." (Optional) The URL for the Go To Web command.
"Supplier=CompanyXYZ" (Optional) The manufacturer that supplied the array or arrayer.
"ArrayerSoftwareName=Printer Robot User Interface" (Optional) The name of the arrayer software.
"ArrayerSoftwareVersion=1.1" (Optional) The version number of the arrayer software.
"ArrayName=MouseApoptosisProteins 4000" (Optional) The name of an array as supplied by an array manufacturer.
"ArrayRevision=2.7" (Optional) The version of an array as supplied by an array manufacturer.
"SlideBarcode=abc0011, abc0012, abc0013" (Optional) Barcodes supported by the GAL file, used for barcode-driven automation.
"Blockn=" (Optional)  The position and dimensions of each block. There is one record for each block, and each record contains 7 fields. Each field is separated by a comma followed by a space.
  xOrigin X position of center of top leftmost feature of current block (in �m).
  yOrigin Y position of center of top leftmost feature of current block (in �m).
  FeatureDiameter Diameter of features within the current block (in �m).
  xFeatures Number of columns of features in current block.
  xSpacing Column spacing of current block (in �m).
  yFeatures Number of rows of features in current block.
  ySpacing Row spacing of current block (in �m).
  Note:  Positions on arrays are measured in microns with respect to the origin, which is the top left corner of the array.
"User Defined" (Optional)  You may include any number of correctly formatted extra lines in the header. When the GAL file is read as input by GenePix Pro 4.1, these will be passed to the output Results (GPR) file.
"Block" "Column" "Row" "Name" "ID" (Required)  Last line of the header, containing column titles for the data records. The quotation marks are advised, but not necessary.

Description of data records

The Data Record section contains records which describe each feature in detail. It includes the block, column, and row numbers for features, as well as descriptive name and identifier information. The GAL Data Record may also optionally contain user-defined fields (column titles) for extra annotation information that you may wish to include.

In GenePix Pro 4.1, any user-defined GAL file data columns are read and output to the Results (GPR) file; in earlier versions they are ignored. Also new in GenePix Pro 4.1 is that there is no longer a 40-character limit on the Name and ID fields; entries longer than 40 characters are truncated when read by earlier versions.

There is one record for each feature, containing a field for each of the descriptive columns:

Block (Required)  The block number for the feature (required).
Column     (Required)  The column location within the block (required).
Row (Required)  The row location within the block (required).
Name (Optional)  Name to be displayed for the given feature (optional; limited to 40 characters in GenePix Pro 4.0 and earlier, no limit in 4.1).
ID (Required)  Identifier for each feature (required; limited to 40 characters in GenePix Pro 4.0 and earlier, no limit in 4.1).
"User Defined" (Optional)  Annotation information (optional).

Block, Column, Row and ID are required fields. The column titles can be in any order.

Note:  If you have empty features, use 'empty' as the feature ID, and the feature is flagged absent when the GAL file is opened by GenePix.

A minimal GAL file header

Because most of the GAL file header records are optional, it is relatively simple to construct a GAL file with a very minimal header. The following example also leaves out the Name column, which is also optional:

ATF       1.0
1         4
"Type=GenePix ArrayList V1.0"
"Block"  "Column"  "Row"  "ID"
 1        1         1      YAL002W
 1        2         1      YAL015C

When you open this GAL file in GenePix, you will be prompted with the New Blocks dialog box to enter block properties. You may find this method of configuring blocks via the New Blocks dialog box more convenient than working out block arrangements by hand.

GPR  -  GenePix Results format (*.gpr)

GenePix Results data are saved as GPR files, which are in Axon Text File (ATF) format. A Results file contains general information about image acquisition and analysis, as well as the data extracted from each individual feature. Any user-defined feature data contained in a GAL file read by GenePix Pro 4.1 will be included in the output GPR file. As of GenePix Pro version 4.0.1.4, the GPR version number is 3.0.

Read a history of the changes to the GPR file format [broken link: http://www.axon.com/gn_GPR_Format_History.html] since GenePix Pro 3, including example GPR files in all the various formats.

GPR Header

A sample GPR file header and a description of each entry are shown below:

Entry Description
ATF     1.0 File type and version number.
29       48 Number of optional header records and
number of data fields (columns).
"Type=GenePix Results 3" Type of ATF file.
"DateTime=2002/02/09 17:15:48" Date and time when the image was acquired.
"Settings=C:\Genepix\Genepix.gps" The name of the settings file that was used for analysis.
"GalFile=C:\Genepix\Demo.gal" The GenePix Array List file used to associate Names and IDs to each entry.
"PixelSize=10" Resolution of each pixel in �m.
"Wavelengths=635     532" Installed laser excitation sources in nm.
"ImageFiles=C:\Genepix\demo.tif 0
C:\Genepix\Genepix.tif 1"
The name and path of the associated TIF file(s).
"NormalizationMethod=None" The type of normalization method used, if applicable.
"NormalizationFactors=1    1" The normalization factor applied to each channel.
"JpegImage=C:\Genepix\demo.jpg" The name and path of the associated Jpeg image files.
"StdDev=Type 1" The type of standard deviation calculation selected in the Options settings.
"RatioFormulation=W1/W2 (635/532)" The ratio formulation of the ratio image, showing which image is numerator and which is denominator.
"Barcode=00331" The barcode symbols read from the image.
"BackgroundSubtraction=LocalFeature" The background subtraction method selected in the Options settings.
"ImageOrigin=0, 0" The origin of the image relative to the scan area.
"JpegOrigin=390, 4320" The origin of the Results JPEG image (the bounding box of the analysis Blocks) relative to the scan area origin.
"Creator=GenePix 4.1.1.4" The version of the GenePix Pro software used to create the Results file.
"Scanner=GenePix 4000B [serial number]" Type and serial number of scanner used to acquire the image.
"FocusPosition=0" The focus position setting used to acquire the image, in microns.
"Temperature=19.6127" The temperature of the scanner, in degrees C.
"LinesAveraged=1" The line average setting used to acquire the image.
"Comment=hyb 2673" User-entered file comment.
"PMTGain=500     600" The PMT settings during acquisition.
"ScanPower=100    100" The amount of laser transmission during acquisition.
"LaserPower=1    1" The power of each laser, in volts.
"LaserOnTime=5    5" The laser on-time for each laser, in minutes.
"Filters=<Empty>    <Empty>" Emission filters used during acquisition (GenePix 4100 and 4200 only.)
"ScanRegion=100,100,2000,2000" The coordinate values of the scan region used during acquisition, in pixels.
"Supplier=" Header field supplied in GAL file.
Data record column headings Column titles for each measurement (see below).
Data Records Extracted data.

GPR Data

The list below describes each column of data in the Results file.

Column Title Description
Block the block number of the feature.
Column the column number of the feature.
Row the row number of the feature.
Name the name of the feature derived from the Array List (up to 40 characters long, contained in quotation marks).
ID the unique identifier of the feature derived from the Array List (up to 40 characters long, contained in quotation marks).
X the X-coordinate in �m of the center of the feature-indicator associated with the feature, where (0,0) is the top left of the image.
Y the Y-coordinate in �m of the center of the feature-indicator associated with the feature, where (0,0) is the top left of the image.
Dia. the diameter in �m of the feature-indicator.
F635 Median median feature pixel intensity at wavelength #1 (635 nm).
F635 Mean mean feature pixel intensity at wavelength #1 (635 nm).
F635 SD the standard deviation of the feature pixel intensity at wavelength #1 (635 nm).
B635 Median the median feature background intensity at wavelength #1 (635 nm).
B635 Mean the mean feature background intensity at wavelength #1 (635 nm).
B635 SD the standard deviation of the feature background intensity at wavelength #1 (635 nm).
% > B635 + 1 SD the percentage of feature pixels with intensities more than one standard deviation above the background pixel intensity, at wavelength #1 (635 nm).
% > B635 + 2 SD the percentage of feature pixels with intensities more than two standard deviations above the background pixel intensity, at wavelength #1 (635 nm).
F635 % Sat. the percentage of feature pixels at wavelength #1 that are saturated.
F532 Median median feature pixel intensity at wavelength #2 (532 nm).
F532 Mean mean feature pixel intensity at wavelength #2 (532 nm).
F532 SD the standard deviation of the feature intensity at wavelength #2 (532 nm).
B532 Median the median feature background intensity at wavelength #2 (532 nm).
B532 Mean the mean feature background intensity at wavelength #2 (532 nm).
B532 SD the standard deviation of the feature background intensity at wavelength #2 (532 nm).
% > B532 + 1 SD the percentage of feature pixels with intensities more than one standard deviation above the background pixel intensity, at wavelength #2 (532 nm).
% > B532 + 2 SD the percentage of feature pixels with intensities more than two standard deviations above the background pixel intensity, at wavelength #2 (532 nm).
F532 % Sat. the percentage of feature pixels at wavelength #2 that are saturated.
Ratio of Medians the ratio of the median intensities of each feature for each wavelength, with the median background subtracted.
Ratio of Means the ratio of the arithmetic mean intensities of each feature for each wavelength, with the median background subtracted.
Median of Ratios the median of pixel-by-pixel ratios of pixel intensities, with the median background subtracted.
Mean of Ratios the geometric mean of the pixel-by-pixel ratios of pixel intensities, with the median background subtracted.
Ratios SD the geometric standard deviation of the pixel intensity ratios.
Rgn Ratio the regression ratio of every pixel in a 2-feature-diameter circle around the center of the feature.
Rgn R� the coefficient of determination for the current regression value.
F Pixels the total number of feature pixels.
B Pixels the total number of background pixels.
Sum of Medians the sum of the median intensities for each wavelength, with the median background subtracted.
Sum of Means the sum of the arithmetic mean intensities for each wavelength, with the median background subtracted.
Log Ratio log (base 2) transform of the ratio of the medians.
Flags the type of flag associated with a feature.
Normalize the normalization status of the feature (included/not included).
F1 Median - B1 the median feature pixel intensity at wavelength #1 with the median background subtracted.
F2 Median - B2 the median feature pixel intensity at wavelength #2 with the median background subtracted.
F1 Mean - B1 the mean feature pixel intensity at wavelength #1 with the median background subtracted.
F2 Mean - B2 the mean feature pixel intensity at wavelength #2 with the median background subtracted.
SNR 1 the signal-to-noise ratio at wavelength #1, defined by (Mean Foreground 1- Mean Background 1) / (Standard deviation of Background 1)
F1 Total Intensity the sum of feature pixel intensities at wavelength #1
Index the number of the feature as it occurs on the array.
"User Defined" user-defined feature data read from the GAL file (GenePix Pro 4.1).

GPL  -  GenePix Lab Book format (*.gpl)

The GenePix Lab Book is a binary file that contains a fixed-size structure for each line in the Lab Book.

GPS  -  GenePix Settings format (*.gps)

GenePix acquisition, analysis and display settings are saved as binary GenePix Settings Files. Settings are organized into a number of different categories (acquisition, analysis and display) all of which are saved together in the GPS file. However, when opening a settings file you can choose which subset of the settings you wish to open.

Acquisition settings include which laser was enabled during the acquisition, the PMT voltages, the lines averaged, and the scan area. Analysis settings include the location and identification of blocks and feature-indicators that were defined on the image. Display settings include brightness and contrast settings, and the color mapping.

JPEG  -  Joint Photographic Experts Group (*.jpg)

Images can be saved in the JPEG format, which is a lossy compressed image file format. GenePix implements minimal JPEG compression, which is enough to reduce image file size significantly, but which removes only a small amount of data from the image. However, we recommend that you do not use the JPEG format to archive images that are to be analyzed later. Rather, use the JPEG format to store images that are to be used in presentations.

TIFF  -  Tagged Image File Format (*.tif)

Images acquired in GenePix are by default saved as 16-bit unsigned TIFF images. This is a standard, uncompressed graphic file format that can be read by many graphics and imaging programs. The primary data acquired by GenePix are the single-wavelength images, and by default these are saved as 16-bit grayscale TIFFs in a single multi-image TIFF file. Not all graphics applications can read multi-image TIFF files. You may wish to try opening a multi-image file with your preferred graphics application to see if they are supported. If not, save the single-wavelength images as separate single-image files.

GenePix exports its preview and pseudocolor ratio images as 24-bit color TIFFs, but it does not read them, as data are not extracted from them.

Copyright � 2004 Molecular Devices Corporation