Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
MapAlignerPoseClustering

Corrects retention time distortions between maps, using a pose clustering approach.

potential predecessor tools $ \longrightarrow $ MapAlignerPoseClustering $ \longrightarrow $ potential successor tools
FeatureFinderCentroided
(or another feature finding algorithm)
FeatureLinkerUnlabeled or
FeatureLinkerUnlabeledQT

This tool provides an algorithm to align the retention time scales of multiple input files, correcting shifts and distortions between them. Retention time adjustment may be necessary to correct for chromatography differences e.g. before data from multiple LC-MS runs can be combined (feature grouping), or when one run should be annotated with peptide identifications obtained in a different run.

All map alignment tools (MapAligner...) collect retention time data from the input files and - by fitting a model to this data

The map alignment tools differ in how they obtain retention time data for the modeling of transformations, and consequently what types of data they can be applied to. The alignment algorithm implemented here is the pose clustering algorithm as described in doi:10.1093/bioinformatics/btm209. It is used to find an affine transformation, which is further refined by a feature grouping step. This algorithm can be applied to features (featureXML) and peaks (mzML), but it has mostly been developed and tested on features. For more details and algorithm-specific parameters (set in the INI file) see "Detailed Description" in the algorithm documentation.

See also
MapAlignerPoseClustering MapAlignerSpectrum MapRTTransformer

This algorithm uses an affine transformation model.

To speed up the alignment, consider reducing 'max_number_of_peaks_considered'. If your alignment is not good enough, consider increasing this number (the alignment will take longer though).

The command line parameters of this tool are:

MapAlignerPoseClustering -- Corrects retention time distortions between maps using a pose clustering approach
.
Version: 2.0.0 Aug 19 2015, 22:19:33, Revision: GIT-NOTFOUND

Usage:
  MapAlignerPoseClustering <options>

This tool has algorithm parameters that are not shown here! Please check the ini file for a detailed descript
ion or use the --helphelp option.

Options (mandatory options marked with '*'):
  -in <files>*               Input files separated by blanks (all must have the same file type) (valid format
                             s: 'mzML', 'featureXML')
  -out <files>               Output files separated by blanks. Either 'out' or 'trafo_out' has to be provided
                             . They can be used together. (valid formats: 'mzML', 'featureXML')
  -trafo_out <files>         Transformation output files separated by blanks. Either 'out' or 'trafo_out' 
                             has to be provided. They can be used together. (valid formats: 'trafoXML')
                             

Options to define a reference file (use either 'file' or 'index', not both; if neither is given 'index' is 
used).:
  -reference:file <file>     File to use as reference (same file format as input files required) (valid forma
                             ts: 'mzML', 'featureXML')
  -reference:index <number>  Use one of the input files as reference ('1' for the first file, etc.).
                             If '0', no explicit reference is set - the algorithm will select a reference. (
                             default: '0' min: '0')

                             
Common TOPP options:
  -ini <file>                Use the given TOPP INI file
  -threads <n>               Sets the number of threads allowed to be used by the TOPP tool (default: '1')
  -write_ini <file>          Writes the default configuration file
  --help                     Shows options
  --helphelp                 Shows all options (including advanced)

The following configuration subsections are valid:
 - algorithm   Algorithm parameters section

You can write an example INI file using the '-write_ini' option.
Documentation of subsection parameters can be found in the doxygen documentation or the INIFileEditor.
Have a look at the OpenMS documentation for more information.

INI file documentation of this tool:

Legend:
required parameter
advanced parameter
+MapAlignerPoseClusteringCorrects retention time distortions between maps using a pose clustering approach.
version2.0.0 Version of the tool that generated this parameters file.
++1Instance '1' section for 'MapAlignerPoseClustering'
in[] Input files separated by blanks (all must have the same file type)input file*.mzML,*.featureXML
out[] Output files separated by blanks. Either 'out' or 'trafo_out' has to be provided. They can be used together.output file*.mzML,*.featureXML
trafo_out[] Transformation output files separated by blanks. Either 'out' or 'trafo_out' has to be provided. They can be used together.output file*.trafoXML
log Name of log file (created only when specified)
debug0 Sets the debug level
threads1 Sets the number of threads allowed to be used by the TOPP tool
no_progressfalse Disables progress logging to command linetrue,false
forcefalse Overwrite tool specific checks.true,false
testfalse Enables the test mode (needed for internal use only)true,false
+++referenceOptions to define a reference file (use either 'file' or 'index', not both; if neither is given 'index' is used).
file File to use as reference (same file format as input files required)input file*.mzML,*.featureXML
index0 Use one of the input files as reference ('1' for the first file, etc.).
If '0', no explicit reference is set - the algorithm will select a reference.
0:∞
+++algorithmAlgorithm parameters section
max_num_peaks_considered1000 The maximal number of peaks/features to be considered per map. To use all, set to '-1'.-1:∞
++++superimposer
mz_pair_max_distance0.5 Maximum of m/z deviation of corresponding elements in different maps. This condition applies to the pairs considered in hashing.0:∞
rt_pair_distance_fraction0.1 Within each of the two maps, the pairs considered for pose clustering must be separated by at least this fraction of the total elution time interval (i.e., max - min). 0:1
num_used_points2000 Maximum number of elements considered in each map (selected by intensity). Use this to reduce the running time and to disregard weak signals during alignment. For using all points, set this to -1.-1:∞
scaling_bucket_size0.005 The scaling of the retention time interval is being hashed into buckets of this size during pose clustering. A good choice for this would be a bit smaller than the error you would expect from repeated runs.0:∞
shift_bucket_size3 The shift at the lower (respectively, higher) end of the retention time interval is being hashed into buckets of this size during pose clustering. A good choice for this would be about the time between consecutive MS scans.0:∞
max_shift1000 Maximal shift which is considered during histogramming. This applies for both directions.0:∞
max_scaling2 Maximal scaling which is considered during histogramming. The minimal scaling is the reciprocal of this.1:∞
dump_buckets [DEBUG] If non-empty, base filename where hash table buckets will be dumped to. A serial number for each invocation will be appended automatically.
dump_pairs [DEBUG] If non-empty, base filename where the individual hashed pairs will be dumped to (large!). A serial number for each invocation will be appended automatically.
++++pairfinder
second_nearest_gap2 The distance to the second nearest neighbors must be larger by this factor than the distance to the matching element itself.1:∞
use_identificationsfalse Never link features that are annotated with different peptides (only the best hit per peptide identification is taken into account).true,false
ignore_chargefalse Compare features normally even if their charge states are differenttrue,false
+++++distance_RTDistance component based on RT differences
max_difference100 Maximum allowed difference in RT in seconds0:∞
exponent1 Normalized RT differences are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0:∞
weight1 RT distances are weighted by this factor0:∞
+++++distance_MZDistance component based on m/z differences
max_difference0.3 Maximum allowed difference in m/z (unit defined by 'unit')0:∞
unitDa Unit of the 'max_difference' parameterDa,ppm
exponent2 Normalized m/z differences are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0:∞
weight1 m/z distances are weighted by this factor0:∞
+++++distance_intensityDistance component based on differences in relative intensity
exponent1 Differences in relative intensity are raised to this power (using 1 or 2 will be fast, everything else is REALLY slow)0:∞
weight0 Distances based on relative intensity are weighted by this factor0:∞

OpenMS / TOPP release 2.0.0 Documentation generated on Thu Aug 20 2015 01:44:31 using doxygen 1.8.9.1