Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
Static Public Member Functions | Static Private Member Functions | List of all members
MRMRTNormalizer Class Reference

The MRMRTNormalizer will find retention time peptides in data. More...

#include <OpenMS/ANALYSIS/OPENSWATH/MRMRTNormalizer.h>

Static Public Member Functions

static std::vector< std::pair< double, double > > removeOutliersRANSAC (std::vector< std::pair< double, double > > &pairs, double rsq_limit, double coverage_limit, size_t max_iterations, double max_rt_threshold, size_t sampling_size)
 This function removes potential outliers in a linear regression dataset. More...
 
static std::vector< std::pair< double, double > > ransac (std::vector< std::pair< double, double > > &pairs, size_t n, size_t k, double t, size_t d, bool test=false)
 This function provides a generic implementation of the RANSAC outlier detection algorithm. Is implemented and tested after the SciPy reference: http://wiki.scipy.org/Cookbook/RANSAC. More...
 
static std::vector< std::pair< double, double > > removeOutliersIterative (std::vector< std::pair< double, double > > &pairs, double rsq_limit, double coverage_limit, bool use_chauvenet, std::string method)
 This function removes potential outliers in a linear regression dataset. More...
 
static double chauvenet_probability (std::vector< double > &residuals, int pos)
 This function computes Chauvenet's criterion probability for a vector and a value whose position is submitted. More...
 
static bool chauvenet (std::vector< double > &residuals, int pos)
 This function computes Chauvenet's criterion for a vector and a value whose position is submitted. More...
 

Static Private Member Functions

static double llsm_rsq (std::vector< std::pair< double, double > > &pairs)
 Interface for GSL or OpenMS::MATH linear regression implementation standard least-squares fit to a straight line takes as input a standard vector of a standard pair of points in a 2D space and returns the coefficients of the linear regression Y(c,x) = c0 + c1 * x. More...
 
static std::pair< double, doublellsm_fit (std::vector< std::pair< double, double > > &pairs)
 
static double llsm_rss (std::vector< std::pair< double, double > > &pairs, std::pair< double, double > &coefficients)
 
static std::vector< std::pair< double, double > > llsm_rss_inliers (std::vector< std::pair< double, double > > &pairs, std::pair< double, double > &coefficients, double max_threshold)
 
static int jackknifeOutlierCandidate (std::vector< double > &x, std::vector< double > &y)
 This function computes a candidate outlier peptide by iteratively leaving one peptide out to find the one which results in the maximum R^2 of a first order linear regression of the remaining ones. The data points are submitted as two vectors of doubles (x- and y-coordinates). More...
 
static int residualOutlierCandidate (std::vector< double > &x, std::vector< double > &y)
 This function computes a candidate outlier peptide by computing the residuals of all points to the linear fit and selecting the one with the largest deviation. The data points are submitted as two vectors of doubles (x- and y-coordinates). More...
 

Detailed Description

The MRMRTNormalizer will find retention time peptides in data.

This tool will take a description of RT peptides and their normalized retention time to write out a transformation file on how to transform the RT space into the normalized space.

The principle is adapted from the following publication: Escher, C. et al. (2012), Using iRT, a normalized retention time for more targeted measurement of peptides. Proteomics, 12: 1111-1121.

Member Function Documentation

static bool chauvenet ( std::vector< double > &  residuals,
int  pos 
)
static

This function computes Chauvenet's criterion for a vector and a value whose position is submitted.

Returns
TRUE, if Chauvenet's criterion is fulfilled and the outlier can be removed.
static double chauvenet_probability ( std::vector< double > &  residuals,
int  pos 
)
static

This function computes Chauvenet's criterion probability for a vector and a value whose position is submitted.

Returns
Chauvenet's criterion probability
static int jackknifeOutlierCandidate ( std::vector< double > &  x,
std::vector< double > &  y 
)
staticprivate

This function computes a candidate outlier peptide by iteratively leaving one peptide out to find the one which results in the maximum R^2 of a first order linear regression of the remaining ones. The data points are submitted as two vectors of doubles (x- and y-coordinates).

Returns
The position of the candidate outlier peptide as supplied by the vector is returned.
Exceptions
Exception::UnableToFitis thrown if fitting cannot be performed
static std::pair<double, double > llsm_fit ( std::vector< std::pair< double, double > > &  pairs)
staticprivate
static double llsm_rsq ( std::vector< std::pair< double, double > > &  pairs)
staticprivate

Interface for GSL or OpenMS::MATH linear regression implementation standard least-squares fit to a straight line takes as input a standard vector of a standard pair of points in a 2D space and returns the coefficients of the linear regression Y(c,x) = c0 + c1 * x.

static double llsm_rss ( std::vector< std::pair< double, double > > &  pairs,
std::pair< double, double > &  coefficients 
)
staticprivate

interface for GSL or OpenMS::MATH linear regression implementation calculates the residual sum of squares of the input points and the linear fit with coefficients c0 & c1.

static std::vector<std::pair<double, double> > llsm_rss_inliers ( std::vector< std::pair< double, double > > &  pairs,
std::pair< double, double > &  coefficients,
double  max_threshold 
)
staticprivate

calculates the residual sum of squares of the input points and the linear fit with coefficients c0 & c1. further removes all points that have an error larger or equal than max_threshold.

static std::vector<std::pair<double, double> > ransac ( std::vector< std::pair< double, double > > &  pairs,
size_t  n,
size_t  k,
double  t,
size_t  d,
bool  test = false 
)
static

This function provides a generic implementation of the RANSAC outlier detection algorithm. Is implemented and tested after the SciPy reference: http://wiki.scipy.org/Cookbook/RANSAC.

Parameters
pairsInput data (paired data of type <dim1, dim2>)
nthe minimum number of data points required to fit the model
kthe maximum number of iterations allowed in the algorithm
ta threshold value for determining when a data point fits a model. Corresponds to the maximal squared deviation in units of the _second_ dimension (dim2).
dthe number of close data values required to assert that a model fits well to data
testdisables the random component of the algorithm
Returns
A vector of pairs
static std::vector<std::pair<double, double> > removeOutliersIterative ( std::vector< std::pair< double, double > > &  pairs,
double  rsq_limit,
double  coverage_limit,
bool  use_chauvenet,
std::string  method 
)
static

This function removes potential outliers in a linear regression dataset.

Two thresholds need to be defined, first a lower R^2 limit to accept the regression for the RT normalization and second, the lower limit of peptide coverage. The algorithms then selects candidate outlier peptides and applies the Chauvenet's criterion on the assumption that the residuals are normal distributed to determine whether the peptides can be removed. This is done iteratively until both limits are reached.

Parameters
pairsInput data (paired data of type <experimental_rt, theoretical_rt>)
rsq_limitMinimal R^2 required
coverage_limitMinimal coverage required (the number of points falls below this fraction, the algorithm aborts)
use_chauvenetWhether to only remove outliers that fulfill Chauvenet's criterion for outliers (otherwise it will remove any outlier candidate regardless of the criterion)
methodOutlier detection method ("iter_jackknife" or "iter_residual")
Returns
A vector of pairs is returned if the R^2 limit was reached without reaching the coverage limit. If the limits are reached, an exception is thrown.
Exceptions
Exception::UnableToFitis thrown if fitting cannot be performed (rsq_limit and coverage_limit cannot be fulfilled)

Referenced by OpenSwathWorkflow::RTNormalization().

static std::vector<std::pair<double, double> > removeOutliersRANSAC ( std::vector< std::pair< double, double > > &  pairs,
double  rsq_limit,
double  coverage_limit,
size_t  max_iterations,
double  max_rt_threshold,
size_t  sampling_size 
)
static

This function removes potential outliers in a linear regression dataset.

Two thresholds need to be defined, first a lower R^2 limit to accept the regression for the RT normalization and second, the lower limit of peptide coverage. The algorithms then selects candidate outlier peptides using the RANSAC outlier detection algorithm and returns the corrected set of peptides if the two thresholds are satisfied.

Parameters
pairsInput data (paired data of type <experimental_rt, theoretical_rt>)
rsq_limitMinimal R^2 required
coverage_limitMinimal coverage required (if the number of points falls below this fraction, the algorithm aborts)
max_iterationsMaximum iterations for the RANSAC algorithm
max_rt_thresholdMaximum deviation from fit for the retention time. This must be in the unit of the second dimension (e.g. theoretical_rt).
sampling_sizeThe number of data points to sample for the RANSAC algorithm.
Returns
A vector of pairs is returned if the R^2 limit was reached without reaching the coverage limit. If the limits are reached, an exception is thrown.
Exceptions
Exception::UnableToFitis thrown if fitting cannot be performed (rsq_limit and coverage_limit cannot be fulfilled)

Referenced by OpenSwathWorkflow::RTNormalization().

static int residualOutlierCandidate ( std::vector< double > &  x,
std::vector< double > &  y 
)
staticprivate

This function computes a candidate outlier peptide by computing the residuals of all points to the linear fit and selecting the one with the largest deviation. The data points are submitted as two vectors of doubles (x- and y-coordinates).

Returns
The position of the candidate outlier peptide as supplied by the vector is returned.
Exceptions
Exception::UnableToFitis thrown if fitting cannot be performed

OpenMS / TOPP release 2.0.0 Documentation generated on Thu Aug 20 2015 01:44:37 using doxygen 1.8.9.1