retention time drift alignment mass spec in matlab

There are different strategies for retention time alignment of mass spectrometry (MS) data in MATLAB. One common approach is based on computing a similarity score between the chromatographic profiles of pairs of MS peaks, which allows inferring the correspondences between the peaks from the two profiles. Drift correction can then be applied using regression methods or peak warping.

Here is an example code for retention time alignment of a set of MS spectra:

main.m
%% load MS data
data = load('example_data.mat');
mz = data.mz;
rt = data.rt;
intensity = data.intensity;

%% peak detection
[peaks,locs,widths,proms] = findpeaks(intensity,rt,'MinPeakHeight',2000);

%% create MS feature matrix
n_peaks = length(peaks);
n_samples = size(intensity,2);
ms_features = zeros(n_peaks,n_samples);
for i=1:n_samples
    ms_features(:,i) = interp1(rt,intensity(:,i),locs,'linear');
end

%% pairwise similarity
pairwise_sim = zeros(n_samples);
for i=1:n_samples-1
    for j=i+1:n_samples
        sim_ij = corr(ms_features(:,i),ms_features(:,j),'type','Spearman');
        pairwise_sim(i,j) = sim_ij;
        pairwise_sim(j,i) = sim_ij; % symmetric matrix
    end
end

%% peak correspondence
p_corr = peak_correspondence(pairwise_sim);

%% drift correction
aligned_rt = zeros(size(rt));
for i=1:n_samples
    aligned_rt(:,i) = rt(:,i) - polyval(p_corr,rt(:,i));
end

%% plot aligned data
figure;
subplot(1,2,1);
surf(rt,1:n_samples,intensity,'EdgeColor','none');
view(2); colormap jet;
title('Original data');

subplot(1,2,2);
surf(aligned_rt,1:n_samples,intensity,'EdgeColor','none');
view(2); colormap jet;
title('Aligned data');
1154 chars
48 lines

This code requires the findpeaks and interp1 functions from MATLAB Signal Processing Toolbox. The peak_correspondence function is a custom implementation of a greedy algorithm for maximum weight matching of bipartite graphs. The polyval function is a polynomial evaluation routine from MATLAB.

This code starts by detecting peaks in the MS spectra using the findpeaks function. Then, it creates an MS feature matrix where each row corresponds to the intensity values of a peak across all samples. The pairwise similarity between samples is computed using the corr function, and a peak correspondence matrix is inferred using the peak_correspondence function.

Finally, the drift correction is performed by computing a polynomial function that maps the original retention time values to the aligned values, and applying this mapping to the retention time arrays. The aligned spectra are plotted using a surf function.

Note that the peak detection, MS feature extraction, and similarity computation steps depend on the specific MS data format and experimental design, and may need to be customized for different applications.

gistlibby LogSnag