NB03c Centroidisation

Binder

Download Notebook

  1. Centroidisation
  2. Peak fitting and picking functions
  3. Application of the peak picking function
  4. Questions

Centroidisation

In reality, a peak is represented by a collection of signals from a peptide or fragment ion species that are measured by the specific detector. Due to imperfections of the measurement, there is a scatter around the accurate mass. This distribution along the m/z axis of signals from ion species is termed profile peak. The conversion of a peak profile into the corresponding m/z and intensity values reduces the complexity, its representation is termed centroiding. To extract the masses for identification in a simple and fast way, peak fitting approaches are used. Further, peak fitting algorithms are also needed to extract ion abundancies and therefore explained under quantification in the following section.

#r "nuget: BioFSharp, 2.0.0-beta5"
#r "nuget: BioFSharp.IO, 2.0.0-beta5"
#r "nuget: Plotly.NET, 4.2.0"
#r "nuget: BioFSharp.Mz, 0.1.5-beta"
#r "nuget: BIO-BTE-06-L-7_Aux, 0.0.10"

#if IPYNB
#r "nuget: Plotly.NET.Interactive, 4.2.0"
#endif // IPYNB

open Plotly.NET
open BioFSharp.Mz
open BIO_BTE_06_L_7_Aux.FS3_Aux
open System.IO

Peak fitting and picking functions

We declare a function which centroids the given m/z and intensity data. In the scope of the function the m/z and intensity data are padded for the wavelet (You will read more about wavelet functions later in NB05a_Quantification.ipynb ) and the centroided. For the centroidisation, we use a Ricker 2D wavelet.

// Code-Block 1

let ms1PeakPicking (mzData: float []) (intensityData: float []) = 
    if mzData.Length < 3 then 
        [||],[||]
    else
        let paddYValue = Array.min intensityData
        // we need to define some padding and wavelet parameters
        let paddingParams = 
            SignalDetection.Padding.createPaddingParameters paddYValue (Some 7) 0.05 150 95.
        let waveletParameters = 
            SignalDetection.Wavelet.createWaveletParameters 10 paddYValue 0.1 90. 1. false false
        
        let paddedMz,paddedIntensity = 
            SignalDetection.Padding.paddDataBy paddingParams mzData intensityData
        
        BioFSharp.Mz.SignalDetection.Wavelet.toCentroidWithRicker2D waveletParameters paddedMz paddedIntensity 

We load a sample MS1 from a mgf file.

// Code-Block 2
let directory = __SOURCE_DIRECTORY__
let path = Path.Combine[|directory; "downloads/ms1MGF.mgf"|]
downloadFile path "ms1MGF.mgf" "bio-bte-06-l-7"

let ms1 = 
    BioFSharp.IO.Mgf.readMgf (path)
    |> List.head

ms1
No value returned by any evaluator

Application of the peak picking function

We centroid the MS2 data using the function declared beforehand:

// Code-Block 3

let centroidedMs1 = 
    ms1PeakPicking ms1.Mass ms1.Intensity
// Code-Block 4

//removes low intensity data points for charting
let filteredMs1Mass, filteredMs1Intensity =
    Array.zip ms1.Mass ms1.Intensity
    |> Array.filter (fun (mass, intensity) ->
        intensity > 400.
    )
    |> Array.unzip

let filteredChart =
    [
        Chart.Point(filteredMs1Mass,filteredMs1Intensity)
        |> Chart.withTraceName "Uncentroided MS1"
        Chart.Point(fst centroidedMs1,snd centroidedMs1)
        |> Chart.withTraceName "Centroided MS1"
    ]
    |> Chart.combine
    |> Chart.withYAxisStyle "Intensity"
    |> Chart.withXAxisStyle (title = "m/z", MinMax = (400., 800.))
    |> Chart.withSize (900., 900.)

filteredChart
No value returned by any evaluator

Questions:

  1. The aim of centroidization is finding the m/z for each profile peak. How can this improve the performance and quality of the following steps?
  2. In the result plot, a single ms1 spectrum is shown. Naively describe the differences between the uncentroided and the centroided spectrums.
  3. Taking into consideration your answer for question 1, do your findings of question 2 meet your expectations? If yes, why? If no, why?
namespace System
namespace System.IO
val ms1PeakPicking : mzData:float [] -> intensityData:float [] -> 'a [] * 'b []
val mzData : float []
Multiple items
val float : value:'T -> float (requires member op_Explicit)
<summary>Converts the argument to 64-bit float. This is a direct conversion for all primitive numeric types. For strings, the input is converted using <c>Double.Parse()</c> with InvariantCulture settings. Otherwise the operation requires an appropriate static conversion method on the input type.</summary>
<param name="value">The input value.</param>
<returns>The converted float</returns>


--------------------
[<Struct>] type float = System.Double
<summary>An abbreviation for the CLI type <see cref="T:System.Double" />.</summary>
<category>Basic Types</category>


--------------------
type float<'Measure> = float
<summary>The type of double-precision floating point numbers, annotated with a unit of measure. The unit of measure is erased in compiled code and when values of this type are analyzed using reflection. The type is representationally equivalent to <see cref="T:System.Double" />.</summary>
<category index="6">Basic Types with Units of Measure</category>
val intensityData : float []
property System.Array.Length: int with get
<summary>Gets the total number of elements in all the dimensions of the <see cref="T:System.Array" />.</summary>
<exception cref="T:System.OverflowException">The array is multidimensional and contains more than <see cref="F:System.Int32.MaxValue" /> elements.</exception>
<returns>The total number of elements in all the dimensions of the <see cref="T:System.Array" />; zero if there are no elements in the array.</returns>
val paddYValue : float
module Array from Microsoft.FSharp.Collections
<summary>Contains operations for working with arrays.</summary>
<remarks> See also <a href="https://docs.microsoft.com/dotnet/fsharp/language-reference/arrays">F# Language Guide - Arrays</a>. </remarks>
val min : array:'T [] -> 'T (requires comparison)
<summary>Returns the lowest of all elements of the array, compared via Operators.min.</summary>
<remarks>Throws ArgumentException for empty arrays</remarks>
<param name="array">The input array.</param>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
<exception cref="T:System.ArgumentException">Thrown when the input array is empty.</exception>
<returns>The minimum element.</returns>
val paddingParams : obj
union case Option.Some: Value: 'T -> Option<'T>
<summary>The representation of "Value of type 'T"</summary>
<param name="Value">The input value.</param>
<returns>An option representing the value.</returns>
val waveletParameters : obj
val paddedMz : obj
val paddedIntensity : obj
val directory : string
val path : string
type Path = static member ChangeExtension : path: string * extension: string -> string static member Combine : path1: string * path2: string -> string + 3 overloads static member EndsInDirectorySeparator : path: ReadOnlySpan<char> -> bool + 1 overload static member GetDirectoryName : path: ReadOnlySpan<char> -> ReadOnlySpan<char> + 1 overload static member GetExtension : path: ReadOnlySpan<char> -> ReadOnlySpan<char> + 1 overload static member GetFileName : path: ReadOnlySpan<char> -> ReadOnlySpan<char> + 1 overload static member GetFileNameWithoutExtension : path: ReadOnlySpan<char> -> ReadOnlySpan<char> + 1 overload static member GetFullPath : path: string -> string + 1 overload static member GetInvalidFileNameChars : unit -> char [] static member GetInvalidPathChars : unit -> char [] ...
<summary>Performs operations on <see cref="T:System.String" /> instances that contain file or directory path information. These operations are performed in a cross-platform manner.</summary>
Path.Combine([<System.ParamArray>] paths: string []) : string
Path.Combine(path1: string, path2: string) : string
Path.Combine(path1: string, path2: string, path3: string) : string
Path.Combine(path1: string, path2: string, path3: string, path4: string) : string
val ms1 : obj
Multiple items
module List from Microsoft.FSharp.Collections
<summary>Contains operations for working with values of type <see cref="T:Microsoft.FSharp.Collections.list`1" />.</summary>
<namespacedoc><summary>Operations for collections such as lists, arrays, sets, maps and sequences. See also <a href="https://docs.microsoft.com/dotnet/fsharp/language-reference/fsharp-collection-types">F# Collection Types</a> in the F# Language Guide. </summary></namespacedoc>


--------------------
type List<'T> = | ( [] ) | ( :: ) of Head: 'T * Tail: 'T list interface IReadOnlyList<'T> interface IReadOnlyCollection<'T> interface IEnumerable interface IEnumerable<'T> member GetReverseIndex : rank:int * offset:int -> int member GetSlice : startIndex:int option * endIndex:int option -> 'T list static member Cons : head:'T * tail:'T list -> 'T list member Head : 'T member IsEmpty : bool member Item : index:int -> 'T with get ...
<summary>The type of immutable singly-linked lists.</summary>
<remarks>Use the constructors <c>[]</c> and <c>::</c> (infix) to create values of this type, or the notation <c>[1;2;3]</c>. Use the values in the <c>List</c> module to manipulate values of this type, or pattern match against the values directly. </remarks>
<exclude />
val head : list:'T list -> 'T
<summary>Returns the first element of the list.</summary>
<param name="list">The input list.</param>
<exception cref="T:System.ArgumentException">Thrown when the list is empty.</exception>
<returns>The first element of the list.</returns>
val centroidedMs1 : obj [] * obj []
val filteredMs1Mass : obj []
val filteredMs1Intensity : float []
val zip : array1:'T1 [] -> array2:'T2 [] -> ('T1 * 'T2) []
<summary>Combines the two arrays into an array of pairs. The two arrays must have equal lengths, otherwise an <c>ArgumentException</c> is raised.</summary>
<param name="array1">The first input array.</param>
<param name="array2">The second input array.</param>
<exception cref="T:System.ArgumentNullException">Thrown when either of the input arrays is null.</exception>
<exception cref="T:System.ArgumentException">Thrown when the input arrays differ in length.</exception>
<returns>The array of tupled elements.</returns>
val filter : predicate:('T -> bool) -> array:'T [] -> 'T []
<summary>Returns a new collection containing only the elements of the collection for which the given predicate returns "true".</summary>
<param name="predicate">The function to test the input elements.</param>
<param name="array">The input array.</param>
<returns>An array containing the elements for which the given predicate returns true.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
val mass : obj
val intensity : float
val unzip : array:('T1 * 'T2) [] -> 'T1 [] * 'T2 []
<summary>Splits an array of pairs into two arrays.</summary>
<param name="array">The input array.</param>
<returns>The two arrays.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
val filteredChart : obj
val fst : tuple:('T1 * 'T2) -> 'T1
<summary>Return the first element of a tuple, <c>fst (a,b) = a</c>.</summary>
<param name="tuple">The input tuple.</param>
<returns>The first value.</returns>
val snd : tuple:('T1 * 'T2) -> 'T2
<summary>Return the second element of a tuple, <c>snd (a,b) = b</c>.</summary>
<param name="tuple">The input tuple.</param>
<returns>The second value.</returns>