NB03a Retention time and scan time

Binder

Download Notebook

  1. Retention time and scan time
    1. m/z calculation of the digested peptides
    2. Determination of peptide hydrophobicity

Retention time and scan time

In general, peptides are separated by one or more steps of liquid chromatography (LC). The retention time (RT) is the time when the measured peptides were eluting from the column and is therefore influenced by the physicochemical interaction of the particular peptide with the column material. Scan time is basically synonym to retention time, but more from the point of view of the device.

The aim of this notebook is to understand that even though peptides are roughly separated by the LC, multiple peptides elute at the same retention time and are recorded within one MS1 spectrum. Here, we will simulate a MS1 spectrum by random sampling from our previously generated peptide-mass distribution. Further, we will try to improve our simulation by incorporating information about the peptide hydrophobicity. It is a only a crude model, but considers the fact that less hydrophobic peptides elute faster from the 13C LC column.

As always, we start by loading our famous libraries.

#r "nuget: FSharp.Stats, 0.4.3"
#r "nuget: BioFSharp, 2.0.0-beta5"
#r "nuget: BioFSharp.IO, 2.0.0-beta5"
#r "nuget: Plotly.NET, 4.2.0"
#r "nuget: BIO-BTE-06-L-7_Aux, 0.0.10"

#if IPYNB
#r "nuget: Plotly.NET.Interactive, 4.2.0"
#endif // IPYNB

open BioFSharp
open Plotly.NET
open BioFSharp.Elements
open BIO_BTE_06_L_7_Aux
open FS3_Aux
open Retention_time_and_scan_time_Aux
open System.IO

open FSharp.Stats

m/z calculation of the digested peptides

I think you remember the protein digestion process from the privious notebook (see: NB02b_Digestion_and_mass_calculation.ipynb ). This time we also remember the peptide sequence, because we need it later for hydrophobicity calculation.

// Code-Block 1

let directory = __SOURCE_DIRECTORY__
let path = Path.Combine[|directory; "downloads/Chlamy_JGI5_5(Cp_Mp).fasta"|]
downloadFile path "Chlamy_JGI5_5(Cp_Mp).fasta" "bio-bte-06-l-7"
// with /../ we navigate a directory 
path

let peptideAndMasses = 
    path
    |> IO.FastA.fromFile BioArray.ofAminoAcidString
    |> Seq.toArray
    |> Array.mapi (fun i fastAItem ->
        Digestion.BioArray.digest Digestion.Table.Trypsin i fastAItem.Sequence
        |> Digestion.BioArray.concernMissCleavages 0 0
        )
    |> Array.concat
    |> Array.map (fun peptide ->
        // calculate mass for each peptide
        peptide.PepSequence, BioSeq.toMonoisotopicMassWith (BioItem.monoisoMass ModificationInfo.Table.H2O) peptide.PepSequence
        )

peptideAndMasses |> Array.head
No value returned by any evaluator

Calculate the single and double charged m/z for all peptides and combine both in a single collection.

// Code-Block 2

// calculate m/z for each peptide z=1
let singleChargedPeptides =
    peptideAndMasses
    // we only consider peptides longer than 6 amino acids 
    |> Array.filter (fun (peptide,ucMass) -> peptide.Length >= 7)
    |> Array.map (fun (peptide,ucMass) -> peptide, Mass.toMZ ucMass 1.) 

// calculate m/z for each peptide z=2
let doubleChargedPeptides =
    peptideAndMasses
    // we only consider peptides longer than 6 amino acids 
    |> Array.filter (fun (peptide,ucMass) -> peptide.Length >= 7)
    |> Array.map (fun (peptide,ucMass) -> peptide, Mass.toMZ ucMass 2.) 

// combine this two
let chargedPeptides =
    Array.concat [singleChargedPeptides;doubleChargedPeptides]


chargedPeptides.[1]
No value returned by any evaluator

Now, we can sample our random "MS1" spectrum from this collection of m/z.

// Code-Block 3

// initialize a random generator 
let rnd = new System.Random()

// sample n random peptides from all Chlamydomonas reinhardtii peptides
let chargedPeptideChart =
    Array.sampleWithOutReplacement rnd chargedPeptides 100
    // we only want the m/z
    |> Array.map (fun (peptide,mz) -> mz, 1.) 
    |> Chart.Column
    |> Chart.withXAxisStyle ("m/z", MinMax = (0., 3000.))
    |> Chart.withYAxisStyle ("Intensity", MinMax = (0., 1.3))
    |> Chart.withSize (900., 400.)
    |> Chart.withTemplate ChartTemplates.light

chargedPeptideChart
No value returned by any evaluator

This looks quite strange. I think you immediately see that we forgot about our isotopic cluster. A peptide doesn't produce a single peak, but a full isotopic cluster. Therefore, we use our convenience function from the previous notebook (see: NB02c_Isotopic_distribution.ipynb ).

// Code-Block 4

// Predicts an isotopic distribution of the given formula at the given charge, 
// normalized by the sum of probabilities, using the MIDAs algorithm
let generateIsotopicDistribution (charge: int) (f: Formula.Formula) =
    IsotopicDistribution.MIDA.ofFormula 
        IsotopicDistribution.MIDA.normalizeByMaxProb
        0.01
        0.005
        charge
        f
    |> List.toArray
        
generateIsotopicDistribution
// Code-Block 5

let peptidesAndMassesChart =
    // sample n random peptides from all Chlamydomonas reinhardtii peptides
    Array.sampleWithOutReplacement rnd peptideAndMasses 500
    |> Array.map (fun (peptide,mz) -> 
            peptide
            |> BioSeq.toFormula
            // peptides are hydrolysed in the mass spectrometer, so we add H2O
            |> Formula.add Formula.Table.H2O
            )
    |> Array.collect (fun formula -> 
        [
            // generate single charged iones 
            generateIsotopicDistribution 1 formula
            // generate double charged iones 
            generateIsotopicDistribution 2 formula
        ] |> Array.concat
        )
    |> Chart.Column
    |> Chart.withXAxisStyle ("m/z", MinMax = (0., 3000.))
    |> Chart.withYAxisStyle ("Intensity", MinMax = (0., 1.3))
    |> Chart.withSize (900., 400.)
    |> Chart.withTemplate ChartTemplates.light

peptidesAndMassesChart
// HINT: zoom in on peptides
No value returned by any evaluator

Determination of peptide hydrophobicity

In a MS1 scan, peptides don't appear randomly. They elute according to their hydrophobicity and other physicochemical properties from the LC.

To more accurately represent a MS1 spectrum, we determine the hydrophobicity of each peptide. Therefore, we first need a function that maps from sequence to hydrophobicity.

// Code-Block 6

open BioFSharp.AminoProperties

// first, define a function that maps from amino acid to hydophobicity
let getHydrophobicityIndex =
    BioFSharp.AminoProperties.initGetAminoProperty AminoProperty.HydrophobicityIndex
    
// second, use that function to map from peptide sequence to hydophobicity
let toHydrophobicity (peptide:AminoAcids.AminoAcid[]) =
    peptide
    |> Array.map AminoAcidSymbols.aminoAcidSymbol
    |> AminoProperties.ofWindowedBioArray 3 getHydrophobicityIndex
    |> Array.average

toHydrophobicity
// Code-Block 7

let peptidesFirst200 = 
    chargedPeptides 
    // now we sort according to hydrophobicity
    |> Array.sortBy (fun (peptide,mass) ->   
        peptide
        |> Array.ofList
        |> toHydrophobicity
        )
    |> Array.take 200

peptidesFirst200 |> Array.head
No value returned by any evaluator

Now, we need to generate the isotopic cluster again and visualize afterwards.

// Code-Block 8

let peptidesFirst200Chart =
    peptidesFirst200
    |> Array.map (fun (peptide,mz) -> 
            peptide
            |> BioSeq.toFormula
            // peptides are hydrolysed in the mass spectrometer, so we add H2O
            |> Formula.add Formula.Table.H2O
            )
    |> Array.collect (fun formula -> 
        [
            // generate single charged iones 
            generateIsotopicDistribution 1 formula
            // generate double charged iones 
            generateIsotopicDistribution 2 formula
        ]
        |> Array.concat
        )
    // Display
    |> Chart.Column
    |> Chart.withXAxisStyle ("m/z", MinMax = (0., 3000.))
    |> Chart.withYAxisStyle ("Intensity", MinMax = (0., 1.3))
    |> Chart.withSize (900., 400.)
    |> Chart.withTemplate ChartTemplates.light

peptidesFirst200Chart
// HINT: zoom in on peptides
No value returned by any evaluator

Questions

  1. How does the gradient applied at a reverse phase LC influence the retention time?
  2. Try generating your own MS1 spectrum with peptides of similar hydrophobicity. Take a look at Codeblock 7 and 8 to see how to do that.
  3. To better compare retention times between runs with different gradients or instruments, the retention time of those runs must be aligned. What could be some ways to align the retention time of different runs?
namespace System
namespace System.IO
namespace Microsoft.FSharp
val directory : string
val path : string
type Path = static member ChangeExtension : path: string * extension: string -> string static member Combine : path1: string * path2: string -> string + 3 overloads static member EndsInDirectorySeparator : path: ReadOnlySpan<char> -> bool + 1 overload static member GetDirectoryName : path: ReadOnlySpan<char> -> ReadOnlySpan<char> + 1 overload static member GetExtension : path: ReadOnlySpan<char> -> ReadOnlySpan<char> + 1 overload static member GetFileName : path: ReadOnlySpan<char> -> ReadOnlySpan<char> + 1 overload static member GetFileNameWithoutExtension : path: ReadOnlySpan<char> -> ReadOnlySpan<char> + 1 overload static member GetFullPath : path: string -> string + 1 overload static member GetInvalidFileNameChars : unit -> char [] static member GetInvalidPathChars : unit -> char [] ...
<summary>Performs operations on <see cref="T:System.String" /> instances that contain file or directory path information. These operations are performed in a cross-platform manner.</summary>
Path.Combine([<System.ParamArray>] paths: string []) : string
Path.Combine(path1: string, path2: string) : string
Path.Combine(path1: string, path2: string, path3: string) : string
Path.Combine(path1: string, path2: string, path3: string, path4: string) : string
val peptideAndMasses : (obj list * obj) []
module Seq from Microsoft.FSharp.Collections
<summary>Contains operations for working with values of type <see cref="T:Microsoft.FSharp.Collections.seq`1" />.</summary>
val toArray : source:seq<'T> -> 'T []
<summary>Builds an array from the given collection.</summary>
<param name="source">The input sequence.</param>
<returns>The result array.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
module Array from Microsoft.FSharp.Collections
<summary>Contains operations for working with arrays.</summary>
<remarks> See also <a href="https://docs.microsoft.com/dotnet/fsharp/language-reference/arrays">F# Language Guide - Arrays</a>. </remarks>
val mapi : mapping:(int -> 'T -> 'U) -> array:'T [] -> 'U []
<summary>Builds a new array whose elements are the results of applying the given function to each of the elements of the array. The integer index passed to the function indicates the index of element being transformed.</summary>
<param name="mapping">The function to transform elements and their indices.</param>
<param name="array">The input array.</param>
<returns>The array of transformed elements.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
val i : int
val fastAItem : obj
val concat : arrays:seq<'T []> -> 'T []
<summary>Builds a new array that contains the elements of each of the given sequence of arrays.</summary>
<param name="arrays">The input sequence of arrays.</param>
<returns>The concatenation of the sequence of input arrays.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input sequence is null.</exception>
val map : mapping:('T -> 'U) -> array:'T [] -> 'U []
<summary>Builds a new array whose elements are the results of applying the given function to each of the elements of the array.</summary>
<param name="mapping">The function to transform elements of the array.</param>
<param name="array">The input array.</param>
<returns>The array of transformed elements.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
val peptide : obj
val head : array:'T [] -> 'T
<summary>Returns the first element of the array.</summary>
<param name="array">The input array.</param>
<returns>The first element of the array.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
<exception cref="T:System.ArgumentException">Thrown when the input array is empty.</exception>
val singleChargedPeptides : (obj list * obj) []
val filter : predicate:('T -> bool) -> array:'T [] -> 'T []
<summary>Returns a new collection containing only the elements of the collection for which the given predicate returns "true".</summary>
<param name="predicate">The function to test the input elements.</param>
<param name="array">The input array.</param>
<returns>An array containing the elements for which the given predicate returns true.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
val peptide : obj list
val ucMass : obj
property List.Length: int with get
<summary>Gets the number of items contained in the list</summary>
val doubleChargedPeptides : (obj list * obj) []
val chargedPeptides : (obj list * obj) []
val rnd : System.Random
Multiple items
type Random = new : unit -> unit + 1 overload member Next : unit -> int + 2 overloads member NextBytes : buffer: byte [] -> unit + 1 overload member NextDouble : unit -> float member Sample : unit -> float
<summary>Represents a pseudo-random number generator, which is an algorithm that produces a sequence of numbers that meet certain statistical requirements for randomness.</summary>

--------------------
System.Random() : System.Random
System.Random(Seed: int) : System.Random
val chargedPeptideChart : obj
val mz : obj
val generateIsotopicDistribution : charge:int -> f:'a -> 'b []
val charge : int
Multiple items
val int : value:'T -> int (requires member op_Explicit)
<summary>Converts the argument to signed 32-bit integer. This is a direct conversion for all primitive numeric types. For strings, the input is converted using <c>Int32.Parse()</c> with InvariantCulture settings. Otherwise the operation requires an appropriate static conversion method on the input type.</summary>
<param name="value">The input value.</param>
<returns>The converted int</returns>


--------------------
[<Struct>] type int = int32
<summary>An abbreviation for the CLI type <see cref="T:System.Int32" />.</summary>
<category>Basic Types</category>


--------------------
type int<'Measure> = int
<summary>The type of 32-bit signed integer numbers, annotated with a unit of measure. The unit of measure is erased in compiled code and when values of this type are analyzed using reflection. The type is representationally equivalent to <see cref="T:System.Int32" />.</summary>
<category>Basic Types with Units of Measure</category>
val f : 'a
Multiple items
module List from Microsoft.FSharp.Collections
<summary>Contains operations for working with values of type <see cref="T:Microsoft.FSharp.Collections.list`1" />.</summary>
<namespacedoc><summary>Operations for collections such as lists, arrays, sets, maps and sequences. See also <a href="https://docs.microsoft.com/dotnet/fsharp/language-reference/fsharp-collection-types">F# Collection Types</a> in the F# Language Guide. </summary></namespacedoc>


--------------------
type List<'T> = | ( [] ) | ( :: ) of Head: 'T * Tail: 'T list interface IReadOnlyList<'T> interface IReadOnlyCollection<'T> interface IEnumerable interface IEnumerable<'T> member GetReverseIndex : rank:int * offset:int -> int member GetSlice : startIndex:int option * endIndex:int option -> 'T list static member Cons : head:'T * tail:'T list -> 'T list member Head : 'T member IsEmpty : bool member Item : index:int -> 'T with get ...
<summary>The type of immutable singly-linked lists.</summary>
<remarks>Use the constructors <c>[]</c> and <c>::</c> (infix) to create values of this type, or the notation <c>[1;2;3]</c>. Use the values in the <c>List</c> module to manipulate values of this type, or pattern match against the values directly. </remarks>
<exclude />
val toArray : list:'T list -> 'T []
<summary>Builds an array from the given list.</summary>
<param name="list">The input list.</param>
<returns>The array containing the elements of the list.</returns>
val peptidesAndMassesChart : obj
val collect : mapping:('T -> 'U []) -> array:'T [] -> 'U []
<summary>For each element of the array, applies the given function. Concatenates all the results and return the combined array.</summary>
<param name="mapping">The function to create sub-arrays from the input array elements.</param>
<param name="array">The input array.</param>
<returns>The concatenation of the sub-arrays.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
val formula : obj
val getHydrophobicityIndex : obj
val toHydrophobicity : peptide:'a [] -> System.IComparable
val peptide : 'a []
val average : array:'T [] -> 'T (requires member ( + ) and member DivideByInt and member get_Zero)
<summary>Returns the average of the elements in the array.</summary>
<param name="array">The input array.</param>
<exception cref="T:System.ArgumentException">Thrown when <c>array</c> is empty.</exception>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
<returns>The average of the elements in the array.</returns>
val peptidesFirst200 : (obj list * obj) []
val sortBy : projection:('T -> 'Key) -> array:'T [] -> 'T [] (requires comparison)
<summary>Sorts the elements of an array, using the given projection for the keys and returning a new array. Elements are compared using <see cref="M:Microsoft.FSharp.Core.Operators.compare" />.</summary>
<remarks>This is not a stable sort, i.e. the original order of equal elements is not necessarily preserved. For a stable sort, consider using <see cref="M:Microsoft.FSharp.Collections.SeqModule.Sort" />.</remarks>
<param name="projection">The function to transform array elements into the type that is compared.</param>
<param name="array">The input array.</param>
<returns>The sorted array.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
val mass : obj
val ofList : list:'T list -> 'T []
<summary>Builds an array from the given list.</summary>
<param name="list">The input list.</param>
<returns>The array of elements from the list.</returns>
val take : count:int -> array:'T [] -> 'T []
<summary>Returns the first N elements of the array.</summary>
<remarks>Throws <c>InvalidOperationException</c> if the count exceeds the number of elements in the array. <c>Array.truncate</c> returns as many items as the array contains instead of throwing an exception.</remarks>
<param name="count">The number of items to take.</param>
<param name="array">The input array.</param>
<returns>The result array.</returns>
<exception cref="T:System.ArgumentNullException">Thrown when the input array is null.</exception>
<exception cref="T:System.ArgumentException">Thrown when the input array is empty.</exception>
<exception cref="T:System.InvalidOperationException">Thrown when count exceeds the number of elements in the list.</exception>
val peptidesFirst200Chart : obj