Header menu logo BioFSharp

BioItems

Summary: This example shows how to use BioItemsin BioFSharp

Often, dealing with similar problems separately results in different approaches. In a programming background, this might make things needlessly complex. Therefore in BioFSharp nucleotides and amino acids are based on the same structural scaffold, leading to a consistent way of working with them. This can come in handy especially when working with their formulas.

Table of contents

Basics

Many functions are similar for AminoAcids and Nucleotides, like for example:

open BioFSharp
open BioFSharp.AminoAcids
open BioFSharp.Nucleotides

Accessing the full name:

AminoAcids.name Ala,
Nucleotides.name G 

(Alanine, Guanine)
Item1
Alanine
Item2
Guanine

or the underlying chemical formula:

AminoAcids.formula Lys |> Formula.toString,
Nucleotides.formula T |> Formula.toString 

(C6.00 H12.00 N2.00 O1.00, C10.00 H14.00 N2.00 O5.00)
Item1
C6.00 H12.00 N2.00 O1.00
Item2
C10.00 H14.00 N2.00 O5.00

Nucleotides and AminoAcids in BioFSharp are represented as Union cases. This makes applying functions selectively very easy.

let filterLysine aa = 
    match aa with
    | AminoAcids.Lys -> AminoAcids.Gap
    | _ -> aa

filterLysine Ala 

Ala

filterLysine Lys

Gap

Of course some functions like these are already defined. Let's use a predefined function to find charged amino acids.

let giveMePositiveAAs aminoAcid = 
    match aminoAcid with
    | a when AminoAcids.isPosCharged a -> 
        printfn 
            "Hey, how are you? I am %s, but my friends call me %c. I'm usually in a positive mood"
            (AminoAcids.name a)
            (AminoAcids.symbol a)

    | a when AminoAcids.isNegCharged a -> 
        printfn 
            "I am %s, short: %c. I'm usually in a negative mood"
            (AminoAcids.name a)
            (AminoAcids.symbol a)

    | _ -> printfn "Just strolling around, minding my own business."

Alanine is usually not charged

giveMePositiveAAs Ala
Just strolling around, minding my own business.

Lysine is usually positively charged:

giveMePositiveAAs Lys
Hey, how are you? I am Lysine, but my friends call me K. I'm usually in a positive mood

Glutamic acid is usually negatively charged:

giveMePositiveAAs Glu
I am Glutamic Acid, short: E. I'm usually in a negative mood

Amino Acids

Modifying Amino Acids

What makes working on Amino Acids with BioFSharp truly powerful is the ability to easily modify AminoAcids, even altering their mass and formula. In the following example we try to find out the mass of a phosphorylated Serine. Applications like these might be quite usefull for identification of peptides in mass spectrometry.

Ser
|> AminoAcids.formula 
|> Formula.toString
C3.00 H5.00 N1.00 O2.00

As you can see by the formula, ur Serine is missing two H and an O. In BioFSharp, all Amino Acids are dehydrolysed by default, because it is assumed that the user will use collections representing a peptide, rather than single Amino Acids. For our cause we want serine in hydrolysed form. An easy way to achieve this is to modify it. An addition of H2O is quite common and therefore premade:

///Hydrolysed serine

let hydroSerine = AminoAcids.setModification ModificationInfo.Table.H2O Ser

hydroSerine
|> AminoAcids.formula 
|> Formula.toString
C3.00 H7.00 N1.00 O3.00

So far so good. Now let's add the phosphate. For this we first create a function which alters the formula of a given molecule in the way a phosphorylation would. In the second step we create a modification resembling a phosphorylation of a residual. At last we modify our Serine with this modification.

///Phosphorylation of OH-Groups adds PO3 to formula and removes one H
let phosporylate formula =  
    Formula.add (Formula.parseFormulaString "PO3") formula
    |> Formula.substract (Formula.parseFormulaString "H")

//We create a modification at the residual called phosphorylation which in our case is hypothetical, hence the `false` for the 'isBiological` parameter
let phosphorylation = ModificationInfo.createModification "Phosphorylation" false ModificationInfo.ModLocation.Residual phosporylate

///phosphorylated Serine
let phosphoSerine = AminoAcids.setModification phosphorylation hydroSerine

phosphoSerine 
|> AminoAcids.formula 
|> Formula.toString
P1.00 C3.00 H6.00 N1.00 O6.00

As you can see the Serine is phosphorylated just as we wanted. Our inital aim was to check the mass, this can be done quite easily:

AminoAcids.averageMass Ser

87.07757500000001

AminoAcids.averageMass phosphoSerine

183.05688399999997

Nucleotides

As working with nucleotides is usually focused on the sequence of the bases, rather than how they actually look like, the list of nucleotide specific functions would be quite short. Here are some of the basic helper functions:

let myAdenine = Nucleotides.A 
let myThymine = Nucleotides.complement myAdenine 

myAdenine, myThymine

(A, T)
Item1
A
Item2
T

Nucleotides.replaceTbyU myAdenine

A

Nucleotides.replaceTbyU myThymine 

U

val filterLysine: aa: 'a -> 'b
val aa: 'a
val giveMePositiveAAs: aminoAcid: 'a -> unit
val aminoAcid: 'a
val a: 'a
val printfn: format: Printf.TextWriterFormat<'T> -> 'T
val hydroSerine: obj
Hydrolysed serine
val phosporylate: formula: 'a -> 'b
Phosphorylation of OH-Groups adds PO3 to formula and removes one H
val formula: 'a
val phosphorylation: obj
val phosphoSerine: obj
phosphorylated Serine
val myAdenine: obj
val myThymine: obj

Type something to start searching.