Header menu logo BioFSharp

Sequence Properties

BinderScriptNotebook

Summary: This example shows how to calculate properties of amino acid sequences in BioFSharp

General

BioFSharp comes equipped with a range of numerical values for important amino acid properties. To access them in an easy fashion, you can use the initGetAminoProperty function in the following way. The result is a mapping function, which assigns a value to each compatible amino acid.
In this tutorial our aim is to find out the hydrophobicity of a peptide. We start by calling the aforementioned function.

open BioFSharp
open AminoProperties

let getHydrophobicityIndex  = initGetAminoProperty AminoProperty.HydrophobicityIndex

getHydrophobicityIndex AminoAcidSymbols.AminoAcidSymbol.Ala 
0.61
let getHydrophobicityIndexZ  = initGetAminoPropertyZnorm AminoProperty.HydrophobicityIndex

getHydrophobicityIndexZ AminoAcidSymbols.AminoAcidSymbol.Ala 
-0.4813502112

With this function you might easily estimate the hydrophobictiy of our peptide by calling it on every element with a map. Usually close amino acids in a peptide influence each other. To cover this you can use the ofWindowedBioArray function. It also takes a window size and calculates the value of the property of every amino acid in the chain with regards to the effect of the adjacent amino acids in this window.

let peptide = 
    "REYAHMIGMEYDTVQK"
    |> BioArray.ofAminoAcidString
    |> Array.map AminoAcidSymbols.aminoAcidSymbol

let peptidehydrophobicites = peptide |> Array.map getHydrophobicityIndex
[|0.6; 0.47; 1.88; 0.61; 0.61; 1.18; 2.22; 0.07; 1.18; 0.47; 1.88; 0.46;
0.05; 1.32; 0.0; 1.15|]
let peptidehydrophobicites' = peptide |> AminoProperties.ofWindowedBioArray 3 getHydrophobicityIndex

In the last step you can then just sum or average over the values to get a summary value of the hydrophobicity, depending on wether you want a length dependent or independent value.

Array.sum peptidehydrophobicites
[|0.89; 0.834; 0.8916666667; 1.081428571; 1.005714286; 1.107142857;
0.9057142857; 1.087142857; 1.065714286; 0.9042857143; 0.7757142857;
0.7657142857; 0.7614285714; 0.81; 0.596; 0.63|]
Array.sum peptidehydrophobicites'
14.11166667
Array.average peptidehydrophobicites
14.11166667
Array.average peptidehydrophobicites'
0.8819791667

Isoelectric Point

The isoelectric point (pI) of a protein is the point at which it carries as many positive as negative charges. Therefore the overall charge is zero. Knowing this value can e.g. be useful for isolation of single proteins in a voltage gradient.
The implementation is based on: this document. In principle, the distinct amino acids in the protein are counted. By using the Henderson-Hasselbalch equation and the pKr values, the theoretic charge states of the amino acids for a specific pH can be calculated. Multiplying those charge states with the count of the associated amino acids and adding those products together then gives the overall charge of the protein. This is only done with the amino acids, which might be charged (basic, acidic). The isoelectric point is the pH value for which this function returns zero. It is found by bisection (also called Binary Search).
Disclaimer: Keep in mind, that this algorithm ignores post-translational modifications and interactions of the amino acids with each other. Therefore it is only intented to be a rough approximation and should be used as such.


The function for finding the isoelectric point is found in the IsoelectricPoint module.

//AA sequence
let myProteinForPI = 
    "ATPIIEMNYPWTMNIKLSSDACMTNWWPNCMTLKIIA"
    |> Seq.map AminoAcidSymbols.aminoAcidSymbol

//accuracy in z
let acc = 0.5

IsoelectricPoint.tryFind IsoelectricPoint.getpKr acc myProteinForPI
Some 7.0
namespace BioFSharp
module AminoProperties from BioFSharp
<summary> Contains functionalities for obtaining included literary data on key amino acid properties </summary>
val getHydrophobicityIndex: (AminoAcidSymbols.AminoAcidSymbol -> float)
val initGetAminoProperty: property: AminoProperty -> (AminoAcidSymbols.AminoAcidSymbol -> float)
<summary> Returns a simple mapping function for the given amino acid property </summary>
type AminoProperty = | GravyScore | HydrophobicityIndex | HydrophobicityFasman | MeltingPointFasman | OpticalRotationFasman | PK_NFasman | PK_CFasman | NormalizedBetaSheet | NormalizedHelix | NormalizedTurn ... static member toString: (AminoProperty -> string)
<summary> Union case of amino acid properties, referencing the according included information in this library. Use "initGetAminoProperty" function to obtain a simple mapping function </summary>
union case AminoProperty.HydrophobicityIndex: AminoProperty
<summary> Hydrophobicity index (Argos et al., 1982) </summary>
module AminoAcidSymbols from BioFSharp
<summary> Contains the AminoAcidSymbol type and its according functions. The AminoAcidSymbol type is a lightweight, efficient presentation of amino acids </summary>
[<Struct>] type AminoAcidSymbol = interface IBioItem interface IComparable override Equals: other: obj -> bool override GetHashCode: unit -> int override ToString: unit -> string static member op_Explicit: value: int -> AminoAcidSymbol + 2 overloads static member Ala: AminoAcidSymbol static member Arg: AminoAcidSymbol static member Asn: AminoAcidSymbol static member Asp: AminoAcidSymbol ...
property AminoAcidSymbols.AminoAcidSymbol.Ala: AminoAcidSymbols.AminoAcidSymbol with get
<summary> 'A' *Alanin </summary>
val getHydrophobicityIndexZ: (AminoAcidSymbols.AminoAcidSymbol -> float)
val initGetAminoPropertyZnorm: property: AminoProperty -> (AminoAcidSymbols.AminoAcidSymbol -> float)
<summary> Returns a simple mapping function for the given amino acid property. Normalizes the values to the Z-Norm scale </summary>
val peptide: AminoAcidSymbols.AminoAcidSymbol array
Multiple items
module BioArray from BioFSharp.BioCollectionsExtensions

--------------------
module BioArray from BioFSharp
<summary> This module contains the BioArray type and its according functions. The BioArray type is an array of objects using the IBioItem interface </summary>
val ofAminoAcidString: s: #(char seq) -> BioArray.BioArray<AminoAcids.AminoAcid>
<summary> Generates amino acid sequence of one-letter-code raw string </summary>
module Array from Microsoft.FSharp.Collections
val map: mapping: ('T -> 'U) -> array: 'T array -> 'U array
val aminoAcidSymbol: a: 'a -> AminoAcidSymbols.AminoAcidSymbol (requires member op_Explicit)
<summary> Maps input to AminoAcidSymbol if possible </summary>
val peptidehydrophobicites: float array
val peptidehydrophobicites': float array
val ofWindowedBioArray: n: int -> pf: ('a -> float) -> source: BioArray.BioArray<'a> -> float array (requires 'a :> IBioItem)
<summary> Returns an array of sliding windows based property averages. Each window contains the n elements surrounding the current element </summary>
val sum: array: 'T array -> 'T (requires member (+) and member Zero)
val average: array: 'T array -> 'T (requires member (+) and member DivideByInt and member Zero)
val myProteinForPI: AminoAcidSymbols.AminoAcidSymbol seq
module Seq from Microsoft.FSharp.Collections
val map: mapping: ('T -> 'U) -> source: 'T seq -> 'U seq
val acc: float
module IsoelectricPoint from BioFSharp
<summary> Finding the isoelectric point of peptides </summary>
val tryFind: pKrFunc: (AminoAcidSymbols.AminoAcidSymbol -> float) -> accuracy: float -> aaSeq: AminoAcidSymbols.AminoAcidSymbol seq -> float option
<summary> Finds the pH for which the global charge of the aaSeq is closer to 0 than the given accuracy. </summary>
val getpKr: (AminoAcidSymbols.AminoAcidSymbol -> float)
<summary> Maps AminoAcidSymbol to default pK value of it's sidechain. Returns 0.0 if sidechain is neither acidic nor basic </summary>

Type something to start searching.