Header menu logo BioFSharp


BioFSharp aims to be a user-friendly functional library for bioinformatics written in F#. It contains the basic data structures for common biological objects like amino acids and nucleotides based on chemical formulas and chemical elements.

BioFSharp facilitates working with sequences in a strongly typed way and is designed to work well with F# Interactive. It provides a variety of parsers for many biological file formats and a variety of algorithms suited for bioinformatic workflows.

The core datamodel implements in ascending hierarchical order:

Data model


For applications and libraries

You can find all available package versions on nuget.

For scripting and interactive notebooks

You can include the package via an inline package reference:

#r "nuget: BioFSharp"


The following example shows how easy it is to start working with sequences:

Create a peptide sequence:

open BioFSharp

"PEPTIDE" |> BioArray.ofAminoAcidString

         1  PEPTIDE

Create a nucleotide sequence:

"ATGC" |> BioArray.ofNucleotideString

         1  ATGC

BioFSharp comes equipped with a broad range of features and functions to map amino acids and nucleotides.

// Returns the corresponding nucleotide of the complementary strand
Nucleotides.G |> Nucleotides.complement


// Returns the monoisotopic mass of Arginine (minus H2O)
AminoAcids.Arg |> AminoAcids.monoisoMass


The various file readers in BioFSharp help to easily retrieve information and write biology-associated file formats like for example FastA:

open BioFSharp.IO

let filepathFastaA = (__SOURCE_DIRECTORY__ + "/data/Chlamy_Cp.fastA")
//reads from file to an array of FastaItems.

let fastaItems = FastA.fromFile BioArray.ofAminoAcidString filepathFastaA

This will return a sequence of FastaItems, where you can directly start working with the individual sequences represented as a BioArray of amino acids.

fastaItems |> Seq.item 0

{ Header = "sp|P19528| cytochrome b6/f complex subunit 4 GN=petD PE=petD.p01"\n Sequence =\n [|Met; Ser; Val; Thr; Lys; Lys; Pro; Asp; Leu; Ser; Asp; Pro; Val; Leu; Lys;\n Ala; Lys; Leu; Ala; Lys; Gly; Met; Gly; His; Asn; Thr; Tyr; Gly; Glu; Pro;\n Ala; Trp; Pro; Asn; Asp; Leu; Leu; Tyr; ...
sp|P19528| cytochrome b6/f complex subunit 4 GN=petD PE=petD.p01

For more detailed examples continue to explore the BioFSharp documentation. In the near future we will start to provide a cookbook like tutorial in the CSBlog.

Contributing and copyright

The project is hosted on GitHub where you can report issues, fork the project and submit pull requests. If you're adding a new public API, please also consider adding samples that can be turned into a documentation.

The library is available under the OSI-approved MIT license. For more information see the License file in the GitHub repository.

namespace BioFSharp
namespace BioFSharp.Interactive
module Formatters from BioFSharp.Interactive
val registerAll: unit -> unit
Multiple items
module BioArray from BioFSharp.BioCollectionsExtensions

module BioArray from BioFSharp
<summary> This module contains the BioArray type and its according functions. The BioArray type is an array of objects using the IBioItem interface </summary>
val ofAminoAcidString: s: #(char seq) -> BioArray.BioArray<AminoAcids.AminoAcid>
<summary> Generates amino acid sequence of one-letter-code raw string </summary>
val ofNucleotideString: s: #(char seq) -> BioArray.BioArray<Nucleotides.Nucleotide>
<summary> Generates nucleotide sequence of one-letter-code raw string </summary>
module Nucleotides from BioFSharp
<summary> Contains the Nucleotide type and its according functions. </summary>
union case Nucleotides.Nucleotide.G: Nucleotides.Nucleotide
<summary> G : Guanine </summary>
val complement: nuc: Nucleotides.Nucleotide -> Nucleotides.Nucleotide
<summary> Returns the Nucleotide from the complementary strand </summary>
module AminoAcids from BioFSharp
<summary> Contains the AminoAcid type and its according functions. The AminoAcid type is a complex presentation of amino acids, allowing modifications </summary>
union case AminoAcids.AminoAcid.Arg: AminoAcids.AminoAcid
<summary> 'R' - Arg - Arginine Functionally similar to lysine. </summary>
val monoisoMass: aa: AminoAcids.AminoAcid -> float
<summary> Returns the monoisotopic mass of AminoAcid (without H20) </summary>
namespace BioFSharp.IO
val filepathFastaA: string
val fastaItems: FastA.FastaItem<BioArray.BioArray<AminoAcids.AminoAcid>> seq
module FastA from BioFSharp.IO
val fromFile: converter: (char seq -> 'a) -> filePath: string -> FastA.FastaItem<'a> seq
<summary> Reads FastaItem from file. Converter determines type of sequence by converting seq&lt;char&gt; -&gt; type </summary>
module Seq from Microsoft.FSharp.Collections
val item: index: int -> source: 'T seq -> 'T

Type something to start searching.