BioFSharp


FSharpML: Explore ML.Net in F#

FSharpML is a lightweight API writen in F# on top of the powerful machine learning framework ML.Net library. It is designed to enable users to explore ML.Net in a scriptable manner and maintaining the functional style of F#. The samples are ported from the official site Samples for ML.NET.

After installing the package via Nuget we can load the delivered reference script and start using ML.Net in conjunction with FSharpML.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
#load "FSharpML.fsx"

open System
open Microsoft.ML
open Microsoft.ML.Data;
open FSharpML
open FSharpML.EstimatorModel
open FSharpML.TransformerModel

Start by creating a model context (MLContext) and a data reader with the loading configuration.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
//Create the MLContext to share across components for deterministic results
let mlContext = MLContext(seed = Nullable 1) //Seed set to any number so you
                                             //have a deterministic environment

// STEP 1: Common data loading configuration
let fullData =
    let hasHeader = true
    let separatorChar = '\t'
    let columns =
        [|
            TextLoader.Column("Label", DataKind.Single, 0)
            TextLoader.Column("SepalLength", DataKind.Single, 1)
            TextLoader.Column("SepalWidth", DataKind.Single, 2)
            TextLoader.Column("PetalLength", DataKind.Single, 3)
            TextLoader.Column("PetalWidth", DataKind.Single, 4)
        |]

    __SOURCE_DIRECTORY__  + "./data/iris-full.txt"
    |> Data.loadFromTextFile mlContext separatorChar hasHeader columns    
    |> DataModel.ofDataview<string> mlContext

//Split dataset in two parts: TrainingData (80%) and TestData (20%)
let trainingData, testingData = 
    fullData
    |> DataModel.trainTestSplit 0.2 


//let struct(trainingDataView, testingDataView) = 
//    mlContext.Clustering.Trainers(fullData, testFraction = 0.2)
No value has been returned

After initializing an model context (MLContext) we can start to build our model by appending transformer functions. The EstimatorModel (Model) holds the context and the chain of estimators (EstimatorChain) and is than fitted to the training data in a training step. The resulting TransformerModel serves as a predictor.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
//STEP 2: Process data, create and train the model 
let model = 
    EstimatorModel.create mlContext
    // Process data transformations in pipeline
    |> EstimatorModel.appendBy (fun mlc -> 
        mlc.Transforms.Concatenate
                            (
                                DefaultColumnNames.Features , 
                                "SepalLength", 
                                "SepalWidth", 
                                "PetalLength", 
                                "PetalWidth"
                            ) )
    // Create the model
    |> EstimatorModel.appendBy (fun mlc -> 
            mlc.Clustering.Trainers.KMeans(
                        featureColumnName = DefaultColumnNames.Features, 
                        numberOfClusters = 3
                    ) )
    // Train the model
    |> EstimatorModel.fit trainingData.Dataview

The resulting TransformerModel serves as a predictor and can be tested by predicting our test data and evaluating the accuracy of the model.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
// STEP3: Run the prediciton on the test data
let predictions =
    model
    |> TransformerModel.transform testingData.Dataview

// STEP4: Evaluate the accuracy of the model
let metrics = 
    model
    |> Evaluation.Clustering.evaluate testingData.Dataview

For more detailed examples continue to explore the FSharpML documentation.

Contributing and copyright

The project is hosted on [GitHub][gh] where you can [report issues][issues], fork the project and submit pull requests. If you're adding a new public API, please also consider adding [samples][content] that can be turned into a documentation. You might also want to read the [library design notes][readme] to understand how it works.

The library is available under Public Domain license, which allows modification and redistribution for both commercial and non-commercial purposes. For more information see the [License file][license] in the GitHub repository.

namespace FSharp
namespace FSharp.Plotly
namespace System
namespace Microsoft
namespace Microsoft.ML
namespace Microsoft.ML.Data
module FSharpML
namespace FSharpML.EstimatorModel
namespace FSharpML.TransformerModel
val mlContext : MLContext
Multiple items
type MLContext =
  new : ?seed:Nullable<int> -> MLContext
  member AnomalyDetection : AnomalyDetectionCatalog
  member BinaryClassification : BinaryClassificationCatalog
  member Clustering : ClusteringCatalog
  member ComponentCatalog : ComponentCatalog
  member Data : DataOperationsCatalog
  member Forecasting : ForecastingCatalog
  member Model : ModelOperationsCatalog
  member MulticlassClassification : MulticlassClassificationCatalog
  member Ranking : RankingCatalog
  ...

--------------------
MLContext(?seed: Nullable<Microsoft.FSharp.Core.int>) : MLContext
Multiple items
type Nullable =
  static member Compare<'T> : n1:Nullable<'T> * n2:Nullable<'T> -> int
  static member Equals<'T> : n1:Nullable<'T> * n2:Nullable<'T> -> bool
  static member GetUnderlyingType : nullableType:Type -> Type

--------------------
type Nullable<'T (requires default constructor and value type and 'T :> ValueType)> =
  struct
    new : value:'T -> Nullable<'T>
    member Equals : other:obj -> bool
    member GetHashCode : unit -> int
    member GetValueOrDefault : unit -> 'T + 1 overload
    member HasValue : bool
    member ToString : unit -> string
    member Value : 'T
  end

--------------------
Nullable ()
Nullable(value: 'T) : Nullable<'T>
val fullData : Microsoft.FSharp.Core.obj
val hasHeader : Microsoft.FSharp.Core.bool
val separatorChar : Microsoft.FSharp.Core.char
val columns : TextLoader.Column Microsoft.FSharp.Core.[]
type TextLoader =
  member GetOutputSchema : unit -> DataViewSchema
  member Load : source:IMultiStreamSource -> IDataView
  nested type Column
  nested type Options
  nested type Range
type Column =
  new : unit -> Column + 3 overloads
  val Name : string
  val Source : Range[]
  val KeyCount : KeyCount
  member DataKind : DataKind with get, set
type DataKind =
  | SByte = 1uy
  | Byte = 2uy
  | Int16 = 3uy
  | UInt16 = 4uy
  | Int32 = 5uy
  | UInt32 = 6uy
  | Int64 = 7uy
  | UInt64 = 8uy
  | Single = 9uy
  | Double = 10uy
  ...
field DataKind.Single: DataKind = 9uy
Multiple items
module Data

from FSharpML

--------------------
namespace Microsoft.ML.Data

--------------------
namespace System.Data
val loadFromTextFile : mlc:MLContext -> separatorChar:Microsoft.FSharp.Core.char -> hasHeader:Microsoft.FSharp.Core.bool -> columns:TextLoader.Column Microsoft.FSharp.Core.[] -> path:Microsoft.FSharp.Core.string -> IDataView
module DataModel

from FSharpML
val ofDataview<'info> : mlc:MLContext -> dataview:IDataView -> DataModel.DataModel<Microsoft.FSharp.Core.obj>
val trainingData : 'a
val testingData : 'a
val trainTestSplit : testfraction:Microsoft.FSharp.Core.float -> dataModel:DataModel.DataModel<'a> -> DataModel.DataModel<DataModel.TrainTestSplitInfo> * DataModel.DataModel<DataModel.TrainTestSplitInfo>
type IrisData =
  { Label: obj
    SepalLength: obj
    SepalWidth: obj
    PetalLength: obj
    PetalWidth: obj }
IrisData.Label: Microsoft.FSharp.Core.obj
IrisData.SepalLength: Microsoft.FSharp.Core.obj
IrisData.SepalWidth: Microsoft.FSharp.Core.obj
IrisData.PetalLength: Microsoft.FSharp.Core.obj
IrisData.PetalWidth: Microsoft.FSharp.Core.obj
val plot1 : '_arg3
property MLContext.Data: DataOperationsCatalog with get
DataOperationsCatalog.CreateEnumerable<'TRow (requires default constructor and reference type)>(data: IDataView, reuseRowObject: Microsoft.FSharp.Core.bool,?ignoreMissingColumns: Microsoft.FSharp.Core.bool,?schemaDefinition: SchemaDefinition) : Collections.Generic.IEnumerable<'TRow>
module Seq

from FSharp.Plotly
type Chart =
  static member Area : xy:seq<#IConvertible * #IConvertible> * ?Name:string * ?ShowMarkers:bool * ?Showlegend:bool * ?MarkerSymbol:Symbol * ?Color:'a2 * ?Opacity:float * ?Labels:seq<#IConvertible> * ?TextPosition:TextPosition * ?TextFont:Font * ?Dash:DrawingStyle * ?Width:'a4 -> GenericChart
  static member Area : x:seq<#IConvertible> * y:seq<#IConvertible> * ?Name:string * ?ShowMarkers:bool * ?Showlegend:bool * ?MarkerSymbol:Symbol * ?Color:'a2 * ?Opacity:float * ?Labels:seq<#IConvertible> * ?TextPosition:TextPosition * ?TextFont:Font * ?Dash:DrawingStyle * ?Width:'a4 -> GenericChart
  static member Bar : keysvalues:seq<#IConvertible * #IConvertible> * ?Name:string * ?Showlegend:bool * ?Color:'a2 * ?Opacity:float * ?Labels:seq<#IConvertible> * ?TextPosition:TextPosition * ?TextFont:Font * ?Marker:Marker -> GenericChart
  static member Bar : keys:seq<#IConvertible> * values:seq<#IConvertible> * ?Name:string * ?Showlegend:bool * ?Color:'a2 * ?Opacity:float * ?Labels:seq<#IConvertible> * ?TextPosition:TextPosition * ?TextFont:Font * ?Marker:Marker -> GenericChart
  static member BoxPlot : xy:seq<'a0 * 'a1> * ?Name:string * ?Showlegend:bool * ?Color:'a2 * ?Fillcolor:'a3 * ?Opacity:float * ?Whiskerwidth:'a4 * ?Boxpoints:Boxpoints * ?Boxmean:BoxMean * ?Jitter:'a5 * ?Pointpos:'a6 * ?Orientation:Orientation -> GenericChart
  static member BoxPlot : ?x:'a0 * ?y:'a1 * ?Name:string * ?Showlegend:bool * ?Color:'a2 * ?Fillcolor:'a3 * ?Opacity:float * ?Whiskerwidth:'a4 * ?Boxpoints:Boxpoints * ?Boxmean:BoxMean * ?Jitter:'a5 * ?Pointpos:'a6 * ?Orientation:Orientation -> GenericChart
  static member Bubble : xysizes:seq<#IConvertible * #IConvertible * #IConvertible> * ?Name:string * ?Showlegend:bool * ?MarkerSymbol:Symbol * ?Color:'a3 * ?Opacity:float * ?Labels:seq<#IConvertible> * ?TextPosition:TextPosition * ?TextFont:Font -> GenericChart
  static member Bubble : x:seq<#IConvertible> * y:seq<#IConvertible> * sizes:seq<#IConvertible> * ?Name:string * ?Showlegend:bool * ?MarkerSymbol:Symbol * ?Color:'a3 * ?Opacity:float * ?Labels:seq<#IConvertible> * ?TextPosition:TextPosition * ?TextFont:Font -> GenericChart
  static member ChoroplethMap : locations:seq<string> * z:seq<#IConvertible> * ?Text:seq<#IConvertible> * ?Locationmode:LocationFormat * ?Autocolorscale:bool * ?Colorscale:Colorscale * ?Colorbar:'a2 * ?Marker:Marker * ?Zmin:'a3 * ?Zmax:'a4 -> GenericChart
  static member Column : keysvalues:seq<#IConvertible * #IConvertible> * ?Name:string * ?Showlegend:bool * ?Color:'a2 * ?Opacity:float * ?Labels:seq<#IConvertible> * ?TextPosition:TextPosition * ?TextFont:Font * ?Marker:Marker -> GenericChart
  ...
static member Chart.Point : xy:Microsoft.FSharp.Collections.seq<#IConvertible * #IConvertible> * ?Name:Microsoft.FSharp.Core.string * ?Showlegend:Microsoft.FSharp.Core.bool * ?MarkerSymbol:StyleParam.Symbol * ?Color:'a2 * ?Opacity:Microsoft.FSharp.Core.float * ?Labels:Microsoft.FSharp.Collections.seq<#IConvertible> * ?TextPosition:StyleParam.TextPosition * ?TextFont:Font -> GenericChart.GenericChart
static member Chart.Point : x:Microsoft.FSharp.Collections.seq<#IConvertible> * y:Microsoft.FSharp.Collections.seq<#IConvertible> * ?Name:Microsoft.FSharp.Core.string * ?Showlegend:Microsoft.FSharp.Core.bool * ?MarkerSymbol:StyleParam.Symbol * ?Color:'a2 * ?Opacity:Microsoft.FSharp.Core.float * ?Labels:Microsoft.FSharp.Collections.seq<#IConvertible> * ?TextPosition:StyleParam.TextPosition * ?TextFont:Font -> GenericChart.GenericChart
static member Chart.Combine : gCharts:Microsoft.FSharp.Collections.seq<GenericChart.GenericChart> -> GenericChart.GenericChart
val model : '_arg3 (requires member ( |> ) and member ( |> ) and 'a :> ITransformer and reference type and 'c :> ITransformer and reference type)
Multiple items
module EstimatorModel

from FSharpML.EstimatorModel

--------------------
namespace FSharpML.EstimatorModel

--------------------
type EstimatorModel<'a (requires 'a :> ITransformer and reference type)> =
  { EstimatorChain: EstimatorChain<'a>
    Context: MLContext }
val create : mlContext:MLContext -> EstimatorModel<'a> (requires reference type and 'a :> ITransformer)
val appendBy : transforming:(MLContext -> #IEstimator<'c>) -> estimatorModel:EstimatorModel<'d> -> EstimatorModel<'c> (requires 'c :> ITransformer and reference type and 'd :> ITransformer and reference type)
val mlc : MLContext
property MLContext.Transforms: TransformsCatalog with get
(extension) TransformsCatalog.Concatenate(outputColumnName: Microsoft.FSharp.Core.string, [<ParamArray>] inputColumnNames: Microsoft.FSharp.Core.string Microsoft.FSharp.Core.[]) : Transforms.ColumnConcatenatingEstimator
module DefaultColumnNames

from FSharpML
val Features : Microsoft.FSharp.Core.string
property MLContext.Clustering: ClusteringCatalog with get
property ClusteringCatalog.Trainers: ClusteringCatalog.ClusteringTrainers with get
(extension) ClusteringCatalog.ClusteringTrainers.KMeans(options: Trainers.KMeansTrainer.Options) : Trainers.KMeansTrainer
(extension) ClusteringCatalog.ClusteringTrainers.KMeans(?featureColumnName: Microsoft.FSharp.Core.string,?exampleWeightColumnName: Microsoft.FSharp.Core.string,?numberOfClusters: Microsoft.FSharp.Core.int) : Trainers.KMeansTrainer
val fit : data:IDataView -> estimatorModel:EstimatorModel<'a> -> TransformerModel<'a> (requires 'a :> ITransformer and reference type)
val predictions : '_arg3
Multiple items
module TransformerModel

from FSharpML.TransformerModel

--------------------
namespace FSharpML.TransformerModel

--------------------
type TransformerModel<'a (requires 'a :> ITransformer and reference type)> =
  { TransformerChain: TransformerChain<'a>
    Context: MLContext }
val transform : data:IDataView -> transformerModel:TransformerModel<'b> -> IDataView (requires 'b :> ITransformer and reference type)
val metrics : '_arg3
module Evaluation

from FSharpML.TransformerModel
Multiple items
module Clustering

from FSharpML.TransformerModel.Evaluation

--------------------
type Clustering =
  static member evaluateWith : ?Label:string * ?Score:string * ?Features:string -> (IDataView -> TransformerModel<'a0> -> ClusteringMetrics) (requires 'a0 :> ITransformer and reference type)
val evaluate : data:IDataView -> transformerModel:TransformerModel<'a> -> ClusteringMetrics (requires 'a :> ITransformer and reference type)
val testingData : 'a Microsoft.FSharp.Core.[]
Fork me on GitHub