TargetP BioContainer
TargetP 1.1 is a widely used tool to predict subcellular localization of proteins by predicting N-terminal presequences. We can leverage the power of targetP from F# by using it in a docker container. To get academical access to the targetP software, please contact the friendly people at DTU.
The image
After aquiring the software you can create a dockerfile that abides biocontainer conventions at the packages root and run
docker build . -t nameOfYourContainer:yourTag
to get the image needed. Here is an example of a possible dockerfile:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: |
|
Running targetp from F#
As always, we need to define the docker client endpoint, image, and container context to run:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
1: 2: 3: |
|
To analyze a file with the container, we can use the runWithMountedFile
function to work on a fasta file in the mounted directory.
The file can be either coming from outside or upstream analysis pipelines using BioFSharp and written to disk by FastA.write
.
Note: this function is available from version 2.0.0 onwards.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: |
|
Here is an example result for the pathogenesis-related (homeodomain protein)[https://www.uniprot.org/uniprot/P48786.fasta]:
val it : TargetP.TargetpItem = { Name = "P48786;" Len = 1088 Mtp = 0.054 Ctp = nan SP = 0.068 Other = 0.943 Loc = "_" RC = 1 TPlen = "" }
It may not always be convenient to analyze files on the disk. To use a memory stream of a
FastAItem
instead, we can write it to a stream and analyze it using the runWithStream
function
Note: in versions before 2.0.0, this function is named run
1: 2: 3: 4: 5: 6: 7: |
|
|
1: 2: 3: 4: 5: 6: 7: |
|
1: 2: 3: |
|
val it : TargetP.TargetpItem = { Name = "Header" Len = 8 Mtp = 0.056 Ctp = nan SP = 0.033 Other = 0.975 Loc = "_" RC = 1 TPlen = "" }
What we are doing with it
Our workgroup uses this DSL to power our iMTS-L prediction service. For additional information about iMTS-L prediction, see the paper or take a look at our step-by-step recipe
docker daemon endpoint on windows
module Docker
from BioFSharp.BioContainers
--------------------
namespace Docker
image to create containers from
| ImageId of string
| ImageName of string
| ContainerId of string
| ContainerName of string
| Tag of string * string
override ToString : unit -> string
The container context we will use to execute targetP
from BioFSharp.BioContainers
type Async =
static member AsBeginEnd : computation:('Arg -> Async<'T>) -> ('Arg * AsyncCallback * obj -> IAsyncResult) * (IAsyncResult -> 'T) * (IAsyncResult -> unit)
static member AwaitEvent : event:IEvent<'Del,'T> * ?cancelAction:(unit -> unit) -> Async<'T> (requires delegate and 'Del :> Delegate)
static member AwaitIAsyncResult : iar:IAsyncResult * ?millisecondsTimeout:int -> Async<bool>
static member AwaitTask : task:Task -> Async<unit>
static member AwaitTask : task:Task<'T> -> Async<'T>
static member AwaitWaitHandle : waitHandle:WaitHandle * ?millisecondsTimeout:int -> Async<bool>
static member CancelDefaultToken : unit -> unit
static member Catch : computation:Async<'T> -> Async<Choice<'T,exn>>
static member Choice : computations:seq<Async<'T option>> -> Async<'T option>
static member FromBeginEnd : beginAction:(AsyncCallback * obj -> IAsyncResult) * endAction:(IAsyncResult -> 'T) * ?cancelAction:(unit -> unit) -> Async<'T>
...
--------------------
type Async<'T> =
from BioFSharp.BioContainers
| NonPlant
| Plant
| NonPlantCustom of seq<TargetpCustomParams>
| PlantCustom of seq<TargetpCustomParams>
static member make : (TargetpParams -> string)
static member makeCmd : (TargetpParams -> string list)
from Microsoft.FSharp.Collections
from BioFSharp
from BioFSharp.IO
type MemoryStream =
inherit Stream
new : unit -> MemoryStream + 6 overloads
member CanRead : bool
member CanSeek : bool
member CanWrite : bool
member Capacity : int with get, set
member CopyTo : destination:Stream * bufferSize:int -> unit
member CopyToAsync : destination:Stream * bufferSize:int * cancellationToken:CancellationToken -> Task
member Flush : unit -> unit
member FlushAsync : cancellationToken:CancellationToken -> Task
member GetBuffer : unit -> byte[]
...
--------------------
MemoryStream() : MemoryStream
MemoryStream(capacity: int) : MemoryStream
MemoryStream(buffer: byte []) : MemoryStream
MemoryStream(buffer: byte [], writable: bool) : MemoryStream
MemoryStream(buffer: byte [], index: int, count: int) : MemoryStream
MemoryStream(buffer: byte [], index: int, count: int, writable: bool) : MemoryStream
MemoryStream(buffer: byte [], index: int, count: int, writable: bool, publiclyVisible: bool) : MemoryStream
from BioFSharp
from Microsoft.FSharp.Collections