ISoP ISoP on Twitter | ACoP ACoP on Twitter | PAGE | WCoP | PAGANZ | PAGJa

Docker + nonmem performance


#1

Wanted to put out a ‘call’ for any other users/groups using docker for nonmem models. I know at least bill denney is using it. I’ve been pretty happy with the isolation, but would like to start some conversations about overall project management, namely, mapping shared folders has a reasonably significant performance penalty that I picked up on (benchmarks were ~25-50% slower running models in a folder mapped to shared drive on the host).

Looking to hear some strategies that anyone has been happy with.


#2

Hi Devin,

I’ve been very happy with the isolation that Docker provides, too. I’ve not done speed comparisons of Docker to native performance (though it wouldn’t be too hard to do so). Others who have done comparisons of Docker to native performance have shown identical or near-identical results

I’ve been debating rewriting my Docker scripts to do the work in a RAM disk and then transfer it to a physical disk at the end of execution. That should make the network mapping penalty pretty small (on models that take more than a few seconds to run). But, using a RAM disk or any copy-on-completion method without permanent storage would not be able to resume models if interrupted part-way through. And, using a RAM disk could have implications if you’re generating a large volume of results simultaneously if you ran out of RAM (e.g. with a bootstrap).

On a related note, I recall seeing a very significant improvement when moving from a spinning magnetic disk to solid-state, but that was simply observational. Maybe a benchmarked comparison of the impact of storage on NONMEM run times is in order. As a guess, it’d be: RAM < SSD <<< spinning disk <<< network SSD << network spinning disk. I’d guess that the difference between RAM and SSD may be imperceptible.

Bill


#3

So after posting I did some more benchmarking - it seems that the ‘bottleneck’ when mapping between mapped volumes is the initial file creation of the swath of temp files. I found that when running in a folder that was synced back to the host os as a volume (windows) running something the first time was ~30% slower, however when re-running the model, the docker and native performance were identical (sometimes even faster on a windows machine with different compilers). I’m not sure if the extra I/0 is bottlenecking nonmem from proceeding to the next step or what, or if it is a side effect of having those files into the host os cache.

To your point, the volatility and $$$ of a ram disk seems like it would be a bit overkill IMO, I haven’t been bottlenecked yet to my knowledge with a ssd, even when writting out fort50 chains for bayesian fits. Nevertheless, I also do have my eye on some of the pcie ssds that have over 1gbps throughput.

Anyway, I also pushed up some docker files that I’m using - https://github.com/dpastoor/dockerfiles

They are a bit simpler than yours, (need to manually put the NMCD and nonmem.lic file in the parent folder, but they are simple and easy to follow. bare nonmem + psn image comes out to ~ 850 mb.

@bdenney have you tried doing any cloud deployments with docker - eg package up a set of runs as a snapshot, and push that snapshot to aws/gce/digital ocean and run that way? Or just mainly using it for isolation?


#4

Hi Devin,

I’m glad that you were able to track down the speed issue to find the bottleneck. By your description, it sounds like that 30% increase would mainly affect fast model runs (a handful of seconds to complete) rather than long runs (minutes to hours to complete). Is that right?

Thanks for sharing your Docker files. I have a few questions about them, but I’ll delve into the specifics on GitHub. For comparison, my current PsN image is ~500 MB. It looks like you strip away the same files that I do from the NONMEM installation (though you don’t appear to use NMQual, so that could be a bit different), and the difference may be just the cleanup after the PsN installation that I do.

I’ve not been doing cloud deployments with Docker yet. Currently, I use Docker for isolation, and when I need bigger things, I’ve been using Metworx.


#5

Correct - so for simulations it is something to consider - but likely negligible for longer running things.

To your PsN image - it that the image including the base nonmem image size, or just psn? the PsN image is ~300 MB more than the base nonmem docker image for me. If you have both nonmem and psn for 500 megs I definitely need to check out what more you’re stripping out.


#6

My PsN image is the whole shebang: g77, perl, NONMEM (with NMQual and mpich), and PsN. The current image I use is 506 MB total.

One big difference in size that I found is removing the PsN tests directory. I run the tests and then delete the directory. If the tests don’t run, I don’t get an image, so that is an implicit confirmation that the tests ran.