The impact of Docker containers on the performance of genomic pipelines [PeerJ]
In this paper, we have assessed the impact of Docker containers technology on the performance of genomic pipelines, showing that container “virtualization” has a negligible overhead on pipeline performance when it is composed of medium/long running tasks, which is the most common scenario in computational genomic pipelines. Interestingly for these tasks the observed standard deviation is smaller when running with Docker. This suggests that the execution with containers is more “homogeneous,” presumably due to the isolation provided by the container environment. The performance degradation is more significant for pipelines where most of the tasks have a fine or very fine granularity (a few seconds or milliseconds). In this case, the container instantiation time, though small, cannot be ignored and produces a perceptible loss of performance.
(tags: performance docker ops genomics papers)