Docker Global Hack Day mania: DockerComp

For the purpose of Distributed (Scientific) Computing, scientists across the world have been mostly using pre-configured VM images to let the client volunteer in contributing towards micro-processing tasks that involve processing of raw data received in chunks over the network.

But, since the introduction of Docker, life has changed and so have the performance benchmarks. We propose an system that uses the benefits of Docker to hopefully perform far better than the currently achieved milestones through VMs. The VMs have a huge overhead of starting up, as compared to Docker containers. Moreover, we don’t even need to explain the difference between running more than one VM on a HostOS compared to running multiple docker containers on that same machine! See the point? 🙂

A user would just need to run the installer.sh script. Run it and grab a coffee mug. The server admin guy just needs to inject the distributed computing task in the docker image, and the server side code. [ Github: https://github.com/arcolife/dockerComp/ ]. Some screenshots below:

installation and first contact

installation and first contact

communication and outage

communication and outage

References:

  1. http://www.rightscale.com/blog/sites/default/files/docker-containers-vms.png
  2. http://en.wikipedia.org/wiki/Docker_%28software%29#cite_ref-3

So, just to give you a context of this whole project, take a look at this project called CernVM. This is a really awesome project, developed to help collect CERN’s LHC data and perform data analysis on a volunteer’s computer or even on commercial clouds. Just imagine if the whole process of using VM was dockerized!

FEATURES

  • Can be used for:
    • Image Processing
    • General Data Analysis
    • Scientific Computing
    • CrowdSourcing projects.

FUTURE GOALS

  • More ease of use to be added in future as a pluggable dockerized distributed computing framework.
  • Integrations to Docker Compose and Swarm for scalability.
  • Security Layers
  • Sample distributed system apps packaged along
  • Possible Integration with CernVM
  • Integration with tor n/w and intelligent task distribution with independent cluster management.
  • Plugin for mounting FuseFS inside docker, in non-privileged mode
  • OpenVPN based tunnel integration
  • More Optimizations in terms of Task Distribution redundancy
  • Release benchmarks and comparisons with existing methodologies.
Advertisements

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s