Docker is probably one of the best ideas anyone has had to make code more reproducible, among other things. In a sense, it’s like Github on steroids. Github is great, but, if you have ever tried to run some random person’s code you found on there, it usually takes at least a few days to get it to work even though it’s the same exact code they had working. They might have had a different version of a library used or some other minor difference, but that could mean a lot of work to get everything to work on your computer. Docker basically allows all of these libraries and dependencies used in the code you want to run to be turned into a Docker image. You can then use this Docker image to essentially recreate the environment necessary to run their code. Let’s see how to install Docker Toolbox with Windows 10.
Head over to https://www.docker.com/products/docker-toolbox and click on the download button for whatever operating system you have. We’re using windows here.
Next, just run the executable that was downloaded and follow the instructions from there. Once docker toolbox is installed, there will be a shortcut icon on the desktop of your computer.
Double-click on that to start the Docker terminal. If you have a firewall, you might need to disable it now and re-enable it when done with Jupyter Notebook. I also had to go into the BIOS and enable virtualization for this to work. The Docker whale will appear when Docker is ready (this part sometimes takes a few minutes).
Now that we’re in this terminal, we can use Ubuntu commands. The next step in getting a Jupyter Notebook running on Docker is to find a Docker image that has the libraries and things that you want. The one I’m using is one we set up on Docker hub called mpcrlearn. It has Tensorflow, Python, Scipy, TFLearn, and a few others that are used to do machine learning. On the right, it gives you the Docker pull command to get these: mpcr/mpcrlearn.
Now that we’ve found one we want, we need to get into Jupyter Notebook to be able to use those things. To do that we need to set up two things, a Dockerfile and a file called docker-compose.yml. I created a folder in my home directory called Docker (you can choose whatever name you want) to put these in.
Inside of that folder are the two text files called Dockerfile and docker-compose.yml. Once you have created that folder, change directories in the Docker terminal to that folder. I would do cd Docker to get to my folder since mine’s name is Docker. Now, let’s create a Dockerfile inside of that folder. Type ECHO >> Dockerfile to create one. The Dockerfile created will be a blank text file. My Dockerfile looks like this.
It is telling Docker what Docker image I want, where I’m going to do my work in the Jupyter Notebook, and what command to launch the notebook. You can probably leave all of this the same. Next, let’s look at what’s in the docker-compose.yml file.
You can probably leave the first line the same. It’s the Docker version. The second line through sixth line leave the same too (if you didn’t use the mpcr Docker image, change line three). The 8888:8888 just tells us what port the Jupyter Notebook is going to get mapped to. The two lines under volumes you should change. Each one of those lines essentially connects a directory on my laptop with a remote directory where we’ll be working. So, since I want the work I do in Jupyter to be saved on my computer, I tell Docker that the Jupyter Notebook files I work on in /notebooks should show up in C:/Users/Michael/Documents/Work/MPCR/notebooks. You can change the first part before the colon to whatever folder you want to save work in, but leave the part after the colon the same. The other line under volumes just saves my jupyter_notebook_config.py file on the local directory too. This line is not necessary. I added it because I modified the config file to get a password on my notebook, and I wanted the password to be remembered after closing Docker. You can just create a file called jupyter_notebook_config.py on your computer and put the directory to that file in place of the directory I have before the colon.
Now that we have done all of that, it is time to launch Jupyter Notebook. To do this, go to the Docker terminal. While in the terminal, make sure you are in the directory where your Dockerfile and docker-compose.yml file are located. Next, type docker-compose up. You should see something like this.
Now, start Google Chrome and type in the IP address located under the whale when Docker started followed by :8888. So I would type 192.168.99.100:8888 into the address bar in Chrome.
You should now be in your Jupyter Notebook.
So, just click on new in the top right corner of the notebook, and then in the dropdown menu, select Python 2 to begin a new notebook.
Now you can run your code in Python and Tensorflow. To run the code, click on Cell > Run All. Let’s run a simple linear model from Tflearn’s examples. Just paste the code found here, and then run it. You should get something like this.
So we just ran code using Tensorflow, TFLearn, and Python without having any of those on our computer thanks to Docker and Jupyter Notebook. Now that we’ve run that and created a new notebook, let’s check the local directory we connected to the notebook to see if our linear model notebook was saved correctly.
Yep. There’s the notebook we created, along with everything else in my Jupyter Notebook. When you exit your jupyter notebook, just make sure to go into the Docker terminal and type CTRL-C and then docker-compose down.
There are a lot of cool features in Jupyter, and I encourage all of you to experiment with them. For instance, you can go to the File tab and download the notebook as a pdf, slideshow, or a .html file. All of my blog posts are created with Jupyter Notebook. This is going to change everything in science, and especially programming/machine learning. You can reproduce any experiment just like that. Thanks for reading this blog post, and have fun with your Jupyter Notebook.