Configuring Python for SSRDE

Python is a popular multi-purpose programming language; and especially has found a niche as a leader in the data science ecosystem. There are two "branches" of Python in wide use, Python 2 (which will be supported until 2020), and Python 3; and a huge variety of 3rd party libraries available for both branches. Below are instructions for setting up a python environment with the modules you need on SSRDE.

By default, python on SSRDE refers to Python 2.7, and python3 refers to Python 3.6

It's often the case that we'll want a specific version of Python, however; we can control this using a tool called a virtual environment. Virtual environments have a number of other uses that we'll see shortly. The syntaxt for creating a virtual environment is virtualenv <environment name> --python=<python version>.

Above we created a virtual environment named test_env that uses Python 3.6. To activate the virtual environment we call source <environment name>/bin/activate (and deactivate if we want to exit the virtual environment). Notice that while the environment is active, python refers to the version of Python we specified when creating the environment.

One of the main benefits of Python is an extensive library of 3rd party modules that add functionality like data visualization, numerical methods, and neural networks. Python has a package management tool called pip that makes installing these packages easy. Pip normally requires sudo (administrator) access to install packages, but when you run pip from your virtual environment you no longer need sudo.

Pip has the ability to read in lists of requirements as well, so you can use one command to install all the packages you need for your projects. By convention those lists are kept in a file named requirements.txt, and the command for installing them is pip install -r requirements.txt.

Pip also has a great command called freeze; which captures all of the packages you currently have installed in your virtual environment. Running pip freeze > requirements.txt will create a requirements.txt that can be used to recreate your environment elsewhere. If you're moving to SSRDE from another development machine, running a pip freeze on that machine and moving the requirements.txt to SSRDE with your code will help you recreate you build the environment that you need in a flash.

It's important to note that in order to have access to the packages that you need when it's time to run your full code, your virtual environment has to be active while the job is running. The simplest way to ensure this is to add an activate command into your job submission script.

SSCF representatives are not programmers and will not be able to provide substantial help with your code itself, but we can assist with the transfer and organization of files, navigating the server, and using the job submission program Slurm.

If you need additional help setting up your environment, please don't hesitate to contact SSCF at sscfhelp@ucsd.edu (or reach out to your department SSCF representative, ex. sscf-econ@ucsd.edu) and we'll be happy to assist!