Python basics: what is a virtual environment?

Share on social:

If you work with Python you’ve most certainly heard of virtual environments, and hopefully you’re using them, but what concretely are they and when are they useful? This article introduces the concept of virtual environments, describes how to create one and explains what’s going on behind the scenes. A basic understanding of the Linux command line is assumed.

The idea

Most Python projects require additional packages which do not come as part of the Python Standard Library. Managing these dependencies can be troublesome when we have multiple Python projects on the same machine, which may each require different versions of the same package. Additionally, when we transfer code from a development environment to a production environment, such as a server or an embedded system, we need to ensure that the code has access to the exact same set of packages with which it was developed and tested. This is where virtual environments come in. Conceptually, virtual environments isolate a sandbox for your project. They achieve this by creating a separate directory tree which contains the Python installation and the packages for the project.

Creating a virtual environment

Virtual environments are created using the Python venv module, which by default uses the most recent version of Python available on your system. The following example commands were ran using Python 3 on an Ubuntu Linux distribution. To create the virtual environment, we run the venv module as a script, followed by the path to the folder where the new directory is to be created.

$ python3 -m venv myenv

The ‘myenv’ directory now contains the newly created directory structure. The new directory contains a ‘bin’ folder which houses the Python interpreter and Standard Library, default packages and an ‘activate’ script used to start the environment.

Environment activation – behind the scenes

The activate script is ran using the source command, which is a shell builtin. When executed in the shell, the activate script prepends the virtual environment’s binary directory to the shell’s path, making the virtual environment’s executables directly available in that shell instance.

$ echo $PATH # path prior to virtual environment activation
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

$ source ./myenv/bin/activate
(myenv) $ echo $PATH # path following virtual environment activation
/home/username/myenv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Additionally, as the virtual environment’s binary directory is added to the beginning of the path, executables in this directory take precedence over executables of the same name found in system directories appearing later in the path. This is how the isolating effect of the virtual environment is achieved, by searching for executables in the virtual environment’s bin directory before searching in system directories. When the virtual environment is active, calling the Python interpreter will execute the virtual environment’s copy instead of the system version.

$ which python3 # virtual environment not active
/usr/bin/python3

(myenv) $ which python3 # virtual environment active
/home/username/myenv/bin/python3

The same concept applies with packages. For example, if the jupyter package is installed using pip when the virtual environment is active, its binaries will be located in the virtual environment’s binary directory rather than in the system directory. The jupyter package can therefore only be accessed when the virtual environment is active, or by manually calling the full path to the virtual environment’s binary directory.

(myenv) $ pip3 install jupyter
(myenv) $ type jupyter # virtual environment active
jupyter is /home/username/myenv/bin/jupyter

$ type jupyter # virtual environment not active
bash: type: jupyter: not found
$ type ./myenv/bin/jupyter # use full path
./myenv/bin/jupyter is ./myenv/bin/jupyter

The virtual environment is deactivated by calling ‘deactivate’ from the shell in which the virtual environment was activated. This command executes the deactivate function, which is loaded into the shell when the activate script is ran.

Copies vs Symbolic Links

When a Python virtual environment is created on a Unix machine, by default the Python interpreter in the virtual environment’s bin directory is actually a symbolic link to the system version of Python, rather than a copy. The copy vs symbolic link behavior of the virtual environment can be controlled using the ‘symlinks’ and ‘copies’ flags when calling the venv module. Use of the ‘copies’ flag is important if a fully independent copy of the Python interpreter is required. When using IDEs such as PyCharm, virtual environments are often created automatically by the IDE, potentially leaving the developer unaware if copies or symbolic links have been used. This can easily checked by running the ls -l command in the virtual environment’s binary directory.

Freezing dependencies

Keeping a record of which package versions were used with a certain version of source code is essential to ensuring that the production environment into which the code is deployed matches the development environment. Additionally, it avoids the ‘it works on my machine’ problem which occurs in teams when developers have differences in their local environments. The means to achieving package record keeping is provided by the pip freeze command, which prints the virtual environment’s packages and their version numbers to the standard output. This output can be redirected to a text file, which can be put under version control in a shared repository.

(myenv) $ pip3 freeze
numpy==1.16.4
...
pkg-resources==0.0.0
(myenv) $ pip3 freeze > requirements.txt

The same set of package dependencies can then be recreated in a different virtual environment using the -r flag with the pip install command:

(myenv-new) $ pip3 install -r requirements.txt

Conclusion

Python virtual environments are simple but powerful tools for managing and ensuring project dependencies. This is achieved by creating a separate virtual environment directory tree and prepending the binary directory to the path. The default use of copies vs symbolic links varies from platform to platform and can be controlled using venv module flags. Virtual environments can be ‘frozen’ and recreated using pip’s freeze command.