Programs requiring dependencies
When you’re working at school you’re building almost everything from the ground up: data structures, abstract data types, working with structured and unstructured data, implementing algorithms.
When you get into the real world things suddenly change: your boss is probably going to be annoyed with you if you spend an afternoon building a linked list in a language that has extensive support for a variety of different types of collections. More often than not in the real world, you’re going to be writing software that depends on other software: other programs, other libraries, or other frameworks of code.
While you’re (probably) most familiar with languages like Java, we’re going to take a diversion and look at downloading and running a program in Python that requires some external dependencies.
Downloading the program
The program you’re going to be using as an example is actually something that extends Pandoc: a “filter” for Pandoc.
You can find the repository here: https://code.cs.umanitoba.ca/cs-lab-course/suppress-notes
You’re going to want to get this into your user directory on Aviary
(Download a zip from GitLab and unzip
,
git clone
, whatever works best for you).
Do this entirely on your own machine on the command line.
This is easier on macOS than Windows because unzip
is
already installed.
If you do this on your own machine, you’re on your own, but be forewarned that you’re going to need to install Python (on Windows or on macOS).
Dependency management in Python
Dependency management in Python is… not exactly a solved problem.
We’re going to stick to the basics, though, and use Python’s pip
.
pip
is a command-line tool that you can use to install
Python libraries (or packages) that you can then use (or depend on) in
your own Python programs.
Using pip
is pretty easy (once you know the dependency
that you want):
- Figure out what the name of the dependency is.
- Install the dependency with
pip install
.
Assuming you know what the name of your dependency is, installing it
is straightforward with pip
. Let’s assume that you’re
building a command-line tool in Python and you want to make it look real
shiny. A Python package you can use to make your command-line
tools look great is rich
. Once you
know the name of the package (rich
in this case), you can
use pip
to install your dependency:
pip install rich
pip
gives us the ability to quickly install dependencies
on the command line, and it installs the most recent version of the
library by default.
We can also install specific versions of dependencies, or versions that meet specific criteria:
# install exactly this version
pip install rich==12.6.0
# install a version that's compatible with X
pip install rich~=12.6.0
In this context, “compatible with” means that the API (the functions you’re calling in the package) are still the same, but there might be releases that are newer that have important security or other bug fixes.
Listing project dependencies
Installing individual dependencies on the command line is fine, and is a good way for us to very quickly get to work when we’re trying to build something without having to worry about versions.
When you’re ready to actually distribute your code to other people, though, you’re suddenly depending on both specific versions of packages that you had installed when you were writing your code, and … well, probably a lot of different packages.
You can just list out instructions for the users of your program or package:
My really cool project======================
:tada:!
This is my amazing project
Installing
----------
`pip`:
You need to install the following dependencies with
* `rich` version 12.6.0
* `click` version 8.0.3
* `Pillow` version 9.0.1
… but that would be tedious for anyone using your code (you
know how to use pip
now, but this would be tedious; can you
imagine your parents doing this???).
Thankfully, pip
can read files that contain a listing of
the dependencies that you have for your project: a file named
requirements.txt
.
requirements.txt
is a file that straight-up lists the
dependencies and their versions, everything that comes after
pip install
.
Here’s a requirements.txt
that describes the
dependencies for the amazing project above:
rich==12.6.0
click==8.0.3
Pillow==9.0.1
pip
’s requirements files can also have comments, though
it’s restricted to line that start with #
.
Here’s the same requirements.txt
, but one that has
comments:
# we need rich 12.6.0 for a very important, but arbitrary reason.
rich==12.6.0
# we use click for handling command line arguments
click==8.0.3
# Pillow is a great image processing library
Pillow==9.0.1
Once we’ve got a file that has these dependencies listed, we can use
pip
to install the dependencies all at once with
pip install -r
.
If you have a file named requirements.txt
that has the
contents listed above, you can use pip
to install the
dependencies:
pip install -r requirements.txt
You can write your own requirements.txt
by hand, but you
can also generate one using the command pip freeze
.
pip freeze
will list all versions of all
Python packages currently on your system (even ones you didn’t install
and ones that are unrelated to your program), but will give you a
comprehensive list of the state of your system.
You can use pip freeze
and some output redirection to
generate a requirements.txt
:
pip freeze > requirements.txt
How important is it that your file is named
requirements.txt
? Not at all! You could call it
“dependencies.txt
”. You could call it
“cats are cool.txt
”. You could call it
“dogs drool.txt
”. (Can you sense a theme here? 🐱)
The name requirements.txt
is a “de facto”
standard. All that matters is that what comes after the
-r
option is the name of a file that follows the standard
of “one package and version per line”.
Installing the dependencies
The repository that you downloaded/cloned/whatever has a list of
requirements as requirements.txt
. You should install those
requirements for this project.
Running the program
The repository that you downloaded/cloned/whatever has a
README.md
that describes how to run the software. You
should read the README.md
to see how to run the
software.
After you get the software running, just for the sake of seeing what can happen when you don’t install dependencies for a Python project.
Let’s start by uninstalling dependencies with pip
:
pip uninstall -r requirements.txt
Found existing installation: panflute 1.12.5
Uninstalling panflute-1.12.5:
Would remove:
/home/you/.local/bin/panfl
/home/you/.local/bin/panflute
/home/you/.local/lib/python3.10/site-packages/panflute-1.12.5.dist-info/*
/home/you/.local/lib/python3.10/site-packages/panflute/*
Proceed (Y/n)? Y
Successfully uninstalled panflute-1.12.5
Now try running the software using the instructions again (through
Pandoc, but don’t pipe to lynx
); what kind of output do you
see now?
Python is helpful enough to tell you the name of the missing package:
Traceback (most recent call last):
File "/home/you/suppress-notes/./suppress-notes", line 3, in <module>
import panflute as pf
ModuleNotFoundError: No module named 'panflute'
Error running filter suppress-notes:
Filter returned error status 1
If you’re planning on using this filter again, you should reinstall
the dependencies with pip
:
pip install -r requirements.txt
Further reading
Dependency management in Python is… not exactly a solved problem.
Pip is the best place to start with Python package and dependency management because it comes for free as part of Python itself (Python’s got “batteries included”).
That said, not everyone agrees that Pip alone is the right choice, or even agrees if Pip is the right tool at all.
You can read a little more about dependency management and packaging in Python in a couple of places:
You might be noticing that we’re installing specific versions of packages “globally”; installing with
pip
means we’re installing into a folder in our own user directory that any Python program we run will use. Sometimes we’ll have multiple different Python programs that use the same dependency, but rely on different versions that aren’t compatible with one another (panflute
is fixed at version 1.12.5 insuppress-notes
, but the most recent version ifpanflute
is 2.1.X, these are not compatible with one another).Python has something called “virtual environments” where you can install dependencies for a specific program in a specific directory rather than “globally”.
Poetry is a more advanced tool than
pip
that is both a dependency manager and a build and packaging tool for Python. Poetry doesn’t work withpip
, but instead replacespip
.