My previous article recommended that one should reconsider using Python in production. However, there’s one category of use case where Python is the dominant option for running production workloads. And that’s data analysis and machine learning.
Almost all bleeding-edge work in data analysis and machine learning, especially around LLMs, happens in Python.
So, here are some of my learnings on how to run Python in production.
Project quality
Package manager
Python has a fragmented ecosystem of package managers.
The only ones I can recommend are
poetry
and
uv
.
After learning about uv
on
Hacker News
, I decided to give it a
try
.
uv
is blazingly fast and manages the Python binary as well.
It even supports
migrations
from other package managers.
The only downside is that uv
is still not on a
stable
release yet.
|
|
Linters
Since Python is a dynamically typed language, it is very easy to write code that is either outright broken or breaks along certain code paths.
Linters are the first line of defense against such code. There is a plethora of linters available for Python. None seems to be sufficient on its own. My current stack consists of autoflake , flake8 , ruff , isort , and pylint .
|
|
Microsoft’s pyright might be good but, in my experience, produces too many false positives.
I haven’t yet found a good way to enforce type hints or type checking in Python.
Prevent secret leaks
Use gitguardian , gitleaks , or noseyparker to prevent secrets from being committed to the repository.
In my experience, GitGuardian is the best, but it is a closed-source tool, while Gitleaks and Noseyparker are open-source.
This advice isn’t specific to Python, but something that engineers who have spent a lot of time writing non-production code in Python Notebooks, do make the mistake of.
Use git commit hook
[Pre-commit](( https://pre-commit.com/ ) hooks are good for enforcing code quality. This is not specific to Python either but is a good practice that’s useful when you are working with data engineers and data scientists who excel at data analysis more than writing production-ready code.
Project maintainability
FastAPI
If you are writing a web service, then go for a combination of fastapi and gunicorn . In my benchmarking, everything else being equal, FastAPI+gunicorn has 3X the throughput of Flask+gunicorn.
|
|
|
|
Data classes
Use data-classes or more advanced pydantic for holding data and use helper classes to group pure functions that operate on those data classes. I was planning to write more, but then I came across this recently written elaborate article on this topic.
|
|
Avoid async and multi-threading
Python’s GIL and async are a mess. Multi-threading in Python codebases is not well tested and is a source of bugs . It is best to avoid any concurrency in Python codebases. If you need performance, use multiple processes instead.
Dependency management
pip-audit could be useful for dependencies with known vulnerabilities. I have never found anything useful, primarily because I use dependabot for automatic dependency updates.
|
|
Further, deptry is a useful tool for finding unused dependencies in Python projects. The results do contain false positives, but it is a good starting point for cleaning up unused dependencies.
|
|
Keep code legally compliant
Python has a lot of libraries with licenses that could be troublesome for the codebase. E.g., libraries with GPL licenses that could make the whole codebase GPL. To avoid it, use licensecheck on CI .
|
|
Deployments
Docker
Use docker for deployments. Even if you are using GPU-enabled VMs, use Docker and expose the GPU to the container with the following parameter.
|
|
Further, use multi-stage builds where you use poetry
/uv
to build the package and then copy the built package to a
smaller base image on top of
python:3.XX-slim
.
I have tried Python’s Alpine-based images (python:alpine
) and for any non-trivial project,
it is very hard to use it due to
Debian’s glibc vs Alpine’s musl
differences.
So, I would recommend against using Alpine-based images for Python.
Nore that while there have been attempts at making Python faster like PyPy , and Condon , they are really difficult to use for any non-trivial project. So, stick to the standard Python interpreter.
Use CPU-only libraries for non-GPU deployments
PyTorch
is huge.
If you are going to be using pytorch
in a non-GPU deployment, then use the CPU-only version.
It is significantly smaller with no loss of accuracy.
You can configure this with
multi-stage Docker builds
or uv
has a
detailed explanation
on how to do this using pyproject.toml
.
|
|
Compile code during build
Compile code during Docker builds.
This ensures that the .pyc
files exist.
It is especially useful for faster boot times during container auto-scaling.
|
|
Download external dependencies at build time
Many libraries like spacy and transformers download large chunks of data on the first use. This not only slows down the container boot time but also makes the Docker build non-hermetic. This was exposed during a HuggingFace outage last year.
Further, prevent downloads during execution with additional library-specific guards.
|
|
Alternatively, you can place these models in cloud/VM storage (PVC on Kubernetes) and mount them as Docker volumes during runtime. For larger models, usually, this is the only choice as building and deploying 5 GiB+ docker images is noticeably slower.
Run Docker containers as a non-root user
The Python docker images have a much larger attack surface than my favorite scratch image for Go deployments.
One should run Python-based containers as a non-root user to reduce the attack surface.
|
|