Distributions and packaging

Shipping your creations

Author

Karsten Naert

Published

November 15, 2025

Why Packaging Matters

The dependency problem

You’ve built something brilliant. It runs perfectly on your machine. Then you share it with a colleague, and… nothing works.

This is the “it works on my machine” syndrome, and it’s caused by differences in:

  • Python versions
  • Installed packages and their versions
  • Operating system libraries
  • Environment variables

In data science and software engineering, reproducibility matters. Your analysis should produce the same results whether you run it today or six months from now, on your laptop or on a cloud server.

The Python packaging landscape

The Python packaging ecosystem has been… complicated. But there’s good news: dramatic improvements happened in 2023-2024.

Some key players:

  • PyPI: The Python Package Index, where packages live
  • uv: Modern, fast tool (we’ll focus on this)
  • pip: The traditional package installer
  • conda: Package and environment manager for scientific computing
  • Poetry: All-in-one dependency management tool

Terminology clarification

Before we dive in, let’s get the vocabulary straight:

  • Module: A python object of type Module, usually created from a file
  • Package: A python object of tyep Module with a __path__ attribute, usually created from a directory with an __init__.py file.
  • Distribution: A bundled version of a package ready for installation (e.g., a .whl file) What this lecture is about
  • Source distribution (sdist): Distribution containing source code (.tar.gz or .zip)
  • Built distribution (wheel): Pre-built binary distribution (.whl)

Python Packaging Fundamentals

PyPI: The Python Package Index

PyPI (pypi.org) is the central repository for Python packages. Think of it as the app store for Python code.

When you browse a package page, you’ll find:

  • Documentation and README
  • Available versions
  • Dependencies
  • Download statistics
  • Project links (homepage, repository, issue tracker)
Warning

Distributions contain arbitrary code you are going to run on your machine! Verifying that the project looks trustworthy is not a luxury but a must!

Visit pypi.org and pypistats.org and explore the pages for requests, numpy, and fastapi. Look at:

  1. How many versions are available?
  2. What Python versions do they support?
  3. What are their dependencies?
  4. How many downloads per month?

It’s also interesting to look at earlier releases. Sometimes releases get yanked because of security reasons and pypi is one of the places where you can learn about this.

This will help you evaluate packages before adding them to your project.

Distribution formats

Source distributions (sdist)

A source distribution is a .tar.gz or .zip file containing:

  • Source code
  • pyproject.toml (configuration)
  • README, LICENSE, and other metadata

When you install from a source distribution, Python must:

  1. Download the source
  2. Install build tools
  3. Compile any extensions
  4. Create the package

This is portable but slower, and requires build tools on the target machine.

Wheels: The modern binary format

Wheels (.whl files) are pre-built distributions. They’re just ZIP files with a specific structure.

Benefits:

  • Fast installation: Just unzip and place files
  • No compiler needed: Binaries are pre-compiled
  • Consistent: Same build used everywhere

Wheels come in two flavors:

  • Pure Python wheels: Work on any platform (*-py3-none-any.whl)
  • Platform-specific wheels: Compiled for specific OS/architecture

The wheel filename tells you everything:

numpy-1.24.0-cp311-cp311-win_amd64.whl
  │      │     │     │      └─ Platform (Windows, 64-bit)
  │      │     │     └─ ABI (CPython 3.11)
  │      │     └─ Python version (3.11)
  │      └─ Package version
  └─ Package name
Exercise

Download a wheel file from PyPI (pick any package), rename it to .zip, and extract it. Explore the contents to understand the structure.

The build system architecture

Modern Python packaging separates concerns:

  • Build frontends (pip, uv, build): Orchestrate the process
  • Build backends (setuptools, hatchling, flit): Actually create distributions

Think of it this way:

  • Frontend = General contractor who manages the project
  • Backend = Construction crew who does the actual work

This separation, defined by PEP 517/518, means you can mix and match tools. The frontend reads your pyproject.toml, sets up an isolated environment, installs the backend, and asks it to build your package.

The [build-system] section in pyproject.toml specifies which backend to use:

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

Virtual environments: Isolation is key

Virtual environments solve the conflict problem. Each project gets its own isolated Python + packages directory.

Structure:

.venv/
├── Scripts/          # Executables (Windows)
│   ├── python.exe
│   ├── pip.exe
│   └── activate.bat
├── Lib/
│   └── site-packages/  # Installed packages
└── pyvenv.cfg        # Configuration

When you “activate” a virtual environment, it modifies your PATH so that python points to the environment’s Python.

Modern tools like uv create and manage virtual environments automatically, so you rarely need to think about this, especially when they are well integrated with your editor.

Security considerations

Dependencies can have vulnerabilities or even be malicious. The fact that packages are open source means you can look at their source, it doesn’t mean someone has looked at the source!

Best practices:

  • Use lockfiles: uv.lock includes hashes
  • Vet dependencies: Check before adding to your project
  • Keep updated: Security patches matter
  • Use audit tools: pip-audit (works with uv too)

Example:

pip install pip-audit
pip-audit
Warning

Supply chain security in the the Python ecosystem is still rather limited. Be extra cautious about what you install. Double check the packages for their correct names since there are many reported incidents of typosquatting where hackers have uploaded malicious python packages with names that correspond to frequently made typos.

Modern Packaging with uv

What is uv?

uv is from Astral (the team behind Ruff), written in Rust. It’s an all-in-one tool:

  • Package installer (replaces pip)
  • Environment manager (replaces venv)
  • Python version manager (replaces pyenv)

It’s 10-100x faster than pip while being fully compatible with the existing ecosystem.

Why we’re teaching it first: It’s the future of Python packaging.

Tip

Fun fact: The name “uv” doesn’t stand for anything! It’s just short, fast to type, and suggests speed (like UV light at the fast end of the spectrum). The logo features a lightning bolt ⚡

We are assuming you already have uv installed. There are different ways of using uv but for our purposes we focus on the project workflow.

Understanding uv’s project workflow

Three key files:

  • pyproject.toml: Your project configuration (dependencies, metadata)
  • uv.lock: Exact versions of everything (commit this!)
  • .python-version: Pinned Python version

Typical workflow:

  1. uv init to start a new project
  2. uv add package_name as you need dependencies
  3. uv sync to install everything
  4. Commit pyproject.toml, uv.lock, and .python-version to Git
Exercise

Create a small project:

  1. Create a new directory and run uv init
  2. Add requests and rich as dependencies
  3. Examine the generated pyproject.toml and uv.lock
  4. Create a simple script that uses these packages
  5. Run it with uv run python script.py

Development dependencies

Some packages are only needed during development (testing, linting):

uv add --dev pytest ruff mypy

In pyproject.toml, these appear in [dependency-groups]:

[dependency-groups]
dev = ["pytest>=7.0", "ruff>=0.1.0", "mypy>=1.0"]

Install everything including dev dependencies:

uv sync --dev

Run dev tools:

uv run pytest
uv run ruff check

uv and requirements.txt

uv maintains backward compatibility with traditional workflows. For instance, rather than a uv.lock file, people will sometimes have a requirements.txt file which tells pip what dependencies to install. It’s possible to generated such a file with uv as follows:

uv pip compile pyproject.toml -o requirements.txt

We can install the dependencies from the file as follows:

uv pip install -r requirements.txt

This makes migration easy—you can adopt uv incrementally.

Exercise

Take an existing project with requirements.txt:

  1. Create a virtual environment with uv venv
  2. Install dependencies with uv pip install -r requirements.txt
  3. Convert to modern format by creating pyproject.toml manually
  4. Run uv lock to create uv.lock

Building and Distributing Packages

Package structure

The modern best practice is the src/ layout:

myproject/
├── src/
│   └── mypackage/
│       ├── __init__.py
│       └── module.py
├── tests/
│   └── test_module.py
├── pyproject.toml
├── README.md
├── LICENSE
└── .gitignore

Why src/? If you add myproject/src/ to your python path, python can only find your package. If you add myproject/ it might also discover tests even though it’s not really a package.

Essential files:

  • pyproject.toml: Configuration
  • README.md: Documentation
  • LICENSE: Legal requirements
  • .gitignore: Version control hygiene

In many distributions you will also find a setup.py file. This dates from before we had a standardized pyproject.toml and contains instructions for setuptools, a specific distributions builder tool.

pyproject.toml: The configuration file

This is the modern standard for Python projects (PEP 517/518/621):

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "mypackage"
version = "0.1.0"
description = "A short description"
authors = [
    {name = "Albert Einstain", email = "[email protected]"}
]
dependencies = [
    "requests>=2.28.0",
]
requires-python = ">=3.10"

[dependency-groups]
dev = [
    "pytest>=7.0",
    "ruff>=0.1.0",
]

[project.scripts]
mytool = "mypackage.cli:main"

Key sections:

  • [build-system]: Specifies the build backend
  • [project]: Name, version, dependencies, authors
  • [dependency-groups]: Development dependencies (uv standard)
  • [project.scripts]: Command-line entry points
  • [tool.*]: Tool-specific configuration (tools like ruff, pyright, …)

Build backends: Under the hood

The build backend transforms your source code into distributions.

Some popular choices you will encounter in the wild:

hatchling (recommended for new projects):

  • Modern and simple
  • Sensible defaults
  • Minimal configuration

setuptools (traditional):

  • Most compatible
  • Most packages still use it
  • Very flexible but more complex

flit-core (minimal):

  • For pure-Python packages
  • Extremely simple

meson-python:

  • For packages with C/C++ extensions

Building your package

With uv, building is simple:

uv build

This creates a dist/ directory with:

  • A source distribution (*.tar.gz)
  • A wheel (*.whl)

Behind the scenes, uv:

  1. Reads your [build-system] configuration
  2. Creates an isolated environment
  3. Installs the build requirements
  4. Calls the backend’s build functions
  5. Generates the distributions

Inspect the wheel:

python -m zipfile -l dist\mypackage-0.1.0-py3-none-any.whl

Create a minimal package:

  1. Create the src/ layout structure
  2. Write a simple pyproject.toml with hatchling
  3. Add a basic Python module with a function
  4. Build with uv build
  5. Inspect the generated wheel file
  6. Install the generated wheel in a separate virtual environment.

Publishing to PyPI

Prerequisites:

  • PyPI account at pypi.org
  • API token (Settings → API tokens)

For practice, use TestPyPI first: test.pypi.org

Install twine (the upload tool):

uv pip install twine

Check your distributions:

twine check dist/*

Upload to TestPyPI:

twine upload --repository testpypi dist/*

Upload to PyPI:

twine upload dist/*
Warning

Use API tokens, not passwords! Set up a token in your PyPI account settings and use it when prompted for credentials.

Best practices:

  • Follow semantic versioning (MAJOR.MINOR.PATCH)
  • Tag releases in Git
  • Write changelogs
  • Test on TestPyPI first

Entry points and CLI scripts

Make your package executable from the command line:

[project.scripts]
mytool = "mypackage.cli:main"

This creates a mytool command that calls the main() function in mypackage.cli.

Example module:

# src/mypackage/cli.py
def main():
    print("Hello from mytool!")

if __name__ == "__main__":
    main()

After installing the package, users can run:

mytool

Version management

Static version (hard-coded):

[project]
version = "0.1.0"

Dynamic version (from file):

[project]
dynamic = ["version"]

[tool.hatchling.version]
path = "src/mypackage/__init__.py"

Then in __init__.py:

__version__ = "0.1.0"

Best practice: Single source of truth for version numbers.

Application Deployment Strategies

The deployment challenge

Distributing libraries (for other developers) is different from deploying applications (for end users).

Goals for application deployment:

  • Reproducible execution
  • Standalone (users don’t manage dependencies)
  • Easy to run

Options range from simple (lockfiles) to complex (containers, compiled executables).

Simple: Lockfiles and virtual environments

Best for: Servers, cloud deployments, development teams

Ship:

  • uv.lock or requirements.txt
  • pyproject.toml
  • Your code

Deploy on target machine:

uv sync

or

uv pip install -r requirements.txt

Fast, simple, reproducible.

Freezing applications with PyInstaller

Turn your Python app into a standalone executable.

Install:

uv pip install pyinstaller

Basic usage:

pyinstaller script.py

Single-file executable:

pyinstaller --onefile script.py

Output appears in dist/ directory.

Limitations:

  • Large file sizes
  • Antivirus false positives
  • Platform-specific (build on Windows for Windows)
  1. Create a simple script with a dependency (e.g., using requests)
  2. Use uv to set up environment and install dependencies
  3. Freeze with PyInstaller: pyinstaller --onefile script.py
  4. Test the executable from the dist/ folder

Containerization with Docker

Containers solve “works on my machine” completely. They package your app with its entire runtime environment.

Example Dockerfile with uv:

FROM python:3.12-slim

# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

# Set working directory
WORKDIR /app

# Copy dependency files
COPY pyproject.toml uv.lock ./

# Install dependencies
RUN uv sync --frozen --no-dev

# Copy application code
COPY . .

# Run application
CMD ["uv", "run", "python", "-m", "myapp"]

Multi-stage build for smaller images:

# Build stage
FROM python:3.12-slim AS builder
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
WORKDIR /app
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev

# Runtime stage
FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /app/.venv /app/.venv
COPY . .
ENV PATH="/app/.venv/bin:$PATH"
CMD ["python", "-m", "myapp"]

Build and run:

docker build -t myapp .
docker run myapp
Exercise
  1. Create a simple Flask or FastAPI application
  2. Write a Dockerfile using uv
  3. Build the image
  4. Run it and test that it works

(This requires Docker Desktop installed on Windows)

Platform-as-a-Service (PaaS)

Services like Heroku, Railway, Render, and Fly.io automatically detect Python projects.

Typically they:

  1. Detect requirements.txt or pyproject.toml
  2. Install dependencies
  3. Run your app

Configuration usually involves:

  • Procfile (specifies how to run)
  • Environment variables
  • buildpack settings

Modern platforms support uv—just specify it in your buildpack configuration.

Serverless deployment

AWS Lambda, Google Cloud Functions, Azure Functions let you run code without managing servers.

Considerations:

  • Cold start latency
  • Execution time limits (typically 15 minutes max)
  • Package size limits

Packaging for Lambda:

  1. Create environment with uv
  2. Zip the contents
  3. Upload to Lambda

Or use container images:

FROM public.ecr.aws/lambda/python:3.12
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev
COPY . .
CMD ["myapp.handler"]

Best for: Event-driven workloads, sporadic usage

Alternative Tools and Workflows

Some other tools and ecosystems are worth knowing about.

The pip ecosystem

pip: The standard package installer, built into Python.

pip install package
pip uninstall package
pip list
pip show package
pip freeze > requirements.txt

Improved dramatically in 2020 with a new dependency resolver, but still slower than uv.

pip-tools: Adds reproducibility to pip.

pip install pip-tools

Workflow:

  1. Write requirements.in with loose constraints
  2. Run pip-compile to generate requirements.txt with exact versions
  3. Use pip-sync to install exactly what’s in requirements.txt

venv: Standard library for virtual environments.

python -m venv myenv
myenv\Scripts\activate

Works but less convenient than uv.

When to use: Legacy systems, maximum compatibility, minimal dependencies.

Poetry: The all-in-one alternative

Poetry handles dependency management, building, and publishing.

Install:

pip install poetry

Workflow:

poetry new myproject
cd myproject
poetry add requests
poetry install
poetry build
poetry publish

Pros:

  • Excellent user experience
  • Mature ecosystem
  • Popular in web development

Cons:

  • Slower than uv
  • Occasional dependency resolution issues
  • Opinionated (can’t easily mix with other tools)

poetry.lock ensures reproducible installs, similar to uv.lock.

PDM: Standards-compliant alternative

Similar workflow to Poetry but more flexible:

pdm init
pdm add requests
pdm install

PDM supports PEP 582 __pypackages__ (experimental local package directory). However PEP 582 has since been rejected and PDM itself recommends using vertual environments.

Pros:

  • Fast
  • Standards-compliant
  • Flexible

Cons:

  • Smaller community than Poetry

The Conda Ecosystem

conda is both a package manager AND an environment manager. Unlike pip, which only handles Python packages, conda can manage:

  • Python packages
  • System libraries (C/C++, CUDA)
  • R packages
  • Julia packages

It was born to solve the “NumPy/SciPy installation nightmare” of the pre-wheel era and gained massive popularity especially in the data science community.

Key terminology:

  • Anaconda: Full distribution (3GB+, has commercial restrictions)
  • Miniconda: Minimal installer (just conda + Python, also has commercial restrictions)
  • Miniforge: Community-driven, uses conda-forge by default (recommended)
  • conda-forge: Community-maintained package channel

For today’s purposes we are just going to recommend not using conda as a software developer. I have many concerns about misalignment between Anaconda, the company behind this ecosystem, and what developers need. Furthermore, there are many licensing issues and Anaconda is currently suing companies like Intel and Dell for misusing their software. The current state of affairs seems to be like surfing to the New York Times website, reading a few articles, and later getting sued because you should have payed a subscription. Exactly what is allowed and isn’t is complicated enough that consulting with a lawyer is not a luxury here in my opinion.

If you’re interested in using the conda ecosystem anyhow, perhaps also have a look at pixi. It’s a company started by Wolf Vollprecht, the person who made conda fast(er), despite not even working at the company Anaconda!

My opinion

I have never used poetry or pdm myself, so here I will just parrot what I hear from other people but I don’t personally see them being used much more in the future with uv being out there.

For conda it’s a bit different since there is still a niche of packages that require conda for a fluent install experience, but it seems that with each year that passes that niche is getting smaller. Furthermore conda, has been focused on things like getting “Python in Excel” – by running things in the cloud, while their whole raison d’être has been getting hollowed out.

Advanced Topics

Version constraints

The python packaging authority maintains a page on valid version specifiers.

Specify dependencies with constraint operators:

  • ==1.2.3: Exact version (rarely use this)
  • >=1.2.0: Minimum version
  • <2.0.0: Upper bound
  • ~=1.2: Compatible release (≥1.2, <2.0)
  • !=1.2.5: Exclude specific version
  • >=1.2,<2.0: Multiple constraints

Example:

dependencies = [
    "requests>=2.28.0,<3.0.0",
    "numpy~=1.24",
]
Tip

Libraries should be permissive: Use >= constraints to maximize compatibility.

Applications should be strict: Use lock files to pin exact versions.

Debugging installation issues

It will sometimes happen that installing a package fails. A reason might be that the package is only available for Unix systems but you’re on windows.

uv pip install -v package_name

Common errors:

  • “Package not found”: Check spelling, verify it exists on PyPI
  • “No matching distribution”: Platform issue, may need build tools
  • Version conflicts: Read the error message carefully

Working with private packages

Options:

  • Private PyPI server (devpi, Artifactory)
  • Cloud services (AWS CodeArtifact, Google Artifact Registry)

Configure uv:

uv pip install --index-url https://pypi.company.com/simple package

Or in pyproject.toml:

[[tool.uv.index]]
name = "private"
url = "https://pypi.company.com/simple"

Authentication via API tokens in environment variables or config files.

Editable installs

Install a package in development mode:

uv pip install -e .

The -e flag creates a link to your source directory instead of copying files. Changes to your code are immediately reflected without reinstalling.

Uses:

  • Active development
  • Testing changes immediately
  • Working on multiple related packages

Modern approach uses PEP 660 (better than the legacy method).

Cross-platform considerations

uv.lock works across Windows, macOS, and Linux.

Platform-specific dependencies use markers:

dependencies = [
    "pywin32>=300; sys_platform == 'win32'",
    "python-magic>=0.4.27; sys_platform != 'win32'",
]

Test on multiple platforms using CI/CD (GitHub Actions with matrix builds).

Path handling: Use pathlib, not string manipulation:

from pathlib import Path

config = Path("config") / "settings.json"  # Works everywhere

Performance tips

uv is already fast, but here are some tips:

  • Caching: uv caches downloads globally (shared across projects)
  • Cache location: %LOCALAPPDATA%\uv\cache (Windows)
  • Clear cache: uv cache clean (rarely needed)
  • Parallel operations: uv downloads and installs in parallel automatically

Network optimization: Use mirrors if PyPI is slow in your region.

Resources

Official Documentation

Tutorials and Guides