The Standard Libraries

Batteries included

Author

Karsten Naert

Published

November 15, 2025

Python’s batteries included philosophy

Python comes with batteries included. This means a large number of standard libraries are provided to make your life easier.

You can find a complete overview on the official documentation page. It’s definitely worth browsing through this to get an idea of everything Python has to offer.

Today we’ll examine some of the most commonly used libraries.

sys and os

import sys
import os
  • sys contains information about the Python installation itself
  • os is for interaction with the operating system. These functions can vary significantly between Windows/Linux/macOS/…
sys.version
'3.13.5 (main, Jul  8 2025, 21:00:57) [Clang 20.1.4 ]'
sys.path
['/home/knaert/.local/share/uv/python/cpython-3.13.5-linux-x86_64-gnu/lib/python313.zip',
 '/home/knaert/.local/share/uv/python/cpython-3.13.5-linux-x86_64-gnu/lib/python3.13',
 '/home/knaert/.local/share/uv/python/cpython-3.13.5-linux-x86_64-gnu/lib/python3.13/lib-dynload',
 '',
 '/var/www/pydev2/.venv/lib/python3.13/site-packages']
sys.implementation
namespace(name='cpython',
          cache_tag='cpython-313',
          version=sys.version_info(major=3, minor=13, micro=5, releaselevel='final', serial=0),
          hexversion=51185136,
          _multiarch='x86_64-linux-gnu')
sys.version_info
sys.version_info(major=3, minor=13, micro=5, releaselevel='final', serial=0)

sys.modules is a dict containing all modules that Python has loaded. (See the lecture on modules and imports.)

# sys.modules is quite large, let's just see how many modules are loaded
len(sys.modules)
1348
os.getcwd()
'/var/www/pydev2/advanced-python'

The os module provides lots of functionality for working with paths and files, e.g. os.path.join. You’ll often see this in recommendations on the internet (Stack Overflow, etc.). But there’s a more modern alternative: pathlib, and for most applications it’s better to use that instead.

Important for software developers are also the environment variables.

These are key-value pairs managed by the operating system. In Python they’re accessible via os.environ. Note that these are read when a process starts and passed to all subprocesses. So if you change an environment variable externally after Python has started, you won’t see any difference. However, if you modify an environment variable in Python via os.environ and then start a subprocess, you will see this change.

if sys.platform == "win32":
    # Windows uses USERPROFILE for the home path and USERNAME for the user name
    home_dir = os.getenv('USERPROFILE')
    user_name = os.getenv('USERNAME')
else:
    # POSIX systems (Linux, macOS) generally use HOME and USER
    home_dir = os.getenv('HOME')
    user_name = os.getenv('USER')

print(f"Home Directory: {home_dir}")
print(f"User Name: {user_name}")
Home Directory: /home/knaert
User Name: knaert
sys.executable
'/var/www/pydev2/.venv/bin/python3'
Exercise
  1. Modify an environment variable (or create a new one) on your computer and start a second Python session to view this environment variable.

  2. Call Python without activating a conda session. Check the environment variables with a mini script:

    python -c "import os; print(os.getenv('PATH'))"

    Then activate conda and check the environment variables again.

pathlib

import pathlib

The pathlib library is for all kinds of operations on paths to files and folders, both in Linux and Windows. It’s a modern alternative to the older os.path, along with the glob module.

Much documentation still refers to these older modules, which is why you still see them frequently in code.

  • It’s highly recommended to use pathlib wherever possible instead of os.path
  • The lesser-known glob module is for finding all files matching a certain pattern, e.g. all files with extension “.txt”. Here too you can use Path.glob

In pathlib you work with path objects in an object-oriented way. This allows you to do things like path.parent and path.with_suffix.

The documentation has a handy overview of the differences between os and pathlib.

for path in pathlib.Path('.').iterdir():
    print(path)
    # print(path.absolute())
    # print(path.suffix)
    # print(path.with_suffix('.png'))
    # print(path, path.is_file())
11-modules-imports.html
index.qmd
.jupyter_cache
12-linters-types.qmd
07-oop-inheritance.qmd
05-functions.qmd
11-modules-imports.qmd
index.html
06-standard-libraries.qmd
13-extra-libraries.qmd
01-control-structures.quarto_ipynb
12-linters-types.html
stderr_colored.png
01-control-structures.qmd
10-pandas-matplotlib.qmd
08-oop-advanced.quarto_ipynb
08-oop-advanced.qmd
04-loose-ends.qmd
02-bytes-files.qmd
05-functions.quarto_ipynb
03-oop-basics.quarto_ipynb
07-oop-inheritance.quarto_ipynb
05-functions.html
07-oop-inheritance.html
09-scientific-stack.qmd
02-bytes-files.quarto_ipynb
06-standard-libraries.quarto_ipynb
03-oop-basics.qmd
for path in pathlib.Path('.').glob('*.qmd'):
    print(path)
index.qmd
12-linters-types.qmd
07-oop-inheritance.qmd
05-functions.qmd
11-modules-imports.qmd
06-standard-libraries.qmd
13-extra-libraries.qmd
01-control-structures.qmd
10-pandas-matplotlib.qmd
08-oop-advanced.qmd
04-loose-ends.qmd
02-bytes-files.qmd
09-scientific-stack.qmd
03-oop-basics.qmd

The way to combine paths in pathlib is with /:

p = pathlib.Path('.').absolute() / 'example.txt'
print(p)
/var/www/pydev2/advanced-python/example.txt

How have the authors of the pathlib module created this “magic” functionality of the / operation?

Many functions will directly accept a pathlib object. For example, this is the case for the open function to open a file.

If you’re dealing with a function that only accepts strings, you can use str(p) or p.as_posix() depending on what works for your application.

print(str(p))
print(p.as_posix())
/var/www/pydev2/advanced-python/example.txt
/var/www/pydev2/advanced-python/example.txt

The simplest way to create a file is with p.touch():

p.touch()
p_stat = p.stat()
print(p_stat)
os.stat_result(st_mode=33204, st_ino=552944, st_dev=2049, st_nlink=1, st_uid=1001, st_gid=33, st_size=0, st_atime=1759217428, st_mtime=1759217428, st_ctime=1759217428)

It takes some digging to convert this to a normal timestamp:

import time
time.ctime(p_stat.st_atime)
'Tue Sep 30 07:30:28 2025'
import datetime
datetime.datetime.fromtimestamp(p_stat.st_ctime)
datetime.datetime(2025, 9, 30, 7, 30, 28, 867334)

See this Stack Overflow question for some explanation:

But also pay attention to the documentation: https://docs.python.org/3.12/library/os.html#os.stat_result.st_ctime

st_atime
    Time of most recent access expressed in seconds.

st_mtime
    Time of most recent content modification expressed in seconds.

st_ctime
    Time of most recent metadata change expressed in seconds.
    
    Changed in version 3.12: st_ctime is deprecated on Windows. 
    Use st_birthtime for the file creation time.

You can even manually change these timestamps, see here

day = 3600 * 24 * 100
os.utime(p, times=(p_stat.st_atime - day, p_stat.st_mtime - day))

re

import re

re stands for “regular expression”. Regular expressions, or regexes, are a kind of command language that allows you to search for very detailed patterns in text.

Regular expressions exist everywhere, not just in Python. They exist in every programming language and text editors often have support for regular expressions too.

Sometimes there are differences between various regex implementations, but the things we’ll cover here are fairly standard.

success = re.search('the pattern', 'This is a text in which I search for the pattern')
print(success)
<re.Match object; span=(37, 48), match='the pattern'>

A search with re always has a pattern and a text in which you search for the pattern.

The result is a match object or None if no match was found.

'This is a text in which I search for the pattern'[44:55]
'tern'

You can work further with a match object:

if success is not None:
    print(success.string)
    print(f"Found '{success.group()}' at position {success.start()}-{success.end()}")
This is a text in which I search for the pattern
Found 'the pattern' at position 37-48

If no match is found, the result is None:

failed = re.search('winner', 'Who won the lottery?')
print(failed)
None

The re module has several other handy functions:

re.match finds a match but only at the beginning of the string:

result = re.match('road', 'Where is the road when the road is gone?')
print(result)
None
result = re.match('Where', 'Where is the road when the road is gone?')
print(result)
<re.Match object; span=(0, 5), match='Where'>

re.findall finds all places where the pattern occurs. This only returns the pattern, so if you want to know where it occurs, this isn’t very practical:

result = re.findall('road', 'Where is the road when the road is gone?')
print(result)
['road', 'road']

re.finditer iterates over all matches and returns a complete match object each time:

for r in re.finditer('road', 'Where is the road when the road is gone?'):
    print(f"Found '{r.group()}' at position {r.start()}-{r.end()}")
Found 'road' at position 13-17
Found 'road' at position 27-31

Regex patterns

Regular expressions are extremely versatile and allow you to search for very varied patterns. Some examples:

  • blabla$ matches blabla only at the end of a string
  • ^blabla matches blabla only at the beginning of a string
  • A|B matches A or B
  • x* matches the empty string, x, xx, xxx, …
  • x+ matches 1 or more copies of x
  • x? matches 0 or 1 copies of x
  • \d matches a digit

Check the documentation for (much) more explanation.

for m in re.finditer("this", "this is a test of this system"):
    print(f"Found '{m.group()}' at {m.start()}-{m.end()}")
Found 'this' at 0-4
Found 'this' at 18-22
for m in re.finditer("^this", "this is a test of this system"):
    print(f"Found '{m.group()}' at {m.start()}-{m.end()}")
Found 'this' at 0-4
for m in re.finditer("is|a", "this is a test of this system"):
    print(f"Found '{m.group()}' at {m.start()}-{m.end()}")
Found 'is' at 2-4
Found 'is' at 5-7
Found 'a' at 8-9
Found 'is' at 20-22
for m in re.finditer("this (is|system)", "this is a test of this system"):
    print(f"Found '{m.group()}' at {m.start()}-{m.end()}")
Found 'this is' at 0-7
Found 'this system' at 18-29
for m in re.finditer("e+", "this is a test of this system"):
    print(f"Found '{m.group()}' at {m.start()}-{m.end()}")
Found 'e' at 11-12
Found 'e' at 27-28
for m in re.finditer("syste*m", "systm system systeeeeeem"):
    print(f"Found '{m.group()}' at {m.start()}-{m.end()}")
Found 'systm' at 0-5
Found 'system' at 6-12
Found 'systeeeeeem' at 13-24
for m in re.finditer("systee?m", "systm system systeem systeeem systeeeeeem"):
    print(f"Found '{m.group()}' at {m.start()}-{m.end()}")
Found 'system' at 6-12
Found 'systeem' at 13-20

Regex groups

Regular expressions have the concept of “groups”. These are certain sub-expressions that you can capture to use later. For example, the following regex matches URLs:

https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)

See this Stack Overflow answer for an explanation.

url_re = r'http(s?):\/\/((www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256})\.([a-zA-Z0-9()]{1,6})\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)?'
match_obj = re.match(url_re, 'https://duckduckgo.com/?t=ffab&q=test&ia=web')
print(match_obj)
<re.Match object; span=(0, 44), match='https://duckduckgo.com/?t=ffab&q=test&ia=web'>

Each set of parentheses defines a group, where group 0 is always the complete match.

The first group will capture whether the URL uses http or https:

print("Full match:", match_obj.group(0))
print("HTTP/HTTPS:", match_obj.group(1))
Full match: https://duckduckgo.com/?t=ffab&q=test&ia=web
HTTP/HTTPS: s

The next group captures the domain and subdomains:

print("Domain:", match_obj.group(2))
Domain: duckduckgo

The following groups look at whether there’s www and what the top-level domain is:

print("WWW part:", match_obj.group(3))
print("TLD:", match_obj.group(4))
WWW part: None
TLD: com

The last group looks at what comes after the base URL:

print("Path/query:", match_obj.group(5))
Path/query: /?t=ffab&q=test&ia=web
Exercise
  1. URL Analysis: Research the result for many alternative URLs. For example, investigate https://www.google.be/search?q=test. Adapt the code to separately capture the domain (e.g., google or duckduckgo).

  2. File Renaming with Regex: Use Python (pathlib) to create a folder on your computer containing the following files (just create empty files). Then change the filenames of the different lessons by changing i0i. For example, Les 1.ipynb becomes Les 01.ipynb, etc. Use the re module maximally.

    advanced python.png
    Les 1.ipynb
    Les 10.ipynb
    Les 2 Demo - PNG bestanden lezen.ipynb
    Les 2.ipynb
    Les 3.ipynb
    Les 4.ipynb
    Les 5 - functies.ipynb
    Les 6-7 - OOP deel 2.ipynb
    Les 8 - de batterijen.ipynb
    Les 9 - modules en imports.ipynb

textwrap and string

Also interesting to know: textwrap.dedent removes indentation:

s = """
    this is a text
    with a certain indentation
    that is always constant
    """

import textwrap
print(textwrap.dedent(s))

this is a text
with a certain indentation
that is always constant

And if you ever need the alphabet:

import string
print("Letters:", string.ascii_letters)
print("Lowercase:", string.ascii_lowercase)  
print("Uppercase:", string.ascii_uppercase)
Letters: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
Lowercase: abcdefghijklmnopqrstuvwxyz
Uppercase: ABCDEFGHIJKLMNOPQRSTUVWXYZ

functools, itertools, operator

import functools
import itertools
import operator

These are three handy libraries for working with functions and iterations.

For functools, it’s especially interesting to know partial, which is a way to transform an existing function into a new function where certain arguments are already filled in:

def multiply(a, b):
    print(f'{a=}, {b=}')
    return a * b

times_3 = functools.partial(multiply, 3)
print(times_3(4))
a=3, b=4
12
times_100 = functools.partial(multiply, b=100)
print(times_100(5))
a=5, b=100
500
print2 = functools.partial(print, end='')
print2('everything')
print2('together')
print2('now')
print()  # Add a newline at the end
everythingtogethernow

The itertools library allows you to iterate over existing lists and other iterable objects in new ways:

L = list(range(4))
print(L)
[0, 1, 2, 3]

For example, pairwise will return pairs of consecutive elements (x1, x2), (x2, x3), …:

for u in itertools.pairwise(L):
    print(u)
(0, 1)
(1, 2)
(2, 3)
changes = [
    (1, "red"),
    (2, "green"), 
    (3, "orange"),
    (4, "red")
]

for first, second in itertools.pairwise(changes):
    match second:
        case (t2, "orange"):
            print("The light became orange at time", t2)
        case _:
            pass
The light became orange at time 3

And combinations(..., n) will produce all possible combinations of n different elements:

for c in itertools.combinations(L, 2):
    print(c)
(0, 1)
(0, 2)
(0, 3)
(1, 2)
(1, 3)
(2, 3)

The operator library offers functions that correspond to normal operations in Python, such as + corresponds to operator.add:

print(operator.add(3, 4))
7

This can be handy in combination with other functionality from the functools library, e.g., functools.reduce.

functools.reduce(f, [a, b, c, ...]) first calculates r = f(a, b), then r2 = f(r, c), etc. If you use operator.add for function f, this is the sum:

print(functools.reduce(operator.add, [1, 2, 3, 4]))
print(functools.reduce(operator.mul, [1, 2, 3, 4]))
print(functools.reduce(operator.add, ['this', ' works', ' for', ' strings']))
10
24
this works for strings

The difference with the sum function is that it must start from something, by default from 0.

Note that to concatenate strings, it’s much better to use the ''.join(...) function.

datetime

You’ve undoubtedly used datetime before:

import datetime as dt

now = dt.datetime.now()
print(now)
2025-09-30 07:30:29.121094
formatted = now.strftime('%Y-%m-%d %H:%M:%S')
print(formatted)
parsed = dt.datetime.strptime(formatted, '%Y-%m-%d %H:%M:%S')
print(parsed)
2025-09-30 07:30:29
2025-09-30 07:30:29
today = dt.date.today()
print(today)
print(today.strftime('%Y-%m-%d'))
2025-09-30
2025-09-30

This is also useful for calculations with time. For example:

print(dt.date.today() - dt.date(2023, 8, 15))
777 days, 0:00:00

Pay attention: date - date exists and datetime - datetime exists but you can’t mix them:

dt.date.today() - dt.datetime(2023, 8, 15)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[57], line 1
----> 1 dt.date.today() - dt.datetime(2023, 8, 15)

TypeError: unsupported operand type(s) for -: 'datetime.date' and 'datetime.datetime'
Exercise
  1. Timedelta Analysis: What is the type of date - date? How can you extract the number of days from it?

  2. Leap Year Calculator: Use the datetime.date library to calculate the number of days in February for the following years: 2000, 2001, 2004, 1900, 2400, 1, 0

    for year in [2000, 2001, 2004, 1900, 2400, 1, 0]:
        nr_feb_days = (dt.date(year, 3, 1) - dt.date(year, 2, 1)).days
        print(f'February {year} has {nr_feb_days} days')

collections

The collections library has useful variations on the familiar dict and set.

For example, see collections.Counter and collections.defaultdict:

import collections

c = collections.Counter(['here', 'there', 'here', 'here', 'there'])
print(c)
print("Most common:", c.most_common(1))
Counter({'here': 3, 'there': 2})
Most common: [('here', 3)]

Pay attention to the signature of collections.defaultdict:

def default_integer():
    return 5

d = collections.defaultdict(default_integer, a=2, b=3)
print(f"a: {d['a']}, b: {d['b']}, c: {d['c']}")  # c gets default value
a: 2, b: 3, c: 5
Exercise

Shakespeare Analysis: Download the collected works of Shakespeare from Project Gutenberg

Which letter occurs most frequently?

import requests
shakespeare = requests.get('https://www.gutenberg.org/cache/epub/100/pg100.txt')
text = shakespeare.content.decode()
# Use collections.Counter to analyze the text

fractions, decimal, statistics

Python also offers support for calculating with fractions and decimal numbers. This can be handy for, for example, financial applications where rounding is not desired.

The module also offers lots of support for rounding in specific ways and maintaining precision.

print(0.1 + 0.2)  # Floating point imprecision
0.30000000000000004
from decimal import Decimal
d1, d2 = Decimal('0.1'), Decimal('0.2')
print(d1 + d2)  # Exact decimal arithmetic
0.3

Similarly, the fractions module allows exact calculation with fractions:

from fractions import Fraction
print(Fraction(1, 7) + Fraction(2, 7))
print(Fraction(1, 6) + Fraction(1, 3) + Fraction(1, 2))
print(Fraction(3, 10) + Fraction(1, 5))
3/7
1
1/2

The statistics module offers support for certain commonly occurring statistical operations. There’s support for Fraction, Decimal, float, and int.

If you only need the simplest statistical things, it can sometimes be interesting to use this module to simplify your dependencies.

To quote the documentation:

The module is not intended to be a competitor to third-party libraries such as NumPy, SciPy, or proprietary full-featured statistics packages aimed at professional statisticians such as Minitab, SAS and Matlab. It is aimed at the level of graphing and scientific calculators.

numbers = [0, 2, 3, 4, 100]
import statistics

mean_val = statistics.mean(numbers)
median_val = statistics.median(numbers) 
stdev_val = statistics.stdev(numbers)
print(f"Mean: {mean_val}, Median: {median_val}, Std Dev: {stdev_val:.2f}")
Mean: 21.8, Median: 3, Std Dev: 43.74
decimal_numbers = [Decimal(str(nr)) for nr in numbers]
print("With Decimal:", statistics.mean(decimal_numbers))
With Decimal: 21.8

GUI Libraries: tkinter and curses

tkinter

import tkinter

Python has a built-in module for building graphical user interfaces (GUIs) in the form of tkinter.

Advantages: - Very portable between different platforms - “Built into” Python, so no extra installation required

Disadvantages: - Interface feels outdated - Learning curve - Documentation is very confusing because it’s built on top of Tcl/Tk

There are also other libraries like PyQt, PySide, Kivy, etc. Additionally, there’s a stronger trend toward browser apps, for which many frameworks also exist. Before starting a large project with one of these frameworks, it’s worth getting to know several so you can assess their strengths and challenges.

See https://wiki.python.org/moin/GuiProgramming

The windows in tkinter always have a strict hierarchical structure where each window depends on a parent window. You must have a “root” window to start from, which is done with tk.Tk().

Exercise

Simple Calculator: Create a simple tkinter app with two input fields where the user can enter numbers, one “calculate” button, and one field where the calculated sum of the two numbers is displayed.

Exercise

File rename application: Create a simple tkinter app where the user can select a folder and rename all files via regular expressions. The application will display all files inside the folder and show the new names before changing the files to their new file name. 1

curses

import curses

The curses module can be used to create textual user interfaces (TUIs) that run in the terminal. It feels very old-school but in some cases can be an elegant solution for making a small tool.

Nowadays there’s textual, a 3rd party TUI library with the same goal: https://www.textualize.io/

And much more

The standard library contains many more useful modules:

  • json, csv, configparser, tomllib – for parsing json, csv files, .ini files and .toml files
  • xml.etree – for parsing XML documents
  • http.server – for starting up a minimal HTTP server
  • pickle – for “freezing” Python objects for a later session
import pickle
import datetime

# Example of pickling and unpickling
data = datetime.datetime(2024, 10, 23)

# Save to file
with open('data.pkl', 'wb') as f:
    pickle.dump(data, f)

# Load from file
with open('data.pkl', 'rb') as f:
    loaded_data = pickle.load(f)

print(f"Original: {data}")
print(f"Loaded: {loaded_data}")  
print(f"Year: {loaded_data.year}")
Original: 2024-10-23 00:00:00
Loaded: 2024-10-23 00:00:00
Year: 2024
Pro Tip

The Python standard library is vast and well-documented. Before installing external packages, check if the standard library already has what you need. It’s always there, thoroughly tested, and maintained by the core Python team.

Additional resources

Footnotes

  1. It’s trickier than you think: for instance if the user wants to swap the names A.txt to B.txt you the file B.txt might be overwritten when renaming A.txt!↩︎