import sys
import osThe Standard Libraries
Batteries included
Python’s batteries included philosophy
Python comes with batteries included. This means a large number of standard libraries are provided to make your life easier.
You can find a complete overview on the official documentation page. It’s definitely worth browsing through this to get an idea of everything Python has to offer.
Today we’ll examine some of the most commonly used libraries.
sys and os
syscontains information about the Python installation itselfosis for interaction with the operating system. These functions can vary significantly between Windows/Linux/macOS/…
sys.version'3.13.5 (main, Jul 8 2025, 21:00:57) [Clang 20.1.4 ]'
sys.path['/home/knaert/.local/share/uv/python/cpython-3.13.5-linux-x86_64-gnu/lib/python313.zip',
'/home/knaert/.local/share/uv/python/cpython-3.13.5-linux-x86_64-gnu/lib/python3.13',
'/home/knaert/.local/share/uv/python/cpython-3.13.5-linux-x86_64-gnu/lib/python3.13/lib-dynload',
'',
'/var/www/pydev2/.venv/lib/python3.13/site-packages']
sys.implementationnamespace(name='cpython',
cache_tag='cpython-313',
version=sys.version_info(major=3, minor=13, micro=5, releaselevel='final', serial=0),
hexversion=51185136,
_multiarch='x86_64-linux-gnu')
sys.version_infosys.version_info(major=3, minor=13, micro=5, releaselevel='final', serial=0)
sys.modules is a dict containing all modules that Python has loaded. (See the lecture on modules and imports.)
# sys.modules is quite large, let's just see how many modules are loaded
len(sys.modules)1348
os.getcwd()'/var/www/pydev2/advanced-python'
The os module provides lots of functionality for working with paths and files, e.g. os.path.join. You’ll often see this in recommendations on the internet (Stack Overflow, etc.). But there’s a more modern alternative: pathlib, and for most applications it’s better to use that instead.
Important for software developers are also the environment variables.
These are key-value pairs managed by the operating system. In Python they’re accessible via os.environ. Note that these are read when a process starts and passed to all subprocesses. So if you change an environment variable externally after Python has started, you won’t see any difference. However, if you modify an environment variable in Python via os.environ and then start a subprocess, you will see this change.
if sys.platform == "win32":
# Windows uses USERPROFILE for the home path and USERNAME for the user name
home_dir = os.getenv('USERPROFILE')
user_name = os.getenv('USERNAME')
else:
# POSIX systems (Linux, macOS) generally use HOME and USER
home_dir = os.getenv('HOME')
user_name = os.getenv('USER')
print(f"Home Directory: {home_dir}")
print(f"User Name: {user_name}")Home Directory: /home/knaert
User Name: knaert
sys.executable'/var/www/pydev2/.venv/bin/python3'
pathlib
import pathlibThe pathlib library is for all kinds of operations on paths to files and folders, both in Linux and Windows. It’s a modern alternative to the older os.path, along with the glob module.
Much documentation still refers to these older modules, which is why you still see them frequently in code.
- It’s highly recommended to use pathlib wherever possible instead of
os.path - The lesser-known
globmodule is for finding all files matching a certain pattern, e.g. all files with extension “.txt”. Here too you can usePath.glob
In pathlib you work with path objects in an object-oriented way. This allows you to do things like path.parent and path.with_suffix.
The documentation has a handy overview of the differences between os and pathlib.
for path in pathlib.Path('.').iterdir():
print(path)
# print(path.absolute())
# print(path.suffix)
# print(path.with_suffix('.png'))
# print(path, path.is_file())11-modules-imports.html
index.qmd
.jupyter_cache
12-linters-types.qmd
07-oop-inheritance.qmd
05-functions.qmd
11-modules-imports.qmd
index.html
06-standard-libraries.qmd
13-extra-libraries.qmd
01-control-structures.quarto_ipynb
12-linters-types.html
stderr_colored.png
01-control-structures.qmd
10-pandas-matplotlib.qmd
08-oop-advanced.quarto_ipynb
08-oop-advanced.qmd
04-loose-ends.qmd
02-bytes-files.qmd
05-functions.quarto_ipynb
03-oop-basics.quarto_ipynb
07-oop-inheritance.quarto_ipynb
05-functions.html
07-oop-inheritance.html
09-scientific-stack.qmd
02-bytes-files.quarto_ipynb
06-standard-libraries.quarto_ipynb
03-oop-basics.qmd
for path in pathlib.Path('.').glob('*.qmd'):
print(path)index.qmd
12-linters-types.qmd
07-oop-inheritance.qmd
05-functions.qmd
11-modules-imports.qmd
06-standard-libraries.qmd
13-extra-libraries.qmd
01-control-structures.qmd
10-pandas-matplotlib.qmd
08-oop-advanced.qmd
04-loose-ends.qmd
02-bytes-files.qmd
09-scientific-stack.qmd
03-oop-basics.qmd
The way to combine paths in pathlib is with /:
p = pathlib.Path('.').absolute() / 'example.txt'
print(p)/var/www/pydev2/advanced-python/example.txt
Many functions will directly accept a pathlib object. For example, this is the case for the open function to open a file.
If you’re dealing with a function that only accepts strings, you can use str(p) or p.as_posix() depending on what works for your application.
print(str(p))
print(p.as_posix())/var/www/pydev2/advanced-python/example.txt
/var/www/pydev2/advanced-python/example.txt
The simplest way to create a file is with p.touch():
p.touch()p_stat = p.stat()
print(p_stat)os.stat_result(st_mode=33204, st_ino=552944, st_dev=2049, st_nlink=1, st_uid=1001, st_gid=33, st_size=0, st_atime=1759217428, st_mtime=1759217428, st_ctime=1759217428)
It takes some digging to convert this to a normal timestamp:
import time
time.ctime(p_stat.st_atime)'Tue Sep 30 07:30:28 2025'
import datetime
datetime.datetime.fromtimestamp(p_stat.st_ctime)datetime.datetime(2025, 9, 30, 7, 30, 28, 867334)
See this Stack Overflow question for some explanation:
But also pay attention to the documentation: https://docs.python.org/3.12/library/os.html#os.stat_result.st_ctime
st_atime
Time of most recent access expressed in seconds.
st_mtime
Time of most recent content modification expressed in seconds.
st_ctime
Time of most recent metadata change expressed in seconds.
Changed in version 3.12: st_ctime is deprecated on Windows.
Use st_birthtime for the file creation time.
You can even manually change these timestamps, see here
day = 3600 * 24 * 100
os.utime(p, times=(p_stat.st_atime - day, p_stat.st_mtime - day))re
import rere stands for “regular expression”. Regular expressions, or regexes, are a kind of command language that allows you to search for very detailed patterns in text.
Regular expressions exist everywhere, not just in Python. They exist in every programming language and text editors often have support for regular expressions too.
Sometimes there are differences between various regex implementations, but the things we’ll cover here are fairly standard.
success = re.search('the pattern', 'This is a text in which I search for the pattern')
print(success)<re.Match object; span=(37, 48), match='the pattern'>
A search with re always has a pattern and a text in which you search for the pattern.
The result is a match object or None if no match was found.
'This is a text in which I search for the pattern'[44:55]'tern'
You can work further with a match object:
if success is not None:
print(success.string)
print(f"Found '{success.group()}' at position {success.start()}-{success.end()}")This is a text in which I search for the pattern
Found 'the pattern' at position 37-48
If no match is found, the result is None:
failed = re.search('winner', 'Who won the lottery?')
print(failed)None
The re module has several other handy functions:
re.match finds a match but only at the beginning of the string:
result = re.match('road', 'Where is the road when the road is gone?')
print(result)None
result = re.match('Where', 'Where is the road when the road is gone?')
print(result)<re.Match object; span=(0, 5), match='Where'>
re.findall finds all places where the pattern occurs. This only returns the pattern, so if you want to know where it occurs, this isn’t very practical:
result = re.findall('road', 'Where is the road when the road is gone?')
print(result)['road', 'road']
re.finditer iterates over all matches and returns a complete match object each time:
for r in re.finditer('road', 'Where is the road when the road is gone?'):
print(f"Found '{r.group()}' at position {r.start()}-{r.end()}")Found 'road' at position 13-17
Found 'road' at position 27-31
Regex patterns
Regular expressions are extremely versatile and allow you to search for very varied patterns. Some examples:
blabla$matchesblablaonly at the end of a string^blablamatchesblablaonly at the beginning of a string
A|BmatchesAorBx*matches the empty string,x,xx,xxx, …x+matches 1 or more copies of xx?matches 0 or 1 copies of x\dmatches a digit
Check the documentation for (much) more explanation.
for m in re.finditer("this", "this is a test of this system"):
print(f"Found '{m.group()}' at {m.start()}-{m.end()}")Found 'this' at 0-4
Found 'this' at 18-22
for m in re.finditer("^this", "this is a test of this system"):
print(f"Found '{m.group()}' at {m.start()}-{m.end()}")Found 'this' at 0-4
for m in re.finditer("is|a", "this is a test of this system"):
print(f"Found '{m.group()}' at {m.start()}-{m.end()}")Found 'is' at 2-4
Found 'is' at 5-7
Found 'a' at 8-9
Found 'is' at 20-22
for m in re.finditer("this (is|system)", "this is a test of this system"):
print(f"Found '{m.group()}' at {m.start()}-{m.end()}")Found 'this is' at 0-7
Found 'this system' at 18-29
for m in re.finditer("e+", "this is a test of this system"):
print(f"Found '{m.group()}' at {m.start()}-{m.end()}")Found 'e' at 11-12
Found 'e' at 27-28
for m in re.finditer("syste*m", "systm system systeeeeeem"):
print(f"Found '{m.group()}' at {m.start()}-{m.end()}")Found 'systm' at 0-5
Found 'system' at 6-12
Found 'systeeeeeem' at 13-24
for m in re.finditer("systee?m", "systm system systeem systeeem systeeeeeem"):
print(f"Found '{m.group()}' at {m.start()}-{m.end()}")Found 'system' at 6-12
Found 'systeem' at 13-20
Regex groups
Regular expressions have the concept of “groups”. These are certain sub-expressions that you can capture to use later. For example, the following regex matches URLs:
https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
See this Stack Overflow answer for an explanation.
url_re = r'http(s?):\/\/((www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256})\.([a-zA-Z0-9()]{1,6})\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)?'
match_obj = re.match(url_re, 'https://duckduckgo.com/?t=ffab&q=test&ia=web')
print(match_obj)<re.Match object; span=(0, 44), match='https://duckduckgo.com/?t=ffab&q=test&ia=web'>
Each set of parentheses defines a group, where group 0 is always the complete match.
The first group will capture whether the URL uses http or https:
print("Full match:", match_obj.group(0))
print("HTTP/HTTPS:", match_obj.group(1))Full match: https://duckduckgo.com/?t=ffab&q=test&ia=web
HTTP/HTTPS: s
The next group captures the domain and subdomains:
print("Domain:", match_obj.group(2))Domain: duckduckgo
The following groups look at whether there’s www and what the top-level domain is:
print("WWW part:", match_obj.group(3))
print("TLD:", match_obj.group(4))WWW part: None
TLD: com
The last group looks at what comes after the base URL:
print("Path/query:", match_obj.group(5))Path/query: /?t=ffab&q=test&ia=web
textwrap and string
Also interesting to know: textwrap.dedent removes indentation:
s = """
this is a text
with a certain indentation
that is always constant
"""
import textwrap
print(textwrap.dedent(s))
this is a text
with a certain indentation
that is always constant
And if you ever need the alphabet:
import string
print("Letters:", string.ascii_letters)
print("Lowercase:", string.ascii_lowercase)
print("Uppercase:", string.ascii_uppercase)Letters: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
Lowercase: abcdefghijklmnopqrstuvwxyz
Uppercase: ABCDEFGHIJKLMNOPQRSTUVWXYZ
functools, itertools, operator
import functools
import itertools
import operatorThese are three handy libraries for working with functions and iterations.
For functools, it’s especially interesting to know partial, which is a way to transform an existing function into a new function where certain arguments are already filled in:
def multiply(a, b):
print(f'{a=}, {b=}')
return a * b
times_3 = functools.partial(multiply, 3)
print(times_3(4))a=3, b=4
12
times_100 = functools.partial(multiply, b=100)
print(times_100(5))a=5, b=100
500
print2 = functools.partial(print, end='')
print2('everything')
print2('together')
print2('now')
print() # Add a newline at the endeverythingtogethernow
The itertools library allows you to iterate over existing lists and other iterable objects in new ways:
L = list(range(4))
print(L)[0, 1, 2, 3]
For example, pairwise will return pairs of consecutive elements (x1, x2), (x2, x3), …:
for u in itertools.pairwise(L):
print(u)(0, 1)
(1, 2)
(2, 3)
changes = [
(1, "red"),
(2, "green"),
(3, "orange"),
(4, "red")
]
for first, second in itertools.pairwise(changes):
match second:
case (t2, "orange"):
print("The light became orange at time", t2)
case _:
passThe light became orange at time 3
And combinations(..., n) will produce all possible combinations of n different elements:
for c in itertools.combinations(L, 2):
print(c)(0, 1)
(0, 2)
(0, 3)
(1, 2)
(1, 3)
(2, 3)
The operator library offers functions that correspond to normal operations in Python, such as + corresponds to operator.add:
print(operator.add(3, 4))7
This can be handy in combination with other functionality from the functools library, e.g., functools.reduce.
functools.reduce(f, [a, b, c, ...]) first calculates r = f(a, b), then r2 = f(r, c), etc. If you use operator.add for function f, this is the sum:
print(functools.reduce(operator.add, [1, 2, 3, 4]))
print(functools.reduce(operator.mul, [1, 2, 3, 4]))
print(functools.reduce(operator.add, ['this', ' works', ' for', ' strings']))10
24
this works for strings
The difference with the sum function is that it must start from something, by default from 0.
Note that to concatenate strings, it’s much better to use the ''.join(...) function.
datetime
You’ve undoubtedly used datetime before:
import datetime as dt
now = dt.datetime.now()
print(now)2025-09-30 07:30:29.121094
formatted = now.strftime('%Y-%m-%d %H:%M:%S')
print(formatted)
parsed = dt.datetime.strptime(formatted, '%Y-%m-%d %H:%M:%S')
print(parsed)2025-09-30 07:30:29
2025-09-30 07:30:29
today = dt.date.today()
print(today)
print(today.strftime('%Y-%m-%d'))2025-09-30
2025-09-30
This is also useful for calculations with time. For example:
print(dt.date.today() - dt.date(2023, 8, 15))777 days, 0:00:00
Pay attention: date - date exists and datetime - datetime exists but you can’t mix them:
dt.date.today() - dt.datetime(2023, 8, 15)--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[57], line 1 ----> 1 dt.date.today() - dt.datetime(2023, 8, 15) TypeError: unsupported operand type(s) for -: 'datetime.date' and 'datetime.datetime'
collections
The collections library has useful variations on the familiar dict and set.
For example, see collections.Counter and collections.defaultdict:
import collections
c = collections.Counter(['here', 'there', 'here', 'here', 'there'])
print(c)
print("Most common:", c.most_common(1))Counter({'here': 3, 'there': 2})
Most common: [('here', 3)]
Pay attention to the signature of collections.defaultdict:
def default_integer():
return 5
d = collections.defaultdict(default_integer, a=2, b=3)
print(f"a: {d['a']}, b: {d['b']}, c: {d['c']}") # c gets default valuea: 2, b: 3, c: 5
fractions, decimal, statistics
Python also offers support for calculating with fractions and decimal numbers. This can be handy for, for example, financial applications where rounding is not desired.
The module also offers lots of support for rounding in specific ways and maintaining precision.
print(0.1 + 0.2) # Floating point imprecision0.30000000000000004
from decimal import Decimal
d1, d2 = Decimal('0.1'), Decimal('0.2')
print(d1 + d2) # Exact decimal arithmetic0.3
Similarly, the fractions module allows exact calculation with fractions:
from fractions import Fraction
print(Fraction(1, 7) + Fraction(2, 7))
print(Fraction(1, 6) + Fraction(1, 3) + Fraction(1, 2))
print(Fraction(3, 10) + Fraction(1, 5))3/7
1
1/2
The statistics module offers support for certain commonly occurring statistical operations. There’s support for Fraction, Decimal, float, and int.
If you only need the simplest statistical things, it can sometimes be interesting to use this module to simplify your dependencies.
To quote the documentation:
The module is not intended to be a competitor to third-party libraries such as NumPy, SciPy, or proprietary full-featured statistics packages aimed at professional statisticians such as Minitab, SAS and Matlab. It is aimed at the level of graphing and scientific calculators.
numbers = [0, 2, 3, 4, 100]
import statistics
mean_val = statistics.mean(numbers)
median_val = statistics.median(numbers)
stdev_val = statistics.stdev(numbers)
print(f"Mean: {mean_val}, Median: {median_val}, Std Dev: {stdev_val:.2f}")Mean: 21.8, Median: 3, Std Dev: 43.74
decimal_numbers = [Decimal(str(nr)) for nr in numbers]
print("With Decimal:", statistics.mean(decimal_numbers))With Decimal: 21.8
GUI Libraries: tkinter and curses
tkinter
import tkinterPython has a built-in module for building graphical user interfaces (GUIs) in the form of tkinter.
Advantages: - Very portable between different platforms - “Built into” Python, so no extra installation required
Disadvantages: - Interface feels outdated - Learning curve - Documentation is very confusing because it’s built on top of Tcl/Tk
There are also other libraries like PyQt, PySide, Kivy, etc. Additionally, there’s a stronger trend toward browser apps, for which many frameworks also exist. Before starting a large project with one of these frameworks, it’s worth getting to know several so you can assess their strengths and challenges.
See https://wiki.python.org/moin/GuiProgramming
The windows in tkinter always have a strict hierarchical structure where each window depends on a parent window. You must have a “root” window to start from, which is done with tk.Tk().
curses
import cursesThe curses module can be used to create textual user interfaces (TUIs) that run in the terminal. It feels very old-school but in some cases can be an elegant solution for making a small tool.
Nowadays there’s textual, a 3rd party TUI library with the same goal: https://www.textualize.io/
And much more
The standard library contains many more useful modules:
- json, csv, configparser, tomllib – for parsing json, csv files, .ini files and .toml files
- xml.etree – for parsing XML documents
- http.server – for starting up a minimal HTTP server
- pickle – for “freezing” Python objects for a later session
import pickle
import datetime
# Example of pickling and unpickling
data = datetime.datetime(2024, 10, 23)
# Save to file
with open('data.pkl', 'wb') as f:
pickle.dump(data, f)
# Load from file
with open('data.pkl', 'rb') as f:
loaded_data = pickle.load(f)
print(f"Original: {data}")
print(f"Loaded: {loaded_data}")
print(f"Year: {loaded_data.year}")Original: 2024-10-23 00:00:00
Loaded: 2024-10-23 00:00:00
Year: 2024
The Python standard library is vast and well-documented. Before installing external packages, check if the standard library already has what you need. It’s always there, thoroughly tested, and maintained by the core Python team.
Additional resources
Footnotes
It’s trickier than you think: for instance if the user wants to swap the names
A.txttoB.txtyou the fileB.txtmight be overwritten when renamingA.txt!↩︎