import threading
import time
def worker(name, duration):
print(f"[{name}] Starting (thread {threading.current_thread().name})")
time.sleep(duration)
print(f"[{name}] Done after {duration}s")
t1 = threading.Thread(target=worker, args=("Alice", 2))
t2 = threading.Thread(target=worker, args=("Bob", 1))
t1.start()
t2.start()
t1.join()
t2.join()
print("Both threads finished.")Multiprocessing and Multithreading in Practice
Threads, queues, and a live π estimator
Introduction
In Lecture 1 we laid the groundwork: processes are isolated warehouses, threads are workers inside those warehouses, and the GIL means Python threads can’t hammer simultaneously. We explored psutil, subprocess, and multiprocessing.Process/Pool.
This lecture puts all of that into practice. We’ll:
- Learn the
threadingmodule hands-on. - Understand how threads and processes communicate via queues.
- Build a complete TKinter application that uses a background thread to coordinate worker processes—all to approximate π in real time.
This is the second lecture in a five-part series:
- Processes and Threads
- Multiprocessing and Multithreading in Practice (you are here)
- Interprocess Communication and Sockets
- Client-Server Architectures and RESTful APIs
- Async Programming, Event Loops, and ASGI
The threading Module
The API mirrors multiprocessing.Process almost exactly. If you can spawn a process, you can spawn a thread.
Your First Thread
This looks almost identical to multiprocessing.Process—and that’s intentional. Python’s concurrency APIs are designed to be swappable. The critical difference: these threads run inside the same process, sharing the same memory space.
Run this a few times. You’ll notice Bob finishes before Alice despite starting second. The OS decides when each thread gets CPU time—this is preemptive scheduling. Your code has no say in the order.
The GIL in Action: A Benchmark
Let’s make this concrete. We’ll time the Monte Carlo π estimation from Lecture 1 using three approaches: sequential, threaded, and multiprocessed.
import random
import time
def monte_carlo_pi(num_points):
inside = 0
for _ in range(num_points):
x, y = random.random(), random.random()
if x * x + y * y <= 1.0:
inside += 1
return insideSequential:
N = 4_000_000
t0 = time.perf_counter()
total_inside = sum(monte_carlo_pi(1_000_000) for _ in range(4))
pi_est = 4 * total_inside / N
print(f"Sequential: π ≈ {pi_est:.6f} ({time.perf_counter() - t0:.2f}s)")Threaded (4 threads):
import threading
results = [0] * 4
def threaded_worker(index, n):
results[index] = monte_carlo_pi(n)
t0 = time.perf_counter()
threads = [threading.Thread(target=threaded_worker, args=(i, 1_000_000)) for i in range(4)]
for t in threads:
t.start()
for t in threads:
t.join()
pi_est = 4 * sum(results) / N
print(f"Threaded: π ≈ {pi_est:.6f} ({time.perf_counter() - t0:.2f}s)")Multiprocessed (4 processes):
from multiprocessing import Pool
if __name__ == '__main__':
t0 = time.perf_counter()
with Pool(4) as pool:
results = pool.map(monte_carlo_pi, [1_000_000] * 4)
pi_est = 4 * sum(results) / N
print(f"Multiproc: π ≈ {pi_est:.6f} ({time.perf_counter() - t0:.2f}s)")On a typical 4-core machine you’ll see something like:
Sequential: π ≈ 3.141247 (3.12s)
Threaded: π ≈ 3.141528 (3.15s) ← no speedup!
Multiproc: π ≈ 3.141692 (1.05s) ← ~3× speedup
Threads didn’t help at all. The GIL ensures only one thread executes Python bytecode at a time, so four threads doing CPU-bound work are effectively sequential. Processes bypass the GIL entirely—each has its own interpreter.
- CPU-bound → use
multiprocessing(or wait for the free-threaded build to mature) - IO-bound (network, file, user input) → use
threading(orasyncio—Lecture 5) - Both → combine them, which is exactly what we’ll do in the capstone project
Daemon Threads
By default, Python waits for all threads to finish before exiting. A daemon thread is a background worker that gets killed automatically when the main thread exits.
import threading
import time
def background_task():
while True:
print("Still working...")
time.sleep(1)
t = threading.Thread(target=background_task, daemon=True)
t.start()
time.sleep(3)
print("Main thread done — daemon dies with us.")This is useful for monitoring or heartbeat tasks. For our capstone, we’ll use a regular (non-daemon) thread and control its lifecycle with a threading.Event.
import threading
import time
stop_event = threading.Event()
def polite_background():
while not stop_event.is_set():
print("Working...")
time.sleep(1)
print("Received stop signal. Shutting down.")
t = threading.Thread(target=polite_background)
t.start()
time.sleep(3)
stop_event.set()
t.join()
print("Clean shutdown complete.")threading.Event is a simple flag: one thread sets it, others check it. Much cleaner than killing a thread mid-work.
Queues: Thread-Safe and Process-Safe Mailboxes
We know threads share memory and processes don’t. But even with shared memory, direct access leads to race conditions. The clean solution is a queue: a thread-safe (or process-safe) FIFO pipe where one side puts data in and the other takes data out.
queue.Queue — For Threads
The queue module from the standard library provides Queue, a thread-safe mailbox. The classic use case is the producer–consumer pattern.
import queue
import threading
import time
def producer(q, n):
for i in range(n):
time.sleep(0.1)
q.put(f"item-{i}")
q.put(None)
def consumer(q):
while True:
item = q.get()
if item is None:
break
print(f"Consumed: {item}")
q = queue.Queue()
t_prod = threading.Thread(target=producer, args=(q, 5))
t_cons = threading.Thread(target=consumer, args=(q,))
t_prod.start()
t_cons.start()
t_prod.join()
t_cons.join()The None sentinel signals the consumer to stop. q.get() blocks until an item is available—no busy waiting, no race conditions. This is the bread and butter of concurrent Python.
A few useful methods:
q.put(item)— add an item (blocks if the queue is full, when amaxsizeis set)q.get()— remove and return an item (blocks until one is available)q.get_nowait()— likeget()but raisesqueue.Emptyinstead of blockingq.qsize()— approximate size (don’t rely on this for synchronization)
multiprocessing.Queue — For Processes
Same idea, but data crosses process boundaries. Under the hood, Python serializes (pickles) the data, sends it through a pipe, and deserializes it on the other side.
from multiprocessing import Process, Queue
import os
def worker(q, n):
total = sum(range(n))
q.put((os.getpid(), total))
if __name__ == '__main__':
q = Queue()
processes = [Process(target=worker, args=(q, 10_000_000)) for _ in range(4)]
for p in processes:
p.start()
for _ in range(4):
pid, result = q.get()
print(f"PID {pid}: {result}")
for p in processes:
p.join()Everything you put on a multiprocessing.Queue must be picklable. That means basic types, most standard library objects, and your own classes (as long as they’re defined at module level). Lambda functions, open file handles, and database connections are not picklable.
Why Two Different Queues?
queue.Queue |
multiprocessing.Queue |
|
|---|---|---|
| Scope | Threads within one process | Across processes |
| Speed | Fast (shared memory) | Slower (serialization + IPC) |
| Data | Any Python object | Must be picklable |
| Use case | Thread coordination | Process coordination |
In our capstone project, we’ll use both: a multiprocessing.Queue for worker processes to send results back, and a queue.Queue for the coordinator thread to feed updates to the UI thread.
TKinter: A Minimal GUI Toolkit
TKinter ships with Python—no pip install needed. We won’t do a deep dive here; we just need enough to build our capstone.
A Minimal Window
import tkinter as tk
root = tk.Tk()
root.title("Hello TKinter")
label = tk.Label(root, text="Nothing happening yet.", font=("Consolas", 16))
label.pack(padx=20, pady=20)
button = tk.Button(root, text="Click me", command=lambda: label.config(text="Clicked!"))
button.pack(pady=10)
root.mainloop()Save this as hello_tk.py and run it from CMD:
python hello_tk.pyA window appears. Click the button. The label changes. Close the window to exit.
The key line is root.mainloop(). This hands control to TKinter’s event loop: it listens for user actions (clicks, key presses, window resizes) and dispatches them to your callbacks. The main thread is now occupied running this loop. If your callback takes a long time, the event loop can’t process other events—the UI freezes.
The Blocking Problem
Let’s simulate a long computation in a button callback:
import tkinter as tk
import time
def slow_task():
label.config(text="Computing...")
time.sleep(5)
label.config(text="Done!")
root = tk.Tk()
root.title("Blocking Demo")
label = tk.Label(root, text="Ready.", font=("Consolas", 16))
label.pack(padx=20, pady=20)
button = tk.Button(root, text="Start slow task", command=slow_task)
button.pack(pady=10)
root.mainloop()Run this and click the button. The window becomes unresponsive for 5 seconds—you can’t move it, resize it, or close it. The “Computing…” text might not even appear until after the sleep, because TKinter hasn’t had a chance to repaint.
This is why long-running work must happen off the main thread.
Updating the UI from Another Thread
TKinter is not thread-safe. You cannot call widget methods (like label.config(...)) directly from a background thread—it will work sometimes and crash unpredictably other times.
The safe pattern is:
- Background thread puts results on a
queue.Queue. - Main thread periodically polls that queue using
root.after().
root.after(ms, callback) schedules a function to run on the main thread after ms milliseconds. It’s TKinter’s equivalent of “check your mailbox every 100ms.”
import tkinter as tk
import threading
import queue
import time
def background_counter(q, stop_event):
i = 0
while not stop_event.is_set():
i += 1
q.put(i)
time.sleep(0.5)
def poll_queue():
while not ui_queue.empty():
try:
value = ui_queue.get_nowait()
label.config(text=f"Count: {value}")
except queue.Empty:
break
root.after(100, poll_queue)
root = tk.Tk()
root.title("Threaded Counter")
label = tk.Label(root, text="Count: 0", font=("Consolas", 16))
label.pack(padx=20, pady=20)
ui_queue = queue.Queue()
stop_event = threading.Event()
worker = threading.Thread(target=background_counter, args=(ui_queue, stop_event), daemon=True)
worker.start()
root.after(100, poll_queue)
root.mainloop()
stop_event.set()The background thread counts and puts values on the queue every 500ms. The main thread checks the queue every 100ms via poll_queue and updates the label. The UI stays responsive throughout.
root.after() Pattern
This is the standard way to bridge threads and TKinter. You’ll see it in virtually every non-trivial TKinter application. The idea generalizes: any framework with an event loop (TKinter, Qt, GTK, even web frameworks) needs a mechanism to safely inject work into that loop from the outside. In Lecture 5, we’ll see the async equivalent.
This pattern is exactly what we need for our capstone. Let’s build it.
Capstone: Live π Estimation with TKinter
We’re going to combine everything from this lecture and Lecture 1 into a single application:
- Worker processes (CPU-bound) generate random points and count how many fall inside the unit circle.
- A coordinator thread spawns those processes, collects results via
multiprocessing.Queue, and forwards running totals to the UI viaqueue.Queue. - The main thread runs TKinter’s event loop, polling the UI queue with
root.after()to display the live π estimate.
Architecture
┌─────────────────────────────────────────────────────┐
│ Main Process │
│ │
│ ┌──────────────────┐ queue.Queue ┌────────┐ │
│ │ Coordinator │ ───────────────► │ Main │ │
│ │ Thread │ │ Thread │ │
│ │ │ │ (Tk) │ │
│ └──────┬───────────┘ └────────┘ │
│ │ │
│ │ multiprocessing.Queue │
│ │ │
│ ┌──────┴───────────────────────────────────┐ │
│ │ Worker Processes (separate PIDs) │ │
│ │ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │ │
│ │ │ W1 │ │ W2 │ │ W3 │ │ W4 │ │ │
│ │ └──────┘ └──────┘ └──────┘ └──────┘ │ │
│ └──────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
Why this three-layer design?
- The worker processes bypass the GIL, giving us real parallelism for CPU-bound work.
- The coordinator thread bridges the process world and the thread world. It blocks on
multiprocessing.Queue.get()—an IO-bound wait that the GIL happily releases. - The main thread never blocks, keeping the UI responsive.
The Complete Application
Below is the full script. Save it as pi_estimator.py and run it from CMD with python pi_estimator.py. We’ll walk through each piece afterward.
import tkinter as tk
import threading
import queue
import random
import os
import time
from multiprocessing import Process, Queue as MPQueue
BATCH_SIZE = 100_000
NUM_WORKERS = 4
def pi_worker(result_queue, stop_event_flag):
pid = os.getpid()
while not stop_event_flag.is_set():
inside = 0
for _ in range(BATCH_SIZE):
x, y = random.random(), random.random()
if x * x + y * y <= 1.0:
inside += 1
result_queue.put((pid, inside, BATCH_SIZE))
def coordinator(mp_queue, ui_queue, stop_event, mp_stop_flag):
workers = []
for _ in range(NUM_WORKERS):
p = Process(target=pi_worker, args=(mp_queue, mp_stop_flag))
p.start()
workers.append(p)
total_inside = 0
total_points = 0
while not stop_event.is_set():
try:
pid, inside, count = mp_queue.get(timeout=0.2)
total_inside += inside
total_points += count
pi_est = 4 * total_inside / total_points
ui_queue.put((pi_est, total_points, len(workers)))
except Exception:
pass
mp_stop_flag.set()
for p in workers:
p.join(timeout=3)
if p.is_alive():
p.terminate()
class PiEstimatorApp:
def __init__(self, root):
self.root = root
self.root.title("Live π Estimator")
self.root.resizable(False, False)
self.pi_label = tk.Label(root, text="π ≈ ???", font=("Consolas", 28))
self.pi_label.pack(padx=30, pady=(20, 5))
self.info_label = tk.Label(root, text="Points: 0 | Workers: 0",
font=("Consolas", 12))
self.info_label.pack(padx=30, pady=(0, 10))
self.status_label = tk.Label(root, text="Status: Idle",
font=("Consolas", 10), fg="gray")
self.status_label.pack(padx=30, pady=(0, 5))
self.button = tk.Button(root, text="Start", font=("Consolas", 14),
command=self.toggle, width=12)
self.button.pack(pady=(5, 20))
self.ui_queue = queue.Queue()
self.stop_event = threading.Event()
self.mp_stop_flag = None
self.coord_thread = None
self.running = False
self.root.protocol("WM_DELETE_WINDOW", self.on_close)
def toggle(self):
if self.running:
self.stop()
else:
self.start()
def start(self):
self.running = True
self.button.config(text="Stop")
self.status_label.config(text="Status: Running", fg="green")
self.stop_event.clear()
mp_queue = MPQueue()
self.mp_stop_flag = MPEvent()
self.coord_thread = threading.Thread(
target=coordinator,
args=(mp_queue, self.ui_queue, self.stop_event, self.mp_stop_flag),
daemon=True,
)
self.coord_thread.start()
self.poll_queue()
def stop(self):
self.running = False
self.stop_event.set()
self.button.config(text="Start")
self.status_label.config(text="Status: Idle", fg="gray")
def poll_queue(self):
while not self.ui_queue.empty():
try:
pi_est, total_points, num_workers = self.ui_queue.get_nowait()
self.pi_label.config(text=f"π ≈ {pi_est:.8f}")
self.info_label.config(
text=f"Points: {total_points:,} | Workers: {num_workers}")
except queue.Empty:
break
if self.running:
self.root.after(100, self.poll_queue)
def on_close(self):
self.stop_event.set()
if self.mp_stop_flag:
self.mp_stop_flag.set()
self.root.destroy()
if __name__ == "__main__":
from multiprocessing import Event as MPEvent
root = tk.Tk()
app = PiEstimatorApp(root)
root.mainloop()if __name__ == "__main__": Guard
On Windows, multiprocessing uses spawn to create child processes—it starts a fresh Python interpreter and re-imports your script. Without the guard, the child would try to create a TKinter window, which would try to spawn more workers, and so on. The guard ensures only the original process builds the GUI.
The from multiprocessing import Event as MPEvent import is placed inside the guard for the same reason: we want child processes to import only the worker function, not the GUI machinery.
Walking Through the Code
pi_worker is the function each child process runs. It loops continuously, generating BATCH_SIZE random points per iteration, counting how many land inside the unit circle, and putting the result tuple (pid, inside, count) onto the multiprocessing.Queue. It checks stop_event_flag (a multiprocessing.Event) each iteration to know when to stop.
coordinator runs in a background thread. It spawns NUM_WORKERS child processes, then enters its own loop: read a result from mp_queue, update running totals, compute the current π estimate, and push it onto ui_queue (a queue.Queue). The timeout=0.2 on mp_queue.get() ensures we don’t block forever—if no result arrives within 200ms, we loop back and check stop_event. On shutdown, it signals the workers via mp_stop_flag and joins them, with a terminate() fallback for stubborn processes.
PiEstimatorApp is the GUI. The constructor builds the window, creates the queues and events, and registers on_close to handle the window’s X button cleanly.
start()clears the stop event, creates fresh queues, launches the coordinator thread, and kicks offpoll_queue.stop()sets the stop event, which propagates through the coordinator to the workers.poll_queue()drains the UI queue and updates labels. It reschedules itself withroot.after(100, self.poll_queue)as long as the app is running—this is the heartbeat that keeps the display updating.on_close()ensures everything shuts down when the user closes the window.
Running It
Save the complete script as pi_estimator.py and run from CMD:
python pi_estimator.pyYou should see a window with a large “π ≈ ???” label. Click Start. The estimate begins updating live, converging toward 3.14159265… as millions of points are sampled. Click Stop to pause, Start again to resume (with fresh workers). Close the window to exit cleanly.
Try changing NUM_WORKERS and BATCH_SIZE at the top of the script. More workers = faster convergence (up to your CPU core count). Larger batches = less queue overhead but less frequent UI updates. Finding the sweet spot is part of the fun.
Summary
We’ve covered a lot of ground. Here’s the cheat sheet:
| Concept | What It Does | Python |
|---|---|---|
| Thread | Worker inside a process (shared memory) | threading.Thread |
| Race condition | Two threads touching the same data | Fix with threading.Lock |
| GIL | Only one thread runs Python at a time | Bypass with multiprocessing |
queue.Queue |
Thread-safe mailbox | queue.Queue |
multiprocessing.Queue |
Process-safe mailbox (pickled) | multiprocessing.Queue |
threading.Event |
Simple stop/go flag for threads | threading.Event |
root.after() |
Schedule work on TKinter’s main thread | Bridge between threads and UI |
The capstone demonstrated the architecture pattern that shows up everywhere in real software:
- UI thread stays responsive (never blocks).
- Coordinator thread manages workers and bridges communication.
- Worker processes do the heavy lifting in parallel.
This same pattern—an event loop polling for results from background workers—is the foundation of client-server architectures. In Lecture 3, we’ll take communication to the next level: instead of queues within one machine, we’ll send messages over the network between separate processes. And in Lecture 5, we’ll see how async/await replaces threading for IO-bound server work.
Exercises & Project Ideas
Additional Resources
- threading documentation
- queue documentation
- multiprocessing documentation
- TKinter documentation
- PEP 703 — Making the GIL Optional
Next: Lecture 3 — Interprocess Communication and Sockets, where we move beyond queues and learn how separate processes communicate over the network.