Cleanup
- python
- 6
- 2
- finished
Weakref module has a wonderful mechanism to perform cleanup actions: finalizer objects. The idea is to register a cleanup function which will is called when some object is garbage collected, or when application exits, without worrying about lifetime of the function itself. I’ve used finalizer objects as a replacement for ordinary atexit module, but they have 2 drawbacks:
- they bind existence of object to the function call: global cleanups require global application object;
- you can’t easily call all the cleanup functions before
atexit
handlers run.
The latter becomes painful if we’re dealing with non-daemonic threads.
(sidenote: Daemon threads borrow their name from Unix daemon processeses.
They are threads which run in the background and aren’t automatically joined
by Python when application exits. They are a kind of run-and-forget
threads.)
If we can communicate to the thread that it should
stop, then we should do so before we try to join it (either manually or
automatically), or otherwise we risk unnecessary delay of application exit. A
good example are timers: if we exit an application while a timer is
still in its waiting state, then joining a thread will result in waiting the
remaining time and then running the underlying action anyway. When
application is quitting, it would be preferable to cancel the timer first,
but there’s no way to do it with weakref or atexit module.
(sidenote: CPython implementation of threading module provides undocumented
_register_atexit
function for the purpose of running cleanup actions before
joining of non-daemonic threads.)
Custom Solution
Inspired by weakref’s finalizer objects, I wrote a custom class for lifetime management, which is a part of one of my projects, kpsh (implementation, and tests are available under the terms of GPL3 or later license).
The usage is simple: you can register any cleanup function by passing it and
its arguments to the cleanup()
call. Without anything else, registered
functions are called in reverse order when application quits, the same as
atexit module.
Calling cleanup
returns a proxy object. If you call it without any
arguments, it will run a stored function immediately. Later, this function
won’t be called again. You may also cancel()
the cleanup.
You may call all remaining cleanup actions (in reverse order) at any time by
running cleanup.run_all()
. You can also wrap blocks of code in a context
manager cleanup.clean_on_exit()
, in which case run_all()
will be
automatically called when you exit this block of code. You can then register
more cleanup actions for later use.
import sys
from cleanup import cleanup
def print_on_exit(s: str, **kw):
print(f"Cleanup: {s}", **kw)
def recursive_cleanup():
cleanup(print_on_exit, "Recursive cleanup")
cleanup(print_on_exit, "Some cleanup")
cleanup(print_on_exit, "Cleanup on stderr", file=sys.stderr)
cleanup(recursive_cleanup)
c = cleanup(print_on_exit, "Cleanup which will be cancelled")
p = cleanup(print_on_exit, "Preemptive cleanup")
print("Application code")
c.cancel()
p()
cleanup.run_all()
print("-------------------------------------")
try:
with cleanup.clean_on_exit():
cleanup(print_on_exit, "Cleanup even though there's an exception")
assert False
except AssertionError:
print("Exception handler")
print("-------------------------------------")
cleanup(print_on_exit, "This will be called AFTER the traceback")
assert False, "The End"
The output of above code is:
Application code
Cleanup: Preemptive cleanup
Cleanup: Recursive cleanup
Cleanup: Cleanup on stderr
Cleanup: Some cleanup
-------------------------------------
Cleanup: Cleanup even though there's an exception
Exception handler
-------------------------------------
Traceback (most recent call last):
File "/home/mgoral/temp/cl/./bla.py", line 41, in <module>
sys.exit(main())
^^^^^^
File "/home/mgoral/temp/cl/./bla.py", line 39, in main
assert False, "The End"
AssertionError: The End
Cleanup: This will be called AFTER the traceback
Source Code
# SPDX-License-Identifier: GPL-3.0-or-later
# Copyright (C) 2024 Michał Góral.
import sys
import atexit
import itertools
from threading import RLock
from contextlib import contextmanager
# Rough idea for function registry stolen from weakref.finalize implementation
class cleanup:
__slots__ = ()
_index_iter = itertools.count()
_registry = {}
_is_atexit = False
lock = RLock()
class _Info:
__slots__ = ("fn", "args", "kwargs", "index")
def __init__(self, fn, *args, **kwargs):
if not self._is_atexit:
atexit.register(self.run_all)
cleanup._is_atexit = True
info = self._Info()
info.fn = fn
info.args = args
info.kwargs = kwargs
info.index = next(self._index_iter)
with self.lock:
self._registry[self] = info
def __call__(self):
with self.lock:
info = self._registry.pop(self, None)
if info:
return info.fn(*info.args, **info.kwargs)
return None
def cancel(self):
with self.lock:
self._registry.pop(self, None)
@classmethod
@contextmanager
def clean_on_exit(cls):
try:
yield
finally:
cls.run_all()
@classmethod
def run_all(cls):
try:
# theoretically cleanup actions may create new cleanups by
# themselves, so we must handle this in an infinite loop
while True:
cleanups = cls._get_cleanups()
if not cleanups:
break
cl = cleanups.pop()
try:
cl()
except Exception:
sys.excepthook(*sys.exc_info()) # show exception on stderr
with cls.lock:
assert len(cls._registry) == 0
finally:
atexit.unregister(cls.run_all)
cls._is_atexit = False
@classmethod
def _get_cleanups(cls):
with cls.lock:
lst = list(cls._registry.items())
lst.sort(key=lambda elem: elem[1].index) # oldest last, but we use list.pop
return [cl for cl, _ in lst] # force retrieval of info via cleanup object