Circular Imports in Python

To understand circular imports, let's take a brief look at what happens when you import a module. On reaching an import module_name statement, the following steps are done by the interpreter. (This is a rough version of what happens.)

a.) It searches for the module matching module_name in the executing script's module cache sys.modules (a dictionary matching modules in the current namespace to their module objects). If the module is in the cache it is returned, else it searches sys.path listed directories in order. If it's still not found it bails with a ModuleNotFoundError, else it continues with the following steps.

b.) It creates a module object (types.ModuleType) based on module_name.

c.) It then adds the module object to the module cache of the executing script: sys.modules[module_name] = module_obj.

d.) Finally, it executes the found module's body in the module object's __dict__: exec(module_body, module_obj.__dict__, module_obj.__dict__).

Module attributes __file__, __name__, and __doc__ get added to module_obj. The module's definitions can then be used in the executing script's namespace as module_name.attribute.

This process is carried out even for wildcard imports (from module_name import *) and partial imports (from module_name import definition1, definition2, ...), though only the imported definitions are copied to the executing script's namespace: The full module is not directly accessible with module_name, but can be accessed indirectly with sys.modules[module_name]. (This implies partial imports are not particularly more efficient memory- and speed-wise than full module imports.)

Suppose you have two files a.py and b.py and you have the following structures:

# -----a.py-----
import b

def a_func():
   print("a_func")

def a_func2():
    return b.b_func()

# -----b.py-----
import a

def b_func():
    return a.a_func()

(See that both modules import from each other.)

If b.py is run, on reaching the import a line module a is searched for and its body is executed (step a). Executing its body reaches the import b line and executes module b (step a again), switching control back to module b and reaching the import a line again. Instead of going into a recursive cycle, module a (though incompletely executed at this point) has been added to module b's module cache (step c) and is pulled from the cache. This leads to the full execution of module b, switching control back to module a.

Module a is then completely executed and the module object representing it (i.e. a) in module b's namespace is now complete with its definitions (i.e. a.a_func() can be used). (Recall that the originally executing script is module b, which control is eventually handed to.)

(A key point to note is that function bodies aren’t executed on module load. This is important as we’ll come to see.)

Consider instead the following structure:

# -----a.py-----
import b

def a_func():
   print("a_func")

# Other code a. (Assume no top-level dependencies with module b.)
...

# -----b.py-----
import a

print(a.a_func())

# Other code b. (Assume no top-level dependencies with module a.)
...

Now we have a top-level dependency as a.a_func() is now a top-level invocation that executes on loading b.py. (a.a_func() is not enclosed in, say, a function.) Running b.py fails with an AttributeError from a circular import.

Let us observe the control flow to understand what happens: Running b.py reaches the import a line, searches for module a and executes its body (step a). Executing its body gets to the import b line and executes module b (step a again), switching control back to module b and reaching the import a line again. Though module a has been added to module b's module cache (step c), it is incompletely executed at this point and doesn't have the a_func() definition. The print(a.a_func()) line is reached in module b and fails with an AttributeError due to the incompletely loaded module a.

A fix around this would be to rearrange the import order in a.py:

# -----a.py-----
# Define what b.py needs to complete execution---its top-level dependency.
def a_func():
    print("a_func")

# This ordering is important for reasons explained below.
import b

# Other code a. (Assume no top-level dependencies with module b.)
...

# -----b.py-----
import a

print(a.a_func())

# Other code b. (Assume no top-level dependencies with module a.)

Now when running b.py and control switches back to module b from a.py's import b line, module a already has the a_func() definition and module b can complete its execution... before fully executing module a. Recall that the segment marked "Other code a" still has to be run to fully execute module a to have all its definitions in its equivalent module object (a) in module b's namespace.

An even better approach is to organize your modules to avoid inter-modular dependencies on both sides of an import, for example by moving such dependencies to separate modules. This approach is less error-prone and could aid the maintainability of your projects.