Python has many strong points and tends to have a learning curve that is friendly to beginners. The import system, however, is one part of the language which can be confusing. There are a few possible sources of confusion:
- Certain import statements which work for modules inside packages do not work for modules outside of packages, and vice versa.
- Modules run as scripts have some special properties which can affect imports.
- Lack of familiarity with the Python package mechanism in general.
- Some of the import syntax does not generalize in the way one might first expect it does.
- Old documentation can come up in searches. (For example, Python 2 had
something called implicit relative import which Python 3 does not have. If
you’re still using Python 2, disable implicit relative imports by using
from __future__ import absolute_import
as the first import.)
This article only discusses modern Python imports (Python versions >3.0). Basic familiarity with the concepts of files and directories/folders is assumed. Before imports are discussed Python modules and packages will be briefly reviewed.
1. Review of Python modules and packages
At some point most non-trivial projects become large enough that it makes sense to separate the code into separate files. In Python those separate files are called modules. The use of modules can make the codebase easier to understand and, as the name implies, more modular. Imports are used to allow code in one module to access and use code from other modules.
The Python module and package system is closely tied to the directories and files in the underlying filesystem. Understanding the correspondence is important in understanding the Python import system. The basic correspondence can be summarized as follows:
A file containing Python code is always a module and vice versa. The filename
of a module should end with the .py
extension. The name of a Python module
is the same as its filename without the .py
extension (except for modules run
directly as scripts; those are generally not imported and are always named
__main__
, see Section 5). All Python programs are composed of
one or more modules.
Packages are collections of modules organized in a certain way:
- Any Python module which is inside a directory containing an
__init__.py
file, and whose parent directory does not, is by definition part of the package associated with that__init__.py
file. The__init__.py
file may or may not be an empty file. The name of the package is the same as the name of the directory. (Formally, packages are just a special kind of module, but in this context packages and modules will be considered to be distinct.) - Subpackages are defined similarly to packages, as subdirectories of a
package directory which also contain an
__init__.py
file. Subpackages can have their own subpackages, and so forth. They are members of the package corresponding to the top-level directory of their subtree of directories containing__init__.py
files.
A module with no __init__.py
in its directory is not part of any package or
subpackage (excepting namespace packages, an advanced topic which is briefly
discussed later).
Consider this example directory structure, which will be used throughout the article:
my_project │ ├── bin │ └── my_script.py │ └── src ├── my_standalone_module.py │ └── my_package ├── __init__.py ├── foo.py ├── bar.py │ └── my_subpackage ├── __init__.py └── baz.py
This is the skeleton of a project called my_project
which contains one script
called my_script
in its bin
directory and a package called my_package
in
its src
directory. (Some people prefer not to have a separate src
directory, but there are some good reasons to include it.) There
is also a module named my_standalone_module
in the src
directory.
The __init__.py
files in the directories are essentially the modules for the
corresponding packages or subpackages. When a module is imported its code
is run to initialize it. When a package or subpackage is
imported its __init__.py
is implicitly imported and run.
A package’s namespace is by definition the namespace of the __init__.py
module in its directory. This includes any names which are explicitly imported
into the __init__.py
file. Whenever a subpackage of a package is imported
its name is automatically added to the namespace of the parent package (i.e. to
the namespace of that package’s __init__.py
).
Subpackages are imported (and their __init__.py
files are run) when they are
either 1) explicitly imported or 2) automatically imported just before a module
or subpackage contained within that subpackage is imported. As noted above,
the module
object representing the subpackage is also added to the namespace
of the package or subpackage that imports it (under its subpackage name).
It is easy to underestimate __init__.py
files, since they are often empty
files, but they are quite important as far as how Python packages work. The
top-level namespace of a package constitutes its main application-programmer
interface (API). Names which should be exposed by that API need to be imported
into the __init__.py
file.
Python import
statements always contain a specifier for a package or module
to import. Equivalently, they always contain a specifier for the corresponding
file or a directory in the filesystem. Remember that while Python’s import
statements never use the .py
file extension for naming modules, other than
that the names of modules, packages, and subpackages generally correspond
directly with filesystem objects (files and directories) and their filesystem names.
2. The Python path-search list: sys.path
The sys.path
list is the root of all imports in Python (excepting system
library modules such as math
, which are always found in their usual
location). This list tells the Python import system where to look for packages
and modules to import. It is just a list containing directory pathnames,
represented as strings.
All standard, non-library imports have the sys.path
list at their root: A
standalone module cannot be imported if its containing directory is not on
the sys.path
list, and a package cannot be imported if the parent
directory of its top-level directory (the parent of the top directory
containing an __init__.py
file) is not on the sys.path
list. Note
that when external packages are installed with pip
or similar programs they
are placed in the system site-packages
directory, which is on sys.path
by
default. That is what allows them to be discovered and imported.
Ordering in the sys.path
list is important: The first match found in the list
is the one that is used. The paths themselves are strings which can represent
relative or absolute pathnames for the underlying operating system. Any
relative pathnames in sys.path
(such as ".."
) are interpreted relative to
Python’s current working directory (CWD). The CWD is initially set to the
command shell’s notion of current directory, i.e., the directory you are in
when you invoke the python
command. The Python CWD can be changed by calls to
os.setcwd()
.
Directories can be added to the initial sys.path
list from command shells
like Bash by setting the PYTHONPATH
environment variable before invoking the
python
command. The PYTHONPATH
environment variable should contain a
colon-separated string of the pathnames to be added. While this has its uses,
it is usually not the recommended way to initialize sys.path
.
The command to import a package which is located in a directory on the
sys.path
list is simple: just import the package name (which is the name of
its top-level directory). Similarly, to import a non-package module located in
a directory on sys.path
just import the module’s name (which is the filename
leaving off the .py
extension). For example, suppose the path to directory
my_project/src
in the skeleton project above is in the sys.path
list. Then
the following imports work:
import my_package
import my_standalone_module
The first statement imports the package my_package
in the directory of that
same name, and the second statement import the module my_standalone_module
with code located in the file my_standalone_module.py
. The same imports can
be done with a single statement, although that style of import is not generally recommended:
import my_package, my_standalone_module # Same as above two imports.
What is actually being imported here are two module
objects, one representing
the package my_package
and the other representing the module my_module
.
For example, if you run str(type(my_package))
after the above import the
result is "<class 'module'>"
.
All the names in the namespace of a package or module represented by a module
object are also attributes of that module
object (i.e., they are in its
__dict__
). This is what allows those attributes to be accessed directly from
the imported module objects. For example, suppose the __init__.py
of
my_package
defines the variable init_var
and my_standalone_module
defines
the variable my_standalone_module_var
. Then expressions like
my_package.init_var
and my_standalone_module.my_standalone_module_var
can
be used to access those variables in any module that makes the above imports.
The as
keyword can optionally be used to rename an import under an alias:
import my_package as mp
import my_standalone_module as msm
import my_package as mp, my_standalone_module as msm # Same as above two.
The as
keyword can be used in any import statement to change the name under
which the import is saved in the local namespace. It only renames the local
reference, not the actual name of the module.
Python always keeps a cache of imported packages and modules as module
objects in the sys.modules
dict, keyed by the fully-qualified name of the
package or module. When an import statement is executed Python first looks in
that dict to see if the package or module has previously been imported. If so
it returns the previously-imported object. Otherwise it tries to import from
the filesystem. Re-importing a module requires the explicit use of the
reload
function from the Python importlib
library.
The from
statement can be used to import subpackages as well as to import
particular attributes defined in a package or module:
from my_package import init_var as iv, my_subpackage as msp
from my_standalone_module import my_standalone_module_var
The first of these statements imports the attribute init_var
from the package
namespace of my_package
, renaming it as iv
. It also imports the subpackage
my_subpackage
, renamed to msp
. The second statement imports the attribute
my_standalone_module_var
from my_standalone_module
with no renaming.
Imports using the from
keyword will be referred to as from
imports, and
imports without the from
keyword will be referred to as bare import
statements.
3. Absolute imports
We have already seen one kind of absolute import, which is the import of a
module or package from a directory on the sys.path
list. There is one more
kind of absolute import which has not yet been covered. These are used to
import modules and subpackages which are located inside packages. That kind of
import cannot be done correctly simply by placing the directory on sys.path
and then importing the module or subpackage. (In fact, a package directory or
subdirectory, i.e., a directory with an __init__.py
file, should never
appear in the sys.path
list. Doing that can introduce subtle bugs which can
be difficult to find. Only the parent directory of the top-level package
directory should ever appear in sys.path
.)
Absolute imports can always be used, in any Python module, regardless of
whether it is inside a package or outside of a package. Absolute imports
require that the directory containing either the top-level package directory
or the non-package module being imported be discoverable on the sys.path
list.
Absolute imports for modules inside packages use a dotted-path syntax. For example:
import my_package.foo
This statement imports the module foo
, located in the file
my_package/foo.py
. After this import the foo
module is accessible under
the name my_package.foo
. An as
keyword could have been used to create an
alias, if desired. The next subsection covers the syntax of these dotted paths
and their relation to the files and directories of the filesystem. Once dotted
paths are understood absolute imports will be much easier to discuss.
3.1. Absolute dotted paths and the filesystem
For any package which can be discovered by looking in the directories on the
sys.path
list there is a corresponding dotted path to specify modules
(files) and subpackages (subdirectories) located inside the package (inside the
package’s directory subtree). The slashes in operating-system pathnames are
essentially replaced with dots. These dotted paths are always relative to the
package’s top-level directory (i.e., the highest-level directory containing an
__init__.py
file),
Here are some examples of the correspondence, based on the project skeleton above. The filesystem pathnames are given on the left (assuming forward slashes), and the corresponding dotted paths are on the right:
Note that the .py
extension is omitted, but other than that the
correspondence is fairly simple. In an import statement these dotted paths
always refer to objects on the filesystem.
3.2. Absolute imports of subpackages and modules in packages
Now that dotted paths have been covered the discussion of importing modules
that are inside packages is fairly simple: just put the dotted path after the
import
or from
statement. The first component of the dotted path for an
absolute import is always the top-level package name (i.e., the name of the
top-level directory of the package subtree).
For package my_package
in the skeleton given earlier these are all valid bare
import
statements:
import my_package
import my_package.foo
import my_package.my_subpackage
import my_package.my_subpackage.baz
Each of these imports results in a module
object in the namespace which, when
used in an expression, syntactically matches the dotted path in the import
statement. The syntax looks the same but in an expression the dots are
attribute accesses on module
objects. For example, the second import does
not actually add anything to the namespace of the module doing the import. The
module for my_package
is already in the namespace due to the first import.
The second import just adds the module attribute foo
to the my_package
namespace so that my_package.foo
works in expressions.
This is a general property of bare import
statements: After a bare import
without an as
the dotted-path used to make the import is always usable in
Python expressions in the importing module. But in those expressions the dot
symbol represents attribute access, unlike in the import statement itself.
This is discussed further in the next subsection.
Python uses its sys.modules
cache for dotted-path imports, too. It goes down
the names on the dotted path and if it finds one that has not previously been
imported then it imports the remainder of the dotted path from the filesystem.
Any previously-imported packages or modules are taken from the cache.
Imports using from
also work for dotted paths. The imports below are all
valid imports from the example package my_package
. They correspond to the
imports above (except the first one above, which has no corresponding from
import). After a from
import, though, only the package or module following
the import
keyword is added to the namespace of the importing module (as a
reference to a module
object):
from my_package import foo
from my_package import my_subpackage
from my_package.my_subpackage import baz
Imports using from
can also be used to import particular attributes from
inside the namespaces of packages and modules. For example, if the namespace
of module foo
contains a variable foo_var
then that variable can be
imported with this statement:
from my_package.foo import foo_var
In fact, attributes inside package and module namespaces can only be imported
using a from
import statement, never with a bare import
statement. This is
discussed further in the next subsection.
3.3. Possible confusions in import syntax
One possibly-confusing aspect of Python imports is that the dot symbol is
overloaded in Python’s syntax. In Python expressions the dot is used for
attribute access, such as in my_class.my_attribute
. But in the dotted paths
of import statements the dot essentially means “subdirectory” and should be
thought of more as a “/” character in a pathname. Import statements are an
exception in that they are the only statements where the dot syntax means
something other than attribute access. In import statements the dot can only
be part of a dotted path.
Consider these valid import statements, assuming that foo_var
is a variable
assigned in module foo.py
:
from my_package import foo # Works.
import my_package.foo # Works.
After the second import above the subexpression my_package.foo
is definitely
usable in Python expressions. The subexpression my_package.foo.foo_var
is
too, because the initial module-scope attributes of foo
are created when it
is imported and initialized. The name foo_var
is then an attribute of the
module
object for foo
.
The first import above is essentially the same as the second one except that
in the second one the module
object for foo
is imported under the name foo
.
Given the apparent pattern above the following may seem like it should work, but it is not allowed:
from my_package.foo import foo_var # Works.
import my_package.foo.foo_var # FAILS!
import my_package.foo.foo_var as fv # Also FAILS!
The first import works because from
imports are allowed to import attributes
from the namespaces of packages and modules. But the second import fails
because bare import
statements cannot be used to import attributes from the
namespaces of packages and modules. Bare import
statements can only be
passed dotted paths, and dotted paths correspond to files and directories in
the filesystem, not to attributes inside modules. Renaming doesn’t change
that, so the third import also fails. This holds even when the expression
my_package.foo.foo_var
is usable in Python expressions.
Another thing you cannot do is assign Python variables as aliases to dotted paths. So, while it seems like it would be convenient, this code does not work:
import my_package.foo as mpf # Works.
from mpf import foo_var # FAILS! Only dotted paths directly after from statements.
Although the attribute-access pattern of modules mimics the dotted-path
syntax, they are not the same thing. The variable mpf
is a reference
to the module
object for foo
. It cannot be substituted for a dotted path.
Since references to module objects cannot be used in import statements, the full dotted paths must always be entered. Relative dotted paths, covered in the next section, can simplify some cases of having to write out the full dotted paths.
To avoid these possible confusions, remember that dotted paths in Python import
statements always refer to filesystem objects (either directories or .py
files). The first specifier in any import statement, whether a bare
import
or a from
import, can only be a dotted path.
4. Relative imports
In the previous section we saw that dotted paths in absolute import statements must always be typed out in full. In the case of intra-package imports — imports from subpackages and modules inside the same package — relative imports can often be used to simplify the dotted-path expressions. Keep in mind that relative imports are only allowed for intra-package imports; all other imports must be absolute imports.
Relative imports are to absolute imports as relative filename paths are to absolute filename paths. They allow for shortened expressions relative to another directory. First we will extend the definition of dotted paths to allow for relative dotted paths.
4.1. Relative dotted paths
A relative dotted path is similar to an absolute dotted path except that it always starts with a dot symbol. If you are familiar with relative paths in a shell such as Bash the syntax is similar.
Relative dotted paths have different meanings depending on the location of the module in which they occur. They are interpreted relative to the directory containing the module in which they occur:
- A single dot refers to the directory containing the module. It can occur
alone or at the beginning of a longer dotted path. As examples, the
following correspondences hold inside the
foo
module (located in directorymy_package
). The first two components on a line are equivalent filesystem paths relative to directorysrc/my_package
, and the final one is the Python dotted path. (Note in the second line that whilebar
without the dot would be an equivalent relative pathname in a shell, as a relative dotted path the leading dot is required.)
- Two dots refer to the parent directory of the directory containing the
module. The two dots can occur alone or at the beginning of a longer dotted
path. The following correspondences hold inside the
baz
module (which is located in directorymy_subpackage
). The first two components on a line are equivalent filesystem paths relative to directorysrc/my_package/my_subpackage
and the final one is the Python dotted path:
- Each additional dot goes up one more directory level.
Suppose there were another subpackage named sibling
at the same level as
my_subpackage
. Then a module cousin
in it could be imported from baz
by
going up and then down as follows:
4.2. Relative imports
Now that relative dotted paths have been covered, relative imports are straightforward: just use a relative dotted path instead of an absolute dotted path (but remember that they are only allowed for intra-package imports).
There is another important restriction on relative imports: A relative dotted
path can only appear after a from
statement. It might seem like you
should be able to write imports such as import .bar
from the foo
module and
import ..bar
from baz
module, but those are syntax errors. The reason this
is not allowed is that the relative dotted paths (such as .bar
) after the bare
import
statements are not valid Python names and therefore cannot be used in
Python expressions.
The following are all valid relative imports from the foo
module:
from . import bar
from .bar import bar_var
from . import my_subpackage
from .my_subpackage import baz
from .my_subpackage.baz import baz_var
These relative imports are all valid in the baz
module:
from .. import bar
from ..bar import bar_var
In addition to importing modules and subpackages, from
imports using only-dot
paths such as .
and ..
can also be used to import attributes from package
and subpackage namespaces (i.e., from __init__.py
namespaces) For example,
this import in module foo
would import the variable init_var
defined in
module my_package.__init__.py
:
from . import init_var
5. Imports in scripts
A script is any Python module which is directly run by the Python
interpreter. This can be done from the command line with the python
command,
by clicking an icon, or via some other invocation method such as from a menu.
Python applications are usually started by running a Python module
as a script.
Scripts have a few unique properties not shared by other modules:
- The directory containing the script file is automatically inserted to
sys.path[0]
when the script is run by the Python interpreter. The absolute directory path is always added; the current working directory, in the shell or in Python, has no effect on this. - The
__name__
attribute of the script’s module is always set to"__main__"
when it is run as a script, regardless of the file’s name. - By default a script is not run as part of a package, even if there happens
to be an
__init__.py
in its directory.
Property 1 allows a script to import any package or module which is located in its directory as an absolute, non-dotted import. This is helpful if the directory contains top-level packages or standalone modules that are intended to be imported. In some situations this can cause problems such as unintended imports due to name shadowing or modules inside packages being imported as if they were standalone modules.
Property 2 is what allows the use of this common idiom in Python scripts:
if __name__ == "__main__":
main() # A commonly-seen example, running function `main`.
Code in that conditional block only executes when the module is directly run as a script and not when the module is imported from another Python module (some modules are meant to be used both ways).
5.1. Scripts outside of packages
The standard idiom for Python scripts is that they should be located outside of packages. The script can then load any packages or modules it needs. There are some use cases for scripts inside packages, which will be covered in the subsection following this one.
The rule for imports in scripts located outside packages is simple: scripts
outside packages can only use absolute imports. Any absolute imports are
allowed, but of course modules inside packages should almost always be imported
as part of their package, using the dotted-path syntax relative to their
package root. In some cases it may be necessary to insert paths to
sys.path[1]
(after the current directory at sys.path[0]
) in order for
Python to discover the necessary modules and packages to import.
If you use a setup.py
for your project then scripts outside packages can be
added to a project
by using the scripts
keyword argument. For development this would involve
setting up the project with a setup.py
and then installing the project in
editable mode, such as by running pip install -e .
in the directory with
setup.py
. (The setup.py
file is usually placed in the project’s root
directory, which is my_project
in the project skeleton given earlier). This
provides a shell command, in the shell’s search path, to run the script. To
add or remove scripts from the project the setup.py
would have to be modified
and the package reinstalled. A similar thing can be done using the more-recent
pyproject.toml
files if you use that method to set up projects rather than
using a setup.py
.
5.2. Scripts inside packages
Scripts can also be run inside packages, but the special properties of scripts listed above have some side-effects which need to be taken into account.
Property 3 means that the package the script is inside of is not automatically
imported when the script runs. To import modules from the package the script
will by default use non-dotted absolute imports (based on Property 1, that the
directory is added to sys.path
). This only works correctly in simple cases
where the imported modules are essentially standalone modules themselves. Even
if the script itself imports the full package in the usual way, the running
script is still not correctly set up as a module of the package.
If the script does explicitly import its containing package then dotted
absolute imports from the package will work. But the script module itself
should never be imported by any other module in the package since it is cached
as the __main__
module by Property 2 and a double import will result.
To get around these problems and correctly run scripts inside packages what is needed is a way to automatically import the containing package and then run the script as a part of the package. There are several possible ways to do this:
- Invoke the script using
python -m <fullyQualifiedName>
, where<fullyQualifiedName>
is the fully-qualified name of the module inside the package (i.e., its absolute dotted path). Note that the directory containing the top-level package directory must be insys.path
or the command will fail. You could write a shell script wrapper for thepython
command to calculate aPYTHONPATH
and qualified name and then invokepython -m
. Generally, though, the invocation method differs from that of other Python scripts. - Set the
__package__
attribute of the script, in the script, to the fully-qualified name and then import the package in the correct way. This is more complex than you might expect, but fortunately there is a package on PyPI which can do this for you automatically (and optionally also remove the directory’ssys.path
entry). - Create a
setup.py
file and create an entry point for the script via theconsole_scripts
keyword. This is similar to thescripts
keyword described above, but it allows modules inside packages to be run via an entry-point function. To add or remove entry points thesetup.py
file would have to be modified and the package reinstalled. This creates commands which are directly executable in the shell, under the command name specified insetup.py
.
6. Not covered
This article has covered the basics of the Python import system, but some important topics have not been discussed. They tend to occur or be used in special or advanced cases.
Star imports: By default, the statement from my_module import *
imports
all the names in the my_module
namespace which do not start with the
underscore character. If __all__
is defined in my_module
as a list of
string variable names then *
-imports from the module will only import those
names. Anything else would need to be explicitly imported. The __all__
list
for an __init__.py
file can also contain the names of modules and subpackages
to import: a *
-import of the corresponding package or subpackage will then
implicitly perform the imports (which would need to be done explicitly in an
ordinary module).
Circular imports: This problem can arise when one module imports another module which then imports the first module again. The usual solution is to reorganize the module structure or to put the problematic import inside a function so it is not performed on module initialization. Circular imports are discussed in the answer to this Python FAQ question: “What are the ‘best practices’ for using import in a module?”
Namespace packages: Namespace packages allow one or more toplevel
directories having the same directory name, but without __init__.py
files, to
function like a common namespace for all the modules and packages in all of
those directories which are discoverable on sys.path
. This can be useful for
large distributions, but there are also drawbacks such as the lack of
__init__.py
files. Most people should continue to use __init__.py
files
and create packages with a single top-level directory.
Dynamic import calls: Suppose you want do perform an import but you do
not know the name of the module to import until runtime. The functional
interface to the import command is called __import__
. It can take a string
argument, e.g., module_found_at_runtime =
__import__(runtime_calculated_name)
, where runtime_calculated_name
is a string.
pth files: Pth files are special files which contain the pathnames of
packages or modules to import. Using pth files only works when they are placed
in the special system site-packages
directory.
Importing from zip files: Python allows modules to be imported from
zipfiles, provided the .zip
archive file is located on sys.path
. The
directory structure in the zip file then acts as a regular directory.
Lower-level APIs of the import system: The full Python import system is complicated and customizable. There are protocols to allow it to be dynamically modified in various ways for special applications.
7. Further reading
- The official Python documentation on imports and modules.
- A detailed guide to Python imports by Chris Yeh.
- An introduction to absolute vs. relative imports, including a discussion of formatting style. By Mbithe Nzomo.
- A discussion of some of the often subtle import traps which can arise, by Nick Coghlan.
Leave any comments below.