What's up with __init__.py and __main__.py ?
I have noticed that a lot of articles that show how to create a python application uses __init__.py
in the root folder of the application. Some also use __main__.py
in addition to the actual python script that is executed to run the application.
Since python does not have any forced folder structures for applications it is easy for those new to Python to get really confused about it all.
Let's dig into what these files do and how they should be used.
Application folder structure
A lot of the posts I have seen have been for applications in the Data Science space. These were python apps written to be executed from the command line without being installed as a package into your Python environment.
Thus, the majority of these applications would be executed from the command line as follows:
$ python awesome.py
The most common folder structure for these applications would normally follow something that looks like this.
|-- AwesomeApp
|-- __init__.py <-- Why is this here?
|-- __main__.py <-- What does this do?
|-- awesome.py
|-- readme.md
|-- requirements.txt
|-- some_module.py
|-- package
|-- __init__.py
|-- package_module.py
As you can see in the above folder structure it is the __init__.py
and __main__.py
that normally raised questions for me.
There are a whole bunch of ways you can structure your Python applications depending on their use and how you would like your users to install and run them.
Again, we are looking at the above structure as it is used often when __init__.py
is at the root level and not because it is the correct folder structure for your project. The correct structure depends on what type of application your want to create and how you want to distribute it.
If you would like to read more about Python project structures, then have a look at these two links.
The Hitchhiker's Guide to Python - Structuring Your Project
Real Python - Python Application Layouts
__init__.py
So, let's get have a look at __init__.py
. As per the Python documentation:
The
__init__.py
files are required to make Python treat directories containing the file as packages. This prevents directories with a common name, such as string, unintentionally hiding valid modules that occur later on the module search path. In the simplest case,__init__.py
can just be an empty file, but it can also execute initialization code for the package or set the__all__
variable, described later.
You can read the full detail here in the official documentation.
Python Documentation - Packages
In our example project structure, the top level __init__.py
file would only be valid if AwesomeApp was intended to be imported as a package by other applications. Since this is not the case it is unnecessary to have the top level __init__.py
file there. It seems this is a mostly harmless habit some people develop. Try to avoid it if you can as it is not the correct use of __init__.py
.
For some additional information around Python packages and the use of __init__.py
have a look at the following two links.
Real Python - Python Modules and Packages
The Hitchhiker's Guide to Python - Packages
__main__.py
Now let's move on to __main__.py
. This file again has a specific purpose in the context of Python packages. When a package is installed in your Python environment and you use the -m
argument of Python to call that package then Python will look for the __main__.py
file and execute the code in it.
For example, if AwesomeApp was a package installed into our Python environment (which it is not, but go with me on this one), we could run the following on the command line which will then execute the code in __main__.py
:
$ python -m AwesomeApp
Since our current app is not designed to be installed as a Python package it also does not make a lot of sense to have __main__.py
in the root level folder.
You can read a bit more about __main__.py
from the following links.
Real Python - How to Publish an Open-Source Python Package to PyPI
Python Documentation - Top-level script environment
Conclusion
We have looked at the use of __init__.py
and __main__.py
in the context of a Python application that is not designed to be installed or used as a Python package but as a standalone application.
In this context the use of both __init__.py
and __main__.py
can cause some confusion as they are primarily used when creating Python packages.
The folder structure we looked at, which seems common in a lot of blog posts, is also not optimal for developing Python packages or applications that are to be distributed via PyPI or to be installed manually via setuptools etc.
Python packages and distributing applications are complex topics that we will look at in a future blog posts.