Geoff Ruddock

Embed markdown documentation into your Airflow DAGs

Why you should do it

I recently discovered that Apache Airflow allows you to embed markdown documentation directly into the Web UI. This is very neat feature, because it enables you locate your documentation as close as possible to the thing itself, rather than hiding it away in some google doc or confluence wiki. This, in turn, increases the chance it is actually read, rather than being promptly forgotten about and undiscovered by new team members.

Screenshot of documentation in Airflow UI

How to do it

To make your markdown visible in the Web UI, simply assign the string variable to the doc_md attribute of your DAG, e.g. dag.docs_md = "My documentation here". That said, I generally put the docs in a string variable at the top of the file, and then assign it later down in the file. This way, it serves a dual purpose of providing context to anyone editing the dag definition file itself.

Example code

docs = """
## DAG Name

#### Purpose

This DAG connects data from one source to another,
performs necessary transformations,
and creates a set of tables that can be used by analysts 

#### Outputs

This pipeline produces the following output tables:

- `table_A` – Contains useful information about ABC.
- `table_b` – Contains useful inormation about XYZ.

#### Owner

For any questions or concerns, please contact 
[me@mycompany.com](mailto:me@mycompany.com).
"""

with DAG() as dag:
    dag.doc_md = docs

comments powered by Disqus