May 13, 2019

Embed markdown documentation directly into your Airflow DAGs

Why you should do it I recently discovered that Apache Airflow allows you to embed markdown documentation directly into the Web UI. This is very neat feature, because it enables you locate your documentation as close as possible to the thing itself, rather than hiding it away in some google doc or confluence wiki. This, in turn, increases the chance it is actually read, rather than being promptly forgotten about and undiscovered by new team members. Read more

April 15, 2019

Save entire webpages for reference With SingleFile

I’ve been reading through a lot of Tiago Forte’s writing on his members-only publication Praxis. Since reading through his series on progressive summarization, I have become more concientious with regards to saving the “work-in-progress” artifacts of my thinking process to Evernote. Often this involves a link to a piece of content, a couple highlights, and a bullet point or two about key takeaways. The problem It’s pretty easy to surface relevant notes using the Search function if I’ve added enough contextual info to the note, but less so if it’s just a link. Read more

February 11, 2019

The best way to manage dependencies between DAGs in Airflow

Airflow provides a few different sensors and operators which enable you to coordinate scheduling between different DAGs, including: ExternalTaskSensor TriggerDagRunOperator SubDagOperator Which one is the best to use? I have previously written about how to use ExternalTaskSensor in Airflow but have since realized that this is not always the best tool for the job. Depending on your specific decision criteria, one of the other approaches may be more suitable to your problem. Read more

January 21, 2019

Set dependencies between Airflow DAGs with ExternalTaskSensor

Problem You are an analyst/data engineer/data scientist building a data processing pipeline in Airflow. Last week you wrote a job that peforms all the necessary processing to build your sales table in the database. This week, you are building a customers table that aggregates data from your previous sales table. Should you add the necessary customers logic as a new task on the existing DAG, or should you create an entirely new DAG? Read more

June 5, 2017

Essential productivity apps for Mac users

Once a year I try to reevaluate my “personal tech stack” to see if I am using fundamental tools as effectively as possible. Not just bigger tools such as todo lists, calendars, and note-taking, but also the smaller utility apps that get used so frequently they blend into our daily work routine. Our fluency with the tools we use every day is the foundation of personal productivity1 , so it makes sense to optimize even small interactions2 such as switching between windows. Read more

© Geoff Ruddock 2019