March 11, 2019

Calculating the bearing (angle) between coordinates in Redshift

I fielded an interesting request recently from our PR team, who wanted to generate a creative representation of our data based on the direction and distance of trips booked on our platform. Distance a key attribute of interest for a travel business, so it is naturally easy to retrieve this data. However the direction of a trip is something that had not been previously analyzed, and so it was not available off-the-shelf in our data warehouse. Read more

June 10, 2018

Redshift function of the week: RATIO_TO_REPORT

A common use-case A very common scenario one comes across while performing data analysis is wanting to compute a basic count of some event—such as visits, searches, or purchases—split by a single dimension—such as country, device, or marketing channel. Quite often this arises as an intermediate need while working towards some other primary task. Let’s work with a simple example: you’d like to get a rough sense of how many of your company’s orders come from from each country. Read more

April 16, 2017

Jupyter Notebooks for Interactive SQL Exploration

I’m always hesitant to tell people that I work as a data scientist. Partially because it’s too vague of a job description to mean much, but also partially because it feels hubristic to use the job title “scientist” to describe work which does not necessarily involve the scientific method. Data is a collection of facts. Data, in general, is not the subject of study. Data about something in particular, such as physical phenomena or the human mind, provide the content of study. Read more

© Geoff Ruddock 2020