DRY Analytics

I was talking to one of our partner teams in a developing market. They are a newer analytics department of savvy Business Analysts & SCMs (Supply Chain Managers). Long story short, they are working on this new metric and I was like, How are you filtering for our specific products in your marketplace? And they were like: Oh we have this case statement that we include in all of our jobs…
Read more →

Wide Tables

A few considerations before building that extremely wide table. Every once in a while, an idea comes along to build a table with a wide array of columns at a particular granularity. This is usually associated with “self-serve analytics” and/or an attempt to define a “source of truth” table. Wide tables, self-serve analytics, and source of truth tables aren’t inherently good or bad… but there are some considerations to this type of table that I’m going to rant about.
Read more →

[6 / 4 = 1 = 1.5]: Type Coercion and Precision in SQL

An overview of what’s to come: I was originally writing this as a high-level introduction to send people when they encounter things like 6/4 returning 1 instead of 1.5 in some Database Management Systems (DBMS)… overtime its sprawled far beyond that. The points below serve as a TLDR, with a sprawling discussion expanding afterward. Operations like division in SQL can be confusing for novices or anyone without a programming background
Read more →

Temporary Tables

The database management system (DBMS) I’ll reference today is Redshift (AWS Cloud Datawarehouse offering) which is based on Postgres. I’ll dive into some of the anti-patterns around Temporary Tables I’ve seen abused during my work experience, why they don’t make sense, and how to fix them. Lets get it out of the way… Temporary Tables are not “better” than Common Table Expressions (CTEs) and CTEs are not “better” than temporary tables.
Read more →