Learn how to prepare your data for a funnel chart.
Funnels are a simple way to see how your users are making their way through your product. They can help you focus your product efforts on places where users are dropping off the most.
For example, imagine you have two events tables. One is called orders which logs a row every time a user orders something on your website. Another is called clicks which logs a row for every click on your website. We want to understand how many users are getting through each step of the process, from clicking an item, to adding it to a cart, to completing an order. The goal is to identify the step(s) in the user journey where we're losing most customers and take some action to encourage users to go to the next step (e.g. by re-designing a page or creating an email campaign for users at the drop off step).
The simplest way to visualize a funnel is as a bar chart of a query whose output is a count distinct users by each funnel step. In this post, I'm going to walk through how to prepare data for a simple funnel chart.
How to do this in SQL
1. Union the two tables.
2. Let's say you know that some of these users are test accounts created internally and want to filter those out. You need to join to another users table to get that information.
3. Lastly, we may only want to build the funnel for a subset of events. Let's only filter on clicked_item, added_to_cart, and ordered_item events.
This works fine and is quick to write, but even in this simple example, there are some gotchas that aren't obvious. For example, how would a new analyst or product manager know that they need to filter out test account activity? And how would they know which table contains that information?
Also, what happens when we inevitably want to extend this analysis? Continue to chuck in new logic to this already somewhat hard-to-read query? Maybe there's a better way to abstract some logic out of this query that can be re-used later.
Now that you've published these metrics to Supergrain, you can consume them from any tool by making an API request. The easiest way to query for metrics is to open an interactive query shell and run an SGQL query.
Using Supergrain requires more upfront steps than doing everything in raw SQL. These upfront steps effectively remove logical burden from every subsequent query, so if the logic you're representing here is important and frequently used, the time will be well worth it.
Comparing raw SQL and Supergrain
SQL has the advantage of being a familiar language for most users, so it's likely faster to get started with the SQL approach. But as most analysts know, managing a library of raw SQL queries quickly becomes very difficult. Queries quickly balloon as analysis gets extended, the original authors leave the company, and before you know it, there are dozens of scattered hundred-line SQL queries that somehow power the whole business.
With Supergrain, business logic is defined centrally. This offers the following advantages.
Simpler end-user queries
Funnel metrics implemented using Supergrain move a lot of the core logic to the Supergrain Metrics Language (SML) layer. This means end-users went from copy-pasting the follow SQL block:
to the following SGQL query:
In addition to fewer keystrokes, notice how the SGQL query operates at the level of metrics, not at lower-level details like table or column names. This also makes the queries more robust against table schema changes, since the mapping of metrics to tables only needs to be changed in the original SML grain.
All the work we invested in SML can be re-used in future queries. Let's say your users table has a country field. Assuming you generated a grain on that table, it's easy to pivot to a different analysis. For example, you could count users by country.
The filtering of test accounts in count_users is already taken care of in the SML layer so queriers don't have to remember to add that logic every time they want to count users.
The bottom line
Funnel metrics are a core component of every analyst's toolkit. It's easy and fast to build these metrics using raw SQL, but for frequently used logic it may be worth it to build these metrics using a metrics framework like Supergrain's.