Review of Hex
Table of Contents
Hex Review⌗
It’s Easter Sunday. After a long day of chocolate mogging and stealing eggs from small children, I’ve decided to check out Hex which calls itself the “The AI Analytics Platform for your whole team”. I’m going to toss a small csv into it, if it does well I might hit it with a 20GB Parquet file.
I’m not going to connect a data warehouse to it, I’m going to treat this as the classic “There’s a dashboard, but I need to download the underlying data and re-manipulate”
I signed up for a free trial and started off with a prompt and some data.
- I have not read a single guide, looked at any template or read any docs
- I have no idea how Hex works on the backend
- I’m just going to figure it out as a I go and bumble my way through and self-onboard
The data⌗
I gave it a ~740 line csv with the historical March Madness data
| Season | Round | Team | Odds | Result |
|---|---|---|---|---|
| 2026 | 1 | TCU | 120 | 1 |
| 2026 | 1 | Ohio State | -142 | 0 |
| 2026 | 1 | Troy | 650 | 0 |
| 2026 | 1 | Nebraska | -1000 | 1 |
The Prompt⌗
I’ve added the prompt below but the TLDR is we want to evaluate the “accuracy” of pre-game odds for March Madness from a CSV containing 5 years of historical data.
Click to view the entire prompt word for word
I have captured historical odds on NCAA march madness tournaments. The CSV contains data with the year, round, team, odds (in american odds) and result (1 for win 0 for loss). The data is not perfectly clean, some of the odds formatting may be wrong or missing, not all years have the correct number of games.
I’d like to visualize the accuracy of these predictions on a discrete vent over the years have been. Use something like brier scores and/or log loss to do this. If you have a better way feel free to do that as well. Split this by round as well as there are many blowouts or “easy” to call games in the beginning of the tournament.
I’d also like to measure the competitiveness of these games over the years. Have the odds gotten closer together on avg over the years, how is this changes if we look by round?
Dropping in⌗
After issuing the prompt and attaching the csv, I immediately find myself in a “project” or, more descriptively, a “notebook style environment”. I’m prompted to give my project a name and watch as some LLM gets to work on the right side of my screen. I can give the project a status and category, which is probably nice as I could see these things getting unwieldy pretty fast if you had 20-30 people doing “analysis”.

The first cell⌗
There’s a collapsible cell with a SELECT * of my data displayed as a table. The column structure is there, and I can actually create a new column with a formula that can reference existing columns if I want. If you’ve read my blog you’ll know I don’t like SELECT *, but I have no idea if this is auto-materialized for all new projects (will test later).
I can re-write the SQL to adjust the data I’ve materialized, but I can also manually filter it via the UI. Applying the filter updates the “relation” (I don’t really know what to call this yet), and it also looks like it updates visualizations/cells that depend on the relation (The LLM seems has been hacking away below).

Importantly, comments can be added, responded, reacted, and resolved in a cell. Maybe more importantly? Cells can be triggered in DAG fashion with downstream cells taking dependencies. This is good. It’s kind of feeling like Marimo which was recently acquired by CoreWeave. Although hex is not open source and you don’t really own your notebooks they live in cloud land.
Speaking of DAGs there’s this graph view which is nice. the graph shows me that Hex has decided to SELECT * from my csv, written some pandas transformations, and then created three charts. I can’t figure out how to pop the graph view out (Full screen) or change the size… thus the blurry image. In general, every square inch of my screen is packed with features.

The LLM & Context⌗
Moving over to the right pane there’s a sort of chat window with an LLM. The chat window can host multiple threads with conversations >90 days being deleted. Unlike the graph view, I can make this pane basically eat the entire screen.

I quickly find out who’s baking this cookie (claude), but it won’t share system prompts/instructions. After a bit of back and forth, Claude tells me I should be using Guides which exists inside “context studio”, which seems to be exactly how context is managed in Hex. This made me a little nervous at first as it seemed you’d need to embed/maintain context in Hex but they allow connections from outside sources (i.e. context in github updated via github actions) If relevant context is haphazardly stored in a bunch of people’s brains this is still of little use though.
But wtf is the compute here⌗
So now I’m curious how this SQL is getting executed. I ask in my little Claude thread and the weights and biases tell me “it’s querying the uploaded OddsMM.csv file directly using the embedded DuckDB engine”. DuckDB… tight. I can confirm this going into the SQL Cells Introduction. It also seems that you can parameterize SQL using Jinja which is a huge win in my book. It also looks like it integrates well with dbt Cloud which is solid. Dbt or something like it is pretty simply how modern data teams should be authoring transformations, there’s just not much of a question at this point.

Opening the “Environment” on the left tool box shows I’m using a Medium instance with 8GB of memory and 4CPUs. They have a beefy 128GB instance, but I’m getting the feeling that if you need this inside of Hex, you may also need a better data model. I’ve also got access to python 3.9 -> 3.12 and a bunch of python packages come pre-installed (no polars… which is big sad) but it seems you can just add to your venv with uv.
Other staff in the left most toolbox⌗
There’s a full revision history with version checkpoints which is very nice. There’s a large space dedicated to a search feature which seems to be able to kind of grep through your current workspace. The variables section stores all the built in variables as well as anything we create in our python code and dataframes we create via SQL. There’s also a data browser which seems to be acting as a data catalogue for assets inside the workspace. Opening up the column types inferred from my csv: Team & Odds are labelled as having type “object”… which is a bit weird, but I guess they are strings… so arrays… so… objects? Idk I’m stretching here for what that doesn’t say “varchar”.
Back in the middle⌗
Coming back to center stage with our data/assets, there’s a nice little chart telling us what is taking the longest inside of our “DAG”.

Kind of a nice little feature rather than setting up timing on some of these sprawling notebooks.
Continuing on, Claude has written a little pandas to create some new data-frames based on the aggregations I requested. These data-frames are available for view in the data browser. These have become materialized relations (I’m pretty sure) that I can query with SQL in a new cell and are also referenced in the visualizations.
Visualizations⌗
Speaking of visualizations, Hex seems to be using Vega after they performed some sort of acqui-hire of the VegaFusion maintainer. The visualizations seem fine and responsive. There’s a suite of UI tools to edit them a bit (both style and substance). You get access to the generated SQL, which seems to build the relation required for each one.

From the screenshot you can see accuracy_by_season_round is referenced which was created by some pandas in the cell above and exposed in the data browser. I’d like to have full control of my charts as code… and it seems like I can easily do that. I created a cell that references that same dataframe, but creates a basic chart with plotly and it works.

Make an… App?⌗
So it seems like the main other thing you can do is switch out of notebook mode and make an “app”. I turned my analysis into an app which seems to expose the visualizations aa an endpoint, and they can refresh on a schedule. You can export it, take snapshots, turn it into a presentation. These feel like interactive dashboards that you might expose to an end user rather than tossing them a notebook and saying “run this” and look at cell 8.
These apps can be added to a “library” that contains published knowledge that can be “endorsed”. I don’t know how this solves for people just endorsing a bunch of stuff. I don’t think there’s a re-endorsement system where you have to be like “yeah this is still good”… but all of this manual tagging typically loses meaning. It’s like people flagging workloads as “business critical” to get access to resources faster… it’s a short time before everything is “business critical”. Some kind of continuous mechanism to confirm endorsement might be nice (Or just annoying enough that people would disregard it… its a hard problem admittedly).
Competitors⌗
It looks like DeepNote offers something similar, but I’ve never used it.
Marimo is a great notebook based analytics tool that’s open source (recently CoreWeave is in the picture)
Overall Thoughts⌗
I’m impressed. It’s a well thought out product. It’s definitely much more than just a “Claude wrapper”, and they are clearly trying to integrate things that modern data teams need. For a quick proof of concept on a personal project I could easily see myself tossing some data in here with some prompting/context as long as I had access to the elevated features of a “Team Plan”. I do think this is geared more toward enterprise, as it’s way overkill for a hobby project, and I can pretty much get all the features I’d want out of this with Claude Code and a little know-how. But they know this and can cater to the “Yeah we don’t want to maintain that but want that functionality” crowd.
I’m always a little weary of notebook development for large teams, but I think there’s some much broader problems about how modern work is performed that our well out of the scope of Hex. How do you stop people from building data assets that they say describe the “same” thing but end up with a different result. This is a perennial problem that Hex would be miles better at reducing then people hand waiving Jupyter notebooks and pivot tables. And holy FUCK can we please realize that there are better things out their than just raw, isolated, local notebook style data wrangling, visualization, and presentation they do NOT scale… use Hex.
I’m trying to think of other headwinds here. I mean, this is A LOT of product. I’m familiar with these concepts, but if the ideal state is to drop this in front of 20 PMs and say “Hey you can make your won dashboards and data assets now” there would be a significant amount of churn/upskilling where as a Data Science could probably get on board with this very quickly. You still need to solve for upstream, Hex isn’t going to fix your broken medallion architecture and redundant/stale tables. I also have no idea if you can throw 60 GBs at this and have it work through it, but that might also be missing the point.
Further exploration⌗
Smash a 20 GB Kalshi Parquet file and seee what happens.
Can she handle TB scale?
I’m not sure if this is the intended audience or how this product is actually used… like it could very well be that the majority of people are just hooking up their cleaned, tabular, aggregated data-warehouse layer with a bunch of context.