openAI published a working paper on how people use chatGPT. They include a visualisation that triggered my curiosity.

The width of the columns represents the cumulative proportion of the main category, and the height of each sub-category represents the proportion of the sub-category within the main category.
The first problem is that the percentages of the main categories do not add up to 100.
| Main Category | Percentage |
|---|---|
| Multimedia | 6.00 |
| Other / Unknown | 4.60 |
| Practical Guidance | 28.30 |
| Seeking Information | 21.30 |
| Self-Expression | 4.30 |
| Technical Help | 7.50 |
| Writing | 28.10 |
| Grand Total | 100.10 |
The percentages of the sub-categories, however, do add up to 100. Next, I measured the width of the columns to test if they actually got it right.

I then (ironically) used chatGPT to extract the numerical values from the graph.

Next, I divided the width of the column by the percentage it is supposed to represent.
| Main Category | Calculated Percentage | Column Width | Column Width Ratio |
|---|---|---|---|
| Multimedia | 5.9 | 43.0 | 7.3 |
| Other / Unknown | 4.5 | 32.5 | 7.2 |
| Practical Guidance | 28.3 | 208.5 | 7.4 |
| Seeking Information | 21.3 | 157.0 | 7.4 |
| Self-Expression | 4.3 | 30.0 | 7.0 |
| Technical Help | 7.6 | 54.0 | 7.1 |
| Writing | 28.1 | 207.5 | 7.4 |
While the columns approximate the main category percentages, they vary considerably. So, how can we do this better? The answer is a good old area graph. It scales each box according the the percentage and groups them into the main categories by color.

A new type of visualization is not always a better visualization.
