Headset collects retail cannabis data via direct connections through the Point-of-Sale (POS). This data fuels the forecasts that power Headset Insights. This diagram gives an in-depth look at that process.
Headset Data Cleaning
We do not use all stores in our data and for some stores we do not use all of their sales. Instead, a representative sample of state level sales are used. Here are some examples of how we transform store data:
- Large Retailers - a single large retailer (or chain of retailers) could skew sales mixes if it represented a large portion of sales in the Headset database. We are aware of this self-selection bias in our sample and as such deflate or exclude sales from large retailers to minimize the skew they have on overall sales.
- Outlier Retailers - some retailers are outliers and do not look at all like an "average" store. For stores that are outliers, their sales may also be excluded or deflated based on how unusual they appear compared to other stores in our database.
- Too Many Stores- having enough coverage to show an accurate picture of the market is paramount, but more data isn't necessarily better. In markets where are coverage is greater than 25%, we often exclude large amounts of sales data from informing Insights. The reason we do this is because we can get a pretty accurate read of the market with 25% of the data and including more just increases how much data needs to be cleaned and cataloged without much additional benefit