MARS: A method for linking barcodes and stratifying products for price index calculation
The increased availability of electronic transaction data for the consumer price index (CPI) offers possibilities to national statistical institutes (NSIs) to enhance the quality of index numbers. More refined methods can be applied that deal with the dynamics of consumption patterns in a more appropriate way than traditional fixed-basket methods. For instance, multilateral methods can be used to specify sales based weights at the most detailed product level and new products can be directly included in index calculations.
Electronic transaction or scanner data sets contain expenditures and quantities sold of items purchased by consumers at physical or online sales points of a retail chain. The sales data are often aggregated by retailers to a weekly level and are specified by the barcode or Global Trade Item Number (GTIN) of each individual item. Transaction data sets also contain characteristics, such as brand and package volume, of the items sold. While traditional price collection methods typically record prices of several tens of products in shops, electronic transaction data sets may contain several tens of thousands of items at the GTIN level for a single retail chain.
GTINs represent the most detailed product level in electronic transaction data sets. Each item has a unique barcode. In principle, this means that NSIs are given a set of tightly defined products. The ratio of monthly expenditure and quantity sold yields a transaction price, which can be followed for each product/GTIN from month to month. However, items may be removed from the market and reintroduced with a modified packaging, for instance, in order to fit within a retailer’s new product line. Quality characteristics of such “relaunch” items may remain the same, but the barcodes may change after reintroduction and also the prices compared with the prices under the previous GTINs. The barcodes of the old and new, reintroduced items have to be linked in order to capture price changes under such relaunches.
Typical market segments that are characterised by relaunches are pharmacy products, clothing and electronics. Rates of item churn may reach such high levels that each year new product lines are introduced that replace the former ones. The GTIN level is not appropriate as product level in such situations. GTINs of relaunch items have to be linked, which means that broader product concepts are needed.
The problem of identifying “suitable” levels of product stratification has to be resolved before applying index methods to calculate price movements from period to period. The key question is what is found to be a suitable level of product stratification and how this notion could be formalised and operationalised. In addition, the size of electronic data sets calls for a method that enables statistical agencies to automate the stratification process to a high degree.
This paper presents the method MARS, which stratifies products by balancing product homogeneity and the extent to which products can be followed over time. The inclusion of the latter measure allows to identify relaunches and corresponding price changes. Results for televisions and hair care are shown.
Reference:
CPS11-002
Session:
Experimental Statistics
Presenter/s:
Antonio Chessa
Presentation type:
Oral presentation
Room:
JENK
Chair:
Martin Karlberg, Eurostat, Luxembourg, (Email)
Date:
Thursday, 14 March
Time:
15:45 - 16:45
Session times:
15:45 - 16:45