Applying Machine Learning for Automatic Product Categorization
The North American Product Classification System (NAPCS) is a comprehensive, hierarchical classification system for products (goods and services) that is consistent across the three North American countries, and promotes improvements in the identification and classification of service products across international classification systems, such as the Central Product Classification System of the United Nations.
Every five years, the U.S. Census Bureau conducts an economic census, providing official benchmark measures of American business and the economy. Beginning in 2017, the economic census will use NAPCS to produce economy-wide product tabulations. Respondents are asked to report data from a long, pre-specified list of potential products in a given industry, with some lists containing more than 50 potential products. Many of the more than 1,200 NAPCS codes can be very complex and ambiguous. Businesses have expressed the desire to alternatively supply Universal Product Codes (UPC) to the U. S. Census Bureau, as this is something they are already storing in their database.
This research considers the text classification problem of predicting NAPCS classification codes, given UPC product descriptions. We present a method for automating the Economic Census by using supervised learning.
Reference:
STS10-002
Session:
Machine Learning in statistical production
Presenter/s:
Andrea Roberson
Presentation type:
Oral presentation
Room:
MANS
Chair:
Diego KUONEN, Statoo Consulting & Geneva School of Economics and Management, University of Geneva, Switzerland, (Email)
Date:
Thursday, 14 March
Time:
14:30 - 15:30
Session times:
14:30 - 15:30