Email) (1,2), El-sayed M. El-Horbaty">
12:30 - 13:30
Poster Session
Room: Lunches Space
Using web scraped data to verify Egyptian consumer price indices
Mina Gerges, (Email) 1, 2, El-sayed M. El-Horbaty, (Email) 2
1 CAPMAS, National Statistics Office of Egypt, Cairo
2 Faculty of Computer and Information Sciences, Ain Shams University, Cairo 11566, Egypt, Cairo
The purpose of this paper is to provide an alternative ways of data collection for NSOs, also covers the manipulation and analysis of web scraped data by tracking the utilization of online prices across markets’ websites and cities in near real time. Recently, In Egypt many companies have been published several websites for e-commerce and one of these is souq.com owned by Amazon, Inc. which made scraping data more available and in general appeared what is called: Web scrapers which are software tools for extracting data from web pages. The growth of online markets over recent years means that many products and associated prices information can be found online and possible to be scrapable. The consumer price index is one of the official statistics which estimate constructed using the prices of a sample of representative items whose prices are collected periodically; so it’s one of the best examples in this sense: by replacing the scraping of e-commerce websites and websites which publish the currently prices of products to automatically collect prices for some products and services rather than physical visiting to stores to manually collect the prices. This offers a range of great benefits including: Reducing data collection costs, increasing the frequency of collection and products in the basket, and improving our understanding of price behaviour. This paper introduces a developed generic tool that automatically collects online prices, as “Scraped Data”, based on multiple Search Engines to crawler newest prices and e-commerce websites. The developed tool aiming to aid in data collection reduction costs process depend on big data analytics. Finally, the methodology of this paper is based on machine learning methods that can lead to the crawling of market data on the web, automatic price scraping and evaluation of scanned data.


Reference:
POST01-015
Session:
Big data analytics (poster)
Presenter/s:
Mina Gerges
Presentation type:
Poster presentation
Room:
Lunches Space
Date:
Tuesday, 12 March
Time:
12:30 - 13:30
Session times:
12:30 - 13:30