Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How does this work? Is there a database of the most popular online stores and you're constantly scraping (or GETing) product data?


We aggregate data from a variety of sources (crawling, data dumps, rss feeds, and in some cases even manual curation) after which we integrate them into our data pipeline. We update them using a power law distribution, where the top 1% of best selling products (based on our internal ranking system) is updated hourly, the next 3% updated every two hours, etc.. The whole index is refreshed at the end of each month.


Very cool. Thanks for the explanation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: