This paper proposes a new methodological framework to analyse economic clusters over space and time, which builds upon recent developments in data science. Specifically, we employ a unique open source of commercial, geolocated and archived webpages. We interrogate these data using data science techniques and NLP methods to build bottom-up classifications of economic activities based on the textual data included in these webpages. We take a fresh look at an iconic London neighbourhood – Shoreditch – that is rich in technology and creative industries, and has become a leading digital and creative cluster over the past two decades. Our results provide valuable insights of the economic profile of Shoreditch. They outperform those produced using administrative data and match past qualitative studies, which are based on lengthy participatory observation and in-depth interviews.