Analytics at the Bank of England are undergoing a transformation. New data-driven analytics are being developed to complement more theoretically-rich models. Three factors are driving this change. First, the Bank’s responsibilities have expanded in recent years, notably around microprudential regulation. While macroeconomics might be feasible with only aggregate data, microprudential regulation requires micro-data on company portfolios and transactions. In turn, these micro-data require new analytical tools and techniques.
Second, the financial crisis devastated popular faith in economic orthodoxies of all types, since most economists failed to accurately forecast it. As a result, space has opened for other analytical approaches, notably machine learning, a branch of statistics originating from computer science.
Finally, wider trends have influenced the Bank. These include increases in computing power, developments in programming packages, and the maturity of big data and data science techniques.
Fintech is one sector where the bank is applying new data-driven analytics. Although in the long term fintech may improve the efficiency of the financial sector, in the short term there is the risk that entrepreneurs and new technologies destabilise incumbents, creating financial instability. This is known as ‘Uber risk’, named after the popular taxi-hailing service. These ‘disruptive’ firms can be identified with a machine-learning algorithm applied to a dataset from start-up tracker Crunchbase, which contains information on venture capital funding of new technology companies.
Venture capital funding of fintech firms has increased significantly in recent years. According to Crunchbase, there are more than 50,000 unlisted technology start-ups globally.
A machine-learning algorithm called ‘k-means clustering’, where ‘k’ refers to the number of clusters, allows us to identify those firms that may pose Uber risk. The algorithm groups firms in such a way that the similarity of firms within each cluster is maximised. This analysis draws on four features derived from Crunchbase, namely: the amount of money raised by a firm per year; the number of unique investors in a firm per year; the total number of funding rounds for a firm per year; and a weighted score of the investors in a firm based on whether they previously funded eventual ‘unicorn’ businesses.
The aim of the application of this algorithm was to identify fintech firms that clustered close to ‘unicorns’, the moniker given to firms valued at more than $1bn. Crunchbase contains information on 204 unicorns as of November 2015.
When the ‘k-means clustering’ algorithm is applied to Crunchbase data from November 2015, 20 clusters are found. One of these contained around 3,000 firms, including all unicorns. Near these unicorns is a group of 60 unlisted fintech firms, several of which are based in the UK. These are potential unicorns worth tracking because they may pose Uber risk.
The Crunchbase case study exemplifies how central banks like the Bank of England are benefiting from new data sources and advanced analytics. It also illustrates a couple of challenges typically encountered when working with such data.
Central bankers historically have worked with ‘clean’ data neatly compiled by their national statistical authority or other professional data providers. Often these data are sampled to be representative of a population. However, many of the newer datasets of interest to central banks are less well ordered. In the case of Crunchbase, any registered user can submit new data points, subject to review by moderators. There is no guarantee that the data represent the universe of start-ups in any unbiased way.
The traditional way venture capitalists identify unicorns is to apply some variant of discounted cash flow analysis. The advantage of this approach is that it rests on solid finance theory. By contrast, algorithmic analytics are vulnerable to the criticism that their approach is too complex or not easily understood. In order for results from these analytical models to be accepted by policy-makers, an intuitive story needs to be weaved around them.
The challenges are considerable, but the potential benefits are enormous. Algorithmic analysis, as applied by the Bank of England, is an efficient way to monitor Uber risk at a much lower cost than performing fundamental analysis of each firm’s business plan and financial statements. The analytics are easy to update; as soon as the underlying data change, the algorithm produces an updated profile.
In this case and others, new analytical techniques provide central banks with an opportunity to improve their efficiency and insight.
David Bholat leads a team of data scientists in Advanced Analytics and Chiranjit Chakraborty is a Data Scientist, both at the Bank of England.