Thomas Piketty is in no doubt that data underpin the conclusions of his best selling economics book, “Capital in the Twenty-First Century” .
He writes, in the introduction: “Compared with previous works, one reason why this book stands out is that I have made an effort to collect as complete and consistent a set of historical sources as possible in order to study the dynamics of income and wealth distribution over the long run”.
While the conclusions of his work, including his call for an international wealth tax, have stirred controversy among academics, commentators and policy makers, even his critics have generally praised the ambition and quality of the data presented in the text.
Reviewing the book this month, Lord Mervyn King, former governor of the Bank of England, said, “the principal weakness of the book is that the carefully assembled data do not live up to Piketty’s rhetoric about the nature of capitalism”.
The sense of diligence in Professor Piketty’s compilation of trends in wealth is bolstered by an online technical annex and spreadsheets containing the data, with sources.
An investigation by the Financial Times, however, has revealed many unexplained data entries and errors in the figures underlying some of the book’s key charts.
These are sufficiently serious to undermine Prof Piketty’s claim that the share of wealth owned by the richest in society has been rising and “the reason why wealth today is not as unequally distributed as in the past is simply that not enough time has passed since 1945”.
After referring back to the original data sources, the investigation found numerous mistakes in Prof Piketty’s work: simple fat-finger errors of transcription; suboptimal averaging techniques; multiple unexplained adjustments to the numbers; data entries with no sourcing, unexplained use of different time periods and inconsistent uses of source data.
Together, the flawed data produce long historical trends on wealth inequality that appear more comprehensive than the source data allows, providing spurious support to Prof Piketty’s conclusion that the “central contradiction of capitalism” is the inexorable concentration of wealth among the richest individuals.
Once the data are cleaned and simplified the European results do not show any tendency towards rising wealth inequality after 1970.
The US source data are also too inconsistent to draw a single long series. But when the individual sources are graphed, none of them supports the view that the wealth share of the top 1 per cent has increased in the past few decades. There is some evidence of a rise in the top 10 per cent wealth share since 1970.
The FT uncovered several types of defect.
One apparent example of straightforward transcription error in Prof Piketty’s spreadsheet is the Swedish entry for 1920. The economist appears to have incorrectly copied the data from the 1908 line in the original source.
A second class of problems relates to unexplained alterations of the original source data. Prof Piketty adjusts his own French data on wealth inequality at death to obtain inequality among the living. However, he used a larger adjustment scale for 1910 than for all the other years, without explaining why.
In the UK data, instead of using his source for the wealth of the top 10 per cent population during the 19th century, Prof Piketty inexplicably adds 26 percentage points to the wealth share of the top 1 per cent for 1870 and 28 percentage points for 1810.
A third problem is that when averaging different countries to estimate wealth in Europe, Prof Piketty gives the same weight to Sweden as to France and the UK – even though it only has one-seventh of the population.
There are also inconsistencies with the years chosen for comparison. For Sweden, the academic uses data from 2004 to represent those from 2000, even though the source data itself includes an estimate for 2000.
Prof Piketty’s documents explaining his sources and methods, suggest that he uses similar data from death duty records around the world. In fact, he interchanges between such source material and surveys of the living, which often give very different answers. Switching between the two sorts of data series, particularly for the US is important to his results.
Some of the biggest defects relate to the UK data, where his original sources consistently show very large declines of near 10 percentage points in wealth held by the rich in the highly inflationary 1970s.
Conversely, Prof Piketty shows the super rich held a greater share of wealth by 1980 and the top 10 per cent saw their share fall only 1.5 percentage points.
The official data series that Prof Piketty says he used for the UK after 1980 shows little increase in inequality over the next 30 years, while his figures show a steep rise.
I am happy to see that FT journalists are using the excel files that I have put on line! I would very much appreciate if you could publish this response along with your piece.
Let me first say that the reason why I put all excel files on line, including all the detailed excel formulas about data constructions and adjustments, is precisely because I want to promote an open and transparent debate about these important and sensitive measurement issues (if there was anything to hide, any “fat finger problem”, why would I put everything on line?).
Let me also say that I certainly agree that available data sources on wealth are much less systematic than for income. In fact, one of the main reasons why I am in favor of wealth taxation and automatic exchange of bank information is that this would be a way to develop more financial transparency and more reliable sources of information on wealth dynamics (even if the tax was charged at very low rates, which you might agree with).
For the time being, we have to do with what we have, that is, a very diverse and heterogeneous set of data sources on wealth: historical inheritance declarations and estate tax statistics, scarce property and wealth tax data, and household surveys with self-reported data on wealth (with typically a lot of under-reporting at the top). As I make clear in the book, in the on-line appendix, and in the many technical papers I have published on this topic, one needs to make a number of adjustments to the raw data sources so as to make them more homogenous over time and across countries. I have tried in the context of this book to make the most justified choices and arbitrages about data sources and adjustments. I have no doubt that my historical data series can be improved and will be improved in the future (this is why I put everything on line). In fact, the “World Top Incomes Database” (WTID) is set to become a “World Wealth and Income Database” in the coming years, and we will put on-line updated estimates covering more countries. But I would be very surprised if any of the substantive conclusion about the long run evolution of wealth distributions was much affected by these improvements.
For instance, my US series have already been extended and improved by an important new research paper by Emmanuel Saez (Berkeley) and Gabriel Zucman (LSE). This work was done after my book was written, so unfortunately I could not use it for my book. Saez and Zucman use much more systematic data than I used in my book, especially for the recent period. Also their series are constructed using a completely different data source and methodology (namely, the capitalisation method using capital income flows and income statements by asset class). The main results are available here: http://gabriel-zucman.eu/files/SaezZucman2014Slides.pdf.
As you can see by yourself, their results confirm and reinforce my own findings: the rise in top wealth shares in the US in recent decades has been even larger than what I show in my book.
In the attached graph, I compare their series with the approximate series that I provide in the book. As you can see by yourself, the general historical profiles are very similar. This is exactly what I expect as we collect more data in other countries as well: we will certainly improve upon my series and adjustments (some of which can certainly be discussed), but I don’t think this will have much of an impact on the general findings.
Finally, let me say that my estimates on wealth concentration do not fully take into account offshore wealth, and are likely to err on the low side. I am certainly not trying to make the picture look darker than it it. As I make clear in chapter 12 of my book (see in particular table 12.1-12.2), top wealth holders have apparently been rising a lot faster average wealth in recent decades, at least according to the wealth rankings published in magazines such as Forbes. This is true not only in the US, but also in Britain and at the global level (see attached table). This is not well taken into account by wealth surveys and official statistics, including the recent statistics that were published for Britain. Of course, as I make clear in my book, wealth rankings published by magazines are far from being a perfectly reliable data source. But for the time being, this is what we have, and what we have suggests that the concentration of wealth at the top is rising pretty much everywhere. Of course, if the FT produces statistics and wealth rankings showing the opposite, I would be very interested to see these statistics, and I would be happy to change my conclusion! Please keep me posted.