Katonomics 3: Evidence-Based Policy – The challenge of data

A third slice of Katonomics. The IPKat again welcomes new readers who may be unfamiliar with the term 'Katonomics'. As regular readers now know, it's the title of a mini-series on IP economics for IPKat readers -- the vast majority of whom are not (yet) economists. This series is brought to you by the IPKat's own Katonomist, Nicola Searle, who turns this week to a theme which seems to excite policymakers today -- the increasing demand for IP policy to be based on evidence, rather than on the cherished tenets of our own received wisdom:
Evidenced Based Policy – The challenge of Data 
Recommendation number one of the Hargreaves Review begins with “Government should ensure that development of the IP System is driven as far as possible by objective evidence.” The thrust for evidence-based policy continues and, along with health care and other government policies, Intellectual Property policy looks set for a bit of a nudge

Not everyone is irrational 
Economics is grounded in theory and models, and economists are mildly obsessed by equations, graphs and demand curves. However, the economy has many variables, including, to economists’ great annoyance, irrational beings. In order to ensure that models and theories are robust, they must be tested. A preferred test method is empirical analysis, which examines whether economic policies supported by theory are also supported by evidence.

What constitutes evidence in economics? The current preference is for quantitative as opposed to qualitative methods and data. Data collection is not an easy task. For a lot of economic analysis, the government provides this service (e.g. the Office for National Statistics, the ONS). However, some interesting data may not be collected or made public. Researchers must instead collect their own data or generate it through surveys or experiments. Direct measurement of concepts (e.g. innovation) is not possible, so economists use proxies. For example, wealth serves as a proxy for the standard of living. Even money may be a proxy for the abstract concept of value. 
The economic theories and resulting policies discussed in earlier posts (here and here) call for testing. If IP is a means of incentivising innovation, then the innovation and the incentives created by IP should be analysed. Proposed changes to IP policy also merit analysis and, in some cases, require investigation via regulatory impact assessments. 
However, how do you measure IP? How do you test whether copyright incentivises creation? What is a trade secret worth? As IP is, by definition, intangible, data analysis is often achieved through proxies. These proxies vary by the type of IP. 
Patent data has long been a favourite of economists. It is a data rich source as patents are formally registered, publicly available, and have useful information such as citations. Further, patent data gets tantalisingly close to a proxy for innovation. The number of times a patent is cited, or the number of claims, suggests the innovative value of the patent. It is not without its problems, for example, newer patents will be less cited ("citation lag"). Dietmar Harhoff has done a lot a work in this area, including this presentation at an EPIP workshop (with Karin Hoisl and Colin Webb) which examines European patent citations and the problem of citation lags. Harhoff also investigates European patent examination in this paper with Stefan Wagner.
While patent data has its problems, copyright data is even more challenging. Unlike patents, no central source for copyright data exists. In particular, piracy data is problematic. A 2010 U.S. government report addresses this:
“Three widely cited U.S. government estimates of economic losses resulting from counterfeiting cannot be substantiated due to the absence of underlying studies. Generally, the illicit nature of counterfeiting and piracy makes estimating the economic impact of IP infringements extremely difficult, so assumptions must be used to offset the lack of data.”
To compound this, much of the relevant data (e.g. royalties stemming from licensing copyright material) is held privately. Policy makers may only have access to conclusions based on privately held data and no ability to verify conclusions. This is problematic given the challenges to economic analysis of copyright data and the contentious nature of copyright debates. 
When data is made available, it can offer surprising insights. For example, a computer games developer in the Netherlands, Joost "Oogst" Van Dongen, shared detailed stats on his recent Proun game. Unusually, he offered the game as pay-what-you-want and found that only 1.8% of players actually paid. He also discusses his revenue (about $0.09 per game) and estimates a level of piracy at around 41% based on detailed stats. 
However, where privately held data is unavailable, researchers may use a variety of public data sources to investigate copyright policy. Ruth Towse uses artists' earnings to analyse copyright and the royalty system of payment. In a UK IPO report, Martin Kretschmer uses sales prices of copying devices to investigate the levy system. Richard Watt details other examples of empirical work in a WIPO paper, which include North American copyright registrations and comparative analysis of book pricing. 
Copyright is not alone in its data challenges. Unregistered rights are particularly tricky; for example, the mere existence of a trade secret may not be known. Even registered rights are complicated; economic analysis of trademarks is limited by the lack of economic data in trade mark registrations. 
Given these challenges to data, how should researchers proceed? Is evaluating policy based on evidence the way forward? Or, for my question of the week, should governments compel rights holders to share data?