NZ Initiative's Eric Crampton says New Zealand should follow the US example of making public data widely available

By Eric Crampton*

I hate self-inflicted wounds. New Zealand’s use of public data, or lack thereof, counts.

Some countries are too small or too poor to be able to collect reasonable data. New Zealand does not suffer from that problem. We have an excellent statistics agency - Statistics New Zealand. And, even better, they’re exceptionally helpful: I have heard little but praise for their staff from any researcher who has needed help in finding data.

The problem comes rather from the data that is really just a bit too difficult to use because of New Zealand’s Statistics Act 1975. When it is too hard to access New Zealand data, many of the researchers who are able to deal with statistics instead look to American data – and New Zealand policy debate is the worse for it. Let us illustrate by example.

When I was at the University of Canterbury, I supervised a Masters Thesis testing one explanation for the gender wage gap. I suspected that employers might avoid hiring younger women, or paid them a bit less, because of the costs that maternity leave might put on the firm. It might not be nice or right or legal, but it could be one explanation for part of the relatively small gender wage gap that remains after adjusting for differences in education, work experience, time outside of the workforce, industry of employment and the like.

So, what did we do? We went to the U.S. Census and to the American Community Survey. Both of those have a public use dataset available, for free, to anyone in the world with an internet connection. The public use samples are designed to be statistically representative of the overall dataset, but are anonymised. The datasets have rather personal information in them, and they have to for the test we wanted to run: we used the difference in childbearing rates between lesbians and heterosexual women, by age cohort, to see whether differences in future fertility rates explained part of the rather well-established lesbian wage premium.

After we finished the work, or, rather, after my excellent student Hayden Skilling did all of the work and I provided minor bits of advice from the sidelines, we wondered whether we should check it against the New Zealand data. If employers are scared that employees will head off on maternity leave and if that is part of the wage gap here as well, that is of public policy consequence. If it were true here, then the government might consider changing how it handles maternity leave to provide greater support for employers facilitating an employee’s maternity leave. If firms faced fewer costs when hiring employees more likely to take maternity leave, employment and wage outcomes might improve.

Despite the project’s appeal, we quickly ruled it out. Why? It would have been far too great a hassle.

Academics in New Zealand wanting to use individual-level publicly collected data to look at questions of public policy interest have two basic options. They can request a Confidentialised Unit Record File from Statistics New Zealand for the Census or other collected data series, or they can request access to the data lab to get closer to the raw data.

Where America makes its anonymised samples free for anyone to download, New Zealand restricts things. The application form requires that you only use the data for the exact purpose you specify and limits how you use it.

Sometimes it is worth the costs of making that application, but a lot of the time a researcher might want to run a very simple correlation test or check whether average outcomes differ between two groups, just to see if there is any reason to go any further.

And, a lot of the benefit of easily accessible public data is in simple myth-dispelling. A lot of newspaper columns make a lot of assertions that are testable – at least in principle. Debunking them is not worth the hassle when it requires a specific application and approval process – even though Statistics New Zealand is exceptionally helpful through those processes. 

If New Zealand followed the United States in providing simple and accessible confidentialised public-use microsamples of its large datasets, researchers could undertake the kind of exploratory data analysis that leads to bigger projects down the track. Sometimes, we need to let the data tell us which hypotheses to test – and we need access to the data to be able to do it.

But, a New Zealand researcher applying for access to a confidentialised unit record file has to promise never to use it for anything other than the purposes stated on the application form, to keep the data secure (using it on laptop computers is effectively banned), and to destroy the data at the end of the research project or at the end of the 12-month licence.

Where rich American data is available to anybody with a web browser and an internet connection, and similar New Zealand data is rather difficult to get and to use, is it any particular surprise that a lot of New Zealand academics, paid in part by the New Zealand government, use American data rather than the New Zealand data that New Zealand taxpayers have already paid to collect?

And that is the self-inflicted wound.

On one side we have rich data that could throw light on countless interesting public policy questions, excellently and expertly collected and curated by Statistics New Zealand.

On another we have an army of academics whose continued employment increasingly depends on landing refereed journal articles. Those academics chose to live in New Zealand rather than elsewhere, and many of them would love to use New Zealand data to shed a bit of light onto policy discussions that are too-often devoid of evidence.

In between the data and the researchers we have the Statistics Act 1975 that binds Statistics New Zealand. To make things even worse, Statistics New Zealand will not release an anonymised data set to any researcher not based in New Zealand. Researchers from around the world help Americans to understand how their country works because they have free access to American data. New Zealand hoards its data like Tolkien’s dragon guards its gold.

If the United States can anonymise data from the American Community Survey from individual states as tiny as Rhode Island or North Dakota, and not yet have any particular issues with abuse of the data, New Zealand could do well to ease up a little.

And, as addendum if it is of interest, adjusting for the risk that an employee might take maternity leave reduces the lesbian wage premium in America by ten to fifteen percent. This implies that somewhere between a fifth and a quarter of the American gender wage gap not explained by other differences might be due to employers’ worries about an employee’s likelihood of taking maternity leave. I wonder whether it is also true in New Zealand. 


*Eric Crampton is head of research at The New Zealand Initiative.

We welcome your help to improve our coverage of this issue. Any examples or experiences to relate? Any links to other news, data or research to shed more light on this? Any insight or views on what might happen next or what should happen next? Any errors to correct?

We welcome your comments below. If you are not already registered, please register to comment or click on the "Register" link below a comment.

Remember we welcome robust, respectful and insightful debate. We don't welcome abusive or defamatory comments and will de-register those repeatedly making such comments. Our current Comment policy is here.