Tuesday, February 15, 2011

The latest on Canada's changing census

The metaphorical aftershocks of the Canadian government's decision to forego the mandatory long-form census in favour of a voluntary household survey, triggering the resignation of then-Chief Statistician Munir Sheikh and creating one of the more unusual national political controversies in recent Canadian history, continued today with the interview of Wayne Smith, the new Chief Statistician of Canada with The Globe and Mail's Ottawa reporters Steven Chase and Tavia Grant. Smith valiantly makes the point that the shift won't inevitably lead to lower-quality data, that if enough organizations and enough people are mobilized to encourage people to participate in the voluntary census there wouldn't inevitably be a drop-off in data quality, that things won't necessarily be bad. He doesn't exclude, however, the very strong likelihood that data quality will drop off significantly.

For instance.

Are you telling me that we’re going to get just as many Inuit, just as many unemployed, just as many people who don’t speak English and are immigrants.

I am saying that either I nor anybody else can tell you today that we’re not.

But it’s a fair bet.

No it’s not a fair bet. There is no scientific reason why you would say that before it even starts, before I see the results, that there’s going to necessarily be a significant problem with the count of Inuit or Métis or immigrants beyond the levels we’ve already seen in the 2006 census.

So you’re saying there is no reason to believe right now that there might be a poorer reading of small sub-population groups?

There’s no necessary reason why that would be the case. There’s a heightened level of risk and that’s the most you can actually say. Then the rest of it will depend on what happens and that’s why it’s so vitally important we get the support.

You’re hoping that business groups, community groups [support this]. How do you ensure they do?

We’ve hired a very large communications staff regionally. And they’re already out there, they’re actually out there meeting with associations, we’ve met with Indian reserves, we’ve met with municipalities, we’ve met with a wide variety of organizations. And so far, I can tell you that the reaction has been very favourable. The indications are that we’re going to get support.

As well:

[U]sers would have preferred to keep the long-form census. I’m sure you’re aware of the concerns raised by different groups, whether it’s researchers or it’s health researchers. In some of the briefs we got from Access to Information, HRSDC told StatsCan that the less reliable data would compromise their ability to determine EI eligibility. Indian and Northern Affairs says it would effectively hurt their ability to effectively manage and evaluate performance in areas of health and housing. The province of Ontario, saying we need this for our tax and budget decisions.

Can you assure these people that they have no reason to worry?

No, I told you a minute ago. Nobody knows what's ultimately going to happen. And I told you there's heightened risk because of the lower response rate. There’s not an a priori reason, that doesn’t mean it can’t happen. So we won't know until we're out the other end. But neither does anybody else. All I’m saying is that nobody can draw firm conclusions that this data is going to be fundamentally flawed. There's no scientific basis for that. We will not know until we come out the other end. I understand that people perceive the higher risk, and they would rather that the risk wasn't there. And the perception is that if the long-form had remained mandatory, that risk would not be there, or would be smaller actually. And I understand that I don't know either and I can't give assurances because I won't know what I've got until we've actually gone thru this process. We've never done it this way before. All I’m saying, all I’m asking Canadians to do is to suspend judgment because there's no scientific basis for saying that the data is going to be fundamentally flawed.

No scientific basis in that nobody’s a fortune teller, is that what you’re saying.

I can sit down, and we can work out for you, for any given variable, for any given city, based on various response rates, what happens to the sampling error if the response rate falls. There's a necessary consequence. If the response rate is this, this is the effect. If the response rate is that, this is the effect. That's science. But people saying there is going to be massive under-coverage of certain populations, to such an extent that the data is ... there's nothing in statistical theory that supports that argument. You won't know until it's done.

None of the consequences are absolutely certain. Note that Smith doesn't say that they're unlikely, that they're improbable.

Smith does raise the interesting possibility of doing away with the traditional model of the census and trying new innovative models.

If you think of the debate over the last few months, a number of options have been out there and discussed. Could Canada do a census based entirely on administrative records. Could Canada do a census like France where they do a part of the country every year and they don’t do the country in one year. Could we do something like the United States where they have a decennial census once every 10 years and run a very large survey every year in the interim that is capable of producing smaller area data. That's another model that's been talked about.

[. . .]

Data mining is a version of using administrative data. Basically you think of any administrative file that exists. What some countries are doing – for example, Denmark or Finland – if you live in those countries, you have to register your address. If you move you have to register. If you want to have a job you have to be registered. They actually have a basic register that says: here are the people living in the country and here is where they are living and this information is current and accurate.

And once they have done that, then they have all these other files: these people are in the education system; these people have cars, these people are in the health system ... You could think of all the files the government holds: they link those file to the basic population register and achieve something that is a very rich data base. There are certain types of data it doesn't contain obviously but it contains a very rich set of data.

The likelihood of a Canadian government that dropped the mandatory long-form census on account of its intrusiveness supporting data mining on such a scale, I leave to my readers to imagine.

1 comment:

Dwight Williams said...

Valiant, but it's still a party line being recited.