You can follow this conversation by subscribing to the comment feed for this post.

Can you not do the weighting thing for the CFCS?

The other option for PEI is to do what I've seen journalists in particular do with increasing frequency these days, do up rankings of things like Canada's best premiers and have only nine in the list.

And what's with Newfoundland in the title if PEI is the star of this rant? Afraid we wouldn't give you enough hits?

It's a snow day here by the way.

Jim: you can't pick on PEI; it's not fair!

Wish I could remember stats. "..estimates of unemployment should not have a coefficient of variation (standard error relative to the estimate) greater than 2 percent for Canada, and 4 to 7 percent for the provinces."

I don't get the "4 to 7 percent". Would that mean 4% for Ontario and 7% for PEI? Would that fit with the 0.76 vs 5.93?

Frances:

I would have thought pweight rather than fweight. These are sampling weights (reciprocals of sampling probabilities), not frequency weights. On the other hand, it doesn't make any difference if you're not computing standard errors.

Jim - yes, I can weight the CFCS.

But the problem is that I have fewer single parents in big CMAs, fewer recent immigrants, fewer really rich people and fewer people living in cities with totally outrageous housing prices than I would have if CFCS had just picked a random sample of people across the country. I care about the maritimes and the prairies, but I don't care about them *more* than the rest of the country.

I thought the Newfoundland title was catchy at the time but, you're right, I could probably have thought of a better one.

And excellent snow here today, too.

NIck, don't ask me to translate!

Thomas, the second column is calculated with the frequency weights provided by Statistics Canada in the Labour Force Survey public user file. I used the variable called "FWEIGHT" and I'm hoping that it is in fact an fweight! The third column is one that I calculated myself based on the other two. I guess those are probability weights.

Oh I see. The CFCS isn't big enough to give you the sample size you want of a small slice of Canadian society, and you think its because they wasted resources trying to make sure they had reasonably reliable overall samples of all the provinces. Welcome to the federation that is Canada!

Jim: "Welcome to the federation that is Canada!"

The truth is I'd never realized that just about every survey in Canada over sampled the smaller provinces.

Perhaps it's a good decision, perhaps it's a bad one.

But it's something that I have never ever heard discussed: "what compromises are we making here? what are we gaining? what are we losing?" It's a conversation worth having - though I wish I was having it over a beer with you, Jim, instead of electronically.

For the LFS, given the structure of employment insurance, there's a good case to be made for the present design.

For other surveys, it might make more sense to have a representative sample of the whole country. Lots of people, for one reason or another, want to look at a particular slice of society (same sex couples, inter-racial marriages, multigenerational families, whatever), and if it happens to be a slice that's concentrated in the large provinces, the present way of designing surveys means it's harder than it needs to be to get a close look at that particular group.

The real shame is that there might be resource constraints that mean you can't do both.

Agree about the drinking of course.

The truth is I'd never realized that just about every survey in Canada over sampled the smaller provinces.

That strikes me as peculiarly naive. Of course strata with smaller populations will be relatively "over" sampled compared to those with larger populations. That's what's necessary to obtain sufficient precise population estimates; standard errors depend primarily on sample size not population size. Regardless, those of us in smaller provinces aren't somehow less deserving of accurate data from the [b]federal[/b] statistics agency.

Frances,
I would have thought pweight too, but like Thomas said it doesnt matter if you don't compute standard error.

According to stata's help file,

"fweights, or frequency weights, are weights that indicate the number of duplicated observations.".
"pweights, or sampling weights, are weights that denote the inverse of the probability that the observation is included because of the sampling design."

According to the LFS page:

Estimation

The final step in the processing of LFS data is the assignment of a weight to each individual record. This process involves several steps. Each record has an initial weight that corresponds to the inverse of the probability of selection . Adjustments are made to this weight to account for non-response that cannot be handled through imputation. In the final weighting step all of the record weights are adjusted so that the aggregate totals will match with independently derived population estimates for various age-sex groups by province and major sub-provincial areas. One feature of the LFS weighting process is that all individuals within a dwelling are assigned the same weight.

In January 2000, the LFS introduced a new estimation method called Regression Composite Estimation. This new method was used to re-base all historical LFS data. It is described in the research paper "Improvements to the Labour Force Survey (LFS)", Catalogue no. 71F0031X. Additional improvements are introduced over time; they are described in different issues of the same publication.

I dont remember ever using fweight.

Simon - the LFS file that I used has a variable that's labelled FWEIGHT. I would guess it's an FWEIGHT. To give a pweight the variable label FWEIGHT would be simply cruel, because people like me are bound to get confused. Statistics Canada wouldn't do that, would they?

It might be that you and I are using different LFS files - I'm using the public use one. From what you've quoted, it sounds as if pweights are used internally by statistics canada when deriving the fweights that are released in the public use file.

The passages you quote are an excellent illustration of why it is *vital* to have a mandatory census - the final adjustments for non-response etc would be impossible without census information.

Hi Frances,

I'm sorry I meant tosay that I never used the "fweight" function not the "FWEIGHT" variable. I've never used the LFS myself.

I gave a quick look at the documentation and the file named rebased-record-layout.xls says :
FWEIGHT : Final individual or family weight. (Integer)

I'm afraid FWEIGHT could stand for Final Weight or Family Weight.

I'll try to clear that up and comment here later.

The comments to this entry are closed.

• WWW