Canada, following the lead of the US (http://data.gov), the UK (http://data.gov.uk/), and Australia (http://www.ands.org.au/) has created a new open data portal, http://data.gc.ca.
The data portal contains all of Statistics Canada's CANSIM data, as well as data from the Department of Finance, Health Canada, Environment Canada, Transport Canada, Citizen and Immigration Canada, and so on.
To try out the data portal, I gave it a couple of tests.
First, I searched for government revenue. There are two bodies that produce government revenue data - Statistics Canada, and the Department of Finance. Unfortunately Statistics Canada does not produce a consistent, long-term series of government revenue data. For example, the current on-going series, 385-0032, only goes back to 1991.
The Department of Finance does produce such a series in the fiscal reference tables, but those tables aren't something that a person would find unless he knew they were there and was looking for them.
Data.gc.ca contains both Statistics Canada's CANSIM series, and the Department of Finance's fiscal reference tables. Unfortunately, when I did an unrestricted search for "government revenue" the fiscal reference tables only started to show up in the fourth page of hits. The download, once I got to the right place, was very easy, but only gave me data up to 2010. The Department of Finance web site currently has data up to 2011 available.
It is possible to search the open data portal by specific department, and in future I may well search everything except for Statistics Canada, so my results aren't dominated by pages of CANSIM series. If I want CANSIM, I'll go to Statistics Canada directly - the strength of the data portal is that it allows me to search all of the other government departments.
One thing that is exciting about the data portal is the availability of geo-spatial data. I tested that part of the site by searching for "Rideau River." Nothing came up. So I tried Ottawa. The first hit was for Paleo-environmental records of climate change across Canada - Vegetation History, Glaciated North America - I didn't try a download, but it looked like an amazing resource.
The current data portal is a pilot project. I doubt that it will ever be as shiny and glossy as the World Bank data portal, http://data.worldbank.org/, but it could incorporate some of the World Bank portal's features. For example, I would like to be able to search for micro data (or meso, e.g. community or region-level data) separately from time series data. Sure, the micro data search might only spit out the census, but it would be a start.
The data portal could also learn from commercial sites such as youtube. One of the major problems with the data.gc.ca site, as with the CANSIM site, is that there are so many series, and the most useful series do not always rank highly in search results. Allowing users to flag the most useful series, and keeping track of downloads, would be a way to improve the quality of the search results.
I'm not a serious data person. Someone who was might have other suggestions. For example, how about giving data sets DOIs, those things at the end of academic citations, e.g. DOI:10.1126/science.7716547, that mark the permanent web home of the article?
These are my thoughts, but take it for a spin, and drive it yourself. The best way to ensure that the open access initiative is maintained, supported, and expanded is to use the data.
Similar reaction to yours. Environment Canada still has the data that I used, but this tool could be useful to see what else is available, from other departments.
Posted by: edeast | March 19, 2012 at 10:44 AM
There's a danger of being too negative - e.g. that's a bit of a problem with my MumblingProfessor videos. The people who've worked to make this happen, and potential users, need to be encouraged. I remember sitting at some workshop back in the early 1990s when I was a junior prof, and trying to convince people at StatsCan that, yes, charging $1000 for micro data *was* a problem for Canadian academics and *did* discourage access. It's easy to forget what things used to be like in the bad old days.
But at the same time, if you look at something like gapminder.org and see what is possible it's hard not to go "meh" when you see the gc.ca stuff.
Posted by: Frances Woolley | March 19, 2012 at 11:01 AM
I'm glad they've done it, and I imagine it took an unreasonable amount of work. Any pan department effort, would be daunting. I'll try to build something with it when I get the time, to become an invested user. At least they have a feedback form.
A benefit, from having all the data in one place would be increasing the exposure to the public of the importance of data. eg during budget cuts, I don't know who stands up for data vs jobs, oh ya legislation. This is why commenting is hard, it's almost impossible to come up with a relevant indisputable point.
Basically my preference relation for gov is, data>everything else. And open data is greater still. Allows for more parties to be involved in data interpretation, and then policy. Somewhere on the internet Peter Norvig's unreasonable effectiveness of data, kind of backs me up.
Posted by: edeast | March 19, 2012 at 12:51 PM
They respond very fast to feedback.
Posted by: edeast | March 19, 2012 at 01:39 PM
I stated using free CANSIM this year and my feedback is that there should be a comment section for each table, like a blog. I wouldn't worry too much about crazy people posting mad comments, it's a statistics website. In the right place, a comment board is a valuable thing.
I would also like to see users of the data have a good place to post their projects or findings. Perhaps this is the same thing as the comment section. Twitter handles could be used for follow up discussions on the data. @tholloway0619
Posted by: Thomas Holloway | March 20, 2012 at 06:20 PM
Thomas - agreed, there's a lot to be learned from search engines and other big sites e.g. youtube.
I think people in government genuinely want these projects to work, and the data to be used, so constructive feedback - especially to the open data portal, which looks like it's still in the somewhat pilot project stage - might help.
Posted by: Frances Woolley | March 20, 2012 at 06:50 PM
Frances,
If you're learning from youtube, you probably wouldn't add comments (eg), but I think comments are actually helpful. What you do need is a moderator who is happy to enforce standards on the (presumably rare) cases when it's needed.
http://data.govt.nz/ allows comments
Posted by: Thomas Lumley | March 21, 2012 at 02:35 AM
Well so much for legislation, the feds are getting out of freshwater. I'm interested in seeing how this works in senate and provincial politics.
Posted by: edeast | March 25, 2012 at 09:39 PM