Monthly Archives: January 2013

Parliamentary inquiry into clinical trials

From the Radstats mailing list:

The Parliamentary Science and Technology Committee has started an enquiry into clinical trials/disclosure of clinical trial data and transparency.It was announced 13 December, deadline for submissions is noon on Friday 22 February.
Submissions up to 3,000 words

Questions asked:
1.  Do the European Commission’s proposed revisions to the Clinical Trials Directive address the main barriers to conducting clinical trials in the UK and EU?
2. What is the role of the Health Research Authority (HRA) in relation to clinical trials and how effective has it been to date?
3. What evidence is there that pharmaceutical companies withhold clinical trial data and what impact does this have on public health?
4. How could the occurrence and results of clinical trials be made more open to scrutiny? Who should be responsible?
5. Can lessons about transparency and disclosure of clinical data be learned from other countries?

The Committee is encouraging written submissions for this inquiry to be sent by email to scitechcom@parliament.uk and marked ‘Clinical Trials’.

http://www.parliament.uk/business/committees/committees-a-z/commons-select/science-and-technology-committee/news/121213-clinical-trials-inquiry-announced/

Leave a comment

Filed under Uncategorized

A Handsome Atlas

This wonderful new website contains an intelligently curated selection of visualizations from the US census in the late 19th century. It’s fascinating to see how innovative people were before they got told what “best practice” was*, and how we are only now re-discovering how to do with computers these wonderful things that were once adequately performed with pencils and watercolours. In fact, Jim Vallandingham has been doing just that, using the D3 JavaScript library. Very nice indeed, and the advantage of the computer is you can just stick another data set in and re-run at the touch of a button.

Image

* – however, I think it might not have been altogether bad to say farewell to bar charts that go off the right-hand side of the page and re-appear on the left…

 

Leave a comment

Filed under Uncategorized

2013 London Stata Users’ Group

I’ve just spotted that the 2013 SUG will be held at Cass Business School on 12-13 September 2013. In my opinion this is the highlight of the stats year for me. Every talk (my own excepted) is of such high calibre – and prêt-à-analyser implemented in the Stata programming language – that each year I take away several great ideas I can start using straight away.

Book it as soon as they open registration, and if you don’t know Stata, learn it now, just so you can go along to this excellent meeting.

Leave a comment

Filed under Uncategorized

40+ Years of the Cox model: one-day conference

On 8 March 2013, LSHTM are hosting a meeting of the International Biometrics Society on the semi-parametric proportional hazards model invented by Sir David Cox in 1972. This statistical tool went on to revolutionise clinical trials analysis and has indirectly saved and improved so many lives over the 41 years that we can’t even begin to count them. (An economist might, but we statisticians wouldn’t know how to count them.) The idea behind it, partial likelihood, was such a step change from what had gone before that we are still learning today all the applications and extensions we can use it for, and new semi-parametric models seem to come out in the stats journals every week. Click here for the programme and registration details.

Image

Leave a comment

Filed under Uncategorized

The delights of re-reading the GAISE College Report

I am revising an introductory stats lecture for some Masters courses starting this term and thought I would re-read the GAISE College Report. This excellent document is not known (and followed) nearly as much as it should be among stats lecturers. It is a set of recommendations for effective statistics teaching in higher education, published in 2005 and revised 2010 by the American Statistical Association.

Before giving you some choice quotes that I think sum up the problem of applied stats teaching outside a “mathematics” or “statistics” course, I shall just pause to say that the report opens with a uniquely lucid history of statistics university courses through the 20th century, from those based around Fisher’s and Snedecor’s famous textbooks from the 20s and 30s, to the exploratory data analysis emphasised by Tukey in the 70s and the double-edged sword of computing power and accessible software which has been the major force changing stats education since about 1990. They note that:

In the early years, statistics had to lean heavily on probability for its legitimacy.

Yes indeed, and there are still many books and courses around where we bore and confuse students in the first part of their study by endless talk about flipping coins and opening doors to reveal goats. No wonder they don’t see stats as relevant to their lives! Probability is a wonderful weapon in our armoury, but we generally don’t start to use it in earnest until we are quite advanced in our practice as data analysts and are looking to do something a bit bespoke. If you are training nurses or physiotherapists to be researchers, you have to face facts that most of them have no intention of going that far down the road.

As a little aside, I too have got this wrong in the past. I thought that understanding how probability or risk can be a long-run based on lots of data (ischaemic heart disease caused 17.4% of all deaths in England and Wales), or a one-off event based on some data and a lot of assumptions (Obama was given a 90.9% chance of winning in 2012 by Nate Silver), or a subjective belief (I think there is a 3% chance my train home today will be delayed by more than 10 minutes), would be useful for my students. Actually, I don’t think any of them recalled that distinction by the end of the course. It had been pushed out (if it ever went in) by what the GAISE College Report calls “recipes”: simple algorithms that show you what test to pick and therefore how to pass the exam. These recipes have a powerful lure for students, there’s no good blaming them for being attracted to the recipe when there are books and websites and YouTube videos full of them. My attempt at deeper understanding failed because the relevance was not emphasised.

OK, time for some of those great quotes.

[Some courses teach] students to become statistically literate and wise consumers of data; this is somewhat similar to an art appreciation course. Some… teach students to become producers of statistical analyses; this is analogous to the studio [fine] art course. Most…are a blend of consumer and producer.

That is a great analogy, and it runs deeper than it might appear at first. To be a good data analyst, you need to learn some classic skills, and have the vision to understand what you are trying to communicate and the creativity to break the rules to better effect. I recently found an essay by composer and conductor Pierre Boulez which draws this parallel in the practice of music and math, which I have quoted in a forthcoming article in Significance on visualization (A life in stats – Nathan Yau).

In week 1 of the carpentry (statistics) course, we learned how to use various kinds of planes (summary statistics). In week 2, we learned about using hammers (confidence intervals). Later, we learned about the characteristics of different types of wood (tests). By the end of the course, we had covered many aspects of carpentry (statistics). But I wanted to learn how to build a table…and I never learned how to do that.

Teaching all the tools in the box is not the same as teaching how to think like a statistician. The latter is much harder and takes longer! But that is what we must aspire to. The satisfaction the student has at being able to do t-tests and report 95% CIs for the mean difference (and so pass their exam) will soon fade when they do the same with some real-life data on a scale with a ceiling effect and get told by peer reviewers that they should have bootstrapped. They would be justified in complaining that they were taught such a simplified set of tools as to be useless in the real world. But if we taught them how to think about the problem and investigate it quantitatively, and how to find help or learn new tricks, they would be much better equipped for using their new skills in earnest.

While demands for dealing with data in an information age continue to grow, advances in technology and software make tools and procedures easier to use and more accessible to more people, thus decreasing the need to teach the mechanics of procedures, but increasing the importance of giving people a sounder grasp of the fundamental concepts needed to use and interpret those tools intelligently.

And this leads to a point the GAISE College Report doesn’t make, but I would like to promote: stats graduates now need to understand what is going on inside their computers to some extent. I don’t mean getting them to do a Wilcoxon signed-rank test by hand, I mean talking them through some of the key issues about storing data, precision and rounding errors, efficient parameterisation, and achieving global optima with iterative algorithms. To do this you don’t need to teach a load of algebra and calculus (though it would help, of course…) and you can start to introduce other computer-intensive concepts like bootstrapping or MCMC. This, surely, is the direction we will have to expand our courses into in coming years. How many papers have you seen recently that just had classic descriptive stats, tests and nothing more advanced than a log-transform or linear regression? We are doing our students a disservice if we send them out into the world equipped to deal with data analysis in the 1980s.

Leave a comment

Filed under Uncategorized

Booze space – article on Significance website

Happy new year everybody! Just before Christmas I wrote an article for Significance which is on their website here. This took the HMRC booze data that Andrew McCulloch had previously analysed as time series, and turned them into an animation (and the R code to make it is here). Apart from the pretty pictures, an interesting angle is the difference between litres of booze (which is what the HMRC count in order to levy the tax) and the units of alcohol – particularly at Christmas, when we as a nation drink a lot of wine.

Leave a comment

Filed under Uncategorized