## The polls were correct, we just didn't understand the question.

<![CDATA[Numbers are authorative – this is probably a perception picked up in our school years, when mathematics seems so black and white: the answer to a sum or calculation is either right or wrong; there is no scope for debate or subjective judgement. Only those who go on to degree level or higher see that subjectivity creeps back into mathematics.
It is no surprise, therefore, that Douglas Adams chose a number to be the answer to the question of life, the universe and everything.
So, that the numbers which constitute the polls which dominated the media during the election campaign should so dramatically fail to predict the outcome has created a small storm. Some have started to blame this on people changing their mind, or simply lying to the polls, although why this should be more prevalent in this election than others is not clear.
However, looking closely, the polls seem to be actually pretty accurate – the problem, like the answer to life, the universe and everything, is not being clear about the question.
Looking at the various polls included in the BBC Poll of Polls, all work in very similar ways. They select a group of 1000 people at random and ask them how they will vote. The differences between the polls include the wording of the question asked, how the group is selected, and any normalisation to remove bias in the group selection.
This means, of course, that the poll is indicative of how many people will vote for a given party.
Comparing the various polls (using the values from the 6th May just before the election; source http://www.bbc.co.uk/news/politics/poll-tracker), with the actual results are interesting:

 Conservative Labour UKIP LibDem Green BBC Poll of Polls 34% 33% 13% 8% 5% Survation 33% 34% 16% 9% 4% TNS-BMRB 33% 32% 14% 8% 6% Opinium 35% 34% 12% 8% 6% Populus 34% 34% 13% 9% 5% ICM 35% 35% 11% 9% 3% Ipsos Mori 35% 30% 10% 8% 8% ComRes 35% 32% 14% 9% 4% YouGov 34% 34% 12% 9% 5% Result 36.90% 30.40% 12.60% 7.90% 3.80%

or as a graph

To me that looks pretty accurate – certainly well within the expected margins of error.
The problem is, of course, that in our First Past the Post system, seats aren’t allocated according to the total percentage, but by the winner in different electorial regions. The relationship between the percentage vote and the actual number of seats the party actual wins, is both complex and messy. As such the polls are (still) a good predictor of a parties overall votes, but a poor indicator of how many seats they would win.
This seems pretty obvious but given that this has always been the case, why has it suddenly become an issue in this election? Well, not least of all because we’ve moved from a two party system, with various smaller parties becoming more significant, the discrepancy between the overall percentage votes and seats is much greater this time, as many commentors have observed:
(
(source: http://i100.independent.co.uk/)
Some will argue that this is necessary to ensure we have definitive winners and strong parliaments, whilst others will argue that this is unfair and our electoral system needs reform, and that coalition parliments are more democratic rather than weak.
However, the real problem lies in a misunderstanding what the polls tell us when there is no clear leader in the overall percentage vote.
If a party has a 20% lead in the polls, it doesn’t mean it will get 20% more seats – it is instead a good indicator that the party will get a majority of the seats
If the party had a 40% lead rather than indicating an increase in the majority, it really indicates that you can be more confident that that party will win a majority of seats; on the other hand a 10% lead means the reliability of the prediction is less.
Hence, a poll with the two lead parties running neck to neck does not tell us that we will get a hung parliament (as the media constantly reported) but rather than the poll’s ability to make a prediction is extremely weak (which admittedly the media also talked about the outcomes uncertainty but not quite for the right reasons).
It remains to be seen what influence the polls (or rather the misinterpretation of the polls) on how people voted (and whether this had a significant impact), for example how many people voted tactically assuming on the basis that the outcome would be a hung parliament.
One notable feature of all the polls however is that the SNP was lumped together in the “Others” category, which may explain why its performance came as such a surprise to the media and politicians.]]>

## Royal Society Future of Scholarly Scientific Communication Live Blog – Final Afternoon

<![CDATA[This is my first attempt at a live blog so apologies in advance. This follows on from a liveblog of earlier in the workshop here.
Economics of Scholarly Communication (chair: John Skehel, FRS)

• if scholarly communication should be done on commercial basis
• role of learned societies

Mark Thorley (NERC)
Claims not to be anti-publisher (some of my colleagues are publishers) but has been asked to be contraversial!
Compares publishers with plant from little shop of horrors (feed-me).
“Publishers believe they have right to be feed” – “arrogance matched by avarice”
more seriously, have responsibilities to share holders etc. as well as society. Where publishers have become very profitable, whose fault is that. Community to blame for giving them money in the first place.
As long as publishers deliver services funders require in a cost effective manner, profit is not an issue – providing not a monopoly provider.
cf service from water company (can’t choose and get worse service)
OA is changing the philosophy. Readers don’t have choice where to find research, but authors in theory where to place research. In practice, choice is artificially limited through biased or misdirected inappropriate reasons why you should choose particular journals. Need to create a much more effective market. Author’s choice comes at a cost – no longer free, but dissemination needs to be seen as part of and (explicitly) funded by the research process. Need to understand (e.g. via Jisc work) total cost of ownership.
Role of funders: shouldn’t set APC or profit levels. Choice is with publishers to work with funders to create an effective and transparent market, so researchers and institutions can make real choices how to spend the scarce resources they have.
Learned Societies are effective at recycling public funding through publishing into supporting research especially early career researchers so need support
Liz Ferguson (Wiley)
Not a failed scientist but a scientifical trained professional.
Will be provocative. Will address what it is publishers do to earn\deserve that profit.
1665 – WW2 university and society publishers dominated the market. New entries willing to take risk but saw opportunities and offered organisation and scale.
Growth in articles requires scale to manage this. Commercial publishers at scale support, manage and financial support this at industrial scale. Often heard desire to take management activity into libraries and institutions – bringing this into overstretched insitutions creates risks from austerity and availability public funding.
Sustainability is inherent part of being a commercial organisation – this needs long term profit not short term profit.
Wiley publish on behalf of a large number of learned societies – they look to publishers to bring efficiencies and sustainable income.
Publishers re-invest profit – previously into new journals, now into service – need to make scholarly communication more effective and easier for example Macmillans investment in DigitalScience. Double digit percentage investment in technology (e.g. semantic linkage and discovery).
Have supported gold and green OA infrastructure (and note green does not have a income stream).
Moves towards open data and open science – publishers putting in workflows to support this and data citation.
Also invest in benefits to scholarly communications as a whole e.g. crossref.
Making profit on which we pay taxes is part of a sustainable infrastructure – innnovation requires profit.

Leighton Chipperfield (Society for General Microbiology)
Also has commercial publishers as friends (and wife!)
How society publishers can justify profit?
Charitable societies generate surplus to help support other activities.
When started SGM have to think about OA, joined up approach to digital etc. Now have all those things and seeing the benefits.
Learned societies to give communities what they want and stand up for them e.g. SGM does small focused meetings. Difficult to see how open support to communities could be done via commercial models. However, societies need to diversify their revenues. Societies are making efforts including conferences, big data, etc. Many societies reassessing the value to their members. Need to consider financial health and the value to their communities.
Are societies the best custodian of scholarly communication? depends on society! Three magic ingredients:

• governance structure empowers professional publishing staff to do business of publishing whilst scientists concentrate on science
• agility
• engagement from society members

Scholarly communication cannot be done at a loss – some form of publishing costs have always and will always exist. Question is whether to be done on profit or surplus basis – depends on political and philosophical ideals.

Open To Audience
David Colquhoun: Why should someone pay £5000 to publish (APC)?
Mark Thorley: concerns over hybrid models versus born OA – but need to know the value received from the costs. This isn’t yet as transparent as it needs to be
Liz Ferguson: not always difference between hybrid versus born OA. If not a born OA publisher there is a lot of legacy.
Audience (didn’t catch name): APC of only published articles cover the costs of all the submitted articles including rejected ones – hence high. Suggests submission fee instead.
Michael Jubb: Need dialog with publisher on what is needed – currently tends to be between funders and publishers but researchers not so good at engaging
Possible role of learned societies to represent researchers? Publisher digital product management groups now test on real users\researchers and seek opinions etc.
Mike Taylor rejoins: don’t need a dialog – publishers are service providers
Tony Hey: each year library asked which journals to cancel (due to increased costs and shrinking budgets) i.e. couldn’t afford what publishers were offering. Publishers didn’t recognise that they were charging more than the university ecosystem could afford hence the change invoked by OA. Don’t want package deals (just the journals they want). Should librarians be dealing with professional deal makers.
Yvonne Nobis: charges are based on historical subscriptions rather than current ones i.e. e-journal costs predicated on print spend 25 years ago. Reductions in (e-)publishing costs not passed onto consumer.
Liz Ferguson: sales models are changing
Mike Brady: setting up company and setting up journal both require risk and agility (or die). These require money. Not requiring surplus\profit is ideologically naive. However, any company which overcharges will fail when alternative emerge. However, if a small number overcharge does not imply profit is bad.
Mike Brady: How to transitions from product to service
Pete Binfield: legacy publisher distort market as author choice is based on “impact” rather than service
Audience: hybrid model is more expensive. market is disfunctional.How do we fix this.
Leighton: can’t come up with UK solution – it needs to be global
Adam Smith: couldn’t funders just dictate
Mark Thorley: bound by what was agreed in Finch process. Wants it to be an open market for author to publish in any journal

(coffee break)

Cost to Research Community (chair: John Skehel, FRS)
Stephen Curry (Imperial College)
Good system of scholarly commnuication but one which is good value for money (in times of asterity)
Need a working market but if so profit is not a “bad thing”. But need to check market is working and levels of profit are not excessive.
Operating Profit Margins from High Tech
Pharma – Fizer: 40%; GSK: 20%
Media – Disney: 14%; 21 Cent Fox15%
Auto – Hiundai: 10%; Daimler: 6%
Bank – JPMorganchase: 16%; Bank of America: 20%
academic publishers: Elsevier: 36%; Spring 34%, Wilet: 42%
This looks high compared to other industries.
However, is the problem high profit or because the market isn’t working – no real choice – journals not exchagable and publisher exploit this (entirely rational but does not give the consumer the best service).
Commercial publishers are 5 times more expensive than not for profit.
Publishers also get free labour (e.g. peer review)
Solutions not reached by creating a them and us situation
What are we going to do about it? Two major themes of meeting

• incentives (the right ones)
• transparency

Need more transparency and wider understanding\awareness of the true costs of publishing. Funders should be more aggressive (e.g. withdraw support from hybrid). Authors involve in cost benefit analysis (not given a bottomless pit). Foster new lean better-value-for-money publishers entering the market. Value on service rather than “nice impact factor”
John Wood (ACU)
knowledge isn’t cheap – world spend \$1 trillion a year on R&D is this a good investment?
advice in 1971 “you should spend as much time comnuicating your results as you do collecting and analysing them” (from Geoff Greenwood) – however now demand for instant publication
hidden costs: time, financial, who pick up bill after the research grant
(hopefully a linke to the slides will be tweeted as they are flying past too fast!)
costs: to individual, to institution, to infrastructure provider (e.g. Jisc), benefits to other researchers, to industry, social benefit, benefit to country, benefit to solving grand challenges.
G8 Ministers report – “the data harvest” – recommendations:

• Do requre a data management plan and how implemented
• Promote data literacy across society from researcher to citizen (start at schools)
• Grants and inventives should include data sharing (communicate)
• develop tools and policies to build trust in data-sharing
• support international collaboration
• don’t regulate what we don’t understand
• don’t stop what has begun

Huge costs moving forward – e.g. SKA will produce 10x the data flow of the current internet
Incentives of EC

• One Science Research Cloud
• two reports:
• impact of open science on universities including promotion of staff
• the Open Science – Open Innovation Ecosystem (how we do business in the future)

Audience Participation
John Skehel: to what extent does the ability to pay for gold OA vary across Universities.
Stephen Curry: proportional to grants. Can be a vicious circle
David Colquhoun: OA could mean publishing is very cheap – but how do you funnel the savings back to the learned societies?
John Skehel: Do commercial publishers quake when they here such things
Bob Campbell\Liz Ferguson: No! Masses of innovation in publishing (startups, new entities, new technologies etc.)
Audience: Do we need to deal with the impact factor culture first?
Stephen Curry: hear a lot about academic freedom but little about academic responsibility which include researchers seeking value for money. All these issues are interlinked – not a simple linear solution. Suggest FRS (since established) should have a policy of not publishing in high impact journals.
Emma Wilson: very few gold OA journals in chemistry – mainly hybrid. If RCs withdrew support for hybrid (due to high costs) would impact chemistry. Cost of gold versus hybrid OA vary across discipline (not such a higher cost in chemistry for instance). Also authors really like hybrid journals
Cameron Neylon: frustrated with discussion over transitional cost. We know theres a lot of money in the system. We know there are savings to be made. How do we invest the future savings we can make. Should we move to more creative funding models (infrastructure funding, endowments), so we can invest in savings rather than the current revenue status.
Mike Taylor: on not paying hybrid APC and will this harm careers of academics, notes Harvard doesn’t pay hybrid APCs has this affected their academic standing or their researchers. Another comment from audience that also Norway funds gold OA but not hybrid
Ginny Barbour: need to evaluate journals on rigour, ethics, data policies – not brand
Audience: we need to address perverse incentives [a common meme]. Can the Royal Society help promote a policy\sign up to change the incentive structures\what makes a good academic CV.
Perception that people need a (e.g.) Nature paper but this is not always true. But how do you deal with administrators badgering people on qunaitity of papers, grants – (Stephen Curry) need integrity and moral courage in the leadership of the institution.
Aileen Fyfe: don’t forget the rest of the world! [this is becoming a meme too]
We are a long long way from a global market for OA but we have starting points
Tony Hey: life science discovery fund used a refereeing service from AAAS at a cost. Could learning societies offer a refereeing service at a cost (was discussed in the breakouts at the previous meeting).
National Academy of Sciences – currently hybrid (plus green after six months) but no great push for Gold only – can’t speak for other journals
Audience: hybrid may not be so high if normalised for rejection rates. High cost not a monopoly of hybrids
Cameron Neylon: not all journals equal in terms of peer review, ethical review etc. OA may do better becuase they can’t rely on brand. However, lack of transparency of service differentials and costs (imaging check, statistical checks, etc.) need better conversation on what services we currently get, what we want\need etc.

Remarkable range of topics over the four days created debate which could have lasted far longer.
Have ranged from APC, fraud, peer review, business models etc.
Characterisation of diverse ecosystem and actors, transformed by technology and rapidly evolving.
Generated more questions than answers but does suggest frameworks to address these in particular conversations between scientists, communication scientists and providers
RS evolved to stimulate debate on all issues around science – doesn’t mean we agree but act as a forum to generate “more light than heat”
RS wants to stimulate debate on all aspects of the science communication debate. Will begin by creating an account of the meetings on the RS website, but want this to be a living document.
Thanks to: Sir John Enderby, FRS (rapporteur); Stefan Janz (organising); Stuart Taylor (facilitating); and to all participants

]]>

## Why this blog is so late…

<![CDATA[The preface to Michael Dummett's Frege's Philosophy of Mathematics is an apology why his book was thirty years late. Following his example, this is an apology why this blog is ten years late!
Setting up a personal blog is something I've been meaning to do for, well a long time, probably as long as the early 2000s. As a techie, the first obstacle was the technology. I can't recall how many different blog platforms I've installed only to become side-tracked in playing with the configuration or comparing features between different blogging software.
The other distraction is that sitting in front of a computer is not conducive (for me at least) to thinking – not only are there distractions such as the inrush of incoming e-mails (somewhat belatedly I've learnt to switch off the new e-mail notification), but when your day job involves starring at a computer, starring at a computer in my spare time does not necessarily appeal. Most of my best thinking is done whilst walking down the street, or waiting at a bus stop, in the bath etc. – locations where it is not so easy to type up ideas: I don't find keyboards on mobile phones good for typing more than a few words of text (and typing whilst walking down the street is not a great idea); speech recognition works better for searching (where fuzziness of search can counteract errors in transcription) than dictating (last time I tried speech recognition software it made some interesting sentences out of bath water splashes); and direct mind interfaces are a long way off (and I'm not sure I'd trust it either).
However, the real excuse is that I'm not a natural writer, and tend to regard it as a chore than a delight, so we'll see how long this blog will last!
As an aside, I was finally jogged into actually creating this blog by the need to help liveblog a workshop at the Royal Society on the Future of Scholarly Communcation. What I hadn’t recalled is that in the preface (from 1991) mentioned above, Dummett on commenting on his disgraceful “completion rate”, he then launches into an attack on how “British universities are in the course of being tranformed by ideologies who misunderstand everything about academic work … [as] part of a transformation of society as a whole … The plan of the ideologies is to increase academic productivity by creating conditions of intense competition … output is monitored by the use of performance indicators … universities have no option but to co-operate in organising the squalid scramble that graduate study has become, in introducing the new ‘incentives’ for their professors and lecturers and in supplying the data for the evaluation process. The question is to what extent they will absorb the values of their overlord and jettison those they used to have .. it is catastrophic when thse politicians display total ignorance of the need to juedge academic productivity on principles quite different from those applicable to industry .. overproduction defeats the very purpose of academic publication.” Much of the discussion at the workshop kept coming back to how broken the current incentive schemes are!]]>