Search FQXi

If you are aware of an interesting new academic paper (that has been published in a peer-reviewed journal or has appeared on the arXiv), a conference talk (at an official professional scientific meeting), an external blog post (by a professional scientist) or a news item (in the mainstream news media), which you think might make an interesting topic for an FQXi blog post, then please contact us at with a link to the original source and a sentence about why you think that the work is worthy of discussion. Please note that we receive many such suggestions and while we endeavour to respond to them, we may not be able to reply to all suggestions.

Please also note that we do not accept unsolicited posts and we cannot review, or open new threads for, unsolicited articles or papers. Requests to review or post such materials will not be answered. If you have your own novel physics theory or model, which you would like to post for further discussion among then FQXi community, then please add them directly to the "Alternative Models of Reality" thread, or to the "Alternative Models of Cosmology" thread. Thank you.

Forum Home
Terms of Use

Order posts by:
 chronological order
 most recent first

Posts by the blogger are highlighted in orange; posts by FQXi Members are highlighted in blue.

By using the FQXi Forum, you acknowledge reading and agree to abide by the Terms of Use

 RSS feed | RSS help

Steve Dufourny: on 2/13/16 at 10:19am UTC, wrote Hello dear thinkers,after all ,all is a question of good governance and...

Anonymous: on 2/13/16 at 0:17am UTC, wrote In science, it is fairly predictable that most major advances will not be...

Anthony Aguirre: on 2/12/16 at 21:19pm UTC, wrote Rob & Lorraine, That's exactly why I chose that terminology -- because...

Eckard Blumschein: on 2/12/16 at 17:25pm UTC, wrote Yes, Lorraine. My English is shaky. In German, "feel hurt" is the about...

Lorraine Ford: on 2/11/16 at 23:37pm UTC, wrote Eckard, Re "I apologize for possibly hurting believers when I used the...

Lorraine Ford: on 2/11/16 at 23:33pm UTC, wrote I agree with Rob.

Eckard Blumschein: on 2/11/16 at 23:21pm UTC, wrote Tom, I guess, p. 148 refers to a textbook by Tray; and the well-known...

Robert McEachern: on 2/11/16 at 18:39pm UTC, wrote "They also find that past performance is a strong predictor of future...


Candy lol: "Good post. I find out something new and difficult on personal blogs I..." in DNA Dust

harry lone: "This article clears my mind. Writer has done great job. Best thing about..." in Baez on Quantum...

harry lone: "Wow. This is some thing very amazing. I was in search of such a beautiful..." in Help Fight Negativity!

Georgina Woodward: "Excluding 'Big crunch' un-creation ending from the eternal category." in The Sudoku Universe, Why...

Phuloo Nikola: "I really appreciate you for this article. It's very useful information...." in The Disintegration of the...

Phuloo Nikola: "Good post. I find out something new and difficult on personal blogs I..." in Koalas, Quantum Mechanics...

Shazi Shiz: "Thanks for your marvelous posting! I quite enjoyed reading it, you happen..." in Out of Plato's Cave?

Shazi Shiz: "I was looking for something like this…I found it quiet interesting,..." in Deferential Geometry

click titles to read articles

The Complexity Conundrum
Resolving the black hole firewall paradox—by calculating what a real astronaut would compute at the black hole's edge.

Quantum Dream Time
Defining a ‘quantum clock’ and a 'quantum ruler' could help those attempting to unify physics—and solve the mystery of vanishing time.

Our Place in the Multiverse
Calculating the odds that intelligent observers arise in parallel universes—and working out what they might see.

Sounding the Drums to Listen for Gravity’s Effect on Quantum Phenomena
A bench-top experiment could test the notion that gravity breaks delicate quantum superpositions.

Watching the Observers
Accounting for quantum fuzziness could help us measure space and time—and the cosmos—more accurately.

December 12, 2017

CATEGORY: Blog [back]
TOPIC: How risky is too risky? Evaluating the expected impact of high-risk/high-reward research [refresh]
Bookmark and Share
Login or create account to post reply or comment.

FQXi Administrator Anthony Aguirre wrote on Feb. 4, 2016 @ 17:51 GMT
Caltech, R. Hurt, IPAC
On Dec. 8, 2015, two different groups (sharing an author) posted papers to the arXiv announcing the possible detection of planet-sized objects in the far outer solar system (Vlemmings et al, arXiv:1512.02650v2 and Liseau et al, arXiv:1512.02652v2). There was a brief flutter on Twitter and in the media, which shortly died down. As far as I am aware, no large-scale effort has begun to confirm or refute these potential detections, and both papers have since been withdrawn, until further data is available.  

Six weeks later on January 20, a paper appeared in The Astronomical Journal adducing strong circumstantial evidence, based on solar system object orbits, for a large 9th planet in the outer solar system (K. Batygin and M. E. Brown, The Astronomical Journal Volume 151, Number 2). The media attention was staggering, and the paper downloaded 243,547 times of this writing. There are almost certainly numerous intense efforts underway to try to detect the object.

While it may be surprising to see much more attention (and resources) directed toward circumstantial evidence for a 9th planet than to direct potential observation of one, this is the sort of decision with which researchers — and research funders, and journalists — are confronted all the time.  

These decisions are, in essence, predictions about how things are going to unfold; this has gotten me interested in how to better solicit and aggregate expert predictions in science and technology, and helped motivate a new project I and several other physicists have been developing, called Metaculus.

To be more specific, there is an important class of decisions that can be posed in the form of "what is the expected return on my investment of time/effort/attention/funding in X?" For some science-based examples:

— "What is my expected return in using my time on telescope X to search for the planet suggested by this data?" Here the potential "return" is fame and satisfaction at discovering a planet.

— "What is my expected return in skimming/reading/studying this new paper?" Here the return might be insight gained, entry into a promising new research direction, etc. 

— "What is the expected return in funding this research grant?" Here, the return could be papers published, talks given, meetings run, or more abstractly intellectual impact on a field or set of questions.

— "What is the expected return on building this instrument?" The impact here would be scientific discovery, possibly measured by papers, citations, etc.

A central idea in these questions is that of expected return. Most simply, this could be the likelihood of success times the return if successful. Or, if there are multiple possible outcomes, it could be the sum/integral of the probability of each outcome times that outcome's impact. 

The idea of high expected return (per dollar) is part of FQXi's core philosophy (and grantmaking criteria). To make a financial analogy, government funding agencies tend to purchase the equivalent of a diverse-but-safe portfolio of bonds and index funds: decent returns, fairly safe. These agencies tend not to fund the science equivalents of startup companies — projects where the chance of major success is fairly low, but the impact if successful is very high. We believe that in the science, as in the corporate, world, both types of investment are very important, and one role of FQXi is trying to fill in this end of the research funding portfolio.  

Evaluating the "probability of success" is, though, rather difficult. It's  often not hard to assess which of two projects is more likely to be successful. For example, I would say the Wendelstein 7-X fusion experiment and subsequent efforts are more likely to lead to useful energy generation than Brillouin Energy's LENR experiments. But how much more likely? Ten times? A thousand? A million? The 7-X’s funding is probably about 1000 times higher, so which experiment has the higher per-dollar expected return on investment depends on this likelihood ratio! Or what about tabletop quantum gravity experiments versus a bigger version of the "holometer"? 

The idea of Metaculus is to generate quantitative and well-calibrated predictions of success probabilities, by soliciting and aggregating expert opinion, and by (in the process) helping people improve their skills at quantifying and predicting impact. Metaculus poses a series of questions, for example "Has a new boson been discovered at the LHC?", with relatively precise criteria for resolving the question after a specific time. Users are invited to predict likelihoods (1-99%) for these questions, and later awarded points for accuracy in their predictions. Studies show that by carefully combining the predictions of many users, better precision and calibration can be achieved.

My experience so far suggests to me that there are several ways a prediction platform like this, when applied to scientific research, can be complementary to traditional peer-review. The effort of creating precise criteria for 'success', and in trying to assign numbers to success likelihood, has a quite different feel than just reading to understand whether a paper/proposal is intellectually sound or correct. It also makes me realize that in all of the peer review and assessment that I have done, I've never been asked (or asked someone) to supply a number like "what is the probability that X will be the result of funding/publishing Y?"  Since that's a significant part of what peer review is, isn't that a bit odd?

Perhaps there is an opportunity for real improvement here. A recent study made the case that prediction 'markets' are quite effective — and more effective than surveys even of experts — in forecasting whether given research (in this case in psychology) would be successfully reproduced (PNAS, Vol 112, no. 50).

I'm very interested in everyone's ideas for how something like Metaculus could be used in trying to make the biggest impact we can out of the limited resource society throws in the direction of us scientists — please comment!

report post as inappropriate

Pentcho Valev wrote on Feb. 4, 2016 @ 19:47 GMT
There is a type of research for which the probability of success is, a priori, zero. The FQXi contest "Which of Our Basic Physical Assumptions Are Wrong?" disclosed no wrong assumption, and I am sure that the participants have long abandoned the essays' topics. Research questioning basic principles is doomed. It is like trying to prove that a town has been built on a land illegally. No matter how convincing your proof is, nobody is going to destroy the town.

Pentcho Valev

report post as inappropriate

FQXi Administrator Anthony Aguirre replied on Feb. 4, 2016 @ 19:52 GMT
So you are suggesting that no research should be funded that questions basic principles? Which ones, exactly, should go unquestioned?

report post as inappropriate

Pentcho Valev replied on Feb. 4, 2016 @ 20:17 GMT
On the contrary, I am complaining that this type of research is hated (funding is out of question). Would yo give support to a person questioning the second law of thermodynamics or Einstein's 1905 second postulate? Knowing that if Einstein's second postulate is proved false, theoretical physics as a whole collapses immediately?

Pentcho Valev

report post as inappropriate

Georgina Woodward replied on Feb. 4, 2016 @ 20:57 GMT
Hi Pentcho,

this process is based on the wisdom of crowds. The aggregated wisdom of independent individuals can be greater than the best opinion of one or a small number of experts. It is important that the opinions are independent and not swayed by what is thought to be most acceptable to the others. This is a way of breaking out of the "group think". Wisdom of crowds, Wikipedia.

report post as inappropriate

Robert H McEachern wrote on Feb. 4, 2016 @ 21:10 GMT
The biggest impact for who? It can be (and has been) argued, that anyone receiving funding, has already gotten their impact (the funding). But what was the impact for the people providing the funds in the first place? Lack of any adequate answer to that question, is what gets many programs, like the Superconducting SuperCollider, defunded. A related issue is, when, should the project be funded? We all understand that many individual investigators would like the fame and glory of a discovery, made possible, by immediate funding. But, which investigator (present or distant future), achieves that glory, is of little concern to the funders, unless a case can be made for an immediate impact upon the funders (taxpayers etc.), rather than those being funded.

Rob McEachern

report post as inappropriate

Georgina Woodward wrote on Feb. 4, 2016 @ 21:23 GMT
Hi Anthony,

I can see how this can be used to predict likelihood of success. Though there may need to be care in specifying what that success actually is. Eg. From Metaculus: In 2016, will an AI player beat a professionally ranked human in the ancient game of Go?..This question is positively resolved if, in 2016, an AI with no handicap beats a professional human player in an official game of Go." Despite widespread press coverage of a European champion being defeated the prediction is only at 75% "Nova77 -- Apparently the game reported in the press was not "official" Yet "...matched its artificial wits against Fan Hui, Europe’s reigning Go champion, and the AI system went undefeated in five games witnessed by an editor from the journal Nature and an arbiter representing the British Go Federation."Cade Metz, I wonder, is the discussion between voters counterproductive?

report post as inappropriate

FQXi Administrator Anthony Aguirre replied on Feb. 4, 2016 @ 21:27 GMT
Hi Georgina,

In that case, the decision was that the 'behind closed doors' games in which Hui was defeated did not conform to the letter of the question. So what's reflected in the 75% is some combination of (a) uncertainty as to whether an official game will be played with characteristics of the unofficial one, and (b) sitting predictions that have not been updated.

It's quite hard to get just the right resolution criteria, we've discovered, but that is quite educational in itself!

report post as inappropriate

Georgina Woodward replied on Feb. 4, 2016 @ 21:41 GMT
Hi Anthony,

I am also unsure of whether likelihood of initial success adequately differentiates between projects. The future impact from one paper or experiment might be very different from another.

Predictions are very often based on what we already know. I think the development of the word wide web and prevalence of home computers and the computer gaming industry are things that would not have been crowd predicted in 1970. Perhaps it is good for predicting outcomes that fit within our current mindsets but not radical change.

report post as inappropriate

Georgina Woodward replied on Feb. 4, 2016 @ 21:51 GMT
Hi Anthony,

what is an official game? surely 5 matches with independent witness and Go Federation official meets that criterion. It does not say a public match. I agree that some of the voters may not have updated their opinions. There is also only a very small crowd at the moment it seems. How small is too small? As you are looking for the wisdom of the crowd not a small group, especially not a small group conversing with each other and comparing their opinions.

report post as inappropriate

Domenico Oricchio wrote on Feb. 5, 2016 @ 00:53 GMT
If I understand well, it can be possible to assign a probability of success in a published research field when a referee read a research paper; so that can be assigned a vote that permit a nation to assign resources in an experimental, or theoretical, team in a nation.

The only problem is that the number of referees for an article is low (it is easy to influence a few), and the number of readers may not be indicative of the likelihood; but a crowd of researchers can give a good likelihood, if it belongs to the area of expertise: for example, it can be possible for the endorsers of arxiv to give a likelihood vote for a research article, so that some nations can fund recommended project, or the arxiv votes can be used by foundations to choose promising projects; in such way arxiv (or other open access sources) can becomes important to obtain financing, and the nations can increase the use of the open access, and improve the success in the research, and reduce the closed access journal (and the cost for a nation).

report post as inappropriate

Steve Dufourny wrote on Feb. 5, 2016 @ 21:09 GMT
Hello Mr Aguirre,

It is an very interesting article.The funds are important for the evolution of sciences, technologies,....The most important is after all the dterminism and the rationalism.The priorities also are essential.The sciences are there also to help this planet and the evolution of animals and vegetals.It is time to focus on a pure altruitic comportment and totally universal.The researchs and inventions can be harmonised globally speaking.The synergies can be relevant.The financing is so important in fact.Beautiful article.Regards

report post as inappropriate

Anonymous replied on Feb. 6, 2016 @ 10:09 GMT
Hi Mr Aguirre,

If FQXi wants ,I try to create an international Humanistic Sciences Center, focus on priorities,global.I try since more than 8 years but it is difficult, Ihave promised to several friends in Africa and Europa.But I have had several serious personal problems.nt globally speaking.The solutions exist in fact.I beleive strongly that alone we are nothing.The complementarity is so important.The adapted productions can help.I am not a good adminisrator and business man.I have discussed with Ms June Klein of The Bill and Melinda Gates fundation.I have explained also the projecdt to my region, the director of an important system.I can have greenhouses(I am productor of plants,flowers, compost, substrates.... near a geothermy near my town in Belgium.I try to centralise the system but it is not easy.Regards

report post as inappropriate

Steve Dufourny replied on Feb. 6, 2016 @ 10:10 GMT
it was me

report post as inappropriate

Georgina Woodward wrote on Feb. 6, 2016 @ 11:01 GMT
Hi; I've just taken a look at the linked article about psychology research replication and prediction markets. I was actually shocked and surprised by the amount of non replicate-able research.From PNAS "Using prediction markets to estimate the reproducibility of scientific research" :" the costs associated with irreproducible preclinical research alone have recently been estimated at US$28 billion a year in the United States." ..."We find that the hypotheses being tested in psychology typically have low prior probabilities of being true (median, 9%) and that a “statistically significant” finding needs to be confirmed in a well-powered replication to have a high probability of being true"..."The RPP project recently found that more than one-half of 100 original findings published in top psychology journals failed to replicate (10). ." Does anyone else find that a surprisingly large amount of irreproducible research and cost?

report post as inappropriate

Georgina Woodward replied on Feb. 6, 2016 @ 11:40 GMT
Wouldn't it be good if there was a way to predict what will be sucessful, reproducible experiments before they are carried out? It seems there is a lot of wasted effort not producing actual scientific progress. On the site Hyperlipid scientific research papers to do with metabolism are discussed and often pulled apart.It seems to me it would be really good if the sort of expertise used to criticize the research was instead able to work in an advisory role pointing out the flaws and omissions in the experimental designs. Then there wouldn't be the abundant production of misleading conclusions. I think it would probably require a lot of the people working in particular fields to contribute to the success prediction. Could they be motivated to do that? Also there might be reluctance to submit designs for evaluation various reasons. Eg Potential loss of research funding,commercial confidentiality, intellectual property concerns.

report post as inappropriate

FQXi Administrator Anthony Aguirre replied on Feb. 6, 2016 @ 15:05 GMT
I find it pretty dismaying (though I would also like to see this study itself reproduced!) Of course, an experiment with a 'null' result is not a failure -- that's useful data. But an experiment with flawed methodology is pretty useless.

There's a problem, of course, in that a bad experiment is an *advantage* if you goal is to get (on their face) interesting results, since bad experiments will produce all sorts of results, and some will be interesting! Add that on to publication bias (of publishing just the interesting stuff) and there are a lot of forces pushing the large fraction of irreproducible-but-interesting publications we seem to see.

I'm not sure what to do about this. I agree that some sort of criticism from other experts *before* the experiment would be great. May be hard to arrange, though. Big experiments (say the LHC) with a large collaboration do lots of this. Small lab experiments, social science research, etc., may not so much.

If there were a prediction platform for predicting reproducibility of experiments, that might help. It would need to have the right people involved and motivated though. A first step might be a site that just tracks which experiments have been reproduced, and which not -- a sort of 'live' version of what's in this paper. That would at least add some level of 'social' shame/pressure to make your own research reproducible. Maybe a prediction platform could then be glued onto that...

report post as inappropriate

John R. Cox replied on Feb. 6, 2016 @ 17:10 GMT

What you have identified is an old problem, its called 'ambition'. But it also highlights the value of peer review and the old fashioned 'brick and mortar' collegiate experience where group discussion is often structured yet informal. and the bugs tend to get worked out before an individual commits to making anything of a public record. I'd also like to say that this topic is one you seem are already fairly well equipped to pursue. :-) jrc

report post as inappropriate

Georgina Woodward wrote on Feb. 7, 2016 @ 03:01 GMT
Hi Anthony,

Why would people choose to be Metaculus participants? The site is almost monochrome and only seems to offer abstract points rewards,( possibly redeemable for vouchers or competition entry) I can understand gamblers taking a lot of time to study the form of horses and get as much information on the stables and training and courses etc as the reward is a potential financial gain. I can also understand slot machine addiction even when the financial reward is unlikely. I have recently read about how loosing rats continue to gamble for a sugar water reward when the game is accompanied by lights and sounds. In both cases a basic reward seeking drive is stimulated. What does participation mean to Metaculus members? What does successfully predicting an outcome 'mean' to them? 'How does it feel'? If this model is used to predict research potential using experts how is commitment to background research and persistence of endevour of the participants obtained?

this post has been edited by the author since its original submission

report post as inappropriate

Georgina Woodward replied on Feb. 7, 2016 @ 07:40 GMT
Re. current Metaculus terms and conditions, the upside for users is?

this post has been edited by the author since its original submission

report post as inappropriate

Georgina Woodward replied on Feb. 8, 2016 @ 00:00 GMT
Hi Anthony,

Re.a Metaculus like model for evaluation of research potential. The site, as currently seen by a visitor, seems to work against a desire to engage and contribute; At a basic reward seeking level and in a face value cost benefit analysis. I'm not saying that to be unkind but in trying to address the question you have asked. I think that if you want experts to engage in a research potential prediction market a 'you loose all rights to absolutely anything discussed' approach wouldn't be in keeping with the face value raison d'etre of the research potential prediction market. Set up to benefit society by maximizing the return on research investment, not collecting, analyzing and utilizing the individuals contributions for money making activities. Also many experts will have limits on the time they can spend on such a project so what incentive is there for them to do so? A co-operative business project might be better for desire to engage and ethically better than a business with sole director or small group of directors and shareholders. A co-operative might give each contributor a share of profit made from selling research potential predictions and share of any profits made from utilization of contributed information in new business, articles etc. For a very large contributor crowd the payout to each will be relatively small but it is sharing the wealth creation through the collective effort.

this post has been edited by the author since its original submission

report post as inappropriate

FQXi Administrator Anthony Aguirre replied on Feb. 8, 2016 @ 17:03 GMT

Another great question. I think we'll have to experiment with what excites people to get involved and spend time. Some sites like WIkipedia, Reddit, and Quora, as well as science projects like the Galaxy Zoo, planet-finding efforts, etc., have created large communities that put in a lot of detailed effort that is paid in status, enjoyment, readership, a feeling of doing a public service, etc. I think all of those could in principle apply here depending upon where we focus. There is also the fun of betting and seeing if you win or lose, which I think is enjoyable to some people even without money on the line (indeed for me is it much more fun *without* money on the line!) We can also experiment with financial motivation, such as prizes or 'bounties' on particular questions, like what Quora has recently started.

In terms of the site's aesthetic, I think it appeals to some and not others. We're not aiming for Candy Crush, and wanted the site to have a unique feel that ties in with its more technical and scientific focus, relatively clean and efficient.

report post as inappropriate

Georgina Woodward wrote on Feb. 11, 2016 @ 03:54 GMT
Hi Anthony,

is there evidence that making more predictions does improve skill at quantifying and predicting impact? Or is this a research project itself? A project that may take a few years at least because not all impacts are necessarily immediate. Can the Metaculus data from individual predictions be used to show prediction skill improvement? Or is it fluctuating as each question has its own unique challenges to consider? The questions do all seem to be about well known, large projects, already reported in the popular press- that are different from the far more obscure project proposals that FQXi might be presented with, or that might be submitted to National grant awarding bodies. The estimation of -impact- of these kinds of projects (not just likelihood of a specified outcome) is quite different from whats currently happening on Metaculus.

report post as inappropriate

FQXi Administrator Anthony Aguirre replied on Feb. 11, 2016 @ 16:49 GMT
Georgina, there is some evidence from the IARPA/ACE publications that prediction is an improvable skill, both from training in better calibration etc., and through practice. They also find that past performance is a strong predictor of future success.

It will definitely be possible to plot predictors' accuracy over time, though as you say this will take a while.

For more 'obscure' predicables, I think you would need to identify a niche of predictors with enough domain expertise to understand the issues well. It's combining the domain expertise with the aggregation and the training that makes it potentially powerful. And certainly for grantmaking you'd want a system somewhat different/expanded from the current Metaculus. Though I think the probabilities for questions like "Will this lead to 100 citations within 2 years" would be imperfect but still informative.

report post as inappropriate

Robert H McEachern replied on Feb. 11, 2016 @ 18:39 GMT
"They also find that past performance is a strong predictor of future success."

Almost every stock investment brochure, cautions the opposite, with regards to expert stock-portfolio managers. It all depends on how predictable, the phenomenon one is attempting to predict, ultimately are.

Planetary orbits are highly predictable. Human behaviors, in unfamiliar circumstances, rather less so.

Rob McEachern

report post as inappropriate

Lorraine Ford replied on Feb. 11, 2016 @ 23:33 GMT
I agree with Rob.

report post as inappropriate

Login or create account to post reply or comment.

Please enter your e-mail address:
Note: Joining the FQXi mailing list does not give you a login account or constitute membership in the organization.