Kepler: Hold the Data?

by Paul Gilster on April 22, 2010

Not long ago I sent out a ‘tweet’ on the Centauri Dreams Twitter feed talking about the number of planet detection candidates the Kepler mission was working with. Almost immediately I discovered that the story had become unavailable at the Nature News site, making me wonder whether the figures were right, but the story is back up (available here) and I can cite it once again. Thus:

Since its launch on 6 March 2009, Kepler, with its 0.95-metre telescope, has been staring at the same field of stars near the northern star of Vega, looking for tiny reductions in starlight caused by a planet passing in front of a star’s face. In January, the Kepler team announced the discovery of five new exoplanets. [Kepler principal investigator William] Borucki says that the team, as of last week, has found 328 more candidates — but that as many as 50% of these may be false positives, where objects such binary stars confuse the picture.

328 candidates, and much work ahead in weeding out the genuine planets from the false positives. But that’s what this kind of work is all about, a reminder that we’ll need to let Kepler run its three-year course before we really have the data nailed down and have identified (we assume) some terrestrial worlds around Sun-like stars, not to mention small planets around M-dwarfs that may prove to be in the habitable zone. No need to be in a hurry for results, in other words, and if you read the Nature News story, you’ll see that there’s no point in hoping for quick answers anyway.

That’s because, in addition to the inexorable rhythms of good research, the Kepler team’s new policy, as recommended by a NASA advisory panel, may be to hold back data on 400 ‘objects of interest’ until February of 2011, by which time some false positives will be eliminated. The issue is tricky: Researchers are usually allowed to keep data private until publication, which allows for the kind of rigor to be applied that will ensure the work’s accuracy. Caution in making media announcements that could lead to later retractions is completely understandable. The question is how long a delay is reasonable, given the multi-year period Kepler needs to confirm some detections. We should have a final decision soon on whether the strict policy will be put in place.

This comes up now because in June, Kepler is supposed to be releasing data, exposing the information to a wider audience that can work with the material to confirm planets. That original policy would have required the team to have turned over its first 43 days worth of data, but now we wait for a final decision on possible policy changes that is due in a week or so. According to Nature News, the Kepler team hoped to censor 500 objects until mission’s end in late 2013, so the subcommittee’s recommendation actually represents a compromise.

From the article:

Many astrophysics programmes allow researchers a proprietary period with the data. For instance, guest observers on the Hubble Space Telescope get exclusive use of their data for a year before public release. But the tradition for NASA Discovery missions — small, principal-investigator-led missions like Kepler — is to make calibrated data available immediately. That policy has already been changed once for Kepler, last year, when the team was given more than a year to pursue confirmations and work out the kinks in its data processing.

But Borucki says more time is needed because a mission launch delay meant that the team missed out on a season of the ground-based follow-up observations that are needed to verify candidate exoplanets. He also worries about releasing “half-baked” candidates that the media will jump on without an understanding of their uncertainty. “My worry is less of being scooped than it is of putting out inaccurate estimates of what exoplanets are really like out there,” he says.

The mission team’s views aren’t unanimously held by other astronomers, and the paper quotes dissenting remarks from Scott Gaudi (Ohio State), who believes that other teams could help the Kepler crew work to confirm candidate planets. ESA’s Malcolm Fridlund, project scientist for CoRoT, is working under a system where data are kept proprietary for one year, but he’s quick to note that the upcoming PLATO (Planetary Transits and Oscillations of Stars) mission will set a different agenda, one that will require the immediate dissemination of data. I’m with Fridlund in believing that with fast release, “You get a larger community and you get a bigger workforce for free. It’s clear that the more people you get involved, the more support you get.”

tzf_img_post

{ 18 comments }

Erik Anderson April 22, 2010 at 12:39

“According to Nature News, the Kepler team hoped to censor 500 objects until mission’s end in late 2013.”

… which I guess means that they would have had to issue ~fake~ data, since simply removing the objects of interest in an earlier release would effectively identify them. Odd. :-/

Zen Blade April 22, 2010 at 13:23

The age old question of what do we care more about:

The individuals doing the work or the field and progress as a whole?

As a scientist, I kind of care about both and would hate to see research slowed down… but at the time it sucks (really badly) when someone you know gets scooped and no credit for working on a difficult problem.

James M. Essig April 22, 2010 at 14:41

Hi Paul;

Holy Molly! 400 ‘objects of interest’! which may include terrestrial like planets and small planets orbiting red dwarfs in the habitable zone! When any false positives are finally weeded out in one or two years from now, my feeling is that numerous Earth like, or semi-Earth like planets will have been discovered.

This can only help marshall support for Project Icarus. Now that President Obama’s mantra for NASA is to go further, and faster, I think that technologies such as fusion rockets, fission fragment drives, and perhaps even Project Orion type rockets R&D will obtain a big jump in emphasis, since some versions of these systems could in theory do better than 0.15 C for fission systems and perhaps velocities greater than 0.5 C when you plug in values for the relativistic rocket equation of Delta V = C{Tanh[(Isp/C) ln (M0/M1)]} where Isp is expressed in units of C. Such craft are appealing because of their simple, independent and self contained nature.

As for discovering a non Fermi-Dirac annihilation reaction that is much, much more rest mass specific exothermic than known nuclear fission or nuclear fusion, we will have to wait to see what the particle physicists and QCD physicists can cook up. I have great hope in this regard.

Pandora, here we come.

NS April 22, 2010 at 14:59

Gee, first one…

The commenters on the article seem to support immediate release of the data, which was my initial reaction too (caveat: I’m a total non-scientist). However, based on what I’ve seen elsewhere the Kepler team has identified a number of scenarios that can lead to a false-positive indication of a planet, and invested substantial time in developing ways to filter these out. Some of these scenarios may not be readily obvious to an outside observer using the data. So maybe the Kepler team has a legitimate concern about releasing data before they’ve at least had a chance to eliminate the false positives they know how to identify?

cephyn April 22, 2010 at 19:14

there’s a centauri dreams twitter feed? i can’t find it on this site anywhere! what is it?

Mike April 22, 2010 at 19:20

Considering that the entire mission has been funded by taxpayer money then yes,the resulting data should be considered public knowledge. The Europeans and the Canadians and anyone else can run ground-based observations to confirm or deny Kepler’s results. NASA’s pays(we pay) for RV follow up on Kepler’s detections using Keck 1. All information should be released as soon as possible,we payed for it. And if corrections have to be made, well,that’s science, we all understand that. I’m sorry but government agencys and public money should not be used to benefit some one’s run at a Nobel prize.

Administrator April 22, 2010 at 20:32

cephyn writes:

there’s a centauri dreams twitter feed? i can’t find it on this site anywhere! what is it?

I post the occasional ‘tweet’ via @centauri_dreams, accessible through the Twitter Web page or, preferentially, through a client program like Tweetdeck. I don’t send out a lot of tweets but try to get one or two off every day.

Michael Spencer April 23, 2010 at 7:25

The notion that, because it is publicly funded, all work should be immediately available is a little naive–and, frankly, annoying. I do wonder if the public need be informed of all incremental results.

We do hire these guys (in the parlance, I suppose), and so in that sense they do ‘work’ for us. Part of our ‘payment’ is surely requisite support? Isn’t it the case that accuracy is important to scientific careers and to the public value of the work?

And, there is this lurking issue: relevant or not: let us say that all of these data are released; that the media breathlessly reports ‘dozens of earths!’, only to take back the claim at some point. Public reaction would arguably be a collective yawn. Why? Because deep understanding of the scientific method is simply lacking in any widespread sense. The result is more diminution in scientific efforts. Ever heard folks say, to the result of a health study as an example: “Oh, just wait, another report will come out saying the exact opposite”!

Finally, three years? Seriously? That’s all we are talking about? What’s the rush? Cassini and Galileo were still on cruise after three years!

Eric Goldstein April 23, 2010 at 13:01

Michael Spencer,

You scoff at a delay of three years, but all of us are mortal. In three years, some people who would have greatly enjoyed seeing preliminary half-baked speculative results will be dead. Similarly, some people who could have worked on those results to make them less preliminary and half-baked will also be dead.

I’ve been told by people working on astronomy missions which have a proprietary period that they wouldn’t have worked on the mission without such a proprietary period. This tell me that the reward and incentive system is broken, and it tells me that people working on US taxpayer funded science missions have a conflict of interest. How about this alternative set-up: one
group of researchers works hard to get data into the hands of the community. Tenure, respect from one’s peers, pay, prizes, etc would be granted based on how accurately the data was collected and how quickly it was made public.

In any case, it is insulting to suggest that the public couldn’t handle preliminary half-baked speculative results. Your medical analogy is excellent — it is indeed a problem that we don’t have the final answers to various medical questions, and it is frustrating and confusing for the public when they are given the opportunity to hear conflicting preliminary conclusions. But the alternative – to deny them that opportunity – is worse!

Mike April 23, 2010 at 13:31

Perhaps I was a little intemperate in my comments. I have a great admiration for the Kepler team. William Borucki deserves the Nobel prize alone just for guiding the project through NASA’s funding maze. They all certainly deserve prime recognition for Kepler’s bounty.
Still,with hundreds of candidates the ground-based RV follow-up could proceed better if more groups could have access to the transit data.That is also the way science should be done,with different teams working the data.
I hope an acceptable compromise will be reached when the decision is made on how long to restrict access.
As for impatience,just consider the possible import of Kepler’s findings. We may finally learn how common earth-like planets are. Some of the more elderly may not have the 3 or 4 years left and would really like to know these profound findings while they’re still living.

Eric April 23, 2010 at 14:19

One more quick point to Michael. Spencer: You say “[a] deep understanding of the scientific method is simply lacking in any widespread sense.” I agree. But instead of justifying withholding information, why not see it as an opportunity for education? I believe there is even funding for “public outreach”, Such funding should teach by example that science is process and not a set of results. Thanks for the soapbox!

NS April 23, 2010 at 17:28

Maybe I’m naive, but everybody seems to think the Kepler team wants to withhold data for prestige/career reasons, and I’m not so sure that’s the case. If it takes time to exclude known false-positive scenarios from the data so that follow-up efforts can concentrate on what look like genuine planetary transits, shouldn’t the team be allowed the time to do that? And after further thought, I’m also not so sure that giving the guys that put the probe up there first crack at its data is such an untoward idea either.

Raffaele Antonio Tavani April 24, 2010 at 8:33

Non è molto bello, quello che dice questo articolo: dovremo aspettare la fine della missione “Kepler”, per avere i dati?

Ma io dico: perchè non fare l’equivalente del “SETI@Home , cioè, per “velocizzare” il rilevamento dei dati, distribuirli con un programma adatto, perchè tutti gli interessati, possano cercarli utilizzando il proprio computer?

Mi par di ricordare che esiste già un programma, che fa questa ricerca di dati, in Internet…
http://www.centauri-dreams.org/?p=12203

Administrator April 24, 2010 at 8:36

Antonio Tavani’s post as translated by Google and tweaked a bit:

Not a pleasant prospect that, as this article says, we should wait until the end of the Kepler mission to get the data.

But I say why not do the equivalent of “SETI @ Home,” that is, why not “speed up” data collection by distributing the process with a suitable program, because all interested parties can try using a computer?

I seem to remember that there is already a program that works with exoplanet data on the Internet [the latter a reference to the systemic console at oklo.org]

Adam April 25, 2010 at 6:05

Antonio has a good point. Would be a clever way to handle the data torrent by dispersing it to amateurs to explore and analyse.

philw1776 April 26, 2010 at 15:39

I know I’m reacting to too little data but if there were less than 350 candidates in the que (including false positives which inflates this #) after all those months it tells me that Kepler is finding LESS planets around stars than expected from the pre mission estimates. Candidates only need one occultation to be unconfirmed candidates. Out of over 100,000 stars there were expectations of finding 1,000 planetary systems, some with several planets.

Whoops. Looked it up and I recalled wrong…

http://kepler.nasa.gov/Science/expectedResults/

By this date, all early M stars and most fainter Ks that have systems in the ~2% observational plane should have seen 3 occultations by any planets in the HZ or closer in.

Mike April 26, 2010 at 17:35

Phil,when you factor in the number of K and Mdwarfs, about 20% of the targeted stars in the observing list and allow for a estimated at most 50% false positive rate you get about 180 candidates. That is some what over prediction because the earliest sequence Kdwarfs might not be showing 3 HZ transits yet and they also maybe closer to 1% chance of alignment due to their planet’s greater semi-major axis from the host star for a HZ orbit.
But this is really way too much guessing at this time.

stephen April 29, 2010 at 16:39

Regarding publicizing such things:

Maybe we should think more in terms of who/what is present in popular culture.

Jay Leno, Regis Philbin, Oprah Winfrey, Larry King, etc.

NPR has some nice outlets, we just need to get NBC’s attention.

How we get their attention, I’m not sure. But it would be nice if they’d pay a little less attention to Hollywood actors…

Comments on this entry are closed.

{ 2 trackbacks }