NASA’s news conference announcing the discovery of Kepler-90i and Kepler-80g was a delightful validation of a principle that has long fascinated me. We have such vast storehouses of astronomical data that finding the time for humans to mine them is deeply problematic. The application of machine learning via neural networks, as performed on Kepler data, shows what can be accomplished in digging out faint signals and hitherto undiscovered phenomena.

Specifically, we had known that Kepler-90 was a multi-planet system already, the existing tools — human analysis coupled with automated selection methods — having determined that there were seven planets there. Kepler-90i emerged as a very weak signal, and one that would not have made the initial cut using existing methods of analysis. When subjected to the machine learning algorithms developed by Google’s Christopher Shallue and Andrew Vanderburg (UT-Austin), the light curve of Kepler-90i as well as that of Kepler-80g could be identified.

Christopher Shallue described the work at the news conference:

“Kepler produced so much data that scientists couldn’t examine it all manually. The method has been to look at the strongest signals, examining them with human eyes and automated tests, not so different from looking for needles in a haystack. Out of 30,000 signals examined, 2500 planets could be confirmed. We chose to search in weaker signals, as if in a much bigger haystack.”

Machine learning shines in such situations, with the neural network able to identify planets with a far weaker signal that would have never made the initial cut for human analysis. In order to train the network, Shallue and Vanderburg fed it 15,000 Kepler signals that had already been labelled by human scientists, allowing it to learn by example to distinguish those patterns caused by planets. In their test runs, the model identified planets 96 percent of the time.

Shallue described the machine learning system as a neural network made up of layers that perform individual computations and pass them along to the next layer in the stack. Given enough layers, it becomes possible to recognize complex patterns, as we have seen in language translation, image and object identification, and the detection of tumors. Now we turn these methods to exoplanet detection in a discovery that bodes well for future discovery.

The two new planets were found through analysis of Kepler data on 670 stars, a major proof of concept for a method that will doubtless continue to improve, and one that will eventually be applied to the entire range of 150,000 stars in the Kepler and K2 dataset. That opens the possibility of numerous new planetary discoveries from the Kepler mission alone, not to mention what we will find with more advanced AI using the TESS and JWST datasets.

Andrew Vanderburg provides a bit more detail on the method at his CfA page:

Once we had built a neural network, we decided to test it out on some new signals. Using traditional transit-search methods (in particular, the same methods I use to search K2 data), we performed a new search of a handful of systems observed by Kepler (in particular, about 670 systems known already to host multiple planets). Importantly, we allowed this search to very sensitively explore weak signals. Usually, when searching Kepler data, a threshold in signal strength is set, below which weak signals are discarded, so as not to overwhelm the searcher with false positive signals. By lowering this threshold in our new search, we suspected that we might find some new planets, at the expense of a large increase in the number of false positives. But because we have a neural network that can efficiently identify real planets and screen out false positives, we could still efficiently identify new planets.

As to the planets themselves, Kepler-90i, orbiting a G-class star somewhat larger and more massive than the Sun some 2500 light years away, is interesting because it turns the Kepler-90 system into the closest thing we have to a Solar System analog, at least in terms of the number of planets. But the resemblance is hardly complete, for these planets exist in a highly compact system. Have a look at the orbital configuration here.

Image: Kepler-90 is a Sun-like star, but all of its eight planets are scrunched into the equivalent distance of Earth to the Sun. The inner planets have extremely tight orbits with a “year” on Kepler-90i lasting only 14.4 days. In comparison, Mercury’s orbit is 88 days. Consequently, Kepler-90i has an average surface temperature of 800 degrees F. Credit: NASA.

The image below shows an artist’s concept of the planets in question, though the distances are obviously not to scale. The planet sizes, however, are.

Image: The Kepler-90 planets have a similar configuration to our solar system with small planets found orbiting close to their star, and the larger planets found farther away. Credit: NASA.

Kepler-80g has an orbital period close to that of Kepler-90i, about 14 days, and is the 6th planet in its system, which has a host star that is either a late K-dwarf or an early M-dwarf. Here we find the already discovered five planets orbiting in a resonance chain, with mutual gravitational interactions keep their orbits aligned. As Andrew Vanderburg pointed out, the orbital period of the new planet could have been predicted based on the mathematical relations of this resonance, within about two minutes of the actual measure.

It was heartening to hear at the news conference that the training model used in these detections will be made publicly available. According to Google’s Shallue, about two hours suffice to train the model on a desktop computer using open source machine learning software called TensorFlow, which is produced by Google. When the code becomes available, anyone will be able to use the model on the publicly available Kepler data on their own PCs.

The paper is Shallue & Vanderburg, “Identifying Exoplanets with Deep Learning: A Five Planet Resonant Chain around Kepler-80 and an Eighth Planet around Kepler-90,” accepted for publication in The Astronomical Journal, and for now available here.

tzf_img_post