Arecibo’s Data Rescue
Priceless astronomy data saved to Ranch system following collapse of Arecibo Telescope
When Puerto Rico’s famed Arecibo telescope collapsed in 2020, astronomers lost access to one of the world’s most treasured pieces of equipment — but also, potentially, decades of priceless data holding still undiscovered secrets about the universe. Now, thanks to a data rescue plan led by TACC, Arecibo’s observations will be preserved for generations of future astronomers.
Millions of people have seen footage of the collapse of the famed Arecibo radio telescope in December 2020. The 900-ton instrument platform snapped its gossamer-like suspension cables, which sent it crashing through the radio dish below and into the Puerto Rican countryside, destroying the giant telescope.
Astronomers worldwide keenly felt the loss of one of the world’s premiere telescopes whose past achievements include the discovery of the first planet found outside our solar system and the first-ever binary pulsar, a find that tested Einstein’s General Theory of Relativity and earned its discoverers a Nobel Prize in 1993.
Luckily, the data center for the Arecibo telescope was spared any long-term damage from the collapse. It stored the ‘golden copy’ of data — the original tapes, hard drives, and disk drives of sky scans since the 1960s. All in all, about three petabytes, or 3,000 terabytes of telescope data needed to be rescued off the island before any other disaster might strike.
Within weeks of Arecibo’s collapse, TACC entered into a partnership to move the Arecibo radio telescope data to Ranch, a long-term data mass storage system at the center.
Future plans include expanding access to over 50 years of astronomy data from the Arecibo Observatory, which up until 2016 had been the world’s largest radio telescope.
“Many key astronomical discoveries come from observing changes in the universe. Preserving data is key to enabling these discoveries,” said Niall Gaffney, who led the development of the Hubble Legacy Archive and is Director of Data Intensive Computing at TACC.
Gaffney listed examples of treasured astronomical data that go back hundreds of years such as Galileo's notebooks; William Herschel's binary star observations; and the Harvard Plate archive that comprises the world's largest archive of stellar glass plate negatives. “By ensuring the rich data collection of the Arecibo telescope lives on, new discoveries not otherwise possible will be enabled,” Gaffney added.
The long term data archive at TACC, Ranch, was able to rapidly provide enhanced bandwidth, a dedicated server for data transfer including a large dedicated file system, and tape storage providing redundant copies of the data as it moved onto tape. These resources were provided within days of the initial request.
TACC was able to rapidly respond to an urgent request from Arecibo for a redundant, permanent archive of its data using the 142 petabyte Ranch archiving system without impacting any current or future planned use of the system.
“I am thrilled that UT Austin will become the home of the data repository for one of the most important telescopes of the past half-century,” said astronomer Dan Jaffe, Vice President for Research at The University of Texas at Austin.
“Arecibo made important contributions across many fields — studies of planets, setting the scale for the expansion of the universe, understanding the clouds from which stars form, to name a few. Preserving these data and making them available for further study will allow Arecibo’s legacy to have an ongoing impact on my field,” said Jaffe.
The data storage is part of the ongoing efforts at Arecibo Observatory to clean up debris from the telescope’s 900-ton instrument platform and reopen remaining infrastructure. Arecibo’s data includes information from thousands of observing sessions, equivalent to watching 120 years of HD video. The data were collected from Arecibo’s 1,000 foot (305 meter) fixed spherical radio/radar telescope.
To get from Arecibo to Austin, the data were copied to storage devices and driven to the University of Puerto Rico at Mayaguez for upload. This ensures that the research community continues to access and execute research on the existing data. This data migration is done in coordination with Arecibo’s IT department, led by Arun Venkataraman.
Once the data drives reached the university, Arecibo’s IT department worked with Globus and the University of Chicago to get data integrity software installed that could stage the next leg of its transfer. Then the data travelled over the AMPATH Internet exchange point that connects the University of Puerto Rico to Miami, Florida. It finally used Internet2 and the LEARN network in Texas to get to TACC in Austin.
“The data is priceless,” emphasized Julio Alvarado, Big Data Program Manager at Arecibo.
As of November of 2021, data are still being transferred and work continues at TACC to create the infrastructure needed to simplify finding data for future discoveries. “While some of the data led to major discoveries over the years, there are reams of data that have yet to be analyzed and could very likely yield more discoveries. Arecibo’s plan is to work with TACC to provide researchers access to the data and the tools necessary to easily retrieve data to continue the science mission at Arecibo,” Alvarado said.