Online: | |
Visits: | |
Stories: |
I must congratulate my colleague Paul Hirst at Gemini – he is to my knowledge the first astronomer to use “cheap” and “cloud” in the same title. His talk on building a new archive for the Gemini telescopes was presented at ADASS XXV in Sydney.
Gemini operates two 8.1 m optical/IR telescopes, one on Mauna Kea, Hawaii, and the other on Cerro Pachon, Chile. The data sets these telescopes produce are not especially large by modern standards – 5GB per night of raw FITS files, with a total volume to date of 27.5TB raw FITS – yet they are diverse, with Imaging and Spectroscopy (Long Slit, cross-dispersed, Fiber-fed, Integral Field; Polarimetry, Adaptive Optics) over the 0.3 um to 25 um wavelength range.
The archive architecture looks like this:
The interesting part of the diagram is the “AWS S3” block, which represents the S3 storage system of Amazon Web Services (AWS), where the data are housed. The data flow from telescope to cloud may be described as follows:
AWS offers many options for scaling the performance on demand, when required.
The really interesting part of this concerns the cost. From Paul’s slides:
These charges total ~$6,000/ yr. The bottom line is that Amazon’s current cost structure is such that the cost per hour operating the archive is approximately the same as the power and cooling costs in Hilo (apart from buying the hardware). The cost benefit analysis given above is a fine example of the analysis that needs to be done if you are thinking of migrating a project to the cloud.
Moreover, the archive is fast, with typical search page response of < 1 second. And new data are available generally <1 minute after readout. Staffing consumed 3 FTEs over 3 years from project start to deployment.
I wish to thank Dr Paul Hirst for supplying his charts and supporting the preparation of this blogpost.