Online:
Visits:
Stories:
Profile image
Story Views

Now:
Last Hour:
Last 24 Hours:
Total:

Is HDF5 a good format to replace UVFITS?

Monday, December 8, 2014 17:31
% of readers think this story is Fact. Add your two cents.

(Before It's News)

The astronomical community has begun to discuss whether the Flexible Image Transport System (FITS) will continue to the de facto standard for storage and exchange of data in astronomy. FITS has been standard since the 1980s, in part because of its backwards compatibility, in part because of the large ecosystem of software that has been developed for  processing FITS files.

Price, Barsdell and Greenhill (2014) have recently showed, on their paper “Is HDF5 a good format to replace UVFITS?” that the Hierarchical Data Format 5 (HDF5) (strictly an API rather than a data format) has advantages over FITS for radio visibility data.  There are two registered FITS conventions for the storage of visibility data and associated metadata: FITS-ISI and UVFITS.  Price et al explain the differences between them: “In UVFITS, the visibility data are stored in a random group HDU (header data unit), whereas in FITS-IDI data are stored in a binary table HDU. In both formats, each row of the table contains columns for the timestamp and a baseline identifier, along with the multidimensional visibility array for that timestamp and baseline.” An alternative format, quite different from FITS,  is a CASA MeasurementSet (MS), essentially a directory of files nested in child directories.

Price et al. point out that the FITS files can be readily mapped to the structure of an HDF file, as shown below:

2014-11-12_11-38-19

A Python utility fits2hdf uses this mapping to convert FITS files into HDF5, and vice-versa. It  uses in turn the PyFITS (ascl:1207.009) and h5py libraries for file I/O. While fits2hdf was aimed at porting VFITS/FITS-IDI data into HDF5, it operates on any valid FITS file.

Further, they argue that HDF5 storage model has a number of advantages over FITS, and MS too, for supporting large data sets. ” … HDF5 provides parallel and network I/O, data chunking meth-ods, external (i.e. distributed) object storage, and a filter pipeline for data compression. Of specific interest for visibility data is bitshuffle2, an HDF5 filter designed for fast compression of visibility data. Using bitshuffle on a 1.2 GB test dataset of data from the LEDA correlator (Kocz et al. 2014), we achieved lossless compression ratio of 1.65x, with total file compression and write time of 7.5 s; in comparison the data compressed by 1.40x in 53.0 s using standard gzip.”



Source: https://astrocompute.wordpress.com/2014/11/12/is-hdf5-a-good-format-to-replace-uvfits/

Report abuse

Comments

Your Comments
Question   Razz  Sad   Evil  Exclaim  Smile  Redface  Biggrin  Surprised  Eek   Confused   Cool  LOL   Mad   Twisted  Rolleyes   Wink  Idea  Arrow  Neutral  Cry   Mr. Green

Top Stories
Recent Stories

Register

Newsletter

Email this story
Email this story

If you really want to ban this commenter, please write down the reason:

If you really want to disable all recommended stories, click on OK button. After that, you will be redirect to your options page.