Document Actions

Review-11

by Ananth Rao last modified 2007-05-25 06:43

University Research Organization

Just yesterday, some new information came to light about what may be an intrinsic problem with HDF5 that could be inherent in the design of the format. These appeared in two postings on a mailing list discussing use of netCDF and HDF5 in high performance computing applications with thousands of processors using parallel I/O:

... Greg S at least has observed file corruption with the HDF5 file format during parallel I/O if a client dies at a particular time. As I understand it, it's hard to devise a solution for the HDF5 file format that is both rock-solid robust and delivers high-performance.

... However, there is concern about the robustness of the underlying HDF5 format. It is possible to corrupt the entire file if there is a crash at the wrong time. We cannot build our production system on a library that has this behavior. Some of the systems we run on are not known for their stability and if a job that has been running for a few days crashes and loses all data, that is not acceptable. With the netcdf-3 library, we would lose all or a portion of the last "time dump" written, but not previous data that had been synced to disk.

 

+ Privacy Policy and Important Notices. NASA - National Aeronautics and Space Administration Curator: Jody Gibson
NASA Official: Richard Ullman