Abstract:
ORNL has embraced leadership-class computing, and has quickly
become one of the leading institutions for the DOE and NSF for large
scale computations. One of the most challenging problems associated with
running on the large computers is dealing with the huge amounts of data
that is generated. Researchers are quickly becoming overwhelmed with the
daunting task, not only of running their simulations on 100K processors,
but also of efficiently extracting and transporting the many TB's of
data generated by the simulations, and analyzing this data, and share it
with their colleagues in a timely manner. The impact of these challenges
and the overall time to solution is only growing as computers are
getting faster. In order to help address these challenges we have been
developing a suite of software solutions, which are gaining acceptance
by the largest codes that are part of the DOE open science.
Our suite of software solutions includes new API's (ADIOS) that allow
for both MPI-IO and asynchronous I/O through Remote Direct Memory Access
(RDMA), workflow automation using the Kepler workflow package, fast wide
area data transfers, and dashboards that combine data management,
provenance management, and data analysis for monitoring complex
simulations. In this talk I introduce these solutions and will show how
this technology is being used in several fusion SciDAC projects.