Solving the import problem: Scalable Dynamic Loading Network File Systems

Description

The most common programming paradigm for scientific computing, SPMD (Single Program Multiple Data), catastrophically interacts with the loading strategies of dynamically linked executables and network-attached file systems on even moderately sized high performance computing clusters. This difficulty is further exacerbated by "function-shipped" I/O on modern supercomputer compute nodes, preventing the deployment of simple solutions. In this talk, we introduce a two-component solution: collfs, a set of low-level MPI-collective file operations that can selectively shadow file system access in a library, and walla, a set of Python import hooks for seamlessly enabling parallel dynamic loading scalable to tens of thousands of cores.