For many HPC applications, single directory write speed is critical for performance. In a typical use case, an application creates a separate output file for each node and task in a job, each of which must often be written to a single directory. As the number of nodes and tasks increase, hundreds of thousands of files may be created which need to be written to a single directory within a short window of time to maintain overall performance.
Until now in Lustre, filename lookup and file system modifying operations (such as create and unlink) were protected by a single lock for an entire directory, thus limiting file writes to the directory to serialized access. Whamcloud has eliminated this bottleneck by introducing a parallel locking mechanism for the entire directory. This capability, called parallel directory operations (PDO), enables multiple metadata service threads to concurrently perform lookup, create, and unlink operations on a single directory. Multiple Object Indexes (MOIs) have also been implemented with PDO. These allow parallel lookups of a files inode from the File Identifier (FID), which is how a client identifies a unique file in the Lustre file system.
PDO will allow:
- Increased concurrency during lookup operations within the shared directory to allow more disk concurrency and IO merging.
- Increased concurrency of service thread writes when a shared directory is modified.
- Reduced thread context switches (sleep/wakeup) delays caused by contention on the single lock.
The PDO capability has been extensively tested, including unit testing on a single-machine and large cluster performance testing. The graph below shows the results for open/create operations in a single directory on modest hardware. This test was run using the mds_survey tool, which simulates metadata traffic at the metadata target (MDT) layer of the MDS software stack. These results are useful for comparing the performance of Lustre prior to implementing the PDO feature to Lustre with the PDO feature.
Whamcloud is continuing to make investments in improving Lustre performance in various areas in the code base. PDO is an excellent example of these improvements. This feature is targeted for the Lustre 2.2 release which
will be available in the first half of 2012 is available now.
A full white paper on PDO is available here (registration required).
Bryon has been with the Lustre team in a leadership role since 2007. He previously worked for 11 years at IBM as a lead developer on the OS/2 operating system and has been managing development teams for startup companies since 1996. Bryon holds a Masters Degree in Computer Science and currently lives in Boulder, Colorado, where he bikes and skis at every opportunity.