Lecture Parallel Storage Systems

Parallel programming is becoming increasingly important since even phones and laptops contain multiple processor cores nowadays. Supercomputers can contain up to several million cores and have become a useful and important tool for a wide range of scientific domains. The analyses and simulations enabled by them have accelerated the process of gaining scientific insight considerably.

The amount of collected and produced data is growing exponentially; it has to be stored, analyzed and processed efficiently since I/O significantly affects overall performance. Vastly different rates of performance development for processors and storage hardware result in a performance imbalance, which makes it even more important to take a close look at storage systems in order to be able to meet future demands.

The lecture will teach the fundamentals of parallel storage systems and I/O; the exercises will allow transferring and applying the acquired skills with a system programming language such as C, C++ or Rust.

As part of the lecture, we will cover the complete storage stack: Storage devices and networks (hard disk drives, solid-state disks, storage area networks etc.), local and distributed file systems (in kernel and user space, novel concepts like snapshots and deduplication) as well as the I/O interfaces layered on top (POSIX, MPI-IO, NetCDF and ADIOS). Moreover, we will discuss reasons and solutions for performance problems as well as alternative approaches for I/O (such as cloud interfaces). Problems and examples will be motivated using real-world scientific applications.

Course

Learning Objective

Participants will learn how parallel applications perform I/O using different programming concepts and how I/O can be optimized. Additionally, they will gain insight into and practical experience with the internals of storage and file systems.

Requirements

Required skills:

  • Practical knowledge of a programming language and the ability to create simple applications

Recommended skills:

  • Basic knowledge about operating systems
  • Basic knowledge about parallel programming

Lecture

  • 2024-04-08: Introduction (Slides)
  • 2024-04-15: Storage Devices (Slides)
  • 2024-04-22: File Systems (Slides)
  • 2024-04-29: Modern File Systems (Slides)
  • 2024-05-06: Parallel Distributed File Systems (Slides)
  • 2024-05-13: Skipped
  • 2024-05-20: Public holiday
  • 2024-05-27: MPI-IO (Slides)
  • 2024-06-03: Libraries (Slides)
  • 2024-06-10: Optimizations (Slides)
  • 2024-06-17: Performance Analysis (Slides)
  • 2024-06-24: Data Reduction (Slides)
  • 2024-07-01: Current and Future Developments (Slides)
  • 2024-07-08: Research Talks and Debriefing

Exercises

  • 2024-04-08: Introduction (Sheet 0, Sheet 1, Materials)
    • Deadline: 2024-04-23, 23:59
  • 2024-04-24: Debugging and Checkpoints (Sheet 2, Materials)
    • Deadline: 2024-05-07, 23:59
  • 2024-05-08: I/O Tools (Sheet 3, Materials)
    • Deadline 2024-05-21, 23:59
  • 2024-05-22: Dummy File System (Sheet 4, Materials)
    • Deadline: 2024-06-04, 23:59
  • 2024-06-05: Memory File System (Sheet 5, Materials)
    • Deadline: 2024-06-18 and 2024-06-25, 23:59
  • 2024-06-26: Persistent File System (Sheet 6, Materials)
    • Deadline: 2024-07-02 and 2024-07-09, 23:59

Literature

  • High Performance Parallel I/O (Prabhat und Quincey Koziol)

Last Modification: 01.07.2024 - Contact Person: Webmaster