Difference between revisions of "LustreFS"

From wiki.hpc.mk
Line 1: Line 1:
{|class="wikitable"
{|class="wikitable"
!Contents
!Contents
|-
|[[LustreFS#Section_LustreFS|Lustre – File System]]<br></br>[[LustreFS#Section_Architecture|Lustre Architecture]]
|[[LustreFS#Section_LustreFS|Lustre – File System]]<br></br>[[LustreFS#Section_Architecture|Lustre Architecture]]
|}
|}

Revision as of 09:58, 27 August 2021

Contents
Lustre – File System

Lustre Architecture




Lustre – File System


Lustre is a type of parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License (version 2 only) and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site systems. Since June 2005, Lustre has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 1 ranked TOP500 supercomputer in June 2020, Fugaku, as well as previous top supercomputers such as Titan and Sequoia.


Lustre Architecture


A Lustre file system has three major functional units:

  • One or more metadata servers (MDS) nodes that have one or more metadata target (MDT) devices per Lustre filesystem that stores namespace metadata, such as filenames, directories, access permissions, and file layout. The MDT data is stored in a local disk filesystem. However, unlike block-based distributed filesystems, such as GPFS and PanFS, where the metadata server controls all the block allocation, the Lustre metadata server is only involved in pathname and permission checks and is not involved in any file I/O operations, avoiding I/O scalability bottlenecks on the metadata server. The ability to have multiple MDTs in a single filesystem is a new feature in Lustre 2.4 and allows directory subtrees to reside on the secondary MDTs, while 2.7 and later allow large single directories to be distributed across multiple MDTs as well.
  • One or more object storage server (OSS) nodes that store file data on one or more object storage target (OST) devices. Depending on the server's hardware, an OSS typically serves between two and eight OSTs, with each OST managing a single local disk filesystem. The capacity of a Lustre file system is the sum of the capacities provided by the OSTs.
  • Client(s) that access and use the data. Lustre presents all clients with a unified namespace for all the files and data in the filesystem, using standard POSIX semantics, and allows concurrent and coherent read and write access to the files in the filesystem.

The MDT, OST, and client may be on the same node (usually for testing purposes), but in typical production installations these devices are on separate nodes communicating over a network. Each MDT and OST may be part of only a single filesystem, though it is possible to have multiple MDTs or OSTs on a single node that are part of different filesystems. The Lustre Network (LNet) layer can use several types of network interconnects, including native InfiniBand verbs, Omni-Path, RoCE, and iWARP via OFED, TCP/IP on Ethernet, and other proprietary network technologies such as the Cray Gemini interconnect.