Hp XC System 2.x Software Uživatelský manuál stáhnout pdf (Strana 24)

1.4 Run-Time E nvironment

In the HP XC environment, LSF-HPC, SLURM, and HP-MPI work together to provide a

powerful, flexible, extensive run-time environment. This section describes LSF-HPC, SLURM ,

and HP-MPI, and h o w these components work together to provid e the HP XC run-time

environment.

1.4.1 SLURM

SLURM (Simple Linux Utility for Resource Management) is a resource management system

that is integrated into the HP XC system. SLURM is suitable for use on large and s m a ll

Linux clusters. It was developed by Lawrence Livermore National Lab and Linux Networks.

As a resource manager, S LURM allocates exclusive or non-exclusive access to resources

(application/compute nodes) for users to perform work, and provides a framework to start,

execute and monitor work (normally a parallel job) on the set of allocated nodes. A SLURM

system consists of two daemo ns, one configuration file, and a set of commands and APIs. The

central controller daemon, slurmctld, maintains the global s tate and directs operations. A

slurmd daemon is deployed to each computing node and responds to job-related requests,

such as launching jobs, sig nalling , and terminating jobs. E nd users and system software (such

as LSF-HPC) communicate withSLURMbymeansofcommandsorAPIs—forexample,

allocating resources, launching parallel jobs on allocated resources, an d killing running jobs.

SLURM groups compute nodes (the nodes where jobs are run) togeth er into partitions.The

HP XC system can have one or several partitions. When HP XC is installed, a single par tit ion

of com pute nodes is created by default for LSF batch jobs. The system administrator has the

option of cr eating addition a l partitions. For example, another partition could be created for

interactive jobs.

1.4.2 Load Sharing Facility (LSF-HPC)

The Load Sharing Facility for High Performance Computing (LSF-HPC) from Platform

Computing Corporation is a batch system resource manager that has been integrated with

SLURM for use on the HP XC system. LSF-HPC for SLURM is included with the HP XC

System Softw are, and is an integral part of theHP XC environment. LSF-HPC interacts with

SLURM to obtain and allocate available resources, and to launch and control all the jobs

submitted to LSF-HPC. LSF-HPC accepts, q ueues, schedules, dispatches, and controls all the

batch j obs that users submit, according to policies and configurations established by the HP

XC site administrator. On an HP XC system, LSF-HPC for SLU RM is installed and runs on

one HP XC node, known as the LSF-HPC execution host.

A complete description of LSF-HPC is provided in Chapter 7. In addition , for your convenience,

the HP XC documentation CD contains LSF Version 6.0 manuals from Platform Computing.

1.4.3 How LSF-HPC and SLURM Interact

In the HP XC environment, LSF-HPC cooperates with SLURM to combine LSF-HPC’s

powerful scheduling functionali ty with SLURM’s scalable parallel job launching capabilities.

LSF-HPC acts primarily as a w orkload scheduler on top of the SLURM system, providing

policy and topology-based scheduling for end users. SLURM provides an execution and

monitori ng layer for LSF-HPC. LSF-HPC uses SLURM to detect system topology information,

make scheduling decisions, and launch j obs on allocated resources.

When a job is submitted to LSF-HPC, LSF-HPC schedules the job based on job resource

requirements and communicates with SLURM to allocate the required HP XC com pu te nodes

for the job from the SLURM lsf partition. LSF-HPC provides node-level scheduling for

parallel jobs, and CPU-level scheduling for serial jobs. Because of node-level scheduling, a

parallel job may be allocated more CPUs than it requested, depending on its resource request;

the srun or mpirun -srun launch commands within the job still honor the original CPU

1-6 Overview o f the User Environment

1 2 ... 19 20 21 22 23 24 25 26 27 28 29 ... 153 154

Komentáře k této Příručce

Žádné komentáře

Hp XC System 2.x Software Uživatelský manuál Strana 24

Komentáře k této Příručce

Související produkty a manuály pro Software Hp XC System 2.x Software