AUBURN HPC Research Community

0

Job Dependency and Intragroup Contention

The allocation and scheduling of resources by the Torque\Moab workload manager is a complex process. When there are multiple jobs being submitted to the same reservation, this becomes even further complicated, and can lead to unexpected results. To facilitate preemption, the scheduler performs a node selection algorithm and attempts to free these nodes by breaking down any running jobs. In...

0

Disk Space and Inode Usage

Hopper provides a GPFS shared file system for users’ home directories and scratch space. While this file system provides outstanding performance and lots of space, there are limitations in regards to disk space and the number of files. Some Background Each unix file consists of two parts: 1) the data blocks that contain the contents of the file and 2)...

0

Deferring Job Execution

As the cluster tends to be busier during the weekdays, it may be advantageous to delay the execution of your job until evening or weekends. To set the date/time on which a job becomes eligible to run, use the -a option of the qsub command. It allows you to submit now, run later. Here’s an example that schedules a job...

0

Running MATLAB Jobs

Hopper provides X11 forwarding so that you can interact with software that provides a GUI, which in many cases can be a convenient way to prepare or run jobs. However, if used incorrectly, use of this feature can lead to high load on the login node and result in your processes being terminated. In the case of MATLAB, College of...

0

Reservations

Hopper implements a shared-maximum model of scheduling which guarantees that each principle investor and their lab group has access to the resources that they have purchased, while also providing extra computational power through leveraging underutilized processors. This model relies heavily on Moab “reservations” which are similar to traditional queues, but are defined in terms of ownership. Like queues, reservations serve...

0

Ownership and Preemption

As a “condo-model” cluster, Hopper is funded by a group of financial stakeholders known as Principle Investors.  Each PI maintains ownership of a subset of nodes within the cluster, and is guaranteed purchased processing power on demand.  In mild contrast to this, a secondary goal of the machine is to promote efficiency through allowing all cluster researchers to leverage any...

0

Environment Modules

Quite often, a very specific set of adjustments must be made to your shell environment in order to use a scientific software package.  This requires knowledge of file tedious system specifics, and involves issuing a number of lengthy commands, such as: $ export PATH=$PATH:/a/binary $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/a/library. Environment modules make it easier to modify your environment, switch between different environments, and...

0

Building the Hopper Cluster Part II: Networking

On Tuesday, January 19th, 2016 work started on the networking phase of the Auburn University “Hopper” High Performance Compute Cluster. This phase involved the routing of hundreds of InfiniBand, Ethernet and Fiber Optic cables, enabling high speed communication between the previously installed servers. The InfiniBand network architecture provides the cluster with high speed, low latency shared disk access and the...

0

Building the Hopper Cluster Part I: Nodes

On Monday, January 5, the build began on the new HPC cluster in the AU Data Center. Resembling an old fashioned ‘barn raising’, OIT and Lenovo personnel unboxed and racked equipment in step one of constructing Auburn’s newest and most powerful research computer. Work continues with the goal to be operational by mid-February.  

0

Excluding a Host

As with all cluster problems you encounter, if you know of a problematic host in the cluster, please send an email to hpcadmin@auburn.edu.  It may take some time for the cluster admins to respond, so in the meantime, you can avoid that host with the following syntax… bsub -R “select[hname!=node000]” … qsub -l h=!node001 Additionally, you can specify one or...