HPC3 - Hardware Specifications

If you are not familiar with the general concepts of cluster computing please see this nice introduction to Distributed Memory parallel computers.

1. HPC3: Hardware

HPC3 had an initial procurement phase through a formal request for proposal (RFP) process. After evaluation, Hewlett Packard Enterprise was awarded the bid. Since award, additional purchases have been made to bring the cluster to its current configuration.

The system started as a 4000 core system when first constructed in June 2020, but it has expanded several times since initial deployment with both UCI- and faculty-purchased nodes. HPC3 nodes have minimums of 56Gb/s Infiniband (most nodes are 100Gb/s), 4GB/core, and AVX-2 capability.

As of July, 2021, the following totals broadly describe the cluster

  • 198 Batch-accessible nodes

  • 8916 total cores (1344 AMD Epyc/7552 Intel)

  • 55,542 GB Aggregate Memory

  • Three load-balanced login nodes

  • 13 nodes with four Nvidia V100 GPUs (52 GPUs total)

  • 94% nodes at 100Gb/s EDR Infiniband (186/198)

1.1. Node Type

There are two node types. We provide the two most-common configurations for CPU Nodes

CPU-only nodes
  • HP Chassis

    • HPE Apollo 2000 Gen 10 chassis. 2RU with 4 nodes/chassis

    • Dual-Socket, Intel Skylake 6148 20-core CPU@2.4GHz. 40 Cores total.

  • Dell Chassis

    • Dell R640 1U Server

    • Dual-Socket, Intel Cascade Lake 6240R 24-core CPU@2.4GHz. 48 Cores total.

  • 10Gbit/s Ethernet

  • 100Gbit/s ConnectX-5, EDR Infiniband

  • 192GB DDR4, ECC, Memory. (Options: 384GB, 768GB)

GPU-Enabled Nodes
  • HPE DL380 Gen 10 chassis. 2RU. Upto 4 GPUs/chassis

  • CPU,Network,Memory,SSD identical to CPU Nodes

  • Qty 4 Nvidia V100 GPU with 16GB of High-Bandwidth Memory. PCIe connection.

1.2. Networking

HPC3 has three networks attached to each node

  • 10 Gb Ethernet. This is the provisioning, control, and network to access Ethernet-only resources.

  • 100Gbit/s EDR Infinband. It is a 2-level Clos-Topology with a maximum 8:1 oversubscription: Nodes in the same rack (max 32) are connected to a full-bisection, 36-port Infiniband switch. Each lower-level switch is connected to two root-level switches with two links/switch. The subnet manager is opensm with LMC (Lid Mask Control) set to 2 for multi-path diversity.

1.3. Support Nodes

Support nodes include

  • Login nodes (3)

  • Scheduler (1)

  • Provisioning (2)

  • Firewall (PFSense) (2)

  • NFS (home area) with ZFS as the underlying file system

Login Nodes have the same CPU, Network, Memory configuration as CPU nodes, but are vended by Dell.