Cluster architectures are designed to enable high performance parallel computing and consist of one or more master nodes and one or more compute nodes interconnected by a private network system. Dell's High Performance Computing Cluster stacks are built around standards-based commodity components - including hardware, interconnects and software.
The master node is the architecture's gateway to external resources and supports the Network File System (NFS). In order to make the master node highly available to users, High Availability (HA) clustering might be employed.
The compute nodes are the cluster workhorses and are used to execute parallel jobs. Typically, access and management of compute nodes are provided via remote interfaces, such as network and / or serial port connections through the master node. Since compute nodes do not need to access machines outside the cluster, nor do machines outside the cluster need to access the compute nodes directly, compute nodes commonly use private IP addresses.
HPCC Infrastructure Stack
There are many choices available when considering an HPC cluster for your compute intensive needs.. That is why customers are increasingly turning to Dell. With Dell's HPCC solution, we have designed a straightforward solution and delivery model allowing you to easily customize your architecture from a breadth of industry standard components and validated building blocks. The basic components to consider should include:
HPC software components include: operating system, hardware drivers, middleware, compiler, parallel program development environment, debugger, performance analyzer, hardware level node monitoring/management tool, OS level node monitoring/management tool, cluster monitor/management tool and parallel applications.
Once the right components have been identified for you specific requirements, installation and implementation are designed to be very straightforward. After all the nodes are racked and all the interconnects and power cords are connected, the next step is to install the planned software. There are three steps to complete software installation:
- Step 1: Install operating system and all the selected software to master node(s) and compute nodes
- Step 2: Configure all the nodes
- Step 3: Perform a pilot run of the HPC cluster and tune cluster performance accordingly
Use of high-density rack mounted servers is the most popular configuration for today's HPC cluster environment. Besides the compute nodes, each rack could be equipped with network switches, UPS, PDU (power distribution unit), and so on. For some types of applications where communication bandwidth between nodes is critical, low-latency and/or high-bandwidth interconnects such as Gigabit Ethernet, Myrinet, Infiniband, etc., are common choices for interconnecting among compute nodes.
Several connections are available for cluster monitoring and management. The serial port and BMC (Baseboard Management Controller accessible via first NIC) provides a console redirection feature as an additional route for monitoring and managing the compute nodes from the master without relying on the network connectivity or interfering with the network activities. DRAC (Dell Remote Assistant Card) can be used for remote management along with the KVM (keyboard-video-mouse) that access the compute nodes through a non-traditional switch with cat-5 cables and TCP/IP networking.
For additional information, please contact us at: 1-800-274-3355 or firstname.lastname@example.org or contact your Dell account representative.