Location: Home > Resources
High-performance computer cluster system
ArticleSource:
Update time: 2014-03-28
Close
Text Size: A  A   A
Print

Instrument Type: Dawning 5000 computer cluster

Purchase date: September 2011

Instrument Introduction:

Dawning 5000 high-performance computer has the advantage on architecture multiprocessor chipset, high-performance operating system nodes, highly scalable interconnection network, high-throughput communications software, multi-threaded programming model divided the global address space and the core compiler technology ; and provides high-density high-performance computer nodes, reliable system architecture, virtualization software, high-performance mass storage, parallel file systems, large-scale system management software, system robustness Technology Surgery, super massively parallel algorithms and other products and technologies to achieve high performance, programmability, portability and stability of the system and serve to calculate Supercomputer Center (Capability Computing) and capacity calculations (Capacity Computing) in the future.

Dawning 5000A, based on high-density server, constitutes "super parallel" architecture (Hyper Parallel Processing, referred HPP);

Peak computing power is up to 20Tflops CPU computing performance , 50 Tflops GPU computing performance, with storage capacity of 96TB parallel storage system, 20 TB backup storage systems.

Applications:

A high-performance computing applications: first Into manufacturing, electronic information, environmental engineering, equipment manufacturing, communication engineering, optical engineering, civil engineering, software engineering, computational fluid dynamics, biotechnology and agriculture, genetic engineering, electronics and information engineering, genetic engineering, students Biomedical, vehicle engineering, chemical engineering, digital city, digital works, video animation, 3D animation, special effects, game engine development, weather, weather disaster assessment, geographic information systems, remote sensing mass Data processing;

High-performance computing technology as the foundation tool for all disciplines industry to provide support in a variety of ways to improve productivity and innovation, including:

(1) In-depth knowledge of basic science discovery, expand the scale of the problem and solution to increase the accuracy of the need for more high-performance computing resources. For example, computational fluid dynamics, computational materials science, computational electromagnetics.

(2) Multi-disciplinary field of design, a large number of multi-sectoral collaborative computing need to build high-performance integrated platform. For example, automotive design, ship design.

(3) Simulation-based engineering science combined with traditional knowledge engineering and high-performance computing technology to provide cost-effective design and practices. For example, simulation-based medical practice, digital city modeling, nuclear power, oil and simulation tools, new materials development, crash simulation, digital wind tunnel.

(4) High-performance computing to improve services in many industries, the timeliness of decision-making, increase economic efficiency. For example, real-time weather forecasting, urban traffic control, video-on-demand services, animation design, online games, RFID-based cargo tracking, intelligent e-commerce.

(5) Data-intensive applications require high-performance data processing, in order to cope with the explosive growth in data problems. For example, high-energy physics experiment data processing, remote sensing data processing, business intelligence, bioinformatics, RFID data mining, financial analysis of mortgage lending, mobile phone traffic analysis.

2 Cloud Computing Applications:. IAAS hardware virtualization cloud services, cloud disaster recovery, cloud storage, cloud security, cloud integration, such as standardization of cloud services and research.

Technical parameters:

Dawning HPC cluster system configuration list

No.

Name

Technical Specifications

Unit

Quantity

1

The hardware part

1.1

Computing subsystem

1.1.1

HPC partition

Blade platform

Dawning TC3600

10U rack-mounted blade chassis can support 10 compute blades; 
2 * management module, integrated remote KVM and remote virtual media; 
1 * Gigabit Ethernet switching module provides 10 RJ45 gigabit interfaces; 
1 * 40Gb / s QDR Infiniband switch module, providing 18 QSFP interfaces; 
4 * Dual redundant hot-swappable cooling fans; 
3 * 2000W power supply (2 +1 redundant hot-swappable);

 

7

Compute blades

Dawn CB65-G

2 * AMD Opteron 6172 12-core processor (2.1GHz); 
8 * 4GB DDR3 1333MHz ECC Register quad-channel memory; 
1 * 146GB SAS hot-plug hard drives; 
2 * 1000M Ethernet card; 
1 * 40Gb Infiniband daughter card;

 

64

SMP node fat

Dawn A840-G

4 * AMD Opteron 6172 12-core processor (2.1GHz); 
128GB DDR3 1333MHz ECC Register quad-channel memory; 
4 * 300GB SAS hot-swappable hard drives (an installed system, 3 raid5); 
2 * Gigabit Ethernet;
1 * 40Gb Infiniband HCA card; 
1 * 16GB PCI-E I / 0 acceleration module; 
1000W redundant power supply;

 

4

Scientific computing accelerators (GPGPU)

First-principles calculations

Standard 4U rackmount equipment 
4 * Nvidia Tesla C2050 (448 computing cores, 1.15Ghz) 
24GB DDR3 Cache, 12GB GDDR5 cache 
A management network port, two 1Gb access ports, a 40Gb QDR IB-speed access ports, redundant power supplies, built the first-principles calculation module (support TeraChem), support for GPU monitoring / management / scheduling

 

6

Molecular dynamics calculations

Standard 4U rackmount equipment 
4 * Nvidia Tesla C2050 (448 computing cores, 1.15Ghz) 
24GB DDR3 Cache, 12GB GDDR5 cache 
A management network port, two 1Gb access ports, a 40Gb QDR IB-speed access ports, redundant power supplies, embedded molecular dynamics calculation module (support NAMD / Gromacs), support for GPU monitoring / management / scheduling

 

10

Fault-tolerant computing module

Dawn CluSnap

4U rackmount standard equipment; 
Hardware-based system-level checkpoint function; 
16GB high-speed system cache; 
1 * 40Gb Infiniband interfaces; 
2 * 1000M Ethernet interface;

 

2

1.1.2

Cloud service partition

Cloud service node

Dawn A840-G

4 * AMD Opteron 6172 12-core processor (2.1GHz); 
128GB DDR3 1333MHz ECC Register quad-channel memory; 
4 * 300GB SAS hot-swappable hard drives (an installed system, 3 raid5); 
5 * Gigabit Ethernet;
2 * 8Gb HBA card; 
1 * 16GB PCI-E I / 0 acceleration module; 
1000W redundant power supply;

 

8

1.1.3

Collaborative design district

Workstation node

Dawn W580I

4U tower interchangeable; 
2 * Intel Xeon E5620 quad-core processor (2.4GHz); 
24GB DDR3 1333MHz; 
2 * 500GB 
  7200 SATA enterprise-class hard drives; 
2 * 1000M Ethernet card; 
1 * Nvidia Quadro FX4000 2GB memory; 
3 * Nvidia Tesla C2050 3GB Memory 
1 * Redundant power mute; 
1 * Slim DVD-RW; 
* Added a guide;

 

10

1.1.4

Security isolation partition

Blade platform

Dawning TC3600

10U rack-mounted blade chassis can support 10 compute blades; 
2 * management module, integrated remote KVM and remote virtual media; 
1 * Gigabit Ethernet switching module provides 10 RJ45 gigabit interfaces; 
1 * 40Gb / s QDR Infiniband switch module, providing 18 QSFP interfaces; 
4 * Dual redundant hot-swappable cooling fans; 
3 * 2000W power supply (2 +1 redundant hot-swappable);

 

1

Compute blades

Dawn CB65-G

2 * AMD Opteron 6172 12-core processor (2.1GHz); 
8 * 4GB DDR3 1333MHz ECC Register quad-channel memory; 
1 * 146GB 2.5-inch SAS hot-plug hard drives; 
2 * 1000M Ethernet card; 
1 * 40Gb Infiniband daughter card;

 

10

SMP node fat

Dawn A840-G

4 * AMD Opteron 6172 12-core processor (2.1GHz); 
128GB DDR3 1333MHz ECC Register quad-channel memory; 
4 * 300GB SAS hot-swappable hard drives (an installed system, 3 raid5); 
3 * Gigabit Ethernet;
1 * 40Gb Infiniband HCA card; 
1 * 16GB PCI-E I / 0 acceleration module; 
1000W redundant power supply;

 

1

1.2

Management control subsystem

Cluster Management / Monitoring Node

Dawn CB65-G

2 * AMD Opteron 6128 8-core processor (2.0GHz); 
8 * 4GB DDR3 1333MHz ECC Register quad-channel memory; 
1 * 146GB SAS hot-plug hard drives; 
2 * 1000M Ethernet card; 
1 * 40Gb Infiniband daughter card;

 

1

IB subnet management nodes

Dawn CB65-G

2 * AMD Opteron 6128 8-core processing (2.0GHz); 
8 * 4GB DDR3 1333MHz ECC Register quad-channel memory; 
1 * 146GB SAS hot-plug hard drives; 
2 * 1000M Ethernet card; 1 * 40Gb Infiniband daughter card;

 

1

Function node (NIS, NTP, FTP, job scheduling, License)

Dawn CB65-G

2 * AMD Opteron 6128 8-core processor (2.0GHz); 
8 * 4GB DDR3 1333MHz ECC Register quad-channel memory; 
1 * 146GB SAS hot-plug hard drives; 
2 * 1000M Ethernet card; 
1 * 40Gb Infiniband daughter card;

 

2

Login node (partition)

Dawn CB65-G

2 * AMD Opteron 6128 8-core processor (2.0GHz); 
8 * 4GB DDR3 1333MHz ECC Register quad-channel memory; 
1 * 146GB SAS hot-plug hard drives; 
2 * 1000M Ethernet card; 
1 * 40Gb Infiniband daughter card;

 

2

Login node (partition)

Dawn A620r-G

2U rack; 
2 * AMD Opteron 6128 8-core at unit (2.0GHz); 
8 * 4GB DDR3 1333MHz;
1 * 146GB 2.5-inch SAS hot-plug hard drives; 
2 * 1000M Ethernet card; 
1 * 40Gb Infiniband HCA card; 
1 * Redundant power supply; 
1 * Slim DVD-RW; 
* Added a guide;

 

2

1.3

Security Control Subsystem

Security Authentication Server

Dawn SecServer750

5U tower interchangeable design, a Xeon E5506 processor, 2G DDR3 memory, 500G SATA 2 hard drive, DVD-ROM, a dedicated high-performance encryption cards, encryption and authentication software CD, with CA certification, signature, single sign-on capabilities, and Nikey together to form authentication solutions, with the State Encryption Administration certification.

 

1

Network auditing equipment

Dawn NetFirm-A1600

6 gigabit ports, four optical ports, 2U chassis, redundant power, network content and behavior audit, support Web filtering, chat monitoring, mail monitoring, behavior management, logging and auditing functions, the number of 2.5 million concurrent connections, Throughput 4.5G, the number of users 5000

 

1

Network firewall

Godson firewall C10TLFW-1000L

1,000,000 concurrent connections, throughput 1G, VPN Tunnels 600

 

2

Smart Key

Dawn NiKey100

Dawn NiKey100 intelligent cryptographic keys to ensure the authentication security. And Dawn GridView, used in conjunction with a firewall VPN.

 

100

1.4

Storage Subsystem

Parallel storage system Parastor (HPC partition)

P100-MDC (data index controller)

High-performance 64-bit processor, 24GB Cache, 1management network port, two 1Gb access ports, a 40Gb QDR IB-speed access ports, redundant power, high-performance embedded data indexing engine, to achieve a single global namespace , manage multiple concurrent client access, support for hot standby

 

2

P100-IOM (Data Access Module)

High-performance 64-bit processor, 24GB Cache, 1management network port, two 1Gb access ports, a 40Gb QDR IB-speed access ports, redundant power supplies, embedded high-performance data access engine, parallel processing of all customers Data Access client's request, automatic load balancing, support on-demand dynamic expansion, support hot standby

 

4

P100-SSU (Intelligent Storage Unit: 64TB)

High-performance 64-bit processor, fully redundant architecture, a management network ports, support for multiple RAID5, RAID6 and other RAID levels, supports SAS, SATA, SSD three kinds of storage media, provides massive storage space, expandable up to 1EB, Support multi-copy tolerant, dynamic online support on-demand expansion; actual configuration of 64TB raw capacity enterprise SATA hard drives as large-capacity storage space

 

1

Management Software

Dawn embedded parallel storage management software system, Chinese interface, graphical display, running parallel storage systems and modules for real-time monitoring of each component parallel storage system I / O performance

 

1

Parallel storage system Parastor (safe isolation partitions)

P100-MDC (data index controller)

High-performance 64-bit processor, 24GB Cache, 1management network port, two 1Gb access ports, a 40Gb QDR IB-speed access ports, redundant power, high-performance embedded data indexing engine, to achieve a single global namespace , manage multiple concurrent client access, support for hot standby

 

2

P100-IOM (Data Access Module)

High-performance 64-bit processor, 24GB Cache, 1management network port, two 1Gb access ports, a 40Gb QDR IB-speed access ports, redundant power supplies, embedded high-performance data access engine, parallel processing of all customers Data Access client's request, automatic load balancing, support on-demand dynamic expansion, support hot standby

 

4

P100-SSU (Intelligent Storage Unit: 32TB)

High-performance 64-bit processor, fully redundant architecture, a management network ports, support for multiple RAID5, RAID6 and other RAID levels, supports SAS, SATA, SSD three kinds of storage media, provides massive storage space, expandable up to 1EB, Support multi-copy tolerant, dynamic online support on-demand expansion; actual configuration of 32TB raw capacity enterprise SATA hard drives as large-capacity storage space

 

1

Management Software

Dawn embedded parallel storage management software system, Chinese interface, graphical display, running parallel storage systems and modules for real-time monitoring of each component parallel storage system I / O performance

 

1

Backup storage system

Dawn Dbstor backup storage system

4U rack, two Gigabit Ethernet ports, 8GB Cache, 20TB intelligent disk containing data deduplication, Enterprise Edition backup software, 20 heterogeneous clients, 840W 2 +1 redundant power supply

 

1

Fabric Switches

Brocade 24-port fiber switch

BR-360-0008-A, 24-port switch, 24-port active, single power supply (fixed), with 24 8Gb Shortwave SFP, including Web tools, Zoning software licensing, support cascading, 1-year factory warranty (off-site)

 

1

1.5

Network Subsystem

High-speed computing network

Mellanox MIS5100Q-3DNC QDR IB switch

Standard Chassis: 6U modular switch chassis, maximum support 108 (6 pages board), the chassis includes three internal switching modules, three adaptive 110/220V AC power supply module (2 +1 redundancy), management modules, hot-swappable fan modules, rack mount kit

 

1

Mellanox MIS5001QC 18QSFP 40Gb / s IB page board

18 +18 port QSFP 40Gb / s IB page board, chip InfiniScale ® IV, for MIS5XXX Series Switches

 

6

Infiniband cable

FreePort QSFP (QDR) IB cable (QDR 4X QSFP-QSFP IB cable)

 

108

Gigabit Ethernet

Force10 C150 switch

Configuration 96 10/100/1000baseT copper interfaces, including: a standard chassis (with an AC power source, a routing and switching modules), two 48 1Gb RJ-45 port Ethernet card board, 1 operating system

 

1

1.6

Infrastructure subsystem

Dawn C1000 cooling system

Air Conditioner

Block C1000 fluorine cloud cold dawn air-conditioned indoor module level, the maximum sensible cooling capacity 35KW, maximum air volume 7000m3 / h, size 600 * 1200 * 2000mm

 

3

Air conditioner outdoor unit

GMVL-Rm600W / D, cooling capacity 60KW

 

3

Dust dehumidification machine

Block dust cloud dawn dehumidification machine, dehumidification capacity 5kg / h, filtration level sub-efficient

 

1

Cabinets

Block C1000 dawn clouds closed cabinet, size 600 * 1200 * 2000mm, the effective space 42U, static load 1000kg

 

12

Monitoring System

C1000 Monitor Collector

Air conditioning and internal cabinet sensor data acquisition and transfer functions on

 

1

C1000 Temperature Probe

Measuring the temperature inside the cabinet

C1000 integrated temperature and humidity probe

Measuring cabinet row cold, the hot channel temperature and humidity microenvironment

C1000 smoke detectors

Measuring cabinet cold, smoke concentration within the microenvironment of the hot aisle to prevent fire hazards

C1000 leak detector

Measuring air conditioner indoor unit the lower condensate leaks, preventing leakage flood risks

Console

Dawning Cluster Console

1U manually retractable console (Dawn 17 "LCD monitor, mouse, keyboard, eight switches, etc.)

 

1

Video Switching System

SKVM

SKVM IV Over IP (includes keyboard and mouse), CIM node control module

 

1

2

Software section

Operating system

Linux

SuSE Linux Enterprise Edition 11.1

 

1

GPU programming environment

Nvidia GPU development environment

nvcc C language compiler

 

1

For the GPU (graphics processor) CUDA FFT and BLAS libraries

Analyzer (Profiler)

For the GPU (graphics processor) gdb debugger

CUDA runtime driver

Cloud management system

Dawn Cloudview

Dawn Cloudview cloud management platform to support the project organization and management, billing management, virtualization management, cloud security management functions

 

1

Cluster Management System

Dawn GridView 2.5

GridView HPC version, support system deployment, system monitoring, cluster management, data reporting, unified alarm, job scheduling

 

1

GridView Application Portal, 8 basic Application Portal, support for Ansys, Lsdyna, Abaqus, Fluent, Vasp, Gaussian, NAMD, Gromacs; 4 custom-built applications Portal, according to user requirements development

 

1

GridView cluster saving module PowerConf 2.0

 

1

Application Development Environment

Compiler

GNU compiler supports C / C + + Fortran77/90 
Intel compilers support C / C + + Fortran

 

1

Math Library

BLAS, LAPACK, ScaLAPACK, FFTW

 

1

MPI parallel environment

OpenMPI (MPI supports Infiniband and Ethernet environments)

 

1

MPICH (MPI supports Gigabit Ethernet environment)

 

1

CPU peak (Tflops)

20.16

 

 

 

GPU peak (Tflops)

50.43

 

 

 

 

Copyright 2011 - 2012 All Rights Reserved Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences
Add: No.266 Fangzheng Avenue,Shuitu Hi-tech Industrial Park, Shuitu Town, Beibei District, Chongqing     Postal: 400714