Bài giảng Introduction to GP-GPU and CUDA - Dương Nhật Tân
High Performance Computing Center
Hanoi University of Science & Technology
Introduction to GP-GPU and CUDA
Duong Nhat Tan (dn.nhattan@gmail.com)
2012
Outline
Overview
What is GPGPU?
GPU Computing with CUDA
Hardware Model
Execution Model
Thread Hierarchy
Memory Model
GPU Computing Application Areas
Summary
High Performance Computing Center
2
Overview
Scientific computing has the following
characteristics:
The problems are not interested.
Use computer to calculate the arithmetic.
Always want the programs run faster
For examples: weather forecasting, climate
change, modeling, simulation, gene
prediction, docking…
High Performance Computing Center
3
Several Approaches
Supercomputers
Mainframe
Cluster
Multi/many cores systems
High Performance Computing Center
4
Microprocessor trends
Many cores running at lower frequencies are fundamentally
more power-efficient
Multi- cores (2-8 cores)
CPU Intel pentium D/core duo/ core 2 duo/ quad cores, core i3,i5,
i7
Many-cores (> 8 cores)
GPU - Graphics Processing unit
A. P. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, and R. W. Brodersen,
“Optimizing Power Using Transformations,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
The development of modern GPUs
GPU - NVIDIA GeFore GTX 295
CUDA Cores
480 ( 240 per GPU )
Graphics Clock (MHz)
Processor Clock (MHz)
Memory Clock (MHz)
Memory Bandwidth (GB/sec)
576
1242
999
223.8
Benchmark (GFLPOS)
1788.48
CPU vs GPU
CPUs are optimized for high performance on sequential code:
transistors dedicated to data caching and flow control
GPUs use additional transistors directly for data processing
Books: “Program ming Massively Parallel Processors: A Hands-on Approach”
High Performance Computing Center
7
GPU Solutions
NVIDIA
GeForce (gaming/movie playback)
Quadro (professional graphics)
Tesla (HPC)
AMD/ATI
Radeon (gaming/movie playback)
FireStream (HPC)
AMD FireStream 9170
High Performance Computing Center
8
Motivation
Costs/performance ratio
Costs for power supply
Costs for maintain, operation
High Performance Computing Center
9
GPGPU
GP-GPU stands for General Purpose Computation on GPU
A technique/technology/approach that consists in using the GPU chip on
the video card as a coprocessor that accelerates operations that are
normally executed on the CPU
GPGPU is different from general graphics operations?
GPGPU – running various kinds of algorithms on a GPU, not necessarily
image processing.
For example: FFT, Monte-Carlo, Data-Sorting, Data mining and the list
continues
Until 2006, developers must cast their problems to graphics
field and resolve them using graphics API
High Performance Computing Center
10
Parallel Computing with GPU
High Performance Computing Center
11
NVIDIA GPU
11/2006: NVIDIA released G80 architecture with an
environment application development - CUDA
Allow developers to develop GPGP applications on high level
programming languages
- Built from a scalable
array of Streaming
Processors (SM)
- Each SM contains 8 SP
(Scalar Processor)
- Each SM can initialize,
manage, execute up to
768 threads
G80 Architecture
High Performance Computing Center
12
NVIDIA GPU
G80-based GPU
Geforce 8800 GT
14 SMs equivalent 112 cores
DRAM 512MB
06/2008
Geforce GT 200 series
30 SMs (240 cores)
DRAM 1GB
Tesla
30 SMs (240 cores)
DRAM 4GB
High Performance Computing Center
13
Tesla Specification
Power consumption: 187 W!
High Performance Computing Center
14
GPU Computing with CUDA
CUDA: Compute Unified Device Architect
Application Development Environment for
NVIDIA GPU
Compiler, debugger, profiler, high-level
programming languages
Libraries (CUBLAS, CUFFT, ..) and Code
Samples
GPU Computing with CUDA
The GPU is viewed as a compute device that:
Is a coprocessor to the CPU or host
Has its own DRAM (device memory)
CUDA C is an extension of C/C++ language
Data parallel programming model
Executing thousands of processes in parallel on
GPUs
Cost of synchronization is not expensive
High Performance Computing Center
16
Hardware implementation
A set of SIMD Multiprocessors with On- Chip shared memory
High Performance Computing Center
17
Scalable Programming Models
High Performance Computing Center
18
Memory Model
There are 6 Memory Types :
•
Registers
o on chip
o fast access
o per thread
o limited amount
High Performance Computing Center
19
Memory Model
There are 6 Memory Types :
•
•
Registers
Local Memory
o in DRAM
o slow
o non-cached
o per thread
o relative large
High Performance Computing Center
20
Tải về để xem bản đầy đủ
Bạn đang xem 20 trang mẫu của tài liệu "Bài giảng Introduction to GP-GPU and CUDA - Dương Nhật Tân", để tải tài liệu gốc về máy hãy click vào nút Download ở trên
File đính kèm:
- bai_giang_introduction_to_gp_gpu_and_cuda_duong_nhat_tan.pdf