Bài giảng Introduction to GP-GPU and CUDA - Dương Nhật Tân

High Performance Computing Center  
Hanoi University of Science & Technology  
Introduction to GP-GPU and CUDA  
Duong Nhat Tan (dn.nhattan@gmail.com)  
2012  
Outline  
Overview  
What is GPGPU?  
GPU Computing with CUDA  
Hardware Model  
Execution Model  
Thread Hierarchy  
Memory Model  
GPU Computing Application Areas  
Summary  
High Performance Computing Center  
2
Overview  
Scientific computing has the following  
characteristics:  
The problems are not interested.  
Use computer to calculate the arithmetic.  
Always want the programs run faster  
For examples: weather forecasting, climate  
change, modeling, simulation, gene  
prediction, docking…  
High Performance Computing Center  
3
Several Approaches  
Supercomputers  
Mainframe  
Cluster  
Multi/many cores systems  
High Performance Computing Center  
4
Microprocessor trends  
Many cores running at lower frequencies are fundamentally  
more power-efficient  
Multi- cores (2-8 cores)  
CPU Intel pentium D/core duo/ core 2 duo/ quad cores, core i3,i5,  
i7  
Many-cores (> 8 cores)  
GPU - Graphics Processing unit  
A. P. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, and R. W. Brodersen,  
“Optimizing Power Using Transformations,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  
The development of modern GPUs  
GPU - NVIDIA GeFore GTX 295  
CUDA Cores  
480 ( 240 per GPU )  
Graphics Clock (MHz)  
Processor Clock (MHz)  
Memory Clock (MHz)  
Memory Bandwidth (GB/sec)  
576  
1242  
999  
223.8  
Benchmark (GFLPOS)  
1788.48  
CPU vs GPU  
CPUs are optimized for high performance on sequential code:  
transistors dedicated to data caching and flow control  
GPUs use additional transistors directly for data processing  
Books: “Program ming Massively Parallel Processors: A Hands-on Approach”  
High Performance Computing Center  
7
GPU Solutions  
NVIDIA  
GeForce (gaming/movie playback)  
Quadro (professional graphics)  
Tesla (HPC)  
AMD/ATI  
Radeon (gaming/movie playback)  
FireStream (HPC)  
AMD FireStream 9170  
High Performance Computing Center  
8
Motivation  
Costs/performance ratio  
Costs for power supply  
Costs for maintain, operation  
High Performance Computing Center  
9
GPGPU  
GP-GPU stands for General Purpose Computation on GPU  
A technique/technology/approach that consists in using the GPU chip on  
the video card as a coprocessor that accelerates operations that are  
normally executed on the CPU  
GPGPU is different from general graphics operations?  
GPGPU running various kinds of algorithms on a GPU, not necessarily  
image processing.  
For example: FFT, Monte-Carlo, Data-Sorting, Data mining and the list  
continues  
Until 2006, developers must cast their problems to graphics  
field and resolve them using graphics API  
High Performance Computing Center  
10  
Parallel Computing with GPU  
High Performance Computing Center  
11  
NVIDIA GPU  
11/2006: NVIDIA released G80 architecture with an  
environment application development - CUDA  
Allow developers to develop GPGP applications on high level  
programming languages  
- Built from a scalable  
array of Streaming  
Processors (SM)  
- Each SM contains 8 SP  
(Scalar Processor)  
- Each SM can initialize,  
manage, execute up to  
768 threads  
G80 Architecture  
High Performance Computing Center  
12  
NVIDIA GPU  
G80-based GPU  
Geforce 8800 GT  
14 SMs equivalent 112 cores  
DRAM 512MB  
06/2008  
Geforce GT 200 series  
30 SMs (240 cores)  
DRAM 1GB  
Tesla  
30 SMs (240 cores)  
DRAM 4GB  
High Performance Computing Center  
13  
Tesla Specification  
Power consumption: 187 W!  
High Performance Computing Center  
14  
GPU Computing with CUDA  
CUDA: Compute Unified Device Architect  
Application Development Environment for  
NVIDIA GPU  
Compiler, debugger, profiler, high-level  
programming languages  
Libraries (CUBLAS, CUFFT, ..) and Code  
Samples  
GPU Computing with CUDA  
The GPU is viewed as a compute device that:  
Is a coprocessor to the CPU or host  
Has its own DRAM (device memory)  
CUDA C is an extension of C/C++ language  
Data parallel programming model  
Executing thousands of processes in parallel on  
GPUs  
Cost of synchronization is not expensive  
High Performance Computing Center  
16  
Hardware implementation  
A set of SIMD Multiprocessors with On- Chip shared memory  
High Performance Computing Center  
17  
Scalable Programming Models  
High Performance Computing Center  
18  
Memory Model  
There are 6 Memory Types :  
Registers  
o on chip  
o fast access  
o per thread  
o limited amount  
High Performance Computing Center  
19  
Memory Model  
There are 6 Memory Types :  
Registers  
Local Memory  
o in DRAM  
o slow  
o non-cached  
o per thread  
o relative large  
High Performance Computing Center  
20  
Tải về để xem bản đầy đủ
pdf 43 trang Thùy Anh 29/04/2022 5480
Bạn đang xem 20 trang mẫu của tài liệu "Bài giảng Introduction to GP-GPU and CUDA - Dương Nhật Tân", để tải tài liệu gốc về máy hãy click vào nút Download ở trên

File đính kèm:

  • pdfbai_giang_introduction_to_gp_gpu_and_cuda_duong_nhat_tan.pdf