PAC 2022
Problem
Frequently used commands in GPU platform
加载环境
source /opt/intel/oneapi/setvars.sh
-------->
:: initializing oneAPI environment ...
-bash: BASH_VERSION = 5.0.17(1)-release
args: Using "$@" for setvars.sh arguments:
:: advisor -- latest
:: ccl -- latest
:: clck -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: inspector -- latest
:: intelpython -- latest
:: ipp -- latest
:: ippcp -- latest
:: ipp -- latest
:: itac -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: vpl -- latest
:: vtune -- latest
:: oneAPI environment initialized ::
查看硬件
clinfo -l
-------->
Platform #0: Intel(R) FPGA Emulation Platform for OpenCL(TM)
`-- Device #0: Intel(R) FPGA Emulation Device
Platform #1: Intel(R) OpenCL
`-- Device #0: Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz
Platform #2: Intel(R) OpenCL HD Graphics
+-- Device #0: Intel(R) Graphics [0x020a]
`-- Device #1: Intel(R) Graphics [0x020a]
或
sycl-ls
-------->
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2022.13.3.0.16_160000]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz 3.0 [2022.13.3.0.16_160000]
[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) Graphics [0x020a] 3.0 [22.18.023111]
[opencl:gpu:3] Intel(R) OpenCL HD Graphics, Intel(R) Graphics [0x020a] 3.0 [22.18.023111]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Graphics [0x020a] 1.3 [1.3.23111]
[ext_oneapi_level_zero:gpu:1] Intel(R) Level-Zero, Intel(R) Graphics [0x020a] 1.3 [1.3.23111]
[host:host:0] SYCL host platform, SYCL host device 1.2 [1.2]
GPU 利用率
sudo /usr/bin/intel_gpu_top -d pci:card=1
sudo /usr/bin/intel_gpu_top -d pci:card=2
-------->
intel-gpu-top - 0/ 0 MHz; 0% RC6; 0 irqs/s
IMC reads: ------ (null)/s
IMC writes: ------ (null)/s
ENGINE BUSY MI_SEMA MI_WAIT
Blitter/0 0.00% | | 0% 0%
Blitter/1 0.00% | | 0% 0%
Video/0 0.00% | | 0% 0%
Video/1 0.00% | | 0% 0%
Video/2 0.00% | | 0% 0%
Video/3 0.00% | | 0% 0%
Video/4 0.00% | | 0% 0%
Video/5 0.00% | | 0% 0%
Video/6 0.00% | | 0% 0%
Video/7 0.00% | | 0% 0%
Video/8 0.00% | | 0% 0%
Video/9 0.00% | | 0% 0%
Video/10 0.00% | | 0% 0%
Video/11 0.00% | | 0% 0%
Video/12 0.00% | | 0% 0%
Video/13 0.00% | | 0% 0%
VideoEnhance/0 0.00% | | 0% 0%
VideoEnhance/1 0.00% | | 0% 0%
VideoEnhance/2 0.00% | | 0% 0%
VideoEnhance/3 0.00% | | 0% 0%
VideoEnhance/4 0.00% | | 0% 0%
VideoEnhance/5 0.00% | | 0% 0%
VideoEnhance/6 0.00% | | 0% 0%
VideoEnhance/7 0.00% | | 0% 0%
Compute/0 0.00% | | 0% 0%
Compute/1 0.00% | | 0% 0%
Compute/2 0.00% | | 0% 0%
Compute/3 0.00% | | 0% 0%
Compute/4 0.00% | | 0% 0%
Compute/5 0.00% | | 0% 0%
Compute/6 0.00% | | 0% 0%
Compute/7 0.00% | | 0% 0%
PID NAME
其他查询指令
clinfo | grep -E 'units|frequency|memory size'
-------->
Max compute units 64
Max clock frequency 2900MHz
Global memory size 270139891712 (251.6GiB)
Local memory size 262144 (256KiB)
Max compute units 64
Max clock frequency 2900MHz
Global memory size 270139891712 (251.6GiB)
Local memory size 32768 (32KiB)
Max compute units 960
Max clock frequency 1400MHz
Global memory size 32482365440 (30.25GiB)
Local memory size 65536 (64KiB)
Max compute units 960
Max clock frequency 1400MHz
Global memory size 32482365440 (30.25GiB)
Local memory size 65536 (64KiB)
Last updated