## P2P DMA sample

### Hardware requirements
1) gfx9 above AMD GPU
2) NetInt Quadra Encoder (T1, T2, T4)

### Dependencies

#### Build tools

1) CMake(>= 3.14) : Get from https://cmake.org/download/, then install
2) pkg-config     : `sudo apt install pkg-config` 

#### Vulkan Loader and ICD

3) Vulkan SDK     : Get from https://vulkan.lunarg.com/sdk/home
4) RADV or AMDVLk(>=21.40) or PRO(>=21.40)

#### NetInt software stack

5) extract NetInt’s V3.0.0_RC2 package
```
v3.0.0/Quadra_SW_V3.0.0_RC2/
-------------------------------/dma-buf
-------------------------------/libxcoder
...
```

6) build NetInt dma-buf kernel driver and install
``` bash
cd v3.0.0/Quadra_SW_V3.0.0_RC2/dma-buf
make -j
insmod netint.ko. # Need to redo it after reboot
```

7) build NetInt libxcoder
``` bash
cd v3.0.0/Quadra_SW_V3.0.0_RC2/libxcoder
./configure && make install # results in directory `build`
```

8) initialize NetInt encoder card resources
``` bash
sudo ./build/init_rsrc # Need to redo it after reboot
```

9) extract NetInt’s v3.0.0_RC2 release, and install firmware
```
v3.0.0/
----------/quadra_quick_installer.sh
...
```

``` bash
./quadra_quick_installer.sh # follow the prompt to install firmware
```

### Build 
1) add libxcoder install path(path of `xcoder.pc`) to pkg-config search path
``` bash
export PKG_CONFIG_PATH=<libxcoder_install_path>/lib/pkgconfig:$PKG_CONFIG_PATH
```

2) add Vulkan SDK settings to PATH
``` bash
source <VULKAN_SDK_PATH>/setup-env.sh
```

3) run cmake and build the sample
``` bash
cmake -DBUILD_NETINT_ENCODING_SUPPORT=ON -DCUBE_WSI_SELECTION=OFFSCREEN -Bcmake-build-offscreen -S.
cd cmake-build-offscreen && make
```

### Run
1) start single process
``` bash
# Access NetInt card resources requires sudo privilleges.
sudo ./vkcubepp --encoding_mode 4 # Refer to ./vkcubepp --help for complete arguments list
```

2) start multiple processes
``` bash
sudo ../scripts/run_multiple_encoding.py --frames <frames_num_per_instance> --instance
<instance_num>
```

3) merge latency data
``` bash
# "log_directory" is the output directory from the above command, a directory named by timestamp.
# "outlier_thresholad" is Z-Score in float for latencies, controlling whether a specific frame is
# considered as an outlier and printed in stdout.
./scripts/merge_instance_latency.py --log <log_directory> -z <outlier_thresholad>
```


## Appendix (GPU to GPU P2P)

Apart from supporting GPU to NetInt DMA and encoding, the P2P DMA sample also supports DMA
transfer between two GPUs. Rendering on one GPU and then present on another GPU through X11
protocol.

For this functionality, additional requirements are:

### Hardware requirements

- two gfx9+ AMD GPUs

### Dependencies

- `sudo apt install libx11-dev`

 
### Build

``` bash
cmake -DBUILD_WSI_XCB_SUPPORT=ON -DCUBE_WSI_SELECTION=XCB -Bcmake-build-x11 -S.
cd cmake-build-x11 && make
```

### Run
1) start Xorg
`sudo Xorg`

2) run sample
``` bash
export DISPLAY=<..>
./vkcubepp --encoding_mode <0-3>
# Check ./vkcubepp --help for specific meaning of the encoding mode number
```
