130 files changed, 23434 insertions, 5697 deletions
diff --git a/.gitignore b/.gitignore
index ed1d024..5172b61 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,16 +4,26 @@ compile_commands.json
 tests/sibs-build/
 tests/compile_commands.json
 
-external/wlr-export-dmabuf-unstable-v1-client-protocol.h
-external/wlr-export-dmabuf-unstable-v1-protocol.c
+**/xdg-output-unstable-v1-client-protocol.h
+**/xdg-output-unstable-v1-protocol.c
 
 .clangd/
 .cache/
 .vscode/
 
+build/
+debug-build/
+
 *.o
 gpu-screen-recorder
 gsr-kms-server
 
 *.mp4
 *.flv
+*.mkv
+*.mov
+*.webm
+*.ts
+*.jpg
+*.jpeg
+*.png
diff --git a/README.md b/README.md
index 32cf998..d41b3e4 100644
--- a/README.md
+++ b/README.md
@@ -1,143 +1,203 @@
+![](https://dec05eba.com/images/gpu_screen_recorder_logo_small.png)
+
 # GPU Screen Recorder
-This is a screen recorder that has minimal impact on system performance by recording a window using the GPU only,
+This is a screen recorder that has minimal impact on system performance by recording your monitor using the GPU only,
 similar to shadowplay on windows. This is the fastest screen recording tool for Linux.
 
 This screen recorder can be used for recording your desktop offline, for live streaming and for nvidia shadowplay-like instant replay,
-where only the last few seconds are saved.
+where only the last few minutes are saved.
+
+This software can also take screenshots.
+
+This is a cli-only tool, if you want an UI for this check out [GPU Screen Recorder GTK](https://git.dec05eba.com/gpu-screen-recorder-gtk/) or if you prefer a ShadowPlay-like UI then check out [GPU Screen Recorder UI](https://git.dec05eba.com/gpu-screen-recorder-ui/).
+
+Supported video codecs:
+* H264 (default)
+* HEVC (Optionally with HDR)
+* AV1 (Optionally with HDR. Not currently supported on NVIDIA if you use GPU Screen Recorder flatpak)
+* VP8
+* VP9
+
+Supported audio codecs:
+* Opus (default)
+* AAC
 
-## Note
-This software works with x11 and wayland, but when using AMD/Intel or Wayland then only monitors can be recorded.\
-GPU Screen Recorder only supports h264 and hevc codecs at the moment which means that webm files are not supported.\
-CPU usage may be higher on wayland than on x11 when using nvidia.
+Supported image formats:
+* JPEG
+* PNG
+
+This software works on X11 and Wayland on AMD, Intel and NVIDIA. Replay data is stored in RAM by default but there is an option to store it on disk instead.
 ### TEMPORARY ISSUES
-1) screen-direct capture has been temporary disabled as it causes issues with stuttering. This might be a nvfbc bug.
-2) Recording the monitor on steam deck might fail sometimes. This happens even when using ffmpeg directly. This might be a steam deck driver bug. Recording a single window doesn't have this issue.
-3) Videos created on AMD/Intel are in variable framerate format. Use MPV to play such videos, otherwise you might experience stuttering in the video if you are using a buggy video player. Try saving the video into a .mkv file instead when using AMD/Intel, as some software may have better support for .mkv files (such as kdenlive).
+1) Videos are in variable framerate format. Use MPV to play such videos, otherwise you might experience stuttering in the video if you are using a buggy video player. You can try saving the video into a .mkv file instead as some software may have better support for .mkv files (such as kdenlive). You can use the "-fm cfr" option to to use constant framerate mode.
+2) FLAC audio codec is disabled at the moment because of temporary issues.
 ### AMD/Intel/Wayland root permission
-When recording a window under AMD/Intel no special user permission is required, however when recording a monitor (or when using wayland) the program needs root permission (to access KMS).\
-To make this safer, the part that needs root access has been moved to its own executable (to make it as small as possible).\
-For you as a user this only means that if you installed GPU Screen Recorder as a flatpak then a prompt asking for root password will show up when you start recording.
+When recording a window or when using the `-w portal` option no special user permission is required,
+however when recording a monitor the program needs root permission (to access KMS).\
+This is safe in GPU Screen Recorder as the part that needs root access has been moved to its own small program that only does one thing.\
+For you as a user this only means that if you installed GPU Screen Recorder as a flatpak then a prompt asking for root password will show up once when you start recording.
 # Performance
 On a system with a i5 4690k CPU and a GTX 1080 GPU:\
 When recording Legend of Zelda Breath of the Wild at 4k, fps drops from 30 to 7 when using OBS Studio + nvenc, however when using this screen recorder the fps remains at 30.\
-When recording GTA V at 4k on highest settings, fps drops from 60 to 23 when using obs-nvfbc + nvenc, however when using this screen recorder the fps only drops to 58. The quality is also much better when using gpu screen recorder.\
-It is recommended to save the video to a SSD because of the large file size, which a slow HDD might not be fast enough to handle.\
-Note that if you have a very powerful CPU and a not so powerful GPU and play a game that is bottlenecked by your GPU and barely uses your CPU then a CPU based screen recording (such as OBS with libx264 instead of nvenc) might perform slightly better than GPU Screen Recorder. At least on NVIDIA.
+When recording GTA V at 4k on highest settings, fps drops from 60 to 23 when using obs-nvfbc + nvenc, however when using this screen recorder the fps only drops to 58.\
+GPU Screen Recorder also produces much smoother videos than OBS when GPU utilization is close to 100%, see comparison here: [https://www.youtube.com/watch?v=zfj4sNVLLLg](https://www.youtube.com/watch?v=zfj4sNVLLLg) and [https://www.youtube.com/watch?v=aK67RSZw2ZQ](https://www.youtube.com/watch?v=aK67RSZw2ZQ).\
+GPU Screen Recorder has much better performance than OBS Studio even with version 30.2 that does "zero-copy" recording and encoding, see: [https://www.youtube.com/watch?v=jdroRjibsDw](https://www.youtube.com/watch?v=jdroRjibsDw).\
+It is recommended to save the video to a SSD because of the large file size, which a slow HDD might not be fast enough to handle. Using variable framerate mode (-fm vfr) which is the default is also recommended as this reduces encoding load. Ultra quality is also overkill most of the time, very high (the default) or lower quality is usually enough.\
+Note that for best performance you should close other screen recorders such as OBS Studio when using GPU Screen Recorder even if they are not recording, since they can affect performance even when idle. This is the case with OBS Studio.
 ## Note about optimal performance on NVIDIA
 NVIDIA driver has a "feature" (read: bug) where it will downclock memory transfer rate when a program uses cuda (or nvenc, which uses cuda), such as GPU Screen Recorder. To work around this bug, GPU Screen Recorder can overclock your GPU memory transfer rate to it's normal optimal level.\
 To enable overclocking for optimal performance use the `-oc` option when running GPU Screen Recorder. You also need to have "Coolbits" NVIDIA X setting set to "12" to enable overclocking. You can automatically add this option if you run `sudo nvidia-xconfig --cool-bits=12` and then reboot your computer.\
 Note that this only works when Xorg server is running as root, and using this option will only give you a performance boost if the game you are recording is bottlenecked by your GPU.\
 Note! use at your own risk!
-## Note about optimal performance on AMD/Intel
-Performance is the same when recording a single window or the monitor, however in some cases, such as when gpu usage is 100%, the video capture rate might be slower than the games fps when recording a single window instead of a monitor. Recording the monitor instead is recommended in such cases.
+# VRR/G-SYNC
+This should work fine on AMD/Intel X11 or Wayland. On Nvidia X11 G-SYNC only works with the -w screen-direct option, but because of bugs in the Nvidia driver this option is not always recommended.
+For example it can cause your computer to freeze when recording certain games.
 
 # Installation
-If you are running an Arch Linux based distro, then you can find gpu screen recorder on aur under the name gpu-screen-recorder-git (`yay -S gpu-screen-recorder-git`).\
+If you are running an Arch Linux based distro then you can find gpu screen recorder on aur under the name gpu-screen-recorder (`yay -S gpu-screen-recorder`).\
 If you are running another distro then you can run `sudo ./install.sh`, but you need to manually install the dependencies, as described below.\
 You can also install gpu screen recorder ([the gtk gui version](https://git.dec05eba.com/gpu-screen-recorder-gtk/)) from [flathub](https://flathub.org/apps/details/com.dec05eba.gpu_screen_recorder), which is the easiest method
 to install GPU Screen Recorder on non-arch based distros.\
-The only official ways to install GPU Screen Recorder is either from source, AUR or flathub. If you install GPU Screen Recorder from somewhere else and have an issue then try installing it
-from one of the official sources before reporting it as an issue.
 If you install GPU Screen Recorder flatpak, which is the gtk gui version then you can still run GPU Screen Recorder command line by using the flatpak command option, for example `flatpak run --command=gpu-screen-recorder com.dec05eba.gpu_screen_recorder -w screen -f 60 -o video.mp4`. Note that if you want to record your monitor on AMD/Intel then you need to install the flatpak system-wide (like so: `flatpak install flathub --system com.dec05eba.gpu_screen_recorder`).
 
+## Unofficial install methods
+The only official ways to install GPU Screen Recorder is either from source, AUR or flathub. Other sources may be out of date and missing features or may not work correctly.\
+If you install GPU Screen Recorder from somewhere else and have an issue then try installing it from one of the official sources before reporting it as an issue.\
+If you still prefer to install GPU Screen Recorder with a package manager instead of from source or as a flatpak then you may be able to find a package for your distro.\
+Here are some known unofficial packages:
+* Debian/Ubuntu: [Pacstall](https://pacstall.dev/packages/gpu-screen-recorder)
+* Nix: [NixOS wiki](https://wiki.nixos.org/wiki/Gpu-screen-recorder)
+* openSUSE: [openSUSE software repository](https://software.opensuse.org/package/gpu-screen-recorder)
+* Fedora: [Copr](https://copr.fedorainfracloud.org/coprs/brycensranch/gpu-screen-recorder-git/)
+* OpenMandriva: [gpu-screen-recorder](https://github.com/OpenMandrivaAssociation/gpu-screen-recorder)
+* Solus: [gpu-screen-recorder](https://github.com/getsolus/packages/tree/main/packages/g/gpu-screen-recorder)
+* Nobara: [Nobara wiki](https://wiki.nobaraproject.org/en/general-usage/additional-software/GPU-Screen-Recorder)
+
 # Dependencies
-## AMD
-libglvnd (which provides libgl and libegl)\
-mesa\
-ffmpeg (libavcodec, libavformat, libavutil, libswresample, libavfilter)\
-x11 (libx11, libxcomposite, libxrandr)\
-libpulse\
-vaapi (libva, libva-mesa-driver)\
-libdrm\
-libcap\
-wayland-client
-## Intel
-libglvnd (which provides libgl and libegl)\
-mesa\
-ffmpeg (libavcodec, libavformat, libavutil, libswresample, libavfilter)\
-x11 (libx11, libxcomposite, libxrandr)\
-libpulse\
-vaapi (libva, libva-intel-driver)\
-libdrm\
-libcap\
-wayland-client
-## NVIDIA
-libglvnd (which provides libgl and libegl)\
-ffmpeg (libavcodec, libavformat, libavutil, libswresample, libavfilter)\
-x11 (libx11, libxcomposite, libxrandr)\
-libpulse\
-cuda runtime (libcuda.so.1) (libnvidia-compute)\
-nvenc (libnvidia-encode)\
-libva\
-libdrm\
-libcap\
-wayland-client\
-nvfbc (libnvidia-fbc1, when recording the screen on x11)\
-xnvctrl (libxnvctrl0, when using the `-oc` option)
+GPU Screen Recorder uses meson build system so you need to install `meson` to build GPU Screen Recorder.
+
+## Build dependencies
+These are the dependencies needed to build GPU Screen Recorder:
+
+* vulkan-headers
+* ffmpeg (libavcodec, libavformat, libavutil, libswresample, libavfilter)
+* x11 (libx11, libxcomposite, libxrandr, libxfixes, libxdamage)
+* libpulse
+* libva (and libva-drm)
+* libdrm
+* libcap
+* wayland (wayland-client, wayland-egl, wayland-scanner)
+
+## Runtime dependencies
+* libglvnd (which provides libgl, libglx and libegl) is needed. Your system needs to support at least OpenGL ES 3.0 (released in 2012)
+
+There are also additional dependencies needed at runtime depending on your GPU vendor:
+
+### AMD
+* mesa
+* vaapi (libva-mesa-driver)
+
+### Intel
+* mesa
+* vaapi (intel-media-driver/libva-intel-driver/linux-firmware-intel, depending on which intel iGPU you have)
+
+### NVIDIA
+* cuda runtime (libcuda.so.1) (libnvidia-compute)
+* nvenc (libnvidia-encode)
+* nvfbc (libnvidia-fbc1, when recording the screen on x11)
+* xnvctrl (libxnvctrl0, when using the `-oc` option)
+
+## Optional dependencies
+When compiling GPU Screen Recorder with portal support (`-Dportal=true`, which is enabled by default) these dependencies are also needed:
+* libdbus
+* libpipewire (and libspa which is usually part of libpipewire)
 
 # How to use
-Run `gpu-screen-recorder --help` to see all options.
+Run `gpu-screen-recorder --help` to see all options and also examples.\
+There is also a gui for the gpu screen recorder called [GPU Screen Recorder GTK](https://git.dec05eba.com/gpu-screen-recorder-gtk/).\
+There is also a new alternative UI for GPU Screen Recorder in the style of ShadowPlay called [GPU Screen Recorder UI](https://git.dec05eba.com/gpu-screen-recorder-ui/).
 ## Recording
-Here is an example of how to record all monitors and the default audio output: `gpu-screen-recorder -w screen -f 60 -a "$(pactl get-default-sink).monitor" -o ~/Videos/test_video.mp4` then stop the screen recorder with `Ctrl+C`, which will also save the recording. You can record a single monitor if you change `-w screen` to the name of a monitor, which you can find if you run the `xrandr`. An example of a monitor name is HDMI-1.
+Here is an example of how to record your monitor and the default audio output: `gpu-screen-recorder -w screen -f 60 -a default_output -o ~/Videos/test_video.mp4`.
+Yyou can stop and save the recording with `Ctrl+C` or by running `pkill -SIGINT -f gpu-screen-recorder`.
+You can see a list of capture options to record if you run `gpu-screen-recorder --list-capture-options`. This will list possible capture options and monitor names, for example:\
+```
+  window
+  DP-1|1920x1080
+```
+in this case you could record a window or a monitor with the name `DP-1`.\
+To list available audio devices that you can use you can run `gpu-screen-recorder --list-audio-devices` and the name to use is on the left size of the `|`.\
+To list available audio application names that you can use you can run `gpu-screen-recorder --list-application-audio`.
 ## Streaming
-Streaming works the same as recording, but the `-o` argument should be path to the live streaming service you want to use (including your live streaming key). Take a look at scripts/twitch-stream.sh to see an example of how to stream to twitch.
+Streaming works the same way as recording, but the `-o` argument should be path to the live streaming service you want to use (including your live streaming key). Take a look at `scripts/twitch-stream.sh` to see an example of how to stream to twitch.\
+GPU Screen Recorder uses Ffmpeg so GPU Screen Recorder supports all protocols that Ffmpeg supports.\
+If you want to reduce latency one thing you can do is to use the `-keyint` option, for example `-keyint 0.5`. Lower value means lower latency at the cost of increased bitrate/decreased quality.
 ## Replay mode
 Run `gpu-screen-recorder` with the `-c mp4` and `-r` option, for example: `gpu-screen-recorder -w screen -f 60 -r 30 -c mp4 -o ~/Videos`. Note that in this case, `-o` should point to a directory.\
-If `-mf yes` is set, replays are save in folders based on the date.
-To save a video in replay mode, you need to send signal SIGUSR1 to gpu screen recorder. You can do this by running `killall -SIGUSR1 gpu-screen-recorder`.\
-To stop recording, send SIGINT to gpu screen recorder. You can do this by running `killall gpu-screen-recorder` or pressing `Ctrl-C` in the terminal that runs gpu screen recorder.\
-The file path to the saved replay is output to stdout. All other output from GPU Screen Recorder is output to stderr.
-## Finding audio device name
-You can find the default output audio device (headset, speakers (in other words, desktop audio)) with the command `pactl get-default-sink`. Add `monitor` to the end of that to use that as an audio input in gpu screen recorder.\
-You can find the default input audio device (microphone) with the command `pactl get-default-source`. This input should not have `monitor` added to the end when used in gpu screen recorder.\
-Example of recording both desktop audio and microphone: `gpu-screen-recorder -w screen -f 60 -a "$(pactl get-default-sink).monitor" -a "$(pactl get-default-source)" -o ~/Videos/test_video.mp4`.\
-A name (that is visible to pipewire) can be given to an audio input device by prefixing the audio input with `<name>/`, for example `dummy/$(pactl get-default-sink).monitor`.\
-Note that if you use multiple audio inputs then they are each recorded into separate audio tracks in the video file. If you want to merge multiple audio inputs into one audio track then separate the audio inputs by "|" in one -a argument,
-for example `-a "$(pactl get-default-sink).monitor|$(pactl get-default-source)"`.
-
-There is also a gui for the gpu screen recorder called [gpu-screen-recorder-gtk](https://git.dec05eba.com/gpu-screen-recorder-gtk/).
+If `-df yes` is set, replays are save in folders based on the date.
+The file path to the saved replay is output to stdout. All other output from GPU Screen Recorder are output to stderr.
+You can also use the `-sc` option to specify a script that should be run (asynchronously) when the video has been saved and the script will have access to the location of the saved file as its first argument.
+This can be used for example to show a notification when a replay has been saved, to rename the video with a title that matches the game played (see `scripts/record-save-application-name.sh` as an example on how to do this on X11) or to re-encode the video.\
+The replay buffer is stored in ram (as encoded video), so don't use a too large replay time and/or video quality unless you have enough ram to store it.
+## Recording while using replay/streaming
+You can record a regular video while using replay/streaming by launching GPU Screen Recorder with the `-ro` option to specify a directory where to save the recording.\
+To start/stop (and save) recording use the SIGRTMIN signal, for example `pkill -SIGRTMIN -f gpu-screen-recorder`. The name of the video will be displayed in stdout when saving the video.\
+This way of recording while using replay/streaming is more efficient than running GPU Screen Recorder multiple times since this way it only records the screen and encodes the video once.
+## Controlling GPU Screen Recorder remotely
+To save a video in replay mode, you need to send signal SIGUSR1 to gpu screen recorder. You can do this by running `pkill -SIGUSR1 -f gpu-screen-recorder`.\
+To stop recording send SIGINT to gpu screen recorder. You can do this by running `pkill -SIGINT -f gpu-screen-recorder` or pressing `Ctrl-C` in the terminal that runs gpu screen recorder. When recording a regular non-replay video this will also save the video.\
+To pause/unpause recording send SIGUSR2 to gpu screen recorder. You can do this by running `pkill -SIGUSR2 -f gpu-screen-recorder`. This is only applicable and useful when recording (not streaming nor replay).\
+There are more signals to control GPU Screen Recorder. Run `gpu-screen-recorder --help` to list them all (under `NOTES` section).
 ## Simple way to run replay without gui
 Run the script `scripts/start-replay.sh` to start replay and then `scripts/save-replay.sh` to save a replay and `scripts/stop-replay.sh` to stop the replay. The videos are saved to `$HOME/Videos`.
 You can use these scripts to start replay at system startup if you add `scripts/start-replay.sh` to startup (this can be done differently depending on your desktop environment / window manager) and then go into
 hotkey settings on your system and choose a hotkey to run the script `scripts/save-replay.sh`. Modify `scripts/start-replay.sh` if you want to use other replay options.
 ## Run replay on system startup
-If you are running a distro that uses systemd then the `install.sh` script installs `extra/gpu-screen-recorder.service` on the system and that systemd service can be started with `systemctl enable --now --user gpu-screen-recorder`
-and it's configured with `$HOME/.config/gpu-screen-recorder.env` (create it if it doesn't exist).
-You can see which variables that you can use in the `gpu-screen-recorder.env` file by looking at the `extra/gpu-screen-recorder.service` file. In general you only need to set the `WINDOW` variable to a monitor to make it work.
-You can use the `scripts/save-replay.sh` script to save a replay and by default the systemd service saves files in `$HOME/Videos`.\
-If you are using a NVIDIA GPU then it's recommended to set PreserveVideoMemoryAllocations=1 as mentioned in the section below.
-## Issues
-### NVIDIA
-Nvidia drivers have an issue where CUDA breaks if CUDA is running when suspend/hibernation happens, and it remains broken until you reload the nvidia driver. To fix this, either disable suspend or tell the NVIDIA driver to preserve video memory on suspend/hibernate by using the `NVreg_PreserveVideoMemoryAllocations=1` option. You can run `sudo extra/install_preserve_video_memory.sh` to automatically add that option to your system.
+If you installed GPU Screen Recorder from AUR or from source and you are running a distro that uses systemd then you will have a systemd service installed that can be started with `systemctl enable --now --user gpu-screen-recorder`. This systemd service runs GPU Screen Recorder on system startup.\
+It's configured with `$HOME/.config/gpu-screen-recorder.env` (create it if it doesn't exist). You can look at [extra/gpu-screen-recorder.env](https://git.dec05eba.com/gpu-screen-recorder/plain/extra/gpu-screen-recorder.env) to see an example.
+You can see which variables that you can use in the `gpu-screen-recorder.env` file by looking at the `extra/gpu-screen-recorder.service` file. Note that all of the variables are optional, you only have to set the ones that are you interested in.
+You can use the `scripts/save-replay.sh` script to save a replay and by default the systemd service saves videos in `$HOME/Videos`.
+# Issues
+## NVIDIA
+Nvidia drivers have an issue where CUDA breaks if CUDA is running when suspend/hibernation happens, and it remains broken until you reload the nvidia driver. `extra/gsr-nvidia.conf` will be installed by default when you install GPU Screen Recorder and that should fix this issue. If this doesn't fix the issue for you then your distro may use a different path for modprobe files. In that case you have to install that `extra/gsr-nvidia.conf` yourself into that location.
+You have to reboot your computer after installing GPU Screen Recorder for the first time for the fix to have any effect.
+# Examples
+Look at the [scripts](https://git.dec05eba.com/gpu-screen-recorder/tree/scripts) directory for script examples. For example if you want to automatically save a recording/replay into a folder with the same name as the game you are recording.
 
-# Reporting bugs/contributing patches
-See [https://git.dec05eba.com/?p=about](https://git.dec05eba.com/?p=about)
+# Reporting bugs, contributing patches, questions or donation
+See [https://git.dec05eba.com/?p=about](https://git.dec05eba.com/?p=about).
 
 # Demo
 [![Click here to watch a demo video on youtube](https://img.youtube.com/vi/n5tm0g01n6A/0.jpg)](https://www.youtube.com/watch?v=n5tm0g01n6A)
 
 # FAQ
-## How is this different from using OBS with nvenc?
-OBS only uses the gpu for video encoding, but the window image that is encoded is copied from the GPU to the CPU and then back to the GPU (video encoding unit). These operations are very slow and causes all of the fps drops when using OBS. OBS only uses the GPU efficiently on Windows 10 and Nvidia.\
-This gpu screen recorder keeps the window image on the GPU and sends it directly to the video encoding unit on the GPU by using CUDA. This means that CPU usage remains at around 0% when using this screen recorder.
-## How is this different from using OBS NvFBC plugin?
-The plugin does everything on the GPU and gives the texture to OBS, but OBS does not know how to use the texture directly on the GPU so it copies the texture to the CPU and then back to the GPU (video encoding unit). These operations are very slow and causes a lot of fps drops unless you have a fast CPU. This is especially noticable when recording at higher resolutions than 1080p.
-## How is this different from using FFMPEG with x11grab and nvenc?
-FFMPEG only uses the GPU with CUDA when doing transcoding from an input video to an output video, and not when recording the screen when using x11grab. So FFMPEG has the same fps drop issues that OBS has.
 ## It tells me that my AMD/Intel GPU is not supported or that my GPU doesn't support h264/hevc, but that's not true!
-Some linux distros (such as manjaro) disable hardware accelerated h264/hevc on AMD/Intel because of "patent license issues". If you are using an arch-based distro then you can install mesa-git instead of mesa and if you are using another distro then you may have to switch to a better distro.
+Some linux distros (such as manjaro and fedora) disable hardware accelerated h264/hevc on AMD/Intel because of "patent license issues". If you are using an arch-based distro then you can install mesa-git instead of mesa and if you are using another distro then you may have to switch to a better distro. On fedora based distros you can follow this: [Hardware Accelerated Codec](https://rpmfusion.org/Howto/Multimedia).\
+If you installed GPU Screen Recorder flatpak then you can try installing mesa-extra freedesktop runtime by running this command: `flatpak install --system org.freedesktop.Platform.GL.default//23.08-extra`
 ## I have an old nvidia GPU that supports nvenc but I get a cuda error when trying to record
 Newer ffmpeg versions don't support older nvidia cards. Try installing GPU Screen Recorder flatpak from [flathub](https://flathub.org/apps/details/com.dec05eba.gpu_screen_recorder) instead. It comes with an older ffmpeg version which might work for your GPU.
 ## I get a black screen/glitches while live streaming
-It seems like ffmpeg earlier than version 6.1 has some type of bug. Install ffmpeg 6.1 (ffmpeg-git in aur, ffmpeg in the offical repositories hasn't been updated yet) and then reinstall GPU Screen Recorder.
-
-# Donations
-If you want to donate you can donate via bitcoin or monero.
-* Bitcoin: bc1qqvuqnwrdyppf707ge27fqz2n9y9gu7lf5ypyuf
-* Monero: 4An9kp2qW1C9Gah7ewv4JzcNFQ5TAX7ineGCqXWK6vQnhsGGcRpNgcn8r9EC3tMcgY7vqCKs3nSRXhejMHBaGvFdN2egYet
-
-# TODO
-* Dynamically change bitrate/resolution to match desired fps. This would be helpful when streaming for example, where the encode output speed also depends on upload speed to the streaming service.
-* Show cursor when recording a window. Currently the cursor is only visible when recording a monitor.
-* Implement opengl injection to capture texture. This fixes VRR without having to use NvFBC direct capture.
-* Always use direct capture with NvFBC once the capture issue in mpv fullscreen has been resolved (maybe detect if direct capture fails in nvfbc and switch to non-direct recording. NvFBC says if direct capture fails).
+It seems like ffmpeg earlier than version 6.1 has some type of bug. Install ffmpeg version 6.1 or later and then reinstall GPU Screen Recorder to fix this issue. The flatpak version of GPU Screen Recorder comes with a newer version of ffmpeg so no extra steps are needed.
+## I can't play the video in my browser directly or in discord
+Browsers and discord don't support hevc video codec at the moment. Choose h264 video codec instead with the -k h264 option.
+Note that websites such as youtube support hevc so there is no need to choose h264 video codec if you intend to upload the video to youtube or if you want to play the video locally or if you intend to
+edit the video with a video editor. Hevc allows for better video quality (especially at lower file sizes) so hevc (or av1) is recommended for source videos.
+## I get a black bar/distorted colors on the sides in the video
+This is mostly an issue on AMD. For av1 it's a hardware issue, see: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9185. For hevc it's a software issue in the AMD driver that hasn't been fixed yet. This issue happens at certain video resolutions. If you get this issue then a workaround is to record with h264 video codec instead (using the -k h264 option).
+## The video doesn't display or has a green/yellow overlay
+This can happen if your video player is missing the H264/HEVC video codecs. Either install the codecs or use mpv.
+## I get stutter in the video
+Try recording to an SSD and make sure it's not using NTFS file system. Also record in variable framerate format.
+## The colors look washed out when recording a monitor with HDR enabled
+You have to either record in hdr mode (-k `hevc_hdr` or -k `av1_hdr` option) to record a HDR video or record with desktop portal option (`-w portal`) to turn the HDR recording into SDR.
+## GPU Screen Recorder records night light when recording in HDR mode
+You can record with desktop portal option (`-w portal`) instead which ignores night light, if you are ok with recording without HDR.
+## Kdenlive says that the video is not usable for editing because it has variable frame rate
+To fix this you can either just press cancel, which will allow you to continue or record the video in .mkv format or constant frame rate (-fm cfr). I recommend recording the video in .mkv format and variable frame rate (-fm vfr).
+## Colors look incorrect when recording HDR (with hevc_hdr/av1_hdr) or using an ICC profile
+KDE Plasma version 6.2 broke HDR and ICC profiles for screen recorders. This was changed in KDE plasma version 6.3 and recording HDR works now, as long as you set HDR brightness to 100% (which means setting "Maximum SDR Brightness" in KDE plasma display settings to 203) and set color accuracy to "Prefer color accuracy". If you want to convert HDR to SDR then record with desktop portal option (`-w portal`) instead.
+I don't know how well recording HDR works in wayland compositors other than KDE plasma.
+## GPU Screen Recorder starts lagging after 30-40 minutes when launching GPU Screen Recorder from steam command launcher
+This is a [steam issue](https://github.com/ValveSoftware/steam-for-linux/issues/11446). Prepend the gpu-screen-recorder command with `LD_PREFIX=""`, for example `LD_PREFIX="" gpu-screen-recorder -w screen -o video.mp4`.
+## The video isn't smooth when gpu usage is 100%
+If you are using the flatpak version of GPU Screen Recorder then try installing GPU Screen Recorder from a non-flatpak source instead (such as from aur or from source). Flatpak has a limitation that prevents GPU Screen Recorder from running faster when playing very heavy games.
+## How do I apply audio effects, such as noise suppression?
+You have to use external software for that, such as Easy Effects or NoiseTorch.
diff --git a/TODO b/TODO
index 97a741d..d552616 100644
--- a/TODO
+++ b/TODO
@@ -2,25 +2,19 @@ Check for reparent.
 Quickly changing workspace and back while recording under i3 breaks the screen recorder. i3 probably unmaps windows in other workspaces.
 See https://trac.ffmpeg.org/wiki/EncodingForStreamingSites for optimizing streaming.
 Look at VK_EXT_external_memory_dma_buf.
-Allow setting a different output resolution than the input resolution.
 Use mov+faststart.
 Allow recording all monitors/selected monitor without nvfbc by recording the compositor proxy window and only recording the part that matches the monitor(s).
-Allow recording a region by recording the compositor proxy window / nvfbc window and copying part of it.
-Use nvenc directly, which allows removing the use of cuda.
-Handle xrandr monitor change in nvfbc.
-Implement follow focused in drm.
 Support amf and qsv.
 Disable flipping on nvidia? this might fix some stuttering issues on some setups. See NvCtrlGetAttribute/NvCtrlSetAttributeAndGetStatus NV_CTRL_SYNC_TO_VBLANK https://github.com/NVIDIA/nvidia-settings/blob/d5f022976368cbceb2f20b838ddb0bf992f0cfb9/src/gtk%2B-2.x/ctkopengl.c.
 Replays seem to have some issues with audio/video. Why?
 Cleanup unused gl/egl functions, macro, etc.
-Add option to disable overlapping of replays (the old behavior kinda. Remove the whole replay buffer data after saving when doing this).
 Set audio track name to audio device name (if not merge of multiple audio devices).
 Add support for webcam, but only really for amd/intel because amd/intel can get drm fd access to webcam, nvidia cant. This allows us to create an opengl texture directly from the webcam fd for optimal performance.
 Reverse engineer nvapi so we can disable "force p2 state" on linux too (nvapi profile api with the settings id 0x50166c5e).
 Support yuv444p on amd/intel.
 fix yuv444 for hevc.
 Do not allow streaming if yuv444.
-Re-enable yuv444.
+Re-enable yuv444 and allow yuv444 for software encoding. Good for remote desktop. But for remote desktop its more ideal to use yuv420 and when the image is not moving then send a png image instead, for clear image when the image is static.
 Support 10 bit output because of better gradients. May even be smaller file size. Better supported on hevc (not supported at all on h264 on my gpu).
 Add nvidia/(amd/intel) specific install script for ubuntu. User should run install_ubuntu.sh but it should run different install dep script depending on if /proc/driver/nvidia/version exists or not. But what about switchable graphics setup?
 Test different combinations of switchable graphics. Intel hybrid mode (running intel but possible to run specific applications with prime-run), running pure intel. Detect switchable graphics.
@@ -31,19 +25,8 @@ https://djdallmann.github.io/GamingPCSetup/CONTENT/RESEARCH/FINDINGS/registrykey
 The video output will be black if if the system is suspended on nvidia and NVreg_PreserveVideoMemoryAllocations is not set to 1. This happens because I think that the driver invalidates textures/cuda buffers? To fix this we could try and recreate gsr capture when gsr_capture_capture fails (with timeout to retry again).
 
 NVreg_RegistryDwords.
-Restore nvfbc screen recording on monitor reconfiguration.
 Window capture doesn't work properly in _control_ game after going from pause menu to in-game (and back to pause menu). There might be some x11 event we need to catch. Same for vr-video-player.
 
-Fix constant framerate not working properly on amd/intel because capture framerate gets locked to the same framerate as game framerate, which doesn't work well when you need to encode multiple duplicate frames. We can skip multiple encode if we duplicate frame once and then use that same frame data as the difference between frames will be exactly the same, but hevc complains about that. Is there a way to make hevc shut up?
-
-Properly handle monitor reconfiguration (kms vaapi, nvfbc).
-
-Better configure vaapi. The file size is too large.
-
-Clear vaapi surface (for focused window).
-
-Window capture performance on steam deck isn't good when playing the witcher 3 for example. The capture itself is fine but video encoding puts it to 30fps even if the game runs at 57 fps.
-
 Monitor capture on steam deck is slightly below the game fps, but only when capturing on the steam deck screen. If capturing on another monitor, there is no issue.
     Is this related to the dma buf rotation issue? different modifier being slow? does this always happen?
 
@@ -54,29 +37,16 @@ Intel is a bit weird with monitor capture and multiple monitors. If one of the m
     Is that only the case when the primary monitor is rotated? Also the primary monitor becomes position 0, 0 so crtc (x11 randr) position doesn't match the drm pos. Maybe get monitor position and size from drm instead.
     How about if multiple monitors are rotated?
 
-Support vp8/vp9. This is especially important on amd which on some distros (such as Manjaro) where hardware accelerated h264/hevc is disabled in the mesa package.
-
 Support screen (all monitors) capture on amd/intel and nvidia wayland when no combined plane is found. Right now screen just takes the first output.
 Use separate plane (which has offset and pitch) from combined plane instead of the combined plane.
 
 Both twitch and youtube support variable bitrate but twitch recommends constant bitrate to reduce stream buffering/dropped frames when going from low motion to high motion: https://help.twitch.tv/s/article/broadcasting-guidelines?language=en_US. Info for youtube: https://support.google.com/youtube/answer/2853702?hl=en#zippy=%2Cvariable-bitrate-with-custom-stream-keys-in-live-control-room%2Ck-p-fps%2Cp-fps.
 
-Limit fps recording with x damage. This is good when running replay mode 24/7 and being afk or when not much is happening on the screen.
-
 On nvidia some games apparently causes the game to appear to stutter (without dropping fps) when recording a monitor but not using
     when using direct screen capture. Observed in Deus Ex and Apex Legends.
 
-Support "screen" (all monitors) capture on wayland. This should be done by getting all drm fds and multiple EGL_DMA_BUF_PLANEX_FD_EXT to create one egl image with all fds combined.
-
-Support pipewire screen capture?
-Support screen rotation.
-When nvidia supports hardware cursor then capture the cursor. Right now the cursor is captured because it's a software cursor so it's composed on the dma buf.
-CPU usage is pretty high on AMD/Intel/(Nvidia(wayland)), why? opening and closing fds, creating egl, cuda association, is slow when done every frame. Test if desktop portal screencast has better performance.
-
 Capture is broken on amd on wlroots. It's disabled at the moment and instead uses kms capture. Find out why we get a black screen in wlroots.
 
-First video and audio frame should be posted immediately instead of waiting 1000/fps milliseconds, to improve latency for remote desktop future functionality.
-
 Support vulkan video encoding. That might workaround forced p2 state nvidia driver "bug". Ffmpeg supports vulkan video encoding if it's encoding with --enable-vulkan
 
 It may be possible to improve color conversion rgb->yuv shader for color edges by biasing colors to an edge, instead of letting color overlaying with bilinear filtering handle it.
@@ -84,8 +54,6 @@ It may be possible to improve color conversion rgb->yuv shader for color edges b
 When webcam is supported mention that nvidia_drm.modeset=1 must be set on nvidia x11 (it's required on wayland so it's not needed there. Or does eglstream work without it??). Check if this really is the case.
   Support green screen removal, cropping, shader effects in general (circle mask, rounded corners, etc).
 
-Use vfr on nvidia x11 as well, otherwise network data could slow it down to below target fps and mess it up.
-
 Preset is set to p5 for now but it should ideally be p6 or p7.
     This change is needed because for certain sizes of a window (or monitor?) such as 971x780 causes encoding to freeze
     when using h264 codec. This is a new(?) nvidia driver bug.
@@ -94,15 +62,247 @@ Preset is set to p5 for now but it should ideally be p6 or p7.
 For low latency, see https://developer.download.nvidia.com/compute/nvenc/v4.0/NVENC_VideoEncoder_API_ProgGuide.pdf (section 7.1).
 Remove follow focused option.
 
-Overclocking (-oc) can overclock too much on some systems. Maybe remove the option?
-
 Exit if X11/Wayland killed (if drm plane dead or something?)
 
 Use SRC_W and SRC_H for screen plane instead of crtc_w and crtc_h.
 
-Make it possible to select which /dev/dri/card* to use, but that requires opengl to also use the same card. Not sure if that is possible for amd, intel and nvidia without using vulkan instead.
-
-Support I915_FORMAT_MOD_Y_TILED_CCS (and other power saving modifiers, see https://trac.ffmpeg.org/ticket/8542). The only fix may be to use desktop portal for recording. This issue doesn't appear on x11 since these modifiers are not used by xorg server.
-
 Test if p2 state can be worked around by using pure nvenc api and overwriting cuInit/cuCtxCreate* to not do anything. Cuda might be loaded when using nvenc but it might not be used, with certain record options? (such as h264 p5).
     nvenc uses cuda when using b frames and rgb->yuv conversion, so convert the image ourselves instead.-
+
+Drop frames if live streaming cant keep up with target fps, or dynamically change resolution/quality.
+
+Support low power option.
+
+Instead of sending a big list of drm data back to kms client, send the monitor we want to record to kms server and the server should respond with only the matching monitor, and cursor.
+
+Tonemap hdr to sdr when hdr is enabled and when hevc_hdr/av1_hdr is not used.
+
+Add 10 bit record option, h264_10bit, hevc_10bit and av1_10bit.
+
+Rotate cursor texture properly (around top left origin).
+
+Setup hardware video context so we can query constraints and capabilities for better default and better error messages.
+
+Use CAP_SYS_NICE in flatpak too on the main gpu screen recorder binary. It makes recording smoother, especially with constant framerate.
+
+Modify ffmpeg to accept opengl texture for nvenc encoding. Removes extra buffers and copies.
+
+When vulkan encode is added, mention minimum nvidia driver required. (550.54.14?).
+
+Support drm plane rotation. Neither X11 nor any Wayland compositor currently rotates drm planes so this might not be needed.
+
+Investigate if there is a way to do gpu->gpu copy directly without touching system ram to enable video encoding on a different gpu. On nvidia this is possible with cudaMemcpyPeer, but how about from an intel/amd gpu to an nvidia gpu or the other way around or any combination of iGPU and dedicated GPU?
+    Maybe something with clEnqueueMigrateMemObjects? on AMD something with DirectGMA maybe?
+
+Go back to using pure vaapi without opengl for video encoding? rotation (transpose) can be done if its done after (rgb to yuv) color conversion.
+
+Use lanczos resampling for better scaling quality. Lanczos resampling can also be used for YUV chroma for better color quality on small text.
+
+Flac is disabled because the frame sizes are too large which causes big audio/video desync.
+
+Enable b-frames.
+
+Support vfr matching games exact fps all the time. On x11 use damage tracking, on wayland? maybe there is drm plane damage tracking. But that may not be accurate as the compositor may update it every monitor hz anyways. On wayland maybe only support it for desktop portal + pipewire capture.
+    Another method to track damage that works regardless of the display server would be to do a diff between frames with a shader.
+
+Support selecting which gpu to use. This can be done in egl with eglQueryDevicesEXT and then eglGetPlatformDisplayEXT. This will automatically work on AMD and Intel as vaapi uses the same device. On nvidia we need to use eglQueryDeviceAttribEXT with EGL_CUDA_DEVICE_NV.
+    Maybe on glx (nvidia x11 nvfbc) we need to use __NV_PRIME_RENDER_OFFLOAD, __NV_PRIME_RENDER_OFFLOAD_PROVIDER, __GLX_VENDOR_LIBRARY_NAME, __VK_LAYER_NV_optimus, VK_ICD_FILENAMES instead. Just look at prime-run /usr/bin/prime-run.
+
+When adding support for steam deck, add option to send video to another computer.
+New gpu screen recorder gui should have the option to cut the video directly, maybe running an ffmpeg command or implementing that ourselves. Only support gpu screen recorder video files.
+
+Check if is software renderer by using eglQueryDisplayAttribEXT(egl_display, EGL_DEVICE_EXT..) eglQueryDeviceStringEXT(egl_device, EGL_EXTENSIONS) and check for "EGL_MESA_device_software".
+
+Use MapTexture2DINTEL for software encoding on intel.
+
+To test vulkan encode on amd set the environment variable RADV_PERFTEST=video_encode before running a program that uses vulkan encode (or queries for it, such as vulkaninfo).
+
+Support hevc/av1 for software encoder and hdr support at the same time. Need support for yuv420p shader for that. Use libx265 for hevc and libsvtav1 for av1 (libsvtav1 is the fastest software av1 video encoder). Also support vp8/vp9 since we are not limited by hardware.
+
+Cleanup pipewire code and add more error checks.
+
+Make dbus code and pipewire setup non blocking.
+
+Support portal (pipewire) hdr capture when pipewire adds support for it. Maybe use the result of SelectSources and then query the hdr metadata with drm.
+
+HDR support on x11?
+
+Move most kms data to kms client. We dont need root access for everything that is server from kms server right now, such as hdr metadata and drm plane properties. Only the drm plane fd really needs root access.
+
+Show rotated window size in monitor list when using incorrect monitor name.
+
+Desktop portal capture on kde plasma makes notifications not show up unless the notification is set as urgent. How to fix this? do we have to make our own notification system?
+
+Explicit sync is done with the drm property IN_FENCE_FD (see https://drmdb.emersion.fr/properties/4008636142/IN_FENCE_FD). Check if this needs to be used on wayland (especially on nvidia) when capturing a monitor directly without desktop portal.
+
+The update fps appear to be lower when recording a monitor instead of using portal on intel. Does this reflect in game framerate?
+
+Fix glitches when using prime-run with desktop portal. It happens when moving a window around. It's probably a syncing issue.
+
+Allow prime-run on x11 if monitor capture and the prime gpu is not nvidia.
+
+Enable 2-pass encoding.
+
+Restart replay/update video resolution if monitor resolution changes.
+
+Fix pure vaapi copy on intel.
+
+Use nvidia low latency options for better encoding times.
+
+Test ideal async_depth value. Increasing async_depth also increased gpu memory usage a lot (from 100mb to 500mb when moving from async_depth 2 to 16) at 4k resolution. Setting it to 8 increases it by 200mb which might be ok.
+
+Replace -encoder cpu with -k h264_software?
+
+Change vp8/vp9 quality options, right now the file size is too large (for vp9 at least at very_high quality).
+
+Support recording while in replay mode. This will be needed when enabling replay on system startup with systemd service and wanting to record a video besides that.
+    The harder and more bloat solution for this would be to make an IPC.
+    The simple solution would be to use SIGUSR2 for starting/stopping recording since SIGUSR2 is unused for replays. That would mean SIGUSR2 for pausing recording would be ignored.
+    It also means that the video will be created in the same directory as the replay (or have option to specify another location for that) but the filename would have to be generated automatically.
+    To rename the file you would have to use -sc to rename it with a script, or add an option to provide a template for the name.
+
+Dynamically change bitrate/resolution to match desired fps. This would be helpful when streaming for example, where the encode output speed also depends on upload speed to the streaming service.
+Implement opengl injection to capture texture. This fixes VRR without having to use NvFBC direct capture and also allows perfect frame timing.
+Always use direct capture with NvFBC once the capture issue in mpv fullscreen has been resolved (maybe detect if direct capture fails in nvfbc and switch to non-direct recording. NvFBC says if direct capture fails).
+
+Support ROI (AV_FRAME_DATA_REGIONS_OF_INTEREST).
+
+Default to hevc if capture size is larger than 4096 in width or height.
+
+Set low latency mode on vulkan encoding.
+
+Support recording/replay/livestreaming at the same time by allowing commands to be run on an existing gpu screen recorder instance.
+
+Test if `xrandr --output DP-1 --scale 1.5` captures correct size on nvidia.
+
+Fix cursor position and scale when scaling x11 display.
+
+Support application audio recording without pipewire combined sink.
+
+Support transposing (rotating) with vaapi. This isn't supported on many devices with rgb buffer, but its supported with nv12 buffer (on intel at least).
+
+Cleanup pipewire_audio.c (proper error handling and memory cleanup of proxies).
+
+Hide application audio module-null-sink by using sink_properties=media.class="Audio/Sink/Internal".
+
+Improve software encoding performance.
+
+Add option to record audio from the recorded window only.
+
+Add option to automatically select best video codec available. Add -k best, -k best_10bit and -k best_hdr.
+
+Use wayland color management protocol when it's available: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/14.
+
+Use different exit codes for different errors. Use one for invalid -w option, another one for invalid -a option for audio devices, etc. This is to make UI error reporting better.
+    Document these exit codes in an exit code .md file, or finally create a manpage where this can be documented.
+
+Ffmpeg fixed black bars in videos on amd when using hevc and when recording at some resolutions, such as 1080p:
+    https://github.com/FFmpeg/FFmpeg/commit/bcfbf2bac8f9eeeedc407b40596f5c7aaa0d5b47
+    https://github.com/FFmpeg/FFmpeg/commit/d0facac679faf45d3356dff2e2cb382580d7a521
+    Disable gpu screen recorder black bar handling when using hevc on amd when the libavcodec version is the one that comes after those commits.
+    Also consider the mesa version, to see if the gpu supports this.
+    The version is libavcodec >= 61.28.100
+
+Use opengl compute shader instead of graphics shader. This might allow for better performance when games are using 100% of graphics unit which might fix issue with 100% gpu usage causing gpu screen recorder to run slow when not using vaapi to convert rgb to nv12(?).
+
+Always disable prime run/dri prime and list all monitors to record from from all cards.
+    Do this instead of adding an option to choose which gpu to use.
+    On X11 the primary gpu will always have the framebuffer for all monitors combined.
+        Use randr to list all monitors and always record and encode with the primary gpu.
+    On Wayland each gpu will have its own list of monitors with framebuffers.
+        Iterate through all cards with drm and list all monitors with associated framebuffers and when choosing a monitor to record
+        automatically use the associated gpu card.
+
+Allow flv av1 if recent ffmpeg version and streaming to youtube (and twitch?) and for custom services.
+Use explicit sync in pipewire video code: https://docs.pipewire.org/page_dma_buf.html.
+
+Support vaapi rotation. Support for it is added in mesa here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32919.
+
+Replay (and recording?) fails to save properly sometimes (especially for long videos). This is noticable with mp4 files since they get corrupt and become unplayable.
+    The entire video does seem to get saved (it's a large video file) and it seems to have the correct headers but it's not playable.
+
+Make it possible to save a shorter replay clip remotely. Maybe implement ipc first, to then also allow starting recording/stream while a replay is running.
+
+Add an option to pass http headers when streaming. Some streaming services require streaming keys to be passed in a http header instead of in the url as a parameter.
+
+When adding vulkan video support add VK_VIDEO_ENCODE_TUNING_MODE_LOW_LATENCY_KHR.
+
+Implement screenshot without invoking opengl (which is slow to start on some systems).
+
+Automatically use desktop portal on wayland when hdr is enabled (or night light) by checking if kms hdr metadata exists, if hdr video codec is not used.
+    Or maybe do this in the ui?
+
+Detect if cached portal session token is no longer valid (this can happen if the user switches to another wayland compositor).
+
+Support reconnecting (and setting things up again) if the audio server is restarted (for both device recording and app recording).
+
+Find out how nvidia-smi fixes nvenc not working on opensuse and do that ourselves instead of relying on nvidia-smi that is not always installed.
+
+Pulseaudio code: add "running" variable to loops to allow stopping the running code when quitting.
+
+Scale screenshot frame libswscale or implement lanczos shader for improved scaline for video as well.
+
+Support high quality scaling with -s by using lanczos.
+
+Support spanning multiple monitors with region capture. This would also allow the user to record multiple monitors at the same time, the same way screen-direct works on nvidia x11.
+
+When webcam support is added also support v4l2loopback? this is done by using avdevice_register_all(); and -c v4l2 -o /dev/video0; but it needs to output raw data as well instead of h264 and possibly yuv420p. Maybe add a -k yuv420p option to do that or -k rgb.
+    This would be implemented by outputting the raw data directly into the output file, without using the video encoder.
+
+Do proper exit, to call gsr_capture_destroy which will properly stop gsr-kms-server. Otherwise there can be zombie gsr-kms-server on error.
+
+Replace all scissors with clearing textures if the cursor hits the outside of the frame image.
+
+Cursor position might be slightly wrong on rotated monitor.
+
+External texture doesn't work on nvidia x11, probably because of glx context (requires gles es). External texture is not used on nvidia x11 right now so it's not an issue.
+
+Add option to save replay buffer on disk instead of ram.
+
+nvfbc capture cursor with cursor.h instead and composite that on top. This allows us to also always get a cursor in direct capture mode. This could possible give better performance as well.
+
+Maybe remove external shader code and make a simple external to internal texture converter (compute shader), to reduce texture sampling. Maybe this is faster?
+
+Fix opengl context broken after suspend on nvidia by using this: https://registry.khronos.org/OpenGL/extensions/NV/NV_robustness_video_memory_purge.txt requires glx context creation flags and GetGraphicsResetStatusARB() == PURGED_CONTEXT_RESET_NV check to recreate all graphics.
+
+HDR looks incorrect, brightest point gets cut off.
+
+Make "screen" capture the preferred monitor.
+
+When webcam support is added add the option to add it as a second video track, to make it easier to edit in video editors.
+
+Fix constant framerate not working properly on amd/intel because capture framerate gets locked to the same framerate as
+    game framerate, which doesn't work well when you need to encode multiple duplicate frames (AMD/Intel is slow at encoding!).
+    It also appears to skip audio frames on nvidia wayland? why? that should be fine, but it causes video stuttering because of audio/video sync.
+
+Add option to pass a fd (from socketpair) to use for rpc. In the rpc have a common header, with protocol version, data type and data in an enum.
+
+Add the option to set audio track name, for example with -a "track-name:blabla|device:default_output|app:firefox"
+
+Maybe disable qp/vbr for replay. In that case we can preallocate all replay data (for both ram and disk) and write to that directly when receiving packet (dont do that when also recording at the same time).
+    That could improve performance/disk write optimization and maybe even reduce ram usage because of less blocks/fragmentation.
+
+When rpc is added add the option to add/remove audio devices/app audio and also overlays (from new capture sources).
+
+Support hdr screenshot.
+
+Recreate opengl context on loss. This can happen if there is a gpu driver bug, causing context to need to be recreated. This is a nice improvement to not break recording even with buggy driver.
+
+Support saving video with surround sound. Surround sound audio capture does work, but it gets downmixed to stereo.
+
+Add (render) plugin support. To simplify it (and possibly best performance) create one rgba texture (with the size of the output video) that is used across all plugins.
+    Create a framebuffer and set this texture and the target and set the framebuffer as active before calling the plugins.
+    Then the plugins can render simply by doing simple opengl draw functions.
+    Maybe send some metadata to the plugin, such as video (and framebuffer) size. Although this data can be retrieved from the active framebuffer.
+
+Either support webcam support with raw yuyv, mapping the buffer directly to opengl. Or use mjpeg, mapping the buffer directly to vaapi jpeg decoder and then get then map the decoded buffer to opengl.
+    Some webcams dont support raw yuyv and many webcams support higher framerates for mjpeg.
+
+Allow medium, high, very_high and ultra quality for -bm cbr. If that is used then it will automatically estimate the best bitrate for that quality based on resolution and fps.
+    Maybe do this in the ui instead (or both?), to show estimated file size.
+
+Maybe remove shader compute code. It doesn't seem necessary anymore now that glSwapBuffer/glFinish isn't used. dbus server isn't needed anymore either, the code can be moved back to the gpu screen recorder process.
+
+Add proper check if opengl functions are supported. dlsym for the symbol will return a no-op function if it's not supported, so it silently fails if used.
+
+Colors are offset to bottom left by 1 pixel or so on steam deck in landscape mode.
diff --git a/build.sh b/build.sh
deleted file mode 100755
index 5cf9954..0000000
--- a/build.sh
+++ /dev/null
@@ -1,57 +0,0 @@
-#!/bin/sh -e
-
-script_dir=$(dirname "$0")
-cd "$script_dir"
-
-CC=${CC:-gcc}
-CXX=${CXX:-g++}
-
-opts="-O2 -g0 -DNDEBUG -Wall -Wextra -Wshadow"
-[ -n "$DEBUG" ] && opts="-O0 -g3 -Wall -Wextra -Wshadow";
-
-build_wayland_protocol() {
-    wayland-scanner private-code external/wlr-export-dmabuf-unstable-v1.xml external/wlr-export-dmabuf-unstable-v1-protocol.c
-    wayland-scanner client-header external/wlr-export-dmabuf-unstable-v1.xml external/wlr-export-dmabuf-unstable-v1-client-protocol.h
-}
-
-build_gsr_kms_server() {
-    # TODO: -fcf-protection=full, not supported on arm
-    extra_opts="-fstack-protector-all"
-    dependencies="libdrm"
-    includes="$(pkg-config --cflags $dependencies)"
-    libs="$(pkg-config --libs $dependencies) -ldl"
-    $CC -c kms/server/kms_server.c $opts $extra_opts $includes
-    $CC -o gsr-kms-server kms_server.o $libs $opts $extra_opts
-}
-
-build_gsr() {
-    dependencies="libavcodec libavformat libavutil x11 xcomposite xrandr libpulse libswresample libavfilter libva libcap libdrm wayland-egl wayland-client"
-    includes="$(pkg-config --cflags $dependencies)"
-    libs="$(pkg-config --libs $dependencies) -ldl -pthread -lm"
-    $CC -c src/capture/capture.c $opts $includes
-    $CC -c src/capture/nvfbc.c $opts $includes
-    $CC -c src/capture/xcomposite_cuda.c $opts $includes
-    $CC -c src/capture/xcomposite_vaapi.c $opts $includes
-    $CC -c src/capture/kms_vaapi.c $opts $includes
-    $CC -c src/capture/kms_cuda.c $opts $includes
-    $CC -c kms/client/kms_client.c $opts $includes
-    $CC -c src/egl.c $opts $includes
-    $CC -c src/cuda.c $opts $includes
-    $CC -c src/xnvctrl.c $opts $includes
-    $CC -c src/overclock.c $opts $includes
-    $CC -c src/window_texture.c $opts $includes
-    $CC -c src/shader.c $opts $includes
-    $CC -c src/color_conversion.c $opts $includes
-    $CC -c src/utils.c $opts $includes
-    $CC -c src/library_loader.c $opts $includes
-    $CC -c external/wlr-export-dmabuf-unstable-v1-protocol.c $opts $includes
-    $CXX -c src/sound.cpp $opts $includes
-    $CXX -c src/main.cpp $opts $includes
-    $CXX -o gpu-screen-recorder capture.o nvfbc.o kms_client.o egl.o cuda.o xnvctrl.o overclock.o window_texture.o shader.o \
-        color_conversion.o utils.o library_loader.o xcomposite_cuda.o xcomposite_vaapi.o kms_vaapi.o kms_cuda.o wlr-export-dmabuf-unstable-v1-protocol.o sound.o main.o $libs $opts
-}
-
-build_wayland_protocol
-build_gsr_kms_server
-build_gsr
-echo "Successfully built gpu-screen-recorder"
diff --git a/dbus/client/dbus_client.c b/dbus/client/dbus_client.c
new file mode 100644
index 0000000..de2df62
--- /dev/null
+++ b/dbus/client/dbus_client.c
@@ -0,0 +1,269 @@
+#include "dbus_client.h"
+#include "../protocol.h"
+
+#include <sys/socket.h>
+#include <sys/wait.h>
+#include <sys/prctl.h>
+#include <unistd.h>
+#include <poll.h>
+
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+
+// TODO: Error checking for write/read
+
+static bool gsr_dbus_client_wait_for_startup(gsr_dbus_client *self) {
+    struct pollfd poll_fd = {
+        .fd = self->socket_pair[0],
+        .events = POLLIN,
+        .revents = 0
+    };
+    for(;;) {
+        int poll_res = poll(&poll_fd, 1, 100);
+        if(poll_res > 0 && (poll_fd.revents & POLLIN)) {
+            char msg;
+            read(self->socket_pair[0], &msg, 1);
+            return true;
+        } else {
+            int status = 0;
+            int wait_result = waitpid(self->pid, &status, WNOHANG);
+            if(wait_result != 0) {
+                int exit_code = -1;
+                if(WIFEXITED(status))
+                    exit_code = WEXITSTATUS(status);
+                fprintf(stderr, "gsr error: gsr_dbus_client_init: server died or never started, exit code: %d\n", exit_code);
+                self->pid = 0;
+                return false;
+            }
+        }
+    }
+}
+
+bool gsr_dbus_client_init(gsr_dbus_client *self, const char *screencast_restore_token) {
+    memset(self, 0, sizeof(*self));
+
+    if(socketpair(AF_UNIX, SOCK_STREAM, 0, self->socket_pair) == -1) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_init: socketpair failed, error: %s\n", strerror(errno));
+        return false;
+    }
+
+    if(screencast_restore_token) {
+        self->screencast_restore_token = strdup(screencast_restore_token);
+        if(!self->screencast_restore_token) {
+            fprintf(stderr, "gsr error: gsr_dbus_client_init: failed to clone restore token\n");
+            gsr_dbus_client_deinit(self);
+            return false;
+        }
+    }
+
+    self->pid = fork();
+    if(self->pid == -1) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_init: failed to fork process\n");
+        gsr_dbus_client_deinit(self);
+        return false;
+    } else if(self->pid == 0) { /* child */
+        char socket_pair_server_str[32];
+        snprintf(socket_pair_server_str, sizeof(socket_pair_server_str), "%d", self->socket_pair[1]);
+
+        /* Needed for NixOS for example, to make sure gsr-dbus-server doesn't inherit cap_sys_nice */
+        prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_CLEAR_ALL, 0, 0, 0);
+
+        const char *args[] = { "gsr-dbus-server", socket_pair_server_str, self->screencast_restore_token ? self->screencast_restore_token : "", NULL };
+        execvp(args[0], (char *const*)args);
+
+        fprintf(stderr, "gsr error: gsr_dbus_client_init: failed to launch \"gsr-dbus-server\", error: %s\n", strerror(errno));
+        _exit(127);
+    } else { /* parent */
+        if(!gsr_dbus_client_wait_for_startup(self)) {
+            gsr_dbus_client_deinit(self);
+            return false;
+        }
+    }
+
+    return true;
+}
+
+void gsr_dbus_client_deinit(gsr_dbus_client *self) {
+    for(int i = 0; i < 2; ++i) {
+        if(self->socket_pair[i] > 0) {
+            close(self->socket_pair[i]);
+            self->socket_pair[i] = -1;
+        }
+    }
+
+    if(self->screencast_restore_token) {
+        free(self->screencast_restore_token);
+        self->screencast_restore_token = NULL;
+    }
+
+    if(self->pid > 0) {
+        kill(self->pid, SIGKILL);
+        int status = 0;
+        waitpid(self->pid, &status, 0);
+        self->pid = 0;
+    }
+}
+
+int gsr_dbus_client_screencast_create_session(gsr_dbus_client *self, char *session_handle, size_t session_handle_size) {
+    const gsr_dbus_request_message request = {
+        .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+        .type = GSR_DBUS_MESSAGE_REQ_CREATE_SESSION,
+        .create_session = (gsr_dbus_message_req_create_session) {}
+    };
+    write(self->socket_pair[0], &request, sizeof(request));
+
+    gsr_dbus_response_message response = {0};
+    read(self->socket_pair[0], &response, sizeof(response));
+
+    if(response.protocol_version != GSR_DBUS_PROTOCOL_VERSION) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_create_session: server uses protocol version %d while the client is using protocol version %d", response.protocol_version, GSR_DBUS_PROTOCOL_VERSION);
+        return -1;
+    }
+
+    if(response.type == GSR_DBUS_MESSAGE_RESP_ERROR) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_create_session: server return error: %s (%d)\n", response.error.message, (int)response.error.error_code);
+        return response.error.error_code;
+    }
+
+    if(response.type != GSR_DBUS_MESSAGE_RESP_CREATE_SESSION) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_create_session: received incorrect response type. Expected %d got %d\n", GSR_DBUS_MESSAGE_RESP_CREATE_SESSION, response.type);
+        return -1;
+    }
+
+    snprintf(session_handle, session_handle_size, "%s", response.create_session.session_handle);
+    return 0;
+}
+
+int gsr_dbus_client_screencast_select_sources(gsr_dbus_client *self, const char *session_handle, uint32_t capture_type, uint32_t cursor_mode) {
+    gsr_dbus_request_message request = {
+        .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+        .type = GSR_DBUS_MESSAGE_REQ_SELECT_SOURCES,
+        .select_sources = (gsr_dbus_message_req_select_sources) {
+            .capture_type = capture_type,
+            .cursor_mode = cursor_mode
+        }
+    };
+    snprintf(request.select_sources.session_handle, sizeof(request.select_sources.session_handle), "%s", session_handle);
+    write(self->socket_pair[0], &request, sizeof(request));
+
+    gsr_dbus_response_message response = {0};
+    read(self->socket_pair[0], &response, sizeof(response));
+
+    if(response.protocol_version != GSR_DBUS_PROTOCOL_VERSION) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_select_sources: server uses protocol version %d while the client is using protocol version %d", response.protocol_version, GSR_DBUS_PROTOCOL_VERSION);
+        return -1;
+    }
+
+    if(response.type == GSR_DBUS_MESSAGE_RESP_ERROR) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_select_sources: server return error: %s (%d)\n", response.error.message, (int)response.error.error_code);
+        return response.error.error_code;
+    }
+
+    if(response.type != GSR_DBUS_MESSAGE_RESP_SELECT_SOURCES) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_select_sources: received incorrect response type. Expected %d got %d\n", GSR_DBUS_MESSAGE_RESP_SELECT_SOURCES, response.type);
+        return -1;
+    }
+
+    return 0;
+}
+
+int gsr_dbus_client_screencast_start(gsr_dbus_client *self, const char *session_handle, uint32_t *pipewire_node) {
+    *pipewire_node = 0;
+
+    gsr_dbus_request_message request = {
+        .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+        .type = GSR_DBUS_MESSAGE_REQ_START,
+        .start = (gsr_dbus_message_req_start) {}
+    };
+    snprintf(request.start.session_handle, sizeof(request.start.session_handle), "%s", session_handle);
+    write(self->socket_pair[0], &request, sizeof(request));
+
+    gsr_dbus_response_message response = {0};
+    read(self->socket_pair[0], &response, sizeof(response));
+
+    if(response.protocol_version != GSR_DBUS_PROTOCOL_VERSION) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_start: server uses protocol version %d while the client is using protocol version %d", response.protocol_version, GSR_DBUS_PROTOCOL_VERSION);
+        return -1;
+    }
+
+    if(response.type == GSR_DBUS_MESSAGE_RESP_ERROR) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_start: server return error: %s (%d)\n", response.error.message, (int)response.error.error_code);
+        return response.error.error_code;
+    }
+
+    if(response.type != GSR_DBUS_MESSAGE_RESP_START) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_start: received incorrect response type. Expected %d got %d\n", GSR_DBUS_MESSAGE_RESP_START, response.type);
+        return -1;
+    }
+
+    if(self->screencast_restore_token) {
+        free(self->screencast_restore_token);
+        if(response.start.restore_token[0] == '\0')
+            self->screencast_restore_token = NULL;
+        else
+            self->screencast_restore_token = strdup(response.start.restore_token);
+    }
+
+    *pipewire_node = response.start.pipewire_node;
+    return 0;
+}
+
+bool gsr_dbus_client_screencast_open_pipewire_remote(gsr_dbus_client *self, const char *session_handle, int *pipewire_fd) {
+    *pipewire_fd = 0;
+
+    gsr_dbus_request_message request = {
+        .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+        .type = GSR_DBUS_MESSAGE_REQ_OPEN_PIPEWIRE_REMOTE,
+        .open_pipewire_remote = (gsr_dbus_message_req_open_pipewire_remote) {}
+    };
+    snprintf(request.open_pipewire_remote.session_handle, sizeof(request.open_pipewire_remote.session_handle), "%s", session_handle);
+    write(self->socket_pair[0], &request, sizeof(request));
+
+    gsr_dbus_response_message response = {0};
+    struct iovec iov = {
+        .iov_base = &response,
+        .iov_len = sizeof(response)
+    };
+
+    char msg_control[CMSG_SPACE(sizeof(int))];
+
+    struct msghdr message = {
+        .msg_iov = &iov,
+        .msg_iovlen = 1,
+        .msg_control = msg_control,
+        .msg_controllen = sizeof(msg_control)
+    };
+
+    const int bla = recvmsg(self->socket_pair[0], &message, MSG_WAITALL);
+    (void)bla;
+
+    if(response.protocol_version != GSR_DBUS_PROTOCOL_VERSION) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_open_pipewire_remote: server uses protocol version %d while the client is using protocol version %d", response.protocol_version, GSR_DBUS_PROTOCOL_VERSION);
+        return false;
+    }
+
+    if(response.type == GSR_DBUS_MESSAGE_RESP_ERROR) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_open_pipewire_remote: server return error: %s (%d)\n", response.error.message, (int)response.error.error_code);
+        return false;
+    }
+
+    if(response.type != GSR_DBUS_MESSAGE_RESP_OPEN_PIPEWIRE_REMOTE) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_open_pipewire_remote: received incorrect response type. Expected %d got %d\n", GSR_DBUS_MESSAGE_RESP_OPEN_PIPEWIRE_REMOTE, response.type);
+        return false;
+    }
+
+    struct cmsghdr *cmsg = CMSG_FIRSTHDR(&message);
+    if(!cmsg || cmsg->cmsg_type != SCM_RIGHTS) {
+        fprintf(stderr, "gsr error: gsr_dbus_client_screencast_open_pipewire_remote: returned message data is missing file descriptor\n");
+        return false;
+    }
+
+    memcpy(pipewire_fd, CMSG_DATA(cmsg), sizeof(*pipewire_fd));
+    return true;
+}
+
+const char* gsr_dbus_client_screencast_get_restore_token(gsr_dbus_client *self) {
+    return self->screencast_restore_token;
+}
diff --git a/dbus/client/dbus_client.h b/dbus/client/dbus_client.h
new file mode 100644
index 0000000..98a1ecf
--- /dev/null
+++ b/dbus/client/dbus_client.h
@@ -0,0 +1,36 @@
+#ifndef GSR_DBUS_CLIENT_H
+#define GSR_DBUS_CLIENT_H
+
+/*
+    Using a client-server architecture is needed for dbus because cap_sys_nice breaks desktop portal.
+    The main binary has cap_sys_nice and we launch a new child-process without it which uses uses desktop portal.
+*/
+
+#include "../portal.h"
+#include <stdbool.h>
+#include <stdint.h>
+#include <signal.h>
+
+typedef struct {
+    int socket_pair[2];
+    char *screencast_restore_token;
+    pid_t pid;
+} gsr_dbus_client;
+
+/* Blocking. TODO: Make non-blocking */
+bool gsr_dbus_client_init(gsr_dbus_client *self, const char *screencast_restore_token);
+void gsr_dbus_client_deinit(gsr_dbus_client *self);
+
+/* The follow functions should be called in order to setup ScreenCast properly */
+/* These functions that return an int return the response status code */
+int gsr_dbus_client_screencast_create_session(gsr_dbus_client *self, char *session_handle, size_t session_handle_size);
+/*
+    |capture_type| is a bitmask of gsr_portal_capture_type values. gsr_portal_capture_type values that are not supported by the desktop portal will be ignored.
+    |gsr_portal_cursor_mode| is a bitmask of gsr_portal_cursor_mode values. gsr_portal_cursor_mode values that are not supported will be ignored.
+*/
+int gsr_dbus_client_screencast_select_sources(gsr_dbus_client *self, const char *session_handle, uint32_t capture_type, uint32_t cursor_mode);
+int gsr_dbus_client_screencast_start(gsr_dbus_client *self, const char *session_handle, uint32_t *pipewire_node);
+bool gsr_dbus_client_screencast_open_pipewire_remote(gsr_dbus_client *self, const char *session_handle, int *pipewire_fd);
+const char* gsr_dbus_client_screencast_get_restore_token(gsr_dbus_client *self);
+
+#endif /* GSR_DBUS_CLIENT_H */
diff --git a/dbus/dbus_impl.c b/dbus/dbus_impl.c
new file mode 100644
index 0000000..600fcc5
--- /dev/null
+++ b/dbus/dbus_impl.c
@@ -0,0 +1,913 @@
+#include "dbus_impl.h"
+
+#include <sys/random.h>
+
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <assert.h>
+
+/* TODO: Make non-blocking when GPU Screen Recorder is turned into a library */
+/* TODO: Make sure responses matches the requests */
+
+#define DESKTOP_PORTAL_SIGNAL_RULE "type='signal',interface='org.freedesktop.Portal.Request'"
+
+typedef enum {
+    DICT_TYPE_STRING,
+    DICT_TYPE_UINT32,
+    DICT_TYPE_BOOL,
+} dict_value_type;
+
+typedef struct {
+    const char *key;
+    dict_value_type value_type;
+    union {
+        char *str;
+        dbus_uint32_t u32;
+        dbus_bool_t boolean;
+    };
+} dict_entry;
+
+static bool generate_random_characters(char *buffer, int buffer_size, const char *alphabet, size_t alphabet_size) {
+    /* TODO: Use other functions on other platforms than linux */
+    if(getrandom(buffer, buffer_size, 0) < buffer_size) {
+        fprintf(stderr, "Failed to get random bytes, error: %s\n", strerror(errno));
+        return false;
+    }
+
+    for(int i = 0; i < buffer_size; ++i) {
+        unsigned char c = *(unsigned char*)&buffer[i];
+        buffer[i] = alphabet[c % alphabet_size];
+    }
+
+    return true;
+}
+
+static bool generate_random_characters_standard_alphabet(char *buffer, int buffer_size) {
+    return generate_random_characters(buffer, buffer_size, "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", 62);
+}
+
+static const char* dict_value_type_to_string(dict_value_type type) {
+    switch(type) {
+        case DICT_TYPE_STRING: return "string";
+        case DICT_TYPE_UINT32: return "uint32";
+        case DICT_TYPE_BOOL:   return "boolean";
+    }
+    return "(unknown)";
+}
+
+bool gsr_dbus_init(gsr_dbus *self, const char *screencast_restore_token) {
+    memset(self, 0, sizeof(*self));
+    dbus_error_init(&self->err);
+
+    self->random_str[DBUS_RANDOM_STR_SIZE] = '\0';
+    if(!generate_random_characters_standard_alphabet(self->random_str, DBUS_RANDOM_STR_SIZE)) {
+        fprintf(stderr, "gsr error: gsr_dbus_init: failed to generate random string\n");
+        return false;
+    }
+
+    self->con = dbus_bus_get(DBUS_BUS_SESSION, &self->err);
+    if(dbus_error_is_set(&self->err)) {
+        fprintf(stderr, "gsr error: gsr_dbus_init: dbus_bus_get failed with error: %s\n", self->err.message);
+        return false;
+    }
+
+    if(!self->con) {
+        fprintf(stderr, "gsr error: gsr_dbus_init: failed to get dbus session\n");
+        return false;
+    }
+
+    /* TODO: Check the name */
+    const int ret = dbus_bus_request_name(self->con, "com.dec05eba.gpu_screen_recorder", DBUS_NAME_FLAG_REPLACE_EXISTING, &self->err);
+    if(dbus_error_is_set(&self->err)) {
+        fprintf(stderr, "gsr error: gsr_dbus_init: dbus_bus_request_name failed with error: %s\n", self->err.message);
+        gsr_dbus_deinit(self);
+        return false;
+    }
+
+    if(screencast_restore_token) {
+        self->screencast_restore_token = strdup(screencast_restore_token);
+        if(!self->screencast_restore_token) {
+            fprintf(stderr, "gsr error: gsr_dbus_init: failed to clone restore token\n");
+            gsr_dbus_deinit(self);
+            return false;
+        }
+    }
+
+    (void)ret;
+    // if(ret != DBUS_REQUEST_NAME_REPLY_PRIMARY_OWNER) {
+    //     fprintf(stderr, "gsr error: gsr_capture_portal_setup_dbus: dbus_bus_request_name failed to get primary owner\n");
+    //     return false;
+    // }
+
+    return true;
+}
+
+void gsr_dbus_deinit(gsr_dbus *self) {
+    if(self->screencast_restore_token) {
+        free(self->screencast_restore_token);
+        self->screencast_restore_token = NULL;
+    }
+
+    if(self->desktop_portal_rule_added) {
+        dbus_bus_remove_match(self->con, DESKTOP_PORTAL_SIGNAL_RULE, NULL);
+        // dbus_connection_flush(self->con);
+        self->desktop_portal_rule_added = false;
+    }
+
+    if(self->con) {
+        dbus_error_free(&self->err);
+
+        dbus_bus_release_name(self->con, "com.dec05eba.gpu_screen_recorder", NULL);
+
+        // Apparently shouldn't be used when a connection is setup by using dbus_bus_get
+        //dbus_connection_close(self->con);
+        dbus_connection_unref(self->con);
+        self->con = NULL;
+    }
+}
+
+static bool gsr_dbus_desktop_portal_get_property(gsr_dbus *self, const char *interface, const char *property_name, uint32_t *result) {
+    *result = 0;
+
+    DBusMessage *msg = dbus_message_new_method_call(
+        "org.freedesktop.portal.Desktop",    // target for the method call
+        "/org/freedesktop/portal/desktop",   // object to call on
+        "org.freedesktop.DBus.Properties",   // interface to call on
+        "Get");                              // method name
+    if(!msg) {
+        fprintf(stderr, "gsr error: gsr_dbus_desktop_portal_get_property: dbus_message_new_method_call failed\n");
+        return false;
+    }
+
+    DBusMessageIter it;
+    dbus_message_iter_init_append(msg, &it);
+
+    if(!dbus_message_iter_append_basic(&it, DBUS_TYPE_STRING, &interface)) {
+        fprintf(stderr, "gsr error: gsr_dbus_desktop_portal_get_property: failed to add interface\n");
+        dbus_message_unref(msg);
+        return false;
+    }
+
+    if(!dbus_message_iter_append_basic(&it, DBUS_TYPE_STRING, &property_name)) {
+        fprintf(stderr, "gsr error: gsr_dbus_desktop_portal_get_property: failed to add property_name\n");
+        dbus_message_unref(msg);
+        return false;
+    }
+
+    DBusPendingCall *pending = NULL;
+    if(!dbus_connection_send_with_reply(self->con, msg, &pending, -1) || !pending) { // -1 is default timeout
+        fprintf(stderr, "gsr error: gsr_dbus_desktop_portal_get_property: dbus_connection_send_with_reply failed\n");
+        dbus_message_unref(msg);
+        return false;
+    }
+    dbus_connection_flush(self->con);
+
+    //fprintf(stderr, "Request Sent\n");
+
+    dbus_message_unref(msg);
+    msg = NULL;
+
+    dbus_pending_call_block(pending);
+
+    msg = dbus_pending_call_steal_reply(pending);
+    if(!msg) {
+        fprintf(stderr, "gsr error: gsr_dbus_desktop_portal_get_property: dbus_pending_call_steal_reply failed\n");
+        dbus_pending_call_unref(pending);
+        dbus_message_unref(msg);
+        return false;
+    }
+
+    dbus_pending_call_unref(pending);
+    pending = NULL;
+
+    DBusMessageIter resp_args;
+    if(!dbus_message_iter_init(msg, &resp_args)) {
+        fprintf(stderr, "gsr error: gsr_dbus_desktop_portal_get_property: response message is missing arguments\n");
+        dbus_message_unref(msg);
+        return false;
+    } else if(DBUS_TYPE_UINT32 == dbus_message_iter_get_arg_type(&resp_args)) {
+        dbus_message_iter_get_basic(&resp_args, result);
+    } else if(DBUS_TYPE_VARIANT == dbus_message_iter_get_arg_type(&resp_args)) {
+        DBusMessageIter variant_iter;
+        dbus_message_iter_recurse(&resp_args, &variant_iter);
+
+        if(dbus_message_iter_get_arg_type(&variant_iter) == DBUS_TYPE_UINT32) {
+            dbus_message_iter_get_basic(&variant_iter, result);
+        } else {
+            fprintf(stderr, "gsr error: gsr_dbus_desktop_portal_get_property: response message is not a variant with an uint32, %c\n", dbus_message_iter_get_arg_type(&variant_iter));
+            dbus_message_unref(msg);
+            return false;
+        }
+    } else {
+        fprintf(stderr, "gsr error: gsr_dbus_desktop_portal_get_property: response message is not an uint32, %c\n", dbus_message_iter_get_arg_type(&resp_args));
+        dbus_message_unref(msg);
+        return false;
+        // TODO: Check dbus_error_is_set?
+    }
+
+    dbus_message_unref(msg);
+    return true;
+}
+
+static uint32_t gsr_dbus_get_screencast_version_cached(gsr_dbus *self) {
+    if(self->screencast_version == 0)
+        gsr_dbus_desktop_portal_get_property(self, "org.freedesktop.portal.ScreenCast", "version", &self->screencast_version);
+    return self->screencast_version;
+}
+
+static bool gsr_dbus_ensure_desktop_portal_rule_added(gsr_dbus *self) {
+    if(self->desktop_portal_rule_added)
+        return true;
+
+    dbus_bus_add_match(self->con, DESKTOP_PORTAL_SIGNAL_RULE, &self->err);
+    dbus_connection_flush(self->con);
+    if(dbus_error_is_set(&self->err)) {
+        fprintf(stderr, "gsr error: gsr_dbus_ensure_desktop_portal_rule_added: failed to add dbus rule %s, error: %s\n", DESKTOP_PORTAL_SIGNAL_RULE, self->err.message);
+        return false;
+    }
+    self->desktop_portal_rule_added = true;
+    return true;
+}
+
+static void gsr_dbus_portal_get_unique_handle_token(gsr_dbus *self, char *buffer, int size) {
+    snprintf(buffer, size, "gpu_screen_recorder_handle_%s_%u", self->random_str, self->handle_counter++);
+}
+
+static void gsr_dbus_portal_get_unique_session_token(gsr_dbus *self, char *buffer, int size) {
+    snprintf(buffer, size, "gpu_screen_recorder_session_%s", self->random_str);
+}
+
+static bool dbus_add_dict(DBusMessageIter *it, const dict_entry *entries, int num_entries) {
+    DBusMessageIter array_it;
+    if(!dbus_message_iter_open_container(it, DBUS_TYPE_ARRAY, "{sv}", &array_it))
+        return false;
+
+    for (int i = 0; i < num_entries; ++i) {
+        DBusMessageIter entry_it = DBUS_MESSAGE_ITER_INIT_CLOSED;
+        DBusMessageIter variant_it = DBUS_MESSAGE_ITER_INIT_CLOSED;
+
+        if(!dbus_message_iter_open_container(&array_it, DBUS_TYPE_DICT_ENTRY, NULL, &entry_it))
+            goto entry_err;
+
+        if(!dbus_message_iter_append_basic(&entry_it, DBUS_TYPE_STRING, &entries[i].key))
+            goto entry_err;
+
+        switch (entries[i].value_type) {
+            case DICT_TYPE_STRING: {
+                if(!dbus_message_iter_open_container(&entry_it, DBUS_TYPE_VARIANT, DBUS_TYPE_STRING_AS_STRING, &variant_it))
+                    goto entry_err;
+                if(!dbus_message_iter_append_basic(&variant_it, DBUS_TYPE_STRING, &entries[i].str))
+                    goto entry_err;
+                break;
+            }
+            case DICT_TYPE_UINT32: {
+                if(!dbus_message_iter_open_container(&entry_it, DBUS_TYPE_VARIANT, DBUS_TYPE_UINT32_AS_STRING, &variant_it))
+                    goto entry_err;
+                if(!dbus_message_iter_append_basic(&variant_it, DBUS_TYPE_UINT32, &entries[i].u32))
+                    goto entry_err;
+                break;
+            }
+            case DICT_TYPE_BOOL: {
+                if(!dbus_message_iter_open_container(&entry_it, DBUS_TYPE_VARIANT, DBUS_TYPE_BOOLEAN_AS_STRING, &variant_it))
+                    goto entry_err;
+                if(!dbus_message_iter_append_basic(&variant_it, DBUS_TYPE_BOOLEAN, &entries[i].boolean))
+                    goto entry_err;
+                break;
+            }
+        }
+
+        dbus_message_iter_close_container(&entry_it, &variant_it);
+        dbus_message_iter_close_container(&array_it, &entry_it);
+        continue;
+
+        entry_err:
+        dbus_message_iter_abandon_container_if_open(&array_it, &variant_it);
+        dbus_message_iter_abandon_container_if_open(&array_it, &entry_it);
+        dbus_message_iter_abandon_container_if_open(it, &array_it);
+        return false;
+    }
+
+    return dbus_message_iter_close_container(it, &array_it);
+}
+
+/* If |response_msg| is NULL then we dont wait for a response signal */
+static bool gsr_dbus_call_screencast_method(gsr_dbus *self, const char *method_name, const char *session_handle, const char *parent_window, const dict_entry *entries, int num_entries, int *resp_fd, DBusMessage **response_msg) {
+    if(resp_fd)
+        *resp_fd = -1;
+
+    if(response_msg)
+        *response_msg = NULL;
+
+    if(!gsr_dbus_ensure_desktop_portal_rule_added(self))
+        return false;
+
+    DBusMessage *msg = dbus_message_new_method_call(
+        "org.freedesktop.portal.Desktop",    // target for the method call
+        "/org/freedesktop/portal/desktop",   // object to call on
+        "org.freedesktop.portal.ScreenCast", // interface to call on
+        method_name);                        // method name
+    if(!msg) {
+        fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: dbus_message_new_method_call failed\n");
+        return false;
+    }
+
+    DBusMessageIter it;
+    dbus_message_iter_init_append(msg, &it);
+
+    if(session_handle) {
+        if(!dbus_message_iter_append_basic(&it, DBUS_TYPE_OBJECT_PATH, &session_handle)) {
+            fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: failed to add session_handle\n");
+            dbus_message_unref(msg);
+            return false;
+        }
+    }
+
+    if(parent_window) {
+        if(!dbus_message_iter_append_basic(&it, DBUS_TYPE_STRING, &parent_window)) {
+            fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: failed to add parent_window\n");
+            dbus_message_unref(msg);
+            return false;
+        }
+    }
+
+    if(!dbus_add_dict(&it, entries, num_entries)) {
+        fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: failed to add dict\n");
+        dbus_message_unref(msg);
+        return false;
+    }
+
+    DBusPendingCall *pending = NULL;
+    if(!dbus_connection_send_with_reply(self->con, msg, &pending, -1) || !pending) { // -1 is default timeout
+        fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: dbus_connection_send_with_reply failed\n");
+        dbus_message_unref(msg);
+        return false;
+    }
+    dbus_connection_flush(self->con);
+
+    //fprintf(stderr, "Request Sent\n");
+
+    dbus_message_unref(msg);
+    msg = NULL;
+
+    dbus_pending_call_block(pending);
+
+    msg = dbus_pending_call_steal_reply(pending);
+    if(!msg) {
+        fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: dbus_pending_call_steal_reply failed\n");
+        dbus_pending_call_unref(pending);
+        dbus_message_unref(msg);
+        return false;
+    }
+
+    dbus_pending_call_unref(pending);
+    pending = NULL;
+
+    DBusMessageIter resp_args;
+    if(!dbus_message_iter_init(msg, &resp_args)) {
+        fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: response message is missing arguments\n");
+        dbus_message_unref(msg);
+        return false;
+    } else if (DBUS_TYPE_OBJECT_PATH == dbus_message_iter_get_arg_type(&resp_args)) {
+        const char *res = NULL;
+        dbus_message_iter_get_basic(&resp_args, &res);
+    } else if(DBUS_TYPE_UNIX_FD == dbus_message_iter_get_arg_type(&resp_args)) {
+        int fd = -1;
+        dbus_message_iter_get_basic(&resp_args, &fd);
+
+        if(resp_fd)
+            *resp_fd = fd;
+    } else if(DBUS_TYPE_STRING == dbus_message_iter_get_arg_type(&resp_args)) {
+        char *err = NULL;
+        dbus_message_iter_get_basic(&resp_args, &err);
+        fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: failed with error: %s\n", err);
+
+        dbus_message_unref(msg);
+        return false;
+        // TODO: Check dbus_error_is_set?
+    } else {
+        fprintf(stderr, "gsr error: gsr_dbus_call_screencast_method: response message is not an object path or unix fd\n");
+        dbus_message_unref(msg);
+        return false;
+        // TODO: Check dbus_error_is_set?
+    }
+
+    dbus_message_unref(msg);
+    if(!response_msg)
+        return true;
+
+    /* TODO: Add timeout, but take into consideration user interactive signals (such as selecting a monitor to capture for ScreenCast) */
+    for (;;) {
+        const int timeout_milliseconds = 10;
+        dbus_connection_read_write(self->con, timeout_milliseconds);
+        *response_msg = dbus_connection_pop_message(self->con);
+
+        if(!*response_msg)
+            continue;
+
+        if(!dbus_message_is_signal(*response_msg, "org.freedesktop.portal.Request", "Response")) {
+            dbus_message_unref(*response_msg);
+            *response_msg = NULL;
+            continue;
+        }
+
+        break;
+    }
+
+    return true;
+}
+
+static int gsr_dbus_get_response_status(DBusMessageIter *resp_args) {
+    if(dbus_message_iter_get_arg_type(resp_args) != DBUS_TYPE_UINT32) {
+        fprintf(stderr, "gsr error: gsr_dbus_get_response_status: missing uint32 in response\n");
+        return -1;
+    }
+
+    dbus_uint32_t response_status = 0;
+    dbus_message_iter_get_basic(resp_args, &response_status);
+
+    dbus_message_iter_next(resp_args);
+    return (int)response_status;
+}
+
+static dict_entry* find_dict_entry_by_key(dict_entry *entries, int num_entries, const char *key) {
+    for(int i = 0; i < num_entries; ++i) {
+        if(strcmp(entries[i].key, key) == 0)
+            return &entries[i];
+    }
+    return NULL;
+}
+
+static bool gsr_dbus_get_variant_value(DBusMessageIter *iter, dict_entry *entry) {
+    if(dbus_message_iter_get_arg_type(iter) != DBUS_TYPE_VARIANT) {
+        fprintf(stderr, "gsr error: gsr_dbus_get_variant_value: value is not a variant\n");
+        return false;
+    }
+
+    DBusMessageIter variant_iter;
+    dbus_message_iter_recurse(iter, &variant_iter);
+
+    switch(dbus_message_iter_get_arg_type(&variant_iter)) {
+        case DBUS_TYPE_STRING: {
+            if(entry->value_type != DICT_TYPE_STRING) {
+                fprintf(stderr, "gsr error: gsr_dbus_get_variant_value: expected entry value to be a(n) %s was a string\n", dict_value_type_to_string(entry->value_type));
+                return false;
+            }
+
+            const char *value = NULL;
+            dbus_message_iter_get_basic(&variant_iter, &value);
+
+            if(!value) {
+                fprintf(stderr, "gsr error: gsr_dbus_get_variant_value: failed to get entry value as value\n");
+                return false;
+            }
+
+            if(entry->str) {
+                free(entry->str);
+                entry->str = NULL;
+            }
+
+            entry->str = strdup(value);
+            if(!entry->str) {
+                fprintf(stderr, "gsr error: gsr_dbus_get_variant_value: failed to copy value\n");
+                return false;
+            }
+            return true;
+        }
+        case DBUS_TYPE_UINT32: {
+            if(entry->value_type != DICT_TYPE_UINT32) {
+                fprintf(stderr, "gsr error: gsr_dbus_get_variant_value: expected entry value to be a(n) %s was an uint32\n", dict_value_type_to_string(entry->value_type));
+                return false;
+            }
+
+            dbus_message_iter_get_basic(&variant_iter, &entry->u32);
+            return true;
+        }
+        case DBUS_TYPE_BOOLEAN: {
+            if(entry->value_type != DICT_TYPE_BOOL) {
+                fprintf(stderr, "gsr error: gsr_dbus_get_variant_value: expected entry value to be a(n) %s was a boolean\n", dict_value_type_to_string(entry->value_type));
+                return false;
+            }
+
+            dbus_message_iter_get_basic(&variant_iter, &entry->boolean);
+            return true;
+        }
+    }
+
+    fprintf(stderr, "gsr error: gsr_dbus_get_variant_value: got unexpected type, expected string, uint32 or boolean\n");
+    return false;
+}
+
+/*
+    Parses a{sv} into matching key entries in |entries|.
+    If the entry value is a string then it's allocated with malloc and is null-terminated
+    and has to be free by the caller.
+    The entry values should be 0 before this method is called.
+    The entries are free'd if this function fails.
+*/
+static bool gsr_dbus_get_map(DBusMessageIter *resp_args, dict_entry *entries, int num_entries) {
+    if(dbus_message_iter_get_arg_type(resp_args) != DBUS_TYPE_ARRAY) {
+        fprintf(stderr, "gsr error: gsr_dbus_get_map: missing array in response\n");
+        return false;
+    }
+
+    DBusMessageIter subiter;
+    dbus_message_iter_recurse(resp_args, &subiter);
+
+    while(dbus_message_iter_get_arg_type(&subiter) != DBUS_TYPE_INVALID) {
+        DBusMessageIter dictiter = DBUS_MESSAGE_ITER_INIT_CLOSED;
+        const char *key = NULL;
+        dict_entry *entry = NULL;
+
+        // fprintf(stderr, "    array element type: %c, %s\n",
+        //         dbus_message_iter_get_arg_type(&subiter),
+        //         dbus_message_iter_get_signature(&subiter));
+        if(dbus_message_iter_get_arg_type(&subiter) != DBUS_TYPE_DICT_ENTRY) {
+            fprintf(stderr, "gsr error: gsr_dbus_get_map: array value is not an entry\n");
+            return false;
+        }
+
+        dbus_message_iter_recurse(&subiter, &dictiter);
+
+        if(dbus_message_iter_get_arg_type(&dictiter) != DBUS_TYPE_STRING) {
+            fprintf(stderr, "gsr error: gsr_dbus_get_map: entry key is not a string\n");
+            goto error;
+        }
+
+        dbus_message_iter_get_basic(&dictiter, &key);
+        if(!key) {
+            fprintf(stderr, "gsr error: gsr_dbus_get_map: failed to get entry key as value\n");
+            goto error;
+        }
+        
+        entry = find_dict_entry_by_key(entries, num_entries, key);
+        if(!entry) {
+            dbus_message_iter_next(&subiter);
+            continue;
+        }
+
+        if(!dbus_message_iter_next(&dictiter)) {
+            fprintf(stderr, "gsr error: gsr_dbus_get_map: missing entry value\n");
+            goto error;
+        }
+
+        if(!gsr_dbus_get_variant_value(&dictiter, entry))
+            goto error;
+
+        dbus_message_iter_next(&subiter);
+    }
+
+    return true;
+
+    error:
+    for(int i = 0; i < num_entries; ++i) {
+        if(entries[i].value_type == DICT_TYPE_STRING) {
+            free(entries[i].str);
+            entries[i].str = NULL;
+        }
+    }
+    return false;
+}
+
+int gsr_dbus_screencast_create_session(gsr_dbus *self, char **session_handle) {
+    assert(session_handle);
+    *session_handle = NULL;
+
+    char handle_token[64];
+    gsr_dbus_portal_get_unique_handle_token(self, handle_token, sizeof(handle_token));
+
+    char session_handle_token[64];
+    gsr_dbus_portal_get_unique_session_token(self, session_handle_token, sizeof(session_handle_token));
+
+    dict_entry args[2];
+    args[0].key = "handle_token";
+    args[0].value_type = DICT_TYPE_STRING;
+    args[0].str = handle_token;
+
+    args[1].key = "session_handle_token";
+    args[1].value_type = DICT_TYPE_STRING;
+    args[1].str = session_handle_token;
+
+    DBusMessage *response_msg = NULL;
+    if(!gsr_dbus_call_screencast_method(self, "CreateSession", NULL, NULL, args, 2, NULL, &response_msg)) {
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_create_session: failed to setup ScreenCast session. Make sure you have a desktop portal running with support for the ScreenCast interface and that the desktop portal matches the Wayland compositor you are running.\n");
+        return -1;
+    }
+
+    // TODO: Verify signal path matches |res|, maybe check the below
+    // DBUS_TYPE_ARRAY value?
+    //fprintf(stderr, "signature: %s, sender: %s\n", dbus_message_get_signature(msg), dbus_message_get_sender(msg));
+    DBusMessageIter resp_args;
+    if(!dbus_message_iter_init(response_msg, &resp_args)) {
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_create_session: missing response\n");
+        dbus_message_unref(response_msg);
+        return -1;
+    }
+
+    const int response_status = gsr_dbus_get_response_status(&resp_args);
+    if(response_status != 0) {
+        dbus_message_unref(response_msg);
+        return response_status;
+    }
+
+    dict_entry entries[1];
+    entries[0].key = "session_handle";
+    entries[0].str = NULL;
+    entries[0].value_type = DICT_TYPE_STRING;
+    if(!gsr_dbus_get_map(&resp_args, entries, 1)) {
+        dbus_message_unref(response_msg);
+        return -1;
+    }
+
+    if(!entries[0].str) {
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_create_session: missing \"session_handle\" in response\n");
+        dbus_message_unref(response_msg);
+        return -1;
+    }
+
+    *session_handle = entries[0].str;
+    //fprintf(stderr, "session handle: |%s|\n", entries[0].str);
+    //free(entries[0].str);
+
+    dbus_message_unref(response_msg);
+    return 0;
+}
+
+static uint32_t unset_unsupported_capture_types(uint32_t requested_capture_types, uint32_t available_capture_types) {
+    if(!(available_capture_types & GSR_PORTAL_CAPTURE_TYPE_MONITOR))
+        requested_capture_types &= ~GSR_PORTAL_CAPTURE_TYPE_MONITOR;
+    if(!(available_capture_types & GSR_PORTAL_CAPTURE_TYPE_WINDOW))
+        requested_capture_types &= ~GSR_PORTAL_CAPTURE_TYPE_WINDOW;
+    if(!(available_capture_types & GSR_PORTAL_CAPTURE_TYPE_VIRTUAL))
+        requested_capture_types &= ~GSR_PORTAL_CAPTURE_TYPE_VIRTUAL;
+    return requested_capture_types;
+}
+
+static uint32_t unset_unsupported_cursor_modes(uint32_t requested_cursor_modes, uint32_t available_cursor_modes) {
+    if(!(available_cursor_modes & GSR_PORTAL_CURSOR_MODE_HIDDEN))
+        requested_cursor_modes &= ~GSR_PORTAL_CURSOR_MODE_HIDDEN;
+    if(!(available_cursor_modes & GSR_PORTAL_CURSOR_MODE_EMBEDDED))
+        requested_cursor_modes &= ~GSR_PORTAL_CURSOR_MODE_EMBEDDED;
+    if(!(available_cursor_modes & GSR_PORTAL_CURSOR_MODE_METADATA))
+        requested_cursor_modes &= ~GSR_PORTAL_CURSOR_MODE_METADATA;
+    return requested_cursor_modes;
+}
+
+int gsr_dbus_screencast_select_sources(gsr_dbus *self, const char *session_handle, uint32_t capture_type, uint32_t cursor_mode) {
+    assert(session_handle);
+
+    uint32_t available_source_types = 0;
+    gsr_dbus_desktop_portal_get_property(self, "org.freedesktop.portal.ScreenCast", "AvailableSourceTypes", &available_source_types);
+    if(available_source_types == 0)
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_select_sources: no source types are available\n");
+    capture_type = unset_unsupported_capture_types(capture_type, available_source_types);
+
+    uint32_t available_cursor_modes = 0;
+    gsr_dbus_desktop_portal_get_property(self, "org.freedesktop.portal.ScreenCast", "AvailableCursorModes", &available_cursor_modes);
+    if(available_cursor_modes == 0)
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_select_sources: no cursors modes are available\n");
+    cursor_mode = unset_unsupported_cursor_modes(cursor_mode, available_cursor_modes);
+
+    char handle_token[64];
+    gsr_dbus_portal_get_unique_handle_token(self, handle_token, sizeof(handle_token));
+
+    int num_arg_dict = 4;
+    dict_entry args[6];
+    args[0].key = "types";
+    args[0].value_type = DICT_TYPE_UINT32;
+    args[0].u32 = capture_type;
+
+    args[1].key = "multiple";
+    args[1].value_type = DICT_TYPE_BOOL;
+    args[1].boolean = false; /* TODO: Wayland ignores this and still gives the option to select multiple sources. Support that case.. */
+
+    args[2].key = "handle_token";
+    args[2].value_type = DICT_TYPE_STRING;
+    args[2].str = handle_token;
+
+    args[3].key = "cursor_mode";
+    args[3].value_type = DICT_TYPE_UINT32;
+    args[3].u32 = cursor_mode;
+
+    const int screencast_server_version = gsr_dbus_get_screencast_version_cached(self);
+    if(screencast_server_version >= 4) {
+        num_arg_dict = 5;
+        args[4].key = "persist_mode";
+        args[4].value_type = DICT_TYPE_UINT32;
+        args[4].u32 = 2; /* persist until explicitly revoked */
+
+        if(self->screencast_restore_token && self->screencast_restore_token[0]) {
+            num_arg_dict = 6;
+
+            args[5].key = "restore_token";
+            args[5].value_type = DICT_TYPE_STRING;
+            args[5].str = self->screencast_restore_token;
+        }
+    } else if(self->screencast_restore_token && self->screencast_restore_token[0]) {
+        fprintf(stderr, "gsr warning: gsr_dbus_screencast_select_sources: tried to use restore token but this option is only available in screencast version >= 4, your wayland compositors screencast version is %d\n", screencast_server_version);
+    }
+    
+    DBusMessage *response_msg = NULL;
+    if(!gsr_dbus_call_screencast_method(self, "SelectSources", session_handle, NULL, args, num_arg_dict, NULL, &response_msg)) {
+        if(num_arg_dict == 6) {
+            /* We dont know what the error exactly is but assume it may be because of invalid restore token. In that case try without restore token */
+            fprintf(stderr, "gsr warning: gsr_dbus_screencast_select_sources: SelectSources failed, retrying without restore_token\n");
+            num_arg_dict = 5;
+            if(!gsr_dbus_call_screencast_method(self, "SelectSources", session_handle, NULL, args, num_arg_dict, NULL, &response_msg))
+                return -1;
+        } else {
+            return -1;
+        }
+    }
+
+    // TODO: Verify signal path matches |res|, maybe check the below
+    //fprintf(stderr, "signature: %s, sender: %s\n", dbus_message_get_signature(msg), dbus_message_get_sender(msg));
+    DBusMessageIter resp_args;
+    if(!dbus_message_iter_init(response_msg, &resp_args)) {
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_create_session: missing response\n");
+        dbus_message_unref(response_msg);
+        return -1;
+    }
+
+    
+    const int response_status = gsr_dbus_get_response_status(&resp_args);
+    if(response_status != 0) {
+        dbus_message_unref(response_msg);
+        return response_status;
+    }
+
+    dbus_message_unref(response_msg);
+    return 0;
+}
+
+static dbus_uint32_t screencast_stream_get_pipewire_node(DBusMessageIter *iter) {
+    DBusMessageIter subiter;
+    dbus_message_iter_recurse(iter, &subiter);
+
+    if(dbus_message_iter_get_arg_type(&subiter) == DBUS_TYPE_STRUCT) {
+        DBusMessageIter structiter;
+        dbus_message_iter_recurse(&subiter, &structiter);
+
+        if(dbus_message_iter_get_arg_type(&structiter) == DBUS_TYPE_UINT32) {
+            dbus_uint32_t data = 0;
+            dbus_message_iter_get_basic(&structiter, &data);
+            return data;
+        }
+    }
+
+    return 0;
+}
+
+int gsr_dbus_screencast_start(gsr_dbus *self, const char *session_handle, uint32_t *pipewire_node) {
+    assert(session_handle);
+    *pipewire_node = 0;
+
+    char handle_token[64];
+    gsr_dbus_portal_get_unique_handle_token(self, handle_token, sizeof(handle_token));
+
+    dict_entry args[1];
+    args[0].key = "handle_token";
+    args[0].value_type = DICT_TYPE_STRING;
+    args[0].str = handle_token;
+    
+    DBusMessage *response_msg = NULL;
+    if(!gsr_dbus_call_screencast_method(self, "Start", session_handle, "", args, 1, NULL, &response_msg))
+        return -1;
+
+    // TODO: Verify signal path matches |res|, maybe check the below
+    //fprintf(stderr, "signature: %s, sender: %s\n", dbus_message_get_signature(msg), dbus_message_get_sender(msg));
+    DBusMessageIter resp_args;
+    if(!dbus_message_iter_init(response_msg, &resp_args)) {
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_start: missing response\n");
+        dbus_message_unref(response_msg);
+        return -1;
+    }
+
+    const int response_status = gsr_dbus_get_response_status(&resp_args);
+    if(response_status != 0) {
+        dbus_message_unref(response_msg);
+        return response_status;
+    }
+
+    if(dbus_message_iter_get_arg_type(&resp_args) != DBUS_TYPE_ARRAY) {
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_start: missing array in response\n");
+        dbus_message_unref(response_msg);
+        return -1;
+    }
+
+    DBusMessageIter subiter;
+    dbus_message_iter_recurse(&resp_args, &subiter);
+
+    while(dbus_message_iter_get_arg_type(&subiter) != DBUS_TYPE_INVALID) {
+        DBusMessageIter dictiter = DBUS_MESSAGE_ITER_INIT_CLOSED;
+        const char *key = NULL;
+
+        // fprintf(stderr, "    array element type: %c, %s\n",
+        //         dbus_message_iter_get_arg_type(&subiter),
+        //         dbus_message_iter_get_signature(&subiter));
+        if(dbus_message_iter_get_arg_type(&subiter) != DBUS_TYPE_DICT_ENTRY) {
+            fprintf(stderr, "gsr error: gsr_dbus_screencast_start: array value is not an entry\n");
+            goto error;
+        }
+
+        dbus_message_iter_recurse(&subiter, &dictiter);
+
+        if(dbus_message_iter_get_arg_type(&dictiter) != DBUS_TYPE_STRING) {
+            fprintf(stderr, "gsr error: gsr_dbus_screencast_start: entry key is not a string\n");
+            goto error;
+        }
+
+        dbus_message_iter_get_basic(&dictiter, &key);
+        if(!key) {
+            fprintf(stderr, "gsr error: gsr_dbus_screencast_start: failed to get entry key as value\n");
+            goto error;
+        }
+
+        if(strcmp(key, "restore_token") == 0) {
+            if(!dbus_message_iter_next(&dictiter)) {
+                fprintf(stderr, "gsr error: gsr_dbus_screencast_start: missing restore_token value\n");
+                goto error;
+            }
+
+            if(dbus_message_iter_get_arg_type(&dictiter) != DBUS_TYPE_VARIANT) {
+                fprintf(stderr, "gsr error: gsr_dbus_screencast_start: restore_token is not a variant\n");
+                goto error;
+            }
+
+            DBusMessageIter variant_iter;
+            dbus_message_iter_recurse(&dictiter, &variant_iter);
+
+            if(dbus_message_iter_get_arg_type(&variant_iter) != DBUS_TYPE_STRING) {
+                fprintf(stderr, "gsr error: gsr_dbus_screencast_start: restore_token is not a string\n");
+                goto error;
+            }
+
+            char *restore_token_str = NULL;
+            dbus_message_iter_get_basic(&variant_iter, &restore_token_str);
+
+            if(restore_token_str) {
+                if(self->screencast_restore_token) {
+                    free(self->screencast_restore_token);
+                    self->screencast_restore_token = NULL;
+                }
+                self->screencast_restore_token = strdup(restore_token_str);
+                //fprintf(stderr, "got restore token: %s\n", self->screencast_restore_token);
+            }
+        } else if(strcmp(key, "streams") == 0) {
+            if(!dbus_message_iter_next(&dictiter)) {
+                fprintf(stderr, "gsr error: gsr_dbus_screencast_start: missing streams value\n");
+                goto error;
+            }
+
+            if(dbus_message_iter_get_arg_type(&dictiter) != DBUS_TYPE_VARIANT) {
+                fprintf(stderr, "gsr error: gsr_dbus_screencast_start: streams value is not a variant\n");
+                goto error;
+            }
+
+            DBusMessageIter variant_iter;
+            dbus_message_iter_recurse(&dictiter, &variant_iter);
+
+            if(dbus_message_iter_get_arg_type(&variant_iter) != DBUS_TYPE_ARRAY) {
+                fprintf(stderr, "gsr error: gsr_dbus_screencast_start: streams value is not an array\n");
+                goto error;
+            }
+
+            int num_streams = dbus_message_iter_get_element_count(&variant_iter);
+            //fprintf(stderr, "num streams: %d\n", num_streams);
+            /* Skip over all streams except the last one, since kde can return multiple streams even if only 1 is requested. The last one is the valid one */
+            for(int i = 0; i < num_streams - 1; ++i) {
+                screencast_stream_get_pipewire_node(&variant_iter);
+            }
+
+            if(num_streams > 0) {
+                *pipewire_node = screencast_stream_get_pipewire_node(&variant_iter);
+                //fprintf(stderr, "pipewire node: %u\n", *pipewire_node);
+            }
+        }
+
+        dbus_message_iter_next(&subiter);
+    }
+
+    if(*pipewire_node == 0) {
+        fprintf(stderr, "gsr error: gsr_dbus_screencast_start: no pipewire node returned\n");
+        goto error;
+    }
+
+    dbus_message_unref(response_msg);
+    return 0;
+
+    error:
+    dbus_message_unref(response_msg);
+    return -1;
+}
+
+bool gsr_dbus_screencast_open_pipewire_remote(gsr_dbus *self, const char *session_handle, int *pipewire_fd) {
+    assert(session_handle);
+    *pipewire_fd = -1;
+    return gsr_dbus_call_screencast_method(self, "OpenPipeWireRemote", session_handle, NULL, NULL, 0, pipewire_fd, NULL);
+}
+
+const char* gsr_dbus_screencast_get_restore_token(gsr_dbus *self) {
+    return self->screencast_restore_token;
+}
diff --git a/dbus/dbus_impl.h b/dbus/dbus_impl.h
new file mode 100644
index 0000000..c3f0751
--- /dev/null
+++ b/dbus/dbus_impl.h
@@ -0,0 +1,37 @@
+#ifndef GSR_DBUS_H
+#define GSR_DBUS_H
+
+#include "portal.h"
+#include <stdbool.h>
+#include <stdint.h>
+#include <dbus/dbus.h>
+
+#define DBUS_RANDOM_STR_SIZE 16
+
+typedef struct {
+    DBusConnection *con;
+    DBusError err;
+    char random_str[DBUS_RANDOM_STR_SIZE + 1];
+    unsigned int handle_counter;
+    bool desktop_portal_rule_added;
+    uint32_t screencast_version;
+    char *screencast_restore_token;
+} gsr_dbus;
+
+/* Blocking. TODO: Make non-blocking */
+bool gsr_dbus_init(gsr_dbus *self, const char *screencast_restore_token);
+void gsr_dbus_deinit(gsr_dbus *self);
+
+/* The follow functions should be called in order to setup ScreenCast properly */
+/* These functions that return an int return the response status code */
+int gsr_dbus_screencast_create_session(gsr_dbus *self, char **session_handle);
+/*
+    |capture_type| is a bitmask of gsr_portal_capture_type values. gsr_portal_capture_type values that are not supported by the desktop portal will be ignored.
+    |gsr_portal_cursor_mode| is a bitmask of gsr_portal_cursor_mode values. gsr_portal_cursor_mode values that are not supported will be ignored.
+*/
+int gsr_dbus_screencast_select_sources(gsr_dbus *self, const char *session_handle, uint32_t capture_type, uint32_t cursor_mode);
+int gsr_dbus_screencast_start(gsr_dbus *self, const char *session_handle, uint32_t *pipewire_node);
+bool gsr_dbus_screencast_open_pipewire_remote(gsr_dbus *self, const char *session_handle, int *pipewire_fd);
+const char* gsr_dbus_screencast_get_restore_token(gsr_dbus *self);
+
+#endif /* GSR_DBUS_H */
diff --git a/dbus/portal.h b/dbus/portal.h
new file mode 100644
index 0000000..6b93aa6
--- /dev/null
+++ b/dbus/portal.h
@@ -0,0 +1,17 @@
+#ifndef GSR_PORTAL_H
+#define GSR_PORTAL_H
+
+typedef enum {
+    GSR_PORTAL_CAPTURE_TYPE_MONITOR = 1 << 0,
+    GSR_PORTAL_CAPTURE_TYPE_WINDOW  = 1 << 1,
+    GSR_PORTAL_CAPTURE_TYPE_VIRTUAL = 1 << 2,
+    GSR_PORTAL_CAPTURE_TYPE_ALL = GSR_PORTAL_CAPTURE_TYPE_MONITOR | GSR_PORTAL_CAPTURE_TYPE_WINDOW | GSR_PORTAL_CAPTURE_TYPE_VIRTUAL
+} gsr_portal_capture_type;
+
+typedef enum {
+    GSR_PORTAL_CURSOR_MODE_HIDDEN   = 1 << 0,
+    GSR_PORTAL_CURSOR_MODE_EMBEDDED = 1 << 1,
+    GSR_PORTAL_CURSOR_MODE_METADATA = 1 << 2
+} gsr_portal_cursor_mode;
+
+#endif /* GSR_PORTAL_H */
diff --git a/dbus/protocol.h b/dbus/protocol.h
new file mode 100644
index 0000000..212358d
--- /dev/null
+++ b/dbus/protocol.h
@@ -0,0 +1,86 @@
+#ifndef GSR_DBUS_PROTOCOL_H
+#define GSR_DBUS_PROTOCOL_H
+
+#include <stdint.h>
+
+#define GSR_DBUS_PROTOCOL_VERSION 1
+
+typedef enum {
+    GSR_DBUS_MESSAGE_REQ_CREATE_SESSION,
+    GSR_DBUS_MESSAGE_REQ_SELECT_SOURCES,
+    GSR_DBUS_MESSAGE_REQ_START,
+    GSR_DBUS_MESSAGE_REQ_OPEN_PIPEWIRE_REMOTE
+} gsr_dbus_message_req_type;
+
+typedef struct {
+
+} gsr_dbus_message_req_create_session;
+
+typedef struct {
+    char session_handle[128];
+    uint32_t capture_type;
+    uint32_t cursor_mode;
+} gsr_dbus_message_req_select_sources;
+
+typedef struct {
+    char session_handle[128];
+} gsr_dbus_message_req_start;
+
+typedef struct {
+    char session_handle[128];
+} gsr_dbus_message_req_open_pipewire_remote;
+
+typedef struct {
+    uint8_t protocol_version;
+    gsr_dbus_message_req_type type;
+    union {
+        gsr_dbus_message_req_create_session create_session;
+        gsr_dbus_message_req_select_sources select_sources;
+        gsr_dbus_message_req_start start;
+        gsr_dbus_message_req_open_pipewire_remote open_pipewire_remote;
+    };
+} gsr_dbus_request_message;
+
+typedef enum {
+    GSR_DBUS_MESSAGE_RESP_ERROR,
+    GSR_DBUS_MESSAGE_RESP_CREATE_SESSION,
+    GSR_DBUS_MESSAGE_RESP_SELECT_SOURCES,
+    GSR_DBUS_MESSAGE_RESP_START,
+    GSR_DBUS_MESSAGE_RESP_OPEN_PIPEWIRE_REMOTE
+} gsr_dbus_message_resp_type;
+
+typedef struct {
+    uint32_t error_code;
+    char message[128];
+} gsr_dbus_message_resp_error;
+
+typedef struct {
+    char session_handle[128];
+} gsr_dbus_message_resp_create_session;
+
+typedef struct {
+    
+} gsr_dbus_message_resp_select_sources;
+
+typedef struct {
+    char restore_token[128];
+    uint32_t pipewire_node;
+} gsr_dbus_message_resp_start;
+
+typedef struct {
+    
+} gsr_dbus_message_resp_open_pipewire_remote;
+
+typedef struct {
+    uint8_t protocol_version;
+    gsr_dbus_message_resp_type type;
+    union {
+        gsr_dbus_message_resp_error error;
+        gsr_dbus_message_resp_create_session create_session;
+        gsr_dbus_message_resp_select_sources select_sources;
+        gsr_dbus_message_resp_start start;
+        gsr_dbus_message_resp_open_pipewire_remote open_pipewire_remote;
+    };
+} gsr_dbus_response_message;
+
+#endif /* GSR_DBUS_PROTOCOL_H */
diff --git a/dbus/server/dbus_server.c b/dbus/server/dbus_server.c
new file mode 100644
index 0000000..bde6acb
--- /dev/null
+++ b/dbus/server/dbus_server.c
@@ -0,0 +1,175 @@
+#include "../dbus_impl.h"
+#include "../protocol.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#include <unistd.h>
+#include <sys/socket.h>
+
+/* TODO: Error check write/read */
+
+static int handle_create_session(gsr_dbus *dbus, int rpc_fd, const gsr_dbus_message_req_create_session *create_session) {
+    (void)create_session;
+    char *session_handle = NULL;
+    const int status = gsr_dbus_screencast_create_session(dbus, &session_handle);
+    if(status == 0) {
+        gsr_dbus_response_message response = {
+            .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+            .type = GSR_DBUS_MESSAGE_RESP_CREATE_SESSION,
+            .create_session = (gsr_dbus_message_resp_create_session) {}
+        };
+        snprintf(response.create_session.session_handle, sizeof(response.create_session.session_handle), "%s", session_handle);
+        free(session_handle);
+        write(rpc_fd, &response, sizeof(response));
+    }
+    return status;
+}
+
+static int handle_select_sources(gsr_dbus *dbus, int rpc_fd, const gsr_dbus_message_req_select_sources *select_sources) {
+    const int status = gsr_dbus_screencast_select_sources(dbus, select_sources->session_handle, select_sources->capture_type, select_sources->cursor_mode);
+    if(status == 0) {
+        gsr_dbus_response_message response = {
+            .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+            .type = GSR_DBUS_MESSAGE_RESP_SELECT_SOURCES,
+            .select_sources = (gsr_dbus_message_resp_select_sources) {}
+        };
+        write(rpc_fd, &response, sizeof(response));
+    }
+    return status;
+}
+
+static int handle_start(gsr_dbus *dbus, int rpc_fd, const gsr_dbus_message_req_start *start) {
+    uint32_t pipewire_node = 0;
+    const int status = gsr_dbus_screencast_start(dbus, start->session_handle, &pipewire_node);
+    if(status == 0) {
+        const char *screencast_restore_token = gsr_dbus_screencast_get_restore_token(dbus);
+        gsr_dbus_response_message response = {
+            .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+            .type = GSR_DBUS_MESSAGE_RESP_START,
+            .start = (gsr_dbus_message_resp_start) {
+                .pipewire_node = pipewire_node
+            }
+        };
+        snprintf(response.start.restore_token, sizeof(response.start.restore_token), "%s", screencast_restore_token ? screencast_restore_token : "");
+        write(rpc_fd, &response, sizeof(response));
+    }
+    return status;
+}
+
+static bool handle_open_pipewire_remote(gsr_dbus *dbus, int rpc_fd, const gsr_dbus_message_req_open_pipewire_remote *open_pipewire_remote) {
+    int pipewire_fd = 0;
+    const bool success = gsr_dbus_screencast_open_pipewire_remote(dbus, open_pipewire_remote->session_handle, &pipewire_fd);
+    if(success) {
+        gsr_dbus_response_message response = {
+            .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+            .type = GSR_DBUS_MESSAGE_RESP_OPEN_PIPEWIRE_REMOTE,
+            .open_pipewire_remote = (gsr_dbus_message_resp_open_pipewire_remote) {}
+        };
+
+        struct iovec iov = {
+            .iov_base = &response,
+            .iov_len = sizeof(response)
+        };
+
+        char msg_control[CMSG_SPACE(sizeof(int))];
+
+        struct msghdr message = {
+            .msg_iov = &iov,
+            .msg_iovlen = 1,
+            .msg_control = msg_control,
+            .msg_controllen = sizeof(msg_control)
+        };
+
+        struct cmsghdr *cmsg = CMSG_FIRSTHDR(&message);
+        cmsg->cmsg_level = SOL_SOCKET;
+        cmsg->cmsg_type = SCM_RIGHTS;
+        cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+        int *fds = (int*)CMSG_DATA(cmsg);
+        fds[0] = pipewire_fd;
+        message.msg_controllen = cmsg->cmsg_len;
+        sendmsg(rpc_fd, &message, 0);
+    }
+    return success;
+}
+
+int main(int argc, char **argv) {
+    if(argc != 3) {
+        fprintf(stderr, "usage: gsr-dbus-server <rpc-fd> <screencast-restore-token>\n");
+        return 1;
+    }
+
+    const char *rpc_fd_str = argv[1];
+    const char *screencast_restore_token = argv[2];
+
+    int rpc_fd = -1;
+    if(sscanf(rpc_fd_str, "%d", &rpc_fd) != 1) {
+        fprintf(stderr, "gsr-dbus-server error: rpc-fd is not a number: %s\n", rpc_fd_str);
+        return 1;
+    }
+
+    if(screencast_restore_token[0] == '\0')
+        screencast_restore_token = NULL;
+
+    gsr_dbus dbus;
+    if(!gsr_dbus_init(&dbus, screencast_restore_token))
+        return 1;
+
+    /* Tell client we have started up */
+    write(rpc_fd, "S", 1);
+
+    gsr_dbus_request_message request;
+    for(;;) {
+        read(rpc_fd, &request, sizeof(request));
+
+        if(request.protocol_version != GSR_DBUS_PROTOCOL_VERSION) {
+            gsr_dbus_response_message response = {
+                .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+                .type = GSR_DBUS_MESSAGE_RESP_ERROR,
+                .error = (gsr_dbus_message_resp_error) {
+                    .error_code = 1
+                }
+            };
+            snprintf(response.error.message, sizeof(response.error.message), "Client uses protocol version %d while the server is using protocol version %d", request.protocol_version, GSR_DBUS_PROTOCOL_VERSION);
+            fprintf(stderr, "gsr-dbus-server error: %s\n", response.error.message);
+            write(rpc_fd, &response, sizeof(response));
+            continue;
+        }
+
+        int status = 0;
+        switch(request.type) {
+            case GSR_DBUS_MESSAGE_REQ_CREATE_SESSION: {
+                status = handle_create_session(&dbus, rpc_fd, &request.create_session);
+                break;
+            }
+            case GSR_DBUS_MESSAGE_REQ_SELECT_SOURCES: {
+                status = handle_select_sources(&dbus, rpc_fd, &request.select_sources);
+                break;
+            }
+            case GSR_DBUS_MESSAGE_REQ_START: {
+                status = handle_start(&dbus, rpc_fd, &request.start);
+                break;
+            }
+            case GSR_DBUS_MESSAGE_REQ_OPEN_PIPEWIRE_REMOTE: {
+                if(!handle_open_pipewire_remote(&dbus, rpc_fd, &request.open_pipewire_remote))
+                    status = -1;
+                break;
+            }
+        }
+
+        if(status != 0) {
+            gsr_dbus_response_message response = {
+                .protocol_version = GSR_DBUS_PROTOCOL_VERSION,
+                .type = GSR_DBUS_MESSAGE_RESP_ERROR,
+                .error = (gsr_dbus_message_resp_error) {
+                    .error_code = status
+                }
+            };
+            snprintf(response.error.message, sizeof(response.error.message), "%s", "Failed to handle request");
+            write(rpc_fd, &response, sizeof(response));
+        }
+    }
+
+    gsr_dbus_deinit(&dbus);
+    return 0;
+}
+\ No newline at end of file
diff --git a/debug-install.sh b/debug-install.sh
deleted file mode 100755
index 720d7ec..0000000
--- a/debug-install.sh
+++ /dev/null
@@ -1,20 +0,0 @@
-#!/bin/sh -e
-
-script_dir=$(dirname "$0")
-cd "$script_dir"
-
-[ $(id -u) -ne 0 ] && echo "You need root privileges to run the install script" && exit 1
-
-DEBUG=1 ./build.sh
-
-install -Dm755 "gsr-kms-server" "/usr/bin/gsr-kms-server"
-install -Dm755 "gpu-screen-recorder" "/usr/bin/gpu-screen-recorder"
-if [ -d "/usr/lib/systemd/user" ]; then
-    install -Dm644 "extra/gpu-screen-recorder.service" "/usr/lib/systemd/user/gpu-screen-recorder.service"
-fi
-# Not necessary, but removes the password prompt when trying to record a monitor on amd/intel or nvidia wayland
-setcap cap_sys_admin+ep /usr/bin/gsr-kms-server
-# Not ncessary, but allows use of EGL_CONTEXT_PRIORITY_LEVEL_IMG which might decrease performance impact on the system
-setcap cap_sys_nice+ep /usr/bin/gpu-screen-recorder
-
-echo "Successfully installed gpu-screen-recorder (debug)"
diff --git a/external/nvEncodeAPI.h b/external/nvEncodeAPI.h
new file mode 100644
index 0000000..281464c
--- /dev/null
+++ b/external/nvEncodeAPI.h
@@ -0,0 +1,4285 @@
+/*
+ * This copyright notice applies to this header file only:
+ *
+ * Copyright (c) 2010-2022 NVIDIA Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person
+ * obtaining a copy of this software and associated documentation
+ * files (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use,
+ * copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the software, and to permit persons to whom the
+ * software is furnished to do so, subject to the following
+ * conditions:
+ *
+ * The above copyright notice and this permission notice shall be
+ * included in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * \file nvEncodeAPI.h
+ *   NVIDIA GPUs - beginning with the Kepler generation - contain a hardware-based encoder
+ *   (referred to as NVENC) which provides fully-accelerated hardware-based video encoding.
+ *   NvEncodeAPI provides the interface for NVIDIA video encoder (NVENC).
+ * \date 2011-2022
+ *  This file contains the interface constants, structure definitions and function prototypes.
+ */
+
+#ifndef _NV_ENCODEAPI_H_
+#define _NV_ENCODEAPI_H_
+
+#include <stdlib.h>
+
+#ifdef _WIN32
+#include <windows.h>
+#endif
+
+#ifdef _MSC_VER
+#ifndef _STDINT
+typedef __int32 int32_t;
+typedef unsigned __int32 uint32_t;
+typedef __int64 int64_t;
+typedef unsigned __int64 uint64_t;
+typedef signed char int8_t;
+typedef unsigned char uint8_t;
+typedef short int16_t;
+typedef unsigned short uint16_t;
+#endif
+#else
+#include <stdint.h>
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * \addtogroup ENCODER_STRUCTURE NvEncodeAPI Data structures
+ * @{
+ */
+
+#if defined(_WIN32) || defined(__CYGWIN__)
+#define NVENCAPI __stdcall
+#else
+#define NVENCAPI
+#endif
+
+#ifdef _WIN32
+typedef RECT NVENC_RECT;
+#else
+#define NVENCAPI
+// =========================================================================================
+#if !defined(GUID) && !defined(GUID_DEFINED)
+#define GUID_DEFINED
+/*!
+ * \struct GUID
+ * Abstracts the GUID structure for non-windows platforms.
+ */
+// =========================================================================================
+typedef struct _GUID
+{
+    uint32_t Data1;                                      /**< [in]: Specifies the first 8 hexadecimal digits of the GUID.                                */
+    uint16_t Data2;                                      /**< [in]: Specifies the first group of 4 hexadecimal digits.                                   */
+    uint16_t Data3;                                      /**< [in]: Specifies the second group of 4 hexadecimal digits.                                  */
+    uint8_t  Data4[8];                                   /**< [in]: Array of 8 bytes. The first 2 bytes contain the third group of 4 hexadecimal digits.
+                                                                    The remaining 6 bytes contain the final 12 hexadecimal digits.                       */
+} GUID, *LPGUID;
+#endif // GUID
+
+/**
+ * \struct _NVENC_RECT
+ * Defines a Rectangle. Used in ::NV_ENC_PREPROCESS_FRAME.
+ */
+typedef struct _NVENC_RECT
+{
+    uint32_t left;                                        /**< [in]: X coordinate of the upper left corner of rectangular area to be specified.       */
+    uint32_t top;                                         /**< [in]: Y coordinate of the upper left corner of the rectangular area to be specified.   */
+    uint32_t right;                                       /**< [in]: X coordinate of the bottom right corner of the rectangular area to be specified. */
+    uint32_t bottom;                                      /**< [in]: Y coordinate of the bottom right corner of the rectangular area to be specified. */
+} NVENC_RECT;
+
+#endif // _WIN32
+
+/** @} */ /* End of GUID and NVENC_RECT structure grouping*/
+
+typedef void* NV_ENC_INPUT_PTR;             /**< NVENCODE API input buffer                              */
+typedef void* NV_ENC_OUTPUT_PTR;            /**< NVENCODE API output buffer*/
+typedef void* NV_ENC_REGISTERED_PTR;        /**< A Resource that has been registered with NVENCODE API*/
+typedef void* NV_ENC_CUSTREAM_PTR;          /**< Pointer to CUstream*/
+
+#define NVENCAPI_MAJOR_VERSION 12
+#define NVENCAPI_MINOR_VERSION 0
+
+#define NVENCAPI_VERSION (NVENCAPI_MAJOR_VERSION | (NVENCAPI_MINOR_VERSION << 24))
+
+/**
+ * Macro to generate per-structure version for use with API.
+ */
+#define NVENCAPI_STRUCT_VERSION(ver) ((uint32_t)NVENCAPI_VERSION | ((ver)<<16) | (0x7 << 28))
+
+
+#define NVENC_INFINITE_GOPLENGTH  0xffffffff
+
+#define NV_MAX_SEQ_HDR_LEN  (512)
+
+#ifdef __GNUC__
+#define NV_ENC_DEPRECATED __attribute__ ((deprecated("WILL BE REMOVED IN A FUTURE VIDEO CODEC SDK VERSION")))
+#elif defined(_MSC_VER)
+#define NV_ENC_DEPRECATED __declspec(deprecated("WILL BE REMOVED IN A FUTURE VIDEO CODEC SDK VERSION"))
+#endif
+
+// =========================================================================================
+// Encode Codec GUIDS supported by the NvEncodeAPI interface.
+// =========================================================================================
+
+// {6BC82762-4E63-4ca4-AA85-1E50F321F6BF}
+static const GUID NV_ENC_CODEC_H264_GUID =
+{ 0x6bc82762, 0x4e63, 0x4ca4, { 0xaa, 0x85, 0x1e, 0x50, 0xf3, 0x21, 0xf6, 0xbf } };
+
+// {790CDC88-4522-4d7b-9425-BDA9975F7603}
+static const GUID NV_ENC_CODEC_HEVC_GUID =
+{ 0x790cdc88, 0x4522, 0x4d7b, { 0x94, 0x25, 0xbd, 0xa9, 0x97, 0x5f, 0x76, 0x3 } };
+
+// {0A352289-0AA7-4759-862D-5D15CD16D254}
+static const GUID NV_ENC_CODEC_AV1_GUID =
+{ 0x0a352289, 0x0aa7, 0x4759, { 0x86, 0x2d, 0x5d, 0x15, 0xcd, 0x16, 0xd2, 0x54 } };
+
+
+
+// =========================================================================================
+// *   Encode Profile GUIDS supported by the NvEncodeAPI interface.
+// =========================================================================================
+
+// {BFD6F8E7-233C-4341-8B3E-4818523803F4}
+static const GUID NV_ENC_CODEC_PROFILE_AUTOSELECT_GUID =
+{ 0xbfd6f8e7, 0x233c, 0x4341, { 0x8b, 0x3e, 0x48, 0x18, 0x52, 0x38, 0x3, 0xf4 } };
+
+// {0727BCAA-78C4-4c83-8C2F-EF3DFF267C6A}
+static const GUID  NV_ENC_H264_PROFILE_BASELINE_GUID =
+{ 0x727bcaa, 0x78c4, 0x4c83, { 0x8c, 0x2f, 0xef, 0x3d, 0xff, 0x26, 0x7c, 0x6a } };
+
+// {60B5C1D4-67FE-4790-94D5-C4726D7B6E6D}
+static const GUID  NV_ENC_H264_PROFILE_MAIN_GUID =
+{ 0x60b5c1d4, 0x67fe, 0x4790, { 0x94, 0xd5, 0xc4, 0x72, 0x6d, 0x7b, 0x6e, 0x6d } };
+
+// {E7CBC309-4F7A-4b89-AF2A-D537C92BE310}
+static const GUID NV_ENC_H264_PROFILE_HIGH_GUID =
+{ 0xe7cbc309, 0x4f7a, 0x4b89, { 0xaf, 0x2a, 0xd5, 0x37, 0xc9, 0x2b, 0xe3, 0x10 } };
+
+// {7AC663CB-A598-4960-B844-339B261A7D52}
+static const GUID  NV_ENC_H264_PROFILE_HIGH_444_GUID =
+{ 0x7ac663cb, 0xa598, 0x4960, { 0xb8, 0x44, 0x33, 0x9b, 0x26, 0x1a, 0x7d, 0x52 } };
+
+// {40847BF5-33F7-4601-9084-E8FE3C1DB8B7}
+static const GUID NV_ENC_H264_PROFILE_STEREO_GUID =
+{ 0x40847bf5, 0x33f7, 0x4601, { 0x90, 0x84, 0xe8, 0xfe, 0x3c, 0x1d, 0xb8, 0xb7 } };
+
+// {B405AFAC-F32B-417B-89C4-9ABEED3E5978}
+static const GUID NV_ENC_H264_PROFILE_PROGRESSIVE_HIGH_GUID =
+{ 0xb405afac, 0xf32b, 0x417b, { 0x89, 0xc4, 0x9a, 0xbe, 0xed, 0x3e, 0x59, 0x78 } };
+
+// {AEC1BD87-E85B-48f2-84C3-98BCA6285072}
+static const GUID NV_ENC_H264_PROFILE_CONSTRAINED_HIGH_GUID =
+{ 0xaec1bd87, 0xe85b, 0x48f2, { 0x84, 0xc3, 0x98, 0xbc, 0xa6, 0x28, 0x50, 0x72 } };
+
+// {B514C39A-B55B-40fa-878F-F1253B4DFDEC}
+static const GUID NV_ENC_HEVC_PROFILE_MAIN_GUID =
+{ 0xb514c39a, 0xb55b, 0x40fa, { 0x87, 0x8f, 0xf1, 0x25, 0x3b, 0x4d, 0xfd, 0xec } };
+
+// {fa4d2b6c-3a5b-411a-8018-0a3f5e3c9be5}
+static const GUID NV_ENC_HEVC_PROFILE_MAIN10_GUID =
+{ 0xfa4d2b6c, 0x3a5b, 0x411a, { 0x80, 0x18, 0x0a, 0x3f, 0x5e, 0x3c, 0x9b, 0xe5 } };
+
+// For HEVC Main 444 8 bit and HEVC Main 444 10 bit profiles only
+// {51ec32b5-1b4c-453c-9cbd-b616bd621341}
+static const GUID NV_ENC_HEVC_PROFILE_FREXT_GUID =
+{ 0x51ec32b5, 0x1b4c, 0x453c, { 0x9c, 0xbd, 0xb6, 0x16, 0xbd, 0x62, 0x13, 0x41 } };
+
+// {5f2a39f5-f14e-4f95-9a9e-b76d568fcf97}
+static const GUID NV_ENC_AV1_PROFILE_MAIN_GUID =
+{ 0x5f2a39f5, 0xf14e, 0x4f95, { 0x9a, 0x9e, 0xb7, 0x6d, 0x56, 0x8f, 0xcf, 0x97 } };
+
+// =========================================================================================
+// *   Preset GUIDS supported by the NvEncodeAPI interface.
+// =========================================================================================
+// {B2DFB705-4EBD-4C49-9B5F-24A777D3E587}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_DEFAULT_GUID =
+{ 0xb2dfb705, 0x4ebd, 0x4c49, { 0x9b, 0x5f, 0x24, 0xa7, 0x77, 0xd3, 0xe5, 0x87 } };
+
+// {60E4C59F-E846-4484-A56D-CD45BE9FDDF6}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_HP_GUID =
+{ 0x60e4c59f, 0xe846, 0x4484, { 0xa5, 0x6d, 0xcd, 0x45, 0xbe, 0x9f, 0xdd, 0xf6 } };
+
+// {34DBA71D-A77B-4B8F-9C3E-B6D5DA24C012}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_HQ_GUID =
+{ 0x34dba71d, 0xa77b, 0x4b8f, { 0x9c, 0x3e, 0xb6, 0xd5, 0xda, 0x24, 0xc0, 0x12 } };
+
+// {82E3E450-BDBB-4e40-989C-82A90DF9EF32}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_BD_GUID  =
+{ 0x82e3e450, 0xbdbb, 0x4e40, { 0x98, 0x9c, 0x82, 0xa9, 0xd, 0xf9, 0xef, 0x32 } };
+
+// {49DF21C5-6DFA-4feb-9787-6ACC9EFFB726}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_LOW_LATENCY_DEFAULT_GUID  =
+{ 0x49df21c5, 0x6dfa, 0x4feb, { 0x97, 0x87, 0x6a, 0xcc, 0x9e, 0xff, 0xb7, 0x26 } };
+
+// {C5F733B9-EA97-4cf9-BEC2-BF78A74FD105}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_LOW_LATENCY_HQ_GUID  =
+{ 0xc5f733b9, 0xea97, 0x4cf9, { 0xbe, 0xc2, 0xbf, 0x78, 0xa7, 0x4f, 0xd1, 0x5 } };
+
+// {67082A44-4BAD-48FA-98EA-93056D150A58}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_LOW_LATENCY_HP_GUID =
+{ 0x67082a44, 0x4bad, 0x48fa, { 0x98, 0xea, 0x93, 0x5, 0x6d, 0x15, 0xa, 0x58 } };
+
+// {D5BFB716-C604-44e7-9BB8-DEA5510FC3AC}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_LOSSLESS_DEFAULT_GUID =
+{ 0xd5bfb716, 0xc604, 0x44e7, { 0x9b, 0xb8, 0xde, 0xa5, 0x51, 0xf, 0xc3, 0xac } };
+
+// {149998E7-2364-411d-82EF-179888093409}
+NV_ENC_DEPRECATED static const GUID NV_ENC_PRESET_LOSSLESS_HP_GUID =
+{ 0x149998e7, 0x2364, 0x411d, { 0x82, 0xef, 0x17, 0x98, 0x88, 0x9, 0x34, 0x9 } };
+
+// Performance degrades and quality improves as we move from P1 to P7. Presets P3 to P7 for H264 and Presets P2 to P7 for HEVC have B frames enabled by default
+// for HIGH_QUALITY and LOSSLESS tuning info, and will not work with Weighted Prediction enabled. In case Weighted Prediction is required, disable B frames by
+// setting frameIntervalP = 1
+// {FC0A8D3E-45F8-4CF8-80C7-298871590EBF}
+static const GUID NV_ENC_PRESET_P1_GUID   =
+{ 0xfc0a8d3e, 0x45f8, 0x4cf8, { 0x80, 0xc7, 0x29, 0x88, 0x71, 0x59, 0xe, 0xbf } };
+
+// {F581CFB8-88D6-4381-93F0-DF13F9C27DAB}
+static const GUID NV_ENC_PRESET_P2_GUID   =
+{ 0xf581cfb8, 0x88d6, 0x4381, { 0x93, 0xf0, 0xdf, 0x13, 0xf9, 0xc2, 0x7d, 0xab } };
+
+// {36850110-3A07-441F-94D5-3670631F91F6}
+static const GUID NV_ENC_PRESET_P3_GUID   =
+{ 0x36850110, 0x3a07, 0x441f, { 0x94, 0xd5, 0x36, 0x70, 0x63, 0x1f, 0x91, 0xf6 } };
+
+// {90A7B826-DF06-4862-B9D2-CD6D73A08681}
+static const GUID NV_ENC_PRESET_P4_GUID   =
+{ 0x90a7b826, 0xdf06, 0x4862, { 0xb9, 0xd2, 0xcd, 0x6d, 0x73, 0xa0, 0x86, 0x81 } };
+
+// {21C6E6B4-297A-4CBA-998F-B6CBDE72ADE3}
+static const GUID NV_ENC_PRESET_P5_GUID   =
+{ 0x21c6e6b4, 0x297a, 0x4cba, { 0x99, 0x8f, 0xb6, 0xcb, 0xde, 0x72, 0xad, 0xe3 } };
+
+// {8E75C279-6299-4AB6-8302-0B215A335CF5}
+static const GUID NV_ENC_PRESET_P6_GUID   =
+{ 0x8e75c279, 0x6299, 0x4ab6, { 0x83, 0x2, 0xb, 0x21, 0x5a, 0x33, 0x5c, 0xf5 } };
+
+// {84848C12-6F71-4C13-931B-53E283F57974}
+static const GUID NV_ENC_PRESET_P7_GUID   =
+{ 0x84848c12, 0x6f71, 0x4c13, { 0x93, 0x1b, 0x53, 0xe2, 0x83, 0xf5, 0x79, 0x74 } };
+
+/**
+ * \addtogroup ENCODER_STRUCTURE NvEncodeAPI Data structures
+ * @{
+ */
+
+/**
+ * Input frame encode modes
+ */
+typedef enum _NV_ENC_PARAMS_FRAME_FIELD_MODE
+{
+    NV_ENC_PARAMS_FRAME_FIELD_MODE_FRAME = 0x01,  /**< Frame mode */
+    NV_ENC_PARAMS_FRAME_FIELD_MODE_FIELD = 0x02,  /**< Field mode */
+    NV_ENC_PARAMS_FRAME_FIELD_MODE_MBAFF = 0x03   /**< MB adaptive frame/field */
+} NV_ENC_PARAMS_FRAME_FIELD_MODE;
+
+/**
+ * Rate Control Modes
+ */
+typedef enum _NV_ENC_PARAMS_RC_MODE
+{
+    NV_ENC_PARAMS_RC_CONSTQP                = 0x0,       /**< Constant QP mode */
+    NV_ENC_PARAMS_RC_VBR                    = 0x1,       /**< Variable bitrate mode */
+    NV_ENC_PARAMS_RC_CBR                    = 0x2,       /**< Constant bitrate mode */
+    NV_ENC_PARAMS_RC_CBR_LOWDELAY_HQ        = 0x8,       /**< Deprecated, use NV_ENC_PARAMS_RC_CBR + NV_ENC_TWO_PASS_QUARTER_RESOLUTION / NV_ENC_TWO_PASS_FULL_RESOLUTION +
+                                                              lowDelayKeyFrameScale=1 */
+    NV_ENC_PARAMS_RC_CBR_HQ                 = 0x10,      /**< Deprecated, use NV_ENC_PARAMS_RC_CBR + NV_ENC_TWO_PASS_QUARTER_RESOLUTION / NV_ENC_TWO_PASS_FULL_RESOLUTION */
+    NV_ENC_PARAMS_RC_VBR_HQ                 = 0x20       /**< Deprecated, use NV_ENC_PARAMS_RC_VBR + NV_ENC_TWO_PASS_QUARTER_RESOLUTION / NV_ENC_TWO_PASS_FULL_RESOLUTION */
+} NV_ENC_PARAMS_RC_MODE;
+
+/**
+ * Multi Pass encoding
+ */
+typedef enum _NV_ENC_MULTI_PASS
+{
+    NV_ENC_MULTI_PASS_DISABLED              = 0x0,        /**< Single Pass */
+    NV_ENC_TWO_PASS_QUARTER_RESOLUTION      = 0x1,        /**< Two Pass encoding is enabled where first Pass is quarter resolution */
+    NV_ENC_TWO_PASS_FULL_RESOLUTION         = 0x2,        /**< Two Pass encoding is enabled where first Pass is full resolution */
+} NV_ENC_MULTI_PASS;
+
+/**
+ * Emphasis Levels
+ */
+typedef enum _NV_ENC_EMPHASIS_MAP_LEVEL
+{
+    NV_ENC_EMPHASIS_MAP_LEVEL_0               = 0x0,       /**< Emphasis Map Level 0, for zero Delta QP value */
+    NV_ENC_EMPHASIS_MAP_LEVEL_1               = 0x1,       /**< Emphasis Map Level 1, for very low Delta QP value */
+    NV_ENC_EMPHASIS_MAP_LEVEL_2               = 0x2,       /**< Emphasis Map Level 2, for low Delta QP value */
+    NV_ENC_EMPHASIS_MAP_LEVEL_3               = 0x3,       /**< Emphasis Map Level 3, for medium Delta QP value */
+    NV_ENC_EMPHASIS_MAP_LEVEL_4               = 0x4,       /**< Emphasis Map Level 4, for high Delta QP value */
+    NV_ENC_EMPHASIS_MAP_LEVEL_5               = 0x5        /**< Emphasis Map Level 5, for very high Delta QP value */
+} NV_ENC_EMPHASIS_MAP_LEVEL;
+
+/**
+ * QP MAP MODE
+ */
+typedef enum _NV_ENC_QP_MAP_MODE
+{
+    NV_ENC_QP_MAP_DISABLED               = 0x0,             /**< Value in NV_ENC_PIC_PARAMS::qpDeltaMap have no effect. */
+    NV_ENC_QP_MAP_EMPHASIS               = 0x1,             /**< Value in NV_ENC_PIC_PARAMS::qpDeltaMap will be treated as Emphasis level. Currently this is only supported for H264 */
+    NV_ENC_QP_MAP_DELTA                  = 0x2,             /**< Value in NV_ENC_PIC_PARAMS::qpDeltaMap will be treated as QP delta map. */
+    NV_ENC_QP_MAP                        = 0x3,             /**< Currently This is not supported. Value in NV_ENC_PIC_PARAMS::qpDeltaMap will be treated as QP value.   */
+} NV_ENC_QP_MAP_MODE;
+
+#define NV_ENC_PARAMS_RC_VBR_MINQP              (NV_ENC_PARAMS_RC_MODE)0x4          /**< Deprecated */
+#define NV_ENC_PARAMS_RC_2_PASS_QUALITY         NV_ENC_PARAMS_RC_CBR_LOWDELAY_HQ    /**< Deprecated */
+#define NV_ENC_PARAMS_RC_2_PASS_FRAMESIZE_CAP   NV_ENC_PARAMS_RC_CBR_HQ             /**< Deprecated */
+#define NV_ENC_PARAMS_RC_2_PASS_VBR             NV_ENC_PARAMS_RC_VBR_HQ             /**< Deprecated */
+#define NV_ENC_PARAMS_RC_CBR2                   NV_ENC_PARAMS_RC_CBR                /**< Deprecated */
+
+/**
+ * Input picture structure
+ */
+typedef enum _NV_ENC_PIC_STRUCT
+{
+    NV_ENC_PIC_STRUCT_FRAME             = 0x01,                 /**< Progressive frame */
+    NV_ENC_PIC_STRUCT_FIELD_TOP_BOTTOM  = 0x02,                 /**< Field encoding top field first */
+    NV_ENC_PIC_STRUCT_FIELD_BOTTOM_TOP  = 0x03                  /**< Field encoding bottom field first */
+} NV_ENC_PIC_STRUCT;
+
+/**
+ * Display picture structure
+ * Currently, this enum is only used for deciding the number of clock timestamp sets in Picture Timing SEI / Time Code SEI
+ * Otherwise, this has no impact on encoder behavior
+ */
+typedef enum _NV_ENC_DISPLAY_PIC_STRUCT
+{
+    NV_ENC_PIC_STRUCT_DISPLAY_FRAME             = 0x00,                 /**< Field encoding top field first */
+    NV_ENC_PIC_STRUCT_DISPLAY_FIELD_TOP_BOTTOM  = 0x01,                 /**< Field encoding top field first */
+    NV_ENC_PIC_STRUCT_DISPLAY_FIELD_BOTTOM_TOP  = 0x02,                 /**< Field encoding bottom field first */
+    NV_ENC_PIC_STRUCT_DISPLAY_FRAME_DOUBLING    = 0x03,                 /**< Frame doubling */
+    NV_ENC_PIC_STRUCT_DISPLAY_FRAME_TRIPLING    = 0x04                  /**< Field tripling */
+} NV_ENC_DISPLAY_PIC_STRUCT;
+
+/**
+ * Input picture type
+ */
+typedef enum _NV_ENC_PIC_TYPE
+{
+    NV_ENC_PIC_TYPE_P               = 0x0,     /**< Forward predicted */
+    NV_ENC_PIC_TYPE_B               = 0x01,    /**< Bi-directionally predicted picture */
+    NV_ENC_PIC_TYPE_I               = 0x02,    /**< Intra predicted picture */
+    NV_ENC_PIC_TYPE_IDR             = 0x03,    /**< IDR picture */
+    NV_ENC_PIC_TYPE_BI              = 0x04,    /**< Bi-directionally predicted with only Intra MBs */
+    NV_ENC_PIC_TYPE_SKIPPED         = 0x05,    /**< Picture is skipped */
+    NV_ENC_PIC_TYPE_INTRA_REFRESH   = 0x06,    /**< First picture in intra refresh cycle */
+    NV_ENC_PIC_TYPE_NONREF_P        = 0x07,    /**< Non reference P picture */
+    NV_ENC_PIC_TYPE_UNKNOWN         = 0xFF     /**< Picture type unknown */
+} NV_ENC_PIC_TYPE;
+
+/**
+ * Motion vector precisions
+ */
+typedef enum _NV_ENC_MV_PRECISION
+{
+    NV_ENC_MV_PRECISION_DEFAULT     = 0x0,     /**< Driver selects Quarter-Pel motion vector precision by default */
+    NV_ENC_MV_PRECISION_FULL_PEL    = 0x01,    /**< Full-Pel motion vector precision */
+    NV_ENC_MV_PRECISION_HALF_PEL    = 0x02,    /**< Half-Pel motion vector precision */
+    NV_ENC_MV_PRECISION_QUARTER_PEL = 0x03     /**< Quarter-Pel motion vector precision */
+} NV_ENC_MV_PRECISION;
+
+
+/**
+ * Input buffer formats
+ */
+typedef enum _NV_ENC_BUFFER_FORMAT
+{
+    NV_ENC_BUFFER_FORMAT_UNDEFINED                       = 0x00000000,  /**< Undefined buffer format */
+
+    NV_ENC_BUFFER_FORMAT_NV12                            = 0x00000001,  /**< Semi-Planar YUV [Y plane followed by interleaved UV plane] */
+    NV_ENC_BUFFER_FORMAT_YV12                            = 0x00000010,  /**< Planar YUV [Y plane followed by V and U planes] */
+    NV_ENC_BUFFER_FORMAT_IYUV                            = 0x00000100,  /**< Planar YUV [Y plane followed by U and V planes] */
+    NV_ENC_BUFFER_FORMAT_YUV444                          = 0x00001000,  /**< Planar YUV [Y plane followed by U and V planes] */
+    NV_ENC_BUFFER_FORMAT_YUV420_10BIT                    = 0x00010000,  /**< 10 bit Semi-Planar YUV [Y plane followed by interleaved UV plane]. Each pixel of size 2 bytes. Most Significant 10 bits contain pixel data. */
+    NV_ENC_BUFFER_FORMAT_YUV444_10BIT                    = 0x00100000,  /**< 10 bit Planar YUV444 [Y plane followed by U and V planes]. Each pixel of size 2 bytes. Most Significant 10 bits contain pixel data.  */
+    NV_ENC_BUFFER_FORMAT_ARGB                            = 0x01000000,  /**< 8 bit Packed A8R8G8B8. This is a word-ordered format
+                                                                             where a pixel is represented by a 32-bit word with B
+                                                                             in the lowest 8 bits, G in the next 8 bits, R in the
+                                                                             8 bits after that and A in the highest 8 bits. */
+    NV_ENC_BUFFER_FORMAT_ARGB10                          = 0x02000000,  /**< 10 bit Packed A2R10G10B10. This is a word-ordered format
+                                                                             where a pixel is represented by a 32-bit word with B
+                                                                             in the lowest 10 bits, G in the next 10 bits, R in the
+                                                                             10 bits after that and A in the highest 2 bits. */
+    NV_ENC_BUFFER_FORMAT_AYUV                            = 0x04000000,  /**< 8 bit Packed A8Y8U8V8. This is a word-ordered format
+                                                                             where a pixel is represented by a 32-bit word with V
+                                                                             in the lowest 8 bits, U in the next 8 bits, Y in the
+                                                                             8 bits after that and A in the highest 8 bits. */
+    NV_ENC_BUFFER_FORMAT_ABGR                            = 0x10000000,  /**< 8 bit Packed A8B8G8R8. This is a word-ordered format
+                                                                             where a pixel is represented by a 32-bit word with R
+                                                                             in the lowest 8 bits, G in the next 8 bits, B in the
+                                                                             8 bits after that and A in the highest 8 bits. */
+    NV_ENC_BUFFER_FORMAT_ABGR10                          = 0x20000000,  /**< 10 bit Packed A2B10G10R10. This is a word-ordered format
+                                                                             where a pixel is represented by a 32-bit word with R
+                                                                             in the lowest 10 bits, G in the next 10 bits, B in the
+                                                                             10 bits after that and A in the highest 2 bits. */
+    NV_ENC_BUFFER_FORMAT_U8                              = 0x40000000,  /**< Buffer format representing one-dimensional buffer.
+                                                                             This format should be used only when registering the
+                                                                             resource as output buffer, which will be used to write
+                                                                             the encoded bit stream or H.264 ME only mode output. */
+} NV_ENC_BUFFER_FORMAT;
+
+#define NV_ENC_BUFFER_FORMAT_NV12_PL NV_ENC_BUFFER_FORMAT_NV12
+#define NV_ENC_BUFFER_FORMAT_YV12_PL NV_ENC_BUFFER_FORMAT_YV12
+#define NV_ENC_BUFFER_FORMAT_IYUV_PL NV_ENC_BUFFER_FORMAT_IYUV
+#define NV_ENC_BUFFER_FORMAT_YUV444_PL NV_ENC_BUFFER_FORMAT_YUV444
+
+/**
+ * Encoding levels
+ */
+typedef enum _NV_ENC_LEVEL
+{
+    NV_ENC_LEVEL_AUTOSELECT         = 0,
+
+    NV_ENC_LEVEL_H264_1             = 10,
+    NV_ENC_LEVEL_H264_1b            = 9,
+    NV_ENC_LEVEL_H264_11            = 11,
+    NV_ENC_LEVEL_H264_12            = 12,
+    NV_ENC_LEVEL_H264_13            = 13,
+    NV_ENC_LEVEL_H264_2             = 20,
+    NV_ENC_LEVEL_H264_21            = 21,
+    NV_ENC_LEVEL_H264_22            = 22,
+    NV_ENC_LEVEL_H264_3             = 30,
+    NV_ENC_LEVEL_H264_31            = 31,
+    NV_ENC_LEVEL_H264_32            = 32,
+    NV_ENC_LEVEL_H264_4             = 40,
+    NV_ENC_LEVEL_H264_41            = 41,
+    NV_ENC_LEVEL_H264_42            = 42,
+    NV_ENC_LEVEL_H264_5             = 50,
+    NV_ENC_LEVEL_H264_51            = 51,
+    NV_ENC_LEVEL_H264_52            = 52,
+    NV_ENC_LEVEL_H264_60            = 60,
+    NV_ENC_LEVEL_H264_61            = 61,
+    NV_ENC_LEVEL_H264_62            = 62,
+
+    NV_ENC_LEVEL_HEVC_1             = 30,
+    NV_ENC_LEVEL_HEVC_2             = 60,
+    NV_ENC_LEVEL_HEVC_21            = 63,
+    NV_ENC_LEVEL_HEVC_3             = 90,
+    NV_ENC_LEVEL_HEVC_31            = 93,
+    NV_ENC_LEVEL_HEVC_4             = 120,
+    NV_ENC_LEVEL_HEVC_41            = 123,
+    NV_ENC_LEVEL_HEVC_5             = 150,
+    NV_ENC_LEVEL_HEVC_51            = 153,
+    NV_ENC_LEVEL_HEVC_52            = 156,
+    NV_ENC_LEVEL_HEVC_6             = 180,
+    NV_ENC_LEVEL_HEVC_61            = 183,
+    NV_ENC_LEVEL_HEVC_62            = 186,
+
+    NV_ENC_TIER_HEVC_MAIN           = 0,
+    NV_ENC_TIER_HEVC_HIGH           = 1,
+
+    NV_ENC_LEVEL_AV1_2              = 0,
+    NV_ENC_LEVEL_AV1_21             = 1,
+    NV_ENC_LEVEL_AV1_22             = 2,
+    NV_ENC_LEVEL_AV1_23             = 3,
+    NV_ENC_LEVEL_AV1_3              = 4,
+    NV_ENC_LEVEL_AV1_31             = 5,
+    NV_ENC_LEVEL_AV1_32             = 6,
+    NV_ENC_LEVEL_AV1_33             = 7,
+    NV_ENC_LEVEL_AV1_4              = 8,
+    NV_ENC_LEVEL_AV1_41             = 9,
+    NV_ENC_LEVEL_AV1_42             = 10,
+    NV_ENC_LEVEL_AV1_43             = 11,
+    NV_ENC_LEVEL_AV1_5              = 12,
+    NV_ENC_LEVEL_AV1_51             = 13,
+    NV_ENC_LEVEL_AV1_52             = 14,
+    NV_ENC_LEVEL_AV1_53             = 15,
+    NV_ENC_LEVEL_AV1_6              = 16,
+    NV_ENC_LEVEL_AV1_61             = 17,
+    NV_ENC_LEVEL_AV1_62             = 18,
+    NV_ENC_LEVEL_AV1_63             = 19,
+    NV_ENC_LEVEL_AV1_7              = 20,
+    NV_ENC_LEVEL_AV1_71             = 21,
+    NV_ENC_LEVEL_AV1_72             = 22,
+    NV_ENC_LEVEL_AV1_73             = 23,
+    NV_ENC_LEVEL_AV1_AUTOSELECT         ,
+
+    NV_ENC_TIER_AV1_0               = 0,
+    NV_ENC_TIER_AV1_1               = 1
+} NV_ENC_LEVEL;
+
+/**
+ * Error Codes
+ */
+typedef enum _NVENCSTATUS
+{
+    /**
+     * This indicates that API call returned with no errors.
+     */
+    NV_ENC_SUCCESS,
+
+    /**
+     * This indicates that no encode capable devices were detected.
+     */
+    NV_ENC_ERR_NO_ENCODE_DEVICE,
+
+    /**
+     * This indicates that devices pass by the client is not supported.
+     */
+    NV_ENC_ERR_UNSUPPORTED_DEVICE,
+
+    /**
+     * This indicates that the encoder device supplied by the client is not
+     * valid.
+     */
+    NV_ENC_ERR_INVALID_ENCODERDEVICE,
+
+    /**
+     * This indicates that device passed to the API call is invalid.
+     */
+    NV_ENC_ERR_INVALID_DEVICE,
+
+    /**
+     * This indicates that device passed to the API call is no longer available and
+     * needs to be reinitialized. The clients need to destroy the current encoder
+     * session by freeing the allocated input output buffers and destroying the device
+     * and create a new encoding session.
+     */
+    NV_ENC_ERR_DEVICE_NOT_EXIST,
+
+    /**
+     * This indicates that one or more of the pointers passed to the API call
+     * is invalid.
+     */
+    NV_ENC_ERR_INVALID_PTR,
+
+    /**
+     * This indicates that completion event passed in ::NvEncEncodePicture() call
+     * is invalid.
+     */
+    NV_ENC_ERR_INVALID_EVENT,
+
+    /**
+     * This indicates that one or more of the parameter passed to the API call
+     * is invalid.
+     */
+    NV_ENC_ERR_INVALID_PARAM,
+
+    /**
+     * This indicates that an API call was made in wrong sequence/order.
+     */
+    NV_ENC_ERR_INVALID_CALL,
+
+    /**
+     * This indicates that the API call failed because it was unable to allocate
+     * enough memory to perform the requested operation.
+     */
+    NV_ENC_ERR_OUT_OF_MEMORY,
+
+    /**
+     * This indicates that the encoder has not been initialized with
+     * ::NvEncInitializeEncoder() or that initialization has failed.
+     * The client cannot allocate input or output buffers or do any encoding
+     * related operation before successfully initializing the encoder.
+     */
+    NV_ENC_ERR_ENCODER_NOT_INITIALIZED,
+
+    /**
+     * This indicates that an unsupported parameter was passed by the client.
+     */
+    NV_ENC_ERR_UNSUPPORTED_PARAM,
+
+    /**
+     * This indicates that the ::NvEncLockBitstream() failed to lock the output
+     * buffer. This happens when the client makes a non blocking lock call to
+     * access the output bitstream by passing NV_ENC_LOCK_BITSTREAM::doNotWait flag.
+     * This is not a fatal error and client should retry the same operation after
+     * few milliseconds.
+     */
+    NV_ENC_ERR_LOCK_BUSY,
+
+    /**
+     * This indicates that the size of the user buffer passed by the client is
+     * insufficient for the requested operation.
+     */
+    NV_ENC_ERR_NOT_ENOUGH_BUFFER,
+
+    /**
+     * This indicates that an invalid struct version was used by the client.
+     */
+    NV_ENC_ERR_INVALID_VERSION,
+
+    /**
+     * This indicates that ::NvEncMapInputResource() API failed to map the client
+     * provided input resource.
+     */
+    NV_ENC_ERR_MAP_FAILED,
+
+    /**
+     * This indicates encode driver requires more input buffers to produce an output
+     * bitstream. If this error is returned from ::NvEncEncodePicture() API, this
+     * is not a fatal error. If the client is encoding with B frames then,
+     * ::NvEncEncodePicture() API might be buffering the input frame for re-ordering.
+     *
+     * A client operating in synchronous mode cannot call ::NvEncLockBitstream()
+     * API on the output bitstream buffer if ::NvEncEncodePicture() returned the
+     * ::NV_ENC_ERR_NEED_MORE_INPUT error code.
+     * The client must continue providing input frames until encode driver returns
+     * ::NV_ENC_SUCCESS. After receiving ::NV_ENC_SUCCESS status the client can call
+     * ::NvEncLockBitstream() API on the output buffers in the same order in which
+     * it has called ::NvEncEncodePicture().
+     */
+    NV_ENC_ERR_NEED_MORE_INPUT,
+
+    /**
+     * This indicates that the HW encoder is busy encoding and is unable to encode
+     * the input. The client should call ::NvEncEncodePicture() again after few
+     * milliseconds.
+     */
+    NV_ENC_ERR_ENCODER_BUSY,
+
+    /**
+     * This indicates that the completion event passed in ::NvEncEncodePicture()
+     * API has not been registered with encoder driver using ::NvEncRegisterAsyncEvent().
+     */
+    NV_ENC_ERR_EVENT_NOT_REGISTERD,
+
+    /**
+     * This indicates that an unknown internal error has occurred.
+     */
+    NV_ENC_ERR_GENERIC,
+
+    /**
+     * This indicates that the client is attempting to use a feature
+     * that is not available for the license type for the current system.
+     */
+    NV_ENC_ERR_INCOMPATIBLE_CLIENT_KEY,
+
+    /**
+     * This indicates that the client is attempting to use a feature
+     * that is not implemented for the current version.
+     */
+    NV_ENC_ERR_UNIMPLEMENTED,
+
+    /**
+     * This indicates that the ::NvEncRegisterResource API failed to register the resource.
+     */
+    NV_ENC_ERR_RESOURCE_REGISTER_FAILED,
+
+    /**
+     * This indicates that the client is attempting to unregister a resource
+     * that has not been successfully registered.
+     */
+    NV_ENC_ERR_RESOURCE_NOT_REGISTERED,
+
+    /**
+     * This indicates that the client is attempting to unmap a resource
+     * that has not been successfully mapped.
+     */
+    NV_ENC_ERR_RESOURCE_NOT_MAPPED,
+
+} NVENCSTATUS;
+
+/**
+ * Encode Picture encode flags.
+ */
+typedef enum _NV_ENC_PIC_FLAGS
+{
+    NV_ENC_PIC_FLAG_FORCEINTRA         = 0x1,   /**< Encode the current picture as an Intra picture */
+    NV_ENC_PIC_FLAG_FORCEIDR           = 0x2,   /**< Encode the current picture as an IDR picture.
+                                                     This flag is only valid when Picture type decision is taken by the Encoder
+                                                     [_NV_ENC_INITIALIZE_PARAMS::enablePTD == 1]. */
+    NV_ENC_PIC_FLAG_OUTPUT_SPSPPS      = 0x4,   /**< Write the sequence and picture header in encoded bitstream of the current picture */
+    NV_ENC_PIC_FLAG_EOS                = 0x8,   /**< Indicates end of the input stream */
+} NV_ENC_PIC_FLAGS;
+
+/**
+ * Memory heap to allocate input and output buffers.
+ */
+typedef enum _NV_ENC_MEMORY_HEAP
+{
+    NV_ENC_MEMORY_HEAP_AUTOSELECT      = 0, /**< Memory heap to be decided by the encoder driver based on the usage */
+    NV_ENC_MEMORY_HEAP_VID             = 1, /**< Memory heap is in local video memory */
+    NV_ENC_MEMORY_HEAP_SYSMEM_CACHED   = 2, /**< Memory heap is in cached system memory */
+    NV_ENC_MEMORY_HEAP_SYSMEM_UNCACHED = 3  /**< Memory heap is in uncached system memory */
+} NV_ENC_MEMORY_HEAP;
+
+/**
+ * B-frame used as reference modes
+ */
+typedef enum _NV_ENC_BFRAME_REF_MODE
+{
+    NV_ENC_BFRAME_REF_MODE_DISABLED = 0x0,          /**< B frame is not used for reference */
+    NV_ENC_BFRAME_REF_MODE_EACH     = 0x1,          /**< Each B-frame will be used for reference */
+    NV_ENC_BFRAME_REF_MODE_MIDDLE   = 0x2,          /**< Only(Number of B-frame)/2 th B-frame will be used for reference */
+} NV_ENC_BFRAME_REF_MODE;
+
+/**
+ * H.264 entropy coding modes.
+ */
+typedef enum _NV_ENC_H264_ENTROPY_CODING_MODE
+{
+    NV_ENC_H264_ENTROPY_CODING_MODE_AUTOSELECT = 0x0,   /**< Entropy coding mode is auto selected by the encoder driver */
+    NV_ENC_H264_ENTROPY_CODING_MODE_CABAC      = 0x1,   /**< Entropy coding mode is CABAC */
+    NV_ENC_H264_ENTROPY_CODING_MODE_CAVLC      = 0x2    /**< Entropy coding mode is CAVLC */
+} NV_ENC_H264_ENTROPY_CODING_MODE;
+
+/**
+ * H.264 specific BDirect modes
+ */
+typedef enum _NV_ENC_H264_BDIRECT_MODE
+{
+    NV_ENC_H264_BDIRECT_MODE_AUTOSELECT = 0x0,          /**< BDirect mode is auto selected by the encoder driver */
+    NV_ENC_H264_BDIRECT_MODE_DISABLE    = 0x1,          /**< Disable BDirect mode */
+    NV_ENC_H264_BDIRECT_MODE_TEMPORAL   = 0x2,          /**< Temporal BDirect mode */
+    NV_ENC_H264_BDIRECT_MODE_SPATIAL    = 0x3           /**< Spatial BDirect mode */
+} NV_ENC_H264_BDIRECT_MODE;
+
+/**
+ * H.264 specific FMO usage
+ */
+typedef enum _NV_ENC_H264_FMO_MODE
+{
+    NV_ENC_H264_FMO_AUTOSELECT          = 0x0,          /**< FMO usage is auto selected by the encoder driver */
+    NV_ENC_H264_FMO_ENABLE              = 0x1,          /**< Enable FMO */
+    NV_ENC_H264_FMO_DISABLE             = 0x2,          /**< Disable FMO */
+} NV_ENC_H264_FMO_MODE;
+
+/**
+ * H.264 specific Adaptive Transform modes
+ */
+typedef enum _NV_ENC_H264_ADAPTIVE_TRANSFORM_MODE
+{
+    NV_ENC_H264_ADAPTIVE_TRANSFORM_AUTOSELECT = 0x0,   /**< Adaptive Transform 8x8 mode is auto selected by the encoder driver*/
+    NV_ENC_H264_ADAPTIVE_TRANSFORM_DISABLE    = 0x1,   /**< Adaptive Transform 8x8 mode disabled */
+    NV_ENC_H264_ADAPTIVE_TRANSFORM_ENABLE     = 0x2,   /**< Adaptive Transform 8x8 mode should be used */
+} NV_ENC_H264_ADAPTIVE_TRANSFORM_MODE;
+
+/**
+ * Stereo frame packing modes.
+ */
+typedef enum _NV_ENC_STEREO_PACKING_MODE
+{
+    NV_ENC_STEREO_PACKING_MODE_NONE             = 0x0,  /**< No Stereo packing required */
+    NV_ENC_STEREO_PACKING_MODE_CHECKERBOARD     = 0x1,  /**< Checkerboard mode for packing stereo frames */
+    NV_ENC_STEREO_PACKING_MODE_COLINTERLEAVE    = 0x2,  /**< Column Interleave mode for packing stereo frames */
+    NV_ENC_STEREO_PACKING_MODE_ROWINTERLEAVE    = 0x3,  /**< Row Interleave mode for packing stereo frames */
+    NV_ENC_STEREO_PACKING_MODE_SIDEBYSIDE       = 0x4,  /**< Side-by-side mode for packing stereo frames */
+    NV_ENC_STEREO_PACKING_MODE_TOPBOTTOM        = 0x5,  /**< Top-Bottom mode for packing stereo frames */
+    NV_ENC_STEREO_PACKING_MODE_FRAMESEQ         = 0x6   /**< Frame Sequential mode for packing stereo frames */
+} NV_ENC_STEREO_PACKING_MODE;
+
+/**
+ *  Input Resource type
+ */
+typedef enum _NV_ENC_INPUT_RESOURCE_TYPE
+{
+    NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX          = 0x0,   /**< input resource type is a directx9 surface*/
+    NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR    = 0x1,   /**< input resource type is a cuda device pointer surface*/
+    NV_ENC_INPUT_RESOURCE_TYPE_CUDAARRAY        = 0x2,   /**< input resource type is a cuda array surface.
+                                                              This array must be a 2D array and the CUDA_ARRAY3D_SURFACE_LDST
+                                                              flag must have been specified when creating it. */
+    NV_ENC_INPUT_RESOURCE_TYPE_OPENGL_TEX       = 0x3    /**< input resource type is an OpenGL texture */
+} NV_ENC_INPUT_RESOURCE_TYPE;
+
+/**
+ *  Buffer usage
+ */
+typedef enum _NV_ENC_BUFFER_USAGE
+{
+    NV_ENC_INPUT_IMAGE              = 0x0,          /**< Registered surface will be used for input image */
+    NV_ENC_OUTPUT_MOTION_VECTOR     = 0x1,          /**< Registered surface will be used for output of H.264 ME only mode.
+                                                         This buffer usage type is not supported for HEVC ME only mode. */
+    NV_ENC_OUTPUT_BITSTREAM         = 0x2,          /**< Registered surface will be used for output bitstream in encoding */
+} NV_ENC_BUFFER_USAGE;
+
+/**
+ *  Encoder Device type
+ */
+typedef enum _NV_ENC_DEVICE_TYPE
+{
+    NV_ENC_DEVICE_TYPE_DIRECTX          = 0x0,   /**< encode device type is a directx9 device */
+    NV_ENC_DEVICE_TYPE_CUDA             = 0x1,   /**< encode device type is a cuda device */
+    NV_ENC_DEVICE_TYPE_OPENGL           = 0x2    /**< encode device type is an OpenGL device.
+                                                      Use of this device type is supported only on Linux */
+} NV_ENC_DEVICE_TYPE;
+
+/**
+ * Number of reference frames
+ */
+typedef enum _NV_ENC_NUM_REF_FRAMES
+{
+    NV_ENC_NUM_REF_FRAMES_AUTOSELECT       = 0x0,          /**< Number of reference frames is auto selected by the encoder driver */
+    NV_ENC_NUM_REF_FRAMES_1                = 0x1,          /**< Number of reference frames equal to 1 */
+    NV_ENC_NUM_REF_FRAMES_2                = 0x2,          /**< Number of reference frames equal to 2 */
+    NV_ENC_NUM_REF_FRAMES_3                = 0x3,          /**< Number of reference frames equal to 3 */
+    NV_ENC_NUM_REF_FRAMES_4                = 0x4,          /**< Number of reference frames equal to 4 */
+    NV_ENC_NUM_REF_FRAMES_5                = 0x5,          /**< Number of reference frames equal to 5 */
+    NV_ENC_NUM_REF_FRAMES_6                = 0x6,          /**< Number of reference frames equal to 6 */
+    NV_ENC_NUM_REF_FRAMES_7                = 0x7           /**< Number of reference frames equal to 7 */
+} NV_ENC_NUM_REF_FRAMES;
+
+/**
+ * Encoder capabilities enumeration.
+ */
+typedef enum _NV_ENC_CAPS
+{
+    /**
+     * Maximum number of B-Frames supported.
+     */
+    NV_ENC_CAPS_NUM_MAX_BFRAMES,
+
+    /**
+     * Rate control modes supported.
+     * \n The API return value is a bitmask of the values in NV_ENC_PARAMS_RC_MODE.
+     */
+    NV_ENC_CAPS_SUPPORTED_RATECONTROL_MODES,
+
+    /**
+     * Indicates HW support for field mode encoding.
+     * \n 0 : Interlaced mode encoding is not supported.
+     * \n 1 : Interlaced field mode encoding is supported.
+     * \n 2 : Interlaced frame encoding and field mode encoding are both supported.
+     */
+     NV_ENC_CAPS_SUPPORT_FIELD_ENCODING,
+
+    /**
+     * Indicates HW support for monochrome mode encoding.
+     * \n 0 : Monochrome mode not supported.
+     * \n 1 : Monochrome mode supported.
+     */
+    NV_ENC_CAPS_SUPPORT_MONOCHROME,
+
+    /**
+     * Indicates HW support for FMO.
+     * \n 0 : FMO not supported.
+     * \n 1 : FMO supported.
+     */
+    NV_ENC_CAPS_SUPPORT_FMO,
+
+    /**
+     * Indicates HW capability for Quarter pel motion estimation.
+     * \n 0 : Quarter-Pel Motion Estimation not supported.
+     * \n 1 : Quarter-Pel Motion Estimation supported.
+     */
+    NV_ENC_CAPS_SUPPORT_QPELMV,
+
+    /**
+     * H.264 specific. Indicates HW support for BDirect modes.
+     * \n 0 : BDirect mode encoding not supported.
+     * \n 1 : BDirect mode encoding supported.
+     */
+    NV_ENC_CAPS_SUPPORT_BDIRECT_MODE,
+
+    /**
+     * H264 specific. Indicates HW support for CABAC entropy coding mode.
+     * \n 0 : CABAC entropy coding not supported.
+     * \n 1 : CABAC entropy coding supported.
+     */
+    NV_ENC_CAPS_SUPPORT_CABAC,
+
+    /**
+     * Indicates HW support for Adaptive Transform.
+     * \n 0 : Adaptive Transform not supported.
+     * \n 1 : Adaptive Transform supported.
+     */
+    NV_ENC_CAPS_SUPPORT_ADAPTIVE_TRANSFORM,
+
+    /**
+     * Indicates HW support for Multi View Coding.
+     * \n 0 : Multi View Coding not supported.
+     * \n 1 : Multi View Coding supported.
+     */
+    NV_ENC_CAPS_SUPPORT_STEREO_MVC,
+
+    /**
+     * Indicates HW support for encoding Temporal layers.
+     * \n 0 : Encoding Temporal layers not supported.
+     * \n 1 : Encoding Temporal layers supported.
+     */
+    NV_ENC_CAPS_NUM_MAX_TEMPORAL_LAYERS,
+
+    /**
+     * Indicates HW support for Hierarchical P frames.
+     * \n 0 : Hierarchical P frames not supported.
+     * \n 1 : Hierarchical P frames supported.
+     */
+    NV_ENC_CAPS_SUPPORT_HIERARCHICAL_PFRAMES,
+
+    /**
+     * Indicates HW support for Hierarchical B frames.
+     * \n 0 : Hierarchical B frames not supported.
+     * \n 1 : Hierarchical B frames supported.
+     */
+    NV_ENC_CAPS_SUPPORT_HIERARCHICAL_BFRAMES,
+
+    /**
+     * Maximum Encoding level supported (See ::NV_ENC_LEVEL for details).
+     */
+    NV_ENC_CAPS_LEVEL_MAX,
+
+    /**
+     * Minimum Encoding level supported (See ::NV_ENC_LEVEL for details).
+     */
+    NV_ENC_CAPS_LEVEL_MIN,
+
+    /**
+     * Indicates HW support for separate colour plane encoding.
+     * \n 0 : Separate colour plane encoding not supported.
+     * \n 1 : Separate colour plane encoding supported.
+     */
+    NV_ENC_CAPS_SEPARATE_COLOUR_PLANE,
+
+    /**
+     * Maximum output width supported.
+     */
+    NV_ENC_CAPS_WIDTH_MAX,
+
+    /**
+     * Maximum output height supported.
+     */
+    NV_ENC_CAPS_HEIGHT_MAX,
+
+    /**
+     * Indicates Temporal Scalability Support.
+     * \n 0 : Temporal SVC encoding not supported.
+     * \n 1 : Temporal SVC encoding supported.
+     */
+    NV_ENC_CAPS_SUPPORT_TEMPORAL_SVC,
+
+    /**
+     * Indicates Dynamic Encode Resolution Change Support.
+     * Support added from NvEncodeAPI version 2.0.
+     * \n 0 : Dynamic Encode Resolution Change not supported.
+     * \n 1 : Dynamic Encode Resolution Change supported.
+     */
+    NV_ENC_CAPS_SUPPORT_DYN_RES_CHANGE,
+
+    /**
+     * Indicates Dynamic Encode Bitrate Change Support.
+     * Support added from NvEncodeAPI version 2.0.
+     * \n 0 : Dynamic Encode bitrate change not supported.
+     * \n 1 : Dynamic Encode bitrate change supported.
+     */
+    NV_ENC_CAPS_SUPPORT_DYN_BITRATE_CHANGE,
+
+    /**
+     * Indicates Forcing Constant QP On The Fly Support.
+     * Support added from NvEncodeAPI version 2.0.
+     * \n 0 : Forcing constant QP on the fly not supported.
+     * \n 1 : Forcing constant QP on the fly supported.
+     */
+    NV_ENC_CAPS_SUPPORT_DYN_FORCE_CONSTQP,
+
+    /**
+     * Indicates Dynamic rate control mode Change Support.
+     * \n 0 : Dynamic rate control mode change not supported.
+     * \n 1 : Dynamic rate control mode change supported.
+     */
+    NV_ENC_CAPS_SUPPORT_DYN_RCMODE_CHANGE,
+
+    /**
+     * Indicates Subframe readback support for slice-based encoding. If this feature is supported, it can be enabled by setting enableSubFrameWrite = 1.
+     * \n 0 : Subframe readback not supported.
+     * \n 1 : Subframe readback supported.
+     */
+    NV_ENC_CAPS_SUPPORT_SUBFRAME_READBACK,
+
+    /**
+     * Indicates Constrained Encoding mode support.
+     * Support added from NvEncodeAPI version 2.0.
+     * \n 0 : Constrained encoding mode not supported.
+     * \n 1 : Constrained encoding mode supported.
+     * If this mode is supported client can enable this during initialization.
+     * Client can then force a picture to be coded as constrained picture where
+     * in-loop filtering is disabled across slice boundaries and prediction vectors for inter
+     * macroblocks in each slice will be restricted to the slice region.
+     */
+    NV_ENC_CAPS_SUPPORT_CONSTRAINED_ENCODING,
+
+    /**
+     * Indicates Intra Refresh Mode Support.
+     * Support added from NvEncodeAPI version 2.0.
+     * \n 0 : Intra Refresh Mode not supported.
+     * \n 1 : Intra Refresh Mode supported.
+     */
+    NV_ENC_CAPS_SUPPORT_INTRA_REFRESH,
+
+    /**
+     * Indicates Custom VBV Buffer Size support. It can be used for capping frame size.
+     * Support added from NvEncodeAPI version 2.0.
+     * \n 0 : Custom VBV buffer size specification from client, not supported.
+     * \n 1 : Custom VBV buffer size specification from client, supported.
+     */
+    NV_ENC_CAPS_SUPPORT_CUSTOM_VBV_BUF_SIZE,
+
+    /**
+     * Indicates Dynamic Slice Mode Support.
+     * Support added from NvEncodeAPI version 2.0.
+     * \n 0 : Dynamic Slice Mode not supported.
+     * \n 1 : Dynamic Slice Mode supported.
+     */
+    NV_ENC_CAPS_SUPPORT_DYNAMIC_SLICE_MODE,
+
+    /**
+     * Indicates Reference Picture Invalidation Support.
+     * Support added from NvEncodeAPI version 2.0.
+     * \n 0 : Reference Picture Invalidation not supported.
+     * \n 1 : Reference Picture Invalidation supported.
+     */
+    NV_ENC_CAPS_SUPPORT_REF_PIC_INVALIDATION,
+
+    /**
+     * Indicates support for Pre-Processing.
+     * The API return value is a bitmask of the values defined in ::NV_ENC_PREPROC_FLAGS
+     */
+    NV_ENC_CAPS_PREPROC_SUPPORT,
+
+    /**
+    * Indicates support Async mode.
+    * \n 0 : Async Encode mode not supported.
+    * \n 1 : Async Encode mode supported.
+    */
+    NV_ENC_CAPS_ASYNC_ENCODE_SUPPORT,
+
+    /**
+     * Maximum MBs per frame supported.
+     */
+    NV_ENC_CAPS_MB_NUM_MAX,
+
+    /**
+     * Maximum aggregate throughput in MBs per sec.
+     */
+    NV_ENC_CAPS_MB_PER_SEC_MAX,
+
+    /**
+     * Indicates HW support for YUV444 mode encoding.
+     * \n 0 : YUV444 mode encoding not supported.
+     * \n 1 : YUV444 mode encoding supported.
+     */
+    NV_ENC_CAPS_SUPPORT_YUV444_ENCODE,
+
+    /**
+     * Indicates HW support for lossless encoding.
+     * \n 0 : lossless encoding not supported.
+     * \n 1 : lossless encoding supported.
+     */
+    NV_ENC_CAPS_SUPPORT_LOSSLESS_ENCODE,
+
+     /**
+     * Indicates HW support for Sample Adaptive Offset.
+     * \n 0 : SAO not supported.
+     * \n 1 : SAO encoding supported.
+     */
+    NV_ENC_CAPS_SUPPORT_SAO,
+
+    /**
+     * Indicates HW support for Motion Estimation Only Mode.
+     * \n 0 : MEOnly Mode not supported.
+     * \n 1 : MEOnly Mode supported for I and P frames.
+     * \n 2 : MEOnly Mode supported for I, P and B frames.
+     */
+    NV_ENC_CAPS_SUPPORT_MEONLY_MODE,
+
+    /**
+     * Indicates HW support for lookahead encoding (enableLookahead=1).
+     * \n 0 : Lookahead not supported.
+     * \n 1 : Lookahead supported.
+     */
+    NV_ENC_CAPS_SUPPORT_LOOKAHEAD,
+
+    /**
+     * Indicates HW support for temporal AQ encoding (enableTemporalAQ=1).
+     * \n 0 : Temporal AQ not supported.
+     * \n 1 : Temporal AQ supported.
+     */
+    NV_ENC_CAPS_SUPPORT_TEMPORAL_AQ,
+    /**
+     * Indicates HW support for 10 bit encoding.
+     * \n 0 : 10 bit encoding not supported.
+     * \n 1 : 10 bit encoding supported.
+     */
+    NV_ENC_CAPS_SUPPORT_10BIT_ENCODE,
+    /**
+     * Maximum number of Long Term Reference frames supported
+     */
+    NV_ENC_CAPS_NUM_MAX_LTR_FRAMES,
+
+    /**
+     * Indicates HW support for Weighted Prediction.
+     * \n 0 : Weighted Prediction not supported.
+     * \n 1 : Weighted Prediction supported.
+     */
+    NV_ENC_CAPS_SUPPORT_WEIGHTED_PREDICTION,
+
+
+    /**
+     * On managed (vGPU) platforms (Windows only), this API, in conjunction with other GRID Management APIs, can be used
+     * to estimate the residual capacity of the hardware encoder on the GPU as a percentage of the total available encoder capacity.
+     * This API can be called at any time; i.e. during the encode session or before opening the encode session.
+     * If the available encoder capacity is returned as zero, applications may choose to switch to software encoding
+     * and continue to call this API (e.g. polling once per second) until capacity becomes available.
+     *
+     * On bare metal (non-virtualized GPU) and linux platforms, this API always returns 100.
+     */
+    NV_ENC_CAPS_DYNAMIC_QUERY_ENCODER_CAPACITY,
+
+     /**
+     * Indicates B as reference support.
+     * \n 0 : B as reference is not supported.
+     * \n 1 : each B-Frame as reference is supported.
+     * \n 2 : only Middle B-frame as reference is supported.
+     */
+    NV_ENC_CAPS_SUPPORT_BFRAME_REF_MODE,
+
+    /**
+     * Indicates HW support for Emphasis Level Map based delta QP computation.
+     * \n 0 : Emphasis Level Map based delta QP not supported.
+     * \n 1 : Emphasis Level Map based delta QP is supported.
+     */
+    NV_ENC_CAPS_SUPPORT_EMPHASIS_LEVEL_MAP,
+
+    /**
+     * Minimum input width supported.
+     */
+    NV_ENC_CAPS_WIDTH_MIN,
+
+    /**
+     * Minimum input height supported.
+     */
+    NV_ENC_CAPS_HEIGHT_MIN,
+
+    /**
+     * Indicates HW support for multiple reference frames.
+     */
+    NV_ENC_CAPS_SUPPORT_MULTIPLE_REF_FRAMES,
+
+    /**
+     * Indicates HW support for HEVC with alpha encoding.
+     * \n 0 : HEVC with alpha encoding not supported.
+     * \n 1 : HEVC with alpha encoding is supported.
+     */
+    NV_ENC_CAPS_SUPPORT_ALPHA_LAYER_ENCODING,
+
+    /**
+     * Indicates number of Encoding engines present on GPU.
+     */
+    NV_ENC_CAPS_NUM_ENCODER_ENGINES,
+
+    /**
+     * Indicates single slice intra refresh support.
+     */
+    NV_ENC_CAPS_SINGLE_SLICE_INTRA_REFRESH,
+
+     /**
+     * Reserved - Not to be used by clients.
+     */
+    NV_ENC_CAPS_EXPOSED_COUNT
+
+} NV_ENC_CAPS;
+
+/**
+ *  HEVC CU SIZE
+ */
+typedef enum _NV_ENC_HEVC_CUSIZE
+{
+    NV_ENC_HEVC_CUSIZE_AUTOSELECT = 0,
+    NV_ENC_HEVC_CUSIZE_8x8        = 1,
+    NV_ENC_HEVC_CUSIZE_16x16      = 2,
+    NV_ENC_HEVC_CUSIZE_32x32      = 3,
+    NV_ENC_HEVC_CUSIZE_64x64      = 4,
+}NV_ENC_HEVC_CUSIZE;
+
+/**
+*  AV1 PART SIZE
+*/
+typedef enum _NV_ENC_AV1_PART_SIZE
+{
+    NV_ENC_AV1_PART_SIZE_AUTOSELECT    = 0,
+    NV_ENC_AV1_PART_SIZE_4x4           = 1,
+    NV_ENC_AV1_PART_SIZE_8x8           = 2,
+    NV_ENC_AV1_PART_SIZE_16x16         = 3,
+    NV_ENC_AV1_PART_SIZE_32x32         = 4,
+    NV_ENC_AV1_PART_SIZE_64x64         = 5,
+}NV_ENC_AV1_PART_SIZE;
+
+/**
+*  Enums related to fields in VUI parameters.
+*/
+typedef enum _NV_ENC_VUI_VIDEO_FORMAT
+{
+    NV_ENC_VUI_VIDEO_FORMAT_COMPONENT   = 0,
+    NV_ENC_VUI_VIDEO_FORMAT_PAL         = 1,
+    NV_ENC_VUI_VIDEO_FORMAT_NTSC        = 2,
+    NV_ENC_VUI_VIDEO_FORMAT_SECAM       = 3,
+    NV_ENC_VUI_VIDEO_FORMAT_MAC         = 4,
+    NV_ENC_VUI_VIDEO_FORMAT_UNSPECIFIED = 5,
+}NV_ENC_VUI_VIDEO_FORMAT;
+
+typedef enum _NV_ENC_VUI_COLOR_PRIMARIES
+{
+    NV_ENC_VUI_COLOR_PRIMARIES_UNDEFINED   = 0,
+    NV_ENC_VUI_COLOR_PRIMARIES_BT709       = 1,
+    NV_ENC_VUI_COLOR_PRIMARIES_UNSPECIFIED = 2,
+    NV_ENC_VUI_COLOR_PRIMARIES_RESERVED    = 3,
+    NV_ENC_VUI_COLOR_PRIMARIES_BT470M      = 4,
+    NV_ENC_VUI_COLOR_PRIMARIES_BT470BG     = 5,
+    NV_ENC_VUI_COLOR_PRIMARIES_SMPTE170M   = 6,
+    NV_ENC_VUI_COLOR_PRIMARIES_SMPTE240M   = 7,
+    NV_ENC_VUI_COLOR_PRIMARIES_FILM        = 8,
+    NV_ENC_VUI_COLOR_PRIMARIES_BT2020      = 9,
+    NV_ENC_VUI_COLOR_PRIMARIES_SMPTE428    = 10,
+    NV_ENC_VUI_COLOR_PRIMARIES_SMPTE431    = 11,
+    NV_ENC_VUI_COLOR_PRIMARIES_SMPTE432    = 12,
+    NV_ENC_VUI_COLOR_PRIMARIES_JEDEC_P22   = 22,
+}NV_ENC_VUI_COLOR_PRIMARIES;
+
+typedef enum _NV_ENC_VUI_TRANSFER_CHARACTERISTIC
+{
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_UNDEFINED     = 0,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_BT709         = 1,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_UNSPECIFIED   = 2,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_RESERVED      = 3,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_BT470M        = 4,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_BT470BG       = 5,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_SMPTE170M     = 6,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_SMPTE240M     = 7,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_LINEAR        = 8,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_LOG           = 9,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_LOG_SQRT      = 10,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_IEC61966_2_4  = 11,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_BT1361_ECG    = 12,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_SRGB          = 13,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_BT2020_10     = 14,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_BT2020_12     = 15,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_SMPTE2084     = 16,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_SMPTE428      = 17,
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC_ARIB_STD_B67  = 18,
+}NV_ENC_VUI_TRANSFER_CHARACTERISTIC;
+
+typedef enum _NV_ENC_VUI_MATRIX_COEFFS
+{
+    NV_ENC_VUI_MATRIX_COEFFS_RGB         = 0,
+    NV_ENC_VUI_MATRIX_COEFFS_BT709       = 1,
+    NV_ENC_VUI_MATRIX_COEFFS_UNSPECIFIED = 2,
+    NV_ENC_VUI_MATRIX_COEFFS_RESERVED    = 3,
+    NV_ENC_VUI_MATRIX_COEFFS_FCC         = 4,
+    NV_ENC_VUI_MATRIX_COEFFS_BT470BG     = 5,
+    NV_ENC_VUI_MATRIX_COEFFS_SMPTE170M   = 6,
+    NV_ENC_VUI_MATRIX_COEFFS_SMPTE240M   = 7,
+    NV_ENC_VUI_MATRIX_COEFFS_YCGCO       = 8,
+    NV_ENC_VUI_MATRIX_COEFFS_BT2020_NCL  = 9,
+    NV_ENC_VUI_MATRIX_COEFFS_BT2020_CL   = 10,
+    NV_ENC_VUI_MATRIX_COEFFS_SMPTE2085   = 11,
+}NV_ENC_VUI_MATRIX_COEFFS;
+
+/**
+ * Input struct for querying Encoding capabilities.
+ */
+typedef struct _NV_ENC_CAPS_PARAM
+{
+    uint32_t version;                                  /**< [in]: Struct version. Must be set to ::NV_ENC_CAPS_PARAM_VER */
+    NV_ENC_CAPS  capsToQuery;                          /**< [in]: Specifies the encode capability to be queried. Client should pass a member for ::NV_ENC_CAPS enum. */
+    uint32_t reserved[62];                             /**< [in]: Reserved and must be set to 0 */
+} NV_ENC_CAPS_PARAM;
+
+/** NV_ENC_CAPS_PARAM struct version. */
+#define NV_ENC_CAPS_PARAM_VER NVENCAPI_STRUCT_VERSION(1)
+
+
+/**
+ * Encoder Output parameters
+ */
+typedef struct _NV_ENC_ENCODE_OUT_PARAMS
+{
+    uint32_t                  version;                 /**< [out]: Struct version. */
+    uint32_t                  bitstreamSizeInBytes;    /**< [out]: Encoded bitstream size in bytes */
+    uint32_t                  reserved[62];            /**< [out]: Reserved and must be set to 0 */
+} NV_ENC_ENCODE_OUT_PARAMS;
+
+/** NV_ENC_ENCODE_OUT_PARAMS struct version. */
+#define NV_ENC_ENCODE_OUT_PARAMS_VER NVENCAPI_STRUCT_VERSION(1)
+
+/**
+ * Creation parameters for input buffer.
+ */
+typedef struct _NV_ENC_CREATE_INPUT_BUFFER
+{
+    uint32_t                  version;                 /**< [in]: Struct version. Must be set to ::NV_ENC_CREATE_INPUT_BUFFER_VER */
+    uint32_t                  width;                   /**< [in]: Input frame width */
+    uint32_t                  height;                  /**< [in]: Input frame height */
+    NV_ENC_MEMORY_HEAP        memoryHeap;              /**< [in]: Deprecated. Do not use */
+    NV_ENC_BUFFER_FORMAT      bufferFmt;               /**< [in]: Input buffer format */
+    uint32_t                  reserved;                /**< [in]: Reserved and must be set to 0 */
+    NV_ENC_INPUT_PTR          inputBuffer;             /**< [out]: Pointer to input buffer */
+    void*                     pSysMemBuffer;           /**< [in]: Pointer to existing system memory buffer */
+    uint32_t                  reserved1[57];           /**< [in]: Reserved and must be set to 0 */
+    void*                     reserved2[63];           /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_CREATE_INPUT_BUFFER;
+
+/** NV_ENC_CREATE_INPUT_BUFFER struct version. */
+#define NV_ENC_CREATE_INPUT_BUFFER_VER NVENCAPI_STRUCT_VERSION(1)
+
+/**
+ * Creation parameters for output bitstream buffer.
+ */
+typedef struct _NV_ENC_CREATE_BITSTREAM_BUFFER
+{
+    uint32_t              version;                     /**< [in]: Struct version. Must be set to ::NV_ENC_CREATE_BITSTREAM_BUFFER_VER */
+    uint32_t              size;                        /**< [in]: Deprecated. Do not use */
+    NV_ENC_MEMORY_HEAP    memoryHeap;                  /**< [in]: Deprecated. Do not use */
+    uint32_t              reserved;                    /**< [in]: Reserved and must be set to 0 */
+    NV_ENC_OUTPUT_PTR     bitstreamBuffer;             /**< [out]: Pointer to the output bitstream buffer */
+    void*                 bitstreamBufferPtr;          /**< [out]: Reserved and should not be used */
+    uint32_t              reserved1[58];               /**< [in]: Reserved and should be set to 0 */
+    void*                 reserved2[64];               /**< [in]: Reserved and should be set to NULL */
+} NV_ENC_CREATE_BITSTREAM_BUFFER;
+
+/** NV_ENC_CREATE_BITSTREAM_BUFFER struct version. */
+#define NV_ENC_CREATE_BITSTREAM_BUFFER_VER NVENCAPI_STRUCT_VERSION(1)
+
+/**
+ * Structs needed for ME only mode.
+ */
+typedef struct _NV_ENC_MVECTOR
+{
+    int16_t             mvx;               /**< the x component of MV in quarter-pel units */
+    int16_t             mvy;               /**< the y component of MV in quarter-pel units */
+} NV_ENC_MVECTOR;
+
+/**
+ * Motion vector structure per macroblock for H264 motion estimation.
+ */
+typedef struct _NV_ENC_H264_MV_DATA
+{
+    NV_ENC_MVECTOR      mv[4];             /**< up to 4 vectors for 8x8 partition */
+    uint8_t             mbType;            /**< 0 (I), 1 (P), 2 (IPCM), 3 (B) */
+    uint8_t             partitionType;     /**< Specifies the block partition type. 0:16x16, 1:8x8, 2:16x8, 3:8x16 */
+    uint16_t            reserved;          /**< reserved padding for alignment */
+    uint32_t            mbCost;
+} NV_ENC_H264_MV_DATA;
+
+/**
+ * Motion vector structure per CU for HEVC motion estimation.
+ */
+typedef struct _NV_ENC_HEVC_MV_DATA
+{
+    NV_ENC_MVECTOR    mv[4];               /**< up to 4 vectors within a CU */
+    uint8_t           cuType;              /**< 0 (I), 1(P) */
+    uint8_t           cuSize;              /**< 0: 8x8, 1: 16x16, 2: 32x32, 3: 64x64 */
+    uint8_t           partitionMode;       /**< The CU partition mode
+                                                0 (2Nx2N), 1 (2NxN), 2(Nx2N), 3 (NxN),
+                                                4 (2NxnU), 5 (2NxnD), 6(nLx2N), 7 (nRx2N) */
+    uint8_t           lastCUInCTB;         /**< Marker to separate CUs in the current CTB from CUs in the next CTB */
+} NV_ENC_HEVC_MV_DATA;
+
+/**
+ * Creation parameters for output motion vector buffer for ME only mode.
+ */
+typedef struct _NV_ENC_CREATE_MV_BUFFER
+{
+    uint32_t            version;           /**< [in]: Struct version. Must be set to NV_ENC_CREATE_MV_BUFFER_VER */
+    NV_ENC_OUTPUT_PTR   mvBuffer;          /**< [out]: Pointer to the output motion vector buffer */
+    uint32_t            reserved1[255];    /**< [in]: Reserved and should be set to 0 */
+    void*               reserved2[63];     /**< [in]: Reserved and should be set to NULL */
+} NV_ENC_CREATE_MV_BUFFER;
+
+/** NV_ENC_CREATE_MV_BUFFER struct version*/
+#define NV_ENC_CREATE_MV_BUFFER_VER NVENCAPI_STRUCT_VERSION(1)
+
+/**
+ * QP value for frames
+ */
+typedef struct _NV_ENC_QP
+{
+    uint32_t        qpInterP;     /**< [in]: Specifies QP value for P-frame. Even though this field is uint32_t for legacy reasons, the client should treat this as a signed parameter(int32_t) for cases in which negative QP values are to be specified. */
+    uint32_t        qpInterB;     /**< [in]: Specifies QP value for B-frame. Even though this field is uint32_t for legacy reasons, the client should treat this as a signed parameter(int32_t) for cases in which negative QP values are to be specified. */
+    uint32_t        qpIntra;      /**< [in]: Specifies QP value for Intra Frame. Even though this field is uint32_t for legacy reasons, the client should treat this as a signed parameter(int32_t) for cases in which negative QP values are to be specified. */
+} NV_ENC_QP;
+
+/**
+ * Rate Control Configuration Parameters
+ */
+ typedef struct _NV_ENC_RC_PARAMS
+ {
+    uint32_t                        version;
+    NV_ENC_PARAMS_RC_MODE           rateControlMode;                             /**< [in]: Specifies the rate control mode. Check support for various rate control modes using ::NV_ENC_CAPS_SUPPORTED_RATECONTROL_MODES caps. */
+    NV_ENC_QP                       constQP;                                     /**< [in]: Specifies the initial QP to be used for encoding, these values would be used for all frames if in Constant QP mode. */
+    uint32_t                        averageBitRate;                              /**< [in]: Specifies the average bitrate(in bits/sec) used for encoding. */
+    uint32_t                        maxBitRate;                                  /**< [in]: Specifies the maximum bitrate for the encoded output. This is used for VBR and ignored for CBR mode. */
+    uint32_t                        vbvBufferSize;                               /**< [in]: Specifies the VBV(HRD) buffer size. in bits. Set 0 to use the default VBV  buffer size. */
+    uint32_t                        vbvInitialDelay;                             /**< [in]: Specifies the VBV(HRD) initial delay in bits. Set 0 to use the default VBV  initial delay .*/
+    uint32_t                        enableMinQP          :1;                     /**< [in]: Set this to 1 if minimum QP used for rate control. */
+    uint32_t                        enableMaxQP          :1;                     /**< [in]: Set this to 1 if maximum QP used for rate control. */
+    uint32_t                        enableInitialRCQP    :1;                     /**< [in]: Set this to 1 if user supplied initial QP is used for rate control. */
+    uint32_t                        enableAQ             :1;                     /**< [in]: Set this to 1 to enable adaptive quantization (Spatial). */
+    uint32_t                        reservedBitField1    :1;                     /**< [in]: Reserved bitfields and must be set to 0. */
+    uint32_t                        enableLookahead      :1;                     /**< [in]: Set this to 1 to enable lookahead with depth <lookaheadDepth> (if lookahead is enabled, input frames must remain available to the encoder until encode completion) */
+    uint32_t                        disableIadapt        :1;                     /**< [in]: Set this to 1 to disable adaptive I-frame insertion at scene cuts (only has an effect when lookahead is enabled) */
+    uint32_t                        disableBadapt        :1;                     /**< [in]: Set this to 1 to disable adaptive B-frame decision (only has an effect when lookahead is enabled) */
+    uint32_t                        enableTemporalAQ     :1;                     /**< [in]: Set this to 1 to enable temporal AQ */
+    uint32_t                        zeroReorderDelay     :1;                     /**< [in]: Set this to 1 to indicate zero latency operation (no reordering delay, num_reorder_frames=0) */
+    uint32_t                        enableNonRefP        :1;                     /**< [in]: Set this to 1 to enable automatic insertion of non-reference P-frames (no effect if enablePTD=0) */
+    uint32_t                        strictGOPTarget      :1;                     /**< [in]: Set this to 1 to minimize GOP-to-GOP rate fluctuations */
+    uint32_t                        aqStrength           :4;                     /**< [in]: When AQ (Spatial) is enabled (i.e. NV_ENC_RC_PARAMS::enableAQ is set), this field is used to specify AQ strength. AQ strength scale is from 1 (low) - 15 (aggressive).
+                                                                                            If not set, strength is auto selected by driver. */
+    uint32_t                        reservedBitFields    :16;                    /**< [in]: Reserved bitfields and must be set to 0 */
+    NV_ENC_QP                       minQP;                                       /**< [in]: Specifies the minimum QP used for rate control. Client must set NV_ENC_CONFIG::enableMinQP to 1. */
+    NV_ENC_QP                       maxQP;                                       /**< [in]: Specifies the maximum QP used for rate control. Client must set NV_ENC_CONFIG::enableMaxQP to 1. */
+    NV_ENC_QP                       initialRCQP;                                 /**< [in]: Specifies the initial QP used for rate control. Client must set NV_ENC_CONFIG::enableInitialRCQP to 1. */
+    uint32_t                        temporallayerIdxMask;                        /**< [in]: Specifies the temporal layers (as a bitmask) whose QPs have changed. Valid max bitmask is [2^NV_ENC_CAPS_NUM_MAX_TEMPORAL_LAYERS - 1].
+                                                                                            Applicable only for constant QP mode (NV_ENC_RC_PARAMS::rateControlMode = NV_ENC_PARAMS_RC_CONSTQP). */
+    uint8_t                         temporalLayerQP[8];                          /**< [in]: Specifies the temporal layer QPs used for rate control. Temporal layer index is used as the array index.
+                                                                                            Applicable only for constant QP mode (NV_ENC_RC_PARAMS::rateControlMode = NV_ENC_PARAMS_RC_CONSTQP). */
+    uint8_t                         targetQuality;                               /**< [in]: Target CQ (Constant Quality) level for VBR mode (range 0-51 with 0-automatic)  */
+    uint8_t                         targetQualityLSB;                            /**< [in]: Fractional part of target quality (as 8.8 fixed point format) */
+    uint16_t                        lookaheadDepth;                              /**< [in]: Maximum depth of lookahead with range 0-(31 - number of B frames).
+                                                                                            lookaheadDepth is only used if enableLookahead=1.*/
+    uint8_t                         lowDelayKeyFrameScale;                       /**< [in]: Specifies the ratio of I frame bits to P frame bits in case of single frame VBV and CBR rate control mode,
+                                                                                            is set to 2 by default for low latency tuning info and 1 by default for ultra low latency tuning info  */
+    int8_t                          yDcQPIndexOffset;                            /**< [in]: Specifies the value of 'deltaQ_y_dc' in AV1.*/
+    int8_t                          uDcQPIndexOffset;                            /**< [in]: Specifies the value of 'deltaQ_u_dc' in AV1.*/
+    int8_t                          vDcQPIndexOffset;                            /**< [in]: Specifies the value of 'deltaQ_v_dc' in AV1 (for future use only - deltaQ_v_dc is currently always internally set to same value as deltaQ_u_dc). */
+    NV_ENC_QP_MAP_MODE              qpMapMode;                                   /**< [in]: This flag is used to interpret values in array specified by NV_ENC_PIC_PARAMS::qpDeltaMap.
+                                                                                            Set this to NV_ENC_QP_MAP_EMPHASIS to treat values specified by NV_ENC_PIC_PARAMS::qpDeltaMap as Emphasis Level Map.
+                                                                                            Emphasis Level can be assigned any value specified in enum NV_ENC_EMPHASIS_MAP_LEVEL.
+                                                                                            Emphasis Level Map is used to specify regions to be encoded at varying levels of quality.
+                                                                                            The hardware encoder adjusts the quantization within the image as per the provided emphasis map,
+                                                                                            by adjusting the quantization parameter (QP) assigned to each macroblock. This adjustment is commonly called "Delta QP".
+                                                                                            The adjustment depends on the absolute QP decided by the rate control algorithm, and is applied after the rate control has decided each macroblock's QP.
+                                                                                            Since the Delta QP overrides rate control, enabling Emphasis Level Map may violate bitrate and VBV buffer size constraints.
+                                                                                            Emphasis Level Map is useful in situations where client has a priori knowledge of the image complexity (e.g. via use of NVFBC's Classification feature) and encoding those high-complexity areas at higher quality (lower QP) is important, even at the possible cost of violating bitrate/VBV buffer size constraints
+                                                                                            This feature is not supported when AQ( Spatial/Temporal) is enabled.
+                                                                                            This feature is only supported for H264 codec currently.
+
+                                                                                            Set this to NV_ENC_QP_MAP_DELTA to treat values specified by NV_ENC_PIC_PARAMS::qpDeltaMap as QP Delta. This specifies QP modifier to be applied on top of the QP chosen by rate control
+
+                                                                                            Set this to NV_ENC_QP_MAP_DISABLED to ignore NV_ENC_PIC_PARAMS::qpDeltaMap values. In this case, qpDeltaMap should be set to NULL.
+
+                                                                                            Other values are reserved for future use.*/
+    NV_ENC_MULTI_PASS               multiPass;                                    /**< [in]: This flag is used to enable multi-pass encoding for a given ::NV_ENC_PARAMS_RC_MODE. This flag is not valid for H264 and HEVC MEOnly mode */
+    uint32_t                        alphaLayerBitrateRatio;                       /**< [in]: Specifies the ratio in which bitrate should be split between base and alpha layer. A value 'x' for this field will split the target bitrate in a ratio of x : 1 between base and alpha layer.
+                                                                                             The default split ratio is 15.*/
+    int8_t                          cbQPIndexOffset;                              /**< [in]: Specifies the value of 'chroma_qp_index_offset' in H264 / 'pps_cb_qp_offset' in HEVC / 'deltaQ_u_ac' in AV1.*/
+    int8_t                          crQPIndexOffset;                              /**< [in]: Specifies the value of 'second_chroma_qp_index_offset' in H264 / 'pps_cr_qp_offset' in HEVC / 'deltaQ_v_ac' in AV1 (for future use only - deltaQ_v_ac is currently always internally set to same value as deltaQ_u_ac). */
+    uint16_t                        reserved2;
+    uint32_t                        reserved[4];
+ } NV_ENC_RC_PARAMS;
+
+/** macro for constructing the version field of ::_NV_ENC_RC_PARAMS */
+#define NV_ENC_RC_PARAMS_VER NVENCAPI_STRUCT_VERSION(1)
+
+#define MAX_NUM_CLOCK_TS    3
+
+/**
+* Clock Timestamp set parameters
+* For H264, this structure is used to populate Picture Timing SEI when NV_ENC_CONFIG_H264::enableTimeCode is set to 1.
+* For HEVC, this structure is used to populate Time Code SEI when NV_ENC_CONFIG_HEVC::enableTimeCodeSEI is set to 1.
+* For more details, refer to Annex D of ITU-T Specification.
+*/
+
+typedef struct _NV_ENC_CLOCK_TIMESTAMP_SET
+{
+    uint32_t        countingType            : 1;    /**< [in] Specifies the 'counting_type' */
+    uint32_t        discontinuityFlag       : 1;    /**< [in] Specifies the 'discontinuity_flag' */
+    uint32_t        cntDroppedFrames        : 1;    /**< [in] Specifies the 'cnt_dropped_flag' */
+    uint32_t        nFrames                 : 8;    /**< [in] Specifies the value of 'n_frames' */
+    uint32_t        secondsValue            : 6;    /**< [in] Specifies the 'seconds_value' */
+    uint32_t        minutesValue            : 6;    /**< [in] Specifies the 'minutes_value' */
+    uint32_t        hoursValue              : 5;    /**< [in] Specifies the 'hours_value' */
+    uint32_t        reserved2               : 4;    /**< [in] Reserved and must be set to 0 */
+    uint32_t        timeOffset;                     /**< [in] Specifies the 'time_offset_value' */
+} NV_ENC_CLOCK_TIMESTAMP_SET;
+
+typedef struct _NV_ENC_TIME_CODE
+{
+    NV_ENC_DISPLAY_PIC_STRUCT       displayPicStruct;                   /**< [in] Display picStruct */
+    NV_ENC_CLOCK_TIMESTAMP_SET      clockTimestamp[MAX_NUM_CLOCK_TS];   /**< [in] Clock Timestamp set */
+} NV_ENC_TIME_CODE;
+
+
+/**
+ * \struct _NV_ENC_CONFIG_H264_VUI_PARAMETERS
+ * H264 Video Usability Info parameters
+ */
+typedef struct _NV_ENC_CONFIG_H264_VUI_PARAMETERS
+{
+    uint32_t                            overscanInfoPresentFlag;        /**< [in]: If set to 1 , it specifies that the overscanInfo is present */
+    uint32_t                            overscanInfo;                   /**< [in]: Specifies the overscan info(as defined in Annex E of the ITU-T Specification). */
+    uint32_t                            videoSignalTypePresentFlag;     /**< [in]: If set to 1, it specifies  that the videoFormat, videoFullRangeFlag and colourDescriptionPresentFlag are present. */
+    NV_ENC_VUI_VIDEO_FORMAT             videoFormat;                    /**< [in]: Specifies the source video format(as defined in Annex E of the ITU-T Specification).*/
+    uint32_t                            videoFullRangeFlag;             /**< [in]: Specifies the output range of the luma and chroma samples(as defined in Annex E of the ITU-T Specification). */
+    uint32_t                            colourDescriptionPresentFlag;   /**< [in]: If set to 1, it specifies that the colourPrimaries, transferCharacteristics and colourMatrix are present. */
+    NV_ENC_VUI_COLOR_PRIMARIES          colourPrimaries;                /**< [in]: Specifies color primaries for converting to RGB(as defined in Annex E of the ITU-T Specification) */
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC  transferCharacteristics;        /**< [in]: Specifies the opto-electronic transfer characteristics to use (as defined in Annex E of the ITU-T Specification) */
+    NV_ENC_VUI_MATRIX_COEFFS            colourMatrix;                   /**< [in]: Specifies the matrix coefficients used in deriving the luma and chroma from the RGB primaries (as defined in Annex E of the ITU-T Specification). */
+    uint32_t                            chromaSampleLocationFlag;       /**< [in]: If set to 1 , it specifies that the chromaSampleLocationTop and chromaSampleLocationBot are present.*/
+    uint32_t                            chromaSampleLocationTop;        /**< [in]: Specifies the chroma sample location for top field(as defined in Annex E of the ITU-T Specification) */
+    uint32_t                            chromaSampleLocationBot;        /**< [in]: Specifies the chroma sample location for bottom field(as defined in Annex E of the ITU-T Specification) */
+    uint32_t                            bitstreamRestrictionFlag;       /**< [in]: If set to 1, it specifies the bitstream restriction parameters are present in the bitstream.*/
+    uint32_t                            timingInfoPresentFlag;          /**< [in]: If set to 1, it specifies that the timingInfo is present and the 'numUnitInTicks' and 'timeScale' fields are specified by the application. */
+                                                                        /**< [in]: If not set, the timingInfo may still be present with timing related fields calculated internally basedon the frame rate specified by the application. */
+    uint32_t                            numUnitInTicks;                 /**< [in]: Specifies the number of time units of the clock(as defined in Annex E of the ITU-T Specification). */
+    uint32_t                            timeScale;                      /**< [in]: Specifies the frquency of the clock(as defined in Annex E of the ITU-T Specification). */
+    uint32_t                            reserved[12];                   /**< [in]: Reserved and must be set to 0 */
+}NV_ENC_CONFIG_H264_VUI_PARAMETERS;
+
+typedef NV_ENC_CONFIG_H264_VUI_PARAMETERS NV_ENC_CONFIG_HEVC_VUI_PARAMETERS;
+
+/**
+ * \struct _NVENC_EXTERNAL_ME_HINT_COUNTS_PER_BLOCKTYPE
+ * External motion vector hint counts per block type.
+ * H264 and AV1 support multiple hint while HEVC supports one hint for each valid candidate.
+ */
+typedef struct _NVENC_EXTERNAL_ME_HINT_COUNTS_PER_BLOCKTYPE
+{
+    uint32_t   numCandsPerBlk16x16                   : 4;   /**< [in]: Supported for H264, HEVC. It Specifies the number of candidates per 16x16 block. */
+    uint32_t   numCandsPerBlk16x8                    : 4;   /**< [in]: Supported for H264 only. Specifies the number of candidates per 16x8 block. */
+    uint32_t   numCandsPerBlk8x16                    : 4;   /**< [in]: Supported for H264 only. Specifies the number of candidates per 8x16 block. */
+    uint32_t   numCandsPerBlk8x8                     : 4;   /**< [in]: Supported for H264, HEVC. Specifies the number of candidates per 8x8 block. */
+    uint32_t   numCandsPerSb                         : 8;   /**< [in]: Supported for AV1 only. Specifies the number of candidates per SB. */
+    uint32_t   reserved                              : 8;   /**< [in]: Reserved for padding. */
+    uint32_t   reserved1[3];                                /**< [in]: Reserved for future use. */
+} NVENC_EXTERNAL_ME_HINT_COUNTS_PER_BLOCKTYPE;
+
+
+/**
+ * \struct _NVENC_EXTERNAL_ME_HINT
+ * External Motion Vector hint structure for H264 and HEVC.
+ */
+typedef struct _NVENC_EXTERNAL_ME_HINT
+{
+    int32_t    mvx         : 12;                        /**< [in]: Specifies the x component of integer pixel MV (relative to current MB) S12.0. */
+    int32_t    mvy         : 10;                        /**< [in]: Specifies the y component of integer pixel MV (relative to current MB) S10.0 .*/
+    int32_t    refidx      : 5;                         /**< [in]: Specifies the reference index (31=invalid). Current we support only 1 reference frame per direction for external hints, so \p refidx must be 0. */
+    int32_t    dir         : 1;                         /**< [in]: Specifies the direction of motion estimation . 0=L0 1=L1.*/
+    int32_t    partType    : 2;                         /**< [in]: Specifies the block partition type.0=16x16 1=16x8 2=8x16 3=8x8 (blocks in partition must be consecutive).*/
+    int32_t    lastofPart  : 1;                         /**< [in]: Set to 1 for the last MV of (sub) partition  */
+    int32_t    lastOfMB    : 1;                         /**< [in]: Set to 1 for the last MV of macroblock. */
+} NVENC_EXTERNAL_ME_HINT;
+
+/**
+ * \struct _NVENC_EXTERNAL_ME_SB_HINT
+ * External Motion Vector SB hint structure for AV1
+ */
+typedef struct _NVENC_EXTERNAL_ME_SB_HINT
+{
+    int16_t    refidx         : 5;                      /**< [in]: Specifies the reference index (31=invalid) */
+    int16_t    direction      : 1;                      /**< [in]: Specifies the direction of motion estimation . 0=L0 1=L1.*/
+    int16_t    bi             : 1;                      /**< [in]: Specifies reference mode 0=single mv, 1=compound mv */
+    int16_t    partition_type : 3;                      /**< [in]: Specifies the partition type: 0: 2NX2N, 1:2NxN, 2:Nx2N. reserved 3bits for future modes */
+    int16_t    x8             : 3;                      /**< [in]: Specifies the current partition's top left x position in 8 pixel unit */
+    int16_t    last_of_cu     : 1;                      /**< [in]: Set to 1 for the last MV current CU */
+    int16_t    last_of_sb     : 1;                      /**< [in]: Set to 1 for the last MV of current SB */
+    int16_t    reserved0      : 1;                      /**< [in]: Reserved and must be set to 0 */
+    int16_t    mvx            : 14;                     /**< [in]: Specifies the x component of integer pixel MV (relative to current MB) S12.2. */
+    int16_t    cu_size        : 2;                      /**< [in]: Specifies the CU size: 0: 8x8, 1: 16x16, 2:32x32, 3:64x64 */
+    int16_t    mvy            : 12;                     /**< [in]: Specifies the y component of integer pixel MV (relative to current MB) S10.2 .*/
+    int16_t    y8             : 3;                      /**< [in]: Specifies the current partition's top left y position in 8 pixel unit */
+    int16_t    reserved1      : 1;                      /**< [in]: Reserved and must be set to 0 */
+} NVENC_EXTERNAL_ME_SB_HINT;
+
+/**
+ * \struct _NV_ENC_CONFIG_H264
+ * H264 encoder configuration parameters
+ */
+typedef struct _NV_ENC_CONFIG_H264
+{
+    uint32_t enableTemporalSVC         :1;                          /**< [in]: Set to 1 to enable SVC temporal*/
+    uint32_t enableStereoMVC           :1;                          /**< [in]: Set to 1 to enable stereo MVC*/
+    uint32_t hierarchicalPFrames       :1;                          /**< [in]: Set to 1 to enable hierarchical P Frames */
+    uint32_t hierarchicalBFrames       :1;                          /**< [in]: Set to 1 to enable hierarchical B Frames */
+    uint32_t outputBufferingPeriodSEI  :1;                          /**< [in]: Set to 1 to write SEI buffering period syntax in the bitstream */
+    uint32_t outputPictureTimingSEI    :1;                          /**< [in]: Set to 1 to write SEI picture timing syntax in the bitstream. */
+    uint32_t outputAUD                 :1;                          /**< [in]: Set to 1 to write access unit delimiter syntax in bitstream */
+    uint32_t disableSPSPPS             :1;                          /**< [in]: Set to 1 to disable writing of Sequence and Picture parameter info in bitstream */
+    uint32_t outputFramePackingSEI     :1;                          /**< [in]: Set to 1 to enable writing of frame packing arrangement SEI messages to bitstream */
+    uint32_t outputRecoveryPointSEI    :1;                          /**< [in]: Set to 1 to enable writing of recovery point SEI message */
+    uint32_t enableIntraRefresh        :1;                          /**< [in]: Set to 1 to enable gradual decoder refresh or intra refresh. If the GOP structure uses B frames this will be ignored */
+    uint32_t enableConstrainedEncoding :1;                          /**< [in]: Set this to 1 to enable constrainedFrame encoding where each slice in the constrained picture is independent of other slices.
+                                                                               Constrained encoding works only with rectangular slices.
+                                                                               Check support for constrained encoding using ::NV_ENC_CAPS_SUPPORT_CONSTRAINED_ENCODING caps. */
+    uint32_t repeatSPSPPS              :1;                          /**< [in]: Set to 1 to enable writing of Sequence and Picture parameter for every IDR frame */
+    uint32_t enableVFR                 :1;                          /**< [in]: Setting enableVFR=1 currently only sets the fixed_frame_rate_flag=0 in the VUI but otherwise
+                                                                               has no impact on the encoder behavior. For more details please refer to E.1 VUI syntax of H.264 standard. Note, however, that NVENC does not support VFR encoding and rate control. */
+    uint32_t enableLTR                 :1;                          /**< [in]: Set to 1 to enable LTR (Long Term Reference) frame support. LTR can be used in two modes: "LTR Trust" mode and "LTR Per Picture" mode.
+                                                                               LTR Trust mode: In this mode, ltrNumFrames pictures after IDR are automatically marked as LTR. This mode is enabled by setting ltrTrustMode = 1.
+                                                                                               Use of LTR Trust mode is strongly discouraged as this mode may be deprecated in future.
+                                                                               LTR Per Picture mode: In this mode, client can control whether the current picture should be marked as LTR. Enable this mode by setting
+                                                                                                     ltrTrustMode = 0 and ltrMarkFrame = 1 for the picture to be marked as LTR. This is the preferred mode
+                                                                                                     for using LTR.
+                                                                               Note that LTRs are not supported if encoding session is configured with B-frames */
+    uint32_t qpPrimeYZeroTransformBypassFlag :1;                    /**< [in]: To enable lossless encode set this to 1, set QP to 0 and RC_mode to NV_ENC_PARAMS_RC_CONSTQP and profile to HIGH_444_PREDICTIVE_PROFILE.
+                                                                               Check support for lossless encoding using ::NV_ENC_CAPS_SUPPORT_LOSSLESS_ENCODE caps.  */
+    uint32_t useConstrainedIntraPred   :1;                          /**< [in]: Set 1 to enable constrained intra prediction. */
+    uint32_t enableFillerDataInsertion :1;                          /**< [in]: Set to 1 to enable insertion of filler data in the bitstream.
+                                                                               This flag will take effect only when one of the CBR rate
+                                                                               control modes (NV_ENC_PARAMS_RC_CBR, NV_ENC_PARAMS_RC_CBR_HQ,
+                                                                               NV_ENC_PARAMS_RC_CBR_LOWDELAY_HQ) is in use and both
+                                                                               NV_ENC_INITIALIZE_PARAMS::frameRateNum and
+                                                                               NV_ENC_INITIALIZE_PARAMS::frameRateDen are set to non-zero
+                                                                               values. Setting this field when
+                                                                               NV_ENC_INITIALIZE_PARAMS::enableOutputInVidmem is also set
+                                                                               is currently not supported and will make ::NvEncInitializeEncoder()
+                                                                               return an error. */
+    uint32_t disableSVCPrefixNalu      :1;                          /**< [in]: Set to 1 to disable writing of SVC Prefix NALU preceding each slice in bitstream.
+                                                                               Applicable only when temporal SVC is enabled (NV_ENC_CONFIG_H264::enableTemporalSVC = 1). */
+    uint32_t enableScalabilityInfoSEI  :1;                          /**< [in]: Set to 1 to enable writing of Scalability Information SEI message preceding each IDR picture in bitstream
+                                                                               Applicable only when temporal SVC is enabled (NV_ENC_CONFIG_H264::enableTemporalSVC = 1). */
+    uint32_t singleSliceIntraRefresh   :1;                          /**< [in]: Set to 1 to maintain single slice in frames during intra refresh.
+                                                                               Check support for single slice intra refresh using ::NV_ENC_CAPS_SINGLE_SLICE_INTRA_REFRESH caps.
+                                                                               This flag will be ignored if the value returned for ::NV_ENC_CAPS_SINGLE_SLICE_INTRA_REFRESH caps is false. */
+    uint32_t enableTimeCode            :1;                          /**< [in]: Set to 1 to enable writing of clock timestamp sets in picture timing SEI.  Note that this flag will be ignored for D3D12 interface. */
+    uint32_t reservedBitFields         :10;                         /**< [in]: Reserved bitfields and must be set to 0 */
+    uint32_t level;                                                 /**< [in]: Specifies the encoding level. Client is recommended to set this to NV_ENC_LEVEL_AUTOSELECT in order to enable the NvEncodeAPI interface to select the correct level. */
+    uint32_t idrPeriod;                                             /**< [in]: Specifies the IDR interval. If not set, this is made equal to gopLength in NV_ENC_CONFIG.Low latency application client can set IDR interval to NVENC_INFINITE_GOPLENGTH so that IDR frames are not inserted automatically. */
+    uint32_t separateColourPlaneFlag;                               /**< [in]: Set to 1 to enable 4:4:4 separate colour planes */
+    uint32_t disableDeblockingFilterIDC;                            /**< [in]: Specifies the deblocking filter mode. Permissible value range: [0,2]. This flag corresponds
+                                                                               to the flag disable_deblocking_filter_idc specified in section 7.4.3 of H.264 specification,
+                                                                               which specifies whether the operation of the deblocking filter shall be disabled across some
+                                                                               block edges of the slice and specifies for which edges the filtering is disabled. See section
+                                                                               7.4.3 of H.264 specification for more details.*/
+    uint32_t numTemporalLayers;                                     /**< [in]: Specifies number of temporal layers to be used for hierarchical coding / temporal SVC. Valid value range is [1,::NV_ENC_CAPS_NUM_MAX_TEMPORAL_LAYERS] */
+    uint32_t spsId;                                                 /**< [in]: Specifies the SPS id of the sequence header */
+    uint32_t ppsId;                                                 /**< [in]: Specifies the PPS id of the picture header */
+    NV_ENC_H264_ADAPTIVE_TRANSFORM_MODE adaptiveTransformMode;      /**< [in]: Specifies the AdaptiveTransform Mode. Check support for AdaptiveTransform mode using ::NV_ENC_CAPS_SUPPORT_ADAPTIVE_TRANSFORM caps. */
+    NV_ENC_H264_FMO_MODE                fmoMode;                    /**< [in]: Specified the FMO Mode. Check support for FMO using ::NV_ENC_CAPS_SUPPORT_FMO caps. */
+    NV_ENC_H264_BDIRECT_MODE            bdirectMode;                /**< [in]: Specifies the BDirect mode. Check support for BDirect mode using ::NV_ENC_CAPS_SUPPORT_BDIRECT_MODE caps.*/
+    NV_ENC_H264_ENTROPY_CODING_MODE     entropyCodingMode;          /**< [in]: Specifies the entropy coding mode. Check support for CABAC mode using ::NV_ENC_CAPS_SUPPORT_CABAC caps. */
+    NV_ENC_STEREO_PACKING_MODE          stereoMode;                 /**< [in]: Specifies the stereo frame packing mode which is to be signaled in frame packing arrangement SEI */
+    uint32_t                            intraRefreshPeriod;         /**< [in]: Specifies the interval between successive intra refresh if enableIntrarefresh is set. Requires enableIntraRefresh to be set.
+                                                                               Will be disabled if NV_ENC_CONFIG::gopLength is not set to NVENC_INFINITE_GOPLENGTH. */
+    uint32_t                            intraRefreshCnt;            /**< [in]: Specifies the length of intra refresh in number of frames for periodic intra refresh. This value should be smaller than intraRefreshPeriod */
+    uint32_t                            maxNumRefFrames;            /**< [in]: Specifies the DPB size used for encoding. Setting it to 0 will let driver use the default DPB size.
+                                                                               The low latency application which wants to invalidate reference frame as an error resilience tool
+                                                                               is recommended to use a large DPB size so that the encoder can keep old reference frames which can be used if recent
+                                                                               frames are invalidated. */
+    uint32_t                            sliceMode;                  /**< [in]: This parameter in conjunction with sliceModeData specifies the way in which the picture is divided into slices
+                                                                               sliceMode = 0 MB based slices, sliceMode = 1 Byte based slices, sliceMode = 2 MB row based slices, sliceMode = 3 numSlices in Picture.
+                                                                               When forceIntraRefreshWithFrameCnt is set it will have priority over sliceMode setting
+                                                                               When sliceMode == 0 and sliceModeData == 0 whole picture will be coded with one slice */
+    uint32_t                            sliceModeData;              /**< [in]: Specifies the parameter needed for sliceMode. For:
+                                                                               sliceMode = 0, sliceModeData specifies # of MBs in each slice (except last slice)
+                                                                               sliceMode = 1, sliceModeData specifies maximum # of bytes in each slice (except last slice)
+                                                                               sliceMode = 2, sliceModeData specifies # of MB rows in each slice (except last slice)
+                                                                               sliceMode = 3, sliceModeData specifies number of slices in the picture. Driver will divide picture into slices optimally */
+    NV_ENC_CONFIG_H264_VUI_PARAMETERS   h264VUIParameters;          /**< [in]: Specifies the H264 video usability info parameters */
+    uint32_t                            ltrNumFrames;               /**< [in]: Specifies the number of LTR frames. This parameter has different meaning in two LTR modes.
+                                                                               In "LTR Trust" mode (ltrTrustMode = 1), encoder will mark the first ltrNumFrames base layer reference frames within each IDR interval as LTR.
+                                                                               In "LTR Per Picture" mode (ltrTrustMode = 0 and ltrMarkFrame = 1), ltrNumFrames specifies maximum number of LTR frames in DPB. */
+    uint32_t                            ltrTrustMode;               /**< [in]: Specifies the LTR operating mode. See comments near NV_ENC_CONFIG_H264::enableLTR for description of the two modes.
+                                                                               Set to 1 to use "LTR Trust" mode of LTR operation. Clients are discouraged to use "LTR Trust" mode as this mode may
+                                                                               be deprecated in future releases.
+                                                                               Set to 0 when using "LTR Per Picture" mode of LTR operation. */
+    uint32_t                            chromaFormatIDC;            /**< [in]: Specifies the chroma format. Should be set to 1 for yuv420 input, 3 for yuv444 input.
+                                                                               Check support for YUV444 encoding using ::NV_ENC_CAPS_SUPPORT_YUV444_ENCODE caps.*/
+    uint32_t                            maxTemporalLayers;          /**< [in]: Specifies the max temporal layer used for temporal SVC / hierarchical coding.
+                                                                               Defaut value of this field is NV_ENC_CAPS::NV_ENC_CAPS_NUM_MAX_TEMPORAL_LAYERS. Note that the value NV_ENC_CONFIG_H264::maxNumRefFrames should
+                                                                               be greater than or equal to (NV_ENC_CONFIG_H264::maxTemporalLayers - 2) * 2, for NV_ENC_CONFIG_H264::maxTemporalLayers >= 2.*/
+    NV_ENC_BFRAME_REF_MODE              useBFramesAsRef;            /**< [in]: Specifies the B-Frame as reference mode. Check support for useBFramesAsRef mode using ::NV_ENC_CAPS_SUPPORT_BFRAME_REF_MODE caps.*/
+    NV_ENC_NUM_REF_FRAMES               numRefL0;                   /**< [in]: Specifies max number of reference frames in reference picture list L0, that can be used by hardware for prediction of a frame.
+                                                                               Check support for numRefL0 using ::NV_ENC_CAPS_SUPPORT_MULTIPLE_REF_FRAMES caps. */
+    NV_ENC_NUM_REF_FRAMES               numRefL1;                   /**< [in]: Specifies max number of reference frames in reference picture list L1, that can be used by hardware for prediction of a frame.
+                                                                               Check support for numRefL1 using ::NV_ENC_CAPS_SUPPORT_MULTIPLE_REF_FRAMES caps. */
+
+    uint32_t                            reserved1[267];             /**< [in]: Reserved and must be set to 0 */
+    void*                               reserved2[64];              /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_CONFIG_H264;
+
+/**
+ * \struct _NV_ENC_CONFIG_HEVC
+ * HEVC encoder configuration parameters to be set during initialization.
+ */
+typedef struct _NV_ENC_CONFIG_HEVC
+{
+    uint32_t level;                                                 /**< [in]: Specifies the level of the encoded bitstream.*/
+    uint32_t tier;                                                  /**< [in]: Specifies the level tier of the encoded bitstream.*/
+    NV_ENC_HEVC_CUSIZE minCUSize;                                   /**< [in]: Specifies the minimum size of luma coding unit.*/
+    NV_ENC_HEVC_CUSIZE maxCUSize;                                   /**< [in]: Specifies the maximum size of luma coding unit. Currently NVENC SDK only supports maxCUSize equal to NV_ENC_HEVC_CUSIZE_32x32.*/
+    uint32_t useConstrainedIntraPred               :1;              /**< [in]: Set 1 to enable constrained intra prediction. */
+    uint32_t disableDeblockAcrossSliceBoundary     :1;              /**< [in]: Set 1 to disable in loop filtering across slice boundary.*/
+    uint32_t outputBufferingPeriodSEI              :1;              /**< [in]: Set 1 to write SEI buffering period syntax in the bitstream */
+    uint32_t outputPictureTimingSEI                :1;              /**< [in]: Set 1 to write SEI picture timing syntax in the bitstream */
+    uint32_t outputAUD                             :1;              /**< [in]: Set 1 to write Access Unit Delimiter syntax. */
+    uint32_t enableLTR                             :1;              /**< [in]: Set to 1 to enable LTR (Long Term Reference) frame support. LTR can be used in two modes: "LTR Trust" mode and "LTR Per Picture" mode.
+                                                                               LTR Trust mode: In this mode, ltrNumFrames pictures after IDR are automatically marked as LTR. This mode is enabled by setting ltrTrustMode = 1.
+                                                                                               Use of LTR Trust mode is strongly discouraged as this mode may be deprecated in future releases.
+                                                                               LTR Per Picture mode: In this mode, client can control whether the current picture should be marked as LTR. Enable this mode by setting
+                                                                                                     ltrTrustMode = 0 and ltrMarkFrame = 1 for the picture to be marked as LTR. This is the preferred mode
+                                                                                                     for using LTR.
+                                                                               Note that LTRs are not supported if encoding session is configured with B-frames */
+    uint32_t disableSPSPPS                         :1;              /**< [in]: Set 1 to disable VPS, SPS and PPS signaling in the bitstream. */
+    uint32_t repeatSPSPPS                          :1;              /**< [in]: Set 1 to output VPS,SPS and PPS for every IDR frame.*/
+    uint32_t enableIntraRefresh                    :1;              /**< [in]: Set 1 to enable gradual decoder refresh or intra refresh. If the GOP structure uses B frames this will be ignored */
+    uint32_t chromaFormatIDC                       :2;              /**< [in]: Specifies the chroma format. Should be set to 1 for yuv420 input, 3 for yuv444 input.*/
+    uint32_t pixelBitDepthMinus8                   :3;              /**< [in]: Specifies pixel bit depth minus 8. Should be set to 0 for 8 bit input, 2 for 10 bit input.*/
+    uint32_t enableFillerDataInsertion             :1;              /**< [in]: Set to 1 to enable insertion of filler data in the bitstream.
+                                                                               This flag will take effect only when one of the CBR rate
+                                                                               control modes (NV_ENC_PARAMS_RC_CBR, NV_ENC_PARAMS_RC_CBR_HQ,
+                                                                               NV_ENC_PARAMS_RC_CBR_LOWDELAY_HQ) is in use and both
+                                                                               NV_ENC_INITIALIZE_PARAMS::frameRateNum and
+                                                                               NV_ENC_INITIALIZE_PARAMS::frameRateDen are set to non-zero
+                                                                               values. Setting this field when
+                                                                               NV_ENC_INITIALIZE_PARAMS::enableOutputInVidmem is also set
+                                                                               is currently not supported and will make ::NvEncInitializeEncoder()
+                                                                               return an error. */
+    uint32_t enableConstrainedEncoding             :1;              /**< [in]: Set this to 1 to enable constrainedFrame encoding where each slice in the constrained picture is independent of other slices.
+                                                                               Constrained encoding works only with rectangular slices.
+                                                                               Check support for constrained encoding using ::NV_ENC_CAPS_SUPPORT_CONSTRAINED_ENCODING caps. */
+    uint32_t enableAlphaLayerEncoding              :1;              /**< [in]: Set this to 1 to enable HEVC encode with alpha layer. */
+    uint32_t singleSliceIntraRefresh               :1;              /**< [in]: Set this to 1 to maintain single slice frames during intra refresh.
+                                                                               Check support for single slice intra refresh using ::NV_ENC_CAPS_SINGLE_SLICE_INTRA_REFRESH caps.
+                                                                               This flag will be ignored if the value returned for ::NV_ENC_CAPS_SINGLE_SLICE_INTRA_REFRESH caps is false. */
+    uint32_t outputRecoveryPointSEI                :1;              /**< [in]: Set to 1 to enable writing of recovery point SEI message */
+    uint32_t outputTimeCodeSEI                     :1;              /**< [in]: Set 1 to write SEI time code syntax in the bitstream. Note that this flag will be ignored for D3D12 interface.*/
+    uint32_t reserved                              :12;             /**< [in]: Reserved bitfields.*/
+    uint32_t idrPeriod;                                             /**< [in]: Specifies the IDR interval. If not set, this is made equal to gopLength in NV_ENC_CONFIG. Low latency application client can set IDR interval to NVENC_INFINITE_GOPLENGTH so that IDR frames are not inserted automatically. */
+    uint32_t intraRefreshPeriod;                                    /**< [in]: Specifies the interval between successive intra refresh if enableIntrarefresh is set. Requires enableIntraRefresh to be set.
+                                                                    Will be disabled if NV_ENC_CONFIG::gopLength is not set to NVENC_INFINITE_GOPLENGTH. */
+    uint32_t intraRefreshCnt;                                       /**< [in]: Specifies the length of intra refresh in number of frames for periodic intra refresh. This value should be smaller than intraRefreshPeriod */
+    uint32_t maxNumRefFramesInDPB;                                  /**< [in]: Specifies the maximum number of references frames in the DPB.*/
+    uint32_t ltrNumFrames;                                          /**< [in]: This parameter has different meaning in two LTR modes.
+                                                                               In "LTR Trust" mode (ltrTrustMode = 1), encoder will mark the first ltrNumFrames base layer reference frames within each IDR interval as LTR.
+                                                                               In "LTR Per Picture" mode (ltrTrustMode = 0 and ltrMarkFrame = 1), ltrNumFrames specifies maximum number of LTR frames in DPB.
+                                                                               These ltrNumFrames acts as a guidance to the encoder and are not necessarily honored. To achieve a right balance between the encoding
+                                                                               quality and keeping LTR frames in the DPB queue, the encoder can internally limit the number of LTR frames.
+                                                                               The number of LTR frames actually used depends upon the encoding preset being used; Faster encoding presets will use fewer LTR frames.*/
+    uint32_t vpsId;                                                 /**< [in]: Specifies the VPS id of the video parameter set */
+    uint32_t spsId;                                                 /**< [in]: Specifies the SPS id of the sequence header */
+    uint32_t ppsId;                                                 /**< [in]: Specifies the PPS id of the picture header */
+    uint32_t sliceMode;                                             /**< [in]: This parameter in conjunction with sliceModeData specifies the way in which the picture is divided into slices
+                                                                                sliceMode = 0 CTU based slices, sliceMode = 1 Byte based slices, sliceMode = 2 CTU row based slices, sliceMode = 3, numSlices in Picture
+                                                                                When sliceMode == 0 and sliceModeData == 0 whole picture will be coded with one slice */
+    uint32_t sliceModeData;                                         /**< [in]: Specifies the parameter needed for sliceMode. For:
+                                                                                sliceMode = 0, sliceModeData specifies # of CTUs in each slice (except last slice)
+                                                                                sliceMode = 1, sliceModeData specifies maximum # of bytes in each slice (except last slice)
+                                                                                sliceMode = 2, sliceModeData specifies # of CTU rows in each slice (except last slice)
+                                                                                sliceMode = 3, sliceModeData specifies number of slices in the picture. Driver will divide picture into slices optimally */
+    uint32_t maxTemporalLayersMinus1;                               /**< [in]: Specifies the max temporal layer used for hierarchical coding. */
+    NV_ENC_CONFIG_HEVC_VUI_PARAMETERS   hevcVUIParameters;          /**< [in]: Specifies the HEVC video usability info parameters */
+    uint32_t ltrTrustMode;                                          /**< [in]: Specifies the LTR operating mode. See comments near NV_ENC_CONFIG_HEVC::enableLTR for description of the two modes.
+                                                                               Set to 1 to use "LTR Trust" mode of LTR operation. Clients are discouraged to use "LTR Trust" mode as this mode may
+                                                                               be deprecated in future releases.
+                                                                               Set to 0 when using "LTR Per Picture" mode of LTR operation. */
+    NV_ENC_BFRAME_REF_MODE              useBFramesAsRef;            /**< [in]: Specifies the B-Frame as reference mode. Check support for useBFramesAsRef mode using  ::NV_ENC_CAPS_SUPPORT_BFRAME_REF_MODE caps.*/
+    NV_ENC_NUM_REF_FRAMES               numRefL0;                   /**< [in]: Specifies max number of reference frames in reference picture list L0, that can be used by hardware for prediction of a frame.
+                                                                               Check support for numRefL0 using ::NV_ENC_CAPS_SUPPORT_MULTIPLE_REF_FRAMES caps. */
+    NV_ENC_NUM_REF_FRAMES               numRefL1;                   /**< [in]: Specifies max number of reference frames in reference picture list L1, that can be used by hardware for prediction of a frame.
+                                                                               Check support for numRefL1 using ::NV_ENC_CAPS_SUPPORT_MULTIPLE_REF_FRAMES caps. */
+    uint32_t                            reserved1[214];             /**< [in]: Reserved and must be set to 0.*/
+    void*                               reserved2[64];              /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_CONFIG_HEVC;
+
+#define NV_MAX_TILE_COLS_AV1               64
+#define NV_MAX_TILE_ROWS_AV1               64
+
+/**
+ * \struct _NV_ENC_FILM_GRAIN_PARAMS_AV1
+ * AV1 Film Grain Parameters structure
+ */
+
+typedef struct _NV_ENC_FILM_GRAIN_PARAMS_AV1
+{
+    uint32_t applyGrain                 :1;                         /**< [in]: Set to 1 to specify film grain should be added to frame */
+    uint32_t chromaScalingFromLuma      :1;                         /**< [in]: Set to 1 to specify the chroma scaling is inferred from luma scaling */
+    uint32_t overlapFlag                :1;                         /**< [in]: Set to 1 to indicate that overlap between film grain blocks should be applied*/
+    uint32_t clipToRestrictedRange      :1;                         /**< [in]: Set to 1 to clip values to restricted (studio) range after adding film grain  */
+    uint32_t grainScalingMinus8         :2;                         /**< [in]: Represents the shift - 8 applied to the values of the chroma component */
+    uint32_t arCoeffLag                 :2;                         /**< [in]: Specifies the number of auto-regressive coefficients for luma and chroma */
+    uint32_t numYPoints                 :4;                         /**< [in]: Specifies the number of points for the piecewise linear scaling function of the luma component */
+    uint32_t numCbPoints                :4;                         /**< [in]: Specifies the number of points for the piecewise linear scaling function of the cb component */
+    uint32_t numCrPoints                :4;                         /**< [in]: Specifies the number of points for the piecewise linear scaling function of the cr component */
+    uint32_t arCoeffShiftMinus6         :2;                         /**< [in]: specifies the range of the auto-regressive coefficients */
+    uint32_t grainScaleShift            :2;                         /**< [in]: Specifies how much the Gaussian random numbers should be scaled down during the grain synthesi process  */
+    uint32_t reserved1                  :8;                         /**< [in]: Reserved bits field - should be set to 0 */
+    uint8_t  pointYValue[14];                                       /**< [in]: pointYValue[i]: x coordinate for i-th point of luma piecewise linear scaling function. Values on a scale of 0...255 */
+    uint8_t  pointYScaling[14];                                     /**< [in]: pointYScaling[i]: i-th point output value of luma piecewise linear scaling function */
+    uint8_t  pointCbValue[10];                                      /**< [in]: pointCbValue[i]: x coordinate for i-th point of cb piecewise linear scaling function. Values on a scale of 0...255 */
+    uint8_t  pointCbScaling[10];                                    /**< [in]: pointCbScaling[i]: i-th point output value of cb piecewise linear scaling function */
+    uint8_t  pointCrValue[10];                                      /**< [in]: pointCrValue[i]: x coordinate for i-th point of cr piecewise linear scaling function. Values on a scale of 0...255 */
+    uint8_t  pointCrScaling[10];                                    /**< [in]: pointCrScaling[i]: i-th point output value of cr piecewise linear scaling function */
+    uint8_t  arCoeffsYPlus128[24];                                  /**< [in]: Specifies auto-regressive coefficients used for the Y plane */
+    uint8_t  arCoeffsCbPlus128[25];                                 /**< [in]: Specifies auto-regressive coefficients used for the U plane */
+    uint8_t  arCoeffsCrPlus128[25];                                 /**< [in]: Specifies auto-regressive coefficients used for the V plane */
+    uint8_t  reserved2[2];                                          /**< [in]: Reserved bytes -  should be set to 0 */
+    uint8_t  cbMult;                                                /**< [in]: Represents a multiplier for the cb component used in derivation of the input index to the cb component scaling function */
+    uint8_t  cbLumaMult;                                            /**< [in]: represents a multiplier for the average luma component used in derivation of the input index to the cb component scaling function. */
+    uint16_t cbOffset;                                              /**< [in]: Represents an offset used in derivation of the input index to the cb component scaling function */
+    uint8_t  crMult;                                                /**< [in]: Represents a multiplier for the cr component used in derivation of the input index to the cr component scaling function */
+    uint8_t  crLumaMult;                                            /**< [in]: represents a multiplier for the average luma component used in derivation of the input index to the cr component scaling function. */
+    uint16_t crOffset;                                              /**< [in]: Represents an offset used in derivation of the input index to the cr component scaling function */
+} NV_ENC_FILM_GRAIN_PARAMS_AV1;
+
+/**
+* \struct _NV_ENC_CONFIG_AV1
+* AV1 encoder configuration parameters to be set during initialization.
+*/
+typedef struct _NV_ENC_CONFIG_AV1
+{
+    uint32_t level;                                                 /**< [in]: Specifies the level of the encoded bitstream.*/
+    uint32_t tier;                                                  /**< [in]: Specifies the level tier of the encoded bitstream.*/
+    NV_ENC_AV1_PART_SIZE minPartSize;                               /**< [in]: Specifies the minimum size of luma coding block partition.*/
+    NV_ENC_AV1_PART_SIZE maxPartSize;                               /**< [in]: Specifies the maximum size of luma coding block partition.*/
+    uint32_t outputAnnexBFormat             : 1;                    /**< [in]: Set 1 to use Annex B format for bitstream output.*/
+    uint32_t enableTimingInfo               : 1;                    /**< [in]: Set 1 to write Timing Info into sequence/frame headers */
+    uint32_t enableDecoderModelInfo         : 1;                    /**< [in]: Set 1 to write Decoder Model Info into sequence/frame headers */
+    uint32_t enableFrameIdNumbers           : 1;                    /**< [in]: Set 1 to write Frame id numbers in  bitstream */
+    uint32_t disableSeqHdr                  : 1;                    /**< [in]: Set 1 to disable Sequence Header signaling in the bitstream. */
+    uint32_t repeatSeqHdr                   : 1;                    /**< [in]: Set 1 to output Sequence Header for every Key frame.*/
+    uint32_t enableIntraRefresh             : 1;                    /**< [in]: Set 1 to enable gradual decoder refresh or intra refresh. If the GOP structure uses B frames this will be ignored */
+    uint32_t chromaFormatIDC                : 2;                    /**< [in]: Specifies the chroma format. Should be set to 1 for yuv420 input (yuv444 input currently not supported).*/
+    uint32_t enableBitstreamPadding         : 1;                    /**< [in]: Set 1 to enable bitstream padding. */
+    uint32_t enableCustomTileConfig         : 1;                    /**< [in]: Set 1 to enable custom tile configuration: numTileColumns and numTileRows must have non zero values and tileWidths and tileHeights must point to a valid address  */
+    uint32_t enableFilmGrainParams          : 1;                    /**< [in]: Set 1 to enable custom film grain parameters: filmGrainParams must point to a valid address  */
+    uint32_t inputPixelBitDepthMinus8       : 3;                    /**< [in]: Specifies pixel bit depth minus 8 of video input. Should be set to 0 for 8 bit input, 2 for 10 bit input.*/
+    uint32_t pixelBitDepthMinus8            : 3;                    /**< [in]: Specifies pixel bit depth minus 8 of encoded video. Should be set to 0 for 8 bit, 2 for 10 bit.
+                                                                               HW will do the bitdepth conversion internally from inputPixelBitDepthMinus8 -> pixelBitDepthMinus8 if bit dpeths differ
+                                                                               Support for 8 bit input to 10 bit encode conversion only */
+    uint32_t reserved                       : 14;                   /**< [in]: Reserved bitfields.*/
+    uint32_t idrPeriod;                                             /**< [in]: Specifies the IDR/Key frame interval. If not set, this is made equal to gopLength in NV_ENC_CONFIG.Low latency application client can set IDR interval to NVENC_INFINITE_GOPLENGTH so that IDR frames are not inserted automatically. */
+    uint32_t intraRefreshPeriod;                                    /**< [in]: Specifies the interval between successive intra refresh if enableIntrarefresh is set. Requires enableIntraRefresh to be set.
+                                                                               Will be disabled if NV_ENC_CONFIG::gopLength is not set to NVENC_INFINITE_GOPLENGTH. */
+    uint32_t intraRefreshCnt;                                       /**< [in]: Specifies the length of intra refresh in number of frames for periodic intra refresh. This value should be smaller than intraRefreshPeriod */
+    uint32_t maxNumRefFramesInDPB;                                  /**< [in]: Specifies the maximum number of references frames in the DPB.*/
+    uint32_t numTileColumns;                                        /**< [in]: This parameter in conjunction with the flag enableCustomTileConfig and the array tileWidths[] specifies the way in which the picture is divided into tile columns.
+                                                                               When enableCustomTileConfig == 0, the picture will be uniformly divided into numTileColumns tile columns. If numTileColumns is not a power of 2,
+                                                                               it will be rounded down to the next power of 2 value. If numTileColumns == 0, the picture will be coded with the smallest number of vertical tiles as allowed by standard.
+                                                                               When enableCustomTileConfig == 1, numTileColumns must be > 0 and <= NV_MAX_TILE_COLS_AV1 and tileWidths must point to a valid array of numTileColumns entries.
+                                                                               Entry i specifies the width in 64x64 CTU unit of tile colum i. The sum of all the entries should be equal to the picture width in 64x64 CTU units. */
+    uint32_t numTileRows;                                           /**< [in]: This parameter in conjunction with the flag enableCustomTileConfig and the array tileHeights[] specifies the way in which the picture is divided into tiles rows
+                                                                               When enableCustomTileConfig == 0, the picture will be uniformly divided into numTileRows tile rows. If numTileRows is not a power of 2,
+                                                                               it will be rounded down to the next power of 2 value. If numTileRows == 0, the picture will be coded with the smallest number of horizontal tiles as allowed by standard.
+                                                                               When enableCustomTileConfig == 1, numTileRows must be > 0 and <= NV_MAX_TILE_ROWS_AV1 and tileHeights must point to a valid array of numTileRows entries.
+                                                                               Entry i specifies the height in 64x64 CTU unit of tile row i. The sum of all the entries should be equal to the picture hieght in 64x64 CTU units. */
+    uint32_t *tileWidths;                                           /**< [in]: If enableCustomTileConfig == 1, tileWidths[i] specifies the width of tile column i in 64x64 CTU unit, with 0 <= i <= numTileColumns -1. */
+    uint32_t *tileHeights;                                          /**< [in]: If enableCustomTileConfig == 1, tileHeights[i] specifies the height of tile row i in 64x64 CTU unit, with 0 <= i <= numTileRows -1. */
+    uint32_t maxTemporalLayersMinus1;                               /**< [in]: Specifies the max temporal layer used for hierarchical coding. */
+    NV_ENC_VUI_COLOR_PRIMARIES colorPrimaries;                      /**< [in]: as defined in section of ISO/IEC 23091-4/ITU-T H.273 */
+    NV_ENC_VUI_TRANSFER_CHARACTERISTIC transferCharacteristics;     /**< [in]: as defined in section of ISO/IEC 23091-4/ITU-T H.273 */
+    NV_ENC_VUI_MATRIX_COEFFS matrixCoefficients;                    /**< [in]: as defined in section of ISO/IEC 23091-4/ITU-T H.273 */
+    uint32_t colorRange;                                            /**< [in]: 0: studio swing representation - 1: full swing representation */
+    uint32_t chromaSamplePosition;                                  /**< [in]: 0: unknown
+                                                                               1: Horizontally collocated with luma (0,0) sample, between two vertical samples
+                                                                               2: Co-located with luma (0,0) sample */
+    NV_ENC_BFRAME_REF_MODE useBFramesAsRef;                         /**< [in]: Specifies the B-Frame as reference mode. Check support for useBFramesAsRef mode using  ::NV_ENC_CAPS_SUPPORT_BFRAME_REF_MODE caps.*/
+    NV_ENC_FILM_GRAIN_PARAMS_AV1 *filmGrainParams;                  /**< [in]: If enableFilmGrainParams == 1, filmGrainParams must point to a valid NV_ENC_FILM_GRAIN_PARAMS_AV1 structure */
+    NV_ENC_NUM_REF_FRAMES  numFwdRefs;                              /**< [in]: Specifies max number of forward reference frame used for prediction of a frame. It must be in range 1-4 (Last, Last2, last3 and Golden). It's a suggestive value not necessarily be honored always. */
+    NV_ENC_NUM_REF_FRAMES  numBwdRefs;                              /**< [in]: Specifies max number of L1 list reference frame used for prediction of a frame. It must be in range 1-3 (Backward, Altref2, Altref). It's a suggestive value not necessarily be honored always. */
+    uint32_t reserved1[235];                                        /**< [in]: Reserved and must be set to 0.*/
+    void*    reserved2[62];                                         /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_CONFIG_AV1;
+
+/**
+ * \struct _NV_ENC_CONFIG_H264_MEONLY
+ * H264 encoder configuration parameters for ME only Mode
+ *
+ */
+typedef struct _NV_ENC_CONFIG_H264_MEONLY
+{
+    uint32_t disablePartition16x16 :1;                          /**< [in]: Disable Motion Estimation on 16x16 blocks*/
+    uint32_t disablePartition8x16  :1;                          /**< [in]: Disable Motion Estimation on 8x16 blocks*/
+    uint32_t disablePartition16x8  :1;                          /**< [in]: Disable Motion Estimation on 16x8 blocks*/
+    uint32_t disablePartition8x8   :1;                          /**< [in]: Disable Motion Estimation on 8x8 blocks*/
+    uint32_t disableIntraSearch    :1;                          /**< [in]: Disable Intra search during Motion Estimation*/
+    uint32_t bStereoEnable         :1;                          /**< [in]: Enable Stereo Mode for Motion Estimation where each view is independently executed*/
+    uint32_t reserved              :26;                         /**< [in]: Reserved and must be set to 0 */
+    uint32_t reserved1 [255];                                   /**< [in]: Reserved and must be set to 0 */
+    void*    reserved2[64];                                     /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_CONFIG_H264_MEONLY;
+
+
+/**
+ * \struct _NV_ENC_CONFIG_HEVC_MEONLY
+ * HEVC encoder configuration parameters for ME only Mode
+ *
+ */
+typedef struct _NV_ENC_CONFIG_HEVC_MEONLY
+{
+    uint32_t reserved [256];                                   /**< [in]: Reserved and must be set to 0 */
+    void*    reserved1[64];                                     /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_CONFIG_HEVC_MEONLY;
+
+/**
+ * \struct _NV_ENC_CODEC_CONFIG
+ * Codec-specific encoder configuration parameters to be set during initialization.
+ */
+typedef union _NV_ENC_CODEC_CONFIG
+{
+    NV_ENC_CONFIG_H264        h264Config;                /**< [in]: Specifies the H.264-specific encoder configuration. */
+    NV_ENC_CONFIG_HEVC        hevcConfig;                /**< [in]: Specifies the HEVC-specific encoder configuration. */
+    NV_ENC_CONFIG_AV1         av1Config;                 /**< [in]: Specifies the AV1-specific encoder configuration. */
+    NV_ENC_CONFIG_H264_MEONLY h264MeOnlyConfig;          /**< [in]: Specifies the H.264-specific ME only encoder configuration. */
+    NV_ENC_CONFIG_HEVC_MEONLY hevcMeOnlyConfig;          /**< [in]: Specifies the HEVC-specific ME only encoder configuration. */
+    uint32_t                reserved[320];               /**< [in]: Reserved and must be set to 0 */
+} NV_ENC_CODEC_CONFIG;
+
+
+/**
+ * \struct _NV_ENC_CONFIG
+ * Encoder configuration parameters to be set during initialization.
+ */
+typedef struct _NV_ENC_CONFIG
+{
+    uint32_t                        version;                                     /**< [in]: Struct version. Must be set to ::NV_ENC_CONFIG_VER. */
+    GUID                            profileGUID;                                 /**< [in]: Specifies the codec profile GUID. If client specifies \p NV_ENC_CODEC_PROFILE_AUTOSELECT_GUID the NvEncodeAPI interface will select the appropriate codec profile. */
+    uint32_t                        gopLength;                                   /**< [in]: Specifies the number of pictures in one GOP. Low latency application client can set goplength to NVENC_INFINITE_GOPLENGTH so that keyframes are not inserted automatically. */
+    int32_t                         frameIntervalP;                              /**< [in]: Specifies the GOP pattern as follows: \p frameIntervalP = 0: I, 1: IPP, 2: IBP, 3: IBBP  If goplength is set to NVENC_INFINITE_GOPLENGTH \p frameIntervalP should be set to 1. */
+    uint32_t                        monoChromeEncoding;                          /**< [in]: Set this to 1 to enable monochrome encoding for this session. */
+    NV_ENC_PARAMS_FRAME_FIELD_MODE  frameFieldMode;                              /**< [in]: Specifies the frame/field mode.
+                                                                                            Check support for field encoding using ::NV_ENC_CAPS_SUPPORT_FIELD_ENCODING caps.
+                                                                                            Using a frameFieldMode other than NV_ENC_PARAMS_FRAME_FIELD_MODE_FRAME for RGB input is not supported. */
+    NV_ENC_MV_PRECISION             mvPrecision;                                 /**< [in]: Specifies the desired motion vector prediction precision. */
+    NV_ENC_RC_PARAMS                rcParams;                                    /**< [in]: Specifies the rate control parameters for the current encoding session. */
+    NV_ENC_CODEC_CONFIG             encodeCodecConfig;                           /**< [in]: Specifies the codec specific config parameters through this union. */
+    uint32_t                        reserved [278];                              /**< [in]: Reserved and must be set to 0 */
+    void*                           reserved2[64];                               /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_CONFIG;
+
+/** macro for constructing the version field of ::_NV_ENC_CONFIG */
+#define NV_ENC_CONFIG_VER (NVENCAPI_STRUCT_VERSION(8) | ( 1u<<31 ))
+
+/**
+ *  Tuning information of NVENC encoding (TuningInfo is not applicable to H264 and HEVC MEOnly mode).
+ */
+typedef enum NV_ENC_TUNING_INFO
+{
+    NV_ENC_TUNING_INFO_UNDEFINED         = 0,                                     /**< Undefined tuningInfo. Invalid value for encoding. */
+    NV_ENC_TUNING_INFO_HIGH_QUALITY      = 1,                                     /**< Tune presets for latency tolerant encoding.*/
+    NV_ENC_TUNING_INFO_LOW_LATENCY       = 2,                                     /**< Tune presets for low latency streaming.*/
+    NV_ENC_TUNING_INFO_ULTRA_LOW_LATENCY = 3,                                     /**< Tune presets for ultra low latency streaming.*/
+    NV_ENC_TUNING_INFO_LOSSLESS          = 4,                                     /**< Tune presets for lossless encoding.*/
+    NV_ENC_TUNING_INFO_COUNT                                                      /**< Count number of tuningInfos. Invalid value. */
+}NV_ENC_TUNING_INFO;
+
+/**
+ * \struct _NV_ENC_INITIALIZE_PARAMS
+ * Encode Session Initialization parameters.
+ */
+typedef struct _NV_ENC_INITIALIZE_PARAMS
+{
+    uint32_t                                   version;                         /**< [in]: Struct version. Must be set to ::NV_ENC_INITIALIZE_PARAMS_VER. */
+    GUID                                       encodeGUID;                      /**< [in]: Specifies the Encode GUID for which the encoder is being created. ::NvEncInitializeEncoder() API will fail if this is not set, or set to unsupported value. */
+    GUID                                       presetGUID;                      /**< [in]: Specifies the preset for encoding. If the preset GUID is set then , the preset configuration will be applied before any other parameter. */
+    uint32_t                                   encodeWidth;                     /**< [in]: Specifies the encode width. If not set ::NvEncInitializeEncoder() API will fail. */
+    uint32_t                                   encodeHeight;                    /**< [in]: Specifies the encode height. If not set ::NvEncInitializeEncoder() API will fail. */
+    uint32_t                                   darWidth;                        /**< [in]: Specifies the display aspect ratio width (H264/HEVC) or the render width (AV1). */
+    uint32_t                                   darHeight;                       /**< [in]: Specifies the display aspect ratio height (H264/HEVC) or the render height (AV1). */
+    uint32_t                                   frameRateNum;                    /**< [in]: Specifies the numerator for frame rate used for encoding in frames per second ( Frame rate = frameRateNum / frameRateDen ). */
+    uint32_t                                   frameRateDen;                    /**< [in]: Specifies the denominator for frame rate used for encoding in frames per second ( Frame rate = frameRateNum / frameRateDen ). */
+    uint32_t                                   enableEncodeAsync;               /**< [in]: Set this to 1 to enable asynchronous mode and is expected to use events to get picture completion notification. */
+    uint32_t                                   enablePTD;                       /**< [in]: Set this to 1 to enable the Picture Type Decision is be taken by the NvEncodeAPI interface. */
+    uint32_t                                   reportSliceOffsets        :1;    /**< [in]: Set this to 1 to enable reporting slice offsets in ::_NV_ENC_LOCK_BITSTREAM. NV_ENC_INITIALIZE_PARAMS::enableEncodeAsync must be set to 0 to use this feature. Client must set this to 0 if NV_ENC_CONFIG_H264::sliceMode is 1 on Kepler GPUs */
+    uint32_t                                   enableSubFrameWrite       :1;    /**< [in]: Set this to 1 to write out available bitstream to memory at subframe intervals.
+                                                                                           If enableSubFrameWrite = 1, then the hardware encoder returns data as soon as a slice (H264/HEVC) or tile (AV1) has completed encoding.
+                                                                                           This results in better encoding latency, but the downside is that the application has to keep polling via a call to nvEncLockBitstream API continuously to see if any encoded slice/tile data is available.
+                                                                                           Use this mode if you feel that the marginal reduction in latency from sub-frame encoding is worth the increase in complexity due to CPU-based polling. */
+    uint32_t                                   enableExternalMEHints     :1;    /**< [in]: Set to 1 to enable external ME hints for the current frame. For NV_ENC_INITIALIZE_PARAMS::enablePTD=1 with B frames, programming L1 hints is optional for B frames since Client doesn't know internal GOP structure.
+                                                                                           NV_ENC_PIC_PARAMS::meHintRefPicDist should preferably be set with enablePTD=1. */
+    uint32_t                                   enableMEOnlyMode          :1;    /**< [in]: Set to 1 to enable ME Only Mode .*/
+    uint32_t                                   enableWeightedPrediction  :1;    /**< [in]: Set this to 1 to enable weighted prediction. Not supported if encode session is configured for B-Frames (i.e. NV_ENC_CONFIG::frameIntervalP > 1 or preset >=P3 when tuningInfo = ::NV_ENC_TUNING_INFO_HIGH_QUALITY or
+                                                                                           tuningInfo = ::NV_ENC_TUNING_INFO_LOSSLESS. This is because preset >=p3 internally enables B frames when tuningInfo = ::NV_ENC_TUNING_INFO_HIGH_QUALITY or ::NV_ENC_TUNING_INFO_LOSSLESS). */
+    uint32_t                                   enableOutputInVidmem      :1;    /**< [in]: Set this to 1 to enable output of NVENC in video memory buffer created by application. This feature is not supported for HEVC ME only mode. */
+    uint32_t                                   reservedBitFields         :26;   /**< [in]: Reserved bitfields and must be set to 0 */
+    uint32_t                                   privDataSize;                    /**< [in]: Reserved private data buffer size and must be set to 0 */
+    void*                                      privData;                        /**< [in]: Reserved private data buffer and must be set to NULL */
+    NV_ENC_CONFIG*                             encodeConfig;                    /**< [in]: Specifies the advanced codec specific structure. If client has sent a valid codec config structure, it will override parameters set by the NV_ENC_INITIALIZE_PARAMS::presetGUID parameter. If set to NULL the NvEncodeAPI interface will use the NV_ENC_INITIALIZE_PARAMS::presetGUID to set the codec specific parameters.
+                                                                                           Client can also optionally query the NvEncodeAPI interface to get codec specific parameters for a presetGUID using ::NvEncGetEncodePresetConfig() API. It can then modify (if required) some of the codec config parameters and send down a custom config structure as part of ::_NV_ENC_INITIALIZE_PARAMS.
+                                                                                           Even in this case client is recommended to pass the same preset guid it has used in ::NvEncGetEncodePresetConfig() API to query the config structure; as NV_ENC_INITIALIZE_PARAMS::presetGUID. This will not override the custom config structure but will be used to determine other Encoder HW specific parameters not exposed in the API. */
+    uint32_t                                   maxEncodeWidth;                  /**< [in]: Maximum encode width to be used for current Encode session.
+                                                                                           Client should allocate output buffers according to this dimension for dynamic resolution change. If set to 0, Encoder will not allow dynamic resolution change. */
+    uint32_t                                   maxEncodeHeight;                 /**< [in]: Maximum encode height to be allowed for current Encode session.
+                                                                                           Client should allocate output buffers according to this dimension for dynamic resolution change. If set to 0, Encode will not allow dynamic resolution change. */
+    NVENC_EXTERNAL_ME_HINT_COUNTS_PER_BLOCKTYPE maxMEHintCountsPerBlock[2];     /**< [in]: If Client wants to pass external motion vectors in NV_ENC_PIC_PARAMS::meExternalHints buffer it must specify the maximum number of hint candidates per block per direction for the encode session.
+                                                                                           The NV_ENC_INITIALIZE_PARAMS::maxMEHintCountsPerBlock[0] is for L0 predictors and NV_ENC_INITIALIZE_PARAMS::maxMEHintCountsPerBlock[1] is for L1 predictors.
+                                                                                           This client must also set NV_ENC_INITIALIZE_PARAMS::enableExternalMEHints to 1. */
+    NV_ENC_TUNING_INFO                         tuningInfo;                      /**< [in]: Tuning Info of NVENC encoding(TuningInfo is not applicable to H264 and HEVC meonly mode). */
+    NV_ENC_BUFFER_FORMAT                       bufferFormat;                    /**< [in]: Input buffer format. Used only when DX12 interface type is used */
+    uint32_t                                   reserved [287];                  /**< [in]: Reserved and must be set to 0 */
+    void*                                      reserved2[64];                   /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_INITIALIZE_PARAMS;
+
+/** macro for constructing the version field of ::_NV_ENC_INITIALIZE_PARAMS */
+#define NV_ENC_INITIALIZE_PARAMS_VER (NVENCAPI_STRUCT_VERSION(5) | ( 1u<<31 ))
+
+
+/**
+ * \struct _NV_ENC_RECONFIGURE_PARAMS
+ * Encode Session Reconfigured parameters.
+ */
+typedef struct _NV_ENC_RECONFIGURE_PARAMS
+{
+    uint32_t                                    version;                        /**< [in]: Struct version. Must be set to ::NV_ENC_RECONFIGURE_PARAMS_VER. */
+    NV_ENC_INITIALIZE_PARAMS                    reInitEncodeParams;             /**< [in]: Encoder session re-initialization parameters.
+                                                                                           If reInitEncodeParams.encodeConfig is NULL and
+                                                                                           reInitEncodeParams.presetGUID is the same as the preset
+                                                                                           GUID specified on the call to NvEncInitializeEncoder(),
+                                                                                           EncodeAPI will continue to use the existing encode
+                                                                                           configuration.
+                                                                                           If reInitEncodeParams.encodeConfig is NULL and
+                                                                                           reInitEncodeParams.presetGUID is different from the preset
+                                                                                           GUID specified on the call to NvEncInitializeEncoder(),
+                                                                                           EncodeAPI will try to use the default configuration for
+                                                                                           the preset specified by reInitEncodeParams.presetGUID.
+                                                                                           In this case, reconfiguration may fail if the new
+                                                                                           configuration is incompatible with the existing
+                                                                                           configuration (e.g. the new configuration results in
+                                                                                           a change in the GOP structure). */
+    uint32_t                                    resetEncoder            :1;     /**< [in]: This resets the rate control states and other internal encoder states. This should be used only with an IDR frame.
+                                                                                           If NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1, encoder will force the frame type to IDR */
+    uint32_t                                    forceIDR                :1;     /**< [in]: Encode the current picture as an IDR picture. This flag is only valid when Picture type decision is taken by the Encoder
+                                                                                           [_NV_ENC_INITIALIZE_PARAMS::enablePTD == 1]. */
+    uint32_t                                    reserved                :30;
+
+}NV_ENC_RECONFIGURE_PARAMS;
+
+/** macro for constructing the version field of ::_NV_ENC_RECONFIGURE_PARAMS */
+#define NV_ENC_RECONFIGURE_PARAMS_VER (NVENCAPI_STRUCT_VERSION(1) | ( 1u<<31 ))
+
+/**
+ * \struct _NV_ENC_PRESET_CONFIG
+ * Encoder preset config
+ */
+typedef struct _NV_ENC_PRESET_CONFIG
+{
+    uint32_t      version;                               /**< [in]:  Struct version. Must be set to ::NV_ENC_PRESET_CONFIG_VER. */
+    NV_ENC_CONFIG presetCfg;                             /**< [out]: preset config returned by the Nvidia Video Encoder interface. */
+    uint32_t      reserved1[255];                        /**< [in]: Reserved and must be set to 0 */
+    void*         reserved2[64];                         /**< [in]: Reserved and must be set to NULL */
+}NV_ENC_PRESET_CONFIG;
+
+/** macro for constructing the version field of ::_NV_ENC_PRESET_CONFIG */
+#define NV_ENC_PRESET_CONFIG_VER (NVENCAPI_STRUCT_VERSION(4) | ( 1u<<31 ))
+
+
+/**
+ * \struct _NV_ENC_PIC_PARAMS_MVC
+ * MVC-specific parameters to be sent on a per-frame basis.
+ */
+typedef struct _NV_ENC_PIC_PARAMS_MVC
+{
+    uint32_t version;                                    /**< [in]: Struct version. Must be set to ::NV_ENC_PIC_PARAMS_MVC_VER. */
+    uint32_t viewID;                                     /**< [in]: Specifies the view ID associated with the current input view. */
+    uint32_t temporalID;                                 /**< [in]: Specifies the temporal ID associated with the current input view. */
+    uint32_t priorityID;                                 /**< [in]: Specifies the priority ID associated with the current input view. Reserved and ignored by the NvEncodeAPI interface. */
+    uint32_t reserved1[12];                              /**< [in]: Reserved and must be set to 0. */
+    void*    reserved2[8];                              /**< [in]: Reserved and must be set to NULL. */
+}NV_ENC_PIC_PARAMS_MVC;
+
+/** macro for constructing the version field of ::_NV_ENC_PIC_PARAMS_MVC */
+#define NV_ENC_PIC_PARAMS_MVC_VER NVENCAPI_STRUCT_VERSION(1)
+
+
+/**
+ * \union _NV_ENC_PIC_PARAMS_H264_EXT
+ * H264 extension  picture parameters
+ */
+typedef union _NV_ENC_PIC_PARAMS_H264_EXT
+{
+    NV_ENC_PIC_PARAMS_MVC mvcPicParams;                  /**< [in]: Specifies the MVC picture parameters. */
+    uint32_t reserved1[32];                              /**< [in]: Reserved and must be set to 0.        */
+}NV_ENC_PIC_PARAMS_H264_EXT;
+
+/**
+ * \struct _NV_ENC_SEI_PAYLOAD
+ *  User SEI message
+ */
+typedef struct _NV_ENC_SEI_PAYLOAD
+{
+    uint32_t payloadSize;            /**< [in] SEI payload size in bytes. SEI payload must be byte aligned, as described in Annex D */
+    uint32_t payloadType;            /**< [in] SEI payload types and syntax can be found in Annex D of the H.264 Specification. */
+    uint8_t *payload;                /**< [in] pointer to user data */
+} NV_ENC_SEI_PAYLOAD;
+
+#define NV_ENC_H264_SEI_PAYLOAD NV_ENC_SEI_PAYLOAD
+
+/**
+ * \struct _NV_ENC_PIC_PARAMS_H264
+ * H264 specific enc pic params. sent on a per frame basis.
+ */
+typedef struct _NV_ENC_PIC_PARAMS_H264
+{
+    uint32_t displayPOCSyntax;                           /**< [in]: Specifies the display POC syntax This is required to be set if client is handling the picture type decision. */
+    uint32_t reserved3;                                  /**< [in]: Reserved and must be set to 0 */
+    uint32_t refPicFlag;                                 /**< [in]: Set to 1 for a reference picture. This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t colourPlaneId;                              /**< [in]: Specifies the colour plane ID associated with the current input. */
+    uint32_t forceIntraRefreshWithFrameCnt;              /**< [in]: Forces an intra refresh with duration equal to intraRefreshFrameCnt.
+                                                                    When outputRecoveryPointSEI is set this is value is used for recovery_frame_cnt in recovery point SEI message
+                                                                    forceIntraRefreshWithFrameCnt cannot be used if B frames are used in the GOP structure specified */
+    uint32_t constrainedFrame           :1;              /**< [in]: Set to 1 if client wants to encode this frame with each slice completely independent of other slices in the frame.
+                                                                    NV_ENC_INITIALIZE_PARAMS::enableConstrainedEncoding should be set to 1 */
+    uint32_t sliceModeDataUpdate        :1;              /**< [in]: Set to 1 if client wants to change the sliceModeData field to specify new sliceSize Parameter
+                                                                    When forceIntraRefreshWithFrameCnt is set it will have priority over sliceMode setting */
+    uint32_t ltrMarkFrame               :1;              /**< [in]: Set to 1 if client wants to mark this frame as LTR */
+    uint32_t ltrUseFrames               :1;              /**< [in]: Set to 1 if client allows encoding this frame using the LTR frames specified in ltrFrameBitmap */
+    uint32_t reservedBitFields          :28;             /**< [in]: Reserved bit fields and must be set to 0 */
+    uint8_t* sliceTypeData;                              /**< [in]: Deprecated. */
+    uint32_t sliceTypeArrayCnt;                          /**< [in]: Deprecated. */
+    uint32_t seiPayloadArrayCnt;                         /**< [in]: Specifies the number of elements allocated in  seiPayloadArray array. */
+    NV_ENC_SEI_PAYLOAD* seiPayloadArray;                 /**< [in]: Array of SEI payloads which will be inserted for this frame. */
+    uint32_t sliceMode;                                  /**< [in]: This parameter in conjunction with sliceModeData specifies the way in which the picture is divided into slices
+                                                                    sliceMode = 0 MB based slices, sliceMode = 1 Byte based slices, sliceMode = 2 MB row based slices, sliceMode = 3, numSlices in Picture
+                                                                    When forceIntraRefreshWithFrameCnt is set it will have priority over sliceMode setting
+                                                                    When sliceMode == 0 and sliceModeData == 0 whole picture will be coded with one slice */
+    uint32_t sliceModeData;                              /**< [in]: Specifies the parameter needed for sliceMode. For:
+                                                                    sliceMode = 0, sliceModeData specifies # of MBs in each slice (except last slice)
+                                                                    sliceMode = 1, sliceModeData specifies maximum # of bytes in each slice (except last slice)
+                                                                    sliceMode = 2, sliceModeData specifies # of MB rows in each slice (except last slice)
+                                                                    sliceMode = 3, sliceModeData specifies number of slices in the picture. Driver will divide picture into slices optimally */
+    uint32_t ltrMarkFrameIdx;                            /**< [in]: Specifies the long term referenceframe index to use for marking this frame as LTR.*/
+    uint32_t ltrUseFrameBitmap;                          /**< [in]: Specifies the associated bitmap of LTR frame indices to use when encoding this frame. */
+    uint32_t ltrUsageMode;                               /**< [in]: Not supported. Reserved for future use and must be set to 0. */
+    uint32_t forceIntraSliceCount;                       /**< [in]: Specifies the number of slices to be forced to Intra in the current picture.
+                                                                    This option along with forceIntraSliceIdx[] array needs to be used with sliceMode = 3 only */
+    uint32_t *forceIntraSliceIdx;                        /**< [in]: Slice indices to be forced to intra in the current picture. Each slice index should be <= num_slices_in_picture -1. Index starts from 0 for first slice.
+                                                                    The number of entries in this array should be equal to forceIntraSliceCount */
+    NV_ENC_PIC_PARAMS_H264_EXT h264ExtPicParams;         /**< [in]: Specifies the H264 extension config parameters using this config. */
+    NV_ENC_TIME_CODE timeCode;                           /**< [in]: Specifies the clock timestamp sets used in picture timing SEI. Applicable only when NV_ENC_CONFIG_H264::enableTimeCode is set to 1. */
+    uint32_t reserved [203];                             /**< [in]: Reserved and must be set to 0. */
+    void*    reserved2[61];                              /**< [in]: Reserved and must be set to NULL. */
+} NV_ENC_PIC_PARAMS_H264;
+
+/**
+ * \struct _NV_ENC_PIC_PARAMS_HEVC
+ * HEVC specific enc pic params. sent on a per frame basis.
+ */
+typedef struct _NV_ENC_PIC_PARAMS_HEVC
+{
+    uint32_t displayPOCSyntax;                           /**< [in]: Specifies the display POC syntax This is required to be set if client is handling the picture type decision. */
+    uint32_t refPicFlag;                                 /**< [in]: Set to 1 for a reference picture. This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t temporalId;                                 /**< [in]: Specifies the temporal id of the picture */
+    uint32_t forceIntraRefreshWithFrameCnt;              /**< [in]: Forces an intra refresh with duration equal to intraRefreshFrameCnt.
+                                                                    When outputRecoveryPointSEI is set this is value is used for recovery_frame_cnt in recovery point SEI message
+                                                                    forceIntraRefreshWithFrameCnt cannot be used if B frames are used in the GOP structure specified */
+    uint32_t constrainedFrame           :1;              /**< [in]: Set to 1 if client wants to encode this frame with each slice completely independent of other slices in the frame.
+                                                                    NV_ENC_INITIALIZE_PARAMS::enableConstrainedEncoding should be set to 1 */
+    uint32_t sliceModeDataUpdate        :1;              /**< [in]: Set to 1 if client wants to change the sliceModeData field to specify new sliceSize Parameter
+                                                                    When forceIntraRefreshWithFrameCnt is set it will have priority over sliceMode setting */
+    uint32_t ltrMarkFrame               :1;              /**< [in]: Set to 1 if client wants to mark this frame as LTR */
+    uint32_t ltrUseFrames               :1;              /**< [in]: Set to 1 if client allows encoding this frame using the LTR frames specified in ltrFrameBitmap */
+    uint32_t reservedBitFields          :28;             /**< [in]: Reserved bit fields and must be set to 0 */
+    uint8_t* sliceTypeData;                              /**< [in]: Array which specifies the slice type used to force intra slice for a particular slice. Currently supported only for NV_ENC_CONFIG_H264::sliceMode == 3.
+                                                                    Client should allocate array of size sliceModeData where sliceModeData is specified in field of ::_NV_ENC_CONFIG_H264
+                                                                    Array element with index n corresponds to nth slice. To force a particular slice to intra client should set corresponding array element to NV_ENC_SLICE_TYPE_I
+                                                                    all other array elements should be set to NV_ENC_SLICE_TYPE_DEFAULT */
+    uint32_t sliceTypeArrayCnt;                          /**< [in]: Client should set this to the number of elements allocated in sliceTypeData array. If sliceTypeData is NULL then this should be set to 0 */
+    uint32_t sliceMode;                                  /**< [in]: This parameter in conjunction with sliceModeData specifies the way in which the picture is divided into slices
+                                                                    sliceMode = 0 CTU based slices, sliceMode = 1 Byte based slices, sliceMode = 2 CTU row based slices, sliceMode = 3, numSlices in Picture
+                                                                    When forceIntraRefreshWithFrameCnt is set it will have priority over sliceMode setting
+                                                                    When sliceMode == 0 and sliceModeData == 0 whole picture will be coded with one slice */
+    uint32_t sliceModeData;                              /**< [in]: Specifies the parameter needed for sliceMode. For:
+                                                                    sliceMode = 0, sliceModeData specifies # of CTUs in each slice (except last slice)
+                                                                    sliceMode = 1, sliceModeData specifies maximum # of bytes in each slice (except last slice)
+                                                                    sliceMode = 2, sliceModeData specifies # of CTU rows in each slice (except last slice)
+                                                                    sliceMode = 3, sliceModeData specifies number of slices in the picture. Driver will divide picture into slices optimally */
+    uint32_t ltrMarkFrameIdx;                            /**< [in]: Specifies the long term reference frame index to use for marking this frame as LTR.*/
+    uint32_t ltrUseFrameBitmap;                          /**< [in]: Specifies the associated bitmap of LTR frame indices to use when encoding this frame. */
+    uint32_t ltrUsageMode;                               /**< [in]: Not supported. Reserved for future use and must be set to 0. */
+    uint32_t seiPayloadArrayCnt;                         /**< [in]: Specifies the number of elements allocated in  seiPayloadArray array. */
+    uint32_t reserved;                                   /**< [in]: Reserved and must be set to 0. */
+    NV_ENC_SEI_PAYLOAD* seiPayloadArray;                 /**< [in]: Array of SEI payloads which will be inserted for this frame. */
+    NV_ENC_TIME_CODE timeCode;                           /**< [in]: Specifies the clock timestamp sets used in time code SEI. Applicable only when NV_ENC_CONFIG_HEVC::enableTimeCodeSEI is set to 1. */
+    uint32_t reserved2 [237];                            /**< [in]: Reserved and must be set to 0. */
+    void*    reserved3[61];                              /**< [in]: Reserved and must be set to NULL. */
+} NV_ENC_PIC_PARAMS_HEVC;
+
+#define NV_ENC_AV1_OBU_PAYLOAD NV_ENC_SEI_PAYLOAD
+
+/**
+* \struct _NV_ENC_PIC_PARAMS_AV1
+* AV1 specific enc pic params. sent on a per frame basis.
+*/
+typedef struct _NV_ENC_PIC_PARAMS_AV1
+{
+    uint32_t displayPOCSyntax;                           /**< [in]: Specifies the display POC syntax This is required to be set if client is handling the picture type decision. */
+    uint32_t refPicFlag;                                 /**< [in]: Set to 1 for a reference picture. This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t temporalId;                                 /**< [in]: Specifies the temporal id of the picture */
+    uint32_t forceIntraRefreshWithFrameCnt;              /**< [in]: Forces an intra refresh with duration equal to intraRefreshFrameCnt.
+                                                                    forceIntraRefreshWithFrameCnt cannot be used if B frames are used in the GOP structure specified */
+    uint32_t goldenFrameFlag            : 1;             /**< [in]: Encode frame as Golden Frame. This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t arfFrameFlag               : 1;             /**< [in]: Encode frame as Alternate Reference Frame. This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t arf2FrameFlag              : 1;             /**< [in]: Encode frame as Alternate Reference 2 Frame. This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t bwdFrameFlag               : 1;             /**< [in]: Encode frame as Backward Reference Frame. This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t overlayFrameFlag           : 1;             /**< [in]: Encode frame as overlay frame. A previously encoded frame with the same displayPOCSyntax value should be present in reference frame buffer.
+                                                                    This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t showExistingFrameFlag      : 1;             /**< [in]: When ovelayFrameFlag is set to 1, this flag controls the value of the show_existing_frame syntax element associated with the overlay frame.
+                                                                    This flag is added to the interface as a placeholder. Its value is ignored for now and always assumed to be set to 1.
+                                                                    This is ignored if NV_ENC_INITIALIZE_PARAMS::enablePTD is set to 1. */
+    uint32_t errorResilientModeFlag     : 1;             /**< [in]: encode frame independently from previously encoded frames */
+
+    uint32_t tileConfigUpdate           : 1;             /**< [in]: Set to 1 if client wants to overwrite the default tile configuration with the tile parameters specified below
+                                                                    When forceIntraRefreshWithFrameCnt is set it will have priority over tileConfigUpdate setting */
+    uint32_t enableCustomTileConfig     : 1;             /**< [in]: Set 1 to enable custom tile configuration: numTileColumns and numTileRows must have non zero values and tileWidths and tileHeights must point to a valid address  */
+    uint32_t filmGrainParamsUpdate      : 1;             /**< [in]: Set to 1 if client wants to update previous film grain parameters: filmGrainParams must point to a valid address and encoder must have been configured with film grain enabled  */
+    uint32_t reservedBitFields          : 22;            /**< [in]: Reserved bitfields and must be set to 0 */
+    uint32_t numTileColumns;                             /**< [in]: This parameter in conjunction with the flag enableCustomTileConfig and the array tileWidths[] specifies the way in which the picture is divided into tile columns.
+                                                                    When enableCustomTileConfig == 0, the picture will be uniformly divided into numTileColumns tile columns. If numTileColumns is not a power of 2,
+                                                                    it will be rounded down to the next power of 2 value. If numTileColumns == 0, the picture will be coded with the smallest number of vertical tiles as allowed by standard.
+                                                                    When enableCustomTileConfig == 1, numTileColumns must be > 0 and <= NV_MAX_TILE_COLS_AV1 and tileWidths must point to a valid array of numTileColumns entries.
+                                                                    Entry i specifies the width in 64x64 CTU unit of tile colum i. The sum of all the entries should be equal to the picture width in 64x64 CTU units. */
+    uint32_t numTileRows;                                /**< [in]: This parameter in conjunction with the flag enableCustomTileConfig and the array tileHeights[] specifies the way in which the picture is divided into tiles rows
+                                                                    When enableCustomTileConfig == 0, the picture will be uniformly divided into numTileRows tile rows. If numTileRows is not a power of 2,
+                                                                    it will be rounded down to the next power of 2 value. If numTileRows == 0, the picture will be coded with the smallest number of horizontal tiles as allowed by standard.
+                                                                    When enableCustomTileConfig == 1, numTileRows must be > 0 and <= NV_MAX_TILE_ROWS_AV1 and tileHeights must point to a valid array of numTileRows entries.
+                                                                    Entry i specifies the height in 64x64 CTU unit of tile row i. The sum of all the entries should be equal to the picture hieght in 64x64 CTU units. */
+    uint32_t *tileWidths;                                /**< [in]: If enableCustomTileConfig == 1, tileWidths[i] specifies the width of tile column i in 64x64 CTU unit, with 0 <= i <= numTileColumns -1. */
+    uint32_t *tileHeights;                               /**< [in]: If enableCustomTileConfig == 1, tileHeights[i] specifies the height of tile row i in 64x64 CTU unit, with 0 <= i <= numTileRows -1. */
+    uint32_t obuPayloadArrayCnt;                         /**< [in]: Specifies the number of elements allocated in  obuPayloadArray array. */
+    uint32_t reserved;                                   /**< [in]: Reserved and must be set to 0. */
+    NV_ENC_AV1_OBU_PAYLOAD* obuPayloadArray;             /**< [in]: Array of OBU payloads which will be inserted for this frame. */
+    NV_ENC_FILM_GRAIN_PARAMS_AV1 *filmGrainParams;       /**< [in]: If filmGrainParamsUpdate == 1, filmGrainParams must point to a valid NV_ENC_FILM_GRAIN_PARAMS_AV1 structure */
+    uint32_t reserved2[247];                             /**< [in]: Reserved and must be set to 0. */
+    void*    reserved3[61];                              /**< [in]: Reserved and must be set to NULL. */
+} NV_ENC_PIC_PARAMS_AV1;
+
+/**
+ * Codec specific per-picture encoding parameters.
+ */
+typedef union _NV_ENC_CODEC_PIC_PARAMS
+{
+    NV_ENC_PIC_PARAMS_H264 h264PicParams;                /**< [in]: H264 encode picture params. */
+    NV_ENC_PIC_PARAMS_HEVC hevcPicParams;                /**< [in]: HEVC encode picture params. */
+    NV_ENC_PIC_PARAMS_AV1  av1PicParams;                 /**< [in]: AV1 encode picture params. */
+    uint32_t               reserved[256];                /**< [in]: Reserved and must be set to 0. */
+} NV_ENC_CODEC_PIC_PARAMS;
+
+
+/**
+ * \struct _NV_ENC_PIC_PARAMS
+ * Encoding parameters that need to be sent on a per frame basis.
+ */
+typedef struct _NV_ENC_PIC_PARAMS
+{
+    uint32_t                                    version;                        /**< [in]: Struct version. Must be set to ::NV_ENC_PIC_PARAMS_VER. */
+    uint32_t                                    inputWidth;                     /**< [in]: Specifies the input frame width */
+    uint32_t                                    inputHeight;                    /**< [in]: Specifies the input frame height */
+    uint32_t                                    inputPitch;                     /**< [in]: Specifies the input buffer pitch. If pitch value is not known, set this to inputWidth. */
+    uint32_t                                    encodePicFlags;                 /**< [in]: Specifies bit-wise OR of encode picture flags. See ::NV_ENC_PIC_FLAGS enum. */
+    uint32_t                                    frameIdx;                       /**< [in]: Specifies the frame index associated with the input frame [optional]. */
+    uint64_t                                    inputTimeStamp;                 /**< [in]: Specifies opaque data which is associated with the encoded frame, but not actually encoded in the output bitstream.
+                                                                                           This opaque data can be used later to uniquely refer to the corresponding encoded frame. For example, it can be used
+                                                                                           for identifying the frame to be invalidated in the reference picture buffer, if lost at the client. */
+    uint64_t                                    inputDuration;                  /**< [in]: Specifies duration of the input picture */
+    NV_ENC_INPUT_PTR                            inputBuffer;                    /**< [in]: Specifies the input buffer pointer. Client must use a pointer obtained from ::NvEncCreateInputBuffer() or ::NvEncMapInputResource() APIs.*/
+    NV_ENC_OUTPUT_PTR                           outputBitstream;                /**< [in]: Specifies the output buffer pointer.
+                                                                                           If NV_ENC_INITIALIZE_PARAMS::enableOutputInVidmem is set to 0, specifies the pointer to output buffer. Client should use a pointer obtained from ::NvEncCreateBitstreamBuffer() API.
+                                                                                           If NV_ENC_INITIALIZE_PARAMS::enableOutputInVidmem is set to 1, client should allocate buffer in video memory for NV_ENC_ENCODE_OUT_PARAMS struct and encoded bitstream data. Client
+                                                                                           should use a pointer obtained from ::NvEncMapInputResource() API, when mapping this output buffer and assign it to NV_ENC_PIC_PARAMS::outputBitstream.
+                                                                                           First 256 bytes of this buffer should be interpreted as NV_ENC_ENCODE_OUT_PARAMS struct followed by encoded bitstream data. Recommended size for output buffer is sum of size of
+                                                                                           NV_ENC_ENCODE_OUT_PARAMS struct and twice the input frame size for lower resolution eg. CIF and 1.5 times the input frame size for higher resolutions. If encoded bitstream size is
+                                                                                           greater than the allocated buffer size for encoded bitstream, then the output buffer will have encoded bitstream data equal to buffer size. All CUDA operations on this buffer must use
+                                                                                           the default stream. */
+    void*                                       completionEvent;                /**< [in]: Specifies an event to be signaled on completion of encoding of this Frame [only if operating in Asynchronous mode]. Each output buffer should be associated with a distinct event pointer. */
+    NV_ENC_BUFFER_FORMAT                        bufferFmt;                      /**< [in]: Specifies the input buffer format. */
+    NV_ENC_PIC_STRUCT                           pictureStruct;                  /**< [in]: Specifies structure of the input picture. */
+    NV_ENC_PIC_TYPE                             pictureType;                    /**< [in]: Specifies input picture type. Client required to be set explicitly by the client if the client has not set NV_ENC_INITALIZE_PARAMS::enablePTD to 1 while calling NvInitializeEncoder. */
+    NV_ENC_CODEC_PIC_PARAMS                     codecPicParams;                 /**< [in]: Specifies the codec specific per-picture encoding parameters. */
+    NVENC_EXTERNAL_ME_HINT_COUNTS_PER_BLOCKTYPE meHintCountsPerBlock[2];        /**< [in]: For H264 and Hevc, specifies the number of hint candidates per block per direction for the current frame. meHintCountsPerBlock[0] is for L0 predictors and meHintCountsPerBlock[1] is for L1 predictors.
+                                                                                           The candidate count in NV_ENC_PIC_PARAMS::meHintCountsPerBlock[lx] must never exceed NV_ENC_INITIALIZE_PARAMS::maxMEHintCountsPerBlock[lx] provided during encoder initialization. */
+    NVENC_EXTERNAL_ME_HINT                     *meExternalHints;                /**< [in]: For H264 and Hevc, Specifies the pointer to ME external hints for the current frame. The size of ME hint buffer should be equal to number of macroblocks * the total number of candidates per macroblock.
+                                                                                           The total number of candidates per MB per direction = 1*meHintCountsPerBlock[Lx].numCandsPerBlk16x16 + 2*meHintCountsPerBlock[Lx].numCandsPerBlk16x8 + 2*meHintCountsPerBlock[Lx].numCandsPerBlk8x8
+                                                                                           + 4*meHintCountsPerBlock[Lx].numCandsPerBlk8x8. For frames using bidirectional ME , the total number of candidates for single macroblock is sum of total number of candidates per MB for each direction (L0 and L1) */
+    uint32_t                                    reserved1[6];                    /**< [in]: Reserved and must be set to 0 */
+    void*                                       reserved2[2];                    /**< [in]: Reserved and must be set to NULL */
+    int8_t                                     *qpDeltaMap;                      /**< [in]: Specifies the pointer to signed byte array containing value per MB for H264, per CTB for HEVC and per SB for AV1 in raster scan order for the current picture, which will be interpreted depending on NV_ENC_RC_PARAMS::qpMapMode.
+                                                                                            If NV_ENC_RC_PARAMS::qpMapMode is NV_ENC_QP_MAP_DELTA, qpDeltaMap specifies QP modifier per MB for H264, per CTB for HEVC and per SB for AV1. This QP modifier will be applied on top of the QP chosen by rate control.
+                                                                                            If NV_ENC_RC_PARAMS::qpMapMode is NV_ENC_QP_MAP_EMPHASIS, qpDeltaMap specifies Emphasis Level Map per MB for H264. This level value along with QP chosen by rate control is used to
+                                                                                            compute the QP modifier, which in turn is applied on top of QP chosen by rate control.
+                                                                                            If NV_ENC_RC_PARAMS::qpMapMode is NV_ENC_QP_MAP_DISABLED, value in qpDeltaMap will be ignored.*/
+    uint32_t                                    qpDeltaMapSize;                  /**< [in]: Specifies the size in bytes of qpDeltaMap surface allocated by client and pointed to by NV_ENC_PIC_PARAMS::qpDeltaMap. Surface (array) should be picWidthInMbs * picHeightInMbs for H264, picWidthInCtbs * picHeightInCtbs for HEVC and
+                                                                                            picWidthInSbs * picHeightInSbs for AV1 */
+    uint32_t                                    reservedBitFields;               /**< [in]: Reserved bitfields and must be set to 0 */
+    uint16_t                                    meHintRefPicDist[2];             /**< [in]: Specifies temporal distance for reference picture (NVENC_EXTERNAL_ME_HINT::refidx = 0) used during external ME with NV_ENC_INITALIZE_PARAMS::enablePTD = 1 . meHintRefPicDist[0] is for L0 hints and meHintRefPicDist[1] is for L1 hints.
+                                                                                            If not set, will internally infer distance of 1. Ignored for NV_ENC_INITALIZE_PARAMS::enablePTD = 0 */
+    NV_ENC_INPUT_PTR                            alphaBuffer;                     /**< [in]: Specifies the input alpha buffer pointer. Client must use a pointer obtained from ::NvEncCreateInputBuffer() or ::NvEncMapInputResource() APIs.
+                                                                                            Applicable only when encoding hevc with alpha layer is enabled. */
+    NVENC_EXTERNAL_ME_SB_HINT                  *meExternalSbHints;               /**< [in]: For AV1,Specifies the pointer to ME external SB hints for the current frame. The size of ME hint buffer should be equal to meSbHintsCount. */
+    uint32_t                                    meSbHintsCount;                  /**< [in]: For AV1, specifies the total number of external ME SB hint candidates for the frame
+                                                                                            NV_ENC_PIC_PARAMS::meSbHintsCount must never exceed the total number of SBs in frame * the max number of candidates per SB provided during encoder initialization.
+                                                                                            The max number of candidates per SB is maxMeHintCountsPerBlock[0].numCandsPerSb + maxMeHintCountsPerBlock[1].numCandsPerSb */
+    uint32_t                                    reserved3[285];                  /**< [in]: Reserved and must be set to 0 */
+    void*                                       reserved4[58];                   /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_PIC_PARAMS;
+
+/** Macro for constructing the version field of ::_NV_ENC_PIC_PARAMS */
+#define NV_ENC_PIC_PARAMS_VER (NVENCAPI_STRUCT_VERSION(6) | ( 1u<<31 ))
+
+
+/**
+ * \struct _NV_ENC_MEONLY_PARAMS
+ * MEOnly parameters that need to be sent on a per motion estimation basis.
+ * NV_ENC_MEONLY_PARAMS::meExternalHints is supported for H264 only.
+ */
+typedef struct _NV_ENC_MEONLY_PARAMS
+{
+    uint32_t                version;                            /**< [in]: Struct version. Must be set to NV_ENC_MEONLY_PARAMS_VER.*/
+    uint32_t                inputWidth;                         /**< [in]: Specifies the input frame width */
+    uint32_t                inputHeight;                        /**< [in]: Specifies the input frame height */
+    NV_ENC_INPUT_PTR        inputBuffer;                        /**< [in]: Specifies the input buffer pointer. Client must use a pointer obtained from NvEncCreateInputBuffer() or NvEncMapInputResource() APIs. */
+    NV_ENC_INPUT_PTR        referenceFrame;                     /**< [in]: Specifies the reference frame pointer */
+    NV_ENC_OUTPUT_PTR       mvBuffer;                           /**< [in]: Specifies the output buffer pointer.
+                                                                           If NV_ENC_INITIALIZE_PARAMS::enableOutputInVidmem is set to 0, specifies the pointer to motion vector data buffer allocated by NvEncCreateMVBuffer.
+                                                                           Client must lock mvBuffer using ::NvEncLockBitstream() API to get the motion vector data.
+                                                                           If NV_ENC_INITIALIZE_PARAMS::enableOutputInVidmem is set to 1, client should allocate buffer in video memory for storing the motion vector data. The size of this buffer must
+                                                                           be equal to total number of macroblocks multiplied by size of NV_ENC_H264_MV_DATA struct. Client should use a pointer obtained from ::NvEncMapInputResource() API, when mapping this
+                                                                           output buffer and assign it to NV_ENC_MEONLY_PARAMS::mvBuffer. All CUDA operations on this buffer must use the default stream. */
+    NV_ENC_BUFFER_FORMAT    bufferFmt;                          /**< [in]: Specifies the input buffer format. */
+    void*                   completionEvent;                    /**< [in]: Specifies an event to be signaled on completion of motion estimation
+                                                                           of this Frame [only if operating in Asynchronous mode].
+                                                                           Each output buffer should be associated with a distinct event pointer. */
+    uint32_t                viewID;                             /**< [in]: Specifies left or right viewID if NV_ENC_CONFIG_H264_MEONLY::bStereoEnable is set.
+                                                                            viewID can be 0,1 if bStereoEnable is set, 0 otherwise. */
+    NVENC_EXTERNAL_ME_HINT_COUNTS_PER_BLOCKTYPE
+                            meHintCountsPerBlock[2];            /**< [in]: Specifies the number of hint candidates per block for the current frame. meHintCountsPerBlock[0] is for L0 predictors.
+                                                                            The candidate count in NV_ENC_PIC_PARAMS::meHintCountsPerBlock[lx] must never exceed NV_ENC_INITIALIZE_PARAMS::maxMEHintCountsPerBlock[lx] provided during encoder initialization. */
+    NVENC_EXTERNAL_ME_HINT  *meExternalHints;                   /**< [in]: Specifies the pointer to ME external hints for the current frame. The size of ME hint buffer should be equal to number of macroblocks * the total number of candidates per macroblock.
+                                                                            The total number of candidates per MB per direction = 1*meHintCountsPerBlock[Lx].numCandsPerBlk16x16 + 2*meHintCountsPerBlock[Lx].numCandsPerBlk16x8 + 2*meHintCountsPerBlock[Lx].numCandsPerBlk8x8
+                                                                            + 4*meHintCountsPerBlock[Lx].numCandsPerBlk8x8. For frames using bidirectional ME , the total number of candidates for single macroblock is sum of total number of candidates per MB for each direction (L0 and L1) */
+    uint32_t                reserved1[243];                     /**< [in]: Reserved and must be set to 0 */
+    void*                   reserved2[59];                      /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_MEONLY_PARAMS;
+
+/** NV_ENC_MEONLY_PARAMS struct version*/
+#define NV_ENC_MEONLY_PARAMS_VER NVENCAPI_STRUCT_VERSION(3)
+
+
+/**
+ * \struct _NV_ENC_LOCK_BITSTREAM
+ * Bitstream buffer lock parameters.
+ */
+typedef struct _NV_ENC_LOCK_BITSTREAM
+{
+    uint32_t                version;                     /**< [in]: Struct version. Must be set to ::NV_ENC_LOCK_BITSTREAM_VER. */
+    uint32_t                doNotWait         :1;        /**< [in]: If this flag is set, the NvEncodeAPI interface will return buffer pointer even if operation is not completed. If not set, the call will block until operation completes. */
+    uint32_t                ltrFrame          :1;        /**< [out]: Flag indicating this frame is marked as LTR frame */
+    uint32_t                getRCStats        :1;        /**< [in]: If this flag is set then lockBitstream call will add additional intra-inter MB count and average MVX, MVY */
+    uint32_t                reservedBitFields :29;       /**< [in]: Reserved bit fields and must be set to 0 */
+    void*                   outputBitstream;             /**< [in]: Pointer to the bitstream buffer being locked. */
+    uint32_t*               sliceOffsets;                /**< [in, out]: Array which receives the slice (H264/HEVC) or tile (AV1) offsets. This is not supported if NV_ENC_CONFIG_H264::sliceMode is 1 on Kepler GPUs. Array size must be equal to size of frame in MBs. */
+    uint32_t                frameIdx;                    /**< [out]: Frame no. for which the bitstream is being retrieved. */
+    uint32_t                hwEncodeStatus;              /**< [out]: The NvEncodeAPI interface status for the locked picture. */
+    uint32_t                numSlices;                   /**< [out]: Number of slices (H264/HEVC) or tiles (AV1) in the encoded picture. Will be reported only if NV_ENC_INITIALIZE_PARAMS::reportSliceOffsets set to 1. */
+    uint32_t                bitstreamSizeInBytes;        /**< [out]: Actual number of bytes generated and copied to the memory pointed by bitstreamBufferPtr.
+                                                                     When HEVC alpha layer encoding is enabled, this field reports the total encoded size in bytes i.e it is the encoded size of the base plus the alpha layer.
+                                                                     For AV1 when enablePTD is set, this field reports the total encoded size in bytes of all the encoded frames packed into the current output surface i.e. show frame plus all preceding no-show frames */
+    uint64_t                outputTimeStamp;             /**< [out]: Presentation timestamp associated with the encoded output. */
+    uint64_t                outputDuration;              /**< [out]: Presentation duration associates with the encoded output. */
+    void*                   bitstreamBufferPtr;          /**< [out]: Pointer to the generated output bitstream.
+                                                                     For MEOnly mode _NV_ENC_LOCK_BITSTREAM::bitstreamBufferPtr should be typecast to
+                                                                     NV_ENC_H264_MV_DATA/NV_ENC_HEVC_MV_DATA pointer respectively for H264/HEVC  */
+    NV_ENC_PIC_TYPE         pictureType;                 /**< [out]: Picture type of the encoded picture. */
+    NV_ENC_PIC_STRUCT       pictureStruct;               /**< [out]: Structure of the generated output picture. */
+    uint32_t                frameAvgQP;                  /**< [out]: Average QP of the frame. */
+    uint32_t                frameSatd;                   /**< [out]: Total SATD cost for whole frame. */
+    uint32_t                ltrFrameIdx;                 /**< [out]: Frame index associated with this LTR frame. */
+    uint32_t                ltrFrameBitmap;              /**< [out]: Bitmap of LTR frames indices which were used for encoding this frame. Value of 0 if no LTR frames were used. */
+    uint32_t                temporalId;                  /**< [out]: TemporalId value of the frame when using temporalSVC encoding */
+    uint32_t                reserved[12];                /**< [in]: Reserved and must be set to 0 */
+    uint32_t                intraMBCount;                /**< [out]: For H264, Number of Intra MBs in the encoded frame. For HEVC, Number of Intra CTBs in the encoded frame. For AV1, Number of Intra SBs in the encoded show frame. Supported only if _NV_ENC_LOCK_BITSTREAM::getRCStats set to 1. */
+    uint32_t                interMBCount;                /**< [out]: For H264, Number of Inter MBs in the encoded frame, includes skip MBs. For HEVC, Number of Inter CTBs in the encoded frame. For AV1, Number of Inter SBs in the encoded show frame. Supported only if _NV_ENC_LOCK_BITSTREAM::getRCStats set to 1. */
+    int32_t                 averageMVX;                  /**< [out]: Average Motion Vector in X direction for the encoded frame. Supported only if _NV_ENC_LOCK_BITSTREAM::getRCStats set to 1. */
+    int32_t                 averageMVY;                  /**< [out]: Average Motion Vector in y direction for the encoded frame. Supported only if _NV_ENC_LOCK_BITSTREAM::getRCStats set to 1. */
+    uint32_t                alphaLayerSizeInBytes;       /**< [out]: Number of bytes generated for the alpha layer in the encoded output. Applicable only when HEVC with alpha encoding is enabled. */
+
+    uint32_t                reserved1[218];              /**< [in]: Reserved and must be set to 0 */
+    void*                   reserved2[64];               /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_LOCK_BITSTREAM;
+
+/** Macro for constructing the version field of ::_NV_ENC_LOCK_BITSTREAM */
+#define NV_ENC_LOCK_BITSTREAM_VER NVENCAPI_STRUCT_VERSION(2)
+
+
+/**
+ * \struct _NV_ENC_LOCK_INPUT_BUFFER
+ * Uncompressed Input Buffer lock parameters.
+ */
+typedef struct _NV_ENC_LOCK_INPUT_BUFFER
+{
+    uint32_t                  version;                   /**< [in]:  Struct version. Must be set to ::NV_ENC_LOCK_INPUT_BUFFER_VER. */
+    uint32_t                  doNotWait         :1;      /**< [in]:  Set to 1 to make ::NvEncLockInputBuffer() a unblocking call. If the encoding is not completed, driver will return ::NV_ENC_ERR_ENCODER_BUSY error code. */
+    uint32_t                  reservedBitFields :31;     /**< [in]:  Reserved bitfields and must be set to 0 */
+    NV_ENC_INPUT_PTR          inputBuffer;               /**< [in]:  Pointer to the input buffer to be locked, client should pass the pointer obtained from ::NvEncCreateInputBuffer() or ::NvEncMapInputResource API. */
+    void*                     bufferDataPtr;             /**< [out]: Pointed to the locked input buffer data. Client can only access input buffer using the \p bufferDataPtr. */
+    uint32_t                  pitch;                     /**< [out]: Pitch of the locked input buffer. */
+    uint32_t                  reserved1[251];            /**< [in]:  Reserved and must be set to 0  */
+    void*                     reserved2[64];             /**< [in]:  Reserved and must be set to NULL  */
+} NV_ENC_LOCK_INPUT_BUFFER;
+
+/** Macro for constructing the version field of ::_NV_ENC_LOCK_INPUT_BUFFER */
+#define NV_ENC_LOCK_INPUT_BUFFER_VER NVENCAPI_STRUCT_VERSION(1)
+
+
+/**
+ * \struct _NV_ENC_MAP_INPUT_RESOURCE
+ * Map an input resource to a Nvidia Encoder Input Buffer
+ */
+typedef struct _NV_ENC_MAP_INPUT_RESOURCE
+{
+    uint32_t                   version;                   /**< [in]:  Struct version. Must be set to ::NV_ENC_MAP_INPUT_RESOURCE_VER. */
+    uint32_t                   subResourceIndex;          /**< [in]:  Deprecated. Do not use. */
+    void*                      inputResource;             /**< [in]:  Deprecated. Do not use. */
+    NV_ENC_REGISTERED_PTR      registeredResource;        /**< [in]:  The Registered resource handle obtained by calling NvEncRegisterInputResource. */
+    NV_ENC_INPUT_PTR           mappedResource;            /**< [out]: Mapped pointer corresponding to the registeredResource. This pointer must be used in NV_ENC_PIC_PARAMS::inputBuffer parameter in ::NvEncEncodePicture() API. */
+    NV_ENC_BUFFER_FORMAT       mappedBufferFmt;           /**< [out]: Buffer format of the outputResource. This buffer format must be used in NV_ENC_PIC_PARAMS::bufferFmt if client using the above mapped resource pointer. */
+    uint32_t                   reserved1[251];            /**< [in]:  Reserved and must be set to 0. */
+    void*                      reserved2[63];             /**< [in]:  Reserved and must be set to NULL */
+} NV_ENC_MAP_INPUT_RESOURCE;
+
+/** Macro for constructing the version field of ::_NV_ENC_MAP_INPUT_RESOURCE */
+#define NV_ENC_MAP_INPUT_RESOURCE_VER NVENCAPI_STRUCT_VERSION(4)
+
+/**
+ * \struct _NV_ENC_INPUT_RESOURCE_OPENGL_TEX
+ * NV_ENC_REGISTER_RESOURCE::resourceToRegister must be a pointer to a variable of this type,
+ * when NV_ENC_REGISTER_RESOURCE::resourceType is NV_ENC_INPUT_RESOURCE_TYPE_OPENGL_TEX
+ */
+typedef struct _NV_ENC_INPUT_RESOURCE_OPENGL_TEX
+{
+    uint32_t texture;                                     /**< [in]: The name of the texture to be used. */
+    uint32_t target;                                      /**< [in]: Accepted values are GL_TEXTURE_RECTANGLE and GL_TEXTURE_2D. */
+} NV_ENC_INPUT_RESOURCE_OPENGL_TEX;
+
+/** \struct NV_ENC_FENCE_POINT_D3D12
+* Fence and fence value for synchronization.
+*/
+typedef struct _NV_ENC_FENCE_POINT_D3D12
+{
+    uint32_t                version;                   /**< [in]: Struct version. Must be set to ::NV_ENC_FENCE_POINT_D3D12_VER. */
+    uint32_t                reserved;                  /**< [in]: Reserved and must be set to 0. */
+    void*                   pFence;                    /**< [in]: Pointer to ID3D12Fence. This fence object is used for synchronization. */
+    uint64_t                waitValue;                 /**< [in]: Fence value to reach or exceed before the GPU operation. */
+    uint64_t                signalValue;               /**< [in]: Fence value to set the fence to, after the GPU operation. */
+    uint32_t                bWait:1;                   /**< [in]: Wait on 'waitValue' if bWait is set to 1, before starting GPU operation. */
+    uint32_t                bSignal:1;                 /**< [in]: Signal on 'signalValue' if bSignal is set to 1, after GPU operation is complete. */
+    uint32_t                reservedBitField:30;       /**< [in]: Reserved and must be set to 0. */
+    uint32_t                reserved1[7];              /**< [in]: Reserved and must be set to 0. */
+} NV_ENC_FENCE_POINT_D3D12;
+
+#define NV_ENC_FENCE_POINT_D3D12_VER NVENCAPI_STRUCT_VERSION(1)
+
+/**
+ * \struct _NV_ENC_INPUT_RESOURCE_D3D12
+ * NV_ENC_PIC_PARAMS::inputBuffer and NV_ENC_PIC_PARAMS::alphaBuffer must be a pointer to a struct of this type,
+ * when D3D12 interface is used
+ */
+typedef struct _NV_ENC_INPUT_RESOURCE_D3D12
+{
+    uint32_t                    version;                /**< [in]: Struct version. Must be set to ::NV_ENC_INPUT_RESOURCE_D3D12_VER. */
+    uint32_t                    reserved;               /**< [in]: Reserved and must be set to 0. */
+    NV_ENC_INPUT_PTR            pInputBuffer;           /**< [in]: Specifies the input surface pointer. Client must use a pointer obtained from NvEncMapInputResource() in NV_ENC_MAP_INPUT_RESOURCE::mappedResource
+                                                                   when mapping the input surface. */
+    NV_ENC_FENCE_POINT_D3D12    inputFencePoint;        /**< [in]: Specifies the fence and corresponding fence values to do GPU wait and signal. */
+    uint32_t                    reserved1[16];          /**< [in]: Reserved and must be set to 0. */
+    void*                       reserved2[16];          /**< [in]: Reserved and must be set to NULL. */
+} NV_ENC_INPUT_RESOURCE_D3D12;
+
+#define NV_ENC_INPUT_RESOURCE_D3D12_VER NVENCAPI_STRUCT_VERSION(1)
+
+/**
+ * \struct _NV_ENC_OUTPUT_RESOURCE_D3D12
+ * NV_ENC_PIC_PARAMS::outputBitstream and NV_ENC_LOCK_BITSTREAM::outputBitstream must be a pointer to a struct of this type,
+ * when D3D12 interface is used
+ */
+typedef struct _NV_ENC_OUTPUT_RESOURCE_D3D12
+{
+    uint32_t                    version;                /**< [in]: Struct version. Must be set to ::NV_ENC_OUTPUT_RESOURCE_D3D12_VER. */
+    uint32_t                    reserved;               /**< [in]: Reserved and must be set to 0. */
+    NV_ENC_INPUT_PTR            pOutputBuffer;          /**< [in]: Specifies the output buffer pointer. Client must use a pointer obtained from NvEncMapInputResource() in NV_ENC_MAP_INPUT_RESOURCE::mappedResource
+                                                                   when mapping output bitstream buffer */
+    NV_ENC_FENCE_POINT_D3D12    outputFencePoint;       /**< [in]: Specifies the fence and corresponding fence values to do GPU wait and signal.*/
+    uint32_t                    reserved1[16];          /**< [in]: Reserved and must be set to 0. */
+    void*                       reserved2[16];          /**< [in]: Reserved and must be set to NULL. */
+} NV_ENC_OUTPUT_RESOURCE_D3D12;
+
+#define NV_ENC_OUTPUT_RESOURCE_D3D12_VER NVENCAPI_STRUCT_VERSION(1)
+
+/**
+ * \struct _NV_ENC_REGISTER_RESOURCE
+ * Register a resource for future use with the Nvidia Video Encoder Interface.
+ */
+typedef struct _NV_ENC_REGISTER_RESOURCE
+{
+    uint32_t                    version;                        /**< [in]: Struct version. Must be set to ::NV_ENC_REGISTER_RESOURCE_VER. */
+    NV_ENC_INPUT_RESOURCE_TYPE  resourceType;                   /**< [in]: Specifies the type of resource to be registered.
+                                                                           Supported values are
+                                                                           ::NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX,
+                                                                           ::NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR,
+                                                                           ::NV_ENC_INPUT_RESOURCE_TYPE_OPENGL_TEX */
+    uint32_t                    width;                          /**< [in]: Input frame width. */
+    uint32_t                    height;                         /**< [in]: Input frame height. */
+    uint32_t                    pitch;                          /**< [in]: Input buffer pitch.
+                                                                           For ::NV_ENC_INPUT_RESOURCE_TYPE_DIRECTX resources, set this to 0.
+                                                                           For ::NV_ENC_INPUT_RESOURCE_TYPE_CUDADEVICEPTR resources, set this to
+                                                                             the pitch as obtained from cuMemAllocPitch(), or to the width in
+                                                                             bytes (if this resource was created by using cuMemAlloc()). This
+                                                                             value must be a multiple of 4.
+                                                                           For ::NV_ENC_INPUT_RESOURCE_TYPE_CUDAARRAY resources, set this to the
+                                                                             width of the allocation in bytes (i.e.
+                                                                             CUDA_ARRAY3D_DESCRIPTOR::Width * CUDA_ARRAY3D_DESCRIPTOR::NumChannels).
+                                                                           For ::NV_ENC_INPUT_RESOURCE_TYPE_OPENGL_TEX resources, set this to the
+                                                                             texture width multiplied by the number of components in the texture
+                                                                             format. */
+    uint32_t                    subResourceIndex;               /**< [in]: Subresource Index of the DirectX resource to be registered. Should be set to 0 for other interfaces. */
+    void*                       resourceToRegister;             /**< [in]: Handle to the resource that is being registered. */
+    NV_ENC_REGISTERED_PTR       registeredResource;             /**< [out]: Registered resource handle. This should be used in future interactions with the Nvidia Video Encoder Interface. */
+    NV_ENC_BUFFER_FORMAT        bufferFormat;                   /**< [in]: Buffer format of resource to be registered. */
+    NV_ENC_BUFFER_USAGE         bufferUsage;                    /**< [in]: Usage of resource to be registered. */
+    NV_ENC_FENCE_POINT_D3D12*   pInputFencePoint;               /**< [in]: Specifies the input fence and corresponding fence values to do GPU wait and signal.
+                                                                           To be used only when NV_ENC_REGISTER_RESOURCE::resourceToRegister represents D3D12 surface and
+                                                                           NV_ENC_BUFFER_USAGE::bufferUsage is NV_ENC_INPUT_IMAGE.
+                                                                           The fence NV_ENC_FENCE_POINT_D3D12::pFence and NV_ENC_FENCE_POINT_D3D12::waitValue will be used to do GPU wait
+                                                                           before starting GPU operation, if NV_ENC_FENCE_POINT_D3D12::bWait is set.
+                                                                           The fence NV_ENC_FENCE_POINT_D3D12::pFence and NV_ENC_FENCE_POINT_D3D12::signalValue will be used to do GPU signal
+                                                                           when GPU operation finishes, if NV_ENC_FENCE_POINT_D3D12::bSignal is set. */
+    uint32_t                    reserved1[247];                 /**< [in]: Reserved and must be set to 0. */
+    void*                       reserved2[61];                  /**< [in]: Reserved and must be set to NULL. */
+} NV_ENC_REGISTER_RESOURCE;
+
+/** Macro for constructing the version field of ::_NV_ENC_REGISTER_RESOURCE */
+#define NV_ENC_REGISTER_RESOURCE_VER NVENCAPI_STRUCT_VERSION(4)
+
+/**
+ * \struct _NV_ENC_STAT
+ * Encode Stats structure.
+ */
+typedef struct _NV_ENC_STAT
+{
+    uint32_t            version;                         /**< [in]:  Struct version. Must be set to ::NV_ENC_STAT_VER. */
+    uint32_t            reserved;                        /**< [in]:  Reserved and must be set to 0 */
+    NV_ENC_OUTPUT_PTR   outputBitStream;                 /**< [out]: Specifies the pointer to output bitstream. */
+    uint32_t            bitStreamSize;                   /**< [out]: Size of generated bitstream in bytes. */
+    uint32_t            picType;                         /**< [out]: Picture type of encoded picture. See ::NV_ENC_PIC_TYPE. */
+    uint32_t            lastValidByteOffset;             /**< [out]: Offset of last valid bytes of completed bitstream */
+    uint32_t            sliceOffsets[16];                /**< [out]: Offsets of each slice */
+    uint32_t            picIdx;                          /**< [out]: Picture number */
+    uint32_t            frameAvgQP;                      /**< [out]: Average QP of the frame. */
+    uint32_t            ltrFrame          :1;            /**< [out]: Flag indicating this frame is marked as LTR frame */
+    uint32_t            reservedBitFields :31;           /**< [in]:  Reserved bit fields and must be set to 0 */
+    uint32_t            ltrFrameIdx;                     /**< [out]: Frame index associated with this LTR frame. */
+    uint32_t            intraMBCount;                    /**< [out]: For H264, Number of Intra MBs in the encoded frame. For HEVC, Number of Intra CTBs in the encoded frame. */
+    uint32_t            interMBCount;                    /**< [out]: For H264, Number of Inter MBs in the encoded frame, includes skip MBs. For HEVC, Number of Inter CTBs in the encoded frame. */
+    int32_t             averageMVX;                      /**< [out]: Average Motion Vector in X direction for the encoded frame. */
+    int32_t             averageMVY;                      /**< [out]: Average Motion Vector in y direction for the encoded frame. */
+    uint32_t            reserved1[226];                  /**< [in]:  Reserved and must be set to 0 */
+    void*               reserved2[64];                   /**< [in]:  Reserved and must be set to NULL */
+} NV_ENC_STAT;
+
+/** Macro for constructing the version field of ::_NV_ENC_STAT */
+#define NV_ENC_STAT_VER NVENCAPI_STRUCT_VERSION(1)
+
+
+/**
+ * \struct _NV_ENC_SEQUENCE_PARAM_PAYLOAD
+ * Sequence and picture paramaters payload.
+ */
+typedef struct _NV_ENC_SEQUENCE_PARAM_PAYLOAD
+{
+    uint32_t            version;                         /**< [in]:  Struct version. Must be set to ::NV_ENC_INITIALIZE_PARAMS_VER. */
+    uint32_t            inBufferSize;                    /**< [in]:  Specifies the size of the spsppsBuffer provided by the client */
+    uint32_t            spsId;                           /**< [in]:  Specifies the SPS id to be used in sequence header. Default value is 0.  */
+    uint32_t            ppsId;                           /**< [in]:  Specifies the PPS id to be used in picture header. Default value is 0.  */
+    void*               spsppsBuffer;                    /**< [in]:  Specifies bitstream header pointer of size NV_ENC_SEQUENCE_PARAM_PAYLOAD::inBufferSize.
+                                                                     It is the client's responsibility to manage this memory. */
+    uint32_t*           outSPSPPSPayloadSize;            /**< [out]: Size of the sequence and picture header in bytes. */
+    uint32_t            reserved [250];                  /**< [in]:  Reserved and must be set to 0 */
+    void*               reserved2[64];                   /**< [in]:  Reserved and must be set to NULL */
+} NV_ENC_SEQUENCE_PARAM_PAYLOAD;
+
+/** Macro for constructing the version field of ::_NV_ENC_SEQUENCE_PARAM_PAYLOAD */
+#define NV_ENC_SEQUENCE_PARAM_PAYLOAD_VER NVENCAPI_STRUCT_VERSION(1)
+
+
+/**
+ * Event registration/unregistration parameters.
+ */
+typedef struct _NV_ENC_EVENT_PARAMS
+{
+    uint32_t            version;                          /**< [in]: Struct version. Must be set to ::NV_ENC_EVENT_PARAMS_VER. */
+    uint32_t            reserved;                         /**< [in]: Reserved and must be set to 0 */
+    void*               completionEvent;                  /**< [in]: Handle to event to be registered/unregistered with the NvEncodeAPI interface. */
+    uint32_t            reserved1[253];                   /**< [in]: Reserved and must be set to 0    */
+    void*               reserved2[64];                    /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_EVENT_PARAMS;
+
+/** Macro for constructing the version field of ::_NV_ENC_EVENT_PARAMS */
+#define NV_ENC_EVENT_PARAMS_VER NVENCAPI_STRUCT_VERSION(1)
+
+/**
+ * Encoder Session Creation parameters
+ */
+typedef struct _NV_ENC_OPEN_ENCODE_SESSIONEX_PARAMS
+{
+    uint32_t            version;                          /**< [in]: Struct version. Must be set to ::NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS_VER. */
+    NV_ENC_DEVICE_TYPE  deviceType;                       /**< [in]: Specified the device Type */
+    void*               device;                           /**< [in]: Pointer to client device. */
+    void*               reserved;                         /**< [in]: Reserved and must be set to 0. */
+    uint32_t            apiVersion;                       /**< [in]: API version. Should be set to NVENCAPI_VERSION. */
+    uint32_t            reserved1[253];                   /**< [in]: Reserved and must be set to 0    */
+    void*               reserved2[64];                    /**< [in]: Reserved and must be set to NULL */
+} NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS;
+/** Macro for constructing the version field of ::_NV_ENC_OPEN_ENCODE_SESSIONEX_PARAMS */
+#define NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS_VER NVENCAPI_STRUCT_VERSION(1)
+
+/** @} */ /* END ENCODER_STRUCTURE */
+
+
+/**
+ * \addtogroup ENCODE_FUNC NvEncodeAPI Functions
+ * @{
+ */
+
+// NvEncOpenEncodeSession
+/**
+ * \brief Opens an encoding session.
+ *
+ * Deprecated.
+ *
+ * \return
+ * ::NV_ENC_ERR_INVALID_CALL\n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncOpenEncodeSession                     (void* device, uint32_t deviceType, void** encoder);
+
+// NvEncGetEncodeGuidCount
+/**
+ * \brief Retrieves the number of supported encode GUIDs.
+ *
+ * The function returns the number of codec GUIDs supported by the NvEncodeAPI
+ * interface.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [out] encodeGUIDCount
+ *   Number of supported encode GUIDs.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodeGUIDCount                    (void* encoder, uint32_t* encodeGUIDCount);
+
+
+// NvEncGetEncodeGUIDs
+/**
+ * \brief Retrieves an array of supported encoder codec GUIDs.
+ *
+ * The function returns an array of codec GUIDs supported by the NvEncodeAPI interface.
+ * The client must allocate an array where the NvEncodeAPI interface can
+ * fill the supported GUIDs and pass the pointer in \p *GUIDs parameter.
+ * The size of the array can be determined by using ::NvEncGetEncodeGUIDCount() API.
+ * The Nvidia Encoding interface returns the number of codec GUIDs it has actually
+ * filled in the GUID array in the \p GUIDCount parameter.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] guidArraySize
+ *   Number of GUIDs to retrieved. Should be set to the number retrieved using
+ *   ::NvEncGetEncodeGUIDCount.
+ * \param [out] GUIDs
+ *   Array of supported Encode GUIDs.
+ * \param [out] GUIDCount
+ *   Number of supported Encode GUIDs.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodeGUIDs                        (void* encoder, GUID* GUIDs, uint32_t guidArraySize, uint32_t* GUIDCount);
+
+
+// NvEncGetEncodeProfileGuidCount
+/**
+ * \brief Retrieves the number of supported profile GUIDs.
+ *
+ * The function returns the number of profile GUIDs supported for a given codec.
+ * The client must first enumerate the codec GUIDs supported by the NvEncodeAPI
+ * interface. After determining the codec GUID, it can query the NvEncodeAPI
+ * interface to determine the number of profile GUIDs supported for a particular
+ * codec GUID.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   The codec GUID for which the profile GUIDs are being enumerated.
+ * \param [out] encodeProfileGUIDCount
+ *   Number of encode profiles supported for the given encodeGUID.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodeProfileGUIDCount                    (void* encoder, GUID encodeGUID, uint32_t* encodeProfileGUIDCount);
+
+
+// NvEncGetEncodeProfileGUIDs
+/**
+ * \brief Retrieves an array of supported encode profile GUIDs.
+ *
+ * The function returns an array of supported profile GUIDs for a particular
+ * codec GUID. The client must allocate an array where the NvEncodeAPI interface
+ * can populate the profile GUIDs. The client can determine the array size using
+ * ::NvEncGetEncodeProfileGUIDCount() API. The client must also validiate that the
+ * NvEncodeAPI interface supports the GUID the client wants to pass as \p encodeGUID
+ * parameter.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   The encode GUID whose profile GUIDs are being enumerated.
+ * \param [in] guidArraySize
+ *   Number of GUIDs to be retrieved. Should be set to the number retrieved using
+ *   ::NvEncGetEncodeProfileGUIDCount.
+ * \param [out] profileGUIDs
+ *   Array of supported Encode Profile GUIDs
+ * \param [out] GUIDCount
+ *   Number of valid encode profile GUIDs in \p profileGUIDs array.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodeProfileGUIDs                               (void* encoder, GUID encodeGUID, GUID* profileGUIDs, uint32_t guidArraySize, uint32_t* GUIDCount);
+
+// NvEncGetInputFormatCount
+/**
+ * \brief Retrieve the number of supported Input formats.
+ *
+ * The function returns the number of supported input formats. The client must
+ * query the NvEncodeAPI interface to determine the supported input formats
+ * before creating the input surfaces.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   Encode GUID, corresponding to which the number of supported input formats
+ *   is to be retrieved.
+ * \param [out] inputFmtCount
+ *   Number of input formats supported for specified Encode GUID.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ */
+NVENCSTATUS NVENCAPI NvEncGetInputFormatCount                   (void* encoder, GUID encodeGUID, uint32_t* inputFmtCount);
+
+
+// NvEncGetInputFormats
+/**
+ * \brief Retrieves an array of supported Input formats
+ *
+ * Returns an array of supported input formats  The client must use the input
+ * format to create input surface using ::NvEncCreateInputBuffer() API.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   Encode GUID, corresponding to which the number of supported input formats
+ *   is to be retrieved.
+ *\param [in] inputFmtArraySize
+ *   Size input format count array passed in \p inputFmts.
+ *\param [out] inputFmts
+ *   Array of input formats supported for this Encode GUID.
+ *\param [out] inputFmtCount
+ *   The number of valid input format types returned by the NvEncodeAPI
+ *   interface in \p inputFmts array.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetInputFormats                       (void* encoder, GUID encodeGUID, NV_ENC_BUFFER_FORMAT* inputFmts, uint32_t inputFmtArraySize, uint32_t* inputFmtCount);
+
+
+// NvEncGetEncodeCaps
+/**
+ * \brief Retrieves the capability value for a specified encoder attribute.
+ *
+ * The function returns the capability value for a given encoder attribute. The
+ * client must validate the encodeGUID using ::NvEncGetEncodeGUIDs() API before
+ * calling this function. The encoder attribute being queried are enumerated in
+ * ::NV_ENC_CAPS_PARAM enum.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   Encode GUID, corresponding to which the capability attribute is to be retrieved.
+ * \param [in] capsParam
+ *   Used to specify attribute being queried. Refer ::NV_ENC_CAPS_PARAM for  more
+ * details.
+ * \param [out] capsVal
+ *   The value corresponding to the capability attribute being queried.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodeCaps                     (void* encoder, GUID encodeGUID, NV_ENC_CAPS_PARAM* capsParam, int* capsVal);
+
+
+// NvEncGetEncodePresetCount
+/**
+ * \brief Retrieves the number of supported preset GUIDs.
+ *
+ * The function returns the number of preset GUIDs available for a given codec.
+ * The client must validate the codec GUID using ::NvEncGetEncodeGUIDs() API
+ * before calling this function.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   Encode GUID, corresponding to which the number of supported presets is to
+ *   be retrieved.
+ * \param [out] encodePresetGUIDCount
+ *   Receives the number of supported preset GUIDs.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodePresetCount              (void* encoder, GUID encodeGUID, uint32_t* encodePresetGUIDCount);
+
+
+// NvEncGetEncodePresetGUIDs
+/**
+ * \brief Receives an array of supported encoder preset GUIDs.
+ *
+ * The function returns an array of encode preset GUIDs available for a given codec.
+ * The client can directly use one of the preset GUIDs based upon the use case
+ * or target device. The preset GUID chosen can be directly used in
+ * NV_ENC_INITIALIZE_PARAMS::presetGUID parameter to ::NvEncEncodePicture() API.
+ * Alternately client can  also use the preset GUID to retrieve the encoding config
+ * parameters being used by NvEncodeAPI interface for that given preset, using
+ * ::NvEncGetEncodePresetConfig() API. It can then modify preset config parameters
+ * as per its use case and send it to NvEncodeAPI interface as part of
+ * NV_ENC_INITIALIZE_PARAMS::encodeConfig parameter for NvEncInitializeEncoder()
+ * API.
+ *
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   Encode GUID, corresponding to which the list of supported presets is to be
+ *   retrieved.
+ * \param [in] guidArraySize
+ *   Size of array of preset GUIDs passed in \p preset GUIDs
+ * \param [out] presetGUIDs
+ *   Array of supported Encode preset GUIDs from the NvEncodeAPI interface
+ *   to client.
+ * \param [out] encodePresetGUIDCount
+ *   Receives the number of preset GUIDs returned by the NvEncodeAPI
+ *   interface.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodePresetGUIDs                  (void* encoder, GUID encodeGUID, GUID* presetGUIDs, uint32_t guidArraySize, uint32_t* encodePresetGUIDCount);
+
+
+// NvEncGetEncodePresetConfig
+/**
+ * \brief Returns a preset config structure supported for given preset GUID.
+ *
+ * The function returns a preset config structure for a given preset GUID.
+ * NvEncGetEncodePresetConfig() API is not applicable to AV1.
+ * Before using this function the client must enumerate the preset GUIDs available for
+ * a given codec. The preset config structure can be modified by the client depending
+ * upon its use case and can be then used to initialize the encoder using
+ * ::NvEncInitializeEncoder() API. The client can use this function only if it
+ * wants to modify the NvEncodeAPI preset configuration, otherwise it can
+ * directly use the preset GUID.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   Encode GUID, corresponding to which the list of supported presets is to be
+ *   retrieved.
+ * \param [in] presetGUID
+ *   Preset GUID, corresponding to which the Encoding configurations is to be
+ *   retrieved.
+ * \param [out] presetConfig
+ *   The requested Preset Encoder Attribute set. Refer ::_NV_ENC_CONFIG for
+*    more details.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodePresetConfig               (void* encoder, GUID encodeGUID, GUID  presetGUID, NV_ENC_PRESET_CONFIG* presetConfig);
+
+// NvEncGetEncodePresetConfigEx
+/**
+ * \brief Returns a preset config structure supported for given preset GUID.
+ *
+ * The function returns a preset config structure for a given preset GUID and tuning info.
+ * NvEncGetEncodePresetConfigEx() API is not applicable to H264 and HEVC meonly mode.
+ * Before using this function the client must enumerate the preset GUIDs available for
+ * a given codec. The preset config structure can be modified by the client depending
+ * upon its use case and can be then used to initialize the encoder using
+ * ::NvEncInitializeEncoder() API. The client can use this function only if it
+ * wants to modify the NvEncodeAPI preset configuration, otherwise it can
+ * directly use the preset GUID.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encodeGUID
+ *   Encode GUID, corresponding to which the list of supported presets is to be
+ *   retrieved.
+ * \param [in] presetGUID
+ *   Preset GUID, corresponding to which the Encoding configurations is to be
+ *   retrieved.
+ * \param [in] tuningInfo
+ *   tuning info, corresponding to which the Encoding configurations is to be
+ *   retrieved.
+ * \param [out] presetConfig
+ *   The requested Preset Encoder Attribute set. Refer ::_NV_ENC_CONFIG for
+ *    more details.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodePresetConfigEx               (void* encoder, GUID encodeGUID, GUID  presetGUID, NV_ENC_TUNING_INFO tuningInfo, NV_ENC_PRESET_CONFIG* presetConfig);
+
+// NvEncInitializeEncoder
+/**
+ * \brief Initialize the encoder.
+ *
+ * This API must be used to initialize the encoder. The initialization parameter
+ * is passed using \p *createEncodeParams  The client must send the following
+ * fields of the _NV_ENC_INITIALIZE_PARAMS structure with a valid value.
+ * - NV_ENC_INITIALIZE_PARAMS::encodeGUID
+ * - NV_ENC_INITIALIZE_PARAMS::encodeWidth
+ * - NV_ENC_INITIALIZE_PARAMS::encodeHeight
+ *
+ * The client can pass a preset GUID directly to the NvEncodeAPI interface using
+ * NV_ENC_INITIALIZE_PARAMS::presetGUID field. If the client doesn't pass
+ * NV_ENC_INITIALIZE_PARAMS::encodeConfig structure, the codec specific parameters
+ * will be selected based on the preset GUID. The preset GUID must have been
+ * validated by the client using ::NvEncGetEncodePresetGUIDs() API.
+ * If the client passes a custom ::_NV_ENC_CONFIG structure through
+ * NV_ENC_INITIALIZE_PARAMS::encodeConfig , it will override the codec specific parameters
+ * based on the preset GUID. It is recommended that even if the client passes a custom config,
+ * it should also send a preset GUID. In this case, the preset GUID passed by the client
+ * will not override any of the custom config parameters programmed by the client,
+ * it is only used as a hint by the NvEncodeAPI interface to determine certain encoder parameters
+ * which are not exposed to the client.
+ *
+ * There are two modes of operation for the encoder namely:
+ * - Asynchronous mode
+ * - Synchronous mode
+ *
+ * The client can select asynchronous or synchronous mode by setting the \p
+ * enableEncodeAsync field in ::_NV_ENC_INITIALIZE_PARAMS to 1 or 0 respectively.
+ *\par Asynchronous mode of operation:
+ * The Asynchronous mode can be enabled by setting NV_ENC_INITIALIZE_PARAMS::enableEncodeAsync to 1.
+ * The client operating in asynchronous mode must allocate completion event object
+ * for each output buffer and pass the completion event object in the
+ * ::NvEncEncodePicture() API. The client can create another thread and wait on
+ * the event object to be signaled by NvEncodeAPI interface on completion of the
+ * encoding process for the output frame. This should unblock the main thread from
+ * submitting work to the encoder. When the event is signaled the client can call
+ * NvEncodeAPI interfaces to copy the bitstream data using ::NvEncLockBitstream()
+ * API. This is the preferred mode of operation.
+ *
+ * NOTE: Asynchronous mode is not supported on Linux.
+ *
+ *\par Synchronous mode of operation:
+ * The client can select synchronous mode by setting NV_ENC_INITIALIZE_PARAMS::enableEncodeAsync to 0.
+ * The client working in synchronous mode can work in a single threaded or multi
+ * threaded mode. The client need not allocate any event objects. The client can
+ * only lock the bitstream data after NvEncodeAPI interface has returned
+ * ::NV_ENC_SUCCESS from encode picture. The NvEncodeAPI interface can return
+ * ::NV_ENC_ERR_NEED_MORE_INPUT error code from ::NvEncEncodePicture() API. The
+ * client must not lock the output buffer in such case but should send the next
+ * frame for encoding. The client must keep on calling ::NvEncEncodePicture() API
+ * until it returns ::NV_ENC_SUCCESS. \n
+ * The client must always lock the bitstream data in order in which it has submitted.
+ * This is true for both asynchronous and synchronous mode.
+ *
+ *\par Picture type decision:
+ * If the client is taking the picture type decision and it must disable the picture
+ * type decision module in NvEncodeAPI by setting NV_ENC_INITIALIZE_PARAMS::enablePTD
+ * to 0. In this case the client is  required to send the picture in encoding
+ * order to NvEncodeAPI by doing the re-ordering for B frames. \n
+ * If the client doesn't want to take the picture type decision it can enable
+ * picture type decision module in the NvEncodeAPI interface by setting
+ * NV_ENC_INITIALIZE_PARAMS::enablePTD to 1 and send the input pictures in display
+ * order.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] createEncodeParams
+ *   Refer ::_NV_ENC_INITIALIZE_PARAMS for details.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncInitializeEncoder                     (void* encoder, NV_ENC_INITIALIZE_PARAMS* createEncodeParams);
+
+
+// NvEncCreateInputBuffer
+/**
+ * \brief Allocates Input buffer.
+ *
+ * This function is used to allocate an input buffer. The client must enumerate
+ * the input buffer format before allocating the input buffer resources. The
+ * NV_ENC_INPUT_PTR returned by the NvEncodeAPI interface in the
+ * NV_ENC_CREATE_INPUT_BUFFER::inputBuffer field can be directly used in
+ * ::NvEncEncodePicture() API. The number of input buffers to be allocated by the
+ * client must be at least 4 more than the number of B frames being used for encoding.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] createInputBufferParams
+ *  Pointer to the ::NV_ENC_CREATE_INPUT_BUFFER structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncCreateInputBuffer                     (void* encoder, NV_ENC_CREATE_INPUT_BUFFER* createInputBufferParams);
+
+
+// NvEncDestroyInputBuffer
+/**
+ * \brief Release an input buffers.
+ *
+ * This function is used to free an input buffer. If the client has allocated
+ * any input buffer using ::NvEncCreateInputBuffer() API, it must free those
+ * input buffers by calling this function. The client must release the input
+ * buffers before destroying the encoder using ::NvEncDestroyEncoder() API.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] inputBuffer
+ *   Pointer to the input buffer to be released.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncDestroyInputBuffer                    (void* encoder, NV_ENC_INPUT_PTR inputBuffer);
+
+// NvEncSetIOCudaStreams
+/**
+ * \brief Set input and output CUDA stream for specified encoder attribute.
+ *
+ * Encoding may involve CUDA pre-processing on the input and post-processing on encoded output.
+ * This function is used to set input and output CUDA streams to pipeline the CUDA pre-processing
+ * and post-processing tasks. Clients should call this function before the call to
+ * NvEncUnlockInputBuffer(). If this function is not called, the default CUDA stream is used for
+ * input and output processing. After a successful call to this function, the streams specified
+ * in that call will replace the previously-used streams.
+ * This API is supported for NVCUVID interface only.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] inputStream
+ *   Pointer to CUstream which is used to process ::NV_ENC_PIC_PARAMS::inputFrame for encode.
+ *   In case of ME-only mode, inputStream is used to process ::NV_ENC_MEONLY_PARAMS::inputBuffer and
+ *   ::NV_ENC_MEONLY_PARAMS::referenceFrame
+ * \param [in] outputStream
+ *  Pointer to CUstream which is used to process ::NV_ENC_PIC_PARAMS::outputBuffer for encode.
+ *  In case of ME-only mode, outputStream is used to process ::NV_ENC_MEONLY_PARAMS::mvBuffer
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_GENERIC \n
+ */
+NVENCSTATUS NVENCAPI NvEncSetIOCudaStreams                     (void* encoder, NV_ENC_CUSTREAM_PTR inputStream, NV_ENC_CUSTREAM_PTR outputStream);
+
+
+// NvEncCreateBitstreamBuffer
+/**
+ * \brief Allocates an output bitstream buffer
+ *
+ * This function is used to allocate an output bitstream buffer and returns a
+ * NV_ENC_OUTPUT_PTR to bitstream  buffer to the client in the
+ * NV_ENC_CREATE_BITSTREAM_BUFFER::bitstreamBuffer field.
+ * The client can only call this function after the encoder session has been
+ * initialized using ::NvEncInitializeEncoder() API. The minimum number of output
+ * buffers allocated by the client must be at least 4 more than the number of B
+ * B frames being used for encoding. The client can only access the output
+ * bitstream data by locking the \p bitstreamBuffer using the ::NvEncLockBitstream()
+ * function.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] createBitstreamBufferParams
+ *   Pointer ::NV_ENC_CREATE_BITSTREAM_BUFFER for details.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncCreateBitstreamBuffer                 (void* encoder, NV_ENC_CREATE_BITSTREAM_BUFFER* createBitstreamBufferParams);
+
+
+// NvEncDestroyBitstreamBuffer
+/**
+ * \brief Release a bitstream buffer.
+ *
+ * This function is used to release the output bitstream buffer allocated using
+ * the ::NvEncCreateBitstreamBuffer() function. The client must release the output
+ * bitstreamBuffer using this function before destroying the encoder session.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] bitstreamBuffer
+ *   Pointer to the bitstream buffer being released.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncDestroyBitstreamBuffer                (void* encoder, NV_ENC_OUTPUT_PTR bitstreamBuffer);
+
+// NvEncEncodePicture
+/**
+ * \brief Submit an input picture for encoding.
+ *
+ * This function is used to submit an input picture buffer for encoding. The
+ * encoding parameters are passed using \p *encodePicParams which is a pointer
+ * to the ::_NV_ENC_PIC_PARAMS structure.
+ *
+ * If the client has set NV_ENC_INITIALIZE_PARAMS::enablePTD to 0, then it must
+ * send a valid value for the following fields.
+ * - NV_ENC_PIC_PARAMS::pictureType
+ * - NV_ENC_PIC_PARAMS_H264::displayPOCSyntax (H264 only)
+ * - NV_ENC_PIC_PARAMS_H264::frameNumSyntax(H264 only)
+ * - NV_ENC_PIC_PARAMS_H264::refPicFlag(H264 only)
+ *
+ *\par MVC Encoding:
+ * For MVC encoding the client must call encode picture API for each view separately
+ * and must pass valid view id in NV_ENC_PIC_PARAMS_MVC::viewID field. Currently
+ * NvEncodeAPI only support stereo MVC so client must send viewID as 0 for base
+ * view and view ID as 1 for dependent view.
+ *
+ *\par Asynchronous Encoding
+ * If the client has enabled asynchronous mode of encoding by setting
+ * NV_ENC_INITIALIZE_PARAMS::enableEncodeAsync to 1 in the ::NvEncInitializeEncoder()
+ * API ,then the client must send a valid NV_ENC_PIC_PARAMS::completionEvent.
+ * Incase of asynchronous mode of operation, client can queue the ::NvEncEncodePicture()
+ * API commands from the main thread and then queue output buffers to be processed
+ * to a secondary worker thread. Before the locking the output buffers in the
+ * secondary thread , the client must wait on NV_ENC_PIC_PARAMS::completionEvent
+ * it has queued in ::NvEncEncodePicture() API call. The client must always process
+ * completion event and the output buffer in the same order in which they have been
+ * submitted for encoding. The NvEncodeAPI interface is responsible for any
+ * re-ordering required for B frames and will always ensure that encoded bitstream
+ * data is written in the same order in which output buffer is submitted.
+ * The NvEncodeAPI interface may return ::NV_ENC_ERR_NEED_MORE_INPUT error code for
+ * some ::NvEncEncodePicture() API calls but the client must not treat it as a fatal error.
+ * The NvEncodeAPI interface might not be able to submit an input picture buffer for encoding
+ * immediately due to re-ordering for B frames.
+ *\code
+  The below example shows how  asynchronous encoding in case of 1 B frames
+  ------------------------------------------------------------------------
+  Suppose the client allocated 4 input buffers(I1,I2..), 4 output buffers(O1,O2..)
+  and 4 completion events(E1, E2, ...). The NvEncodeAPI interface will need to
+  keep a copy of the input buffers for re-ordering and it allocates following
+  internal buffers (NvI1, NvI2...). These internal buffers are managed by NvEncodeAPI
+  and the client is not responsible for the allocating or freeing the memory of
+  the internal buffers.
+
+  a) The client main thread will queue the following encode frame calls.
+  Note the picture type is unknown to the client, the decision is being taken by
+  NvEncodeAPI interface. The client should pass ::_NV_ENC_PIC_PARAMS parameter
+  consisting of allocated input buffer, output buffer and output events in successive
+  ::NvEncEncodePicture() API calls along with other required encode picture params.
+  For example:
+  1st EncodePicture parameters - (I1, O1, E1)
+  2nd EncodePicture parameters - (I2, O2, E2)
+  3rd EncodePicture parameters - (I3, O3, E3)
+
+  b) NvEncodeAPI SW will receive the following encode Commands from the client.
+  The left side shows input from client in the form (Input buffer, Output Buffer,
+  Output Event). The right hand side shows a possible picture type decision take by
+  the NvEncodeAPI interface.
+  (I1, O1, E1)    ---P1 Frame
+  (I2, O2, E2)    ---B2 Frame
+  (I3, O3, E3)    ---P3 Frame
+
+  c) NvEncodeAPI interface will make a copy of the input buffers to its internal
+   buffers for re-ordering. These copies are done as part of nvEncEncodePicture
+   function call from the client and NvEncodeAPI interface is responsible for
+   synchronization of copy operation with the actual encoding operation.
+   I1 --> NvI1
+   I2 --> NvI2
+   I3 --> NvI3
+
+   d) The NvEncodeAPI encodes I1 as P frame and submits I1 to encoder HW and returns ::NV_ENC_SUCCESS.
+   The NvEncodeAPI tries to encode I2 as B frame and fails with ::NV_ENC_ERR_NEED_MORE_INPUT error code.
+   The error is not fatal and it notifies client that I2 is not submitted to encoder immediately.
+   The NvEncodeAPI encodes I3 as P frame and submits I3 for encoding which will be used as  backward
+   reference frame for I2. The NvEncodeAPI then submits I2 for encoding and returns ::NV_ENC_SUCESS.
+   Both the submission are part of the same ::NvEncEncodePicture() function call.
+
+  e) After returning from ::NvEncEncodePicture() call , the client must queue the output
+   bitstream  processing work to the secondary thread. The output bitstream processing
+   for asynchronous mode consist of first waiting on completion event(E1, E2..)
+   and then locking the output bitstream buffer(O1, O2..) for reading the encoded
+   data. The work queued to the secondary thread by the client is in the following order
+   (I1, O1, E1)
+   (I2, O2, E2)
+   (I3, O3, E3)
+   Note they are in the same order in which client calls ::NvEncEncodePicture() API
+   in \p step a).
+
+  f) NvEncodeAPI interface  will do the re-ordering such that Encoder HW will receive
+  the following encode commands:
+  (NvI1, O1, E1)   ---P1 Frame
+  (NvI3, O2, E2)   ---P3 Frame
+  (NvI2, O3, E3)   ---B2 frame
+
+  g) After the encoding operations are completed, the events will be signaled
+  by NvEncodeAPI interface in the following order :
+  (O1, E1) ---P1 Frame ,output bitstream copied to O1 and event E1 signaled.
+  (O2, E2) ---P3 Frame ,output bitstream copied to O2 and event E2 signaled.
+  (O3, E3) ---B2 Frame ,output bitstream copied to O3 and event E3 signaled.
+
+  h) The client must lock the bitstream data using ::NvEncLockBitstream() API in
+   the order O1,O2,O3  to read the encoded data, after waiting for the events
+   to be signaled in the same order i.e E1, E2 and E3.The output processing is
+   done in the secondary thread in the following order:
+   Waits on E1, copies encoded bitstream from O1
+   Waits on E2, copies encoded bitstream from O2
+   Waits on E3, copies encoded bitstream from O3
+
+  -Note the client will receive the events signaling and output buffer in the
+   same order in which they have submitted for encoding.
+  -Note the LockBitstream will have picture type field which will notify the
+   output picture type to the clients.
+  -Note the input, output buffer and the output completion event are free to be
+   reused once NvEncodeAPI interfaced has signaled the event and the client has
+   copied the data from the output buffer.
+
+ * \endcode
+ *
+ *\par Synchronous Encoding
+ * The client can enable synchronous mode of encoding by setting
+ * NV_ENC_INITIALIZE_PARAMS::enableEncodeAsync to 0 in ::NvEncInitializeEncoder() API.
+ * The NvEncodeAPI interface may return ::NV_ENC_ERR_NEED_MORE_INPUT error code for
+ * some ::NvEncEncodePicture() API calls when NV_ENC_INITIALIZE_PARAMS::enablePTD
+ * is set to 1, but the client must not treat it as a fatal error. The NvEncodeAPI
+ * interface might not be able to submit an input picture buffer for encoding
+ * immediately due to re-ordering for B frames. The NvEncodeAPI interface cannot
+ * submit the input picture which is decided to be encoded as B frame as it waits
+ * for backward reference from  temporally subsequent frames. This input picture
+ * is buffered internally and waits for more input picture to arrive. The client
+ * must not call ::NvEncLockBitstream() API on the output buffers whose
+ * ::NvEncEncodePicture() API returns ::NV_ENC_ERR_NEED_MORE_INPUT. The client must
+ * wait for the NvEncodeAPI interface to return ::NV_ENC_SUCCESS before locking the
+ * output bitstreams to read the encoded bitstream data. The following example
+ * explains the scenario with synchronous encoding with 2 B frames.
+ *\code
+ The below example shows how  synchronous encoding works in case of 1 B frames
+ -----------------------------------------------------------------------------
+ Suppose the client allocated 4 input buffers(I1,I2..), 4 output buffers(O1,O2..)
+ and 4 completion events(E1, E2, ...). The NvEncodeAPI interface will need to
+ keep a copy of the input buffers for re-ordering and it allocates following
+ internal buffers (NvI1, NvI2...). These internal buffers are managed by NvEncodeAPI
+ and the client is not responsible for the allocating or freeing the memory of
+ the internal buffers.
+
+ The client calls ::NvEncEncodePicture() API with input buffer I1 and output buffer O1.
+ The NvEncodeAPI decides to encode I1 as P frame and submits it to encoder
+ HW and returns ::NV_ENC_SUCCESS.
+ The client can now read the encoded data by locking the output O1 by calling
+ NvEncLockBitstream API.
+
+ The client calls ::NvEncEncodePicture() API with input buffer I2 and output buffer O2.
+ The NvEncodeAPI decides to encode I2 as B frame and buffers I2 by copying it
+ to internal buffer and returns ::NV_ENC_ERR_NEED_MORE_INPUT.
+ The error is not fatal and it notifies client that it cannot read the encoded
+ data by locking the output O2 by calling ::NvEncLockBitstream() API without submitting
+ more work to the NvEncodeAPI interface.
+
+ The client calls ::NvEncEncodePicture() with input buffer I3 and output buffer O3.
+ The NvEncodeAPI decides to encode I3 as P frame and it first submits I3 for
+ encoding which will be used as backward reference frame for I2.
+ The NvEncodeAPI then submits I2 for encoding and returns ::NV_ENC_SUCESS. Both
+ the submission are part of the same ::NvEncEncodePicture() function call.
+ The client can now read the encoded data for both the frames by locking the output
+ O2 followed by  O3 ,by calling ::NvEncLockBitstream() API.
+
+ The client must always lock the output in the same order in which it has submitted
+ to receive the encoded bitstream in correct encoding order.
+
+ * \endcode
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] encodePicParams
+ *   Pointer to the ::_NV_ENC_PIC_PARAMS structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_ENCODER_BUSY \n
+ * ::NV_ENC_ERR_NEED_MORE_INPUT \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncEncodePicture                         (void* encoder, NV_ENC_PIC_PARAMS* encodePicParams);
+
+
+// NvEncLockBitstream
+/**
+ * \brief Lock output bitstream buffer
+ *
+ * This function is used to lock the bitstream buffer to read the encoded data.
+ * The client can only access the encoded data by calling this function.
+ * The pointer to client accessible encoded data is returned in the
+ * NV_ENC_LOCK_BITSTREAM::bitstreamBufferPtr field. The size of the encoded data
+ * in the output buffer is returned in the NV_ENC_LOCK_BITSTREAM::bitstreamSizeInBytes
+ * The NvEncodeAPI interface also returns the output picture type and picture structure
+ * of the encoded frame in NV_ENC_LOCK_BITSTREAM::pictureType and
+ * NV_ENC_LOCK_BITSTREAM::pictureStruct fields respectively. If the client has
+ * set NV_ENC_LOCK_BITSTREAM::doNotWait to 1, the function might return
+ * ::NV_ENC_ERR_LOCK_BUSY if client is operating in synchronous mode. This is not
+ * a fatal failure if NV_ENC_LOCK_BITSTREAM::doNotWait is set to 1. In the above case the client can
+ * retry the function after few milliseconds.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] lockBitstreamBufferParams
+ *   Pointer to the ::_NV_ENC_LOCK_BITSTREAM structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_LOCK_BUSY \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncLockBitstream                         (void* encoder, NV_ENC_LOCK_BITSTREAM* lockBitstreamBufferParams);
+
+
+// NvEncUnlockBitstream
+/**
+ * \brief Unlock the output bitstream buffer
+ *
+ * This function is used to unlock the output bitstream buffer after the client
+ * has read the encoded data from output buffer. The client must call this function
+ * to unlock the output buffer which it has previously locked using ::NvEncLockBitstream()
+ * function. Using a locked bitstream buffer in ::NvEncEncodePicture() API will cause
+ * the function to fail.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] bitstreamBuffer
+ *   bitstream buffer pointer being unlocked
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncUnlockBitstream                       (void* encoder, NV_ENC_OUTPUT_PTR bitstreamBuffer);
+
+
+// NvLockInputBuffer
+/**
+ * \brief Locks an input buffer
+ *
+ * This function is used to lock the input buffer to load the uncompressed YUV
+ * pixel data into input buffer memory. The client must pass the NV_ENC_INPUT_PTR
+ * it had previously allocated using ::NvEncCreateInputBuffer()in the
+ * NV_ENC_LOCK_INPUT_BUFFER::inputBuffer field.
+ * The NvEncodeAPI interface returns pointer to client accessible input buffer
+ * memory in NV_ENC_LOCK_INPUT_BUFFER::bufferDataPtr field.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] lockInputBufferParams
+ *   Pointer to the ::_NV_ENC_LOCK_INPUT_BUFFER structure
+ *
+ * \return
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_LOCK_BUSY \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncLockInputBuffer                      (void* encoder, NV_ENC_LOCK_INPUT_BUFFER* lockInputBufferParams);
+
+
+// NvUnlockInputBuffer
+/**
+ * \brief Unlocks the input buffer
+ *
+ * This function is used to unlock the input buffer memory previously locked for
+ * uploading YUV pixel data. The input buffer must be unlocked before being used
+ * again for encoding, otherwise NvEncodeAPI will fail the ::NvEncEncodePicture()
+ *
+  * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] inputBuffer
+ *   Pointer to the input buffer that is being unlocked.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncUnlockInputBuffer                     (void* encoder, NV_ENC_INPUT_PTR inputBuffer);
+
+
+// NvEncGetEncodeStats
+/**
+ * \brief Get encoding statistics.
+ *
+ * This function is used to retrieve the encoding statistics.
+ * This API is not supported when encode device type is CUDA.
+ * Note that this API will be removed in future Video Codec SDK release.
+ * Clients should use NvEncLockBitstream() API to retrieve the encoding statistics.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] encodeStats
+ *   Pointer to the ::_NV_ENC_STAT structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetEncodeStats                        (void* encoder, NV_ENC_STAT* encodeStats);
+
+
+// NvEncGetSequenceParams
+/**
+ * \brief Get encoded sequence and picture header.
+ *
+ * This function can be used to retrieve the sequence and picture header out of
+ * band. The client must call this function only after the encoder has been
+ * initialized using ::NvEncInitializeEncoder() function. The client must
+ * allocate the memory where the NvEncodeAPI interface can copy the bitstream
+ * header and pass the pointer to the memory in NV_ENC_SEQUENCE_PARAM_PAYLOAD::spsppsBuffer.
+ * The size of buffer is passed in the field  NV_ENC_SEQUENCE_PARAM_PAYLOAD::inBufferSize.
+ * The NvEncodeAPI interface will copy the bitstream header payload and returns
+ * the actual size of the bitstream header in the field
+ * NV_ENC_SEQUENCE_PARAM_PAYLOAD::outSPSPPSPayloadSize.
+ * The client must call  ::NvEncGetSequenceParams() function from the same thread which is
+ * being used to call ::NvEncEncodePicture() function.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] sequenceParamPayload
+ *   Pointer to the ::_NV_ENC_SEQUENCE_PARAM_PAYLOAD structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetSequenceParams                     (void* encoder, NV_ENC_SEQUENCE_PARAM_PAYLOAD* sequenceParamPayload);
+
+// NvEncGetSequenceParamEx
+/**
+ * \brief Get sequence and picture header.
+ *
+ * This function can be used to retrieve the sequence and picture header out of band, even when
+ * encoder has not been initialized using ::NvEncInitializeEncoder() function.
+ * The client must allocate the memory where the NvEncodeAPI interface can copy the bitstream
+ * header and pass the pointer to the memory in NV_ENC_SEQUENCE_PARAM_PAYLOAD::spsppsBuffer.
+ * The size of buffer is passed in the field  NV_ENC_SEQUENCE_PARAM_PAYLOAD::inBufferSize.
+ * If encoder has not been initialized using ::NvEncInitializeEncoder() function, client must
+ * send NV_ENC_INITIALIZE_PARAMS as input. The NV_ENC_INITIALIZE_PARAMS passed must be same as the
+ * one which will be used for initializing encoder using ::NvEncInitializeEncoder() function later.
+ * If encoder is already initialized using ::NvEncInitializeEncoder() function, the provided
+ * NV_ENC_INITIALIZE_PARAMS structure is ignored. The NvEncodeAPI interface will copy the bitstream
+ * header payload and returns the actual size of the bitstream header in the field
+ * NV_ENC_SEQUENCE_PARAM_PAYLOAD::outSPSPPSPayloadSize. The client must call  ::NvEncGetSequenceParamsEx()
+ * function from the same thread which is being used to call ::NvEncEncodePicture() function.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] encInitParams
+ *   Pointer to the _NV_ENC_INITIALIZE_PARAMS structure.
+ * \param [in,out] sequenceParamPayload
+ *   Pointer to the ::_NV_ENC_SEQUENCE_PARAM_PAYLOAD structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncGetSequenceParamEx                     (void* encoder, NV_ENC_INITIALIZE_PARAMS* encInitParams, NV_ENC_SEQUENCE_PARAM_PAYLOAD* sequenceParamPayload);
+
+// NvEncRegisterAsyncEvent
+/**
+ * \brief Register event for notification to encoding completion.
+ *
+ * This function is used to register the completion event with NvEncodeAPI
+ * interface. The event is required when the client has configured the encoder to
+ * work in asynchronous mode. In this mode the client needs to send a completion
+ * event with every output buffer. The NvEncodeAPI interface will signal the
+ * completion of the encoding process using this event. Only after the event is
+ * signaled the client can get the encoded data using ::NvEncLockBitstream() function.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] eventParams
+ *   Pointer to the ::_NV_ENC_EVENT_PARAMS structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncRegisterAsyncEvent                    (void* encoder, NV_ENC_EVENT_PARAMS* eventParams);
+
+
+// NvEncUnregisterAsyncEvent
+/**
+ * \brief Unregister completion event.
+ *
+ * This function is used to unregister completion event which has been previously
+ * registered using ::NvEncRegisterAsyncEvent() function. The client must unregister
+ * all events before destroying the encoder using ::NvEncDestroyEncoder() function.
+ *
+  * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] eventParams
+ *   Pointer to the ::_NV_ENC_EVENT_PARAMS structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncUnregisterAsyncEvent                  (void* encoder, NV_ENC_EVENT_PARAMS* eventParams);
+
+
+// NvEncMapInputResource
+/**
+ * \brief Map an externally created input resource pointer for encoding.
+ *
+ * Maps an externally allocated input resource [using and returns a NV_ENC_INPUT_PTR
+ * which can be used for encoding in the ::NvEncEncodePicture() function. The
+ * mapped resource is returned in the field NV_ENC_MAP_INPUT_RESOURCE::outputResourcePtr.
+ * The NvEncodeAPI interface also returns the buffer format of the mapped resource
+ * in the field NV_ENC_MAP_INPUT_RESOURCE::outbufferFmt.
+ * This function provides synchronization guarantee that any graphics work submitted
+ * on the input buffer is completed before the buffer is used for encoding. This is
+ * also true for compute (i.e. CUDA) work, provided that the previous workload using
+ * the input resource was submitted to the default stream.
+ * The client should not access any input buffer while they are mapped by the encoder.
+ * For D3D12 interface type, this function does not provide synchronization guarantee.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] mapInputResParams
+ *   Pointer to the ::_NV_ENC_MAP_INPUT_RESOURCE structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_RESOURCE_NOT_REGISTERED \n
+ * ::NV_ENC_ERR_MAP_FAILED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncMapInputResource                         (void* encoder, NV_ENC_MAP_INPUT_RESOURCE* mapInputResParams);
+
+
+// NvEncUnmapInputResource
+/**
+ * \brief  UnMaps a NV_ENC_INPUT_PTR  which was mapped for encoding
+ *
+ *
+ * UnMaps an input buffer which was previously mapped using ::NvEncMapInputResource()
+ * API. The mapping created using ::NvEncMapInputResource() should be invalidated
+ * using this API before the external resource is destroyed by the client. The client
+ * must unmap the buffer after ::NvEncLockBitstream() API returns successfully for encode
+ * work submitted using the mapped input buffer.
+ *
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] mappedInputBuffer
+ *   Pointer to the NV_ENC_INPUT_PTR
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_RESOURCE_NOT_REGISTERED \n
+ * ::NV_ENC_ERR_RESOURCE_NOT_MAPPED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncUnmapInputResource                         (void* encoder, NV_ENC_INPUT_PTR mappedInputBuffer);
+
+// NvEncDestroyEncoder
+/**
+ * \brief Destroy Encoding Session
+ *
+ * Destroys the encoder session previously created using ::NvEncOpenEncodeSession()
+ * function. The client must flush the encoder before freeing any resources. In order
+ * to flush the encoder the client must pass a NULL encode picture packet and either
+ * wait for the ::NvEncEncodePicture() function to return in synchronous mode or wait
+ * for the flush event to be signaled by the encoder in asynchronous mode.
+ * The client must free all the input and output resources created using the
+ * NvEncodeAPI interface before destroying the encoder. If the client is operating
+ * in asynchronous mode, it must also unregister the completion events previously
+ * registered.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncDestroyEncoder                        (void* encoder);
+
+// NvEncInvalidateRefFrames
+/**
+ * \brief Invalidate reference frames
+ *
+ * Invalidates reference frame based on the time stamp provided by the client.
+ * The encoder marks any reference frames or any frames which have been reconstructed
+ * using the corrupt frame as invalid for motion estimation and uses older reference
+ * frames for motion estimation. The encoder forces the current frame to be encoded
+ * as an intra frame if no reference frames are left after invalidation process.
+ * This is useful for low latency application for error resiliency. The client
+ * is recommended to set NV_ENC_CONFIG_H264::maxNumRefFrames to a large value so
+ * that encoder can keep a backup of older reference frames in the DPB and can use them
+ * for motion estimation when the newer reference frames have been invalidated.
+ * This API can be called multiple times.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] invalidRefFrameTimeStamp
+ *   Timestamp of the invalid reference frames which needs to be invalidated.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncInvalidateRefFrames(void* encoder, uint64_t invalidRefFrameTimeStamp);
+
+// NvEncOpenEncodeSessionEx
+/**
+ * \brief Opens an encoding session.
+ *
+ * Opens an encoding session and returns a pointer to the encoder interface in
+ * the \p **encoder parameter. The client should start encoding process by calling
+ * this API first.
+ * The client must pass a pointer to IDirect3DDevice9 device or CUDA context in the \p *device parameter.
+ * For the OpenGL interface, \p device must be NULL. An OpenGL context must be current when
+ * calling all NvEncodeAPI functions.
+ * If the creation of encoder session fails, the client must call ::NvEncDestroyEncoder API
+ * before exiting.
+ *
+ * \param [in] openSessionExParams
+ *    Pointer to a ::NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS structure.
+ * \param [out] encoder
+ *    Encode Session pointer to the NvEncodeAPI interface.
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_NO_ENCODE_DEVICE \n
+ * ::NV_ENC_ERR_UNSUPPORTED_DEVICE \n
+ * ::NV_ENC_ERR_INVALID_DEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncOpenEncodeSessionEx                   (NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS *openSessionExParams, void** encoder);
+
+// NvEncRegisterResource
+/**
+ * \brief Registers a resource with the Nvidia Video Encoder Interface.
+ *
+ * Registers a resource with the Nvidia Video Encoder Interface for book keeping.
+ * The client is expected to pass the registered resource handle as well, while calling ::NvEncMapInputResource API.
+ *
+ * \param [in] encoder
+ *   Pointer to the NVEncodeAPI interface.
+ *
+ * \param [in] registerResParams
+ *   Pointer to a ::_NV_ENC_REGISTER_RESOURCE structure
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_RESOURCE_REGISTER_FAILED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ * ::NV_ENC_ERR_UNIMPLEMENTED \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncRegisterResource                      (void* encoder, NV_ENC_REGISTER_RESOURCE* registerResParams);
+
+// NvEncUnregisterResource
+/**
+ * \brief Unregisters a resource previously registered with the Nvidia Video Encoder Interface.
+ *
+ * Unregisters a resource previously registered with the Nvidia Video Encoder Interface.
+ * The client is expected to unregister any resource that it has registered with the
+ * Nvidia Video Encoder Interface before destroying the resource.
+ *
+ * \param [in] encoder
+ *   Pointer to the NVEncodeAPI interface.
+ *
+ * \param [in] registeredResource
+ *   The registered resource pointer that was returned in ::NvEncRegisterResource.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_RESOURCE_NOT_REGISTERED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ * ::NV_ENC_ERR_UNIMPLEMENTED \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncUnregisterResource                    (void* encoder, NV_ENC_REGISTERED_PTR registeredResource);
+
+// NvEncReconfigureEncoder
+/**
+ * \brief Reconfigure an existing encoding session.
+ *
+ * Reconfigure an existing encoding session.
+ * The client should call this API to change/reconfigure the parameter passed during
+ * NvEncInitializeEncoder API call.
+ * Currently Reconfiguration of following are not supported.
+ * Change in GOP structure.
+ * Change in sync-Async mode.
+ * Change in MaxWidth & MaxHeight.
+ * Change in PTD mode.
+ *
+ * Resolution change is possible only if maxEncodeWidth & maxEncodeHeight of NV_ENC_INITIALIZE_PARAMS
+ * is set while creating encoder session.
+ *
+ * \param [in] encoder
+ *   Pointer to the NVEncodeAPI interface.
+ *
+ * \param [in] reInitEncodeParams
+ *    Pointer to a ::NV_ENC_RECONFIGURE_PARAMS structure.
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_NO_ENCODE_DEVICE \n
+ * ::NV_ENC_ERR_UNSUPPORTED_DEVICE \n
+ * ::NV_ENC_ERR_INVALID_DEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_GENERIC \n
+ *
+ */
+NVENCSTATUS NVENCAPI NvEncReconfigureEncoder                   (void *encoder, NV_ENC_RECONFIGURE_PARAMS* reInitEncodeParams);
+
+
+
+// NvEncCreateMVBuffer
+/**
+ * \brief Allocates output MV buffer for ME only mode.
+ *
+ * This function is used to allocate an output MV buffer. The size of the mvBuffer is
+ * dependent on the frame height and width of the last ::NvEncCreateInputBuffer() call.
+ * The NV_ENC_OUTPUT_PTR returned by the NvEncodeAPI interface in the
+ * ::NV_ENC_CREATE_MV_BUFFER::mvBuffer field should be used in
+ * ::NvEncRunMotionEstimationOnly() API.
+ * Client must lock ::NV_ENC_CREATE_MV_BUFFER::mvBuffer using ::NvEncLockBitstream() API to get the motion vector data.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in,out] createMVBufferParams
+ *  Pointer to the ::NV_ENC_CREATE_MV_BUFFER structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_GENERIC \n
+ */
+NVENCSTATUS NVENCAPI NvEncCreateMVBuffer                        (void* encoder, NV_ENC_CREATE_MV_BUFFER* createMVBufferParams);
+
+
+// NvEncDestroyMVBuffer
+/**
+ * \brief Release an output MV buffer for ME only mode.
+ *
+ * This function is used to release the output MV buffer allocated using
+ * the ::NvEncCreateMVBuffer() function. The client must release the output
+ * mvBuffer using this function before destroying the encoder session.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] mvBuffer
+ *   Pointer to the mvBuffer being released.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ */
+NVENCSTATUS NVENCAPI NvEncDestroyMVBuffer                       (void* encoder, NV_ENC_OUTPUT_PTR mvBuffer);
+
+
+// NvEncRunMotionEstimationOnly
+/**
+ * \brief Submit an input picture and reference frame for motion estimation in ME only mode.
+ *
+ * This function is used to submit the input frame and reference frame for motion
+ * estimation. The ME parameters are passed using *meOnlyParams which is a pointer
+ * to ::_NV_ENC_MEONLY_PARAMS structure.
+ * Client must lock ::NV_ENC_CREATE_MV_BUFFER::mvBuffer using ::NvEncLockBitstream() API to get the motion vector data.
+ * to get motion vector data.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ * \param [in] meOnlyParams
+ *   Pointer to the ::_NV_ENC_MEONLY_PARAMS structure.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ * ::NV_ENC_ERR_INVALID_ENCODERDEVICE \n
+ * ::NV_ENC_ERR_DEVICE_NOT_EXIST \n
+ * ::NV_ENC_ERR_UNSUPPORTED_PARAM \n
+ * ::NV_ENC_ERR_OUT_OF_MEMORY \n
+ * ::NV_ENC_ERR_INVALID_PARAM \n
+ * ::NV_ENC_ERR_INVALID_VERSION \n
+ * ::NV_ENC_ERR_NEED_MORE_INPUT \n
+ * ::NV_ENC_ERR_ENCODER_NOT_INITIALIZED \n
+ * ::NV_ENC_ERR_GENERIC \n
+ */
+NVENCSTATUS NVENCAPI NvEncRunMotionEstimationOnly               (void* encoder, NV_ENC_MEONLY_PARAMS* meOnlyParams);
+
+// NvEncodeAPIGetMaxSupportedVersion
+/**
+ * \brief Get the largest NvEncodeAPI version supported by the driver.
+ *
+ * This function can be used by clients to determine if the driver supports
+ * the NvEncodeAPI header the application was compiled with.
+ *
+ * \param [out] version
+ *   Pointer to the requested value. The 4 least significant bits in the returned
+ *   indicate the minor version and the rest of the bits indicate the major
+ *   version of the largest supported version.
+ *
+ * \return
+ * ::NV_ENC_SUCCESS \n
+ * ::NV_ENC_ERR_INVALID_PTR \n
+ */
+NVENCSTATUS NVENCAPI NvEncodeAPIGetMaxSupportedVersion          (uint32_t* version);
+
+
+// NvEncGetLastErrorString
+/**
+ * \brief Get the description of the last error reported by the API.
+ *
+ * This function returns a null-terminated string that can be used by clients to better understand the reason
+ * for failure of a previous API call.
+ *
+ * \param [in] encoder
+ *   Pointer to the NvEncodeAPI interface.
+ *
+ * \return
+ *   Pointer to buffer containing the details of the last error encountered by the API.
+ */
+const char * NVENCAPI NvEncGetLastErrorString          (void* encoder);
+
+
+/// \cond API PFN
+/*
+ *  Defines API function pointers
+ */
+typedef NVENCSTATUS (NVENCAPI* PNVENCOPENENCODESESSION)         (void* device, uint32_t deviceType, void** encoder);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODEGUIDCOUNT)        (void* encoder, uint32_t* encodeGUIDCount);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODEGUIDS)            (void* encoder, GUID* GUIDs, uint32_t guidArraySize, uint32_t* GUIDCount);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODEPROFILEGUIDCOUNT) (void* encoder, GUID encodeGUID, uint32_t* encodeProfileGUIDCount);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODEPROFILEGUIDS)     (void* encoder, GUID encodeGUID, GUID* profileGUIDs, uint32_t guidArraySize, uint32_t* GUIDCount);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETINPUTFORMATCOUNT)       (void* encoder, GUID encodeGUID, uint32_t* inputFmtCount);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETINPUTFORMATS)           (void* encoder, GUID encodeGUID, NV_ENC_BUFFER_FORMAT* inputFmts, uint32_t inputFmtArraySize, uint32_t* inputFmtCount);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODECAPS)             (void* encoder, GUID encodeGUID, NV_ENC_CAPS_PARAM* capsParam, int* capsVal);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODEPRESETCOUNT)      (void* encoder, GUID encodeGUID, uint32_t* encodePresetGUIDCount);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODEPRESETGUIDS)      (void* encoder, GUID encodeGUID, GUID* presetGUIDs, uint32_t guidArraySize, uint32_t* encodePresetGUIDCount);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODEPRESETCONFIG)     (void* encoder, GUID encodeGUID, GUID  presetGUID, NV_ENC_PRESET_CONFIG* presetConfig);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODEPRESETCONFIGEX)   (void* encoder, GUID encodeGUID, GUID  presetGUID, NV_ENC_TUNING_INFO tuningInfo, NV_ENC_PRESET_CONFIG* presetConfig);
+typedef NVENCSTATUS (NVENCAPI* PNVENCINITIALIZEENCODER)         (void* encoder, NV_ENC_INITIALIZE_PARAMS* createEncodeParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCCREATEINPUTBUFFER)         (void* encoder, NV_ENC_CREATE_INPUT_BUFFER* createInputBufferParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCDESTROYINPUTBUFFER)        (void* encoder, NV_ENC_INPUT_PTR inputBuffer);
+typedef NVENCSTATUS (NVENCAPI* PNVENCCREATEBITSTREAMBUFFER)     (void* encoder, NV_ENC_CREATE_BITSTREAM_BUFFER* createBitstreamBufferParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCDESTROYBITSTREAMBUFFER)    (void* encoder, NV_ENC_OUTPUT_PTR bitstreamBuffer);
+typedef NVENCSTATUS (NVENCAPI* PNVENCENCODEPICTURE)             (void* encoder, NV_ENC_PIC_PARAMS* encodePicParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCLOCKBITSTREAM)             (void* encoder, NV_ENC_LOCK_BITSTREAM* lockBitstreamBufferParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCUNLOCKBITSTREAM)           (void* encoder, NV_ENC_OUTPUT_PTR bitstreamBuffer);
+typedef NVENCSTATUS (NVENCAPI* PNVENCLOCKINPUTBUFFER)           (void* encoder, NV_ENC_LOCK_INPUT_BUFFER* lockInputBufferParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCUNLOCKINPUTBUFFER)         (void* encoder, NV_ENC_INPUT_PTR inputBuffer);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETENCODESTATS)            (void* encoder, NV_ENC_STAT* encodeStats);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETSEQUENCEPARAMS)         (void* encoder, NV_ENC_SEQUENCE_PARAM_PAYLOAD* sequenceParamPayload);
+typedef NVENCSTATUS (NVENCAPI* PNVENCREGISTERASYNCEVENT)        (void* encoder, NV_ENC_EVENT_PARAMS* eventParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCUNREGISTERASYNCEVENT)      (void* encoder, NV_ENC_EVENT_PARAMS* eventParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCMAPINPUTRESOURCE)          (void* encoder, NV_ENC_MAP_INPUT_RESOURCE* mapInputResParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCUNMAPINPUTRESOURCE)        (void* encoder, NV_ENC_INPUT_PTR mappedInputBuffer);
+typedef NVENCSTATUS (NVENCAPI* PNVENCDESTROYENCODER)            (void* encoder);
+typedef NVENCSTATUS (NVENCAPI* PNVENCINVALIDATEREFFRAMES)       (void* encoder, uint64_t invalidRefFrameTimeStamp);
+typedef NVENCSTATUS (NVENCAPI* PNVENCOPENENCODESESSIONEX)       (NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS *openSessionExParams, void** encoder);
+typedef NVENCSTATUS (NVENCAPI* PNVENCREGISTERRESOURCE)          (void* encoder, NV_ENC_REGISTER_RESOURCE* registerResParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCUNREGISTERRESOURCE)        (void* encoder, NV_ENC_REGISTERED_PTR registeredRes);
+typedef NVENCSTATUS (NVENCAPI* PNVENCRECONFIGUREENCODER)        (void* encoder, NV_ENC_RECONFIGURE_PARAMS* reInitEncodeParams);
+
+typedef NVENCSTATUS (NVENCAPI* PNVENCCREATEMVBUFFER)            (void* encoder, NV_ENC_CREATE_MV_BUFFER* createMVBufferParams);
+typedef NVENCSTATUS (NVENCAPI* PNVENCDESTROYMVBUFFER)           (void* encoder, NV_ENC_OUTPUT_PTR mvBuffer);
+typedef NVENCSTATUS (NVENCAPI* PNVENCRUNMOTIONESTIMATIONONLY)   (void* encoder, NV_ENC_MEONLY_PARAMS* meOnlyParams);
+typedef const char * (NVENCAPI* PNVENCGETLASTERROR)             (void* encoder);
+typedef NVENCSTATUS (NVENCAPI* PNVENCSETIOCUDASTREAMS)          (void* encoder, NV_ENC_CUSTREAM_PTR inputStream, NV_ENC_CUSTREAM_PTR outputStream);
+typedef NVENCSTATUS (NVENCAPI* PNVENCGETSEQUENCEPARAMEX)        (void* encoder, NV_ENC_INITIALIZE_PARAMS* encInitParams, NV_ENC_SEQUENCE_PARAM_PAYLOAD* sequenceParamPayload);
+
+
+/// \endcond
+
+
+/** @} */ /* END ENCODE_FUNC */
+
+/**
+ * \ingroup ENCODER_STRUCTURE
+ * NV_ENCODE_API_FUNCTION_LIST
+ */
+typedef struct _NV_ENCODE_API_FUNCTION_LIST
+{
+    uint32_t                        version;                           /**< [in]: Client should pass NV_ENCODE_API_FUNCTION_LIST_VER.                               */
+    uint32_t                        reserved;                          /**< [in]: Reserved and should be set to 0.                                                  */
+    PNVENCOPENENCODESESSION         nvEncOpenEncodeSession;            /**< [out]: Client should access ::NvEncOpenEncodeSession() API through this pointer.        */
+    PNVENCGETENCODEGUIDCOUNT        nvEncGetEncodeGUIDCount;           /**< [out]: Client should access ::NvEncGetEncodeGUIDCount() API through this pointer.       */
+    PNVENCGETENCODEPRESETCOUNT      nvEncGetEncodeProfileGUIDCount;    /**< [out]: Client should access ::NvEncGetEncodeProfileGUIDCount() API through this pointer.*/
+    PNVENCGETENCODEPRESETGUIDS      nvEncGetEncodeProfileGUIDs;        /**< [out]: Client should access ::NvEncGetEncodeProfileGUIDs() API through this pointer.    */
+    PNVENCGETENCODEGUIDS            nvEncGetEncodeGUIDs;               /**< [out]: Client should access ::NvEncGetEncodeGUIDs() API through this pointer.           */
+    PNVENCGETINPUTFORMATCOUNT       nvEncGetInputFormatCount;          /**< [out]: Client should access ::NvEncGetInputFormatCount() API through this pointer.      */
+    PNVENCGETINPUTFORMATS           nvEncGetInputFormats;              /**< [out]: Client should access ::NvEncGetInputFormats() API through this pointer.          */
+    PNVENCGETENCODECAPS             nvEncGetEncodeCaps;                /**< [out]: Client should access ::NvEncGetEncodeCaps() API through this pointer.            */
+    PNVENCGETENCODEPRESETCOUNT      nvEncGetEncodePresetCount;         /**< [out]: Client should access ::NvEncGetEncodePresetCount() API through this pointer.     */
+    PNVENCGETENCODEPRESETGUIDS      nvEncGetEncodePresetGUIDs;         /**< [out]: Client should access ::NvEncGetEncodePresetGUIDs() API through this pointer.     */
+    PNVENCGETENCODEPRESETCONFIG     nvEncGetEncodePresetConfig;        /**< [out]: Client should access ::NvEncGetEncodePresetConfig() API through this pointer.    */
+    PNVENCINITIALIZEENCODER         nvEncInitializeEncoder;            /**< [out]: Client should access ::NvEncInitializeEncoder() API through this pointer.        */
+    PNVENCCREATEINPUTBUFFER         nvEncCreateInputBuffer;            /**< [out]: Client should access ::NvEncCreateInputBuffer() API through this pointer.        */
+    PNVENCDESTROYINPUTBUFFER        nvEncDestroyInputBuffer;           /**< [out]: Client should access ::NvEncDestroyInputBuffer() API through this pointer.       */
+    PNVENCCREATEBITSTREAMBUFFER     nvEncCreateBitstreamBuffer;        /**< [out]: Client should access ::NvEncCreateBitstreamBuffer() API through this pointer.    */
+    PNVENCDESTROYBITSTREAMBUFFER    nvEncDestroyBitstreamBuffer;       /**< [out]: Client should access ::NvEncDestroyBitstreamBuffer() API through this pointer.   */
+    PNVENCENCODEPICTURE             nvEncEncodePicture;                /**< [out]: Client should access ::NvEncEncodePicture() API through this pointer.            */
+    PNVENCLOCKBITSTREAM             nvEncLockBitstream;                /**< [out]: Client should access ::NvEncLockBitstream() API through this pointer.            */
+    PNVENCUNLOCKBITSTREAM           nvEncUnlockBitstream;              /**< [out]: Client should access ::NvEncUnlockBitstream() API through this pointer.          */
+    PNVENCLOCKINPUTBUFFER           nvEncLockInputBuffer;              /**< [out]: Client should access ::NvEncLockInputBuffer() API through this pointer.          */
+    PNVENCUNLOCKINPUTBUFFER         nvEncUnlockInputBuffer;            /**< [out]: Client should access ::NvEncUnlockInputBuffer() API through this pointer.        */
+    PNVENCGETENCODESTATS            nvEncGetEncodeStats;               /**< [out]: Client should access ::NvEncGetEncodeStats() API through this pointer.           */
+    PNVENCGETSEQUENCEPARAMS         nvEncGetSequenceParams;            /**< [out]: Client should access ::NvEncGetSequenceParams() API through this pointer.        */
+    PNVENCREGISTERASYNCEVENT        nvEncRegisterAsyncEvent;           /**< [out]: Client should access ::NvEncRegisterAsyncEvent() API through this pointer.       */
+    PNVENCUNREGISTERASYNCEVENT      nvEncUnregisterAsyncEvent;         /**< [out]: Client should access ::NvEncUnregisterAsyncEvent() API through this pointer.     */
+    PNVENCMAPINPUTRESOURCE          nvEncMapInputResource;             /**< [out]: Client should access ::NvEncMapInputResource() API through this pointer.         */
+    PNVENCUNMAPINPUTRESOURCE        nvEncUnmapInputResource;           /**< [out]: Client should access ::NvEncUnmapInputResource() API through this pointer.       */
+    PNVENCDESTROYENCODER            nvEncDestroyEncoder;               /**< [out]: Client should access ::NvEncDestroyEncoder() API through this pointer.           */
+    PNVENCINVALIDATEREFFRAMES       nvEncInvalidateRefFrames;          /**< [out]: Client should access ::NvEncInvalidateRefFrames() API through this pointer.      */
+    PNVENCOPENENCODESESSIONEX       nvEncOpenEncodeSessionEx;          /**< [out]: Client should access ::NvEncOpenEncodeSession() API through this pointer.        */
+    PNVENCREGISTERRESOURCE          nvEncRegisterResource;             /**< [out]: Client should access ::NvEncRegisterResource() API through this pointer.         */
+    PNVENCUNREGISTERRESOURCE        nvEncUnregisterResource;           /**< [out]: Client should access ::NvEncUnregisterResource() API through this pointer.       */
+    PNVENCRECONFIGUREENCODER        nvEncReconfigureEncoder;           /**< [out]: Client should access ::NvEncReconfigureEncoder() API through this pointer.       */
+    void*                           reserved1;
+    PNVENCCREATEMVBUFFER            nvEncCreateMVBuffer;               /**< [out]: Client should access ::NvEncCreateMVBuffer API through this pointer.             */
+    PNVENCDESTROYMVBUFFER           nvEncDestroyMVBuffer;              /**< [out]: Client should access ::NvEncDestroyMVBuffer API through this pointer.            */
+    PNVENCRUNMOTIONESTIMATIONONLY   nvEncRunMotionEstimationOnly;      /**< [out]: Client should access ::NvEncRunMotionEstimationOnly API through this pointer.    */
+    PNVENCGETLASTERROR              nvEncGetLastErrorString;           /**< [out]: Client should access ::nvEncGetLastErrorString API through this pointer.         */
+    PNVENCSETIOCUDASTREAMS          nvEncSetIOCudaStreams;             /**< [out]: Client should access ::nvEncSetIOCudaStreams API through this pointer.           */
+    PNVENCGETENCODEPRESETCONFIGEX   nvEncGetEncodePresetConfigEx;      /**< [out]: Client should access ::NvEncGetEncodePresetConfigEx() API through this pointer.  */
+    PNVENCGETSEQUENCEPARAMEX        nvEncGetSequenceParamEx;           /**< [out]: Client should access ::NvEncGetSequenceParamEx() API through this pointer.       */
+    void*                           reserved2[277];                    /**< [in]:  Reserved and must be set to NULL                                                 */
+} NV_ENCODE_API_FUNCTION_LIST;
+
+/** Macro for constructing the version field of ::_NV_ENCODEAPI_FUNCTION_LIST. */
+#define NV_ENCODE_API_FUNCTION_LIST_VER NVENCAPI_STRUCT_VERSION(2)
+
+// NvEncodeAPICreateInstance
+/**
+ * \ingroup ENCODE_FUNC
+ * Entry Point to the NvEncodeAPI interface.
+ *
+ * Creates an instance of the NvEncodeAPI interface, and populates the
+ * pFunctionList with function pointers to the API routines implemented by the
+ * NvEncodeAPI interface.
+ *
+ * \param [out] functionList
+ *
+ * \return
+ * ::NV_ENC_SUCCESS
+ * ::NV_ENC_ERR_INVALID_PTR
+ */
+NVENCSTATUS NVENCAPI NvEncodeAPICreateInstance(NV_ENCODE_API_FUNCTION_LIST *functionList);
+
+#ifdef __cplusplus
+}
+#endif
+
+
+#endif
+
diff --git a/external/stb_image_write.h b/external/stb_image_write.h
new file mode 100644
index 0000000..e4b32ed
--- /dev/null
+++ b/external/stb_image_write.h
@@ -0,0 +1,1724 @@
+/* stb_image_write - v1.16 - public domain - http://nothings.org/stb
+   writes out PNG/BMP/TGA/JPEG/HDR images to C stdio - Sean Barrett 2010-2015
+                                     no warranty implied; use at your own risk
+
+   Before #including,
+
+       #define STB_IMAGE_WRITE_IMPLEMENTATION
+
+   in the file that you want to have the implementation.
+
+   Will probably not work correctly with strict-aliasing optimizations.
+
+ABOUT:
+
+   This header file is a library for writing images to C stdio or a callback.
+
+   The PNG output is not optimal; it is 20-50% larger than the file
+   written by a decent optimizing implementation; though providing a custom
+   zlib compress function (see STBIW_ZLIB_COMPRESS) can mitigate that.
+   This library is designed for source code compactness and simplicity,
+   not optimal image file size or run-time performance.
+
+BUILDING:
+
+   You can #define STBIW_ASSERT(x) before the #include to avoid using assert.h.
+   You can #define STBIW_MALLOC(), STBIW_REALLOC(), and STBIW_FREE() to replace
+   malloc,realloc,free.
+   You can #define STBIW_MEMMOVE() to replace memmove()
+   You can #define STBIW_ZLIB_COMPRESS to use a custom zlib-style compress function
+   for PNG compression (instead of the builtin one), it must have the following signature:
+   unsigned char * my_compress(unsigned char *data, int data_len, int *out_len, int quality);
+   The returned data will be freed with STBIW_FREE() (free() by default),
+   so it must be heap allocated with STBIW_MALLOC() (malloc() by default),
+
+UNICODE:
+
+   If compiling for Windows and you wish to use Unicode filenames, compile
+   with
+       #define STBIW_WINDOWS_UTF8
+   and pass utf8-encoded filenames. Call stbiw_convert_wchar_to_utf8 to convert
+   Windows wchar_t filenames to utf8.
+
+USAGE:
+
+   There are five functions, one for each image file format:
+
+     int stbi_write_png(char const *filename, int w, int h, int comp, const void *data, int stride_in_bytes);
+     int stbi_write_bmp(char const *filename, int w, int h, int comp, const void *data);
+     int stbi_write_tga(char const *filename, int w, int h, int comp, const void *data);
+     int stbi_write_jpg(char const *filename, int w, int h, int comp, const void *data, int quality);
+     int stbi_write_hdr(char const *filename, int w, int h, int comp, const float *data);
+
+     void stbi_flip_vertically_on_write(int flag); // flag is non-zero to flip data vertically
+
+   There are also five equivalent functions that use an arbitrary write function. You are
+   expected to open/close your file-equivalent before and after calling these:
+
+     int stbi_write_png_to_func(stbi_write_func *func, void *context, int w, int h, int comp, const void  *data, int stride_in_bytes);
+     int stbi_write_bmp_to_func(stbi_write_func *func, void *context, int w, int h, int comp, const void  *data);
+     int stbi_write_tga_to_func(stbi_write_func *func, void *context, int w, int h, int comp, const void  *data);
+     int stbi_write_hdr_to_func(stbi_write_func *func, void *context, int w, int h, int comp, const float *data);
+     int stbi_write_jpg_to_func(stbi_write_func *func, void *context, int x, int y, int comp, const void *data, int quality);
+
+   where the callback is:
+      void stbi_write_func(void *context, void *data, int size);
+
+   You can configure it with these global variables:
+      int stbi_write_tga_with_rle;             // defaults to true; set to 0 to disable RLE
+      int stbi_write_png_compression_level;    // defaults to 8; set to higher for more compression
+      int stbi_write_force_png_filter;         // defaults to -1; set to 0..5 to force a filter mode
+
+
+   You can define STBI_WRITE_NO_STDIO to disable the file variant of these
+   functions, so the library will not use stdio.h at all. However, this will
+   also disable HDR writing, because it requires stdio for formatted output.
+
+   Each function returns 0 on failure and non-0 on success.
+
+   The functions create an image file defined by the parameters. The image
+   is a rectangle of pixels stored from left-to-right, top-to-bottom.
+   Each pixel contains 'comp' channels of data stored interleaved with 8-bits
+   per channel, in the following order: 1=Y, 2=YA, 3=RGB, 4=RGBA. (Y is
+   monochrome color.) The rectangle is 'w' pixels wide and 'h' pixels tall.
+   The *data pointer points to the first byte of the top-left-most pixel.
+   For PNG, "stride_in_bytes" is the distance in bytes from the first byte of
+   a row of pixels to the first byte of the next row of pixels.
+
+   PNG creates output files with the same number of components as the input.
+   The BMP format expands Y to RGB in the file format and does not
+   output alpha.
+
+   PNG supports writing rectangles of data even when the bytes storing rows of
+   data are not consecutive in memory (e.g. sub-rectangles of a larger image),
+   by supplying the stride between the beginning of adjacent rows. The other
+   formats do not. (Thus you cannot write a native-format BMP through the BMP
+   writer, both because it is in BGR order and because it may have padding
+   at the end of the line.)
+
+   PNG allows you to set the deflate compression level by setting the global
+   variable 'stbi_write_png_compression_level' (it defaults to 8).
+
+   HDR expects linear float data. Since the format is always 32-bit rgb(e)
+   data, alpha (if provided) is discarded, and for monochrome data it is
+   replicated across all three channels.
+
+   TGA supports RLE or non-RLE compressed data. To use non-RLE-compressed
+   data, set the global variable 'stbi_write_tga_with_rle' to 0.
+
+   JPEG does ignore alpha channels in input data; quality is between 1 and 100.
+   Higher quality looks better but results in a bigger image.
+   JPEG baseline (no JPEG progressive).
+
+CREDITS:
+
+
+   Sean Barrett           -    PNG/BMP/TGA
+   Baldur Karlsson        -    HDR
+   Jean-Sebastien Guay    -    TGA monochrome
+   Tim Kelsey             -    misc enhancements
+   Alan Hickman           -    TGA RLE
+   Emmanuel Julien        -    initial file IO callback implementation
+   Jon Olick              -    original jo_jpeg.cpp code
+   Daniel Gibson          -    integrate JPEG, allow external zlib
+   Aarni Koskela          -    allow choosing PNG filter
+
+   bugfixes:
+      github:Chribba
+      Guillaume Chereau
+      github:jry2
+      github:romigrou
+      Sergio Gonzalez
+      Jonas Karlsson
+      Filip Wasil
+      Thatcher Ulrich
+      github:poppolopoppo
+      Patrick Boettcher
+      github:xeekworx
+      Cap Petschulat
+      Simon Rodriguez
+      Ivan Tikhonov
+      github:ignotion
+      Adam Schackart
+      Andrew Kensler
+
+LICENSE
+
+  See end of file for license information.
+
+*/
+
+#ifndef INCLUDE_STB_IMAGE_WRITE_H
+#define INCLUDE_STB_IMAGE_WRITE_H
+
+#include <stdlib.h>
+
+// if STB_IMAGE_WRITE_STATIC causes problems, try defining STBIWDEF to 'inline' or 'static inline'
+#ifndef STBIWDEF
+#ifdef STB_IMAGE_WRITE_STATIC
+#define STBIWDEF  static
+#else
+#ifdef __cplusplus
+#define STBIWDEF  extern "C"
+#else
+#define STBIWDEF  extern
+#endif
+#endif
+#endif
+
+#ifndef STB_IMAGE_WRITE_STATIC  // C++ forbids static forward declarations
+STBIWDEF int stbi_write_tga_with_rle;
+STBIWDEF int stbi_write_png_compression_level;
+STBIWDEF int stbi_write_force_png_filter;
+#endif
+
+#ifndef STBI_WRITE_NO_STDIO
+STBIWDEF int stbi_write_png(char const *filename, int w, int h, int comp, const void  *data, int stride_in_bytes);
+STBIWDEF int stbi_write_bmp(char const *filename, int w, int h, int comp, const void  *data);
+STBIWDEF int stbi_write_tga(char const *filename, int w, int h, int comp, const void  *data);
+STBIWDEF int stbi_write_hdr(char const *filename, int w, int h, int comp, const float *data);
+STBIWDEF int stbi_write_jpg(char const *filename, int x, int y, int comp, const void  *data, int quality);
+
+#ifdef STBIW_WINDOWS_UTF8
+STBIWDEF int stbiw_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input);
+#endif
+#endif
+
+typedef void stbi_write_func(void *context, void *data, int size);
+
+STBIWDEF int stbi_write_png_to_func(stbi_write_func *func, void *context, int w, int h, int comp, const void  *data, int stride_in_bytes);
+STBIWDEF int stbi_write_bmp_to_func(stbi_write_func *func, void *context, int w, int h, int comp, const void  *data);
+STBIWDEF int stbi_write_tga_to_func(stbi_write_func *func, void *context, int w, int h, int comp, const void  *data);
+STBIWDEF int stbi_write_hdr_to_func(stbi_write_func *func, void *context, int w, int h, int comp, const float *data);
+STBIWDEF int stbi_write_jpg_to_func(stbi_write_func *func, void *context, int x, int y, int comp, const void  *data, int quality);
+
+STBIWDEF void stbi_flip_vertically_on_write(int flip_boolean);
+
+#endif//INCLUDE_STB_IMAGE_WRITE_H
+
+#ifdef STB_IMAGE_WRITE_IMPLEMENTATION
+
+#ifdef _WIN32
+   #ifndef _CRT_SECURE_NO_WARNINGS
+   #define _CRT_SECURE_NO_WARNINGS
+   #endif
+   #ifndef _CRT_NONSTDC_NO_DEPRECATE
+   #define _CRT_NONSTDC_NO_DEPRECATE
+   #endif
+#endif
+
+#ifndef STBI_WRITE_NO_STDIO
+#include <stdio.h>
+#endif // STBI_WRITE_NO_STDIO
+
+#include <stdarg.h>
+#include <stdlib.h>
+#include <string.h>
+#include <math.h>
+
+#if defined(STBIW_MALLOC) && defined(STBIW_FREE) && (defined(STBIW_REALLOC) || defined(STBIW_REALLOC_SIZED))
+// ok
+#elif !defined(STBIW_MALLOC) && !defined(STBIW_FREE) && !defined(STBIW_REALLOC) && !defined(STBIW_REALLOC_SIZED)
+// ok
+#else
+#error "Must define all or none of STBIW_MALLOC, STBIW_FREE, and STBIW_REALLOC (or STBIW_REALLOC_SIZED)."
+#endif
+
+#ifndef STBIW_MALLOC
+#define STBIW_MALLOC(sz)        malloc(sz)
+#define STBIW_REALLOC(p,newsz)  realloc(p,newsz)
+#define STBIW_FREE(p)           free(p)
+#endif
+
+#ifndef STBIW_REALLOC_SIZED
+#define STBIW_REALLOC_SIZED(p,oldsz,newsz) STBIW_REALLOC(p,newsz)
+#endif
+
+
+#ifndef STBIW_MEMMOVE
+#define STBIW_MEMMOVE(a,b,sz) memmove(a,b,sz)
+#endif
+
+
+#ifndef STBIW_ASSERT
+#include <assert.h>
+#define STBIW_ASSERT(x) assert(x)
+#endif
+
+#define STBIW_UCHAR(x) (unsigned char) ((x) & 0xff)
+
+#ifdef STB_IMAGE_WRITE_STATIC
+static int stbi_write_png_compression_level = 8;
+static int stbi_write_tga_with_rle = 1;
+static int stbi_write_force_png_filter = -1;
+#else
+int stbi_write_png_compression_level = 8;
+int stbi_write_tga_with_rle = 1;
+int stbi_write_force_png_filter = -1;
+#endif
+
+static int stbi__flip_vertically_on_write = 0;
+
+STBIWDEF void stbi_flip_vertically_on_write(int flag)
+{
+   stbi__flip_vertically_on_write = flag;
+}
+
+typedef struct
+{
+   stbi_write_func *func;
+   void *context;
+   unsigned char buffer[64];
+   int buf_used;
+} stbi__write_context;
+
+// initialize a callback-based context
+static void stbi__start_write_callbacks(stbi__write_context *s, stbi_write_func *c, void *context)
+{
+   s->func    = c;
+   s->context = context;
+}
+
+#ifndef STBI_WRITE_NO_STDIO
+
+static void stbi__stdio_write(void *context, void *data, int size)
+{
+   fwrite(data,1,size,(FILE*) context);
+}
+
+#if defined(_WIN32) && defined(STBIW_WINDOWS_UTF8)
+#ifdef __cplusplus
+#define STBIW_EXTERN extern "C"
+#else
+#define STBIW_EXTERN extern
+#endif
+STBIW_EXTERN __declspec(dllimport) int __stdcall MultiByteToWideChar(unsigned int cp, unsigned long flags, const char *str, int cbmb, wchar_t *widestr, int cchwide);
+STBIW_EXTERN __declspec(dllimport) int __stdcall WideCharToMultiByte(unsigned int cp, unsigned long flags, const wchar_t *widestr, int cchwide, char *str, int cbmb, const char *defchar, int *used_default);
+
+STBIWDEF int stbiw_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input)
+{
+   return WideCharToMultiByte(65001 /* UTF8 */, 0, input, -1, buffer, (int) bufferlen, NULL, NULL);
+}
+#endif
+
+static FILE *stbiw__fopen(char const *filename, char const *mode)
+{
+   FILE *f;
+#if defined(_WIN32) && defined(STBIW_WINDOWS_UTF8)
+   wchar_t wMode[64];
+   wchar_t wFilename[1024];
+   if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, filename, -1, wFilename, sizeof(wFilename)/sizeof(*wFilename)))
+      return 0;
+
+   if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, mode, -1, wMode, sizeof(wMode)/sizeof(*wMode)))
+      return 0;
+
+#if defined(_MSC_VER) && _MSC_VER >= 1400
+   if (0 != _wfopen_s(&f, wFilename, wMode))
+      f = 0;
+#else
+   f = _wfopen(wFilename, wMode);
+#endif
+
+#elif defined(_MSC_VER) && _MSC_VER >= 1400
+   if (0 != fopen_s(&f, filename, mode))
+      f=0;
+#else
+   f = fopen(filename, mode);
+#endif
+   return f;
+}
+
+static int stbi__start_write_file(stbi__write_context *s, const char *filename)
+{
+   FILE *f = stbiw__fopen(filename, "wb");
+   stbi__start_write_callbacks(s, stbi__stdio_write, (void *) f);
+   return f != NULL;
+}
+
+static void stbi__end_write_file(stbi__write_context *s)
+{
+   fclose((FILE *)s->context);
+}
+
+#endif // !STBI_WRITE_NO_STDIO
+
+typedef unsigned int stbiw_uint32;
+typedef int stb_image_write_test[sizeof(stbiw_uint32)==4 ? 1 : -1];
+
+static void stbiw__writefv(stbi__write_context *s, const char *fmt, va_list v)
+{
+   while (*fmt) {
+      switch (*fmt++) {
+         case ' ': break;
+         case '1': { unsigned char x = STBIW_UCHAR(va_arg(v, int));
+                     s->func(s->context,&x,1);
+                     break; }
+         case '2': { int x = va_arg(v,int);
+                     unsigned char b[2];
+                     b[0] = STBIW_UCHAR(x);
+                     b[1] = STBIW_UCHAR(x>>8);
+                     s->func(s->context,b,2);
+                     break; }
+         case '4': { stbiw_uint32 x = va_arg(v,int);
+                     unsigned char b[4];
+                     b[0]=STBIW_UCHAR(x);
+                     b[1]=STBIW_UCHAR(x>>8);
+                     b[2]=STBIW_UCHAR(x>>16);
+                     b[3]=STBIW_UCHAR(x>>24);
+                     s->func(s->context,b,4);
+                     break; }
+         default:
+            STBIW_ASSERT(0);
+            return;
+      }
+   }
+}
+
+static void stbiw__writef(stbi__write_context *s, const char *fmt, ...)
+{
+   va_list v;
+   va_start(v, fmt);
+   stbiw__writefv(s, fmt, v);
+   va_end(v);
+}
+
+static void stbiw__write_flush(stbi__write_context *s)
+{
+   if (s->buf_used) {
+      s->func(s->context, &s->buffer, s->buf_used);
+      s->buf_used = 0;
+   }
+}
+
+static void stbiw__putc(stbi__write_context *s, unsigned char c)
+{
+   s->func(s->context, &c, 1);
+}
+
+static void stbiw__write1(stbi__write_context *s, unsigned char a)
+{
+   if ((size_t)s->buf_used + 1 > sizeof(s->buffer))
+      stbiw__write_flush(s);
+   s->buffer[s->buf_used++] = a;
+}
+
+static void stbiw__write3(stbi__write_context *s, unsigned char a, unsigned char b, unsigned char c)
+{
+   int n;
+   if ((size_t)s->buf_used + 3 > sizeof(s->buffer))
+      stbiw__write_flush(s);
+   n = s->buf_used;
+   s->buf_used = n+3;
+   s->buffer[n+0] = a;
+   s->buffer[n+1] = b;
+   s->buffer[n+2] = c;
+}
+
+static void stbiw__write_pixel(stbi__write_context *s, int rgb_dir, int comp, int write_alpha, int expand_mono, unsigned char *d)
+{
+   unsigned char bg[3] = { 255, 0, 255}, px[3];
+   int k;
+
+   if (write_alpha < 0)
+      stbiw__write1(s, d[comp - 1]);
+
+   switch (comp) {
+      case 2: // 2 pixels = mono + alpha, alpha is written separately, so same as 1-channel case
+      case 1:
+         if (expand_mono)
+            stbiw__write3(s, d[0], d[0], d[0]); // monochrome bmp
+         else
+            stbiw__write1(s, d[0]);  // monochrome TGA
+         break;
+      case 4:
+         if (!write_alpha) {
+            // composite against pink background
+            for (k = 0; k < 3; ++k)
+               px[k] = bg[k] + ((d[k] - bg[k]) * d[3]) / 255;
+            stbiw__write3(s, px[1 - rgb_dir], px[1], px[1 + rgb_dir]);
+            break;
+         }
+         /* FALLTHROUGH */
+      case 3:
+         stbiw__write3(s, d[1 - rgb_dir], d[1], d[1 + rgb_dir]);
+         break;
+   }
+   if (write_alpha > 0)
+      stbiw__write1(s, d[comp - 1]);
+}
+
+static void stbiw__write_pixels(stbi__write_context *s, int rgb_dir, int vdir, int x, int y, int comp, void *data, int write_alpha, int scanline_pad, int expand_mono)
+{
+   stbiw_uint32 zero = 0;
+   int i,j, j_end;
+
+   if (y <= 0)
+      return;
+
+   if (stbi__flip_vertically_on_write)
+      vdir *= -1;
+
+   if (vdir < 0) {
+      j_end = -1; j = y-1;
+   } else {
+      j_end =  y; j = 0;
+   }
+
+   for (; j != j_end; j += vdir) {
+      for (i=0; i < x; ++i) {
+         unsigned char *d = (unsigned char *) data + (j*x+i)*comp;
+         stbiw__write_pixel(s, rgb_dir, comp, write_alpha, expand_mono, d);
+      }
+      stbiw__write_flush(s);
+      s->func(s->context, &zero, scanline_pad);
+   }
+}
+
+static int stbiw__outfile(stbi__write_context *s, int rgb_dir, int vdir, int x, int y, int comp, int expand_mono, void *data, int alpha, int pad, const char *fmt, ...)
+{
+   if (y < 0 || x < 0) {
+      return 0;
+   } else {
+      va_list v;
+      va_start(v, fmt);
+      stbiw__writefv(s, fmt, v);
+      va_end(v);
+      stbiw__write_pixels(s,rgb_dir,vdir,x,y,comp,data,alpha,pad, expand_mono);
+      return 1;
+   }
+}
+
+static int stbi_write_bmp_core(stbi__write_context *s, int x, int y, int comp, const void *data)
+{
+   if (comp != 4) {
+      // write RGB bitmap
+      int pad = (-x*3) & 3;
+      return stbiw__outfile(s,-1,-1,x,y,comp,1,(void *) data,0,pad,
+              "11 4 22 4" "4 44 22 444444",
+              'B', 'M', 14+40+(x*3+pad)*y, 0,0, 14+40,  // file header
+               40, x,y, 1,24, 0,0,0,0,0,0);             // bitmap header
+   } else {
+      // RGBA bitmaps need a v4 header
+      // use BI_BITFIELDS mode with 32bpp and alpha mask
+      // (straight BI_RGB with alpha mask doesn't work in most readers)
+      return stbiw__outfile(s,-1,-1,x,y,comp,1,(void *)data,1,0,
+         "11 4 22 4" "4 44 22 444444 4444 4 444 444 444 444",
+         'B', 'M', 14+108+x*y*4, 0, 0, 14+108, // file header
+         108, x,y, 1,32, 3,0,0,0,0,0, 0xff0000,0xff00,0xff,0xff000000u, 0, 0,0,0, 0,0,0, 0,0,0, 0,0,0); // bitmap V4 header
+   }
+}
+
+STBIWDEF int stbi_write_bmp_to_func(stbi_write_func *func, void *context, int x, int y, int comp, const void *data)
+{
+   stbi__write_context s = { 0 };
+   stbi__start_write_callbacks(&s, func, context);
+   return stbi_write_bmp_core(&s, x, y, comp, data);
+}
+
+#ifndef STBI_WRITE_NO_STDIO
+STBIWDEF int stbi_write_bmp(char const *filename, int x, int y, int comp, const void *data)
+{
+   stbi__write_context s = { 0 };
+   if (stbi__start_write_file(&s,filename)) {
+      int r = stbi_write_bmp_core(&s, x, y, comp, data);
+      stbi__end_write_file(&s);
+      return r;
+   } else
+      return 0;
+}
+#endif //!STBI_WRITE_NO_STDIO
+
+static int stbi_write_tga_core(stbi__write_context *s, int x, int y, int comp, void *data)
+{
+   int has_alpha = (comp == 2 || comp == 4);
+   int colorbytes = has_alpha ? comp-1 : comp;
+   int format = colorbytes < 2 ? 3 : 2; // 3 color channels (RGB/RGBA) = 2, 1 color channel (Y/YA) = 3
+
+   if (y < 0 || x < 0)
+      return 0;
+
+   if (!stbi_write_tga_with_rle) {
+      return stbiw__outfile(s, -1, -1, x, y, comp, 0, (void *) data, has_alpha, 0,
+         "111 221 2222 11", 0, 0, format, 0, 0, 0, 0, 0, x, y, (colorbytes + has_alpha) * 8, has_alpha * 8);
+   } else {
+      int i,j,k;
+      int jend, jdir;
+
+      stbiw__writef(s, "111 221 2222 11", 0,0,format+8, 0,0,0, 0,0,x,y, (colorbytes + has_alpha) * 8, has_alpha * 8);
+
+      if (stbi__flip_vertically_on_write) {
+         j = 0;
+         jend = y;
+         jdir = 1;
+      } else {
+         j = y-1;
+         jend = -1;
+         jdir = -1;
+      }
+      for (; j != jend; j += jdir) {
+         unsigned char *row = (unsigned char *) data + j * x * comp;
+         int len;
+
+         for (i = 0; i < x; i += len) {
+            unsigned char *begin = row + i * comp;
+            int diff = 1;
+            len = 1;
+
+            if (i < x - 1) {
+               ++len;
+               diff = memcmp(begin, row + (i + 1) * comp, comp);
+               if (diff) {
+                  const unsigned char *prev = begin;
+                  for (k = i + 2; k < x && len < 128; ++k) {
+                     if (memcmp(prev, row + k * comp, comp)) {
+                        prev += comp;
+                        ++len;
+                     } else {
+                        --len;
+                        break;
+                     }
+                  }
+               } else {
+                  for (k = i + 2; k < x && len < 128; ++k) {
+                     if (!memcmp(begin, row + k * comp, comp)) {
+                        ++len;
+                     } else {
+                        break;
+                     }
+                  }
+               }
+            }
+
+            if (diff) {
+               unsigned char header = STBIW_UCHAR(len - 1);
+               stbiw__write1(s, header);
+               for (k = 0; k < len; ++k) {
+                  stbiw__write_pixel(s, -1, comp, has_alpha, 0, begin + k * comp);
+               }
+            } else {
+               unsigned char header = STBIW_UCHAR(len - 129);
+               stbiw__write1(s, header);
+               stbiw__write_pixel(s, -1, comp, has_alpha, 0, begin);
+            }
+         }
+      }
+      stbiw__write_flush(s);
+   }
+   return 1;
+}
+
+STBIWDEF int stbi_write_tga_to_func(stbi_write_func *func, void *context, int x, int y, int comp, const void *data)
+{
+   stbi__write_context s = { 0 };
+   stbi__start_write_callbacks(&s, func, context);
+   return stbi_write_tga_core(&s, x, y, comp, (void *) data);
+}
+
+#ifndef STBI_WRITE_NO_STDIO
+STBIWDEF int stbi_write_tga(char const *filename, int x, int y, int comp, const void *data)
+{
+   stbi__write_context s = { 0 };
+   if (stbi__start_write_file(&s,filename)) {
+      int r = stbi_write_tga_core(&s, x, y, comp, (void *) data);
+      stbi__end_write_file(&s);
+      return r;
+   } else
+      return 0;
+}
+#endif
+
+// *************************************************************************************************
+// Radiance RGBE HDR writer
+// by Baldur Karlsson
+
+#define stbiw__max(a, b)  ((a) > (b) ? (a) : (b))
+
+#ifndef STBI_WRITE_NO_STDIO
+
+static void stbiw__linear_to_rgbe(unsigned char *rgbe, float *linear)
+{
+   int exponent;
+   float maxcomp = stbiw__max(linear[0], stbiw__max(linear[1], linear[2]));
+
+   if (maxcomp < 1e-32f) {
+      rgbe[0] = rgbe[1] = rgbe[2] = rgbe[3] = 0;
+   } else {
+      float normalize = (float) frexp(maxcomp, &exponent) * 256.0f/maxcomp;
+
+      rgbe[0] = (unsigned char)(linear[0] * normalize);
+      rgbe[1] = (unsigned char)(linear[1] * normalize);
+      rgbe[2] = (unsigned char)(linear[2] * normalize);
+      rgbe[3] = (unsigned char)(exponent + 128);
+   }
+}
+
+static void stbiw__write_run_data(stbi__write_context *s, int length, unsigned char databyte)
+{
+   unsigned char lengthbyte = STBIW_UCHAR(length+128);
+   STBIW_ASSERT(length+128 <= 255);
+   s->func(s->context, &lengthbyte, 1);
+   s->func(s->context, &databyte, 1);
+}
+
+static void stbiw__write_dump_data(stbi__write_context *s, int length, unsigned char *data)
+{
+   unsigned char lengthbyte = STBIW_UCHAR(length);
+   STBIW_ASSERT(length <= 128); // inconsistent with spec but consistent with official code
+   s->func(s->context, &lengthbyte, 1);
+   s->func(s->context, data, length);
+}
+
+static void stbiw__write_hdr_scanline(stbi__write_context *s, int width, int ncomp, unsigned char *scratch, float *scanline)
+{
+   unsigned char scanlineheader[4] = { 2, 2, 0, 0 };
+   unsigned char rgbe[4];
+   float linear[3];
+   int x;
+
+   scanlineheader[2] = (width&0xff00)>>8;
+   scanlineheader[3] = (width&0x00ff);
+
+   /* skip RLE for images too small or large */
+   if (width < 8 || width >= 32768) {
+      for (x=0; x < width; x++) {
+         switch (ncomp) {
+            case 4: /* fallthrough */
+            case 3: linear[2] = scanline[x*ncomp + 2];
+                    linear[1] = scanline[x*ncomp + 1];
+                    linear[0] = scanline[x*ncomp + 0];
+                    break;
+            default:
+                    linear[0] = linear[1] = linear[2] = scanline[x*ncomp + 0];
+                    break;
+         }
+         stbiw__linear_to_rgbe(rgbe, linear);
+         s->func(s->context, rgbe, 4);
+      }
+   } else {
+      int c,r;
+      /* encode into scratch buffer */
+      for (x=0; x < width; x++) {
+         switch(ncomp) {
+            case 4: /* fallthrough */
+            case 3: linear[2] = scanline[x*ncomp + 2];
+                    linear[1] = scanline[x*ncomp + 1];
+                    linear[0] = scanline[x*ncomp + 0];
+                    break;
+            default:
+                    linear[0] = linear[1] = linear[2] = scanline[x*ncomp + 0];
+                    break;
+         }
+         stbiw__linear_to_rgbe(rgbe, linear);
+         scratch[x + width*0] = rgbe[0];
+         scratch[x + width*1] = rgbe[1];
+         scratch[x + width*2] = rgbe[2];
+         scratch[x + width*3] = rgbe[3];
+      }
+
+      s->func(s->context, scanlineheader, 4);
+
+      /* RLE each component separately */
+      for (c=0; c < 4; c++) {
+         unsigned char *comp = &scratch[width*c];
+
+         x = 0;
+         while (x < width) {
+            // find first run
+            r = x;
+            while (r+2 < width) {
+               if (comp[r] == comp[r+1] && comp[r] == comp[r+2])
+                  break;
+               ++r;
+            }
+            if (r+2 >= width)
+               r = width;
+            // dump up to first run
+            while (x < r) {
+               int len = r-x;
+               if (len > 128) len = 128;
+               stbiw__write_dump_data(s, len, &comp[x]);
+               x += len;
+            }
+            // if there's a run, output it
+            if (r+2 < width) { // same test as what we break out of in search loop, so only true if we break'd
+               // find next byte after run
+               while (r < width && comp[r] == comp[x])
+                  ++r;
+               // output run up to r
+               while (x < r) {
+                  int len = r-x;
+                  if (len > 127) len = 127;
+                  stbiw__write_run_data(s, len, comp[x]);
+                  x += len;
+               }
+            }
+         }
+      }
+   }
+}
+
+static int stbi_write_hdr_core(stbi__write_context *s, int x, int y, int comp, float *data)
+{
+   if (y <= 0 || x <= 0 || data == NULL)
+      return 0;
+   else {
+      // Each component is stored separately. Allocate scratch space for full output scanline.
+      unsigned char *scratch = (unsigned char *) STBIW_MALLOC(x*4);
+      int i, len;
+      char buffer[128];
+      char header[] = "#?RADIANCE\n# Written by stb_image_write.h\nFORMAT=32-bit_rle_rgbe\n";
+      s->func(s->context, header, sizeof(header)-1);
+
+#ifdef __STDC_LIB_EXT1__
+      len = sprintf_s(buffer, sizeof(buffer), "EXPOSURE=          1.0000000000000\n\n-Y %d +X %d\n", y, x);
+#else
+      len = sprintf(buffer, "EXPOSURE=          1.0000000000000\n\n-Y %d +X %d\n", y, x);
+#endif
+      s->func(s->context, buffer, len);
+
+      for(i=0; i < y; i++)
+         stbiw__write_hdr_scanline(s, x, comp, scratch, data + comp*x*(stbi__flip_vertically_on_write ? y-1-i : i));
+      STBIW_FREE(scratch);
+      return 1;
+   }
+}
+
+STBIWDEF int stbi_write_hdr_to_func(stbi_write_func *func, void *context, int x, int y, int comp, const float *data)
+{
+   stbi__write_context s = { 0 };
+   stbi__start_write_callbacks(&s, func, context);
+   return stbi_write_hdr_core(&s, x, y, comp, (float *) data);
+}
+
+STBIWDEF int stbi_write_hdr(char const *filename, int x, int y, int comp, const float *data)
+{
+   stbi__write_context s = { 0 };
+   if (stbi__start_write_file(&s,filename)) {
+      int r = stbi_write_hdr_core(&s, x, y, comp, (float *) data);
+      stbi__end_write_file(&s);
+      return r;
+   } else
+      return 0;
+}
+#endif // STBI_WRITE_NO_STDIO
+
+
+//////////////////////////////////////////////////////////////////////////////
+//
+// PNG writer
+//
+
+#ifndef STBIW_ZLIB_COMPRESS
+// stretchy buffer; stbiw__sbpush() == vector<>::push_back() -- stbiw__sbcount() == vector<>::size()
+#define stbiw__sbraw(a) ((int *) (void *) (a) - 2)
+#define stbiw__sbm(a)   stbiw__sbraw(a)[0]
+#define stbiw__sbn(a)   stbiw__sbraw(a)[1]
+
+#define stbiw__sbneedgrow(a,n)  ((a)==0 || stbiw__sbn(a)+n >= stbiw__sbm(a))
+#define stbiw__sbmaybegrow(a,n) (stbiw__sbneedgrow(a,(n)) ? stbiw__sbgrow(a,n) : 0)
+#define stbiw__sbgrow(a,n)  stbiw__sbgrowf((void **) &(a), (n), sizeof(*(a)))
+
+#define stbiw__sbpush(a, v)      (stbiw__sbmaybegrow(a,1), (a)[stbiw__sbn(a)++] = (v))
+#define stbiw__sbcount(a)        ((a) ? stbiw__sbn(a) : 0)
+#define stbiw__sbfree(a)         ((a) ? STBIW_FREE(stbiw__sbraw(a)),0 : 0)
+
+static void *stbiw__sbgrowf(void **arr, int increment, int itemsize)
+{
+   int m = *arr ? 2*stbiw__sbm(*arr)+increment : increment+1;
+   void *p = STBIW_REALLOC_SIZED(*arr ? stbiw__sbraw(*arr) : 0, *arr ? (stbiw__sbm(*arr)*itemsize + sizeof(int)*2) : 0, itemsize * m + sizeof(int)*2);
+   STBIW_ASSERT(p);
+   if (p) {
+      if (!*arr) ((int *) p)[1] = 0;
+      *arr = (void *) ((int *) p + 2);
+      stbiw__sbm(*arr) = m;
+   }
+   return *arr;
+}
+
+static unsigned char *stbiw__zlib_flushf(unsigned char *data, unsigned int *bitbuffer, int *bitcount)
+{
+   while (*bitcount >= 8) {
+      stbiw__sbpush(data, STBIW_UCHAR(*bitbuffer));
+      *bitbuffer >>= 8;
+      *bitcount -= 8;
+   }
+   return data;
+}
+
+static int stbiw__zlib_bitrev(int code, int codebits)
+{
+   int res=0;
+   while (codebits--) {
+      res = (res << 1) | (code & 1);
+      code >>= 1;
+   }
+   return res;
+}
+
+static unsigned int stbiw__zlib_countm(unsigned char *a, unsigned char *b, int limit)
+{
+   int i;
+   for (i=0; i < limit && i < 258; ++i)
+      if (a[i] != b[i]) break;
+   return i;
+}
+
+static unsigned int stbiw__zhash(unsigned char *data)
+{
+   stbiw_uint32 hash = data[0] + (data[1] << 8) + (data[2] << 16);
+   hash ^= hash << 3;
+   hash += hash >> 5;
+   hash ^= hash << 4;
+   hash += hash >> 17;
+   hash ^= hash << 25;
+   hash += hash >> 6;
+   return hash;
+}
+
+#define stbiw__zlib_flush() (out = stbiw__zlib_flushf(out, &bitbuf, &bitcount))
+#define stbiw__zlib_add(code,codebits) \
+      (bitbuf |= (code) << bitcount, bitcount += (codebits), stbiw__zlib_flush())
+#define stbiw__zlib_huffa(b,c)  stbiw__zlib_add(stbiw__zlib_bitrev(b,c),c)
+// default huffman tables
+#define stbiw__zlib_huff1(n)  stbiw__zlib_huffa(0x30 + (n), 8)
+#define stbiw__zlib_huff2(n)  stbiw__zlib_huffa(0x190 + (n)-144, 9)
+#define stbiw__zlib_huff3(n)  stbiw__zlib_huffa(0 + (n)-256,7)
+#define stbiw__zlib_huff4(n)  stbiw__zlib_huffa(0xc0 + (n)-280,8)
+#define stbiw__zlib_huff(n)  ((n) <= 143 ? stbiw__zlib_huff1(n) : (n) <= 255 ? stbiw__zlib_huff2(n) : (n) <= 279 ? stbiw__zlib_huff3(n) : stbiw__zlib_huff4(n))
+#define stbiw__zlib_huffb(n) ((n) <= 143 ? stbiw__zlib_huff1(n) : stbiw__zlib_huff2(n))
+
+#define stbiw__ZHASH   16384
+
+#endif // STBIW_ZLIB_COMPRESS
+
+STBIWDEF unsigned char * stbi_zlib_compress(unsigned char *data, int data_len, int *out_len, int quality)
+{
+#ifdef STBIW_ZLIB_COMPRESS
+   // user provided a zlib compress implementation, use that
+   return STBIW_ZLIB_COMPRESS(data, data_len, out_len, quality);
+#else // use builtin
+   static unsigned short lengthc[] = { 3,4,5,6,7,8,9,10,11,13,15,17,19,23,27,31,35,43,51,59,67,83,99,115,131,163,195,227,258, 259 };
+   static unsigned char  lengtheb[]= { 0,0,0,0,0,0,0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4,  4,  5,  5,  5,  5,  0 };
+   static unsigned short distc[]   = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577, 32768 };
+   static unsigned char  disteb[]  = { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13 };
+   unsigned int bitbuf=0;
+   int i,j, bitcount=0;
+   unsigned char *out = NULL;
+   unsigned char ***hash_table = (unsigned char***) STBIW_MALLOC(stbiw__ZHASH * sizeof(unsigned char**));
+   if (hash_table == NULL)
+      return NULL;
+   if (quality < 5) quality = 5;
+
+   stbiw__sbpush(out, 0x78);   // DEFLATE 32K window
+   stbiw__sbpush(out, 0x5e);   // FLEVEL = 1
+   stbiw__zlib_add(1,1);  // BFINAL = 1
+   stbiw__zlib_add(1,2);  // BTYPE = 1 -- fixed huffman
+
+   for (i=0; i < stbiw__ZHASH; ++i)
+      hash_table[i] = NULL;
+
+   i=0;
+   while (i < data_len-3) {
+      // hash next 3 bytes of data to be compressed
+      int h = stbiw__zhash(data+i)&(stbiw__ZHASH-1), best=3;
+      unsigned char *bestloc = 0;
+      unsigned char **hlist = hash_table[h];
+      int n = stbiw__sbcount(hlist);
+      for (j=0; j < n; ++j) {
+         if (hlist[j]-data > i-32768) { // if entry lies within window
+            int d = stbiw__zlib_countm(hlist[j], data+i, data_len-i);
+            if (d >= best) { best=d; bestloc=hlist[j]; }
+         }
+      }
+      // when hash table entry is too long, delete half the entries
+      if (hash_table[h] && stbiw__sbn(hash_table[h]) == 2*quality) {
+         STBIW_MEMMOVE(hash_table[h], hash_table[h]+quality, sizeof(hash_table[h][0])*quality);
+         stbiw__sbn(hash_table[h]) = quality;
+      }
+      stbiw__sbpush(hash_table[h],data+i);
+
+      if (bestloc) {
+         // "lazy matching" - check match at *next* byte, and if it's better, do cur byte as literal
+         h = stbiw__zhash(data+i+1)&(stbiw__ZHASH-1);
+         hlist = hash_table[h];
+         n = stbiw__sbcount(hlist);
+         for (j=0; j < n; ++j) {
+            if (hlist[j]-data > i-32767) {
+               int e = stbiw__zlib_countm(hlist[j], data+i+1, data_len-i-1);
+               if (e > best) { // if next match is better, bail on current match
+                  bestloc = NULL;
+                  break;
+               }
+            }
+         }
+      }
+
+      if (bestloc) {
+         int d = (int) (data+i - bestloc); // distance back
+         STBIW_ASSERT(d <= 32767 && best <= 258);
+         for (j=0; best > lengthc[j+1]-1; ++j);
+         stbiw__zlib_huff(j+257);
+         if (lengtheb[j]) stbiw__zlib_add(best - lengthc[j], lengtheb[j]);
+         for (j=0; d > distc[j+1]-1; ++j);
+         stbiw__zlib_add(stbiw__zlib_bitrev(j,5),5);
+         if (disteb[j]) stbiw__zlib_add(d - distc[j], disteb[j]);
+         i += best;
+      } else {
+         stbiw__zlib_huffb(data[i]);
+         ++i;
+      }
+   }
+   // write out final bytes
+   for (;i < data_len; ++i)
+      stbiw__zlib_huffb(data[i]);
+   stbiw__zlib_huff(256); // end of block
+   // pad with 0 bits to byte boundary
+   while (bitcount)
+      stbiw__zlib_add(0,1);
+
+   for (i=0; i < stbiw__ZHASH; ++i)
+      (void) stbiw__sbfree(hash_table[i]);
+   STBIW_FREE(hash_table);
+
+   // store uncompressed instead if compression was worse
+   if (stbiw__sbn(out) > data_len + 2 + ((data_len+32766)/32767)*5) {
+      stbiw__sbn(out) = 2;  // truncate to DEFLATE 32K window and FLEVEL = 1
+      for (j = 0; j < data_len;) {
+         int blocklen = data_len - j;
+         if (blocklen > 32767) blocklen = 32767;
+         stbiw__sbpush(out, data_len - j == blocklen); // BFINAL = ?, BTYPE = 0 -- no compression
+         stbiw__sbpush(out, STBIW_UCHAR(blocklen)); // LEN
+         stbiw__sbpush(out, STBIW_UCHAR(blocklen >> 8));
+         stbiw__sbpush(out, STBIW_UCHAR(~blocklen)); // NLEN
+         stbiw__sbpush(out, STBIW_UCHAR(~blocklen >> 8));
+         memcpy(out+stbiw__sbn(out), data+j, blocklen);
+         stbiw__sbn(out) += blocklen;
+         j += blocklen;
+      }
+   }
+
+   {
+      // compute adler32 on input
+      unsigned int s1=1, s2=0;
+      int blocklen = (int) (data_len % 5552);
+      j=0;
+      while (j < data_len) {
+         for (i=0; i < blocklen; ++i) { s1 += data[j+i]; s2 += s1; }
+         s1 %= 65521; s2 %= 65521;
+         j += blocklen;
+         blocklen = 5552;
+      }
+      stbiw__sbpush(out, STBIW_UCHAR(s2 >> 8));
+      stbiw__sbpush(out, STBIW_UCHAR(s2));
+      stbiw__sbpush(out, STBIW_UCHAR(s1 >> 8));
+      stbiw__sbpush(out, STBIW_UCHAR(s1));
+   }
+   *out_len = stbiw__sbn(out);
+   // make returned pointer freeable
+   STBIW_MEMMOVE(stbiw__sbraw(out), out, *out_len);
+   return (unsigned char *) stbiw__sbraw(out);
+#endif // STBIW_ZLIB_COMPRESS
+}
+
+static unsigned int stbiw__crc32(unsigned char *buffer, int len)
+{
+#ifdef STBIW_CRC32
+    return STBIW_CRC32(buffer, len);
+#else
+   static unsigned int crc_table[256] =
+   {
+      0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA, 0x076DC419, 0x706AF48F, 0xE963A535, 0x9E6495A3,
+      0x0eDB8832, 0x79DCB8A4, 0xE0D5E91E, 0x97D2D988, 0x09B64C2B, 0x7EB17CBD, 0xE7B82D07, 0x90BF1D91,
+      0x1DB71064, 0x6AB020F2, 0xF3B97148, 0x84BE41DE, 0x1ADAD47D, 0x6DDDE4EB, 0xF4D4B551, 0x83D385C7,
+      0x136C9856, 0x646BA8C0, 0xFD62F97A, 0x8A65C9EC, 0x14015C4F, 0x63066CD9, 0xFA0F3D63, 0x8D080DF5,
+      0x3B6E20C8, 0x4C69105E, 0xD56041E4, 0xA2677172, 0x3C03E4D1, 0x4B04D447, 0xD20D85FD, 0xA50AB56B,
+      0x35B5A8FA, 0x42B2986C, 0xDBBBC9D6, 0xACBCF940, 0x32D86CE3, 0x45DF5C75, 0xDCD60DCF, 0xABD13D59,
+      0x26D930AC, 0x51DE003A, 0xC8D75180, 0xBFD06116, 0x21B4F4B5, 0x56B3C423, 0xCFBA9599, 0xB8BDA50F,
+      0x2802B89E, 0x5F058808, 0xC60CD9B2, 0xB10BE924, 0x2F6F7C87, 0x58684C11, 0xC1611DAB, 0xB6662D3D,
+      0x76DC4190, 0x01DB7106, 0x98D220BC, 0xEFD5102A, 0x71B18589, 0x06B6B51F, 0x9FBFE4A5, 0xE8B8D433,
+      0x7807C9A2, 0x0F00F934, 0x9609A88E, 0xE10E9818, 0x7F6A0DBB, 0x086D3D2D, 0x91646C97, 0xE6635C01,
+      0x6B6B51F4, 0x1C6C6162, 0x856530D8, 0xF262004E, 0x6C0695ED, 0x1B01A57B, 0x8208F4C1, 0xF50FC457,
+      0x65B0D9C6, 0x12B7E950, 0x8BBEB8EA, 0xFCB9887C, 0x62DD1DDF, 0x15DA2D49, 0x8CD37CF3, 0xFBD44C65,
+      0x4DB26158, 0x3AB551CE, 0xA3BC0074, 0xD4BB30E2, 0x4ADFA541, 0x3DD895D7, 0xA4D1C46D, 0xD3D6F4FB,
+      0x4369E96A, 0x346ED9FC, 0xAD678846, 0xDA60B8D0, 0x44042D73, 0x33031DE5, 0xAA0A4C5F, 0xDD0D7CC9,
+      0x5005713C, 0x270241AA, 0xBE0B1010, 0xC90C2086, 0x5768B525, 0x206F85B3, 0xB966D409, 0xCE61E49F,
+      0x5EDEF90E, 0x29D9C998, 0xB0D09822, 0xC7D7A8B4, 0x59B33D17, 0x2EB40D81, 0xB7BD5C3B, 0xC0BA6CAD,
+      0xEDB88320, 0x9ABFB3B6, 0x03B6E20C, 0x74B1D29A, 0xEAD54739, 0x9DD277AF, 0x04DB2615, 0x73DC1683,
+      0xE3630B12, 0x94643B84, 0x0D6D6A3E, 0x7A6A5AA8, 0xE40ECF0B, 0x9309FF9D, 0x0A00AE27, 0x7D079EB1,
+      0xF00F9344, 0x8708A3D2, 0x1E01F268, 0x6906C2FE, 0xF762575D, 0x806567CB, 0x196C3671, 0x6E6B06E7,
+      0xFED41B76, 0x89D32BE0, 0x10DA7A5A, 0x67DD4ACC, 0xF9B9DF6F, 0x8EBEEFF9, 0x17B7BE43, 0x60B08ED5,
+      0xD6D6A3E8, 0xA1D1937E, 0x38D8C2C4, 0x4FDFF252, 0xD1BB67F1, 0xA6BC5767, 0x3FB506DD, 0x48B2364B,
+      0xD80D2BDA, 0xAF0A1B4C, 0x36034AF6, 0x41047A60, 0xDF60EFC3, 0xA867DF55, 0x316E8EEF, 0x4669BE79,
+      0xCB61B38C, 0xBC66831A, 0x256FD2A0, 0x5268E236, 0xCC0C7795, 0xBB0B4703, 0x220216B9, 0x5505262F,
+      0xC5BA3BBE, 0xB2BD0B28, 0x2BB45A92, 0x5CB36A04, 0xC2D7FFA7, 0xB5D0CF31, 0x2CD99E8B, 0x5BDEAE1D,
+      0x9B64C2B0, 0xEC63F226, 0x756AA39C, 0x026D930A, 0x9C0906A9, 0xEB0E363F, 0x72076785, 0x05005713,
+      0x95BF4A82, 0xE2B87A14, 0x7BB12BAE, 0x0CB61B38, 0x92D28E9B, 0xE5D5BE0D, 0x7CDCEFB7, 0x0BDBDF21,
+      0x86D3D2D4, 0xF1D4E242, 0x68DDB3F8, 0x1FDA836E, 0x81BE16CD, 0xF6B9265B, 0x6FB077E1, 0x18B74777,
+      0x88085AE6, 0xFF0F6A70, 0x66063BCA, 0x11010B5C, 0x8F659EFF, 0xF862AE69, 0x616BFFD3, 0x166CCF45,
+      0xA00AE278, 0xD70DD2EE, 0x4E048354, 0x3903B3C2, 0xA7672661, 0xD06016F7, 0x4969474D, 0x3E6E77DB,
+      0xAED16A4A, 0xD9D65ADC, 0x40DF0B66, 0x37D83BF0, 0xA9BCAE53, 0xDEBB9EC5, 0x47B2CF7F, 0x30B5FFE9,
+      0xBDBDF21C, 0xCABAC28A, 0x53B39330, 0x24B4A3A6, 0xBAD03605, 0xCDD70693, 0x54DE5729, 0x23D967BF,
+      0xB3667A2E, 0xC4614AB8, 0x5D681B02, 0x2A6F2B94, 0xB40BBE37, 0xC30C8EA1, 0x5A05DF1B, 0x2D02EF8D
+   };
+
+   unsigned int crc = ~0u;
+   int i;
+   for (i=0; i < len; ++i)
+      crc = (crc >> 8) ^ crc_table[buffer[i] ^ (crc & 0xff)];
+   return ~crc;
+#endif
+}
+
+#define stbiw__wpng4(o,a,b,c,d) ((o)[0]=STBIW_UCHAR(a),(o)[1]=STBIW_UCHAR(b),(o)[2]=STBIW_UCHAR(c),(o)[3]=STBIW_UCHAR(d),(o)+=4)
+#define stbiw__wp32(data,v) stbiw__wpng4(data, (v)>>24,(v)>>16,(v)>>8,(v));
+#define stbiw__wptag(data,s) stbiw__wpng4(data, s[0],s[1],s[2],s[3])
+
+static void stbiw__wpcrc(unsigned char **data, int len)
+{
+   unsigned int crc = stbiw__crc32(*data - len - 4, len+4);
+   stbiw__wp32(*data, crc);
+}
+
+static unsigned char stbiw__paeth(int a, int b, int c)
+{
+   int p = a + b - c, pa = abs(p-a), pb = abs(p-b), pc = abs(p-c);
+   if (pa <= pb && pa <= pc) return STBIW_UCHAR(a);
+   if (pb <= pc) return STBIW_UCHAR(b);
+   return STBIW_UCHAR(c);
+}
+
+// @OPTIMIZE: provide an option that always forces left-predict or paeth predict
+static void stbiw__encode_png_line(unsigned char *pixels, int stride_bytes, int width, int height, int y, int n, int filter_type, signed char *line_buffer)
+{
+   static int mapping[] = { 0,1,2,3,4 };
+   static int firstmap[] = { 0,1,0,5,6 };
+   int *mymap = (y != 0) ? mapping : firstmap;
+   int i;
+   int type = mymap[filter_type];
+   unsigned char *z = pixels + stride_bytes * (stbi__flip_vertically_on_write ? height-1-y : y);
+   int signed_stride = stbi__flip_vertically_on_write ? -stride_bytes : stride_bytes;
+
+   if (type==0) {
+      memcpy(line_buffer, z, width*n);
+      return;
+   }
+
+   // first loop isn't optimized since it's just one pixel
+   for (i = 0; i < n; ++i) {
+      switch (type) {
+         case 1: line_buffer[i] = z[i]; break;
+         case 2: line_buffer[i] = z[i] - z[i-signed_stride]; break;
+         case 3: line_buffer[i] = z[i] - (z[i-signed_stride]>>1); break;
+         case 4: line_buffer[i] = (signed char) (z[i] - stbiw__paeth(0,z[i-signed_stride],0)); break;
+         case 5: line_buffer[i] = z[i]; break;
+         case 6: line_buffer[i] = z[i]; break;
+      }
+   }
+   switch (type) {
+      case 1: for (i=n; i < width*n; ++i) line_buffer[i] = z[i] - z[i-n]; break;
+      case 2: for (i=n; i < width*n; ++i) line_buffer[i] = z[i] - z[i-signed_stride]; break;
+      case 3: for (i=n; i < width*n; ++i) line_buffer[i] = z[i] - ((z[i-n] + z[i-signed_stride])>>1); break;
+      case 4: for (i=n; i < width*n; ++i) line_buffer[i] = z[i] - stbiw__paeth(z[i-n], z[i-signed_stride], z[i-signed_stride-n]); break;
+      case 5: for (i=n; i < width*n; ++i) line_buffer[i] = z[i] - (z[i-n]>>1); break;
+      case 6: for (i=n; i < width*n; ++i) line_buffer[i] = z[i] - stbiw__paeth(z[i-n], 0,0); break;
+   }
+}
+
+STBIWDEF unsigned char *stbi_write_png_to_mem(const unsigned char *pixels, int stride_bytes, int x, int y, int n, int *out_len)
+{
+   int force_filter = stbi_write_force_png_filter;
+   int ctype[5] = { -1, 0, 4, 2, 6 };
+   unsigned char sig[8] = { 137,80,78,71,13,10,26,10 };
+   unsigned char *out,*o, *filt, *zlib;
+   signed char *line_buffer;
+   int j,zlen;
+
+   if (stride_bytes == 0)
+      stride_bytes = x * n;
+
+   if (force_filter >= 5) {
+      force_filter = -1;
+   }
+
+   filt = (unsigned char *) STBIW_MALLOC((x*n+1) * y); if (!filt) return 0;
+   line_buffer = (signed char *) STBIW_MALLOC(x * n); if (!line_buffer) { STBIW_FREE(filt); return 0; }
+   for (j=0; j < y; ++j) {
+      int filter_type;
+      if (force_filter > -1) {
+         filter_type = force_filter;
+         stbiw__encode_png_line((unsigned char*)(pixels), stride_bytes, x, y, j, n, force_filter, line_buffer);
+      } else { // Estimate the best filter by running through all of them:
+         int best_filter = 0, best_filter_val = 0x7fffffff, est, i;
+         for (filter_type = 0; filter_type < 5; filter_type++) {
+            stbiw__encode_png_line((unsigned char*)(pixels), stride_bytes, x, y, j, n, filter_type, line_buffer);
+
+            // Estimate the entropy of the line using this filter; the less, the better.
+            est = 0;
+            for (i = 0; i < x*n; ++i) {
+               est += abs((signed char) line_buffer[i]);
+            }
+            if (est < best_filter_val) {
+               best_filter_val = est;
+               best_filter = filter_type;
+            }
+         }
+         if (filter_type != best_filter) {  // If the last iteration already got us the best filter, don't redo it
+            stbiw__encode_png_line((unsigned char*)(pixels), stride_bytes, x, y, j, n, best_filter, line_buffer);
+            filter_type = best_filter;
+         }
+      }
+      // when we get here, filter_type contains the filter type, and line_buffer contains the data
+      filt[j*(x*n+1)] = (unsigned char) filter_type;
+      STBIW_MEMMOVE(filt+j*(x*n+1)+1, line_buffer, x*n);
+   }
+   STBIW_FREE(line_buffer);
+   zlib = stbi_zlib_compress(filt, y*( x*n+1), &zlen, stbi_write_png_compression_level);
+   STBIW_FREE(filt);
+   if (!zlib) return 0;
+
+   // each tag requires 12 bytes of overhead
+   out = (unsigned char *) STBIW_MALLOC(8 + 12+13 + 12+zlen + 12);
+   if (!out) return 0;
+   *out_len = 8 + 12+13 + 12+zlen + 12;
+
+   o=out;
+   STBIW_MEMMOVE(o,sig,8); o+= 8;
+   stbiw__wp32(o, 13); // header length
+   stbiw__wptag(o, "IHDR");
+   stbiw__wp32(o, x);
+   stbiw__wp32(o, y);
+   *o++ = 8;
+   *o++ = STBIW_UCHAR(ctype[n]);
+   *o++ = 0;
+   *o++ = 0;
+   *o++ = 0;
+   stbiw__wpcrc(&o,13);
+
+   stbiw__wp32(o, zlen);
+   stbiw__wptag(o, "IDAT");
+   STBIW_MEMMOVE(o, zlib, zlen);
+   o += zlen;
+   STBIW_FREE(zlib);
+   stbiw__wpcrc(&o, zlen);
+
+   stbiw__wp32(o,0);
+   stbiw__wptag(o, "IEND");
+   stbiw__wpcrc(&o,0);
+
+   STBIW_ASSERT(o == out + *out_len);
+
+   return out;
+}
+
+#ifndef STBI_WRITE_NO_STDIO
+STBIWDEF int stbi_write_png(char const *filename, int x, int y, int comp, const void *data, int stride_bytes)
+{
+   FILE *f;
+   int len;
+   unsigned char *png = stbi_write_png_to_mem((const unsigned char *) data, stride_bytes, x, y, comp, &len);
+   if (png == NULL) return 0;
+
+   f = stbiw__fopen(filename, "wb");
+   if (!f) { STBIW_FREE(png); return 0; }
+   fwrite(png, 1, len, f);
+   fclose(f);
+   STBIW_FREE(png);
+   return 1;
+}
+#endif
+
+STBIWDEF int stbi_write_png_to_func(stbi_write_func *func, void *context, int x, int y, int comp, const void *data, int stride_bytes)
+{
+   int len;
+   unsigned char *png = stbi_write_png_to_mem((const unsigned char *) data, stride_bytes, x, y, comp, &len);
+   if (png == NULL) return 0;
+   func(context, png, len);
+   STBIW_FREE(png);
+   return 1;
+}
+
+
+/* ***************************************************************************
+ *
+ * JPEG writer
+ *
+ * This is based on Jon Olick's jo_jpeg.cpp:
+ * public domain Simple, Minimalistic JPEG writer - http://www.jonolick.com/code.html
+ */
+
+static const unsigned char stbiw__jpg_ZigZag[] = { 0,1,5,6,14,15,27,28,2,4,7,13,16,26,29,42,3,8,12,17,25,30,41,43,9,11,18,
+      24,31,40,44,53,10,19,23,32,39,45,52,54,20,22,33,38,46,51,55,60,21,34,37,47,50,56,59,61,35,36,48,49,57,58,62,63 };
+
+static void stbiw__jpg_writeBits(stbi__write_context *s, int *bitBufP, int *bitCntP, const unsigned short *bs) {
+   int bitBuf = *bitBufP, bitCnt = *bitCntP;
+   bitCnt += bs[1];
+   bitBuf |= bs[0] << (24 - bitCnt);
+   while(bitCnt >= 8) {
+      unsigned char c = (bitBuf >> 16) & 255;
+      stbiw__putc(s, c);
+      if(c == 255) {
+         stbiw__putc(s, 0);
+      }
+      bitBuf <<= 8;
+      bitCnt -= 8;
+   }
+   *bitBufP = bitBuf;
+   *bitCntP = bitCnt;
+}
+
+static void stbiw__jpg_DCT(float *d0p, float *d1p, float *d2p, float *d3p, float *d4p, float *d5p, float *d6p, float *d7p) {
+   float d0 = *d0p, d1 = *d1p, d2 = *d2p, d3 = *d3p, d4 = *d4p, d5 = *d5p, d6 = *d6p, d7 = *d7p;
+   float z1, z2, z3, z4, z5, z11, z13;
+
+   float tmp0 = d0 + d7;
+   float tmp7 = d0 - d7;
+   float tmp1 = d1 + d6;
+   float tmp6 = d1 - d6;
+   float tmp2 = d2 + d5;
+   float tmp5 = d2 - d5;
+   float tmp3 = d3 + d4;
+   float tmp4 = d3 - d4;
+
+   // Even part
+   float tmp10 = tmp0 + tmp3;   // phase 2
+   float tmp13 = tmp0 - tmp3;
+   float tmp11 = tmp1 + tmp2;
+   float tmp12 = tmp1 - tmp2;
+
+   d0 = tmp10 + tmp11;       // phase 3
+   d4 = tmp10 - tmp11;
+
+   z1 = (tmp12 + tmp13) * 0.707106781f; // c4
+   d2 = tmp13 + z1;       // phase 5
+   d6 = tmp13 - z1;
+
+   // Odd part
+   tmp10 = tmp4 + tmp5;       // phase 2
+   tmp11 = tmp5 + tmp6;
+   tmp12 = tmp6 + tmp7;
+
+   // The rotator is modified from fig 4-8 to avoid extra negations.
+   z5 = (tmp10 - tmp12) * 0.382683433f; // c6
+   z2 = tmp10 * 0.541196100f + z5; // c2-c6
+   z4 = tmp12 * 1.306562965f + z5; // c2+c6
+   z3 = tmp11 * 0.707106781f; // c4
+
+   z11 = tmp7 + z3;      // phase 5
+   z13 = tmp7 - z3;
+
+   *d5p = z13 + z2;         // phase 6
+   *d3p = z13 - z2;
+   *d1p = z11 + z4;
+   *d7p = z11 - z4;
+
+   *d0p = d0;  *d2p = d2;  *d4p = d4;  *d6p = d6;
+}
+
+static void stbiw__jpg_calcBits(int val, unsigned short bits[2]) {
+   int tmp1 = val < 0 ? -val : val;
+   val = val < 0 ? val-1 : val;
+   bits[1] = 1;
+   while(tmp1 >>= 1) {
+      ++bits[1];
+   }
+   bits[0] = val & ((1<<bits[1])-1);
+}
+
+static int stbiw__jpg_processDU(stbi__write_context *s, int *bitBuf, int *bitCnt, float *CDU, int du_stride, float *fdtbl, int DC, const unsigned short HTDC[256][2], const unsigned short HTAC[256][2]) {
+   const unsigned short EOB[2] = { HTAC[0x00][0], HTAC[0x00][1] };
+   const unsigned short M16zeroes[2] = { HTAC[0xF0][0], HTAC[0xF0][1] };
+   int dataOff, i, j, n, diff, end0pos, x, y;
+   int DU[64];
+
+   // DCT rows
+   for(dataOff=0, n=du_stride*8; dataOff<n; dataOff+=du_stride) {
+      stbiw__jpg_DCT(&CDU[dataOff], &CDU[dataOff+1], &CDU[dataOff+2], &CDU[dataOff+3], &CDU[dataOff+4], &CDU[dataOff+5], &CDU[dataOff+6], &CDU[dataOff+7]);
+   }
+   // DCT columns
+   for(dataOff=0; dataOff<8; ++dataOff) {
+      stbiw__jpg_DCT(&CDU[dataOff], &CDU[dataOff+du_stride], &CDU[dataOff+du_stride*2], &CDU[dataOff+du_stride*3], &CDU[dataOff+du_stride*4],
+                     &CDU[dataOff+du_stride*5], &CDU[dataOff+du_stride*6], &CDU[dataOff+du_stride*7]);
+   }
+   // Quantize/descale/zigzag the coefficients
+   for(y = 0, j=0; y < 8; ++y) {
+      for(x = 0; x < 8; ++x,++j) {
+         float v;
+         i = y*du_stride+x;
+         v = CDU[i]*fdtbl[j];
+         // DU[stbiw__jpg_ZigZag[j]] = (int)(v < 0 ? ceilf(v - 0.5f) : floorf(v + 0.5f));
+         // ceilf() and floorf() are C99, not C89, but I /think/ they're not needed here anyway?
+         DU[stbiw__jpg_ZigZag[j]] = (int)(v < 0 ? v - 0.5f : v + 0.5f);
+      }
+   }
+
+   // Encode DC
+   diff = DU[0] - DC;
+   if (diff == 0) {
+      stbiw__jpg_writeBits(s, bitBuf, bitCnt, HTDC[0]);
+   } else {
+      unsigned short bits[2];
+      stbiw__jpg_calcBits(diff, bits);
+      stbiw__jpg_writeBits(s, bitBuf, bitCnt, HTDC[bits[1]]);
+      stbiw__jpg_writeBits(s, bitBuf, bitCnt, bits);
+   }
+   // Encode ACs
+   end0pos = 63;
+   for(; (end0pos>0)&&(DU[end0pos]==0); --end0pos) {
+   }
+   // end0pos = first element in reverse order !=0
+   if(end0pos == 0) {
+      stbiw__jpg_writeBits(s, bitBuf, bitCnt, EOB);
+      return DU[0];
+   }
+   for(i = 1; i <= end0pos; ++i) {
+      int startpos = i;
+      int nrzeroes;
+      unsigned short bits[2];
+      for (; DU[i]==0 && i<=end0pos; ++i) {
+      }
+      nrzeroes = i-startpos;
+      if ( nrzeroes >= 16 ) {
+         int lng = nrzeroes>>4;
+         int nrmarker;
+         for (nrmarker=1; nrmarker <= lng; ++nrmarker)
+            stbiw__jpg_writeBits(s, bitBuf, bitCnt, M16zeroes);
+         nrzeroes &= 15;
+      }
+      stbiw__jpg_calcBits(DU[i], bits);
+      stbiw__jpg_writeBits(s, bitBuf, bitCnt, HTAC[(nrzeroes<<4)+bits[1]]);
+      stbiw__jpg_writeBits(s, bitBuf, bitCnt, bits);
+   }
+   if(end0pos != 63) {
+      stbiw__jpg_writeBits(s, bitBuf, bitCnt, EOB);
+   }
+   return DU[0];
+}
+
+static int stbi_write_jpg_core(stbi__write_context *s, int width, int height, int comp, const void* data, int quality) {
+   // Constants that don't pollute global namespace
+   static const unsigned char std_dc_luminance_nrcodes[] = {0,0,1,5,1,1,1,1,1,1,0,0,0,0,0,0,0};
+   static const unsigned char std_dc_luminance_values[] = {0,1,2,3,4,5,6,7,8,9,10,11};
+   static const unsigned char std_ac_luminance_nrcodes[] = {0,0,2,1,3,3,2,4,3,5,5,4,4,0,0,1,0x7d};
+   static const unsigned char std_ac_luminance_values[] = {
+      0x01,0x02,0x03,0x00,0x04,0x11,0x05,0x12,0x21,0x31,0x41,0x06,0x13,0x51,0x61,0x07,0x22,0x71,0x14,0x32,0x81,0x91,0xa1,0x08,
+      0x23,0x42,0xb1,0xc1,0x15,0x52,0xd1,0xf0,0x24,0x33,0x62,0x72,0x82,0x09,0x0a,0x16,0x17,0x18,0x19,0x1a,0x25,0x26,0x27,0x28,
+      0x29,0x2a,0x34,0x35,0x36,0x37,0x38,0x39,0x3a,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4a,0x53,0x54,0x55,0x56,0x57,0x58,0x59,
+      0x5a,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6a,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x83,0x84,0x85,0x86,0x87,0x88,0x89,
+      0x8a,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9a,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xb2,0xb3,0xb4,0xb5,0xb6,
+      0xb7,0xb8,0xb9,0xba,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xca,0xd2,0xd3,0xd4,0xd5,0xd6,0xd7,0xd8,0xd9,0xda,0xe1,0xe2,
+      0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0xea,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0xfa
+   };
+   static const unsigned char std_dc_chrominance_nrcodes[] = {0,0,3,1,1,1,1,1,1,1,1,1,0,0,0,0,0};
+   static const unsigned char std_dc_chrominance_values[] = {0,1,2,3,4,5,6,7,8,9,10,11};
+   static const unsigned char std_ac_chrominance_nrcodes[] = {0,0,2,1,2,4,4,3,4,7,5,4,4,0,1,2,0x77};
+   static const unsigned char std_ac_chrominance_values[] = {
+      0x00,0x01,0x02,0x03,0x11,0x04,0x05,0x21,0x31,0x06,0x12,0x41,0x51,0x07,0x61,0x71,0x13,0x22,0x32,0x81,0x08,0x14,0x42,0x91,
+      0xa1,0xb1,0xc1,0x09,0x23,0x33,0x52,0xf0,0x15,0x62,0x72,0xd1,0x0a,0x16,0x24,0x34,0xe1,0x25,0xf1,0x17,0x18,0x19,0x1a,0x26,
+      0x27,0x28,0x29,0x2a,0x35,0x36,0x37,0x38,0x39,0x3a,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4a,0x53,0x54,0x55,0x56,0x57,0x58,
+      0x59,0x5a,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6a,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0x82,0x83,0x84,0x85,0x86,0x87,
+      0x88,0x89,0x8a,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9a,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xb2,0xb3,0xb4,
+      0xb5,0xb6,0xb7,0xb8,0xb9,0xba,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xca,0xd2,0xd3,0xd4,0xd5,0xd6,0xd7,0xd8,0xd9,0xda,
+      0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0xea,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0xfa
+   };
+   // Huffman tables
+   static const unsigned short YDC_HT[256][2] = { {0,2},{2,3},{3,3},{4,3},{5,3},{6,3},{14,4},{30,5},{62,6},{126,7},{254,8},{510,9}};
+   static const unsigned short UVDC_HT[256][2] = { {0,2},{1,2},{2,2},{6,3},{14,4},{30,5},{62,6},{126,7},{254,8},{510,9},{1022,10},{2046,11}};
+   static const unsigned short YAC_HT[256][2] = {
+      {10,4},{0,2},{1,2},{4,3},{11,4},{26,5},{120,7},{248,8},{1014,10},{65410,16},{65411,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {12,4},{27,5},{121,7},{502,9},{2038,11},{65412,16},{65413,16},{65414,16},{65415,16},{65416,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {28,5},{249,8},{1015,10},{4084,12},{65417,16},{65418,16},{65419,16},{65420,16},{65421,16},{65422,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {58,6},{503,9},{4085,12},{65423,16},{65424,16},{65425,16},{65426,16},{65427,16},{65428,16},{65429,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {59,6},{1016,10},{65430,16},{65431,16},{65432,16},{65433,16},{65434,16},{65435,16},{65436,16},{65437,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {122,7},{2039,11},{65438,16},{65439,16},{65440,16},{65441,16},{65442,16},{65443,16},{65444,16},{65445,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {123,7},{4086,12},{65446,16},{65447,16},{65448,16},{65449,16},{65450,16},{65451,16},{65452,16},{65453,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {250,8},{4087,12},{65454,16},{65455,16},{65456,16},{65457,16},{65458,16},{65459,16},{65460,16},{65461,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {504,9},{32704,15},{65462,16},{65463,16},{65464,16},{65465,16},{65466,16},{65467,16},{65468,16},{65469,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {505,9},{65470,16},{65471,16},{65472,16},{65473,16},{65474,16},{65475,16},{65476,16},{65477,16},{65478,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {506,9},{65479,16},{65480,16},{65481,16},{65482,16},{65483,16},{65484,16},{65485,16},{65486,16},{65487,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {1017,10},{65488,16},{65489,16},{65490,16},{65491,16},{65492,16},{65493,16},{65494,16},{65495,16},{65496,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {1018,10},{65497,16},{65498,16},{65499,16},{65500,16},{65501,16},{65502,16},{65503,16},{65504,16},{65505,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {2040,11},{65506,16},{65507,16},{65508,16},{65509,16},{65510,16},{65511,16},{65512,16},{65513,16},{65514,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {65515,16},{65516,16},{65517,16},{65518,16},{65519,16},{65520,16},{65521,16},{65522,16},{65523,16},{65524,16},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {2041,11},{65525,16},{65526,16},{65527,16},{65528,16},{65529,16},{65530,16},{65531,16},{65532,16},{65533,16},{65534,16},{0,0},{0,0},{0,0},{0,0},{0,0}
+   };
+   static const unsigned short UVAC_HT[256][2] = {
+      {0,2},{1,2},{4,3},{10,4},{24,5},{25,5},{56,6},{120,7},{500,9},{1014,10},{4084,12},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {11,4},{57,6},{246,8},{501,9},{2038,11},{4085,12},{65416,16},{65417,16},{65418,16},{65419,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {26,5},{247,8},{1015,10},{4086,12},{32706,15},{65420,16},{65421,16},{65422,16},{65423,16},{65424,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {27,5},{248,8},{1016,10},{4087,12},{65425,16},{65426,16},{65427,16},{65428,16},{65429,16},{65430,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {58,6},{502,9},{65431,16},{65432,16},{65433,16},{65434,16},{65435,16},{65436,16},{65437,16},{65438,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {59,6},{1017,10},{65439,16},{65440,16},{65441,16},{65442,16},{65443,16},{65444,16},{65445,16},{65446,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {121,7},{2039,11},{65447,16},{65448,16},{65449,16},{65450,16},{65451,16},{65452,16},{65453,16},{65454,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {122,7},{2040,11},{65455,16},{65456,16},{65457,16},{65458,16},{65459,16},{65460,16},{65461,16},{65462,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {249,8},{65463,16},{65464,16},{65465,16},{65466,16},{65467,16},{65468,16},{65469,16},{65470,16},{65471,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {503,9},{65472,16},{65473,16},{65474,16},{65475,16},{65476,16},{65477,16},{65478,16},{65479,16},{65480,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {504,9},{65481,16},{65482,16},{65483,16},{65484,16},{65485,16},{65486,16},{65487,16},{65488,16},{65489,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {505,9},{65490,16},{65491,16},{65492,16},{65493,16},{65494,16},{65495,16},{65496,16},{65497,16},{65498,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {506,9},{65499,16},{65500,16},{65501,16},{65502,16},{65503,16},{65504,16},{65505,16},{65506,16},{65507,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {2041,11},{65508,16},{65509,16},{65510,16},{65511,16},{65512,16},{65513,16},{65514,16},{65515,16},{65516,16},{0,0},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {16352,14},{65517,16},{65518,16},{65519,16},{65520,16},{65521,16},{65522,16},{65523,16},{65524,16},{65525,16},{0,0},{0,0},{0,0},{0,0},{0,0},
+      {1018,10},{32707,15},{65526,16},{65527,16},{65528,16},{65529,16},{65530,16},{65531,16},{65532,16},{65533,16},{65534,16},{0,0},{0,0},{0,0},{0,0},{0,0}
+   };
+   static const int YQT[] = {16,11,10,16,24,40,51,61,12,12,14,19,26,58,60,55,14,13,16,24,40,57,69,56,14,17,22,29,51,87,80,62,18,22,
+                             37,56,68,109,103,77,24,35,55,64,81,104,113,92,49,64,78,87,103,121,120,101,72,92,95,98,112,100,103,99};
+   static const int UVQT[] = {17,18,24,47,99,99,99,99,18,21,26,66,99,99,99,99,24,26,56,99,99,99,99,99,47,66,99,99,99,99,99,99,
+                              99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99,99};
+   static const float aasf[] = { 1.0f * 2.828427125f, 1.387039845f * 2.828427125f, 1.306562965f * 2.828427125f, 1.175875602f * 2.828427125f,
+                                 1.0f * 2.828427125f, 0.785694958f * 2.828427125f, 0.541196100f * 2.828427125f, 0.275899379f * 2.828427125f };
+
+   int row, col, i, k, subsample;
+   float fdtbl_Y[64], fdtbl_UV[64];
+   unsigned char YTable[64], UVTable[64];
+
+   if(!data || !width || !height || comp > 4 || comp < 1) {
+      return 0;
+   }
+
+   quality = quality ? quality : 90;
+   subsample = quality <= 90 ? 1 : 0;
+   quality = quality < 1 ? 1 : quality > 100 ? 100 : quality;
+   quality = quality < 50 ? 5000 / quality : 200 - quality * 2;
+
+   for(i = 0; i < 64; ++i) {
+      int uvti, yti = (YQT[i]*quality+50)/100;
+      YTable[stbiw__jpg_ZigZag[i]] = (unsigned char) (yti < 1 ? 1 : yti > 255 ? 255 : yti);
+      uvti = (UVQT[i]*quality+50)/100;
+      UVTable[stbiw__jpg_ZigZag[i]] = (unsigned char) (uvti < 1 ? 1 : uvti > 255 ? 255 : uvti);
+   }
+
+   for(row = 0, k = 0; row < 8; ++row) {
+      for(col = 0; col < 8; ++col, ++k) {
+         fdtbl_Y[k]  = 1 / (YTable [stbiw__jpg_ZigZag[k]] * aasf[row] * aasf[col]);
+         fdtbl_UV[k] = 1 / (UVTable[stbiw__jpg_ZigZag[k]] * aasf[row] * aasf[col]);
+      }
+   }
+
+   // Write Headers
+   {
+      static const unsigned char head0[] = { 0xFF,0xD8,0xFF,0xE0,0,0x10,'J','F','I','F',0,1,1,0,0,1,0,1,0,0,0xFF,0xDB,0,0x84,0 };
+      static const unsigned char head2[] = { 0xFF,0xDA,0,0xC,3,1,0,2,0x11,3,0x11,0,0x3F,0 };
+      const unsigned char head1[] = { 0xFF,0xC0,0,0x11,8,(unsigned char)(height>>8),STBIW_UCHAR(height),(unsigned char)(width>>8),STBIW_UCHAR(width),
+                                      3,1,(unsigned char)(subsample?0x22:0x11),0,2,0x11,1,3,0x11,1,0xFF,0xC4,0x01,0xA2,0 };
+      s->func(s->context, (void*)head0, sizeof(head0));
+      s->func(s->context, (void*)YTable, sizeof(YTable));
+      stbiw__putc(s, 1);
+      s->func(s->context, UVTable, sizeof(UVTable));
+      s->func(s->context, (void*)head1, sizeof(head1));
+      s->func(s->context, (void*)(std_dc_luminance_nrcodes+1), sizeof(std_dc_luminance_nrcodes)-1);
+      s->func(s->context, (void*)std_dc_luminance_values, sizeof(std_dc_luminance_values));
+      stbiw__putc(s, 0x10); // HTYACinfo
+      s->func(s->context, (void*)(std_ac_luminance_nrcodes+1), sizeof(std_ac_luminance_nrcodes)-1);
+      s->func(s->context, (void*)std_ac_luminance_values, sizeof(std_ac_luminance_values));
+      stbiw__putc(s, 1); // HTUDCinfo
+      s->func(s->context, (void*)(std_dc_chrominance_nrcodes+1), sizeof(std_dc_chrominance_nrcodes)-1);
+      s->func(s->context, (void*)std_dc_chrominance_values, sizeof(std_dc_chrominance_values));
+      stbiw__putc(s, 0x11); // HTUACinfo
+      s->func(s->context, (void*)(std_ac_chrominance_nrcodes+1), sizeof(std_ac_chrominance_nrcodes)-1);
+      s->func(s->context, (void*)std_ac_chrominance_values, sizeof(std_ac_chrominance_values));
+      s->func(s->context, (void*)head2, sizeof(head2));
+   }
+
+   // Encode 8x8 macroblocks
+   {
+      static const unsigned short fillBits[] = {0x7F, 7};
+      int DCY=0, DCU=0, DCV=0;
+      int bitBuf=0, bitCnt=0;
+      // comp == 2 is grey+alpha (alpha is ignored)
+      int ofsG = comp > 2 ? 1 : 0, ofsB = comp > 2 ? 2 : 0;
+      const unsigned char *dataR = (const unsigned char *)data;
+      const unsigned char *dataG = dataR + ofsG;
+      const unsigned char *dataB = dataR + ofsB;
+      int x, y, pos;
+      if(subsample) {
+         for(y = 0; y < height; y += 16) {
+            for(x = 0; x < width; x += 16) {
+               float Y[256], U[256], V[256];
+               for(row = y, pos = 0; row < y+16; ++row) {
+                  // row >= height => use last input row
+                  int clamped_row = (row < height) ? row : height - 1;
+                  int base_p = (stbi__flip_vertically_on_write ? (height-1-clamped_row) : clamped_row)*width*comp;
+                  for(col = x; col < x+16; ++col, ++pos) {
+                     // if col >= width => use pixel from last input column
+                     int p = base_p + ((col < width) ? col : (width-1))*comp;
+                     float r = dataR[p], g = dataG[p], b = dataB[p];
+                     Y[pos]= +0.29900f*r + 0.58700f*g + 0.11400f*b - 128;
+                     U[pos]= -0.16874f*r - 0.33126f*g + 0.50000f*b;
+                     V[pos]= +0.50000f*r - 0.41869f*g - 0.08131f*b;
+                  }
+               }
+               DCY = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, Y+0,   16, fdtbl_Y, DCY, YDC_HT, YAC_HT);
+               DCY = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, Y+8,   16, fdtbl_Y, DCY, YDC_HT, YAC_HT);
+               DCY = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, Y+128, 16, fdtbl_Y, DCY, YDC_HT, YAC_HT);
+               DCY = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, Y+136, 16, fdtbl_Y, DCY, YDC_HT, YAC_HT);
+
+               // subsample U,V
+               {
+                  float subU[64], subV[64];
+                  int yy, xx;
+                  for(yy = 0, pos = 0; yy < 8; ++yy) {
+                     for(xx = 0; xx < 8; ++xx, ++pos) {
+                        int j = yy*32+xx*2;
+                        subU[pos] = (U[j+0] + U[j+1] + U[j+16] + U[j+17]) * 0.25f;
+                        subV[pos] = (V[j+0] + V[j+1] + V[j+16] + V[j+17]) * 0.25f;
+                     }
+                  }
+                  DCU = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, subU, 8, fdtbl_UV, DCU, UVDC_HT, UVAC_HT);
+                  DCV = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, subV, 8, fdtbl_UV, DCV, UVDC_HT, UVAC_HT);
+               }
+            }
+         }
+      } else {
+         for(y = 0; y < height; y += 8) {
+            for(x = 0; x < width; x += 8) {
+               float Y[64], U[64], V[64];
+               for(row = y, pos = 0; row < y+8; ++row) {
+                  // row >= height => use last input row
+                  int clamped_row = (row < height) ? row : height - 1;
+                  int base_p = (stbi__flip_vertically_on_write ? (height-1-clamped_row) : clamped_row)*width*comp;
+                  for(col = x; col < x+8; ++col, ++pos) {
+                     // if col >= width => use pixel from last input column
+                     int p = base_p + ((col < width) ? col : (width-1))*comp;
+                     float r = dataR[p], g = dataG[p], b = dataB[p];
+                     Y[pos]= +0.29900f*r + 0.58700f*g + 0.11400f*b - 128;
+                     U[pos]= -0.16874f*r - 0.33126f*g + 0.50000f*b;
+                     V[pos]= +0.50000f*r - 0.41869f*g - 0.08131f*b;
+                  }
+               }
+
+               DCY = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, Y, 8, fdtbl_Y,  DCY, YDC_HT, YAC_HT);
+               DCU = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, U, 8, fdtbl_UV, DCU, UVDC_HT, UVAC_HT);
+               DCV = stbiw__jpg_processDU(s, &bitBuf, &bitCnt, V, 8, fdtbl_UV, DCV, UVDC_HT, UVAC_HT);
+            }
+         }
+      }
+
+      // Do the bit alignment of the EOI marker
+      stbiw__jpg_writeBits(s, &bitBuf, &bitCnt, fillBits);
+   }
+
+   // EOI
+   stbiw__putc(s, 0xFF);
+   stbiw__putc(s, 0xD9);
+
+   return 1;
+}
+
+STBIWDEF int stbi_write_jpg_to_func(stbi_write_func *func, void *context, int x, int y, int comp, const void *data, int quality)
+{
+   stbi__write_context s = { 0 };
+   stbi__start_write_callbacks(&s, func, context);
+   return stbi_write_jpg_core(&s, x, y, comp, (void *) data, quality);
+}
+
+
+#ifndef STBI_WRITE_NO_STDIO
+STBIWDEF int stbi_write_jpg(char const *filename, int x, int y, int comp, const void *data, int quality)
+{
+   stbi__write_context s = { 0 };
+   if (stbi__start_write_file(&s,filename)) {
+      int r = stbi_write_jpg_core(&s, x, y, comp, data, quality);
+      stbi__end_write_file(&s);
+      return r;
+   } else
+      return 0;
+}
+#endif
+
+#endif // STB_IMAGE_WRITE_IMPLEMENTATION
+
+/* Revision history
+      1.16  (2021-07-11)
+             make Deflate code emit uncompressed blocks when it would otherwise expand
+             support writing BMPs with alpha channel
+      1.15  (2020-07-13) unknown
+      1.14  (2020-02-02) updated JPEG writer to downsample chroma channels
+      1.13
+      1.12
+      1.11  (2019-08-11)
+
+      1.10  (2019-02-07)
+             support utf8 filenames in Windows; fix warnings and platform ifdefs
+      1.09  (2018-02-11)
+             fix typo in zlib quality API, improve STB_I_W_STATIC in C++
+      1.08  (2018-01-29)
+             add stbi__flip_vertically_on_write, external zlib, zlib quality, choose PNG filter
+      1.07  (2017-07-24)
+             doc fix
+      1.06 (2017-07-23)
+             writing JPEG (using Jon Olick's code)
+      1.05   ???
+      1.04 (2017-03-03)
+             monochrome BMP expansion
+      1.03   ???
+      1.02 (2016-04-02)
+             avoid allocating large structures on the stack
+      1.01 (2016-01-16)
+             STBIW_REALLOC_SIZED: support allocators with no realloc support
+             avoid race-condition in crc initialization
+             minor compile issues
+      1.00 (2015-09-14)
+             installable file IO function
+      0.99 (2015-09-13)
+             warning fixes; TGA rle support
+      0.98 (2015-04-08)
+             added STBIW_MALLOC, STBIW_ASSERT etc
+      0.97 (2015-01-18)
+             fixed HDR asserts, rewrote HDR rle logic
+      0.96 (2015-01-17)
+             add HDR output
+             fix monochrome BMP
+      0.95 (2014-08-17)
+             add monochrome TGA output
+      0.94 (2014-05-31)
+             rename private functions to avoid conflicts with stb_image.h
+      0.93 (2014-05-27)
+             warning fixes
+      0.92 (2010-08-01)
+             casts to unsigned char to fix warnings
+      0.91 (2010-07-17)
+             first public release
+      0.90   first internal release
+*/
+
+/*
+------------------------------------------------------------------------------
+This software is available under 2 licenses -- choose whichever you prefer.
+------------------------------------------------------------------------------
+ALTERNATIVE A - MIT License
+Copyright (c) 2017 Sean Barrett
+Permission is hereby granted, free of charge, to any person obtaining a copy of
+this software and associated documentation files (the "Software"), to deal in
+the Software without restriction, including without limitation the rights to
+use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
+of the Software, and to permit persons to whom the Software is furnished to do
+so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+------------------------------------------------------------------------------
+ALTERNATIVE B - Public Domain (www.unlicense.org)
+This is free and unencumbered software released into the public domain.
+Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
+software, either in source code form or as a compiled binary, for any purpose,
+commercial or non-commercial, and by any means.
+In jurisdictions that recognize copyright laws, the author or authors of this
+software dedicate any and all copyright interest in the software to the public
+domain. We make this dedication for the benefit of the public at large and to
+the detriment of our heirs and successors. We intend this dedication to be an
+overt act of relinquishment in perpetuity of all present and future rights to
+this software under copyright law.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+------------------------------------------------------------------------------
+*/
diff --git a/external/wlr-export-dmabuf-unstable-v1.xml b/external/wlr-export-dmabuf-unstable-v1.xml
deleted file mode 100644
index 2614065..0000000
--- a/external/wlr-export-dmabuf-unstable-v1.xml
+++ /dev/null
@@ -1,203 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<protocol name="wlr_export_dmabuf_unstable_v1">
-  <copyright>
-    Copyright © 2018 Rostislav Pehlivanov
-
-    Permission is hereby granted, free of charge, to any person obtaining a
-    copy of this software and associated documentation files (the "Software"),
-    to deal in the Software without restriction, including without limitation
-    the rights to use, copy, modify, merge, publish, distribute, sublicense,
-    and/or sell copies of the Software, and to permit persons to whom the
-    Software is furnished to do so, subject to the following conditions:
-
-    The above copyright notice and this permission notice (including the next
-    paragraph) shall be included in all copies or substantial portions of the
-    Software.
-
-    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
-    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
-    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
-    DEALINGS IN THE SOFTWARE.
-  </copyright>
-
-  <description summary="a protocol for low overhead screen content capturing">
-    An interface to capture surfaces in an efficient way by exporting DMA-BUFs.
-
-    Warning! The protocol described in this file is experimental and
-    backward incompatible changes may be made. Backward compatible changes
-    may be added together with the corresponding interface version bump.
-    Backward incompatible changes are done by bumping the version number in
-    the protocol and interface names and resetting the interface version.
-    Once the protocol is to be declared stable, the 'z' prefix and the
-    version number in the protocol and interface names are removed and the
-    interface version number is reset.
-  </description>
-
-  <interface name="zwlr_export_dmabuf_manager_v1" version="1">
-    <description summary="manager to inform clients and begin capturing">
-      This object is a manager with which to start capturing from sources.
-    </description>
-
-    <request name="capture_output">
-      <description summary="capture a frame from an output">
-        Capture the next frame of a an entire output.
-      </description>
-      <arg name="frame" type="new_id" interface="zwlr_export_dmabuf_frame_v1"/>
-      <arg name="overlay_cursor" type="int"
-           summary="include custom client hardware cursor on top of the frame"/>
-      <arg name="output" type="object" interface="wl_output"/>
-    </request>
-
-    <request name="destroy" type="destructor">
-      <description summary="destroy the manager">
-        All objects created by the manager will still remain valid, until their
-        appropriate destroy request has been called.
-      </description>
-    </request>
-  </interface>
-
-  <interface name="zwlr_export_dmabuf_frame_v1" version="1">
-    <description summary="a DMA-BUF frame">
-      This object represents a single DMA-BUF frame.
-
-      If the capture is successful, the compositor will first send a "frame"
-      event, followed by one or several "object". When the frame is available
-      for readout, the "ready" event is sent.
-
-      If the capture failed, the "cancel" event is sent. This can happen anytime
-      before the "ready" event.
-
-      Once either a "ready" or a "cancel" event is received, the client should
-      destroy the frame. Once an "object" event is received, the client is
-      responsible for closing the associated file descriptor.
-
-      All frames are read-only and may not be written into or altered.
-    </description>
-
-    <enum name="flags">
-      <description summary="frame flags">
-        Special flags that should be respected by the client.
-      </description>
-      <entry name="transient" value="0x1"
-             summary="clients should copy frame before processing"/>
-    </enum>
-
-    <event name="frame">
-      <description summary="a frame description">
-        Main event supplying the client with information about the frame. If the
-        capture didn't fail, this event is always emitted first before any other
-        events.
-
-        This event is followed by a number of "object" as specified by the
-        "num_objects" argument.
-      </description>
-      <arg name="width" type="uint"
-           summary="frame width in pixels"/>
-      <arg name="height" type="uint"
-           summary="frame height in pixels"/>
-      <arg name="offset_x" type="uint"
-           summary="crop offset for the x axis"/>
-      <arg name="offset_y" type="uint"
-           summary="crop offset for the y axis"/>
-      <arg name="buffer_flags" type="uint"
-           summary="flags which indicate properties (invert, interlacing),
-                    has the same values as zwp_linux_buffer_params_v1:flags"/>
-      <arg name="flags" type="uint" enum="flags"
-           summary="indicates special frame features"/>
-      <arg name="format" type="uint"
-           summary="format of the frame (DRM_FORMAT_*)"/>
-      <arg name="mod_high" type="uint"
-           summary="drm format modifier, high"/>
-      <arg name="mod_low" type="uint"
-           summary="drm format modifier, low"/>
-      <arg name="num_objects" type="uint"
-           summary="indicates how many objects (FDs) the frame has (max 4)"/>
-    </event>
-
-    <event name="object">
-      <description summary="an object description">
-        Event which serves to supply the client with the file descriptors
-        containing the data for each object.
-
-        After receiving this event, the client must always close the file
-        descriptor as soon as they're done with it and even if the frame fails.
-      </description>
-      <arg name="index" type="uint"
-           summary="index of the current object"/>
-      <arg name="fd" type="fd"
-           summary="fd of the current object"/>
-      <arg name="size" type="uint"
-           summary="size in bytes for the current object"/>
-      <arg name="offset" type="uint"
-           summary="starting point for the data in the object's fd"/>
-      <arg name="stride" type="uint"
-           summary="line size in bytes"/>
-      <arg name="plane_index" type="uint"
-           summary="index of the the plane the data in the object applies to"/>
-    </event>
-
-    <event name="ready">
-      <description summary="indicates frame is available for reading">
-        This event is sent as soon as the frame is presented, indicating it is
-        available for reading. This event includes the time at which
-        presentation happened at.
-
-        The timestamp is expressed as tv_sec_hi, tv_sec_lo, tv_nsec triples,
-        each component being an unsigned 32-bit value. Whole seconds are in
-        tv_sec which is a 64-bit value combined from tv_sec_hi and tv_sec_lo,
-        and the additional fractional part in tv_nsec as nanoseconds. Hence,
-        for valid timestamps tv_nsec must be in [0, 999999999]. The seconds part
-        may have an arbitrary offset at start.
-
-        After receiving this event, the client should destroy this object.
-      </description>
-      <arg name="tv_sec_hi" type="uint"
-           summary="high 32 bits of the seconds part of the timestamp"/>
-      <arg name="tv_sec_lo" type="uint"
-           summary="low 32 bits of the seconds part of the timestamp"/>
-      <arg name="tv_nsec" type="uint"
-           summary="nanoseconds part of the timestamp"/>
-    </event>
-
-    <enum name="cancel_reason">
-      <description summary="cancel reason">
-        Indicates reason for cancelling the frame.
-      </description>
-      <entry name="temporary" value="0"
-             summary="temporary error, source will produce more frames"/>
-      <entry name="permanent" value="1"
-             summary="fatal error, source will not produce frames"/>
-      <entry name="resizing" value="2"
-             summary="temporary error, source will produce more frames"/>
-    </enum>
-
-    <event name="cancel">
-      <description summary="indicates the frame is no longer valid">
-        If the capture failed or if the frame is no longer valid after the
-        "frame" event has been emitted, this event will be used to inform the
-        client to scrap the frame.
-
-        If the failure is temporary, the client may capture again the same
-        source. If the failure is permanent, any further attempts to capture the
-        same source will fail again.
-
-        After receiving this event, the client should destroy this object.
-      </description>
-      <arg name="reason" type="uint" enum="cancel_reason"
-           summary="indicates a reason for cancelling this frame capture"/>
-    </event>
-
-    <request name="destroy" type="destructor">
-      <description summary="delete this object, used or not">
-        Unreferences the frame. This request must be called as soon as its no
-        longer used.
-
-        It can be called at any time by the client. The client will still have
-        to close any FDs it has been given.
-      </description>
-    </request>
-  </interface>
-</protocol>
-\ No newline at end of file
diff --git a/extra/gpu-screen-recorder.env b/extra/gpu-screen-recorder.env
new file mode 100644
index 0000000..664fa5d
--- /dev/null
+++ b/extra/gpu-screen-recorder.env
@@ -0,0 +1,12 @@
+WINDOW=screen
+CONTAINER=mp4
+CODEC=h264
+AUDIO_CODEC=opus
+AUDIO_DEVICE=default_output
+SECONDARY_AUDIO_DEVICE=default_input
+FRAMERATE=60
+REPLAYDURATION=60
+OUTPUTDIR=/run/media/dec05eba/SSD1TB/Videos/aaaa
+KEYINT=2
+ENCODER=gpu
+RESTORE_PORTAL_SESSION=yes
+\ No newline at end of file
diff --git a/extra/gpu-screen-recorder.service b/extra/gpu-screen-recorder.service
index 61009be..7054e17 100644
--- a/extra/gpu-screen-recorder.service
+++ b/extra/gpu-screen-recorder.service
@@ -5,15 +5,23 @@ Description=GPU Screen Recorder Service
 EnvironmentFile=-%h/.config/gpu-screen-recorder.env
 Environment=WINDOW=screen
 Environment=CONTAINER=mp4
-Environment=QUALITY=very_high
+Environment=QUALITY=40000
+Environment=BITRATE_MODE=cbr
 Environment=CODEC=auto
 Environment=AUDIO_CODEC=opus
-Environment=AUDIO_DEVICE=
+Environment=AUDIO_DEVICE=default_output
+Environment=SECONDARY_AUDIO_DEVICE=
 Environment=FRAMERATE=60
 Environment=REPLAYDURATION=60
 Environment=OUTPUTDIR=%h/Videos
 Environment=MAKEFOLDERS=no
-ExecStart=/bin/sh -c 'AUDIO="${AUDIO_DEVICE:-$(pactl get-default-sink).monitor}"; gpu-screen-recorder -v no -w $WINDOW -c $CONTAINER -q $QUALITY -k $CODEC -ac $AUDIO_CODEC -a "$AUDIO" -f $FRAMERATE -r $REPLAYDURATION -o "$OUTPUTDIR" -mf $MAKEFOLDERS $ADDITIONAL_ARGS'
+Environment=COLOR_RANGE=limited
+Environment=KEYINT=2
+Environment=ENCODER=gpu
+Environment=RESTORE_PORTAL_SESSION=yes
+Environment=OUTPUT_RESOLUTION=0x0
+Environment=ADDITIONAL_ARGS=
+ExecStart=gpu-screen-recorder -v no -w "${WINDOW}" -s "${OUTPUT_RESOLUTION}" -c "${CONTAINER}" -q "${QUALITY}" -k "${CODEC}" -ac "${AUDIO_CODEC}" -a "${AUDIO_DEVICE}" -a "${SECONDARY_AUDIO_DEVICE}" -f "${FRAMERATE}" -r "${REPLAYDURATION}" -o "${OUTPUTDIR}" -df "${MAKEFOLDERS}" $ADDITIONAL_ARGS -cr "${COLOR_RANGE}" -keyint "${KEYINT}" -restore-portal-session "${RESTORE_PORTAL_SESSION}" -encoder "${ENCODER}" -bm "${BITRATE_MODE}"
 KillSignal=SIGINT
 Restart=on-failure
 RestartSec=5s
diff --git a/extra/install_preserve_video_memory.sh b/extra/install_preserve_video_memory.sh
deleted file mode 100755
index c5cf658..0000000
--- a/extra/install_preserve_video_memory.sh
+++ /dev/null
@@ -1,8 +0,0 @@
-#!/bin/sh
-
-script_dir=$(dirname "$0")
-cd "$script_dir"
-
-[ $(id -u) -ne 0 ] && echo "You need root privileges to run the install script" && exit 1
-
-install -Dm644 gsr-nvidia.conf /etc/modprobe.d/gsr-nvidia.conf
diff --git a/extra/meson_post_install.sh b/extra/meson_post_install.sh
new file mode 100755
index 0000000..143965c
--- /dev/null
+++ b/extra/meson_post_install.sh
@@ -0,0 +1,5 @@
+#!/bin/sh
+
+# Needed to remove password prompt when recording a monitor (without desktop portal option) on amd/intel or nvidia wayland
+/usr/sbin/setcap cap_sys_admin+ep ${MESON_INSTALL_DESTDIR_PREFIX}/bin/gsr-kms-server \
+    || echo "\n!!! Please re-run install as root\n"
diff --git a/include/args_parser.h b/include/args_parser.h
new file mode 100644
index 0000000..e2fa46e
--- /dev/null
+++ b/include/args_parser.h
@@ -0,0 +1,106 @@
+#ifndef GSR_ARGS_PARSER_H
+#define GSR_ARGS_PARSER_H
+
+#include <stdbool.h>
+#include <stdint.h>
+#include "defs.h"
+#include "vec2.h"
+
+typedef struct gsr_egl gsr_egl;
+
+#define NUM_ARGS 30
+
+typedef enum {
+    ARG_TYPE_STRING,
+    ARG_TYPE_BOOLEAN,
+    ARG_TYPE_ENUM,
+    ARG_TYPE_I64,
+    ARG_TYPE_DOUBLE,
+} ArgType;
+
+typedef struct {
+    const char *name;
+    int value;
+} ArgEnum;
+
+typedef struct {
+    ArgType type;
+    const char **values;
+    int capacity_num_values;
+    int num_values;
+
+    const char *key;
+    bool optional;
+    bool list;
+
+    const ArgEnum *enum_values;
+    int num_enum_values;
+
+    int64_t integer_value_min;
+    int64_t integer_value_max;
+
+    union {
+        bool boolean;
+        int enum_value;
+        int64_t i64_value;
+        double d_value;
+    } typed_value;
+} Arg;
+
+typedef struct {
+    void (*version)(void *userdata);
+    void (*info)(void *userdata);
+    void (*list_audio_devices)(void *userdata);
+    void (*list_application_audio)(void *userdata);
+    void (*list_capture_options)(const char *card_path, void *userdata);
+} args_handlers;
+
+typedef struct {
+    Arg args[NUM_ARGS];
+
+    gsr_video_encoder_hardware video_encoder;
+    gsr_pixel_format pixel_format;
+    gsr_framerate_mode framerate_mode;
+    gsr_color_range color_range;
+    gsr_tune tune;
+    gsr_video_codec video_codec;
+    gsr_audio_codec audio_codec;
+    gsr_bitrate_mode bitrate_mode;
+    gsr_video_quality video_quality;
+    gsr_replay_storage replay_storage;
+    char window[64];
+    const char *container_format;
+    const char *filename;
+    const char *replay_recording_directory;
+    const char *portal_session_token_filepath;
+    const char *recording_saved_script;
+    bool verbose;
+    bool gl_debug;
+    bool record_cursor;
+    bool date_folders;
+    bool restore_portal_session;
+    bool restart_replay_on_save;
+    bool overclock;
+    bool is_livestream;
+    bool is_output_piped;
+    bool low_latency_recording;
+    bool very_old_gpu;
+    int64_t video_bitrate;
+    int64_t audio_bitrate;
+    int64_t fps;
+    int64_t replay_buffer_size_secs;
+    double keyint;
+    vec2i output_resolution;
+    vec2i region_size;
+    vec2i region_position;
+} args_parser;
+
+/* |argv| is stored as a reference */
+bool args_parser_parse(args_parser *self, int argc, char **argv, const args_handlers *args_handlers, void *userdata);
+void args_parser_deinit(args_parser *self);
+
+bool args_parser_validate_with_gl_info(args_parser *self, gsr_egl *egl);
+void args_parser_print_usage(void);
+Arg* args_parser_get_arg(args_parser *self, const char *arg_name);
+
+#endif /* GSR_ARGS_PARSER_H */
diff --git a/include/capture/capture.h b/include/capture/capture.h
index 82a9555..634eee0 100644
--- a/include/capture/capture.h
+++ b/include/capture/capture.h
@@ -1,32 +1,51 @@
 #ifndef GSR_CAPTURE_CAPTURE_H
 #define GSR_CAPTURE_CAPTURE_H
 
+#include "../color_conversion.h"
 #include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
 
 typedef struct AVCodecContext AVCodecContext;
+typedef struct AVStream AVStream;
 typedef struct AVFrame AVFrame;
-
+typedef struct AVMasteringDisplayMetadata AVMasteringDisplayMetadata;
+typedef struct AVContentLightMetadata AVContentLightMetadata;
 typedef struct gsr_capture gsr_capture;
 
+typedef struct {
+    int width;
+    int height;
+    int fps;
+    AVCodecContext *video_codec_context; /* can be NULL */
+    AVFrame *frame; /* can be NULL, but will never be NULL if |video_codec_context| is set */
+} gsr_capture_metadata;
+
 struct gsr_capture {
     /* These methods should not be called manually. Call gsr_capture_* instead */
-    int (*start)(gsr_capture *cap, AVCodecContext *video_codec_context);
-    void (*tick)(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame **frame); /* can be NULL */
-    bool (*should_stop)(gsr_capture *cap, bool *err); /* can be NULL */
-    int (*capture)(gsr_capture *cap, AVFrame *frame);
-    void (*capture_end)(gsr_capture *cap, AVFrame *frame); /* can be NULL */
-    void (*destroy)(gsr_capture *cap, AVCodecContext *video_codec_context);
+    int (*start)(gsr_capture *cap, gsr_capture_metadata *capture_metadata);
+    void (*on_event)(gsr_capture *cap, gsr_egl *egl); /* can be NULL */
+    void (*tick)(gsr_capture *cap); /* can be NULL. If there is an event then |on_event| is called before this */
+    bool (*should_stop)(gsr_capture *cap, bool *err); /* can be NULL. If NULL, return false */
+    int (*capture)(gsr_capture *cap, gsr_capture_metadata *capture_metadata, gsr_color_conversion *color_conversion);
+    bool (*uses_external_image)(gsr_capture *cap); /* can be NULL. If NULL, return false */
+    bool (*set_hdr_metadata)(gsr_capture *cap, AVMasteringDisplayMetadata *mastering_display_metadata, AVContentLightMetadata *light_metadata); /* can be NULL. If NULL, return false */
+    uint64_t (*get_window_id)(gsr_capture *cap); /* can be NULL. Returns 0 if unknown */
+    bool (*is_damaged)(gsr_capture *cap); /* can be NULL */
+    void (*clear_damage)(gsr_capture *cap); /* can be NULL */
+    void (*destroy)(gsr_capture *cap);
 
     void *priv; /* can be NULL */
     bool started;
 };
 
-int gsr_capture_start(gsr_capture *cap, AVCodecContext *video_codec_context);
-void gsr_capture_tick(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame **frame);
+int gsr_capture_start(gsr_capture *cap, gsr_capture_metadata *capture_metadata);
+void gsr_capture_on_event(gsr_capture *cap, gsr_egl *egl);
+void gsr_capture_tick(gsr_capture *cap);
 bool gsr_capture_should_stop(gsr_capture *cap, bool *err);
-int gsr_capture_capture(gsr_capture *cap, AVFrame *frame);
-void gsr_capture_end(gsr_capture *cap, AVFrame *frame);
-/* Calls |gsr_capture_stop| as well */
-void gsr_capture_destroy(gsr_capture *cap, AVCodecContext *video_codec_context);
+int gsr_capture_capture(gsr_capture *cap, gsr_capture_metadata *capture_metadata, gsr_color_conversion *color_conversion);
+bool gsr_capture_uses_external_image(gsr_capture *cap);
+bool gsr_capture_set_hdr_metadata(gsr_capture *cap, AVMasteringDisplayMetadata *mastering_display_metadata, AVContentLightMetadata *light_metadata);
+void gsr_capture_destroy(gsr_capture *cap);
 
 #endif /* GSR_CAPTURE_CAPTURE_H */
diff --git a/include/capture/kms.h b/include/capture/kms.h
new file mode 100644
index 0000000..ce09817
--- /dev/null
+++ b/include/capture/kms.h
@@ -0,0 +1,19 @@
+#ifndef GSR_CAPTURE_KMS_H
+#define GSR_CAPTURE_KMS_H
+
+#include "capture.h"
+
+typedef struct {
+    gsr_egl *egl;
+    const char *display_to_capture; /* A copy is made of this */
+    bool hdr;
+    bool record_cursor;
+    int fps;
+    vec2i output_resolution;
+    vec2i region_size;
+    vec2i region_position;
+} gsr_capture_kms_params;
+
+gsr_capture* gsr_capture_kms_create(const gsr_capture_kms_params *params);
+
+#endif /* GSR_CAPTURE_KMS_H */
diff --git a/include/capture/kms_cuda.h b/include/capture/kms_cuda.h
deleted file mode 100644
index c2b2ad8..0000000
--- a/include/capture/kms_cuda.h
+++ /dev/null
@@ -1,19 +0,0 @@
-#ifndef GSR_CAPTURE_KMS_CUDA_H
-#define GSR_CAPTURE_KMS_CUDA_H
-
-#include "../vec2.h"
-#include "../utils.h"
-#include "capture.h"
-
-typedef struct _XDisplay Display;
-
-typedef struct {
-    gsr_egl *egl;
-    const char *display_to_capture; /* if this is "screen", then the first monitor is captured. A copy is made of this */
-    gsr_gpu_info gpu_inf;
-    const char *card_path; /* reference */
-} gsr_capture_kms_cuda_params;
-
-gsr_capture* gsr_capture_kms_cuda_create(const gsr_capture_kms_cuda_params *params);
-
-#endif /* GSR_CAPTURE_KMS_CUDA_H */
diff --git a/include/capture/kms_vaapi.h b/include/capture/kms_vaapi.h
deleted file mode 100644
index 26cda2c..0000000
--- a/include/capture/kms_vaapi.h
+++ /dev/null
@@ -1,20 +0,0 @@
-#ifndef GSR_CAPTURE_KMS_VAAPI_H
-#define GSR_CAPTURE_KMS_VAAPI_H
-
-#include "../vec2.h"
-#include "../utils.h"
-#include "capture.h"
-
-typedef struct _XDisplay Display;
-
-typedef struct {
-    gsr_egl *egl;
-    const char *display_to_capture; /* if this is "screen", then the first monitor is captured. A copy is made of this */
-    gsr_gpu_info gpu_inf;
-    const char *card_path; /* reference */
-    bool wayland;
-} gsr_capture_kms_vaapi_params;
-
-gsr_capture* gsr_capture_kms_vaapi_create(const gsr_capture_kms_vaapi_params *params);
-
-#endif /* GSR_CAPTURE_KMS_VAAPI_H */
diff --git a/include/capture/nvfbc.h b/include/capture/nvfbc.h
index 5678473..7e30d01 100644
--- a/include/capture/nvfbc.h
+++ b/include/capture/nvfbc.h
@@ -2,20 +2,17 @@
 #define GSR_CAPTURE_NVFBC_H
 
 #include "capture.h"
-#include "../egl.h"
 #include "../vec2.h"
 
-typedef struct _XDisplay Display;
-
 typedef struct {
-    Display *dpy;
     gsr_egl *egl;
     const char *display_to_capture; /* if this is "screen", then the entire x11 screen is captured (all displays). A copy is made of this */
     int fps;
-    vec2i pos;
-    vec2i size;
     bool direct_capture;
-    bool overclock;
+    bool record_cursor;
+    vec2i output_resolution;
+    vec2i region_size;
+    vec2i region_position;
 } gsr_capture_nvfbc_params;
 
 gsr_capture* gsr_capture_nvfbc_create(const gsr_capture_nvfbc_params *params);
diff --git a/include/capture/portal.h b/include/capture/portal.h
new file mode 100644
index 0000000..74cdba9
--- /dev/null
+++ b/include/capture/portal.h
@@ -0,0 +1,17 @@
+#ifndef GSR_CAPTURE_PORTAL_H
+#define GSR_CAPTURE_PORTAL_H
+
+#include "capture.h"
+
+typedef struct {
+    gsr_egl *egl;
+    bool record_cursor;
+    bool restore_portal_session;
+    /* If this is set to NULL then this defaults to $XDG_CONFIG_HOME/gpu-screen-recorder/restore_token ($XDG_CONFIG_HOME defaults to $HOME/.config) */
+    const char *portal_session_token_filepath;
+    vec2i output_resolution;
+} gsr_capture_portal_params;
+
+gsr_capture* gsr_capture_portal_create(const gsr_capture_portal_params *params);
+
+#endif /* GSR_CAPTURE_PORTAL_H */
diff --git a/include/capture/xcomposite.h b/include/capture/xcomposite.h
new file mode 100644
index 0000000..bf6532e
--- /dev/null
+++ b/include/capture/xcomposite.h
@@ -0,0 +1,17 @@
+#ifndef GSR_CAPTURE_XCOMPOSITE_H
+#define GSR_CAPTURE_XCOMPOSITE_H
+
+#include "capture.h"
+#include "../vec2.h"
+
+typedef struct {
+    gsr_egl *egl;
+    unsigned long window;
+    bool follow_focused; /* If this is set then |window| is ignored */
+    bool record_cursor;
+    vec2i output_resolution;
+} gsr_capture_xcomposite_params;
+
+gsr_capture* gsr_capture_xcomposite_create(const gsr_capture_xcomposite_params *params);
+
+#endif /* GSR_CAPTURE_XCOMPOSITE_H */
diff --git a/include/capture/xcomposite_cuda.h b/include/capture/xcomposite_cuda.h
deleted file mode 100644
index 2106714..0000000
--- a/include/capture/xcomposite_cuda.h
+++ /dev/null
@@ -1,22 +0,0 @@
-#ifndef GSR_CAPTURE_XCOMPOSITE_CUDA_H
-#define GSR_CAPTURE_XCOMPOSITE_CUDA_H
-
-#include "capture.h"
-#include "../egl.h"
-#include "../vec2.h"
-#include <X11/X.h>
-
-typedef struct _XDisplay Display;
-
-typedef struct {
-    gsr_egl *egl;
-    Display *dpy;
-    Window window;
-    bool follow_focused; /* If this is set then |window| is ignored */
-    vec2i region_size; /* This is currently only used with |follow_focused| */
-    bool overclock;
-} gsr_capture_xcomposite_cuda_params;
-
-gsr_capture* gsr_capture_xcomposite_cuda_create(const gsr_capture_xcomposite_cuda_params *params);
-
-#endif /* GSR_CAPTURE_XCOMPOSITE_CUDA_H */
diff --git a/include/capture/xcomposite_vaapi.h b/include/capture/xcomposite_vaapi.h
deleted file mode 100644
index e80c60a..0000000
--- a/include/capture/xcomposite_vaapi.h
+++ /dev/null
@@ -1,22 +0,0 @@
-#ifndef GSR_CAPTURE_XCOMPOSITE_VAAPI_H
-#define GSR_CAPTURE_XCOMPOSITE_VAAPI_H
-
-#include "capture.h"
-#include "../egl.h"
-#include "../vec2.h"
-#include <X11/X.h>
-
-typedef struct _XDisplay Display;
-
-typedef struct {
-    gsr_egl *egl;
-    Display *dpy;
-    Window window;
-    bool follow_focused; /* If this is set then |window| is ignored */
-    vec2i region_size; /* This is currently only used with |follow_focused| */
-    const char *card_path; /* reference */
-} gsr_capture_xcomposite_vaapi_params;
-
-gsr_capture* gsr_capture_xcomposite_vaapi_create(const gsr_capture_xcomposite_vaapi_params *params);
-
-#endif /* GSR_CAPTURE_XCOMPOSITE_VAAPI_H */
diff --git a/include/capture/ximage.h b/include/capture/ximage.h
new file mode 100644
index 0000000..e6c3607
--- /dev/null
+++ b/include/capture/ximage.h
@@ -0,0 +1,18 @@
+#ifndef GSR_CAPTURE_XIMAGE_H
+#define GSR_CAPTURE_XIMAGE_H
+
+#include "capture.h"
+#include "../vec2.h"
+
+typedef struct {
+    gsr_egl *egl;
+    const char *display_to_capture; /* A copy is made of this */
+    bool record_cursor;
+    vec2i output_resolution;
+    vec2i region_size;
+    vec2i region_position;
+} gsr_capture_ximage_params;
+
+gsr_capture* gsr_capture_ximage_create(const gsr_capture_ximage_params *params);
+
+#endif /* GSR_CAPTURE_XIMAGE_H */
diff --git a/include/codec_query/codec_query.h b/include/codec_query/codec_query.h
new file mode 100644
index 0000000..316217d
--- /dev/null
+++ b/include/codec_query/codec_query.h
@@ -0,0 +1,23 @@
+#ifndef GSR_CODEC_QUERY_H
+#define GSR_CODEC_QUERY_H
+
+#include <stdbool.h>
+
+typedef struct {
+    bool supported;
+    bool low_power;
+} gsr_supported_video_codec;
+
+typedef struct {
+    gsr_supported_video_codec h264;
+    gsr_supported_video_codec hevc;
+    gsr_supported_video_codec hevc_hdr;
+    gsr_supported_video_codec hevc_10bit;
+    gsr_supported_video_codec av1;
+    gsr_supported_video_codec av1_hdr;
+    gsr_supported_video_codec av1_10bit;
+    gsr_supported_video_codec vp8;
+    gsr_supported_video_codec vp9;
+} gsr_supported_video_codecs;
+
+#endif /* GSR_CODEC_QUERY_H */
diff --git a/include/codec_query/nvenc.h b/include/codec_query/nvenc.h
new file mode 100644
index 0000000..c01acf6
--- /dev/null
+++ b/include/codec_query/nvenc.h
@@ -0,0 +1,8 @@
+#ifndef GSR_CODEC_QUERY_NVENC_H
+#define GSR_CODEC_QUERY_NVENC_H
+
+#include "codec_query.h"
+
+bool gsr_get_supported_video_codecs_nvenc(gsr_supported_video_codecs *video_codecs, bool cleanup);
+
+#endif /* GSR_CODEC_QUERY_NVENC_H */
diff --git a/include/codec_query/vaapi.h b/include/codec_query/vaapi.h
new file mode 100644
index 0000000..60bdeca
--- /dev/null
+++ b/include/codec_query/vaapi.h
@@ -0,0 +1,8 @@
+#ifndef GSR_CODEC_QUERY_VAAPI_H
+#define GSR_CODEC_QUERY_VAAPI_H
+
+#include "codec_query.h"
+
+bool gsr_get_supported_video_codecs_vaapi(gsr_supported_video_codecs *video_codecs, const char *card_path, bool cleanup);
+
+#endif /* GSR_CODEC_QUERY_VAAPI_H */
diff --git a/include/codec_query/vulkan.h b/include/codec_query/vulkan.h
new file mode 100644
index 0000000..bb06c6b
--- /dev/null
+++ b/include/codec_query/vulkan.h
@@ -0,0 +1,8 @@
+#ifndef GSR_CODEC_QUERY_VULKAN_H
+#define GSR_CODEC_QUERY_VULKAN_H
+
+#include "codec_query.h"
+
+bool gsr_get_supported_video_codecs_vulkan(gsr_supported_video_codecs *video_codecs, const char *card_path, bool cleanup);
+
+#endif /* GSR_CODEC_QUERY_VULKAN_H */
diff --git a/include/color_conversion.h b/include/color_conversion.h
index 738cba5..cb074a1 100644
--- a/include/color_conversion.h
+++ b/include/color_conversion.h
@@ -2,41 +2,84 @@
 #define GSR_COLOR_CONVERSION_H
 
 #include "shader.h"
+#include "defs.h"
 #include "vec2.h"
+#include <stdbool.h>
+
+#define GSR_COLOR_CONVERSION_MAX_COMPUTE_SHADERS 12
+#define GSR_COLOR_CONVERSION_MAX_GRAPHICS_SHADERS 6
+#define GSR_COLOR_CONVERSION_MAX_FRAMEBUFFERS 2
 
 typedef enum {
-    GSR_SOURCE_COLOR_RGB
+    GSR_SOURCE_COLOR_RGB,
+    GSR_SOURCE_COLOR_BGR
 } gsr_source_color;
 
 typedef enum {
-    GSR_DESTINATION_COLOR_BGR,
-    GSR_DESTINATION_COLOR_NV12 /* YUV420, BT709, limited */
+    GSR_DESTINATION_COLOR_NV12, /* YUV420, BT709, 8-bit */
+    GSR_DESTINATION_COLOR_P010, /* YUV420, BT2020, 10-bit */
+    GSR_DESTINATION_COLOR_RGB8
 } gsr_destination_color;
 
+typedef enum {
+    GSR_ROT_0,
+    GSR_ROT_90,
+    GSR_ROT_180,
+    GSR_ROT_270
+} gsr_rotation;
+
+typedef struct {
+    int rotation_matrix;
+    int offset;
+} gsr_color_graphics_uniforms;
+
+typedef struct {
+    int rotation_matrix;
+    int source_position;
+    int target_position;
+    int scale;
+} gsr_color_compute_uniforms;
+
 typedef struct {
     gsr_egl *egl;
 
-    gsr_source_color source_color;
     gsr_destination_color destination_color;
 
     unsigned int destination_textures[2];
     int num_destination_textures;
+
+    gsr_color_range color_range;
+    bool load_external_image_shader;
+    bool force_graphics_shader;
 } gsr_color_conversion_params;
 
 typedef struct {
     gsr_color_conversion_params params;
-    int rotation_uniforms[2];
-    gsr_shader shaders[2];
+    gsr_color_compute_uniforms compute_uniforms[GSR_COLOR_CONVERSION_MAX_COMPUTE_SHADERS];
+    gsr_shader compute_shaders[GSR_COLOR_CONVERSION_MAX_COMPUTE_SHADERS];
+
+    /* These are only loader if compute shaders (of the same type) fail to load */
+    gsr_color_graphics_uniforms graphics_uniforms[GSR_COLOR_CONVERSION_MAX_GRAPHICS_SHADERS];
+    gsr_shader graphics_shaders[GSR_COLOR_CONVERSION_MAX_GRAPHICS_SHADERS];
 
-    unsigned int framebuffers[2];
+    bool compute_shaders_failed_to_load;
+    bool external_compute_shaders_failed_to_load;
+
+    unsigned int framebuffers[GSR_COLOR_CONVERSION_MAX_FRAMEBUFFERS];
 
     unsigned int vertex_array_object_id;
     unsigned int vertex_buffer_object_id;
+
+    int max_local_size_dim;
 } gsr_color_conversion;
 
 int gsr_color_conversion_init(gsr_color_conversion *self, const gsr_color_conversion_params *params);
 void gsr_color_conversion_deinit(gsr_color_conversion *self);
 
-int gsr_color_conversion_draw(gsr_color_conversion *self, unsigned int texture_id, vec2i source_pos, vec2i source_size, vec2i texture_pos, vec2i texture_size, float rotation);
+void gsr_color_conversion_draw(gsr_color_conversion *self, unsigned int texture_id, vec2i destination_pos, vec2i destination_size, vec2i source_pos, vec2i source_size, vec2i texture_size, gsr_rotation rotation, gsr_source_color source_color, bool external_texture, bool alpha_blending);
+void gsr_color_conversion_clear(gsr_color_conversion *self);
+void gsr_color_conversion_read_destination_texture(gsr_color_conversion *self, int destination_texture_index, int x, int y, int width, int height, unsigned int color_format, unsigned int data_format, void *pixels);
+
+gsr_rotation gsr_monitor_rotation_to_rotation(gsr_monitor_rotation monitor_rotation);
 
 #endif /* GSR_COLOR_CONVERSION_H */
diff --git a/include/cuda.h b/include/cuda.h
index 41fe15b..fd1f9f9 100644
--- a/include/cuda.h
+++ b/include/cuda.h
@@ -73,7 +73,8 @@ typedef CUDA_MEMCPY2D_v2 CUDA_MEMCPY2D;
 
 typedef struct CUgraphicsResource_st *CUgraphicsResource;
 
-typedef struct {
+typedef struct gsr_cuda gsr_cuda;
+struct gsr_cuda {
     gsr_overclock overclock;
     bool do_overclock;
 
@@ -88,8 +89,9 @@ typedef struct {
     CUresult (*cuCtxPushCurrent_v2)(CUcontext ctx);
     CUresult (*cuCtxPopCurrent_v2)(CUcontext *pctx);
     CUresult (*cuGetErrorString)(CUresult error, const char **pStr);
-    CUresult (*cuMemsetD8_v2)(CUdeviceptr dstDevice, unsigned char uc, size_t N);
     CUresult (*cuMemcpy2D_v2)(const CUDA_MEMCPY2D *pCopy);
+    CUresult (*cuMemcpy2DAsync_v2)(const CUDA_MEMCPY2D *pcopy, CUstream hStream);
+    CUresult (*cuStreamSynchronize)(CUstream hStream);
 
     CUresult (*cuGraphicsGLRegisterImage)(CUgraphicsResource *pCudaResource, unsigned int image, unsigned int target, unsigned int flags);
     CUresult (*cuGraphicsEGLRegisterImage)(CUgraphicsResource *pCudaResource, void *image, unsigned int flags);
@@ -98,7 +100,7 @@ typedef struct {
     CUresult (*cuGraphicsUnmapResources)(unsigned int count, CUgraphicsResource *resources, CUstream hStream);
     CUresult (*cuGraphicsUnregisterResource)(CUgraphicsResource resource);
     CUresult (*cuGraphicsSubResourceGetMappedArray)(CUarray *pArray, CUgraphicsResource resource, unsigned int arrayIndex, unsigned int mipLevel);
-} gsr_cuda;
+};
 
 bool gsr_cuda_load(gsr_cuda *self, Display *display, bool overclock);
 void gsr_cuda_unload(gsr_cuda *self);
diff --git a/include/cursor.h b/include/cursor.h
new file mode 100644
index 0000000..1564714
--- /dev/null
+++ b/include/cursor.h
@@ -0,0 +1,28 @@
+#ifndef GSR_CURSOR_H
+#define GSR_CURSOR_H
+
+#include "egl.h"
+#include "vec2.h"
+
+typedef struct {
+    gsr_egl *egl;
+    Display *display;
+    int x_fixes_event_base;
+
+    unsigned int texture_id;
+    vec2i size;
+    vec2i hotspot;
+    vec2i position;
+
+    bool cursor_image_set;
+    bool visible;
+} gsr_cursor;
+
+int gsr_cursor_init(gsr_cursor *self, gsr_egl *egl, Display *display);
+void gsr_cursor_deinit(gsr_cursor *self);
+
+/* Returns true if the cursor image has updated or if the cursor has moved */
+bool gsr_cursor_on_event(gsr_cursor *self, XEvent *xev);
+void gsr_cursor_tick(gsr_cursor *self, Window relative_to);
+
+#endif /* GSR_CURSOR_H */
diff --git a/include/damage.h b/include/damage.h
new file mode 100644
index 0000000..4b10e58
--- /dev/null
+++ b/include/damage.h
@@ -0,0 +1,52 @@
+#ifndef GSR_DAMAGE_H
+#define GSR_DAMAGE_H
+
+#include "cursor.h"
+#include "utils.h"
+#include <stdbool.h>
+#include <stdint.h>
+
+typedef struct _XDisplay Display;
+typedef union _XEvent XEvent;
+
+typedef enum {
+    GSR_DAMAGE_TRACK_NONE,
+    GSR_DAMAGE_TRACK_WINDOW,
+    GSR_DAMAGE_TRACK_MONITOR
+} gsr_damage_track_type;
+
+typedef struct {
+    gsr_egl *egl;
+    Display *display;
+    bool track_cursor;
+    gsr_damage_track_type track_type;
+
+    int damage_event;
+    int damage_error;
+    uint64_t damage;
+    bool damaged;
+
+    int randr_event;
+    int randr_error;
+
+    uint64_t window;
+    //vec2i window_pos;
+    vec2i window_size;
+
+    gsr_cursor cursor; /* Relative to |window| */
+    gsr_monitor monitor;
+    char monitor_name[32];
+} gsr_damage;
+
+bool gsr_damage_init(gsr_damage *self, gsr_egl *egl, bool track_cursor);
+void gsr_damage_deinit(gsr_damage *self);
+
+bool gsr_damage_set_target_window(gsr_damage *self, uint64_t window);
+bool gsr_damage_set_target_monitor(gsr_damage *self, const char *monitor_name);
+void gsr_damage_on_event(gsr_damage *self, XEvent *xev);
+void gsr_damage_tick(gsr_damage *self);
+/* Also returns true if damage tracking is not available */
+bool gsr_damage_is_damaged(gsr_damage *self);
+void gsr_damage_clear(gsr_damage *self);
+
+#endif /* GSR_DAMAGE_H */
diff --git a/include/defs.h b/include/defs.h
new file mode 100644
index 0000000..d780005
--- /dev/null
+++ b/include/defs.h
@@ -0,0 +1,112 @@
+#ifndef GSR_DEFS_H
+#define GSR_DEFS_H
+
+#include <stdbool.h>
+
+#define GSR_VIDEO_CODEC_AUTO -1
+#define GSR_BITRATE_MODE_AUTO -1
+
+typedef enum {
+    GSR_GPU_VENDOR_AMD,
+    GSR_GPU_VENDOR_INTEL,
+    GSR_GPU_VENDOR_NVIDIA,
+    GSR_GPU_VENDOR_BROADCOM,
+} gsr_gpu_vendor;
+
+typedef struct {
+    gsr_gpu_vendor vendor;
+    int gpu_version; /* 0 if unknown */
+    bool is_steam_deck;
+} gsr_gpu_info;
+
+typedef enum {
+    GSR_MONITOR_ROT_0,
+    GSR_MONITOR_ROT_90,
+    GSR_MONITOR_ROT_180,
+    GSR_MONITOR_ROT_270,
+} gsr_monitor_rotation;
+
+typedef enum {
+    GSR_CONNECTION_X11,
+    GSR_CONNECTION_WAYLAND,
+    GSR_CONNECTION_DRM,
+} gsr_connection_type;
+
+typedef enum {
+    GSR_VIDEO_QUALITY_MEDIUM,
+    GSR_VIDEO_QUALITY_HIGH,
+    GSR_VIDEO_QUALITY_VERY_HIGH,
+    GSR_VIDEO_QUALITY_ULTRA,
+} gsr_video_quality;
+
+typedef enum {
+    GSR_VIDEO_CODEC_H264,
+    GSR_VIDEO_CODEC_HEVC,
+    GSR_VIDEO_CODEC_HEVC_HDR,
+    GSR_VIDEO_CODEC_HEVC_10BIT,
+    GSR_VIDEO_CODEC_AV1,
+    GSR_VIDEO_CODEC_AV1_HDR,
+    GSR_VIDEO_CODEC_AV1_10BIT,
+    GSR_VIDEO_CODEC_VP8,
+    GSR_VIDEO_CODEC_VP9,
+    GSR_VIDEO_CODEC_H264_VULKAN,
+    GSR_VIDEO_CODEC_HEVC_VULKAN,
+} gsr_video_codec;
+
+typedef enum {
+    GSR_AUDIO_CODEC_AAC,
+    GSR_AUDIO_CODEC_OPUS,
+    GSR_AUDIO_CODEC_FLAC,
+} gsr_audio_codec;
+
+typedef enum {
+    GSR_PIXEL_FORMAT_YUV420,
+    GSR_PIXEL_FORMAT_YUV444,
+} gsr_pixel_format;
+
+typedef enum {
+    GSR_FRAMERATE_MODE_CONSTANT,
+    GSR_FRAMERATE_MODE_VARIABLE,
+    GSR_FRAMERATE_MODE_CONTENT,
+} gsr_framerate_mode;
+
+typedef enum {
+    GSR_BITRATE_MODE_QP,
+    GSR_BITRATE_MODE_VBR,
+    GSR_BITRATE_MODE_CBR,
+} gsr_bitrate_mode;
+
+typedef enum {
+    GSR_TUNE_PERFORMANCE,
+    GSR_TUNE_QUALITY,
+} gsr_tune;
+
+typedef enum {
+    GSR_VIDEO_ENCODER_HW_GPU,
+    GSR_VIDEO_ENCODER_HW_CPU,
+} gsr_video_encoder_hardware;
+
+typedef enum {
+    GSR_COLOR_RANGE_LIMITED,
+    GSR_COLOR_RANGE_FULL,
+} gsr_color_range;
+
+typedef enum {
+    GSR_COLOR_DEPTH_8_BITS,
+    GSR_COLOR_DEPTH_10_BITS,
+} gsr_color_depth;
+
+typedef enum {
+    GSR_REPLAY_STORAGE_RAM,
+    GSR_REPLAY_STORAGE_DISK,
+} gsr_replay_storage;
+
+bool video_codec_is_hdr(gsr_video_codec video_codec);
+gsr_video_codec hdr_video_codec_to_sdr_video_codec(gsr_video_codec video_codec);
+gsr_color_depth video_codec_to_bit_depth(gsr_video_codec video_codec);
+const char* video_codec_to_string(gsr_video_codec video_codec);
+bool video_codec_is_av1(gsr_video_codec video_codec);
+bool video_codec_is_vulkan(gsr_video_codec video_codec);
+const char* audio_codec_get_name(gsr_audio_codec audio_codec);
+
+#endif /* GSR_DEFS_H */
diff --git a/include/egl.h b/include/egl.h
index ea71b5a..e11557e 100644
--- a/include/egl.h
+++ b/include/egl.h
@@ -8,6 +8,9 @@
 #include <stdbool.h>
 #include <stdint.h>
 #include "vec2.h"
+#include "defs.h"
+
+typedef struct gsr_window gsr_window;
 
 #ifdef _WIN64
 typedef signed   long long int khronos_intptr_t;
@@ -33,12 +36,20 @@ typedef void* EGLImage;
 typedef void* EGLImageKHR;
 typedef void *GLeglImageOES;
 typedef void (*__eglMustCastToProperFunctionPointerType)(void);
+typedef struct __GLXFBConfigRec *GLXFBConfig;
+typedef struct __GLXcontextRec *GLXContext;
+typedef XID GLXDrawable;
+typedef void(*__GLXextFuncPtr)(void);
 
 #define EGL_SUCCESS                             0x3000
 #define EGL_BUFFER_SIZE                         0x3020
 #define EGL_RENDERABLE_TYPE                     0x3040
 #define EGL_OPENGL_API                          0x30A2
+#define EGL_OPENGL_ES_API                       0x30A0
 #define EGL_OPENGL_BIT                          0x0008
+#define EGL_OPENGL_ES_BIT                       0x0001
+#define EGL_OPENGL_ES2_BIT                      0x0004
+#define EGL_OPENGL_ES3_BIT                      0x00000040
 #define EGL_NONE                                0x3038
 #define EGL_CONTEXT_CLIENT_VERSION              0x3098
 #define EGL_BACK_BUFFER                         0x3084
@@ -52,8 +63,23 @@ typedef void (*__eglMustCastToProperFunctionPointerType)(void);
 #define EGL_DMA_BUF_PLANE0_FD_EXT               0x3272
 #define EGL_DMA_BUF_PLANE0_OFFSET_EXT           0x3273
 #define EGL_DMA_BUF_PLANE0_PITCH_EXT            0x3274
+#define EGL_DMA_BUF_PLANE1_FD_EXT               0x3275
+#define EGL_DMA_BUF_PLANE1_OFFSET_EXT           0x3276
+#define EGL_DMA_BUF_PLANE1_PITCH_EXT            0x3277
+#define EGL_DMA_BUF_PLANE2_FD_EXT               0x3278
+#define EGL_DMA_BUF_PLANE2_OFFSET_EXT           0x3279
+#define EGL_DMA_BUF_PLANE2_PITCH_EXT            0x327A
+#define EGL_DMA_BUF_PLANE3_FD_EXT               0x3440
+#define EGL_DMA_BUF_PLANE3_OFFSET_EXT           0x3441
+#define EGL_DMA_BUF_PLANE3_PITCH_EXT            0x3442
 #define EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT      0x3443
 #define EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT      0x3444
+#define EGL_DMA_BUF_PLANE1_MODIFIER_LO_EXT      0x3445
+#define EGL_DMA_BUF_PLANE1_MODIFIER_HI_EXT      0x3446
+#define EGL_DMA_BUF_PLANE2_MODIFIER_LO_EXT      0x3447
+#define EGL_DMA_BUF_PLANE2_MODIFIER_HI_EXT      0x3448
+#define EGL_DMA_BUF_PLANE3_MODIFIER_LO_EXT      0x3449
+#define EGL_DMA_BUF_PLANE3_MODIFIER_HI_EXT      0x344A
 #define EGL_LINUX_DMA_BUF_EXT                   0x3270
 #define EGL_RED_SIZE                            0x3024
 #define EGL_ALPHA_SIZE                          0x3021
@@ -63,16 +89,31 @@ typedef void (*__eglMustCastToProperFunctionPointerType)(void);
 #define EGL_CONTEXT_PRIORITY_HIGH_IMG           0x3101
 #define EGL_CONTEXT_PRIORITY_MEDIUM_IMG         0x3102
 #define EGL_CONTEXT_PRIORITY_LOW_IMG            0x3103
+#define EGL_DEVICE_EXT                          0x322C
+#define EGL_DRM_DEVICE_FILE_EXT                 0x3233
 
 #define GL_FLOAT                                0x1406
 #define GL_FALSE                                0
 #define GL_TRUE                                 1
 #define GL_TRIANGLES                            0x0004
 #define GL_TEXTURE_2D                           0x0DE1
-#define GL_TEXTURE_EXTERNAL_OES                 0x8D65 // TODO: Use this where applicable
+#define GL_TEXTURE_EXTERNAL_OES                 0x8D65
+#define GL_RED                                  0x1903
+#define GL_GREEN                                0x1904
+#define GL_BLUE                                 0x1905
+#define GL_ALPHA                                0x1906
+#define GL_TEXTURE_SWIZZLE_RGBA                 0x8E46
+#define GL_RG                                   0x8227
 #define GL_RGB                                  0x1907
 #define GL_RGBA                                 0x1908
+#define GL_RGB8                                 0x8051
 #define GL_RGBA8                                0x8058
+#define GL_R8                                   0x8229
+#define GL_RG8                                  0x822B
+#define GL_R16                                  0x822A
+#define GL_RG16                                 0x822C
+#define GL_RGB16                                0x8054
+#define GL_RGBA32F                              0x8814
 #define GL_UNSIGNED_BYTE                        0x1401
 #define GL_COLOR_BUFFER_BIT                     0x00004000
 #define GL_TEXTURE_WRAP_S                       0x2802
@@ -87,75 +128,80 @@ typedef void (*__eglMustCastToProperFunctionPointerType)(void);
 #define GL_FRAMEBUFFER                          0x8D40
 #define GL_COLOR_ATTACHMENT0                    0x8CE0
 #define GL_FRAMEBUFFER_COMPLETE                 0x8CD5
-#define GL_STREAM_DRAW                          0x88E0
+#define GL_DYNAMIC_DRAW                         0x88E8
 #define GL_ARRAY_BUFFER                         0x8892
 #define GL_BLEND                                0x0BE2
 #define GL_SRC_ALPHA                            0x0302
 #define GL_ONE_MINUS_SRC_ALPHA                  0x0303
+#define GL_DEBUG_OUTPUT                         0x92E0
+#define GL_SCISSOR_TEST                         0x0C11
+#define GL_PACK_ALIGNMENT                       0x0D05
+#define GL_UNPACK_ALIGNMENT                     0x0CF5
+#define GL_READ_ONLY                            0x88B8
+#define GL_WRITE_ONLY                           0x88B9
+#define GL_READ_WRITE                           0x88BA
+#define GL_MAX_COMPUTE_FIXED_GROUP_INVOCATIONS  0x90EB
+#define GL_TEXTURE0                             0x84C0
+#define GL_TEXTURE1                             0x84C1
+#define GL_SHADER_IMAGE_ACCESS_BARRIER_BIT      0x00000020
+#define GL_ALL_BARRIER_BITS                     0xFFFFFFFF
 
 #define GL_VENDOR                               0x1F00
 #define GL_RENDERER                             0x1F01
+#define GL_VERSION                              0x1F02
 
 #define GL_COMPILE_STATUS                       0x8B81
 #define GL_INFO_LOG_LENGTH                      0x8B84
 #define GL_FRAGMENT_SHADER                      0x8B30
 #define GL_VERTEX_SHADER                        0x8B31
+#define GL_COMPUTE_SHADER                       0x91B9
 #define GL_COMPILE_STATUS                       0x8B81
 #define GL_LINK_STATUS                          0x8B82
 
 typedef unsigned int (*FUNC_eglExportDMABUFImageQueryMESA)(EGLDisplay dpy, EGLImageKHR image, int *fourcc, int *num_planes, uint64_t *modifiers);
 typedef unsigned int (*FUNC_eglExportDMABUFImageMESA)(EGLDisplay dpy, EGLImageKHR image, int *fds, int32_t *strides, int32_t *offsets);
 typedef void (*FUNC_glEGLImageTargetTexture2DOES)(unsigned int target, GLeglImageOES image);
+typedef GLXContext (*FUNC_glXCreateContextAttribsARB)(Display *dpy, GLXFBConfig config, GLXContext share_context, Bool direct, const int *attrib_list);
+typedef void (*FUNC_glXSwapIntervalEXT)(Display * dpy, GLXDrawable drawable, int interval);
+typedef int (*FUNC_glXSwapIntervalMESA)(unsigned int interval);
+typedef int (*FUNC_glXSwapIntervalSGI)(int interval);
+typedef void (*GLDEBUGPROC)(unsigned int source, unsigned int type, unsigned int id, unsigned int severity, int length, const char *message, const void *userParam);
+typedef int (*FUNC_eglQueryDisplayAttribEXT)(EGLDisplay dpy, int32_t attribute, intptr_t *value);
+typedef const char* (*FUNC_eglQueryDeviceStringEXT)(void *device, int32_t name);
+typedef int (*FUNC_eglQueryDmaBufModifiersEXT)(EGLDisplay dpy, int32_t format, int32_t max_modifiers, uint64_t *modifiers, int *external_only, int32_t *num_modifiers);
+typedef void (*FUNC_glCreateMemoryObjectsEXT)(int n, unsigned int *memoryObjects);
+typedef void (*FUNC_glImportMemoryFdEXT)(unsigned int memory, uint64_t size, unsigned int handleType, int fd);
+typedef unsigned char (*FUNC_glIsMemoryObjectEXT)(unsigned int memoryObject);
+typedef void (*FUNC_glTexStorageMem2DEXT)(unsigned int target, int levels, unsigned int internalFormat, int width, int height, unsigned int memory, uint64_t offset);
+typedef void (*FUNC_glBufferStorageMemEXT)(unsigned int target, ssize_t size, unsigned int memory, uint64_t offset);
+typedef void (*FUNC_glNamedBufferStorageMemEXT)(unsigned int buffer, ssize_t size, unsigned int memory, uint64_t offset);
+typedef void (*FUNC_glMemoryObjectParameterivEXT)(unsigned int memoryObject, unsigned int pname, const int *params);
+
+typedef enum {
+    GSR_GL_CONTEXT_TYPE_EGL,
+    GSR_GL_CONTEXT_TYPE_GLX
+} gsr_gl_context_type;
 
-#define GSR_MAX_OUTPUTS 32
-
-typedef struct {
-    Display *dpy;
-    Window window;
-} gsr_x11;
-
-typedef struct {
-    uint32_t wl_name;
-    void *output;
-    vec2i pos;
-    vec2i size;
-    char *name;
-} gsr_wayland_output;
-
-typedef struct {
-    void *dpy;
-    void *window;
-    void *registry;
-    void *surface;
-    void *compositor;
-    void *export_manager;
-    void *current_frame;
-    void *frame_callback;
-    gsr_wayland_output outputs[GSR_MAX_OUTPUTS];
-    int num_outputs;
-    gsr_wayland_output *output_to_capture;
-} gsr_wayland;
-
-typedef struct {
+typedef struct gsr_egl gsr_egl;
+struct gsr_egl {
     void *egl_library;
+    void *glx_library;
     void *gl_library;
 
+    gsr_gl_context_type context_type;
+    gsr_window *window;
+
     EGLDisplay egl_display;
     EGLSurface egl_surface;
     EGLContext egl_context;
+    const char *dri_card_path;
 
-    gsr_x11 x11;
-    gsr_wayland wayland;
+    void *glx_context;
+    void *glx_fb_config;
 
-    int fd;
-    uint32_t x;
-    uint32_t y;
-    uint32_t width;
-    uint32_t height;
-    uint32_t pitch;
-    uint32_t offset;
-    uint32_t pixel_format;
-    uint64_t modifier;
+    gsr_gpu_info gpu_info;
+
+    char card_path[128];
 
     int32_t (*eglGetError)(void);
     EGLDisplay (*eglGetDisplay)(EGLNativeDisplayType display_id);
@@ -177,22 +223,55 @@ typedef struct {
     FUNC_eglExportDMABUFImageQueryMESA eglExportDMABUFImageQueryMESA;
     FUNC_eglExportDMABUFImageMESA eglExportDMABUFImageMESA;
     FUNC_glEGLImageTargetTexture2DOES glEGLImageTargetTexture2DOES;
+    FUNC_eglQueryDisplayAttribEXT eglQueryDisplayAttribEXT;
+    FUNC_eglQueryDeviceStringEXT eglQueryDeviceStringEXT;
+    FUNC_eglQueryDmaBufModifiersEXT eglQueryDmaBufModifiersEXT;
+    FUNC_glCreateMemoryObjectsEXT glCreateMemoryObjectsEXT;
+    FUNC_glImportMemoryFdEXT glImportMemoryFdEXT;
+    FUNC_glIsMemoryObjectEXT glIsMemoryObjectEXT;
+    FUNC_glTexStorageMem2DEXT glTexStorageMem2DEXT;
+    FUNC_glBufferStorageMemEXT glBufferStorageMemEXT;
+    FUNC_glNamedBufferStorageMemEXT glNamedBufferStorageMemEXT;
+    FUNC_glMemoryObjectParameterivEXT glMemoryObjectParameterivEXT;
+
+    __GLXextFuncPtr (*glXGetProcAddress)(const unsigned char *procName);
+    GLXFBConfig* (*glXChooseFBConfig)(Display *dpy, int screen, const int *attribList, int *nitems);
+    Bool (*glXMakeContextCurrent)(Display *dpy, GLXDrawable draw, GLXDrawable read, GLXContext ctx);
+    // TODO: Remove
+    GLXContext (*glXCreateNewContext)(Display *dpy, GLXFBConfig config, int renderType, GLXContext shareList, Bool direct);
+    void (*glXDestroyContext)(Display *dpy, GLXContext ctx);
+    void (*glXSwapBuffers)(Display *dpy, GLXDrawable drawable);
+    FUNC_glXCreateContextAttribsARB glXCreateContextAttribsARB;
+
+    /* Optional */
+    FUNC_glXSwapIntervalEXT glXSwapIntervalEXT;
+    FUNC_glXSwapIntervalMESA glXSwapIntervalMESA;
+    FUNC_glXSwapIntervalSGI glXSwapIntervalSGI;
 
     unsigned int (*glGetError)(void);
     const unsigned char* (*glGetString)(unsigned int name);
+    void (*glFlush)(void);
+    void (*glFinish)(void);
     void (*glClear)(unsigned int mask);
     void (*glClearColor)(float red, float green, float blue, float alpha);
     void (*glGenTextures)(int n, unsigned int *textures);
     void (*glDeleteTextures)(int n, const unsigned int *texture);
+    void (*glActiveTexture)(unsigned int texture);
     void (*glBindTexture)(unsigned int target, unsigned int texture);
+    void (*glBindImageTexture)(unsigned int unit, unsigned int texture, int level, unsigned char layered, int layer, unsigned int access, unsigned int format);
     void (*glTexParameteri)(unsigned int target, unsigned int pname, int param);
+    void (*glTexParameteriv)(unsigned int target, unsigned int pname, const int *params);
+    void (*glTexParameterfv)(unsigned int target, unsigned int pname, const float *params);
     void (*glGetTexLevelParameteriv)(unsigned int target, int level, unsigned int pname, int *params);
     void (*glTexImage2D)(unsigned int target, int level, int internalFormat, int width, int height, int border, unsigned int format, unsigned int type, const void *pixels);
-    void (*glCopyImageSubData)(unsigned int srcName, unsigned int srcTarget, int srcLevel, int srcX, int srcY, int srcZ, unsigned int dstName, unsigned int dstTarget, int dstLevel, int dstX, int dstY, int dstZ, int srcWidth, int srcHeight, int srcDepth);
-    void (*glClearTexImage)(unsigned int texture, unsigned int level, unsigned int format, unsigned int type, const void *data);
+    void (*glTexSubImage2D)(unsigned int target, int level, int xoffset, int yoffset, int width, int height, unsigned format, unsigned type, const void *pixels);
+    void (*glTexStorage2D)(unsigned int target, int levels, unsigned int internalformat, int width, int height);
+    void (*glGetTexImage)(unsigned int target, int level, unsigned int format, unsigned int type, void *pixels);
     void (*glGenFramebuffers)(int n, unsigned int *framebuffers);
     void (*glBindFramebuffer)(unsigned int target, unsigned int framebuffer);
     void (*glDeleteFramebuffers)(int n, const unsigned int *framebuffers);
+    void (*glDispatchCompute)(unsigned int num_groups_x, unsigned int num_groups_y, unsigned int num_groups_z);
+    void (*glMemoryBarrier)(unsigned int barriers);
     void (*glViewport)(int x, int y, int width, int height);
     void (*glFramebufferTexture2D)(unsigned int target, unsigned int attachment, unsigned int textarget, unsigned int texture, int level);
     void (*glDrawBuffers)(int n, const unsigned int *bufs);
@@ -223,18 +302,28 @@ typedef struct {
     void (*glEnableVertexAttribArray)(unsigned int index);
     void (*glDrawArrays)(unsigned int mode, int first, int count);
     void (*glEnable)(unsigned int cap);
+    void (*glDisable)(unsigned int cap);
     void (*glBlendFunc)(unsigned int sfactor, unsigned int dfactor);
+    void (*glPixelStorei)(unsigned int pname, int param);
     int (*glGetUniformLocation)(unsigned int program, const char *name);
     void (*glUniform1f)(int location, float v0);
-} gsr_egl;
+    void (*glUniform2f)(int location, float v0, float v1);
+    void (*glUniform1i)(int location, int v0);
+    void (*glUniform2i)(int location, int v0, int v1);
+    void (*glUniformMatrix2fv)(int location, int count, unsigned char transpose, const float *value);
+    void (*glDebugMessageCallback)(GLDEBUGPROC callback, const void *userParam);
+    void (*glScissor)(int x, int y, int width, int height);
+    void (*glCreateBuffers)(int n, unsigned int *buffers);
+    void (*glReadPixels)(int x, int y, int width, int height, unsigned int format, unsigned int type, void *pixels);
+    void* (*glMapBuffer)(unsigned int target, unsigned int access);
+    unsigned char (*glUnmapBuffer)(unsigned int target);
+    void (*glGetIntegerv)(unsigned int pname, int *params);
+};
 
-bool gsr_egl_load(gsr_egl *self, Display *dpy, bool wayland);
+bool gsr_egl_load(gsr_egl *self, gsr_window *window, bool is_monitor_capture, bool enable_debug);
 void gsr_egl_unload(gsr_egl *self);
 
-/* wayland protocol capture, does not include kms capture */
-bool gsr_egl_supports_wayland_capture(gsr_egl *self);
-bool gsr_egl_start_capture(gsr_egl *self, const char *monitor_to_capture);
-void gsr_egl_update(gsr_egl *self);
-void gsr_egl_cleanup_frame(gsr_egl *self);
+/* Does opengl swap with egl or glx, depending on which one is active */
+void gsr_egl_swap_buffers(gsr_egl *self);
 
 #endif /* GSR_EGL_H */
diff --git a/include/encoder/encoder.h b/include/encoder/encoder.h
new file mode 100644
index 0000000..7e550f6
--- /dev/null
+++ b/include/encoder/encoder.h
@@ -0,0 +1,43 @@
+#ifndef GSR_ENCODER_H
+#define GSR_ENCODER_H
+
+#include "../replay_buffer/replay_buffer.h"
+#include <stdbool.h>
+#include <stdint.h>
+#include <stddef.h>
+#include <pthread.h>
+
+#define GSR_MAX_RECORDING_DESTINATIONS 128
+
+typedef struct AVCodecContext AVCodecContext;
+typedef struct AVFormatContext AVFormatContext;
+typedef struct AVStream AVStream;
+
+typedef struct {
+    size_t id;
+    AVCodecContext *codec_context;
+    AVFormatContext *format_context;
+    AVStream *stream;
+    int64_t start_pts;
+    bool has_received_keyframe;
+} gsr_encoder_recording_destination;
+
+typedef struct {
+    gsr_replay_buffer *replay_buffer;
+    pthread_mutex_t file_write_mutex;
+    bool mutex_created;
+
+    gsr_encoder_recording_destination recording_destinations[GSR_MAX_RECORDING_DESTINATIONS];
+    size_t num_recording_destinations;
+    size_t recording_destination_id_counter;
+} gsr_encoder;
+
+bool gsr_encoder_init(gsr_encoder *self, gsr_replay_storage replay_storage, size_t replay_buffer_num_packets, double replay_buffer_time, const char *replay_directory);
+void gsr_encoder_deinit(gsr_encoder *self);
+
+void gsr_encoder_receive_packets(gsr_encoder *self, AVCodecContext *codec_context, int64_t pts, int stream_index);
+/* Returns the id to the recording destination, or -1 on error */
+size_t gsr_encoder_add_recording_destination(gsr_encoder *self, AVCodecContext *codec_context, AVFormatContext *format_context, AVStream *stream, int64_t start_pts);
+bool gsr_encoder_remove_recording_destination(gsr_encoder *self, size_t id);
+
+#endif /* GSR_ENCODER_H */
diff --git a/include/encoder/video/nvenc.h b/include/encoder/video/nvenc.h
new file mode 100644
index 0000000..d4a906b
--- /dev/null
+++ b/include/encoder/video/nvenc.h
@@ -0,0 +1,16 @@
+#ifndef GSR_ENCODER_VIDEO_NVENC_H
+#define GSR_ENCODER_VIDEO_NVENC_H
+
+#include "video.h"
+
+typedef struct gsr_egl gsr_egl;
+
+typedef struct {
+    gsr_egl *egl;
+    bool overclock;
+    gsr_color_depth color_depth;
+} gsr_video_encoder_nvenc_params;
+
+gsr_video_encoder* gsr_video_encoder_nvenc_create(const gsr_video_encoder_nvenc_params *params);
+
+#endif /* GSR_ENCODER_VIDEO_NVENC_H */
diff --git a/include/encoder/video/software.h b/include/encoder/video/software.h
new file mode 100644
index 0000000..fd2dc6b
--- /dev/null
+++ b/include/encoder/video/software.h
@@ -0,0 +1,15 @@
+#ifndef GSR_ENCODER_VIDEO_SOFTWARE_H
+#define GSR_ENCODER_VIDEO_SOFTWARE_H
+
+#include "video.h"
+
+typedef struct gsr_egl gsr_egl;
+
+typedef struct {
+    gsr_egl *egl;
+    gsr_color_depth color_depth;
+} gsr_video_encoder_software_params;
+
+gsr_video_encoder* gsr_video_encoder_software_create(const gsr_video_encoder_software_params *params);
+
+#endif /* GSR_ENCODER_VIDEO_SOFTWARE_H */
diff --git a/include/encoder/video/vaapi.h b/include/encoder/video/vaapi.h
new file mode 100644
index 0000000..b509f17
--- /dev/null
+++ b/include/encoder/video/vaapi.h
@@ -0,0 +1,15 @@
+#ifndef GSR_ENCODER_VIDEO_VAAPI_H
+#define GSR_ENCODER_VIDEO_VAAPI_H
+
+#include "video.h"
+
+typedef struct gsr_egl gsr_egl;
+
+typedef struct {
+    gsr_egl *egl;
+    gsr_color_depth color_depth;
+} gsr_video_encoder_vaapi_params;
+
+gsr_video_encoder* gsr_video_encoder_vaapi_create(const gsr_video_encoder_vaapi_params *params);
+
+#endif /* GSR_ENCODER_VIDEO_VAAPI_H */
diff --git a/include/encoder/video/video.h b/include/encoder/video/video.h
new file mode 100644
index 0000000..7a706b5
--- /dev/null
+++ b/include/encoder/video/video.h
@@ -0,0 +1,30 @@
+#ifndef GSR_ENCODER_VIDEO_H
+#define GSR_ENCODER_VIDEO_H
+
+#include "../../color_conversion.h"
+#include <stdbool.h>
+
+#define GSR_MAX_RECORDING_DESTINATIONS 128
+
+typedef struct gsr_video_encoder gsr_video_encoder;
+typedef struct AVCodecContext AVCodecContext;
+typedef struct AVFrame AVFrame;
+
+struct gsr_video_encoder {
+    bool (*start)(gsr_video_encoder *encoder, AVCodecContext *video_codec_context, AVFrame *frame);
+    void (*destroy)(gsr_video_encoder *encoder, AVCodecContext *video_codec_context);
+    void (*copy_textures_to_frame)(gsr_video_encoder *encoder, AVFrame *frame, gsr_color_conversion *color_conversion); /* Can be NULL */
+    /* |textures| should be able to fit 2 elements */
+    void (*get_textures)(gsr_video_encoder *encoder, unsigned int *textures, int *num_textures, gsr_destination_color *destination_color);
+
+    void *priv;
+    bool started;
+};
+
+/* Set |replay_buffer_time_seconds| and |fps| to 0 to disable replay buffer */
+bool gsr_video_encoder_start(gsr_video_encoder *encoder, AVCodecContext *video_codec_context, AVFrame *frame);
+void gsr_video_encoder_destroy(gsr_video_encoder *encoder, AVCodecContext *video_codec_context);
+void gsr_video_encoder_copy_textures_to_frame(gsr_video_encoder *encoder, AVFrame *frame, gsr_color_conversion *color_conversion);
+void gsr_video_encoder_get_textures(gsr_video_encoder *encoder, unsigned int *textures, int *num_textures, gsr_destination_color *destination_color);
+
+#endif /* GSR_ENCODER_VIDEO_H */
diff --git a/include/encoder/video/vulkan.h b/include/encoder/video/vulkan.h
new file mode 100644
index 0000000..383fc4f
--- /dev/null
+++ b/include/encoder/video/vulkan.h
@@ -0,0 +1,15 @@
+#ifndef GSR_ENCODER_VIDEO_VULKAN_H
+#define GSR_ENCODER_VIDEO_VULKAN_H
+
+#include "video.h"
+
+typedef struct gsr_egl gsr_egl;
+
+typedef struct {
+    gsr_egl *egl;
+    gsr_color_depth color_depth;
+} gsr_video_encoder_vulkan_params;
+
+gsr_video_encoder* gsr_video_encoder_vulkan_create(const gsr_video_encoder_vulkan_params *params);
+
+#endif /* GSR_ENCODER_VIDEO_VULKAN_H */
diff --git a/include/image_writer.h b/include/image_writer.h
new file mode 100644
index 0000000..65e7497
--- /dev/null
+++ b/include/image_writer.h
@@ -0,0 +1,35 @@
+#ifndef GSR_IMAGE_WRITER_H
+#define GSR_IMAGE_WRITER_H
+
+#include <stdbool.h>
+
+typedef struct gsr_egl gsr_egl;
+
+typedef enum {
+    GSR_IMAGE_FORMAT_JPEG,
+    GSR_IMAGE_FORMAT_PNG
+} gsr_image_format;
+
+typedef enum {
+    GSR_IMAGE_WRITER_SOURCE_OPENGL,
+    GSR_IMAGE_WRITER_SOURCE_MEMORY
+} gsr_image_writer_source;
+
+typedef struct {
+    gsr_image_writer_source source;
+    gsr_egl *egl;
+    int width;
+    int height;
+    unsigned int texture;
+    const void *memory; /* Reference */
+} gsr_image_writer;
+
+bool gsr_image_writer_init_opengl(gsr_image_writer *self, gsr_egl *egl, int width, int height);
+/* |memory| is taken as a reference. The data is expected to be in rgba8 format (8 bit rgba) */
+bool gsr_image_writer_init_memory(gsr_image_writer *self, const void *memory, int width, int height);
+void gsr_image_writer_deinit(gsr_image_writer *self);
+
+/* Quality is between 1 and 100 where 100 is the max quality. Quality doesn't apply to lossless formats */
+bool gsr_image_writer_write_to_file(gsr_image_writer *self, const char *filepath, gsr_image_format image_format, int quality);
+
+#endif /* GSR_IMAGE_WRITER_H */
diff --git a/include/pipewire_audio.h b/include/pipewire_audio.h
new file mode 100644
index 0000000..68e5356
--- /dev/null
+++ b/include/pipewire_audio.h
@@ -0,0 +1,156 @@
+#ifndef GSR_PIPEWIRE_AUDIO_H
+#define GSR_PIPEWIRE_AUDIO_H
+
+#include <pipewire/thread-loop.h>
+#include <pipewire/context.h>
+#include <pipewire/core.h>
+#include <spa/utils/hook.h>
+
+#include <stdbool.h>
+
+typedef enum {
+    GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_OUTPUT, /* Application audio */
+    GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_INPUT,  /* Audio recording input */
+    GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE /* Audio output or input device or combined (virtual) sink */
+} gsr_pipewire_audio_node_type;
+
+typedef struct {
+    uint32_t id;
+    char *name;
+    gsr_pipewire_audio_node_type type;
+} gsr_pipewire_audio_node;
+
+typedef enum {
+    GSR_PIPEWIRE_AUDIO_PORT_DIRECTION_INPUT,
+    GSR_PIPEWIRE_AUDIO_PORT_DIRECTION_OUTPUT
+} gsr_pipewire_audio_port_direction;
+
+typedef struct {
+    uint32_t id;
+    uint32_t node_id;
+    gsr_pipewire_audio_port_direction direction;
+    char *name;
+} gsr_pipewire_audio_port;
+
+typedef struct {
+    uint32_t id;
+    uint32_t output_node_id;
+    uint32_t input_node_id;
+} gsr_pipewire_audio_link;
+
+typedef enum {
+    GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_STREAM, /* Application */
+    GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_SINK    /* Combined (virtual) sink */
+} gsr_pipewire_audio_link_input_type;
+
+typedef enum {
+    GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_STANDARD,
+    GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT,
+    GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_INPUT
+} gsr_pipewire_audio_requested_type;
+
+typedef struct {
+    char *name;
+    gsr_pipewire_audio_requested_type type;
+} gsr_pipewire_audio_requested_output;
+
+typedef struct {
+    gsr_pipewire_audio_requested_output *outputs;
+    int num_outputs;
+    char *input_name;
+    bool inverted;
+    gsr_pipewire_audio_node_type output_type;
+    gsr_pipewire_audio_link_input_type input_type;
+} gsr_pipewire_audio_requested_link;
+
+typedef struct {
+    struct pw_thread_loop *thread_loop;
+    struct pw_context *context;
+    struct pw_core *core;
+    struct spa_hook core_listener;
+    struct pw_registry *registry;
+    struct spa_hook registry_listener;
+    int server_version_sync;
+
+    struct pw_proxy *metadata_proxy;
+    struct spa_hook metadata_listener;
+    struct spa_hook metadata_proxy_listener;
+    char default_output_device_name[128];
+    char default_input_device_name[128];
+
+    gsr_pipewire_audio_node *stream_nodes;
+    size_t num_stream_nodes;
+    size_t stream_nodes_capacity_items;
+
+    gsr_pipewire_audio_port *ports;
+    size_t num_ports;
+    size_t ports_capacity_items;
+
+    gsr_pipewire_audio_link *links;
+    size_t num_links;
+    size_t links_capacity_items;
+
+    gsr_pipewire_audio_requested_link *requested_links;
+    size_t num_requested_links;
+    size_t requested_links_capacity_items;
+
+    struct pw_proxy **virtual_sink_proxies;
+    size_t num_virtual_sink_proxies;
+    size_t virtual_sink_proxies_capacity_items;
+} gsr_pipewire_audio;
+
+bool gsr_pipewire_audio_init(gsr_pipewire_audio *self);
+void gsr_pipewire_audio_deinit(gsr_pipewire_audio *self);
+
+bool gsr_pipewire_audio_create_virtual_sink(gsr_pipewire_audio *self, const char *name);
+
+/*
+    This function links audio source outputs from applications that match the name |app_names| to the input
+    that matches the name |stream_name_input|.
+    If an application or a new application starts outputting audio after this function is called and the app name matches
+    then it will automatically link the audio sources.
+    |app_names| and |stream_name_input| are case-insensitive matches.
+*/
+bool gsr_pipewire_audio_add_link_from_apps_to_stream(gsr_pipewire_audio *self, const char **app_names, int num_app_names, const char *stream_name_input);
+/*
+    This function links audio source outputs from all applications except the ones that match the name |app_names| to the input
+    that matches the name |stream_name_input|.
+    If an application or a new application starts outputting audio after this function is called and the app name doesn't match
+    then it will automatically link the audio sources.
+    |app_names| and |stream_name_input| are case-insensitive matches.
+*/
+bool gsr_pipewire_audio_add_link_from_apps_to_stream_inverted(gsr_pipewire_audio *self, const char **app_names, int num_app_names, const char *stream_name_input);
+
+/*
+    This function links audio source outputs from applications that match the name |app_names| to the input
+    that matches the name |sink_name_input|.
+    If an application or a new application starts outputting audio after this function is called and the app name matches
+    then it will automatically link the audio sources.
+    |app_names| and |sink_name_input| are case-insensitive matches.
+*/
+bool gsr_pipewire_audio_add_link_from_apps_to_sink(gsr_pipewire_audio *self, const char **app_names, int num_app_names, const char *sink_name_input);
+/*
+    This function links audio source outputs from all applications except the ones that match the name |app_names| to the input
+    that matches the name |sink_name_input|.
+    If an application or a new application starts outputting audio after this function is called and the app name doesn't match
+    then it will automatically link the audio sources.
+    |app_names| and |sink_name_input| are case-insensitive matches.
+*/
+bool gsr_pipewire_audio_add_link_from_apps_to_sink_inverted(gsr_pipewire_audio *self, const char **app_names, int num_app_names, const char *sink_name_input);
+
+/*
+    This function links audio source outputs from devices that match the name |source_names| to the input
+    that matches the name |sink_name_input|.
+    If a device or a new device starts outputting audio after this function is called and the device name matches
+    then it will automatically link the audio sources.
+    |source_names| and |sink_name_input| are case-insensitive matches.
+    |source_names| can include "default_output" or "default_input" to use the default output/input
+    and it will automatically switch when the default output/input is changed in system audio settings.
+*/
+bool gsr_pipewire_audio_add_link_from_sources_to_sink(gsr_pipewire_audio *self, const char **source_names, int num_source_names, const char *sink_name_input);
+
+/* Return true to continue */
+typedef bool (*gsr_pipewire_audio_app_query_callback)(const char *app_name, void *userdata);
+void gsr_pipewire_audio_for_each_app(gsr_pipewire_audio *self, gsr_pipewire_audio_app_query_callback callback, void *userdata);
+
+#endif /* GSR_PIPEWIRE_AUDIO_H */
diff --git a/include/pipewire_video.h b/include/pipewire_video.h
new file mode 100644
index 0000000..785f56f
--- /dev/null
+++ b/include/pipewire_video.h
@@ -0,0 +1,113 @@
+#ifndef GSR_PIPEWIRE_VIDEO_H
+#define GSR_PIPEWIRE_VIDEO_H
+
+#include <stdbool.h>
+#include <stdint.h>
+#include <pthread.h>
+
+#include <spa/utils/hook.h>
+#include <spa/param/video/format.h>
+
+#define GSR_PIPEWIRE_VIDEO_MAX_MODIFIERS 1024
+#define GSR_PIPEWIRE_VIDEO_MAX_VIDEO_FORMATS 6
+#define GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES 4
+
+typedef struct gsr_egl gsr_egl;
+
+typedef struct {
+    int major;
+    int minor;
+    int micro;
+} gsr_pipewire_video_data_version;
+
+typedef struct {
+    uint32_t fps_num;
+    uint32_t fps_den;
+} gsr_pipewire_video_video_info;
+
+typedef struct {
+    int fd;
+    uint32_t offset;
+    int32_t stride;
+} gsr_pipewire_video_dmabuf_data;
+
+typedef struct {
+    int x, y;
+    int width, height;
+} gsr_pipewire_video_region;
+
+typedef struct {
+    enum spa_video_format format;
+    size_t modifiers_index;
+    size_t modifiers_size;
+} gsr_video_format;
+
+typedef struct {
+    unsigned int texture_id;
+    unsigned int external_texture_id;
+    unsigned int cursor_texture_id;
+} gsr_texture_map;
+
+typedef struct {
+    gsr_egl *egl;
+    int fd;
+    uint32_t node;
+    pthread_mutex_t mutex;
+    bool mutex_initialized;
+
+    struct pw_thread_loop *thread_loop;
+    struct pw_context *context;
+    struct pw_core *core;
+    struct spa_hook core_listener;
+    struct pw_stream *stream;
+    struct spa_hook stream_listener;
+    struct spa_source *reneg;
+    struct spa_video_info format;
+    int server_version_sync;
+    bool negotiated;
+    bool renegotiated;
+    bool damaged;
+
+    struct {
+        bool visible;
+        bool valid;
+        uint8_t *data;
+        int x, y;
+        int hotspot_x, hotspot_y;
+        int width, height;
+    } cursor;
+
+    struct {
+        bool valid;
+        int x, y;
+        uint32_t width, height;
+    } crop;
+
+    gsr_video_format supported_video_formats[GSR_PIPEWIRE_VIDEO_MAX_VIDEO_FORMATS];
+
+    gsr_pipewire_video_data_version server_version;
+    gsr_pipewire_video_video_info video_info;
+    gsr_pipewire_video_dmabuf_data dmabuf_data[GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES];
+    size_t dmabuf_num_planes;
+
+    bool no_modifiers_fallback;
+    bool external_texture_fallback;
+
+    uint64_t modifiers[GSR_PIPEWIRE_VIDEO_MAX_MODIFIERS];
+    size_t num_modifiers;
+} gsr_pipewire_video;
+
+/*
+    |capture_cursor| only applies to when capturing a window or region.
+    In other cases |pipewire_node|'s setup will determine if the cursor is included.
+    Note that the cursor is not guaranteed to be shown even if set to true, it depends on the wayland compositor.
+*/
+bool gsr_pipewire_video_init(gsr_pipewire_video *self, int pipewire_fd, uint32_t pipewire_node, int fps, bool capture_cursor, gsr_egl *egl);
+void gsr_pipewire_video_deinit(gsr_pipewire_video *self);
+
+/* |dmabuf_data| should be at least GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES in size */
+bool gsr_pipewire_video_map_texture(gsr_pipewire_video *self, gsr_texture_map texture_map, gsr_pipewire_video_region *region, gsr_pipewire_video_region *cursor_region, gsr_pipewire_video_dmabuf_data *dmabuf_data, int *num_dmabuf_data, uint32_t *fourcc, uint64_t *modifiers, bool *using_external_image);
+bool gsr_pipewire_video_is_damaged(gsr_pipewire_video *self);
+void gsr_pipewire_video_clear_damage(gsr_pipewire_video *self);
+
+#endif /* GSR_PIPEWIRE_VIDEO_H */
diff --git a/include/replay_buffer/replay_buffer.h b/include/replay_buffer/replay_buffer.h
new file mode 100644
index 0000000..a04a3be
--- /dev/null
+++ b/include/replay_buffer/replay_buffer.h
@@ -0,0 +1,54 @@
+#ifndef GSR_REPLAY_BUFFER_H
+#define GSR_REPLAY_BUFFER_H
+
+#include "../defs.h"
+#include <pthread.h>
+#include <stdbool.h>
+#include <libavcodec/packet.h>
+
+typedef struct gsr_replay_buffer gsr_replay_buffer;
+
+typedef struct {
+    size_t packet_index;
+    size_t file_index;
+} gsr_replay_buffer_iterator;
+
+struct gsr_replay_buffer {
+    void (*destroy)(gsr_replay_buffer *self);
+    bool (*append)(gsr_replay_buffer *self, const AVPacket *av_packet, double timestamp);
+    void (*clear)(gsr_replay_buffer *self);
+    AVPacket* (*iterator_get_packet)(gsr_replay_buffer *self, gsr_replay_buffer_iterator iterator);
+    /* The returned data should be free'd with free */
+    uint8_t* (*iterator_get_packet_data)(gsr_replay_buffer *self, gsr_replay_buffer_iterator iterator);
+    /* The clone has to be destroyed before the replay buffer it clones is destroyed */
+    gsr_replay_buffer* (*clone)(gsr_replay_buffer *self);
+    /* Returns {0, 0} if replay buffer is empty */
+    gsr_replay_buffer_iterator (*find_packet_index_by_time_passed)(gsr_replay_buffer *self, int seconds);
+    /* Returns {-1, 0} if not found */
+    gsr_replay_buffer_iterator (*find_keyframe)(gsr_replay_buffer *self, gsr_replay_buffer_iterator start_iterator, int stream_index, bool invert_stream_index);
+    bool (*iterator_next)(gsr_replay_buffer *self, gsr_replay_buffer_iterator *iterator);
+
+    pthread_mutex_t mutex;
+    bool mutex_initialized;
+    gsr_replay_buffer *original_replay_buffer;
+};
+
+gsr_replay_buffer* gsr_replay_buffer_create(gsr_replay_storage replay_storage, const char *replay_directory, double replay_buffer_time, size_t replay_buffer_num_packets);
+void gsr_replay_buffer_destroy(gsr_replay_buffer *self);
+
+void gsr_replay_buffer_lock(gsr_replay_buffer *self);
+void gsr_replay_buffer_unlock(gsr_replay_buffer *self);
+bool gsr_replay_buffer_append(gsr_replay_buffer *self, const AVPacket *av_packet, double timestamp);
+void gsr_replay_buffer_clear(gsr_replay_buffer *self);
+AVPacket* gsr_replay_buffer_iterator_get_packet(gsr_replay_buffer *self, gsr_replay_buffer_iterator iterator);
+/* The returned data should be free'd with free */
+uint8_t* gsr_replay_buffer_iterator_get_packet_data(gsr_replay_buffer *self, gsr_replay_buffer_iterator iterator);
+/* The clone has to be destroyed before the replay buffer it clones is destroyed */
+gsr_replay_buffer* gsr_replay_buffer_clone(gsr_replay_buffer *self);
+/* Returns {0, 0} if replay buffer is empty */
+gsr_replay_buffer_iterator gsr_replay_buffer_find_packet_index_by_time_passed(gsr_replay_buffer *self, int seconds);
+/* Returns {-1, 0} if not found */
+gsr_replay_buffer_iterator gsr_replay_buffer_find_keyframe(gsr_replay_buffer *self, gsr_replay_buffer_iterator start_iterator, int stream_index, bool invert_stream_index);
+bool gsr_replay_buffer_iterator_next(gsr_replay_buffer *self, gsr_replay_buffer_iterator *iterator);
+
+#endif /* GSR_REPLAY_BUFFER_H */
+\ No newline at end of file
diff --git a/include/replay_buffer/replay_buffer_disk.h b/include/replay_buffer/replay_buffer_disk.h
new file mode 100644
index 0000000..6873bb0
--- /dev/null
+++ b/include/replay_buffer/replay_buffer_disk.h
@@ -0,0 +1,44 @@
+#ifndef GSR_REPLAY_BUFFER_DISK_H
+#define GSR_REPLAY_BUFFER_DISK_H
+
+#include "replay_buffer.h"
+#include <limits.h>
+
+#define GSR_REPLAY_BUFFER_CAPACITY_NUM_FILES 1024
+
+typedef struct {
+    AVPacket packet;
+    size_t data_index;
+    double timestamp;
+} gsr_av_packet_disk;
+
+typedef struct {
+    size_t id;
+    double start_timestamp;
+    double end_timestamp;
+    int ref_counter;
+    int fd;
+    
+    gsr_av_packet_disk *packets;
+    size_t capacity_num_packets;
+    size_t num_packets;
+} gsr_replay_buffer_file;
+
+typedef struct {
+    gsr_replay_buffer replay_buffer;
+    double replay_buffer_time;
+
+    size_t storage_counter;
+    size_t storage_num_bytes_written;
+    int storage_fd;
+    gsr_replay_buffer_file *files[GSR_REPLAY_BUFFER_CAPACITY_NUM_FILES]; // GSR_REPLAY_BUFFER_CAPACITY_NUM_FILES * REPLAY_BUFFER_FILE_SIZE_BYTES = 256gb, should be enough for everybody
+    size_t num_files;
+
+    char replay_directory[PATH_MAX];
+
+    bool owns_directory;
+} gsr_replay_buffer_disk;
+
+gsr_replay_buffer* gsr_replay_buffer_disk_create(const char *replay_directory, double replay_buffer_time);
+
+#endif /* GSR_REPLAY_BUFFER_DISK_H */
+\ No newline at end of file
diff --git a/include/replay_buffer/replay_buffer_ram.h b/include/replay_buffer/replay_buffer_ram.h
new file mode 100644
index 0000000..a43d1b9
--- /dev/null
+++ b/include/replay_buffer/replay_buffer_ram.h
@@ -0,0 +1,22 @@
+#ifndef GSR_REPLAY_BUFFER_RAM_H
+#define GSR_REPLAY_BUFFER_RAM_H
+
+#include "replay_buffer.h"
+
+typedef struct {
+    AVPacket packet;
+    int ref_counter;
+    double timestamp;
+} gsr_av_packet_ram;
+
+typedef struct {
+    gsr_replay_buffer replay_buffer;
+    gsr_av_packet_ram **packets;
+    size_t capacity_num_packets;
+    size_t num_packets;
+    size_t index;
+} gsr_replay_buffer_ram;
+
+gsr_replay_buffer* gsr_replay_buffer_ram_create(size_t replay_buffer_num_packets);
+
+#endif /* GSR_REPLAY_BUFFER_RAM_H */
+\ No newline at end of file
diff --git a/include/shader.h b/include/shader.h
index 37f4c09..285758d 100644
--- a/include/shader.h
+++ b/include/shader.h
@@ -1,7 +1,9 @@
 #ifndef GSR_SHADER_H
 #define GSR_SHADER_H
 
-#include "egl.h"
+#include <stdbool.h>
+
+typedef struct gsr_egl gsr_egl;
 
 typedef struct {
     gsr_egl *egl;
@@ -9,11 +11,13 @@ typedef struct {
 } gsr_shader;
 
 /* |vertex_shader| or |fragment_shader| may be NULL */
-int gsr_shader_init(gsr_shader *self, gsr_egl *egl, const char *vertex_shader, const char *fragment_shader);
+int gsr_shader_init(gsr_shader *self, gsr_egl *egl, const char *vertex_shader, const char *fragment_shader, const char *compute_shader);
 void gsr_shader_deinit(gsr_shader *self);
 
 int gsr_shader_bind_attribute_location(gsr_shader *self, const char *attribute, int location);
 void gsr_shader_use(gsr_shader *self);
 void gsr_shader_use_none(gsr_shader *self);
 
+void gsr_shader_enable_debug_output(bool enable);
+
 #endif /* GSR_SHADER_H */
diff --git a/include/sound.hpp b/include/sound.hpp
index 6873e90..87e2e2d 100644
--- a/include/sound.hpp
+++ b/include/sound.hpp
@@ -26,12 +26,30 @@ typedef struct {
     unsigned int frames;
 } SoundDevice;
 
-struct AudioInput {
+struct AudioDevice {
     std::string name;
     std::string description;
 };
 
+struct AudioDevices {
+    std::string default_output;
+    std::string default_input;
+    std::vector<AudioDevice> audio_inputs;
+};
+
+enum class AudioInputType {
+    DEVICE,
+    APPLICATION
+};
+
+struct AudioInput {
+    std::string name;
+    AudioInputType type = AudioInputType::DEVICE;
+    bool inverted = false;
+};
+
 struct MergedAudioInputs {
+    std::string track_name;
     std::vector<AudioInput> audio_inputs;
 };
 
@@ -42,9 +60,10 @@ typedef enum {
 } AudioFormat;
 
 /*
-    Get a sound device by name, returning the device into the @device parameter.
-    The device should be closed with @sound_device_close after it has been used
-    to clean up internal resources.
+    Get a sound device by name, returning the device into the |device| parameter.
+    |device_name| can be a device name or "default_output" or "default_input".
+    If the device name is "default_output" or "default_input" then it will automatically switch which
+    device is records from when the default output/input is changed in the system audio settings.
     Returns 0 on success, or a negative value on failure.
 */
 int sound_device_get_by_name(SoundDevice *device, const char *device_name, const char *description, unsigned int num_channels, unsigned int period_frame_size, AudioFormat audio_format);
@@ -55,8 +74,9 @@ void sound_device_close(SoundDevice *device);
     Returns the next chunk of audio into @buffer.
     Returns the number of frames read, or a negative value on failure.
 */
-int sound_device_read_next_chunk(SoundDevice *device, void **buffer);
+int sound_device_read_next_chunk(SoundDevice *device, void **buffer, double timeout_sec, double *latency_seconds);
 
-std::vector<AudioInput> get_pulseaudio_inputs();
+AudioDevices get_pulseaudio_inputs();
+bool pulseaudio_server_is_pipewire();
 
 #endif /* GPU_SCREEN_RECORDER_H */
diff --git a/include/utils.h b/include/utils.h
index 392591a..74ccf18 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -3,36 +3,23 @@
 
 #include "vec2.h"
 #include "../include/egl.h"
+#include "../include/defs.h"
 #include <stdbool.h>
 #include <stdint.h>
-#include <X11/extensions/Xrandr.h>
 
-typedef enum {
-    GSR_GPU_VENDOR_AMD,
-    GSR_GPU_VENDOR_INTEL,
-    GSR_GPU_VENDOR_NVIDIA
-} gsr_gpu_vendor;
-
-typedef struct {
-    gsr_gpu_vendor vendor;
-    int gpu_version; /* 0 if unknown */
-} gsr_gpu_info;
+typedef struct AVCodecContext AVCodecContext;
+typedef struct AVFrame AVFrame;
 
 typedef struct {
     const char *name;
     int name_len;
-    vec2i pos;
+    vec2i pos; /* This is 0, 0 on wayland. Use |drm_monitor_get_display_server_data| to get the position */
     vec2i size;
-    XRRCrtcInfo *crt_info; /* Only on x11 */
-    uint32_t connector_id; /* Only on drm */
+    uint32_t connector_id; /* Only on x11 and drm */
+    gsr_monitor_rotation rotation; /* Only on x11 and wayland */
+    uint32_t monitor_identifier; /* On x11 this is the crtc id */
 } gsr_monitor;
 
-typedef enum {
-    GSR_CONNECTION_X11,
-    GSR_CONNECTION_WAYLAND,
-    GSR_CONNECTION_DRM
-} gsr_connection_type;
-
 typedef struct {
     const char *name;
     int name_len;
@@ -41,18 +28,34 @@ typedef struct {
 } get_monitor_by_name_userdata;
 
 double clock_get_monotonic_seconds(void);
+bool generate_random_characters(char *buffer, int buffer_size, const char *alphabet, size_t alphabet_size);
+bool generate_random_characters_standard_alphabet(char *buffer, int buffer_size);
 
 typedef void (*active_monitor_callback)(const gsr_monitor *monitor, void *userdata);
-void for_each_active_monitor_output(void *connection, gsr_connection_type connection_type, active_monitor_callback callback, void *userdata);
-bool get_monitor_by_name(void *connection, gsr_connection_type connection_type, const char *name, gsr_monitor *monitor);
+void for_each_active_monitor_output_x11_not_cached(Display *display, active_monitor_callback callback, void *userdata);
+void for_each_active_monitor_output(const gsr_window *window, const char *card_path, gsr_connection_type connection_type, active_monitor_callback callback, void *userdata);
+bool get_monitor_by_name(const gsr_egl *egl, gsr_connection_type connection_type, const char *name, gsr_monitor *monitor);
+bool drm_monitor_get_display_server_data(const gsr_window *window, const gsr_monitor *monitor, gsr_monitor_rotation *monitor_rotation, vec2i *monitor_position);
+
+int get_connector_type_by_name(const char *name);
+int get_connector_type_id_by_name(const char *name);
+uint32_t monitor_identifier_from_type_and_count(int monitor_type_index, int monitor_type_count);
 
 bool gl_get_gpu_info(gsr_egl *egl, gsr_gpu_info *info);
 
+bool try_card_has_valid_plane(const char *card_path);
 /* |output| should be at least 128 bytes in size */
-bool gsr_get_valid_card_path(char *output);
+bool gsr_get_valid_card_path(gsr_egl *egl, char *output, bool is_monitor_capture);
 /* |render_path| should be at least 128 bytes in size */
 bool gsr_card_path_get_render_path(const char *card_path, char *render_path);
 
-int even_number_ceil(int value);
+int create_directory_recursive(char *path);
+
+/* |img_attr| needs to be at least 44 in size */
+void setup_dma_buf_attrs(intptr_t *img_attr, uint32_t format, uint32_t width, uint32_t height, const int *fds, const uint32_t *offsets, const uint32_t *pitches, const uint64_t *modifiers, int num_planes, bool use_modifier);
+
+vec2i scale_keep_aspect_ratio(vec2i from, vec2i to);
+
+unsigned int gl_create_texture(gsr_egl *egl, int width, int height, int internal_format, unsigned int format, int filter);
 
 #endif /* GSR_UTILS_H */
diff --git a/include/vec2.h b/include/vec2.h
index 3e33cfb..8fd3858 100644
--- a/include/vec2.h
+++ b/include/vec2.h
@@ -9,4 +9,8 @@ typedef struct {
     float x, y;
 } vec2f;
 
+typedef struct {
+    double x, y;
+} vec2d;
+
 #endif /* VEC2_H */
diff --git a/include/window/wayland.h b/include/window/wayland.h
new file mode 100644
index 0000000..3535b0f
--- /dev/null
+++ b/include/window/wayland.h
@@ -0,0 +1,8 @@
+#ifndef GSR_WINDOW_WAYLAND_H
+#define GSR_WINDOW_WAYLAND_H
+
+#include "window.h"
+
+gsr_window* gsr_window_wayland_create(void);
+
+#endif /* GSR_WINDOW_WAYLAND_H */
diff --git a/include/window/window.h b/include/window/window.h
new file mode 100644
index 0000000..7839f6a
--- /dev/null
+++ b/include/window/window.h
@@ -0,0 +1,37 @@
+#ifndef GSR_WINDOW_H
+#define GSR_WINDOW_H
+
+#include "../utils.h"
+#include <stdbool.h>
+
+typedef union _XEvent XEvent;
+typedef struct gsr_window gsr_window;
+
+typedef enum {
+    GSR_DISPLAY_SERVER_X11,
+    GSR_DISPLAY_SERVER_WAYLAND
+} gsr_display_server;
+
+struct gsr_window {
+    void (*destroy)(gsr_window *self);
+    /* Returns true if an event is available */
+    bool (*process_event)(gsr_window *self);
+    XEvent* (*get_event_data)(gsr_window *self); /* can be NULL */
+    gsr_display_server (*get_display_server)(void);
+    void* (*get_display)(gsr_window *self);
+    void* (*get_window)(gsr_window *self);
+    void (*for_each_active_monitor_output_cached)(const gsr_window *self, active_monitor_callback callback, void *userdata);
+    void *priv;
+};
+
+void gsr_window_destroy(gsr_window *self);
+
+/* Returns true if an event is available */
+bool gsr_window_process_event(gsr_window *self);
+XEvent* gsr_window_get_event_data(gsr_window *self);
+gsr_display_server gsr_window_get_display_server(const gsr_window *self);
+void* gsr_window_get_display(gsr_window *self);
+void* gsr_window_get_window(gsr_window *self);
+void gsr_window_for_each_active_monitor_output_cached(const gsr_window *self, active_monitor_callback callback, void *userdata);
+
+#endif /* GSR_WINDOW_H */
diff --git a/include/window/x11.h b/include/window/x11.h
new file mode 100644
index 0000000..e0c2948
--- /dev/null
+++ b/include/window/x11.h
@@ -0,0 +1,10 @@
+#ifndef GSR_WINDOW_X11_H
+#define GSR_WINDOW_X11_H
+
+#include "window.h"
+
+typedef struct _XDisplay Display;
+
+gsr_window* gsr_window_x11_create(Display *display);
+
+#endif /* GSR_WINDOW_X11_H */
diff --git a/include/window_texture.h b/include/window_texture.h
index 75bb2a7..6ee5df4 100644
--- a/include/window_texture.h
+++ b/include/window_texture.h
@@ -7,6 +7,7 @@ typedef struct {
     Display *display;
     Window window;
     Pixmap pixmap;
+    EGLImage image;
     unsigned int texture_id;
     int redirected;
     gsr_egl *egl;
diff --git a/install.sh b/install.sh
index 0d413ee..2a1abf8 100755
--- a/install.sh
+++ b/install.sh
@@ -5,18 +5,11 @@ cd "$script_dir"
 
 [ $(id -u) -ne 0 ] && echo "You need root privileges to run the install script" && exit 1
 
-./build.sh
-strip gsr-kms-server
-strip gpu-screen-recorder
+echo "Warning: this install.sh script is deprecated. Use meson directly instead if possible"
 
-install -Dm755 "gsr-kms-server" "/usr/bin/gsr-kms-server"
-install -Dm755 "gpu-screen-recorder" "/usr/bin/gpu-screen-recorder"
-if [ -d "/usr/lib/systemd/user" ]; then
-    install -Dm644 "extra/gpu-screen-recorder.service" "/usr/lib/systemd/user/gpu-screen-recorder.service"
-fi
-# Not necessary, but removes the password prompt when trying to record a monitor on amd/intel or nvidia wayland
-setcap cap_sys_admin+ep /usr/bin/gsr-kms-server
-# Not ncessary, but allows use of EGL_CONTEXT_PRIORITY_LEVEL_IMG which might decrease performance impact on the system
-setcap cap_sys_nice+ep /usr/bin/gpu-screen-recorder
+rm -rf build
+meson setup build
+meson configure --prefix=/usr --buildtype=release -Dsystemd=true -Dstrip=true build
+ninja -C build install
 
 echo "Successfully installed gpu-screen-recorder"
diff --git a/kms/client/kms_client.c b/kms/client/kms_client.c
index 28922ff..57afd04 100644
--- a/kms/client/kms_client.c
+++ b/kms/client/kms_client.c
@@ -1,4 +1,5 @@
 #include "kms_client.h"
+#include "../../include/utils.h"
 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>
@@ -10,6 +11,8 @@
 #include <sys/socket.h>
 #include <sys/un.h>
 #include <sys/wait.h>
+#include <poll.h>
+#include <sys/stat.h>
 #include <sys/capability.h>
 
 #define GSR_SOCKET_PAIR_LOCAL  0
@@ -18,34 +21,18 @@
 static void cleanup_socket(gsr_kms_client *self, bool kill_server);
 static int gsr_kms_client_replace_connection(gsr_kms_client *self);
 
-static bool generate_random_characters(char *buffer, int buffer_size, const char *alphabet, size_t alphabet_size) {
-    int fd = open("/dev/urandom", O_RDONLY);
-    if(fd == -1) {
-        perror("/dev/urandom");
-        return false;
-    }
-
-    if(read(fd, buffer, buffer_size) < buffer_size) {
-        fprintf(stderr, "Failed to read %d bytes from /dev/urandom\n", buffer_size);
-        close(fd);
-        return false;
-    }
-
-    for(int i = 0; i < buffer_size; ++i) {
-        unsigned char c = *(unsigned char*)&buffer[i];
-        buffer[i] = alphabet[c % alphabet_size];
-    }
-
-    close(fd);
-    return true;
-}
-
 static void close_fds(gsr_kms_response *response) {
-    for(int i = 0; i < response->num_fds; ++i) {
-        if(response->fds[i].fd > 0)
-            close(response->fds[i].fd);
-        response->fds[i].fd = 0;
+    for(int i = 0; i < response->num_items; ++i) {
+        for(int j = 0; j < response->items[i].num_dma_bufs; ++j) {
+            gsr_kms_response_dma_buf *dma_buf = &response->items[i].dma_buf[j];
+            if(dma_buf->fd > 0) {
+                close(dma_buf->fd);
+                dma_buf->fd = -1;
+            }
+        }
+        response->items[i].num_dma_bufs = 0;
     }
+    response->num_items = 0;
 }
 
 static int send_msg_to_server(int server_fd, gsr_kms_request *request) {
@@ -78,7 +65,7 @@ static int send_msg_to_server(int server_fd, gsr_kms_request *request) {
     return sendmsg(server_fd, &response_message, 0);
 }
 
-static int recv_msg_from_server(int server_fd, gsr_kms_response *response) {
+static int recv_msg_from_server(int server_pid, int server_fd, gsr_kms_response *response) {
     struct iovec iov;
     iov.iov_base = response;
     iov.iov_len = sizeof(*response);
@@ -87,21 +74,40 @@ static int recv_msg_from_server(int server_fd, gsr_kms_response *response) {
     response_message.msg_iov = &iov;
     response_message.msg_iovlen = 1;
 
-    char cmsgbuf[CMSG_SPACE(sizeof(int) * GSR_KMS_MAX_PLANES)];
+    char cmsgbuf[CMSG_SPACE(sizeof(int) * GSR_KMS_MAX_ITEMS * GSR_KMS_MAX_DMA_BUFS)];
     memset(cmsgbuf, 0, sizeof(cmsgbuf));
     response_message.msg_control = cmsgbuf;
     response_message.msg_controllen = sizeof(cmsgbuf);
 
-    int res = recvmsg(server_fd, &response_message, MSG_WAITALL);
-    if(res <= 0)
-        return res;
+    int res = 0;
+    for(;;) {
+        res = recvmsg(server_fd, &response_message, MSG_DONTWAIT);
+        if(res <= 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) {
+            // If we are replacing the connection and closing the application at the same time
+            // then recvmsg can get stuck (because the server died), so we prevent that by doing
+            // non-blocking recvmsg and checking if the server died
+            int status = 0;
+            int wait_result = waitpid(server_pid, &status, WNOHANG);
+            if(wait_result != 0) {
+                res = -1;
+                break;
+            }
+            usleep(1000);
+        } else {
+            break;
+        }
+    }
 
-    if(response->num_fds > 0) {
+    if(res > 0 && response->num_items > 0) {
         struct cmsghdr *cmsg = CMSG_FIRSTHDR(&response_message);
         if(cmsg) {
             int *fds = (int*)CMSG_DATA(cmsg);
-            for(int i = 0; i < response->num_fds; ++i) {
-                response->fds[i].fd = fds[i];
+            int fd_index = 0;
+            for(int i = 0; i < response->num_items; ++i) {
+                for(int j = 0; j < response->items[i].num_dma_bufs; ++j) {
+                    gsr_kms_response_dma_buf *dma_buf = &response->items[i].dma_buf[j];
+                    dma_buf->fd = fds[fd_index++];
+                }
             }
         } else {
             close_fds(response);
@@ -119,20 +125,48 @@ static bool create_socket_path(char *output_path, size_t output_path_size) {
 
     char random_characters[11];
     random_characters[10] = '\0';
-    if(!generate_random_characters(random_characters, 10, "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", 62))
+    if(!generate_random_characters_standard_alphabet(random_characters, 10))
         return false;
 
     snprintf(output_path, output_path_size, "%s/.gsr-kms-socket-%s", home, random_characters);
     return true;
 }
 
-static void strncpy_safe(char *dst, const char *src, int len) {
-    int src_len = strlen(src);
-    int min_len = src_len;
-    if(len - 1 < min_len)
-        min_len = len - 1;
-    memcpy(dst, src, min_len);
-    dst[min_len] = '\0';
+static bool readlink_realpath(const char *filepath, char *buffer) {
+    char symlinked_path[PATH_MAX];
+    ssize_t bytes_written = readlink(filepath, symlinked_path, sizeof(symlinked_path) - 1);
+    if(bytes_written == -1 && errno == EINVAL) {
+        /* Not a symlink */
+        snprintf(symlinked_path, sizeof(symlinked_path), "%s", filepath);
+    } else if(bytes_written == -1) {
+        return false;
+    } else {
+        symlinked_path[bytes_written] = '\0';
+    }
+
+    if(!realpath(symlinked_path, buffer))
+        return false;
+
+    return true;
+}
+
+static bool strcat_safe(char *str, int size, const char *str_to_add) {
+    const int str_len = strlen(str);
+    const int str_to_add_len = strlen(str_to_add);
+    if(str_len + str_to_add_len + 1 >= size)
+        return false;
+
+    memcpy(str + str_len, str_to_add, str_to_add_len);
+    str[str_len + str_to_add_len] = '\0';
+    return true;
+}
+
+static void file_get_directory(char *filepath) {
+    char *end = strrchr(filepath, '/');
+    if(end == NULL)
+        filepath[0] = '\0';
+    else
+        *end = '\0';
 }
 
 static bool find_program_in_path(const char *program_name, char *filepath, int filepath_len) {
@@ -186,34 +220,52 @@ int gsr_kms_client_init(gsr_kms_client *self, const char *card_path) {
     }
 
     char server_filepath[PATH_MAX];
-    if(!find_program_in_path("gsr-kms-server", server_filepath, sizeof(server_filepath))) {
-        fprintf(stderr, "gsr error: gsr_kms_client_init: gsr-kms-server is not installed\n");
+    if(!readlink_realpath("/proc/self/exe", server_filepath)) {
+        fprintf(stderr, "gsr error: gsr_kms_client_init: failed to resolve /proc/self/exe\n");
         return -1;
     }
+    file_get_directory(server_filepath);
+
+    if(!strcat_safe(server_filepath, sizeof(server_filepath), "/gsr-kms-server")) {
+        fprintf(stderr, "gsr error: gsr_kms_client_init: gsr-kms-server path too long\n");
+        return -1;
+    }
+
+    if(access(server_filepath, F_OK) != 0) {
+        fprintf(stderr, "gsr info: gsr_kms_client_init: gsr-kms-server is not installed in the same directory as gpu-screen-recorder (%s not found), looking for gsr-kms-server in PATH instead\n", server_filepath);
+        if(!find_program_in_path("gsr-kms-server", server_filepath, sizeof(server_filepath)) || access(server_filepath, F_OK) != 0) {
+            fprintf(stderr, "gsr error: gsr_kms_client_init: gsr-kms-server was not found in PATH. Please install gpu-screen-recorder properly\n");
+            return -1;
+        }
+    }
+
+    fprintf(stderr, "gsr info: gsr_kms_client_init: setting up connection to %s\n", server_filepath);
 
-    bool has_perm = 0;
     const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
-    if(!inside_flatpak) {
-        if(geteuid() == 0) {
-            has_perm = true;
-        } else {
-            cap_t kms_server_cap = cap_get_file(server_filepath);
-            if(kms_server_cap) {
-                cap_flag_value_t res = 0;
-                cap_get_flag(kms_server_cap, CAP_SYS_ADMIN, CAP_PERMITTED, &res);
-                if(res == CAP_SET) {
-                    //fprintf(stderr, "has permission!\n");
-                    has_perm = true;
-                } else {
-                    //fprintf(stderr, "No permission:(\n");
-                }
-                cap_free(kms_server_cap);
+    const char *home = getenv("HOME");
+    if(!home)
+        home = "/tmp";
+
+    bool has_perm = 0;
+    if(geteuid() == 0) {
+        has_perm = true;
+    } else {
+        cap_t kms_server_cap = cap_get_file(server_filepath);
+        if(kms_server_cap) {
+            cap_flag_value_t res = CAP_CLEAR;
+            cap_get_flag(kms_server_cap, CAP_SYS_ADMIN, CAP_PERMITTED, &res);
+            if(res == CAP_SET) {
+                //fprintf(stderr, "has permission!\n");
+                has_perm = true;
             } else {
-                if(errno == ENODATA)
-                    fprintf(stderr, "gsr info: gsr_kms_client_init: gsr-kms-server is missing sys_admin cap and will require root authentication. To bypass this automatically, run: sudo setcap cap_sys_admin+ep '%s'\n", server_filepath);
-                else
-                    fprintf(stderr, "gsr info: gsr_kms_client_init: failed to get cap\n");
+                //fprintf(stderr, "No permission:(\n");
             }
+            cap_free(kms_server_cap);
+        } else if(!inside_flatpak) {
+            if(errno == ENODATA)
+                fprintf(stderr, "gsr info: gsr_kms_client_init: gsr-kms-server is missing sys_admin cap and will require root authentication. To bypass this automatically, run: sudo setcap cap_sys_admin+ep '%s'\n", server_filepath);
+            else
+                fprintf(stderr, "gsr info: gsr_kms_client_init: failed to get cap\n");
         }
     }
 
@@ -229,8 +281,13 @@ int gsr_kms_client_init(gsr_kms_client *self, const char *card_path) {
     }
 
     local_addr.sun_family = AF_UNIX;
-    strncpy_safe(local_addr.sun_path, self->initial_socket_path, sizeof(local_addr.sun_path));
-    if(bind(self->initial_socket_fd, (struct sockaddr*)&local_addr, sizeof(local_addr.sun_family) + strlen(local_addr.sun_path)) == -1) {
+    snprintf(local_addr.sun_path, sizeof(local_addr.sun_path), "%s", (const char*)self->initial_socket_path);
+
+    const mode_t prev_mask = umask(0000);
+    const int bind_res = bind(self->initial_socket_fd, (struct sockaddr*)&local_addr, sizeof(local_addr.sun_family) + strlen(local_addr.sun_path));
+    umask(prev_mask);
+
+    if(bind_res == -1) {
         fprintf(stderr, "gsr error: gsr_kms_client_init: failed to bind socket, error: %s\n", strerror(errno));
         goto err;
     }
@@ -246,7 +303,7 @@ int gsr_kms_client_init(gsr_kms_client *self, const char *card_path) {
         goto err;
     } else if(pid == 0) { /* child */
         if(inside_flatpak) {
-            const char *args[] = { "flatpak-spawn", "--host", "pkexec", "flatpak", "run", "--command=gsr-kms-server", "com.dec05eba.gpu_screen_recorder", self->initial_socket_path, card_path, NULL };
+            const char *args[] = { "flatpak-spawn", "--host", "/var/lib/flatpak/app/com.dec05eba.gpu_screen_recorder/current/active/files/bin/kms-server-proxy", self->initial_socket_path, card_path, home, NULL };
             execvp(args[0], (char *const*)args);
         } else if(has_perm) {
             const char *args[] = { server_filepath, self->initial_socket_path, card_path, NULL };
@@ -255,24 +312,21 @@ int gsr_kms_client_init(gsr_kms_client *self, const char *card_path) {
             const char *args[] = { "pkexec", server_filepath, self->initial_socket_path, card_path, NULL };
             execvp(args[0], (char *const*)args);
         }
-        fprintf(stderr, "gsr error: gsr_kms_client_init: execvp failed, error: %s\n", strerror(errno));
+        fprintf(stderr, "gsr error: gsr_kms_client_init: failed to launch \"gsr-kms-server\", error: %s\n", strerror(errno));
         _exit(127);
     } else { /* parent */
         self->kms_server_pid = pid;
     }
 
     fprintf(stderr, "gsr info: gsr_kms_client_init: waiting for server to connect\n");
+    struct pollfd poll_fd = {
+        .fd = self->initial_socket_fd,
+        .events = POLLIN,
+        .revents = 0
+    };
     for(;;) {
-        struct timeval tv;
-        fd_set rfds;
-        FD_ZERO(&rfds);
-        FD_SET(self->initial_socket_fd, &rfds);
-
-        tv.tv_sec = 0;
-        tv.tv_usec = 100 * 1000; // 100 ms
-
-        int select_res = select(1 + self->initial_socket_fd, &rfds, NULL, NULL, &tv);
-        if(select_res > 0) {
+        int poll_res = poll(&poll_fd, 1, 100);
+        if(poll_res > 0 && (poll_fd.revents & POLLIN)) {
             socklen_t sock_len = 0;
             self->initial_client_fd = accept(self->initial_socket_fd, (struct sockaddr*)&remote_addr, &sock_len);
             if(self->initial_client_fd == -1) {
@@ -312,12 +366,12 @@ int gsr_kms_client_init(gsr_kms_client *self, const char *card_path) {
 }
 
 void cleanup_socket(gsr_kms_client *self, bool kill_server) {
-    if(self->initial_client_fd != -1) {
+    if(self->initial_client_fd > 0) {
         close(self->initial_client_fd);
         self->initial_client_fd = -1;
     }
 
-    if(self->initial_socket_fd != -1) {
+    if(self->initial_socket_fd > 0) {
         close(self->initial_socket_fd);
         self->initial_socket_fd = -1;
     }
@@ -331,8 +385,11 @@ void cleanup_socket(gsr_kms_client *self, bool kill_server) {
         }
     }
 
-    if(kill_server && self->kms_server_pid != -1) {
+    if(kill_server && self->kms_server_pid > 0) {
         kill(self->kms_server_pid, SIGKILL);
+        // TODO:
+        //int status;
+        //waitpid(self->kms_server_pid, &status, 0);
         self->kms_server_pid = -1;
     }
 
@@ -361,7 +418,7 @@ int gsr_kms_client_replace_connection(gsr_kms_client *self) {
         return -1;
     }
 
-    const int recv_res = recv_msg_from_server(self->socket_pair[GSR_SOCKET_PAIR_LOCAL], &response);
+    const int recv_res = recv_msg_from_server(self->kms_server_pid, self->socket_pair[GSR_SOCKET_PAIR_LOCAL], &response);
     if(recv_res == 0) {
         fprintf(stderr, "gsr warning: gsr_kms_client_replace_connection: kms server shut down\n");
         return -1;
@@ -371,7 +428,7 @@ int gsr_kms_client_replace_connection(gsr_kms_client *self) {
     }
 
     if(response.version != GSR_KMS_PROTOCOL_VERSION) {
-        fprintf(stderr, "gsr error: gsr_kms_client_replace_connection: expected gsr-kms-server protocol version to be %u, but it's %u\n", GSR_KMS_PROTOCOL_VERSION, response.version);
+        fprintf(stderr, "gsr error: gsr_kms_client_replace_connection: expected gsr-kms-server protocol version to be %u, but it's %u. please reinstall gpu screen recorder\n", GSR_KMS_PROTOCOL_VERSION, response.version);
         /*close_fds(response);*/
         return -1;
     }
@@ -394,7 +451,7 @@ int gsr_kms_client_get_kms(gsr_kms_client *self, gsr_kms_response *response) {
         return -1;
     }
 
-    const int recv_res = recv_msg_from_server(self->socket_pair[GSR_SOCKET_PAIR_LOCAL], response);
+    const int recv_res = recv_msg_from_server(self->kms_server_pid, self->socket_pair[GSR_SOCKET_PAIR_LOCAL], response);
     if(recv_res == 0) {
         fprintf(stderr, "gsr warning: gsr_kms_client_get_kms: kms server shut down\n");
         strcpy(response->err_msg, "failed to receive");
@@ -406,7 +463,7 @@ int gsr_kms_client_get_kms(gsr_kms_client *self, gsr_kms_response *response) {
     }
 
     if(response->version != GSR_KMS_PROTOCOL_VERSION) {
-        fprintf(stderr, "gsr error: gsr_kms_client_get_kms: expected gsr-kms-server protocol version to be %u, but it's %u\n", GSR_KMS_PROTOCOL_VERSION, response->version);
+        fprintf(stderr, "gsr error: gsr_kms_client_get_kms: expected gsr-kms-server protocol version to be %u, but it's %u. please reinstall gpu screen recorder\n", GSR_KMS_PROTOCOL_VERSION, response->version);
         /*close_fds(response);*/
         strcpy(response->err_msg, "mismatching protocol version");
         return -1;
diff --git a/kms/client/kms_client.h b/kms/client/kms_client.h
index 02f2dd1..2d18848 100644
--- a/kms/client/kms_client.h
+++ b/kms/client/kms_client.h
@@ -5,13 +5,15 @@
 #include <sys/types.h>
 #include <limits.h>
 
-typedef struct {
+typedef struct gsr_kms_client gsr_kms_client;
+
+struct gsr_kms_client {
     pid_t kms_server_pid;
     int initial_socket_fd;
     int initial_client_fd;
     char initial_socket_path[PATH_MAX];
     int socket_pair[2];
-} gsr_kms_client;
+};
 
 /* |card_path| should be a path to card, for example /dev/dri/card0 */
 int gsr_kms_client_init(gsr_kms_client *self, const char *card_path);
diff --git a/kms/kms_shared.h b/kms/kms_shared.h
index b72d75d..2dbb655 100644
--- a/kms/kms_shared.h
+++ b/kms/kms_shared.h
@@ -3,9 +3,16 @@
 
 #include <stdint.h>
 #include <stdbool.h>
+#include <drm_mode.h>
 
-#define GSR_KMS_PROTOCOL_VERSION 1
-#define GSR_KMS_MAX_PLANES 32
+#define GSR_KMS_PROTOCOL_VERSION 4
+
+#define GSR_KMS_MAX_ITEMS 8
+#define GSR_KMS_MAX_DMA_BUFS 4
+
+typedef struct gsr_kms_response_dma_buf gsr_kms_response_dma_buf;
+typedef struct gsr_kms_response_item gsr_kms_response_item;
+typedef struct gsr_kms_response gsr_kms_response;
 
 typedef enum {
     KMS_REQUEST_TYPE_REPLACE_CONNECTION,
@@ -26,29 +33,35 @@ typedef struct {
     int new_connection_fd;
 } gsr_kms_request;
 
-typedef struct {
+struct gsr_kms_response_dma_buf {
     int fd;
-    uint32_t width;
-    uint32_t height;
     uint32_t pitch;
     uint32_t offset;
+};
+
+struct gsr_kms_response_item {
+    gsr_kms_response_dma_buf dma_buf[GSR_KMS_MAX_DMA_BUFS];
+    int num_dma_bufs;
+    uint32_t width;
+    uint32_t height;
     uint32_t pixel_format;
     uint64_t modifier;
     uint32_t connector_id; /* 0 if unknown */
-    bool is_combined_plane;
     bool is_cursor;
+    bool has_hdr_metadata;
     int x;
     int y;
     int src_w;
     int src_h;
-} gsr_kms_response_fd;
+    struct hdr_output_metadata hdr_metadata;
+};
 
-typedef struct {
+struct gsr_kms_response {
     uint32_t version; /* GSR_KMS_PROTOCOL_VERSION */
     int result;       /* gsr_kms_result */
     char err_msg[128];
-    gsr_kms_response_fd fds[GSR_KMS_MAX_PLANES];
-    int num_fds;
-} gsr_kms_response;
+    gsr_kms_response_item items[GSR_KMS_MAX_ITEMS];
+    int num_items;
+};
 
 #endif /* #define GSR_KMS_SHARED_H */
diff --git a/kms/server/kms_server.c b/kms/server/kms_server.c
index fbd101e..070875b 100644
--- a/kms/server/kms_server.c
+++ b/kms/server/kms_server.c
@@ -1,11 +1,17 @@
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif
+
 #include "../kms_shared.h"
 
 #include <stdio.h>
 #include <string.h>
 #include <errno.h>
 #include <stdlib.h>
+#include <locale.h>
 
 #include <unistd.h>
+#include <limits.h>
 #include <fcntl.h>
 #include <sys/socket.h>
 #include <sys/un.h>
@@ -14,17 +20,18 @@
 #include <xf86drm.h>
 #include <xf86drmMode.h>
 #include <drm_mode.h>
+#include <drm_fourcc.h>
 
 #define MAX_CONNECTORS 32
 
 typedef struct {
     int drmfd;
-    drmModePlaneResPtr planes;
 } gsr_drm;
 
 typedef struct {
     uint32_t connector_id;
     uint64_t crtc_id;
+    uint64_t hdr_metadata_blob_id;
 } connector_crtc_pair;
 
 typedef struct {
@@ -36,6 +43,14 @@ static int max_int(int a, int b) {
     return a > b ? a : b;
 }
 
+static int count_num_fds(const gsr_kms_response *response) {
+    int num_fds = 0;
+    for(int i = 0; i < response->num_items; ++i) {
+        num_fds += response->items[i].num_dma_bufs;
+    }
+    return num_fds;
+}
+
 static int send_msg_to_client(int client_fd, gsr_kms_response *response) {
     struct iovec iov;
     iov.iov_base = response;
@@ -45,21 +60,25 @@ static int send_msg_to_client(int client_fd, gsr_kms_response *response) {
     response_message.msg_iov = &iov;
     response_message.msg_iovlen = 1;
 
-    char cmsgbuf[CMSG_SPACE(sizeof(int) * max_int(1, response->num_fds))];
+    const int num_fds = count_num_fds(response);
+    char cmsgbuf[CMSG_SPACE(sizeof(int) * max_int(1, num_fds))];
     memset(cmsgbuf, 0, sizeof(cmsgbuf));
 
-    if(response->num_fds > 0) {
+    if(num_fds > 0) {
         response_message.msg_control = cmsgbuf;
         response_message.msg_controllen = sizeof(cmsgbuf);
 
         struct cmsghdr *cmsg = CMSG_FIRSTHDR(&response_message);
         cmsg->cmsg_level = SOL_SOCKET;
         cmsg->cmsg_type = SCM_RIGHTS;
-        cmsg->cmsg_len = CMSG_LEN(sizeof(int) * response->num_fds);
+        cmsg->cmsg_len = CMSG_LEN(sizeof(int) * num_fds);
 
         int *fds = (int*)CMSG_DATA(cmsg);
-        for(int i = 0; i < response->num_fds; ++i) {
-            fds[i] = response->fds[i].fd;
+        int fd_index = 0;
+        for(int i = 0; i < response->num_items; ++i) {
+            for(int j = 0; j < response->items[i].num_dma_bufs; ++j) {
+                fds[fd_index++] = response->items[i].dma_buf[j].fd;
+            }
         }
 
         response_message.msg_controllen = cmsg->cmsg_len;
@@ -118,18 +137,18 @@ static bool connector_get_property_by_name(int drmfd, drmModeConnectorPtr props,
 }
 
 typedef enum {
-    PLANE_PROPERTY_X         = 1 << 0,
-    PLANE_PROPERTY_Y         = 1 << 1,
-    PLANE_PROPERTY_SRC_X     = 1 << 2,
-    PLANE_PROPERTY_SRC_Y     = 1 << 3,
-    PLANE_PROPERTY_SRC_W     = 1 << 4,
-    PLANE_PROPERTY_SRC_H     = 1 << 5,
-    PLANE_PROPERTY_IS_CURSOR = 1 << 6,
+    PLANE_PROPERTY_X          = 1 << 0,
+    PLANE_PROPERTY_Y          = 1 << 1,
+    PLANE_PROPERTY_SRC_X      = 1 << 2,
+    PLANE_PROPERTY_SRC_Y      = 1 << 3,
+    PLANE_PROPERTY_SRC_W      = 1 << 4,
+    PLANE_PROPERTY_SRC_H      = 1 << 5,
+    PLANE_PROPERTY_IS_CURSOR  = 1 << 6,
+    PLANE_PROPERTY_IS_PRIMARY = 1 << 7,
 } plane_property_mask;
 
 /* Returns plane_property_mask */
-static uint32_t plane_get_properties(int drmfd, uint32_t plane_id, bool *is_cursor, int *x, int *y, int *src_x, int *src_y, int *src_w, int *src_h) {
-    *is_cursor = false;
+static uint32_t plane_get_properties(int drmfd, uint32_t plane_id, int *x, int *y, int *src_x, int *src_y, int *src_w, int *src_h) {
     *x = 0;
     *y = 0;
     *src_x = 0;
@@ -141,8 +160,9 @@ static uint32_t plane_get_properties(int drmfd, uint32_t plane_id, bool *is_curs
 
     drmModeObjectPropertiesPtr props = drmModeObjectGetProperties(drmfd, plane_id, DRM_MODE_OBJECT_PLANE);
     if(!props)
-        return false;
+        return property_mask;
 
+    // TODO: Dont do this every frame
     for(uint32_t i = 0; i < props->count_props; ++i) {
         drmModePropertyPtr prop = drmModeGetProperty(drmfd, props->props[i]);
         if(!prop)
@@ -171,8 +191,10 @@ static uint32_t plane_get_properties(int drmfd, uint32_t plane_id, bool *is_curs
         } else if((type & DRM_MODE_PROP_ENUM) && strcmp(prop->name, "type") == 0) {
             const uint64_t current_enum_value = props->prop_values[i];
             for(int j = 0; j < prop->count_enums; ++j) {
-                if(prop->enums[j].value == current_enum_value && strcmp(prop->enums[j].name, "Cursor") == 0) {
-                    *is_cursor = true;
+                if(prop->enums[j].value == current_enum_value && strcmp(prop->enums[j].name, "Primary") == 0) {
+                    property_mask |= PLANE_PROPERTY_IS_PRIMARY;
+                    break;
+                } else if(prop->enums[j].value == current_enum_value && strcmp(prop->enums[j].name, "Cursor") == 0) {
                     property_mask |= PLANE_PROPERTY_IS_CURSOR;
                     break;
                 }
@@ -186,13 +208,13 @@ static uint32_t plane_get_properties(int drmfd, uint32_t plane_id, bool *is_curs
     return property_mask;
 }
 
-/* Returns 0 if not found */
-static uint32_t get_connector_by_crtc_id(const connector_to_crtc_map *c2crtc_map, uint32_t crtc_id) {
+/* Returns NULL if not found */
+static const connector_crtc_pair* get_connector_pair_by_crtc_id(const connector_to_crtc_map *c2crtc_map, uint32_t crtc_id) {
     for(int i = 0; i < c2crtc_map->num_maps; ++i) {
         if(c2crtc_map->maps[i].crtc_id == crtc_id)
-            return c2crtc_map->maps[i].connector_id;
+            return &c2crtc_map->maps[i];
     }
-    return 0;
+    return NULL;
 }
 
 static void map_crtc_to_connector_ids(gsr_drm *drm, connector_to_crtc_map *c2crtc_map) {
@@ -209,8 +231,12 @@ static void map_crtc_to_connector_ids(gsr_drm *drm, connector_to_crtc_map *c2crt
         uint64_t crtc_id = 0;
         connector_get_property_by_name(drm->drmfd, connector, "CRTC_ID", &crtc_id);
 
+        uint64_t hdr_output_metadata_blob_id = 0;
+        connector_get_property_by_name(drm->drmfd, connector, "HDR_OUTPUT_METADATA", &hdr_output_metadata_blob_id);
+
         c2crtc_map->maps[c2crtc_map->num_maps].connector_id = connector->connector_id;
         c2crtc_map->maps[c2crtc_map->num_maps].crtc_id = crtc_id;
+        c2crtc_map->maps[c2crtc_map->num_maps].hdr_metadata_blob_id = hdr_output_metadata_blob_id;
         ++c2crtc_map->num_maps;
 
         drmModeFreeConnector(connector);
@@ -238,21 +264,56 @@ static void drm_mode_cleanup_handles(int drmfd, drmModeFB2Ptr drmfb) {
     }
 }
 
-static int kms_get_fb(gsr_drm *drm, gsr_kms_response *response, connector_to_crtc_map *c2crtc_map) {
+static bool get_hdr_metadata(int drm_fd, uint64_t hdr_metadata_blob_id, struct hdr_output_metadata *hdr_metadata) {
+    drmModePropertyBlobPtr hdr_metadata_blob = drmModeGetPropertyBlob(drm_fd, hdr_metadata_blob_id);
+    if(!hdr_metadata_blob)
+        return false;
+
+    if(hdr_metadata_blob->length >= sizeof(struct hdr_output_metadata))
+        *hdr_metadata = *(struct hdr_output_metadata*)hdr_metadata_blob->data;
+
+    drmModeFreePropertyBlob(hdr_metadata_blob);
+    return true;
+}
+
+/* Returns the number of drm handles that we managed to get */
+static int drm_prime_handles_to_fds(gsr_drm *drm, drmModeFB2Ptr drmfb, int *fb_fds) {
+    for(int i = 0; i < GSR_KMS_MAX_DMA_BUFS; ++i) {
+        if(!drmfb->handles[i])
+            return i;
+
+        const int ret = drmPrimeHandleToFD(drm->drmfd, drmfb->handles[i], O_RDONLY, &fb_fds[i]);
+        if(ret != 0 || fb_fds[i] == -1)
+            return i;
+    }
+    return GSR_KMS_MAX_DMA_BUFS;
+}
+
+static int kms_get_fb(gsr_drm *drm, gsr_kms_response *response) {
     int result = -1;
 
     response->result = KMS_RESULT_OK;
     response->err_msg[0] = '\0';
-    response->num_fds = 0;
+    response->num_items = 0;
+
+    connector_to_crtc_map c2crtc_map;
+    c2crtc_map.num_maps = 0;
+    map_crtc_to_connector_ids(drm, &c2crtc_map);
+
+    drmModePlaneResPtr planes = drmModeGetPlaneResources(drm->drmfd);
+    if(!planes) {
+        fprintf(stderr, "kms server error: failed to get plane resources, error: %s\n", strerror(errno));
+        goto done;
+    }
 
-    for(uint32_t i = 0; i < drm->planes->count_planes && response->num_fds < GSR_KMS_MAX_PLANES; ++i) {
+    for(uint32_t i = 0; i < planes->count_planes && response->num_items < GSR_KMS_MAX_ITEMS; ++i) {
         drmModePlanePtr plane = NULL;
         drmModeFB2Ptr drmfb = NULL;
 
-        plane = drmModeGetPlane(drm->drmfd, drm->planes->planes[i]);
+        plane = drmModeGetPlane(drm->drmfd, planes->planes[i]);
         if(!plane) {
             response->result = KMS_RESULT_FAILED_TO_GET_PLANE;
-            snprintf(response->err_msg, sizeof(response->err_msg), "failed to get drm plane with id %u, error: %s\n", drm->planes->planes[i], strerror(errno));
+            snprintf(response->err_msg, sizeof(response->err_msg), "failed to get drm plane with id %u, error: %s\n", planes->planes[i], strerror(errno));
             fprintf(stderr, "kms server error: %s\n", response->err_msg);
             goto next;
         }
@@ -279,41 +340,54 @@ static int kms_get_fb(gsr_drm *drm, gsr_kms_response *response, connector_to_crt
         // TODO: Check if dimensions have changed by comparing width and height to previous time this was called.
         // TODO: Support other plane formats than rgb (with multiple planes, such as direct YUV420 on wayland).
 
-        int fb_fd = -1;
-        const int ret = drmPrimeHandleToFD(drm->drmfd, drmfb->handles[0], O_RDONLY, &fb_fd);
-        if(ret != 0 || fb_fd == -1) {
+        int x = 0, y = 0, src_x = 0, src_y = 0, src_w = 0, src_h = 0;
+        plane_property_mask property_mask = plane_get_properties(drm->drmfd, plane->plane_id, &x, &y, &src_x, &src_y, &src_w, &src_h);
+        if(!(property_mask & PLANE_PROPERTY_IS_PRIMARY) && !(property_mask & PLANE_PROPERTY_IS_CURSOR))
+            continue;
+
+        int fb_fds[GSR_KMS_MAX_DMA_BUFS];
+        const int num_fb_fds = drm_prime_handles_to_fds(drm, drmfb, fb_fds);
+        if(num_fb_fds == 0) {
             response->result = KMS_RESULT_FAILED_TO_GET_PLANE;
             snprintf(response->err_msg, sizeof(response->err_msg), "failed to get fd from drm handle, error: %s", strerror(errno));
             fprintf(stderr, "kms server error: %s\n", response->err_msg);
             goto cleanup_handles;
         }
 
-        bool is_cursor = false;
-        int x = 0, y = 0, src_x = 0, src_y = 0, src_w = 0, src_h = 0;
-        plane_get_properties(drm->drmfd, plane->plane_id, &is_cursor, &x, &y, &src_x, &src_y, &src_w, &src_h);
-
-        response->fds[response->num_fds].fd = fb_fd;
-        response->fds[response->num_fds].width = drmfb->width;
-        response->fds[response->num_fds].height = drmfb->height;
-        response->fds[response->num_fds].pitch = drmfb->pitches[0];
-        response->fds[response->num_fds].offset = drmfb->offsets[0];
-        response->fds[response->num_fds].pixel_format = drmfb->pixel_format;
-        response->fds[response->num_fds].modifier = drmfb->modifier;
-        response->fds[response->num_fds].connector_id = get_connector_by_crtc_id(c2crtc_map, plane->crtc_id);
-        response->fds[response->num_fds].is_cursor = is_cursor;
-        response->fds[response->num_fds].is_combined_plane = false;
-        if(is_cursor) {
-            response->fds[response->num_fds].x = x;
-            response->fds[response->num_fds].y = y;
-            response->fds[response->num_fds].src_w = 0;
-            response->fds[response->num_fds].src_h = 0;
+        const int item_index = response->num_items;
+
+        const connector_crtc_pair *crtc_pair = get_connector_pair_by_crtc_id(&c2crtc_map, plane->crtc_id);
+        if(crtc_pair && crtc_pair->hdr_metadata_blob_id) {
+            response->items[item_index].has_hdr_metadata = get_hdr_metadata(drm->drmfd, crtc_pair->hdr_metadata_blob_id, &response->items[item_index].hdr_metadata);
         } else {
-            response->fds[response->num_fds].x = src_x;
-            response->fds[response->num_fds].y = src_y;
-            response->fds[response->num_fds].src_w = src_w;
-            response->fds[response->num_fds].src_h = src_h;
+            response->items[item_index].has_hdr_metadata = false;
         }
-        ++response->num_fds;
+
+        for(int j = 0; j < num_fb_fds; ++j) {
+            response->items[item_index].dma_buf[j].fd = fb_fds[j];
+            response->items[item_index].dma_buf[j].pitch = drmfb->pitches[j];
+            response->items[item_index].dma_buf[j].offset = drmfb->offsets[j];
+        }
+        response->items[item_index].num_dma_bufs = num_fb_fds;
+
+        response->items[item_index].width = drmfb->width;
+        response->items[item_index].height = drmfb->height;
+        response->items[item_index].pixel_format = drmfb->pixel_format;
+        response->items[item_index].modifier = drmfb->flags & DRM_MODE_FB_MODIFIERS ? drmfb->modifier : DRM_FORMAT_MOD_INVALID;
+        response->items[item_index].connector_id = crtc_pair ? crtc_pair->connector_id : 0;
+        response->items[item_index].is_cursor = property_mask & PLANE_PROPERTY_IS_CURSOR;
+        if(property_mask & PLANE_PROPERTY_IS_CURSOR) {
+            response->items[item_index].x = x;
+            response->items[item_index].y = y;
+            response->items[item_index].src_w = 0;
+            response->items[item_index].src_h = 0;
+        } else {
+            response->items[item_index].x = src_x;
+            response->items[item_index].y = src_y;
+            response->items[item_index].src_w = src_w;
+            response->items[item_index].src_h = src_h;
+        }
+        ++response->num_items;
 
         cleanup_handles:
         drm_mode_cleanup_handles(drm->drmfd, drmfb);
@@ -325,16 +399,28 @@ static int kms_get_fb(gsr_drm *drm, gsr_kms_response *response, connector_to_crt
             drmModeFreePlane(plane);
     }
 
-    if(response->num_fds > 0)
+    done:
+
+    if(planes)
+        drmModeFreePlaneResources(planes);
+
+    if(response->num_items > 0)
         response->result = KMS_RESULT_OK;
 
     if(response->result == KMS_RESULT_OK) {
         result = 0;
     } else {
-        for(int i = 0; i < response->num_fds; ++i) {
-            close(response->fds[i].fd);
+        for(int i = 0; i < response->num_items; ++i) {
+            for(int j = 0; j < response->items[i].num_dma_bufs; ++j) {
+                gsr_kms_response_dma_buf *dma_buf = &response->items[i].dma_buf[j];
+                if(dma_buf->fd > 0) {
+                    close(dma_buf->fd);
+                    dma_buf->fd = -1;
+                }
+            }
+            response->items[i].num_dma_bufs = 0;
         }
-        response->num_fds = 0;
+        response->num_items = 0;
     }
 
     return result;
@@ -348,21 +434,13 @@ static double clock_get_monotonic_seconds(void) {
     return (double)ts.tv_sec + (double)ts.tv_nsec * 0.000000001;
 }
 
-static void strncpy_safe(char *dst, const char *src, int len) {
-    int src_len = strlen(src);
-    int min_len = src_len;
-    if(len - 1 < min_len)
-        min_len = len - 1;
-    memcpy(dst, src, min_len);
-    dst[min_len] = '\0';
-}
-
 int main(int argc, char **argv) {
+    setlocale(LC_ALL, "C"); // Sigh... stupid C
+
     int res = 0;
     int socket_fd = 0;
     gsr_drm drm;
     drm.drmfd = 0;
-    drm.planes = NULL;
 
     if(argc != 3) {
         fprintf(stderr, "usage: gsr-kms-server <domain_socket_path> <card_path>\n");
@@ -395,17 +473,6 @@ int main(int argc, char **argv) {
         fprintf(stderr, "kms server warning: drmSetClientCap DRM_CLIENT_CAP_ATOMIC failed, error: %s. The wrong monitor may be captured as a result\n", strerror(errno));
     }
 
-    drm.planes = drmModeGetPlaneResources(drm.drmfd);
-    if(!drm.planes) {
-        fprintf(stderr, "kms server error: failed to get plane resources, error: %s\n", strerror(errno));
-        res = 2;
-        goto done;
-    }
-
-    connector_to_crtc_map c2crtc_map;
-    c2crtc_map.num_maps = 0;
-    map_crtc_to_connector_ids(&drm, &c2crtc_map);
-
     fprintf(stderr, "kms server info: connecting to the client\n");
     bool connected = false;
     const double connect_timeout_sec = 5.0;
@@ -413,7 +480,7 @@ int main(int argc, char **argv) {
     while(clock_get_monotonic_seconds() - start_time < connect_timeout_sec) {
         struct sockaddr_un remote_addr = {0};
         remote_addr.sun_family = AF_UNIX;
-        strncpy_safe(remote_addr.sun_path, domain_socket_path, sizeof(remote_addr.sun_path));
+        snprintf(remote_addr.sun_path, sizeof(remote_addr.sun_path), "%s", domain_socket_path);
         // TODO: Check if parent disconnected
         if(connect(socket_fd, (struct sockaddr*)&remote_addr, sizeof(remote_addr.sun_family) + strlen(remote_addr.sun_path)) == -1) {
             if(errno == ECONNREFUSED || errno == ENOENT) {
@@ -463,7 +530,7 @@ int main(int argc, char **argv) {
         }
 
         if(request.version != GSR_KMS_PROTOCOL_VERSION) {
-            fprintf(stderr, "kms server error: expected gpu screen recorder protocol version to be %u, but it's %u\n", GSR_KMS_PROTOCOL_VERSION, request.version);
+            fprintf(stderr, "kms server error: expected gpu screen recorder protocol version to be %u, but it's %u. please reinstall gpu screen recorder\n", GSR_KMS_PROTOCOL_VERSION, request.version);
             /*
             if(request.new_connection_fd > 0)
                 close(request.new_connection_fd);
@@ -475,7 +542,7 @@ int main(int argc, char **argv) {
             case KMS_REQUEST_TYPE_REPLACE_CONNECTION: {
                 gsr_kms_response response;
                 response.version = GSR_KMS_PROTOCOL_VERSION;
-                response.num_fds = 0;
+                response.num_items = 0;
 
                 if(request.new_connection_fd > 0) {
                     if(socket_fd > 0)
@@ -498,9 +565,9 @@ int main(int argc, char **argv) {
             case KMS_REQUEST_TYPE_GET_KMS: {
                 gsr_kms_response response;
                 response.version = GSR_KMS_PROTOCOL_VERSION;
-                response.num_fds = 0;
+                response.num_items = 0;
                 
-                if(kms_get_fb(&drm, &response, &c2crtc_map) == 0) {
+                if(kms_get_fb(&drm, &response) == 0) {
                     if(send_msg_to_client(socket_fd, &response) == -1)
                         fprintf(stderr, "kms server error: failed to respond to client KMS_REQUEST_TYPE_GET_KMS request\n");
                 } else {
@@ -508,9 +575,17 @@ int main(int argc, char **argv) {
                         fprintf(stderr, "kms server error: failed to respond to client KMS_REQUEST_TYPE_GET_KMS request\n");
                 }
 
-                for(int i = 0; i < response.num_fds; ++i) {
-                    close(response.fds[i].fd);
+                for(int i = 0; i < response.num_items; ++i) {
+                    for(int j = 0; j < response.items[i].num_dma_bufs; ++j) {
+                        gsr_kms_response_dma_buf *dma_buf = &response.items[i].dma_buf[j];
+                        if(dma_buf->fd > 0) {
+                            close(dma_buf->fd);
+                            dma_buf->fd = -1;
+                        }
+                    }
+                    response.items[i].num_dma_bufs = 0;
                 }
+                response.num_items = 0;
 
                 break;
             }
@@ -518,7 +593,7 @@ int main(int argc, char **argv) {
                 gsr_kms_response response;
                 response.version = GSR_KMS_PROTOCOL_VERSION;
                 response.result = KMS_RESULT_INVALID_REQUEST;
-                response.num_fds = 0;
+                response.num_items = 0;
 
                 snprintf(response.err_msg, sizeof(response.err_msg), "invalid request type %d, expected %d (%s)", request.type, KMS_REQUEST_TYPE_GET_KMS, "KMS_REQUEST_TYPE_GET_KMS");
                 fprintf(stderr, "kms server error: %s\n", response.err_msg);
@@ -531,8 +606,6 @@ int main(int argc, char **argv) {
     }
 
     done:
-    if(drm.planes)
-        drmModeFreePlaneResources(drm.planes);
     if(drm.drmfd > 0)
         close(drm.drmfd);
     if(socket_fd > 0)
diff --git a/meson.build b/meson.build
new file mode 100644
index 0000000..43c429d
--- /dev/null
+++ b/meson.build
@@ -0,0 +1,120 @@
+project('gpu-screen-recorder', ['c', 'cpp'], version : '5.5.10', default_options : ['warning_level=2'])
+
+add_project_arguments('-Wshadow', language : ['c', 'cpp'])
+if get_option('buildtype') == 'debug'
+    add_project_arguments('-g3', language : ['c', 'cpp'])
+elif get_option('buildtype') == 'release'
+    add_project_arguments('-DNDEBUG', language : ['c', 'cpp'])
+endif
+
+src = [
+    'kms/client/kms_client.c',
+    'src/capture/capture.c',
+    'src/capture/nvfbc.c',
+    'src/capture/xcomposite.c',
+    'src/capture/ximage.c',
+    'src/capture/kms.c',
+    'src/encoder/encoder.c',
+    'src/encoder/video/video.c',
+    'src/encoder/video/nvenc.c',
+    'src/encoder/video/vaapi.c',
+    'src/encoder/video/vulkan.c',
+    'src/encoder/video/software.c',
+    'src/codec_query/nvenc.c',
+    'src/codec_query/vaapi.c',
+    'src/codec_query/vulkan.c',
+    'src/window/window.c',
+    'src/window/x11.c',
+    'src/window/wayland.c',
+    'src/replay_buffer/replay_buffer.c',
+    'src/replay_buffer/replay_buffer_ram.c',
+    'src/replay_buffer/replay_buffer_disk.c',
+    'src/egl.c',
+    'src/cuda.c',
+    'src/xnvctrl.c',
+    'src/overclock.c',
+    'src/window_texture.c',
+    'src/shader.c',
+    'src/color_conversion.c',
+    'src/utils.c',
+    'src/library_loader.c',
+    'src/cursor.c',
+    'src/damage.c',
+    'src/image_writer.c',
+    'src/args_parser.c',
+    'src/defs.c',
+    'src/sound.cpp',
+    'src/main.cpp',
+]
+
+subdir('protocol')
+src += protocol_src
+
+dep = [
+    dependency('threads'),
+    dependency('libavcodec'),
+    dependency('libavformat'),
+    dependency('libavutil'),
+    dependency('x11'),
+    dependency('xcomposite'),
+    dependency('xrandr'),
+    dependency('xfixes'),
+    dependency('xdamage'),
+    dependency('libpulse'),
+    dependency('libswresample'),
+    dependency('libavfilter'),
+    dependency('libva'),
+    dependency('libva-drm'),
+    dependency('libcap'),
+    dependency('libdrm'),
+    dependency('wayland-egl'),
+    dependency('wayland-client'),
+]
+
+uses_pipewire = false
+
+if get_option('portal') == true
+    src += [
+        'src/capture/portal.c',
+        'dbus/client/dbus_client.c',
+        'src/pipewire_video.c',
+    ]
+    add_project_arguments('-DGSR_PORTAL', language : ['c', 'cpp'])
+    uses_pipewire = true
+endif
+
+if get_option('app_audio') == true
+    src += [
+        'src/pipewire_audio.c',
+    ]
+    add_project_arguments('-DGSR_APP_AUDIO', language : ['c', 'cpp'])
+    uses_pipewire = true
+endif
+
+if uses_pipewire == true
+    dep += [
+        dependency('libpipewire-0.3'),
+        dependency('libspa-0.2'),
+    ]
+endif
+
+add_project_arguments('-DGSR_VERSION="' + meson.project_version() + '"', language: ['c', 'cpp'])
+
+executable('gsr-kms-server', 'kms/server/kms_server.c', dependencies : dependency('libdrm'), c_args : '-fstack-protector-all', install : true)
+executable('gpu-screen-recorder', src, dependencies : dep, install : true)
+
+if get_option('portal') == true
+    executable('gsr-dbus-server', ['dbus/server/dbus_server.c', 'dbus/dbus_impl.c'], dependencies : dependency('dbus-1'), install : true)
+endif
+
+if get_option('systemd') == true
+    install_data(files('extra/gpu-screen-recorder.service'), install_dir : 'lib/systemd/user')
+endif
+
+if get_option('capabilities') == true
+    meson.add_install_script('extra/meson_post_install.sh')
+endif
+
+if get_option('nvidia_suspend_fix') == true
+    install_data(files('extra/gsr-nvidia.conf'), install_dir : 'lib/modprobe.d')
+endif
diff --git a/meson_options.txt b/meson_options.txt
new file mode 100644
index 0000000..b1023c2
--- /dev/null
+++ b/meson_options.txt
@@ -0,0 +1,5 @@
+option('systemd', type : 'boolean', value : true, description : 'Install systemd service file')
+option('capabilities', type : 'boolean', value : true, description : 'Set binary admin capability on gsr-kms-server binary to remove password prompt when recording monitor (without desktop portal option) on amd/intel or nvidia wayland')
+option('nvidia_suspend_fix', type : 'boolean', value : true, description : 'Install nvidia modprobe config file to tell nvidia driver to preserve video memory on suspend. This is a workaround for an nvidia driver bug that breaks cuda (and gpu screen recorder) on suspend')
+option('portal', type : 'boolean', value : true, description : 'Build with support for xdg desktop portal ScreenCast capture (wayland only) (-w portal option). Requires pipewire')
+option('app_audio', type : 'boolean', value : true, description : 'Build with support for recording a single audio source (-a app: option). Requires pipewire')
diff --git a/project.conf b/project.conf
index 23092af..7cf013b 100644
--- a/project.conf
+++ b/project.conf
@@ -1,12 +1,16 @@
 [package]
 name = "gpu-screen-recorder"
 type = "executable"
-version = "3.0.0"
+version = "5.5.10"
 platforms = ["posix"]
 
 [config]
-ignore_dirs = ["kms/server"]
-error_on_warning = "true"
+ignore_dirs = ["kms/server", "build", "debug-build", "dbus/server"]
+#error_on_warning = "true"
+
+[define]
+GSR_PORTAL = "1"
+GSR_APP_AUDIO = "1"
 
 [dependencies]
 libavcodec = ">=58"
@@ -15,11 +19,18 @@ libavutil = ">=56.2"
 x11 = ">=1"
 xcomposite = ">=0.2"
 xrandr = ">=1"
+xfixes = ">=2"
+xdamage = ">=1"
 libpulse = ">=13"
 libswresample = ">=3"
 libavfilter = ">=5"
 libva = ">=1"
+libva-drm = ">=1"
 libcap = ">=2"
 libdrm = ">=2"
 wayland-egl = ">=15"
 wayland-client = ">=1"
+dbus-1 = ">=1"
+libpipewire-0.3 = ">=1"
+libspa-0.2 = ">=0"
+vulkan = ">=1"
diff --git a/protocol/meson.build b/protocol/meson.build
new file mode 100644
index 0000000..bbdccba
--- /dev/null
+++ b/protocol/meson.build
@@ -0,0 +1,25 @@
+wayland_scanner = dependency('wayland-scanner', native: true)
+wayland_scanner_path = wayland_scanner.get_variable(pkgconfig: 'wayland_scanner')
+wayland_scanner_prog = find_program(wayland_scanner_path, native: true)
+
+wayland_scanner_code = generator(
+	wayland_scanner_prog,
+	output: '@BASENAME@-protocol.c',
+	arguments: ['private-code', '@INPUT@', '@OUTPUT@'],
+)
+
+wayland_scanner_client = generator(
+	wayland_scanner_prog,
+	output: '@BASENAME@-client-protocol.h',
+	arguments: ['client-header', '@INPUT@', '@OUTPUT@'],
+)
+
+protocols = [
+	'xdg-output-unstable-v1.xml',
+]
+
+protocol_src = []
+foreach xml : protocols
+	protocol_src += wayland_scanner_code.process(xml)
+	protocol_src += wayland_scanner_client.process(xml)
+endforeach
diff --git a/protocol/xdg-output-unstable-v1.xml b/protocol/xdg-output-unstable-v1.xml
new file mode 100644
index 0000000..5d536aa
--- /dev/null
+++ b/protocol/xdg-output-unstable-v1.xml
@@ -0,0 +1,222 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<protocol name="xdg_output_unstable_v1">
+
+  <copyright>
+    Copyright © 2017 Red Hat Inc.
+
+    Permission is hereby granted, free of charge, to any person obtaining a
+    copy of this software and associated documentation files (the "Software"),
+    to deal in the Software without restriction, including without limitation
+    the rights to use, copy, modify, merge, publish, distribute, sublicense,
+    and/or sell copies of the Software, and to permit persons to whom the
+    Software is furnished to do so, subject to the following conditions:
+
+    The above copyright notice and this permission notice (including the next
+    paragraph) shall be included in all copies or substantial portions of the
+    Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+  </copyright>
+
+  <description summary="Protocol to describe output regions">
+    This protocol aims at describing outputs in a way which is more in line
+    with the concept of an output on desktop oriented systems.
+
+    Some information are more specific to the concept of an output for
+    a desktop oriented system and may not make sense in other applications,
+    such as IVI systems for example.
+
+    Typically, the global compositor space on a desktop system is made of
+    a contiguous or overlapping set of rectangular regions.
+
+    The logical_position and logical_size events defined in this protocol
+    might provide information identical to their counterparts already
+    available from wl_output, in which case the information provided by this
+    protocol should be preferred to their equivalent in wl_output. The goal is
+    to move the desktop specific concepts (such as output location within the
+    global compositor space, etc.) out of the core wl_output protocol.
+
+    Warning! The protocol described in this file is experimental and
+    backward incompatible changes may be made. Backward compatible
+    changes may be added together with the corresponding interface
+    version bump.
+    Backward incompatible changes are done by bumping the version
+    number in the protocol and interface names and resetting the
+    interface version. Once the protocol is to be declared stable,
+    the 'z' prefix and the version number in the protocol and
+    interface names are removed and the interface version number is
+    reset.
+  </description>
+
+  <interface name="zxdg_output_manager_v1" version="3">
+    <description summary="manage xdg_output objects">
+      A global factory interface for xdg_output objects.
+    </description>
+
+    <request name="destroy" type="destructor">
+      <description summary="destroy the xdg_output_manager object">
+	Using this request a client can tell the server that it is not
+	going to use the xdg_output_manager object anymore.
+
+	Any objects already created through this instance are not affected.
+      </description>
+    </request>
+
+    <request name="get_xdg_output">
+      <description summary="create an xdg output from a wl_output">
+	This creates a new xdg_output object for the given wl_output.
+      </description>
+      <arg name="id" type="new_id" interface="zxdg_output_v1"/>
+      <arg name="output" type="object" interface="wl_output"/>
+    </request>
+  </interface>
+
+  <interface name="zxdg_output_v1" version="3">
+    <description summary="compositor logical output region">
+      An xdg_output describes part of the compositor geometry.
+
+      This typically corresponds to a monitor that displays part of the
+      compositor space.
+
+      For objects version 3 onwards, after all xdg_output properties have been
+      sent (when the object is created and when properties are updated), a
+      wl_output.done event is sent. This allows changes to the output
+      properties to be seen as atomic, even if they happen via multiple events.
+    </description>
+
+    <request name="destroy" type="destructor">
+      <description summary="destroy the xdg_output object">
+	Using this request a client can tell the server that it is not
+	going to use the xdg_output object anymore.
+      </description>
+    </request>
+
+    <event name="logical_position">
+      <description summary="position of the output within the global compositor space">
+	The position event describes the location of the wl_output within
+	the global compositor space.
+
+	The logical_position event is sent after creating an xdg_output
+	(see xdg_output_manager.get_xdg_output) and whenever the location
+	of the output changes within the global compositor space.
+      </description>
+      <arg name="x" type="int"
+	   summary="x position within the global compositor space"/>
+      <arg name="y" type="int"
+	   summary="y position within the global compositor space"/>
+    </event>
+
+    <event name="logical_size">
+      <description summary="size of the output in the global compositor space">
+	The logical_size event describes the size of the output in the
+	global compositor space.
+
+	Most regular Wayland clients should not pay attention to the
+	logical size and would rather rely on xdg_shell interfaces.
+
+	Some clients such as Xwayland, however, need this to configure
+	their surfaces in the global compositor space as the compositor
+	may apply a different scale from what is advertised by the output
+	scaling property (to achieve fractional scaling, for example).
+
+	For example, for a wl_output mode 3840×2160 and a scale factor 2:
+
+	- A compositor not scaling the monitor viewport in its compositing space
+	  will advertise a logical size of 3840×2160,
+
+	- A compositor scaling the monitor viewport with scale factor 2 will
+	  advertise a logical size of 1920×1080,
+
+	- A compositor scaling the monitor viewport using a fractional scale of
+	  1.5 will advertise a logical size of 2560×1440.
+
+	For example, for a wl_output mode 1920×1080 and a 90 degree rotation,
+	the compositor will advertise a logical size of 1080x1920.
+
+	The logical_size event is sent after creating an xdg_output
+	(see xdg_output_manager.get_xdg_output) and whenever the logical
+	size of the output changes, either as a result of a change in the
+	applied scale or because of a change in the corresponding output
+	mode(see wl_output.mode) or transform (see wl_output.transform).
+      </description>
+      <arg name="width" type="int"
+	   summary="width in global compositor space"/>
+      <arg name="height" type="int"
+	   summary="height in global compositor space"/>
+    </event>
+
+    <event name="done">
+      <description summary="all information about the output have been sent">
+	This event is sent after all other properties of an xdg_output
+	have been sent.
+
+	This allows changes to the xdg_output properties to be seen as
+	atomic, even if they happen via multiple events.
+
+	For objects version 3 onwards, this event is deprecated. Compositors
+	are not required to send it anymore and must send wl_output.done
+	instead.
+      </description>
+    </event>
+
+    <!-- Version 2 additions -->
+
+    <event name="name" since="2">
+      <description summary="name of this output">
+	Many compositors will assign names to their outputs, show them to the
+	user, allow them to be configured by name, etc. The client may wish to
+	know this name as well to offer the user similar behaviors.
+
+	The naming convention is compositor defined, but limited to
+	alphanumeric characters and dashes (-). Each name is unique among all
+	wl_output globals, but if a wl_output global is destroyed the same name
+	may be reused later. The names will also remain consistent across
+	sessions with the same hardware and software configuration.
+
+	Examples of names include 'HDMI-A-1', 'WL-1', 'X11-1', etc. However, do
+	not assume that the name is a reflection of an underlying DRM
+	connector, X11 connection, etc.
+
+	The name event is sent after creating an xdg_output (see
+	xdg_output_manager.get_xdg_output). This event is only sent once per
+	xdg_output, and the name does not change over the lifetime of the
+	wl_output global.
+
+        This event is deprecated, instead clients should use wl_output.name.
+        Compositors must still support this event.
+      </description>
+      <arg name="name" type="string" summary="output name"/>
+    </event>
+
+    <event name="description" since="2">
+      <description summary="human-readable description of this output">
+	Many compositors can produce human-readable descriptions of their
+	outputs.  The client may wish to know this description as well, to
+	communicate the user for various purposes.
+
+	The description is a UTF-8 string with no convention defined for its
+	contents. Examples might include 'Foocorp 11" Display' or 'Virtual X11
+	output via :1'.
+
+	The description event is sent after creating an xdg_output (see
+	xdg_output_manager.get_xdg_output) and whenever the description
+	changes. The description is optional, and may not be sent at all.
+
+	For objects of version 2 and lower, this event is only sent once per
+	xdg_output, and the description does not change over the lifetime of
+	the wl_output global.
+
+	This event is deprecated, instead clients should use
+	wl_output.description. Compositors must still support this event.
+      </description>
+      <arg name="description" type="string" summary="output description"/>
+    </event>
+
+  </interface>
+</protocol>
diff --git a/scripts/interactive.sh b/scripts/interactive.sh
index 63b0eae..bfaaae0 100755
--- a/scripts/interactive.sh
+++ b/scripts/interactive.sh
@@ -1,7 +1,5 @@
 #!/bin/sh -e
 
-selected_audio_input="$(pactl get-default-sink).monitor"
-
 echo "Select a window to record"
 window_id=$(xdotool selectwindow)
 
@@ -14,4 +12,4 @@ read output_file_name
 output_dir=$(dirname "$output_file_name")
 mkdir -p "$output_dir"
 
-gpu-screen-recorder -w "$window_id" -c mp4 -f "$fps" -a "$selected_audio_input" -o "$output_file_name"
+gpu-screen-recorder -w "$window_id" -c mp4 -f "$fps" -a default_output -o "$output_file_name"
diff --git a/scripts/record-application-name.sh b/scripts/record-application-name.sh
new file mode 100755
index 0000000..f8c9b0d
--- /dev/null
+++ b/scripts/record-application-name.sh
@@ -0,0 +1,6 @@
+#!/bin/sh
+
+window=$(xdotool selectwindow)
+window_name=$(xdotool getwindowname "$window" || xdotool getwindowclassname "$window" || echo "Game")
+window_name="$(echo "$window_name" | tr '/\\' '_')"
+gpu-screen-recorder -w "$window" -f 60 -a default_output -o "$HOME/Videos/recording/$window_name/$(date +"Video_%Y-%m-%d_%H-%M-%S.mp4")"
diff --git a/scripts/record-save-application-name.sh b/scripts/record-save-application-name.sh
new file mode 100755
index 0000000..c95f398
--- /dev/null
+++ b/scripts/record-save-application-name.sh
@@ -0,0 +1,14 @@
+#!/bin/sh
+
+# This script should be passed to gpu-screen-recorder with the -sc option, for example:
+# gpu-screen-recorder -w screen -f 60 -a default_output -r 60 -sc scripts/record-save-application-name.sh -c mp4 -o "$HOME/Videos"
+
+window=$(xdotool getwindowfocus)
+window_name=$(xdotool getwindowname "$window" || xdotool getwindowclassname "$window" || echo "Game")
+window_name="$(echo "$window_name" | tr '/\\' '_')"
+
+video_dir="$HOME/Videos/Replays/$window_name"
+mkdir -p "$video_dir"
+video="$video_dir/$(date +"${window_name}_%Y-%m-%d_%H-%M-%S.mp4")"
+mv "$1" "$video"
+sleep 0.5 && notify-send -t 2000 -u low "GPU Screen Recorder" "Replay saved to $video"
+\ No newline at end of file
diff --git a/scripts/replay-application-name.sh b/scripts/replay-application-name.sh
new file mode 100755
index 0000000..3c3f8c5
--- /dev/null
+++ b/scripts/replay-application-name.sh
@@ -0,0 +1,6 @@
+#!/bin/sh
+
+window=$(xdotool selectwindow)
+window_name=$(xdotool getwindowname "$window" || xdotool getwindowclassname "$window" || echo "Game")
+window_name="$(echo "$window_name" | tr '/\\' '_')"
+gpu-screen-recorder -w "$window" -f 60 -c mkv -a default_output -bm cbr -q 40000 -r 60 -o "$HOME/Videos/Replays/$window_name"
diff --git a/scripts/replay.sh b/scripts/replay.sh
deleted file mode 100755
index cf6c494..0000000
--- a/scripts/replay.sh
+++ /dev/null
@@ -1,6 +0,0 @@
-#!/bin/sh -e
-
-[ "$#" -ne 4 ] && echo "usage: replay.sh <window_id> <fps> <replay_time_sec> <output_directory>" && exit 1
-active_sink="$(pactl get-default-sink).monitor"
-mkdir -p "$4"
-gpu-screen-recorder -w "$1" -c mp4 -f "$2" -a "$active_sink" -r "$3" -o "$4"
diff --git a/scripts/save-recording.sh b/scripts/save-recording.sh
new file mode 100755
index 0000000..90fefc1
--- /dev/null
+++ b/scripts/save-recording.sh
@@ -0,0 +1,3 @@
+#!/bin/sh
+
+killall -SIGINT gpu-screen-recorder && sleep 0.5 && notify-send -t 1500 -u low "GPU Screen Recorder" "Recording saved"
diff --git a/scripts/save-replay.sh b/scripts/save-replay.sh
index eac9141..f9390aa 100755
--- a/scripts/save-replay.sh
+++ b/scripts/save-replay.sh
@@ -1,4 +1,3 @@
 #!/bin/sh -e
 
-killall -SIGUSR1 gpu-screen-recorder
-notify-send -t 5000 -u low -- "GPU Screen Recorder" "Replay saved"
+killall -SIGUSR1 gpu-screen-recorder && sleep 0.5 && notify-send -t 1500 -u low -- "GPU Screen Recorder" "Replay saved"
diff --git a/scripts/start-recording.sh b/scripts/start-recording.sh
new file mode 100755
index 0000000..03fda73
--- /dev/null
+++ b/scripts/start-recording.sh
@@ -0,0 +1,5 @@
+#!/bin/sh
+
+pidof -q gpu-screen-recorder && exit 0
+video="$HOME/Videos/$(date +"Video_%Y-%m-%d_%H-%M-%S.mp4")"
+gpu-screen-recorder -w screen -f 60 -a default_output -o "$video"
diff --git a/scripts/start-replay.sh b/scripts/start-replay.sh
index 29ff67a..d47a614 100755
--- a/scripts/start-replay.sh
+++ b/scripts/start-replay.sh
@@ -1,5 +1,6 @@
 #!/bin/sh
 
+pidof -q gpu-screen-recorder && exit 0
 video_path="$HOME/Videos"
 mkdir -p "$video_path"
-gpu-screen-recorder -w screen -f 60 -a "$(pactl get-default-sink).monitor" -c mp4 -r 30 -o "$video_path"
+gpu-screen-recorder -w screen -f 60 -a default_output -c mkv -bm cbr -q 40000 -r 30 -o "$video_path"
diff --git a/scripts/start-stop-recording.sh b/scripts/start-stop-recording.sh
new file mode 100755
index 0000000..775a829
--- /dev/null
+++ b/scripts/start-stop-recording.sh
@@ -0,0 +1,10 @@
+#!/bin/sh
+
+# Simple script to start recording if it's not recording and stop recording
+# if it's already recording. This script can be bound to a single hotkey
+# to start/stop recording with a single hotkey.
+
+killall -SIGINT -q gpu-screen-recorder && exit 0
+video="$HOME/Videos/$(date +"Video_%Y-%m-%d_%H-%M-%S.mp4")"
+gpu-screen-recorder -w screen -f 60 -a default_output -o "$video"
+notify-send -t 2000 -u low "GPU Screen Recorder" "Video saved to $video"
diff --git a/scripts/toggle-recording-selected.sh b/scripts/toggle-recording-selected.sh
index f87f71c..d4c1b38 100755
--- a/scripts/toggle-recording-selected.sh
+++ b/scripts/toggle-recording-selected.sh
@@ -1,9 +1,9 @@
 #!/bin/sh -e
 
-killall -INT gpu-screen-recorder && notify-send -u low 'GPU Screen Recorder' 'Stopped recording' && exit 0;
+killall -SIGINT gpu-screen-recorder && sleep 0.5 && notify-send -t 1500 -u low 'GPU Screen Recorder' 'Stopped recording' && exit 0;
 window=$(xdotool selectwindow)
-active_sink="$(pactl get-default-sink).monitor"
+active_sink=default_output
 mkdir -p "$HOME/Videos"
 video="$HOME/Videos/$(date +"Video_%Y-%m-%d_%H-%M-%S.mp4")"
-notify-send -t 5000 -u low 'GPU Screen Recorder' "Started recording video to $video"
+notify-send -t 1500 -u low 'GPU Screen Recorder' "Started recording video to $video"
 gpu-screen-recorder -w "$window" -c mp4 -f 60 -a "$active_sink" -o "$video"
diff --git a/scripts/toggle-recording.sh b/scripts/toggle-recording.sh
new file mode 100755
index 0000000..b353dc9
--- /dev/null
+++ b/scripts/toggle-recording.sh
@@ -0,0 +1,6 @@
+#!/bin/sh -e
+
+killall -SIGINT gpu-screen-recorder && sleep 0.5 && notify-send -t 1500 -u low 'GPU Screen Recorder' 'Stopped recording' && exit 0;
+video="$HOME/Videos/$(date +"Video_%Y-%m-%d_%H-%M-%S.mp4")"
+notify-send -t 1500 -u low 'GPU Screen Recorder' "Started recording video to $video"
+gpu-screen-recorder -w screen -f 60 -a "default_output" -o "$video"
diff --git a/scripts/twitch-stream-local-copy.sh b/scripts/twitch-stream-local-copy.sh
index dba9d15..fa23cf6 100755
--- a/scripts/twitch-stream-local-copy.sh
+++ b/scripts/twitch-stream-local-copy.sh
@@ -3,5 +3,5 @@
 # Stream on twitch while also saving the video to disk locally
 
 [ "$#" -ne 4 ] && echo "usage: twitch-stream-local-copy.sh <window_id> <fps> <livestream_key> <local_file>" && exit 1
-active_sink="$(pactl get-default-sink).monitor"
-gpu-screen-recorder -w "$1" -c flv -f "$2" -a "$active_sink" | tee -- "$4" | ffmpeg -i pipe:0 -c copy -f flv -- "rtmp://live.twitch.tv/app/$3"
+active_sink=default_output
+gpu-screen-recorder -w "$1" -c flv -f "$2" -q high -a "$active_sink" | tee -- "$4" | ffmpeg -i pipe:0 -c copy -f flv -- "rtmp://live.twitch.tv/app/$3"
diff --git a/scripts/twitch-stream.sh b/scripts/twitch-stream.sh
index cd4737a..99dade8 100755
--- a/scripts/twitch-stream.sh
+++ b/scripts/twitch-stream.sh
@@ -1,5 +1,5 @@
 #!/bin/sh
 
 [ "$#" -ne 3 ] && echo "usage: twitch-stream.sh <window_id> <fps> <livestream_key>" && exit 1
-active_sink="$(pactl get-default-sink).monitor"
-gpu-screen-recorder -w "$1" -c flv -f "$2" -a "$active_sink" -o "rtmp://live.twitch.tv/app/$3"
+active_sink=default_output
+gpu-screen-recorder -w "$1" -c flv -f "$2" -q high -a "$active_sink" -o "rtmp://live.twitch.tv/app/$3"
diff --git a/scripts/youtube-hls-stream.sh b/scripts/youtube-hls-stream.sh
index 21619af..10fa6b2 100755
--- a/scripts/youtube-hls-stream.sh
+++ b/scripts/youtube-hls-stream.sh
@@ -1,11 +1,5 @@
 #!/bin/sh
 
 [ "$#" -ne 3 ] && echo "usage: youtube-hls-stream.sh <window_id> <fps> <livestream_key>" && exit 1
-mkdir "youtube_stream"
-cd "youtube_stream"
-active_sink="$(pactl get-default-sink).monitor"
-gpu-screen-recorder -w "$1" -c mpegts -f "$2" -a "$active_sink" | ffmpeg -i pipe:0 -c copy -f hls \
-    -hls_time 2 -hls_flags independent_segments -hls_flags delete_segments -hls_segment_type mpegts -hls_segment_filename stream%02d.ts -master_pl_name stream.m3u8 out1 &
-echo "Waiting until stream segments are created..."
-sleep 10
-ffmpeg -i stream.m3u8 -c copy -- "https://a.upload.youtube.com/http_upload_hls?cid=$3&copy=0&file=stream.m3u8"
+active_sink=default_output
+gpu-screen-recorder -w "$1" -c hls -f "$2" -q high -a "$active_sink" -ac aac -o "https://a.upload.youtube.com/http_upload_hls?cid=$3&copy=0&file=stream.m3u8"
+\ No newline at end of file
diff --git a/src/args_parser.c b/src/args_parser.c
new file mode 100644
index 0000000..0e05557
--- /dev/null
+++ b/src/args_parser.c
@@ -0,0 +1,924 @@
+#include "../include/args_parser.h"
+#include "../include/defs.h"
+#include "../include/egl.h"
+#include "../include/window/window.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <assert.h>
+#include <libgen.h>
+#include <sys/stat.h>
+
+#ifndef GSR_VERSION
+#define GSR_VERSION "unknown"
+#endif
+
+static const ArgEnum video_codec_enums[] = {
+    { .name = "auto",        .value = GSR_VIDEO_CODEC_AUTO       },
+    { .name = "h264",        .value = GSR_VIDEO_CODEC_H264       },
+    { .name = "h265",        .value = GSR_VIDEO_CODEC_HEVC       },
+    { .name = "hevc",        .value = GSR_VIDEO_CODEC_HEVC       },
+    { .name = "hevc_hdr",    .value = GSR_VIDEO_CODEC_HEVC_HDR   },
+    { .name = "hevc_10bit",  .value = GSR_VIDEO_CODEC_HEVC_10BIT },
+    { .name = "av1",         .value = GSR_VIDEO_CODEC_AV1        },
+    { .name = "av1_hdr",     .value = GSR_VIDEO_CODEC_AV1_HDR    },
+    { .name = "av1_10bit",   .value = GSR_VIDEO_CODEC_AV1_10BIT  },
+    { .name = "vp8",         .value = GSR_VIDEO_CODEC_VP8        },
+    { .name = "vp9",         .value = GSR_VIDEO_CODEC_VP9        },
+};
+
+static const ArgEnum audio_codec_enums[] = {
+    { .name = "opus", .value = GSR_AUDIO_CODEC_OPUS },
+    { .name = "aac",  .value = GSR_AUDIO_CODEC_AAC  },
+    { .name = "flac", .value = GSR_AUDIO_CODEC_FLAC },
+};
+
+static const ArgEnum video_encoder_enums[] = {
+    { .name = "gpu", .value = GSR_VIDEO_ENCODER_HW_GPU },
+    { .name = "cpu", .value = GSR_VIDEO_ENCODER_HW_CPU },
+};
+
+static const ArgEnum pixel_format_enums[] = {
+    { .name = "yuv420", .value = GSR_PIXEL_FORMAT_YUV420 },
+    { .name = "yuv444", .value = GSR_PIXEL_FORMAT_YUV444 },
+};
+
+static const ArgEnum framerate_mode_enums[] = {
+    { .name = "vfr",     .value = GSR_FRAMERATE_MODE_VARIABLE },
+    { .name = "cfr",     .value = GSR_FRAMERATE_MODE_CONSTANT },
+    { .name = "content", .value = GSR_FRAMERATE_MODE_CONTENT  },
+};
+
+static const ArgEnum bitrate_mode_enums[] = {
+    { .name = "auto", .value = GSR_BITRATE_MODE_AUTO },
+    { .name = "qp",   .value = GSR_BITRATE_MODE_QP   },
+    { .name = "cbr",  .value = GSR_BITRATE_MODE_CBR  },
+    { .name = "vbr",  .value = GSR_BITRATE_MODE_VBR  },
+};
+
+static const ArgEnum color_range_enums[] = {
+    { .name = "limited", .value = GSR_COLOR_RANGE_LIMITED },
+    { .name = "full",    .value = GSR_COLOR_RANGE_FULL    },
+};
+
+static const ArgEnum tune_enums[] = {
+    { .name = "performance", .value = GSR_TUNE_PERFORMANCE },
+    { .name = "quality",     .value = GSR_TUNE_QUALITY     },
+};
+
+static const ArgEnum replay_storage_enums[] = {
+    { .name = "ram",  .value = GSR_REPLAY_STORAGE_RAM  },
+    { .name = "disk", .value = GSR_REPLAY_STORAGE_DISK },
+};
+
+static void arg_deinit(Arg *arg) {
+    if(arg->values) {
+        free(arg->values);
+        arg->values = NULL;
+    }
+}
+
+static bool arg_append_value(Arg *arg, const char *value) {
+    if(arg->num_values + 1 >= arg->capacity_num_values) {
+        const int new_capacity_num_values = arg->capacity_num_values == 0 ? 4 : arg->capacity_num_values*2;
+        void *new_data = realloc(arg->values, new_capacity_num_values * sizeof(const char*));
+        if(!new_data)
+            return false;
+
+        arg->values = new_data;
+        arg->capacity_num_values = new_capacity_num_values;
+    }
+
+    arg->values[arg->num_values] = value;
+    ++arg->num_values;
+    return true;
+}
+
+static bool arg_get_enum_value_by_name(const Arg *arg, const char *name, int *enum_value) {
+    assert(arg->type == ARG_TYPE_ENUM);
+    assert(arg->enum_values);
+    for(int i = 0; i < arg->num_enum_values; ++i) {
+        if(strcmp(arg->enum_values[i].name, name) == 0) {
+            *enum_value = arg->enum_values[i].value;
+            return true;
+        }
+    }
+    return false;
+}
+
+static void arg_print_expected_enum_names(const Arg *arg) {
+    assert(arg->type == ARG_TYPE_ENUM);
+    assert(arg->enum_values);
+    for(int i = 0; i < arg->num_enum_values; ++i) {
+        if(i > 0) {
+            if(i == arg->num_enum_values -1)
+                fprintf(stderr, " or ");
+            else
+                fprintf(stderr, ", ");
+        }
+        fprintf(stderr, "'%s'", arg->enum_values[i].name);
+    }
+}
+
+static Arg* args_get_by_key(Arg *args, int num_args, const char *key) {
+    for(int i = 0; i < num_args; ++i) {
+        if(strcmp(args[i].key, key) == 0)
+            return &args[i];
+    }
+    return NULL;
+}
+
+static const char* args_get_value_by_key(Arg *args, int num_args, const char *key) {
+    for(int i = 0; i < num_args; ++i) {
+        if(strcmp(args[i].key, key) == 0) {
+            if(args[i].num_values == 0)
+                return NULL;
+            else
+                return args[i].values[0];
+        }
+    }
+    return NULL;
+}
+
+static bool args_get_boolean_by_key(Arg *args, int num_args, const char *key, bool default_value) {
+    Arg *arg = args_get_by_key(args, num_args, key);
+    assert(arg);
+    if(arg->num_values == 0) {
+        return default_value;
+    } else {
+        assert(arg->type == ARG_TYPE_BOOLEAN);
+        return arg->typed_value.boolean;
+    }
+}
+
+static int args_get_enum_by_key(Arg *args, int num_args, const char *key, int default_value) {
+    Arg *arg = args_get_by_key(args, num_args, key);
+    assert(arg);
+    if(arg->num_values == 0) {
+        return default_value;
+    } else {
+        assert(arg->type == ARG_TYPE_ENUM);
+        return arg->typed_value.enum_value;
+    }
+}
+
+static int64_t args_get_i64_by_key(Arg *args, int num_args, const char *key, int64_t default_value) {
+    Arg *arg = args_get_by_key(args, num_args, key);
+    assert(arg);
+    if(arg->num_values == 0) {
+        return default_value;
+    } else {
+        assert(arg->type == ARG_TYPE_I64);
+        return arg->typed_value.i64_value;
+    }
+}
+
+static double args_get_double_by_key(Arg *args, int num_args, const char *key, double default_value) {
+    Arg *arg = args_get_by_key(args, num_args, key);
+    assert(arg);
+    if(arg->num_values == 0) {
+        return default_value;
+    } else {
+        assert(arg->type == ARG_TYPE_DOUBLE);
+        return arg->typed_value.d_value;
+    }
+}
+
+static void usage_header() {
+    const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
+    const char *program_name = inside_flatpak ? "flatpak run --command=gpu-screen-recorder com.dec05eba.gpu_screen_recorder" : "gpu-screen-recorder";
+    printf("usage: %s -w <window_id|monitor|focused|portal|region> [-c <container_format>] [-s WxH] [-region WxH+X+Y] [-f <fps>] [-a <audio_input>] [-q <quality>] [-r <replay_buffer_size_sec>] [-replay-storage ram|disk] [-restart-replay-on-save yes|no] [-k h264|hevc|av1|vp8|vp9|hevc_hdr|av1_hdr|hevc_10bit|av1_10bit] [-ac aac|opus|flac] [-ab <bitrate>] [-oc yes|no] [-fm cfr|vfr|content] [-bm auto|qp|vbr|cbr] [-cr limited|full] [-tune performance|quality] [-df yes|no] [-sc <script_path>] [-cursor yes|no] [-keyint <value>] [-restore-portal-session yes|no] [-portal-session-token-filepath filepath] [-encoder gpu|cpu] [-o <output_file>] [-ro <output_directory>] [--list-capture-options [card_path]] [--list-audio-devices] [--list-application-audio] [-v yes|no] [-gl-debug yes|no] [--version] [-h|--help]\n", program_name);
+    fflush(stdout);
+}
+
+static void usage_full() {
+    const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
+    const char *program_name = inside_flatpak ? "flatpak run --command=gpu-screen-recorder com.dec05eba.gpu_screen_recorder" : "gpu-screen-recorder";
+    usage_header();
+    printf("\n");
+    printf("OPTIONS:\n");
+    printf("  -w    Window id to record, a display (monitor name), \"screen\", \"screen-direct\", \"focused\", \"portal\" or \"region\".\n");
+    printf("        If this is \"portal\" then xdg desktop screencast portal with PipeWire will be used. Portal option is only available on Wayland.\n");
+    printf("        If you select to save the session (token) in the desktop portal capture popup then the session will be saved for the next time you use \"portal\",\n");
+    printf("        but the session will be ignored unless you run GPU Screen Recorder with the '-restore-portal-session yes' option.\n");
+    printf("        If this is \"region\" then the region specified by the -region option is recorded.\n");
+    printf("        If this is \"screen\" then the first monitor found is recorded.\n");
+    printf("        \"screen-direct\" can only be used on Nvidia X11, to allow recording without breaking VRR (G-SYNC). This also records all of your monitors.\n");
+    printf("        Using this \"screen-direct\" option is not recommended unless you use VRR (G-SYNC) as there are Nvidia driver issues that can cause your system or games to freeze/crash.\n");
+    printf("        The \"screen-direct\" option is not needed on AMD, Intel nor Nvidia on Wayland as VRR works properly in those cases.\n");
+    printf("        Run GPU Screen Recorder with the --list-capture-options option to list valid values for this option.\n");
+    printf("\n");
+    printf("  -c    Container format for output file, for example mp4, or flv. Only required if no output file is specified or if recording in replay buffer mode.\n");
+    printf("        If an output file is specified and -c is not used then the container format is determined from the output filename extension.\n");
+    printf("        Only containers that support h264, hevc, av1, vp8 or vp9 are supported, which means that only mp4, mkv, flv, webm (and some others) are supported.\n");
+    printf("\n");
+    printf("  -s    The output resolution limit of the video in the format WxH, for example 1920x1080. If this is 0x0 then the original resolution is used. Optional, except when -w is \"focused\".\n");
+    printf("        Note: the captured content is scaled to this size. The output resolution might not be exactly as specified by this option. The original aspect ratio is respected so the resolution will match that.\n");
+    printf("        The video encoder might also need to add padding, which will result in black bars on the sides of the video. This is especially an issue on AMD.\n");
+    printf("\n");
+    printf("  -region\n");
+    printf("        The region to capture, only to be used with -w region. This is in format WxH+X+Y, which is compatible with tools such as slop (X11) and slurp (kde plasma, wlroots and hyprland).\n");
+    printf("        The region can be inside any monitor. If width and height are 0 (for example 0x0+500+500) then the entire monitor that the region is inside in will be recorded.\n");
+    printf("        Note: currently the region can't span multiple monitors.\n");
+    printf("\n");
+    printf("  -f    Frame rate to record at. Recording will only capture frames at this target frame rate.\n");
+    printf("        For constant frame rate mode this option is the frame rate every frame will be captured at and if the capture frame rate is below this target frame rate then the frames will be duplicated.\n");
+    printf("        For variable frame rate mode this option is the max frame rate and if the capture frame rate is below this target frame rate then frames will not be duplicated.\n");
+    printf("        Content frame rate is similar to variable frame rate mode, except the frame rate will match the frame rate of the captured content when possible, but not capturing above the frame rate set in this -f option.\n");
+    printf("        Optional, set to 60 by default.\n");
+    printf("\n");
+    printf("  -a    Audio device or application to record from (pulse audio device). Can be specified multiple times. Each time this is specified a new audio track is added for the specified audio device or application.\n");
+    printf("        The audio device can also be \"default_output\" in which case the default output device is used, or \"default_input\" in which case the default input device is used.\n");
+    printf("        Multiple audio sources can be merged into one audio track by using \"|\" as a separator into one -a argument, for example: -a \"default_output|default_input\".\n");
+    printf("        The audio name can also be prefixed with \"device:\", for example: -a \"device:default_output\".\n");
+    printf("        To record audio from an application then prefix the audio name with \"app:\", for example: -a \"app:Brave\". The application name is case-insensitive.\n");
+    printf("        To record audio from all applications except the provided ones prefix the audio name with \"app-inverse:\", for example: -a \"app-inverse:Brave\".\n");
+    printf("        \"app:\" and \"app-inverse:\" can't be mixed in one audio track.\n");
+    printf("        One audio track can contain both audio devices and application audio, for example: -a \"default_output|device:alsa_output.pci-0000_00_1b.0.analog-stereo.monitor|app:Brave\".\n");
+    printf("        Recording application audio is only possible when the sound server on the system is PipeWire.\n");
+    printf("        If the audio name is an empty string then the argument is ignored.\n");
+    printf("        Optional, no audio track is added by default.\n");
+    printf("        Run GPU Screen Recorder with the --list-audio-devices option to list valid audio device names.\n");
+    printf("        Run GPU Screen Recorder with the --list-application-audio option to list valid application names. It's possible to use an application name that is not listed in --list-application-audio,\n");
+    printf("        for example when trying to record audio from an application that hasn't started yet.\n");
+    printf("\n");
+    printf("  -q    Video quality. Should be either 'medium', 'high', 'very_high' or 'ultra' when using '-bm qp' or '-bm vbr' options, and '-bm qp' is the default option used.\n");
+    printf("        'high' is the recommended option when live streaming or when you have a slower harddrive.\n");
+    printf("        When using '-bm cbr' option then this is option is instead used to specify the video bitrate in kbps.\n");
+    printf("        Optional when using '-bm qp' or '-bm vbr' options, set to 'very_high' be default.\n");
+    printf("        Required when using '-bm cbr' option.\n");
+    printf("\n");
+    printf("  -r    Replay buffer time in seconds. If this is set, then only the last seconds as set by this option will be stored\n");
+    printf("        and the video will only be saved when the gpu-screen-recorder is closed. This feature is similar to Nvidia's instant replay feature This option has be between 5 and 1200.\n");
+    printf("        Note that the video data is stored in RAM (unless -replay-storage disk is used), so don't use too long replay buffer time and use constant bitrate option (-bm cbr) to prevent RAM usage from going too high in busy scenes.\n");
+    printf("        Optional, disabled by default.\n");
+    printf("\n");
+    printf("  -replay-storage\n");
+    printf("        Specify where temporary replay is stored. Should be either 'ram' or 'disk'. If set to 'disk' then replay data is stored in temporary files in the same directory as -o.\n");
+    printf("        Preferably avoid setting this to 'disk' unless -o is set to a HDD, as constant writes to a SSD can reduce the life-time of the SSD.\n");
+    printf("        Optional, set to 'ram' by default.\n");
+    printf("\n");
+    printf("  -restart-replay-on-save\n");
+    printf("        Restart replay on save. For example if this is set to 'no' and replay time (-r) is set to 60 seconds and a replay is saved once then the first replay video is 60 seconds long\n");
+    printf("        and if a replay is saved 10 seconds later then the second replay video will also be 60 seconds long and contain 50 seconds of the previous video as well.\n");
+    printf("        If this is set to 'yes' then after a replay is saved the replay buffer data is cleared and the second replay will start from that point onward.\n");
+    printf("        The replay is only restarted when saving a full replay (SIGUSR1 signal)\n");
+    printf("        Optional, set to 'no' by default.\n");
+    printf("\n");
+    printf("  -k    Video codec to use. Should be either 'auto', 'h264', 'hevc', 'av1', 'vp8', 'vp9', 'hevc_hdr', 'av1_hdr', 'hevc_10bit' or 'av1_10bit'.\n");
+    printf("        Optional, set to 'auto' by default which defaults to 'h264'. Forcefully set to 'h264' if the file container type is 'flv'.\n");
+    printf("        'hevc_hdr' and 'av1_hdr' option is not available on X11 nor when using the portal capture option.\n");
+    printf("        'hevc_10bit' and 'av1_10bit' options allow you to select 10 bit color depth which can reduce banding and improve quality in darker areas, but not all video players support 10 bit color depth\n");
+    printf("        and if you upload the video to a website the website might reduce 10 bit to 8 bit.\n");
+    printf("        Note that when using 'hevc_hdr' or 'av1_hdr' the color depth is also 10 bits.\n");
+    printf("\n");
+    printf("  -ac   Audio codec to use. Should be either 'aac', 'opus' or 'flac'. Optional, set to 'opus' for .mp4/.mkv files, otherwise set to 'aac'.\n");
+    printf("        'opus' and 'flac' is only supported by .mp4/.mkv files. 'opus' is recommended for best performance and smallest audio size.\n");
+    printf("        Flac audio codec is option is disable at the moment because of a temporary issue.\n");
+    printf("\n");
+    printf("  -ab   Audio bitrate in kbps. If this is set to 0 then it's the same as if it's absent, in which case the bitrate is determined automatically depending on the audio codec.\n");
+    printf("        Optional, by default the bitrate is 128kbps for opus and flac and 160kbps for aac.\n");
+    printf("\n");
+    printf("  -oc   Overclock memory transfer rate to the maximum performance level. This only applies to NVIDIA on X11 and exists to overcome a bug in NVIDIA driver where performance level\n");
+    printf("        is dropped when you record a game. Only needed if you are recording a game that is bottlenecked by GPU. The same issue exists on Wayland but overclocking is not possible on Wayland.\n");
+    printf("        Works only if your have \"Coolbits\" set to \"12\" in NVIDIA X settings, see README for more information. Note! use at your own risk! Optional, disabled by default.\n");
+    printf("\n");
+    printf("  -fm   Framerate mode. Should be either 'cfr' (constant frame rate), 'vfr' (variable frame rate) or 'content'. Optional, set to 'vfr' by default.\n");
+    printf("        'vfr' is recommended for recording for less issue with very high system load but some applications such as video editors may not support it properly.\n");
+    printf("        'content' is currently only supported on X11 or when using portal capture option. The 'content' option matches the recording frame rate to the captured content.\n");
+    printf("\n");
+    printf("  -bm   Bitrate mode. Should be either 'auto', 'qp' (constant quality), 'vbr' (variable bitrate) or 'cbr' (constant bitrate). Optional, set to 'auto' by default which defaults to 'qp' on all devices\n");
+    printf("        except steam deck that has broken drivers and doesn't support qp.\n");
+    printf("        Note: 'vbr' option is not supported when using '-encoder cpu' option.\n");
+    printf("\n");
+    printf("  -cr   Color range. Should be either 'limited' (aka mpeg) or 'full' (aka jpeg). Optional, set to 'limited' by default.\n");
+    printf("        Limited color range means that colors are in range 16-235 (4112-60395 for hdr) while full color range means that colors are in range 0-255 (0-65535 for hdr).\n");
+    printf("        Note that some buggy video players (such as vlc) are unable to correctly display videos in full color range and when upload the video to websites the website\n");
+    printf("        might re-encoder the video to make the video limited color range.\n");
+    printf("\n");
+    printf("  -tune\n");
+    printf("        Tune for performance or quality. Should be either 'performance' or 'quality'. At the moment this option only has an effect on Nvidia where setting this to quality\n");
+    printf("        sets options such as preset, multipass and b frames. Optional, set to 'performance' by default.\n");
+    printf("\n");
+    printf("  -df   Organise replays in folders based on the current date.\n");
+    printf("\n");
+    printf("  -sc   Run a script on the saved video file (asynchronously). The first argument to the script is the filepath to the saved video file and the second argument is the recording type (either \"regular\" or \"replay\").\n");
+    printf("        Not applicable for live streams.\n");
+    printf("\n");
+    printf("  -cursor\n");
+    printf("        Record cursor. Optional, set to 'yes' by default.\n");
+    printf("\n");
+    printf("  -keyint\n");
+    printf("        Specifies the keyframe interval in seconds, the max amount of time to wait to generate a keyframe. Keyframes can be generated more often than this.\n");
+    printf("        This also affects seeking in the video and may affect how the replay video is cut. If this is set to 10 for example then you can only seek in 10-second chunks in the video.\n");
+    printf("        Setting this to a higher value reduces the video file size if you are ok with the previously described downside. This option is expected to be a floating point number.\n");
+    printf("        By default this value is set to 2.0.\n");
+    printf("\n");
+    printf("  -restore-portal-session\n");
+    printf("        If GPU Screen Recorder should use the same capture option as the last time. Using this option removes the popup asking what you want to record the next time you record with '-w portal'\n");
+    printf("        if you selected the option to save session (token) in the desktop portal screencast popup.\n");
+    printf("        This option may not have any effect on your Wayland compositor and your systems desktop portal needs to support ScreenCast version 5 or later. Optional, set to 'no' by default.\n");
+    printf("\n");
+    printf("  -portal-session-token-filepath\n");
+    printf("        This option is used together with -restore-portal-session option to specify the file path to save/restore the portal session token to/from.\n");
+    printf("        This can be used to remember different portal capture options depending on different recording option (such as recording/replay).\n");
+    printf("        Optional, set to \"$XDG_CONFIG_HOME/gpu-screen-recorder/restore_token\" by default ($XDG_CONFIG_HOME defaults to \"$HOME/.config\").\n");
+    printf("        Note: the directory to the portal session token file is created automatically if it doesn't exist.\n");
+    printf("\n");
+    printf("  -encoder\n");
+    printf("        Which device should be used for video encoding. Should either be 'gpu' or 'cpu'. 'cpu' option currently only work with h264 codec option (-k).\n");
+    printf("        Optional, set to 'gpu' by default.\n");
+    printf("\n");
+    printf("  --info\n");
+    printf("        List info about the system. Lists the following information (prints them to stdout and exits):\n");
+    printf("        Supported video codecs (h264, h264_software, hevc, hevc_hdr, hevc_10bit, av1, av1_hdr, av1_10bit, vp8, vp9) and image codecs (jpeg, png) (if supported).\n");
+    printf("        Supported capture options (window, focused, screen, monitors and portal, if supported by the system).\n");
+    printf("        If opengl initialization fails then the program exits with 22, if no usable drm device is found then it exits with 23. On success it exits with 0.\n");
+    printf("\n");
+    printf("  --list-capture-options\n");
+    printf("        List available capture options. Lists capture options in the following format (prints them to stdout and exits):\n");
+    printf("          <option>\n");
+    printf("          <monitor_name>|<resolution>\n");
+    printf("        For example:\n");
+    printf("          window\n");
+    printf("          DP-1|1920x1080\n");
+    printf("        The <option> and <monitor_name> is the name that can be passed to GPU Screen Recorder with the -w option.\n");
+    printf("        --list-capture-options optionally accepts a card path (\"/dev/dri/cardN\") which can improve the performance of running this command.\n");
+    printf("\n");
+    printf("  --list-audio-devices\n");
+    printf("        List audio devices. Lists audio devices in the following format (prints them to stdout and exits):\n");
+    printf("          <audio_device_name>|<audio_device_name_in_human_readable_format>\n");
+    printf("        For example:\n");
+    printf("          bluez_input.88:C9:E8:66:A2:27|WH-1000XM4\n");
+    printf("          alsa_output.pci-0000_0c_00.4.iec958-stereo|Monitor of Starship/Matisse HD Audio Controller Digital Stereo (IEC958)\n");
+    printf("        The <audio_device_name> is the name that can be passed to GPU Screen Recorder with the -a option.\n");
+    printf("\n");
+    printf("  --list-application-audio\n");
+    printf("        Lists applications that you can record from (prints them to stdout and exits), for example:\n");
+    printf("          firefox\n");
+    printf("          csgo\n");
+    printf("        These names are the application audio names that can be passed to GPU Screen Recorder with the -a option.\n");
+    printf("\n");
+    printf("  --version\n");
+    printf("        Print version (%s) and exit\n", GSR_VERSION);
+    printf("\n");
+    //fprintf(stderr, "  -pixfmt  The pixel format to use for the output video. yuv420 is the most common format and is best supported, but the color is compressed, so colors can look washed out and certain colors of text can look bad. Use yuv444 for no color compression, but the video may not work everywhere and it may not work with hardware video decoding. Optional, set to 'yuv420' by default\n");
+    printf("  -o    The output file path. If omitted then the encoded data is sent to stdout. Required in replay mode (when using -r).\n");
+    printf("        In replay mode this has to be a directory instead of a file.\n");
+    printf("        Note: the directory to the file is created automatically if it doesn't already exist.\n");
+    printf("\n");
+    printf("  -ro   The output directory for regular recordings in replay/streaming mode. Required to start recording in replay/streaming mode.\n");
+    printf("        Note: the directory to the file is created automatically if it doesn't already exist.\n");
+    printf("\n");
+    printf("  -v    Prints fps and damage info once per second. Optional, set to 'yes' by default.\n");
+    printf("\n");
+    printf("  -gl-debug\n");
+    printf("        Print opengl debug output. Optional, set to 'no' by default.\n");
+    printf("\n");
+    printf("  -h, --help\n");
+    printf("        Show this help.\n");
+    printf("\n");
+    printf("NOTES:\n");
+    printf("  Send signal SIGINT to gpu-screen-recorder (Ctrl+C, or pkill -SIGINT -f gpu-screen-recorder) to stop and save the recording. When in replay mode this stops recording without saving.\n");
+    printf("  Send signal SIGUSR2 to gpu-screen-recorder (pkill -SIGUSR2 -f gpu-screen-recorder) to pause/unpause recording. Only applicable when recording (not streaming nor replay).\n");
+    printf("  Send signal SIGUSR1 to gpu-screen-recorder (pkill -SIGUSR1 -f gpu-screen-recorder) to save a replay (when in replay mode).\n");
+    printf("  Send signal SIGRTMIN+1 to gpu-screen-recorder (pkill -SIGRTMIN+1 -f gpu-screen-recorder) to save a replay of the last 10 seconds (when in replay mode).\n");
+    printf("  Send signal SIGRTMIN+2 to gpu-screen-recorder (pkill -SIGRTMIN+2 -f gpu-screen-recorder) to save a replay of the last 30 seconds (when in replay mode).\n");
+    printf("  Send signal SIGRTMIN+3 to gpu-screen-recorder (pkill -SIGRTMIN+3 -f gpu-screen-recorder) to save a replay of the last 60 seconds (when in replay mode).\n");
+    printf("  Send signal SIGRTMIN+4 to gpu-screen-recorder (pkill -SIGRTMIN+4 -f gpu-screen-recorder) to save a replay of the last 5 minutes (when in replay mode).\n");
+    printf("  Send signal SIGRTMIN+5 to gpu-screen-recorder (pkill -SIGRTMIN+5 -f gpu-screen-recorder) to save a replay of the last 10 minutes (when in replay mode).\n");
+    printf("  Send signal SIGRTMIN+6 to gpu-screen-recorder (pkill -SIGRTMIN+6 -f gpu-screen-recorder) to save a replay of the last 30 minutes (when in replay mode).\n");
+    printf("  Send signal SIGRTMIN to gpu-screen-recorder (pkill -SIGRTMIN -f gpu-screen-recorder) to start/stop recording a regular video when in replay/streaming mode.\n");
+    printf("\n");
+    printf("EXAMPLES:\n");
+    printf("  %s -w screen -f 60 -a default_output -o video.mp4\n", program_name);
+    printf("  %s -w screen -f 60 -a default_output -a default_input -o video.mp4\n", program_name);
+    printf("  %s -w $(xdotool selectwindow) -f 60 -a default_output -o video.mp4\n", program_name);
+    printf("  %s -w screen -f 60 -a \"default_output|default_input\" -o video.mp4\n", program_name);
+    printf("  %s -w screen -f 60 -a default_output -c mkv -r 60 -o \"$HOME/Videos\"\n", program_name);
+    printf("  %s -w screen -f 60 -a default_output -c mkv -r 1800 -replay-storage disk -bm cbr -q 40000 -o \"$HOME/Videos\"\n", program_name);
+    printf("  %s -w screen -f 60 -a default_output -c mkv -sc script.sh -r 60 -o \"$HOME/Videos\"\n", program_name);
+    printf("  %s -w portal -f 60 -a default_output -restore-portal-session yes -o video.mp4\n", program_name);
+    printf("  %s -w screen -f 60 -a default_output -bm cbr -q 15000 -o video.mp4\n", program_name);
+    printf("  %s -w screen -f 60 -a \"app:firefox|app:csgo\" -o video.mp4\n", program_name);
+    printf("  %s -w screen -f 60 -a \"app-inverse:firefox|app-inverse:csgo\" -o video.mp4\n", program_name);
+    printf("  %s -w screen -f 60 -a \"default_input|app-inverse:Brave\" -o video.mp4\n", program_name);
+    printf("  %s -w screen -o image.jpg\n", program_name);
+    printf("  %s -w screen -q medium -o image.jpg\n", program_name);
+    printf("  %s -w region -region 640x480+100+100 -o video.mp4\n", program_name);
+    printf("  %s -w region -region $(slop) -o video.mp4\n", program_name);
+    printf("  %s -w region -region $(slurp -f \"%%wx%%h+%%x+%%y\") -o video.mp4\n", program_name);
+    //fprintf(stderr, "  gpu-screen-recorder -w screen -f 60 -q ultra -pixfmt yuv444 -o video.mp4\n");
+    fflush(stdout);
+}
+
+static void usage() {
+    usage_header();
+}
+
+// TODO: Does this match all livestreaming cases?
+static bool is_livestream_path(const char *str) {
+    const int len = strlen(str);
+    if((len >= 7 && memcmp(str, "http://", 7) == 0) || (len >= 8 && memcmp(str, "https://", 8) == 0))
+        return true;
+    else if((len >= 7 && memcmp(str, "rtmp://", 7) == 0) || (len >= 8 && memcmp(str, "rtmps://", 8) == 0))
+        return true;
+    else if((len >= 7 && memcmp(str, "rtsp://", 7) == 0))
+        return true;
+    else if((len >= 6 && memcmp(str, "srt://", 6) == 0))
+        return true;
+    else if((len >= 6 && memcmp(str, "tcp://", 6) == 0))
+        return true;
+    else if((len >= 6 && memcmp(str, "udp://", 6) == 0))
+        return true;
+    else
+        return false;
+}
+
+static bool args_parser_set_values(args_parser *self) {
+    self->video_encoder = (gsr_video_encoder_hardware)args_get_enum_by_key(self->args, NUM_ARGS, "-encoder", GSR_VIDEO_ENCODER_HW_GPU);
+    self->pixel_format = (gsr_pixel_format)args_get_enum_by_key(self->args, NUM_ARGS, "-pixfmt", GSR_PIXEL_FORMAT_YUV420);
+    self->framerate_mode = (gsr_framerate_mode)args_get_enum_by_key(self->args, NUM_ARGS, "-fm", GSR_FRAMERATE_MODE_VARIABLE);
+    self->color_range = (gsr_color_range)args_get_enum_by_key(self->args, NUM_ARGS, "-cr", GSR_COLOR_RANGE_LIMITED);
+    self->tune = (gsr_tune)args_get_enum_by_key(self->args, NUM_ARGS, "-tune", GSR_TUNE_PERFORMANCE);
+    self->video_codec = (gsr_video_codec)args_get_enum_by_key(self->args, NUM_ARGS, "-k", GSR_VIDEO_CODEC_AUTO);
+    self->audio_codec = (gsr_audio_codec)args_get_enum_by_key(self->args, NUM_ARGS, "-ac", GSR_AUDIO_CODEC_OPUS);
+    self->bitrate_mode = (gsr_bitrate_mode)args_get_enum_by_key(self->args, NUM_ARGS, "-bm", GSR_BITRATE_MODE_AUTO);
+    self->replay_storage = (gsr_replay_storage)args_get_enum_by_key(self->args, NUM_ARGS, "-replay-storage", GSR_REPLAY_STORAGE_RAM);
+
+    const char *window = args_get_value_by_key(self->args, NUM_ARGS, "-w");
+    snprintf(self->window, sizeof(self->window), "%s", window);
+    self->verbose = args_get_boolean_by_key(self->args, NUM_ARGS, "-v", true);
+    self->gl_debug = args_get_boolean_by_key(self->args, NUM_ARGS, "-gl-debug", false);
+    self->record_cursor = args_get_boolean_by_key(self->args, NUM_ARGS, "-cursor", true);
+    self->date_folders = args_get_boolean_by_key(self->args, NUM_ARGS, "-df", false);
+    self->restore_portal_session = args_get_boolean_by_key(self->args, NUM_ARGS, "-restore-portal-session", false);
+    self->restart_replay_on_save = args_get_boolean_by_key(self->args, NUM_ARGS, "-restart-replay-on-save", false);
+    self->overclock = args_get_boolean_by_key(self->args, NUM_ARGS, "-oc", false);
+
+    self->audio_bitrate = args_get_i64_by_key(self->args, NUM_ARGS, "-ab", 0);
+    self->audio_bitrate *= 1000LL;
+
+    self->keyint = args_get_double_by_key(self->args, NUM_ARGS, "-keyint", 2.0);
+
+    if(self->audio_codec == GSR_AUDIO_CODEC_FLAC) {
+        fprintf(stderr, "gsr warning: flac audio codec is temporary disabled, using opus audio codec instead\n");
+        self->audio_codec = GSR_AUDIO_CODEC_OPUS;
+    }
+
+    self->portal_session_token_filepath = args_get_value_by_key(self->args, NUM_ARGS, "-portal-session-token-filepath");
+    if(self->portal_session_token_filepath) {
+        int len = strlen(self->portal_session_token_filepath);
+        if(len > 0 && self->portal_session_token_filepath[len - 1] == '/') {
+            fprintf(stderr, "gsr error: -portal-session-token-filepath should be a path to a file but it ends with a /: %s\n", self->portal_session_token_filepath);
+            return false;
+        }
+    }
+
+    self->recording_saved_script = args_get_value_by_key(self->args, NUM_ARGS, "-sc");
+    if(self->recording_saved_script) {
+        struct stat buf;
+        if(stat(self->recording_saved_script, &buf) == -1 || !S_ISREG(buf.st_mode)) {
+            fprintf(stderr, "gsr error: Script \"%s\" either doesn't exist or it's not a file\n", self->recording_saved_script);
+            usage();
+            return false;
+        }
+
+        if(!(buf.st_mode & S_IXUSR)) {
+            fprintf(stderr, "gsr error: Script \"%s\" is not executable\n", self->recording_saved_script);
+            usage();
+            return false;
+        }
+    }
+
+    const char *quality_str = args_get_value_by_key(self->args, NUM_ARGS, "-q");
+    self->video_quality = GSR_VIDEO_QUALITY_VERY_HIGH;
+    self->video_bitrate = 0;
+
+    if(self->bitrate_mode == GSR_BITRATE_MODE_CBR) {
+        if(!quality_str) {
+            fprintf(stderr, "gsr error: option '-q' is required when using '-bm cbr' option\n");
+            usage();
+            return false;
+        }
+
+        if(sscanf(quality_str, "%" PRIi64, &self->video_bitrate) != 1) {
+            fprintf(stderr, "gsr error: -q argument \"%s\" is not an integer value. When using '-bm cbr' option '-q' is expected to be an integer value\n", quality_str);
+            usage();
+            return false;
+        }
+
+        if(self->video_bitrate < 0) {
+            fprintf(stderr, "gsr error: -q is expected to be 0 or larger, got %" PRIi64 "\n", self->video_bitrate);
+            usage();
+            return false;
+        }
+
+        self->video_bitrate *= 1000LL;
+    } else {
+        if(!quality_str)
+            quality_str = "very_high";
+
+        if(strcmp(quality_str, "medium") == 0) {
+            self->video_quality = GSR_VIDEO_QUALITY_MEDIUM;
+        } else if(strcmp(quality_str, "high") == 0) {
+            self->video_quality = GSR_VIDEO_QUALITY_HIGH;
+        } else if(strcmp(quality_str, "very_high") == 0) {
+            self->video_quality = GSR_VIDEO_QUALITY_VERY_HIGH;
+        } else if(strcmp(quality_str, "ultra") == 0) {
+            self->video_quality = GSR_VIDEO_QUALITY_ULTRA;
+        } else {
+            fprintf(stderr, "gsr error: -q should either be 'medium', 'high', 'very_high' or 'ultra', got: '%s'\n", quality_str);
+            usage();
+            return false;
+        }
+    }
+
+    const char *output_resolution_str = args_get_value_by_key(self->args, NUM_ARGS, "-s");
+    if(!output_resolution_str && strcmp(self->window, "focused") == 0) {
+        fprintf(stderr, "gsr error: option -s is required when using '-w focused' option\n");
+        usage();
+        return false;
+    }
+
+    self->output_resolution = (vec2i){0, 0};
+    if(output_resolution_str) {
+        if(sscanf(output_resolution_str, "%dx%d", &self->output_resolution.x, &self->output_resolution.y) != 2) {
+            fprintf(stderr, "gsr error: invalid value for option -s '%s', expected a value in format WxH\n", output_resolution_str);
+            usage();
+            return false;
+        }
+
+        if(self->output_resolution.x < 0 || self->output_resolution.y < 0) {
+            fprintf(stderr, "gsr error: invalid value for option -s '%s', expected width and height to be greater or equal to 0\n", output_resolution_str);
+            usage();
+            return false;
+        }
+    }
+
+    self->region_size = (vec2i){0, 0};
+    self->region_position = (vec2i){0, 0};
+    const char *region_str = args_get_value_by_key(self->args, NUM_ARGS, "-region");
+    if(region_str) {
+        if(strcmp(self->window, "region") != 0) {
+            fprintf(stderr, "gsr error: option -region can only be used when option '-w region' is used\n");
+            usage();
+            return false;
+        }
+
+        if(sscanf(region_str, "%dx%d+%d+%d", &self->region_size.x, &self->region_size.y, &self->region_position.x, &self->region_position.y) != 4) {
+            fprintf(stderr, "gsr error: invalid value for option -region '%s', expected a value in format WxH+X+Y\n", region_str);
+            usage();
+            return false;
+        }
+
+        if(self->region_size.x < 0 || self->region_size.y < 0 || self->region_position.x < 0 || self->region_position.y < 0) {
+            fprintf(stderr, "gsr error: invalid value for option -region '%s', expected width, height, x and y to be greater or equal to 0\n", region_str);
+            usage();
+            return false;
+        }
+    } else {
+        if(strcmp(self->window, "region") == 0) {
+            fprintf(stderr, "gsr error: option -region is required when '-w region' is used\n");
+            usage();
+            return false;
+        }
+    }
+
+    self->fps = args_get_i64_by_key(self->args, NUM_ARGS, "-f", 60);
+    self->replay_buffer_size_secs = args_get_i64_by_key(self->args, NUM_ARGS, "-r", -1);
+    if(self->replay_buffer_size_secs != -1)
+        self->replay_buffer_size_secs += (int64_t)(self->keyint + 0.5); // Add a few seconds to account of lost packets because of non-keyframe packets skipped
+
+    self->container_format = args_get_value_by_key(self->args, NUM_ARGS, "-c");
+    if(self->container_format && strcmp(self->container_format, "mkv") == 0)
+        self->container_format = "matroska";
+
+    const bool is_replaying = self->replay_buffer_size_secs != -1;
+    self->is_livestream = false;
+    self->filename = args_get_value_by_key(self->args, NUM_ARGS, "-o");
+    if(self->filename) {
+        self->is_livestream = is_livestream_path(self->filename);
+        if(self->is_livestream) {
+            if(is_replaying) {
+                fprintf(stderr, "gsr error: replay mode is not applicable to live streaming\n");
+                return false;
+            }
+        } else {
+            if(!is_replaying) {
+                char directory_buf[PATH_MAX];
+                snprintf(directory_buf, sizeof(directory_buf), "%s", self->filename);
+                char *directory = dirname(directory_buf);
+                if(strcmp(directory, ".") != 0 && strcmp(directory, "/") != 0) {
+                    if(create_directory_recursive(directory) != 0) {
+                        fprintf(stderr, "gsr error: failed to create directory for output file: %s\n", self->filename);
+                        return false;
+                    }
+                }
+            } else {
+                if(!self->container_format) {
+                    fprintf(stderr, "gsr error: option -c is required when using option -r\n");
+                    usage();
+                    return false;
+                }
+
+                struct stat buf;
+                if(stat(self->filename, &buf) != -1 && !S_ISDIR(buf.st_mode)) {
+                    fprintf(stderr, "gsr error: File \"%s\" exists but it's not a directory\n", self->filename);
+                    usage();
+                    return false;
+                }
+            }
+        }
+    } else {
+        if(!is_replaying) {
+            self->filename = "/dev/stdout";
+        } else {
+            fprintf(stderr, "gsr error: Option -o is required when using option -r\n");
+            usage();
+            return false;
+        }
+
+        if(!self->container_format) {
+            fprintf(stderr, "gsr error: option -c is required when not using option -o\n");
+            usage();
+            return false;
+        }
+    }
+
+    self->is_output_piped = strcmp(self->filename, "/dev/stdout") == 0;
+    self->low_latency_recording = self->is_livestream || self->is_output_piped;
+
+    self->replay_recording_directory = args_get_value_by_key(self->args, NUM_ARGS, "-ro");
+
+    const bool is_portal_capture = strcmp(self->window, "portal") == 0;
+    if(!self->restore_portal_session && is_portal_capture)
+        fprintf(stderr, "gsr info: option '-w portal' was used without '-restore-portal-session yes'. The previous screencast session will be ignored\n");
+
+    if(self->is_livestream && self->recording_saved_script) {
+        fprintf(stderr, "gsr warning: live stream detected, -sc script is ignored\n");
+        self->recording_saved_script = NULL;
+    }
+
+    return true;
+}
+
+bool args_parser_parse(args_parser *self, int argc, char **argv, const args_handlers *arg_handlers, void *userdata) {
+    assert(arg_handlers);
+    memset(self, 0, sizeof(*self));
+
+    if(argc <= 1) {
+        usage_full();
+        return false;
+    }
+
+    if(argc == 2 && (strcmp(argv[1], "-h") == 0 || strcmp(argv[1], "--help") == 0)) {
+        usage_full();
+        return false;
+    }
+
+    if(argc == 2 && strcmp(argv[1], "--info") == 0) {
+        arg_handlers->info(userdata);
+        return true;
+    }
+
+    if(argc == 2 && strcmp(argv[1], "--list-audio-devices") == 0) {
+        arg_handlers->list_audio_devices(userdata);
+        return true;
+    }
+
+    if(argc == 2 && strcmp(argv[1], "--list-application-audio") == 0) {
+        arg_handlers->list_application_audio(userdata);
+        return true;
+    }
+
+    if(strcmp(argv[1], "--list-capture-options") == 0) {
+        if(argc == 2) {
+            arg_handlers->list_capture_options(NULL, userdata);
+            return true;
+        } else if(argc == 3 || argc == 4) {
+            const char *card_path = argv[2];
+            arg_handlers->list_capture_options(card_path, userdata);
+            return true;
+        } else {
+            fprintf(stderr, "gsr error: expected --list-capture-options to be called with either no extra arguments or 1 extra argument (card path)\n");
+            return false;
+        }
+    }
+
+    if(argc == 2 && strcmp(argv[1], "--version") == 0) {
+        arg_handlers->version(userdata);
+        return true;
+    }
+
+    int arg_index = 0;
+    self->args[arg_index++] = (Arg){ .key = "-w",                             .optional = false, .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-c",                             .optional = true,  .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-f",                             .optional = true,  .list = false, .type = ARG_TYPE_I64, .integer_value_min = 1, .integer_value_max = 1000 };
+    self->args[arg_index++] = (Arg){ .key = "-s",                             .optional = true,  .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-region",                        .optional = true,  .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-a",                             .optional = true,  .list = true,  .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-q",                             .optional = true,  .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-o",                             .optional = true,  .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-ro",                            .optional = true,  .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-r",                             .optional = true,  .list = false, .type = ARG_TYPE_I64, .integer_value_min = 2, .integer_value_max = 86400 };
+    self->args[arg_index++] = (Arg){ .key = "-restart-replay-on-save",        .optional = true,  .list = false, .type = ARG_TYPE_BOOLEAN };
+    self->args[arg_index++] = (Arg){ .key = "-k",                             .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = video_codec_enums, .num_enum_values = sizeof(video_codec_enums)/sizeof(ArgEnum) };
+    self->args[arg_index++] = (Arg){ .key = "-ac",                            .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = audio_codec_enums, .num_enum_values = sizeof(audio_codec_enums)/sizeof(ArgEnum) };
+    self->args[arg_index++] = (Arg){ .key = "-ab",                            .optional = true,  .list = false, .type = ARG_TYPE_I64, .integer_value_min = 0, .integer_value_max = 50000 };
+    self->args[arg_index++] = (Arg){ .key = "-oc",                            .optional = true,  .list = false, .type = ARG_TYPE_BOOLEAN };
+    self->args[arg_index++] = (Arg){ .key = "-fm",                            .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = framerate_mode_enums, .num_enum_values = sizeof(framerate_mode_enums)/sizeof(ArgEnum) };
+    self->args[arg_index++] = (Arg){ .key = "-bm",                            .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = bitrate_mode_enums, .num_enum_values = sizeof(bitrate_mode_enums)/sizeof(ArgEnum) };
+    self->args[arg_index++] = (Arg){ .key = "-pixfmt",                        .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = pixel_format_enums, .num_enum_values = sizeof(pixel_format_enums)/sizeof(ArgEnum) };
+    self->args[arg_index++] = (Arg){ .key = "-v",                             .optional = true,  .list = false, .type = ARG_TYPE_BOOLEAN };
+    self->args[arg_index++] = (Arg){ .key = "-gl-debug",                      .optional = true,  .list = false, .type = ARG_TYPE_BOOLEAN };
+    self->args[arg_index++] = (Arg){ .key = "-df",                            .optional = true,  .list = false, .type = ARG_TYPE_BOOLEAN };
+    self->args[arg_index++] = (Arg){ .key = "-sc",                            .optional = true,  .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-cr",                            .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = color_range_enums, .num_enum_values = sizeof(color_range_enums)/sizeof(ArgEnum) };
+    self->args[arg_index++] = (Arg){ .key = "-tune",                          .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = tune_enums, .num_enum_values = sizeof(tune_enums)/sizeof(ArgEnum) };
+    self->args[arg_index++] = (Arg){ .key = "-cursor",                        .optional = true,  .list = false, .type = ARG_TYPE_BOOLEAN };
+    self->args[arg_index++] = (Arg){ .key = "-keyint",                        .optional = true,  .list = false, .type = ARG_TYPE_DOUBLE, .integer_value_min = 0, .integer_value_max = 500 };
+    self->args[arg_index++] = (Arg){ .key = "-restore-portal-session",        .optional = true,  .list = false, .type = ARG_TYPE_BOOLEAN };
+    self->args[arg_index++] = (Arg){ .key = "-portal-session-token-filepath", .optional = true,  .list = false, .type = ARG_TYPE_STRING  };
+    self->args[arg_index++] = (Arg){ .key = "-encoder",                       .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = video_encoder_enums, .num_enum_values = sizeof(video_encoder_enums)/sizeof(ArgEnum) };
+    self->args[arg_index++] = (Arg){ .key = "-replay-storage",                .optional = true,  .list = false, .type = ARG_TYPE_ENUM, .enum_values = replay_storage_enums, .num_enum_values = sizeof(replay_storage_enums)/sizeof(ArgEnum) };
+    assert(arg_index == NUM_ARGS);
+
+    for(int i = 1; i < argc; i += 2) {
+        const char *arg_name = argv[i];
+        Arg *arg = args_get_by_key(self->args, NUM_ARGS, arg_name);
+        if(!arg) {
+            fprintf(stderr, "gsr error: invalid argument '%s'\n", arg_name);
+            usage();
+            return false;
+        }
+
+        if(arg->num_values > 0 && !arg->list) {
+            fprintf(stderr, "gsr error: expected argument '%s' to only be specified once\n", arg_name);
+            usage();
+            return false;
+        }
+
+        if(i + 1 >= argc) {
+            fprintf(stderr, "gsr error: missing value for argument '%s'\n", arg_name);
+            usage();
+            return false;
+        }
+
+        const char *arg_value = argv[i + 1];
+        switch(arg->type) {
+            case ARG_TYPE_STRING: {
+                break;
+            }
+            case ARG_TYPE_BOOLEAN: {
+                if(strcmp(arg_value, "yes") == 0) {
+                    arg->typed_value.boolean = true;
+                } else if(strcmp(arg_value, "no") == 0) {
+                    arg->typed_value.boolean = false;
+                } else {
+                    fprintf(stderr, "gsr error: %s should either be 'yes' or 'no', got: '%s'\n", arg_name, arg_value);
+                    usage();
+                    return false;
+                }
+                break;
+            }
+            case ARG_TYPE_ENUM: {
+                if(!arg_get_enum_value_by_name(arg, arg_value, &arg->typed_value.enum_value)) {
+                    fprintf(stderr, "gsr error: %s should either be ", arg_name);
+                    arg_print_expected_enum_names(arg);
+                    fprintf(stderr, ", got: '%s'\n", arg_value);
+                    usage();
+                    return false;
+                }
+                break;
+            }
+            case ARG_TYPE_I64: {
+                if(sscanf(arg_value, "%" PRIi64, &arg->typed_value.i64_value) != 1) {
+                    fprintf(stderr, "gsr error: %s argument \"%s\" is not an integer\n", arg_name, arg_value);
+                    usage();
+                    return false;
+                }
+
+                if(arg->typed_value.i64_value < arg->integer_value_min) {
+                    fprintf(stderr, "gsr error: %s argument is expected to be larger than %" PRIi64 ", got %" PRIi64 "\n", arg_name, arg->integer_value_min, arg->typed_value.i64_value);
+                    usage();
+                    return false;
+                }
+
+                if(arg->typed_value.i64_value > arg->integer_value_max) {
+                    fprintf(stderr, "gsr error: %s argument is expected to be less than %" PRIi64 ", got %" PRIi64 "\n", arg_name, arg->integer_value_max, arg->typed_value.i64_value);
+                    usage();
+                    return false;
+                }
+                break;
+            }
+            case ARG_TYPE_DOUBLE: {
+                if(sscanf(arg_value, "%lf", &arg->typed_value.d_value) != 1) {
+                    fprintf(stderr, "gsr error: %s argument \"%s\" is not an floating-point number\n", arg_name, arg_value);
+                    usage();
+                    return false;
+                }
+
+                if(arg->typed_value.d_value < arg->integer_value_min) {
+                    fprintf(stderr, "gsr error: %s argument is expected to be larger than %" PRIi64 ", got %lf\n", arg_name, arg->integer_value_min, arg->typed_value.d_value);
+                    usage();
+                    return false;
+                }
+
+                if(arg->typed_value.d_value > arg->integer_value_max) {
+                    fprintf(stderr, "gsr error: %s argument is expected to be less than %" PRIi64 ", got %lf\n", arg_name, arg->integer_value_max, arg->typed_value.d_value);
+                    usage();
+                    return false;
+                }
+                break;
+            }
+        }
+
+        if(!arg_append_value(arg, arg_value)) {
+            fprintf(stderr, "gsr error: failed to append argument, out of memory\n");
+            return false;
+        }
+    }
+
+    for(int i = 0; i < NUM_ARGS; ++i) {
+        const Arg *arg = &self->args[i];
+        if(!arg->optional && arg->num_values == 0) {
+            fprintf(stderr, "gsr error: missing argument '%s'\n", arg->key);
+            usage();
+            return false;
+        }
+    }
+
+    return args_parser_set_values(self);
+}
+
+void args_parser_deinit(args_parser *self) {
+    for(int i = 0; i < NUM_ARGS; ++i) {
+        arg_deinit(&self->args[i]);
+    }
+}
+
+bool args_parser_validate_with_gl_info(args_parser *self, gsr_egl *egl) {
+    const bool wayland = gsr_window_get_display_server(egl->window) == GSR_DISPLAY_SERVER_WAYLAND;
+
+    if(self->bitrate_mode == (gsr_bitrate_mode)GSR_BITRATE_MODE_AUTO) {
+        // QP is broken on steam deck, see https://github.com/ValveSoftware/SteamOS/issues/1609
+        self->bitrate_mode = egl->gpu_info.is_steam_deck ? GSR_BITRATE_MODE_VBR : GSR_BITRATE_MODE_QP;
+    }
+
+    if(egl->gpu_info.is_steam_deck && self->bitrate_mode == GSR_BITRATE_MODE_QP) {
+        fprintf(stderr, "gsr warning: qp bitrate mode is not supported on Steam Deck because of Steam Deck driver bugs. Using vbr instead\n");
+        self->bitrate_mode = GSR_BITRATE_MODE_VBR;
+    }
+
+    if(self->video_encoder == GSR_VIDEO_ENCODER_HW_CPU && self->bitrate_mode == GSR_BITRATE_MODE_VBR) {
+        fprintf(stderr, "gsr warning: bitrate mode has been forcefully set to qp because software encoding option doesn't support vbr option\n");
+        self->bitrate_mode = GSR_BITRATE_MODE_QP;
+    }
+
+    if(egl->gpu_info.vendor != GSR_GPU_VENDOR_NVIDIA && self->overclock) {
+        fprintf(stderr, "gsr info: overclock option has no effect on amd/intel, ignoring option\n");
+        self->overclock = false;
+    }
+
+    if(egl->gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA && self->overclock && wayland) {
+        fprintf(stderr, "gsr info: overclocking is not possible on nvidia on wayland, ignoring option\n");
+        self->overclock = false;
+    }
+
+    if(egl->gpu_info.is_steam_deck) {
+        fprintf(stderr, "gsr warning: steam deck has multiple driver issues. One of them has been reported here: https://github.com/ValveSoftware/SteamOS/issues/1609\n"
+            "If you have issues with GPU Screen Recorder on steam deck that you don't have on a desktop computer then report the issue to Valve and/or AMD.\n");
+    }
+
+    self->very_old_gpu = false;
+    if(egl->gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA && egl->gpu_info.gpu_version != 0 && egl->gpu_info.gpu_version < 900) {
+        fprintf(stderr, "gsr info: your gpu appears to be very old (older than maxwell architecture). Switching to lower preset\n");
+        self->very_old_gpu = true;
+    }
+
+    if(video_codec_is_hdr(self->video_codec) && !wayland) {
+        fprintf(stderr, "gsr error: hdr video codec option %s is not available on X11\n", video_codec_to_string(self->video_codec));
+        usage();
+        return false;
+    }
+
+    const bool is_portal_capture = strcmp(self->window, "portal") == 0;
+    if(video_codec_is_hdr(self->video_codec) && is_portal_capture) {
+        fprintf(stderr, "gsr warning: portal capture option doesn't support hdr yet (PipeWire doesn't support hdr), the video will be tonemapped from hdr to sdr\n");
+        self->video_codec = hdr_video_codec_to_sdr_video_codec(self->video_codec);
+    }
+
+    return true;
+}
+
+void args_parser_print_usage(void) {
+    usage();
+}
+
+Arg* args_parser_get_arg(args_parser *self, const char *arg_name) {
+    return args_get_by_key(self->args, NUM_ARGS, arg_name);
+}
diff --git a/src/capture/capture.c b/src/capture/capture.c
index eea0d1d..bc95300 100644
--- a/src/capture/capture.c
+++ b/src/capture/capture.c
@@ -1,59 +1,53 @@
 #include "../../include/capture/capture.h"
-#include <stdio.h>
+#include <assert.h>
 
-int gsr_capture_start(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    if(cap->started)
-        return -1;
-
-    int res = cap->start(cap, video_codec_context);
+int gsr_capture_start(gsr_capture *cap, gsr_capture_metadata *capture_metadata) {
+    assert(!cap->started);
+    int res = cap->start(cap, capture_metadata);
     if(res == 0)
         cap->started = true;
 
     return res;
 }
 
-void gsr_capture_tick(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame **frame) {
-    if(!cap->started) {
-        fprintf(stderr, "gsr error: gsp_capture_tick failed: the gsr capture has not been started\n");
-        return;
-    }
-
+void gsr_capture_tick(gsr_capture *cap) {
+    assert(cap->started);
     if(cap->tick)
-        cap->tick(cap, video_codec_context, frame);
+        cap->tick(cap);
 }
 
-bool gsr_capture_should_stop(gsr_capture *cap, bool *err) {
-    if(!cap->started) {
-        fprintf(stderr, "gsr error: gsr_capture_should_stop failed: the gsr capture has not been started\n");
-        return false;
-    }
+void gsr_capture_on_event(gsr_capture *cap, gsr_egl *egl) {
+    if(cap->on_event)
+        cap->on_event(cap, egl);
+}
 
-    if(!cap->should_stop)
+bool gsr_capture_should_stop(gsr_capture *cap, bool *err) {
+    assert(cap->started);
+    if(cap->should_stop)
+        return cap->should_stop(cap, err);
+    else
         return false;
-
-    return cap->should_stop(cap, err);
 }
 
-int gsr_capture_capture(gsr_capture *cap, AVFrame *frame) {
-    if(!cap->started) {
-        fprintf(stderr, "gsr error: gsr_capture_capture failed: the gsr capture has not been started\n");
-        return -1;
-    }
-    return cap->capture(cap, frame);
+int gsr_capture_capture(gsr_capture *cap, gsr_capture_metadata *capture_metadata, gsr_color_conversion *color_conversion) {
+    assert(cap->started);
+    return cap->capture(cap, capture_metadata, color_conversion);
 }
 
-void gsr_capture_end(gsr_capture *cap, AVFrame *frame) {
-    if(!cap->started) {
-        fprintf(stderr, "gsr error: gsr_capture_end failed: the gsr capture has not been started\n");
-        return;
-    }
-
-    if(!cap->capture_end)
-        return;
+bool gsr_capture_uses_external_image(gsr_capture *cap) {
+    if(cap->uses_external_image)
+        return cap->uses_external_image(cap);
+    else
+        return false;
+}
 
-    cap->capture_end(cap, frame);
+bool gsr_capture_set_hdr_metadata(gsr_capture *cap, AVMasteringDisplayMetadata *mastering_display_metadata, AVContentLightMetadata *light_metadata) {
+    if(cap->set_hdr_metadata)
+        return cap->set_hdr_metadata(cap, mastering_display_metadata, light_metadata);
+    else
+        return false;
 }
 
-void gsr_capture_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    cap->destroy(cap, video_codec_context);
+void gsr_capture_destroy(gsr_capture *cap) {
+    cap->destroy(cap);
 }
diff --git a/src/capture/kms.c b/src/capture/kms.c
new file mode 100644
index 0000000..36a5355
--- /dev/null
+++ b/src/capture/kms.c
@@ -0,0 +1,767 @@
+#include "../../include/capture/kms.h"
+#include "../../include/utils.h"
+#include "../../include/color_conversion.h"
+#include "../../include/cursor.h"
+#include "../../include/window/window.h"
+#include "../../kms/client/kms_client.h"
+
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+#include <xf86drm.h>
+#include <drm_fourcc.h>
+
+#include <libavutil/mastering_display_metadata.h>
+
+#define FIND_CRTC_BY_NAME_TIMEOUT_SECONDS 2.0
+
+#define HDMI_STATIC_METADATA_TYPE1 0
+#define HDMI_EOTF_SMPTE_ST2084 2
+
+#define MAX_CONNECTOR_IDS 32
+
+typedef struct {
+    uint32_t connector_ids[MAX_CONNECTOR_IDS];
+    int num_connector_ids;
+} MonitorId;
+
+typedef struct {
+    gsr_capture_kms_params params;
+    
+    gsr_kms_client kms_client;
+    gsr_kms_response kms_response;
+
+    vec2i capture_pos;
+    vec2i capture_size;
+    MonitorId monitor_id;
+
+    gsr_monitor_rotation monitor_rotation;
+
+    unsigned int input_texture_id;
+    unsigned int external_input_texture_id;
+    unsigned int cursor_texture_id;
+
+    bool no_modifiers_fallback;
+    bool external_texture_fallback;
+
+    struct hdr_output_metadata hdr_metadata;
+    bool hdr_metadata_set;
+
+    bool is_x11;
+    gsr_cursor x11_cursor;
+
+    //int drm_fd;
+    //uint64_t prev_sequence;
+    //bool damaged;
+
+    vec2i prev_target_pos;
+    vec2i prev_plane_size;
+
+    double last_time_monitor_check;
+} gsr_capture_kms;
+
+static void gsr_capture_kms_cleanup_kms_fds(gsr_capture_kms *self) {
+    for(int i = 0; i < self->kms_response.num_items; ++i) {
+        for(int j = 0; j < self->kms_response.items[i].num_dma_bufs; ++j) {
+            gsr_kms_response_dma_buf *dma_buf = &self->kms_response.items[i].dma_buf[j];
+            if(dma_buf->fd > 0) {
+                close(dma_buf->fd);
+                dma_buf->fd = -1;
+            }
+        }
+        self->kms_response.items[i].num_dma_bufs = 0;
+    }
+    self->kms_response.num_items = 0;
+}
+
+static void gsr_capture_kms_stop(gsr_capture_kms *self) {
+    if(self->input_texture_id) {
+        self->params.egl->glDeleteTextures(1, &self->input_texture_id);
+        self->input_texture_id = 0;
+    }
+
+    if(self->external_input_texture_id) {
+        self->params.egl->glDeleteTextures(1, &self->external_input_texture_id);
+        self->external_input_texture_id = 0;
+    }
+
+    if(self->cursor_texture_id) {
+        self->params.egl->glDeleteTextures(1, &self->cursor_texture_id);
+        self->cursor_texture_id = 0;
+    }
+
+    // if(self->drm_fd > 0) {
+    //     close(self->drm_fd);
+    //     self->drm_fd = -1;
+    // }
+
+    gsr_capture_kms_cleanup_kms_fds(self);
+    gsr_kms_client_deinit(&self->kms_client);
+    gsr_cursor_deinit(&self->x11_cursor);
+}
+
+static int max_int(int a, int b) {
+    return a > b ? a : b;
+}
+
+static void gsr_capture_kms_create_input_texture_ids(gsr_capture_kms *self) {
+    self->params.egl->glGenTextures(1, &self->input_texture_id);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, self->input_texture_id);
+    self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+    self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+
+    self->params.egl->glGenTextures(1, &self->external_input_texture_id);
+    self->params.egl->glBindTexture(GL_TEXTURE_EXTERNAL_OES, self->external_input_texture_id);
+    self->params.egl->glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+    self->params.egl->glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+    self->params.egl->glBindTexture(GL_TEXTURE_EXTERNAL_OES, 0);
+
+    const bool cursor_texture_id_is_external = self->params.egl->gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA;
+    const int cursor_texture_id_target = cursor_texture_id_is_external ? GL_TEXTURE_EXTERNAL_OES : GL_TEXTURE_2D;
+
+    self->params.egl->glGenTextures(1, &self->cursor_texture_id);
+    self->params.egl->glBindTexture(cursor_texture_id_target, self->cursor_texture_id);
+    self->params.egl->glTexParameteri(cursor_texture_id_target, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+    self->params.egl->glTexParameteri(cursor_texture_id_target, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+    self->params.egl->glBindTexture(cursor_texture_id_target, 0);
+}
+
+/* TODO: On monitor reconfiguration, find monitor x, y, width and height again. Do the same for nvfbc. */
+
+typedef struct {
+    MonitorId *monitor_id;
+    const char *monitor_to_capture;
+    int monitor_to_capture_len;
+    int num_monitors;
+} MonitorCallbackUserdata;
+
+static void monitor_callback(const gsr_monitor *monitor, void *userdata) {
+    MonitorCallbackUserdata *monitor_callback_userdata = userdata;
+    ++monitor_callback_userdata->num_monitors;
+
+    if(monitor_callback_userdata->monitor_to_capture_len != monitor->name_len || memcmp(monitor_callback_userdata->monitor_to_capture, monitor->name, monitor->name_len) != 0)
+        return;
+
+    if(monitor_callback_userdata->monitor_id->num_connector_ids < MAX_CONNECTOR_IDS) {
+        monitor_callback_userdata->monitor_id->connector_ids[monitor_callback_userdata->monitor_id->num_connector_ids] = monitor->connector_id;
+        ++monitor_callback_userdata->monitor_id->num_connector_ids;
+    }
+
+    if(monitor_callback_userdata->monitor_id->num_connector_ids == MAX_CONNECTOR_IDS)
+        fprintf(stderr, "gsr warning: reached max connector ids\n");
+}
+
+static vec2i rotate_capture_size_if_rotated(gsr_capture_kms *self, vec2i capture_size) {
+    if(self->monitor_rotation == GSR_MONITOR_ROT_90 || self->monitor_rotation == GSR_MONITOR_ROT_270) {
+        int tmp_x = capture_size.x;
+        capture_size.x = capture_size.y;
+        capture_size.y = tmp_x;
+    }
+    return capture_size;
+}
+
+static int gsr_capture_kms_start(gsr_capture *cap, gsr_capture_metadata *capture_metadata) {
+    gsr_capture_kms *self = cap->priv;
+
+    gsr_capture_kms_create_input_texture_ids(self);
+
+    gsr_monitor monitor;
+    self->monitor_id.num_connector_ids = 0;
+
+    int kms_init_res = gsr_kms_client_init(&self->kms_client, self->params.egl->card_path);
+    if(kms_init_res != 0)
+        return kms_init_res;
+
+    self->is_x11 = gsr_window_get_display_server(self->params.egl->window) == GSR_DISPLAY_SERVER_X11;
+    const gsr_connection_type connection_type = self->is_x11 ? GSR_CONNECTION_X11 : GSR_CONNECTION_DRM;
+    if(self->is_x11) {
+        Display *display = gsr_window_get_display(self->params.egl->window);
+        gsr_cursor_init(&self->x11_cursor, self->params.egl, display);
+    }
+
+    MonitorCallbackUserdata monitor_callback_userdata = {
+        &self->monitor_id,
+        self->params.display_to_capture, strlen(self->params.display_to_capture),
+        0,
+    };
+    for_each_active_monitor_output(self->params.egl->window, self->params.egl->card_path, connection_type, monitor_callback, &monitor_callback_userdata);
+
+    if(!get_monitor_by_name(self->params.egl, connection_type, self->params.display_to_capture, &monitor)) {
+        fprintf(stderr, "gsr error: gsr_capture_kms_start: failed to find monitor by name \"%s\"\n", self->params.display_to_capture);
+        gsr_capture_kms_stop(self);
+        return -1;
+    }
+
+    monitor.name = self->params.display_to_capture;
+    vec2i monitor_position = {0, 0};
+    drm_monitor_get_display_server_data(self->params.egl->window, &monitor, &self->monitor_rotation, &monitor_position);
+
+    self->capture_pos = monitor.pos;
+    /* Monitor size is already rotated on x11 when the monitor is rotated, no need to apply it ourselves */
+    if(self->is_x11)
+        self->capture_size = monitor.size;
+    else
+        self->capture_size = rotate_capture_size_if_rotated(self, monitor.size);
+
+    if(self->params.output_resolution.x > 0 && self->params.output_resolution.y > 0) {
+        self->params.output_resolution = scale_keep_aspect_ratio(self->capture_size, self->params.output_resolution);
+        capture_metadata->width = self->params.output_resolution.x;
+        capture_metadata->height = self->params.output_resolution.y;
+    } else if(self->params.region_size.x > 0 && self->params.region_size.y > 0) {
+        capture_metadata->width = self->params.region_size.x;
+        capture_metadata->height = self->params.region_size.y;
+    } else {
+        capture_metadata->width = self->capture_size.x;
+        capture_metadata->height = self->capture_size.y;
+    }
+
+    self->last_time_monitor_check = clock_get_monotonic_seconds();
+    return 0;
+}
+
+static void gsr_capture_kms_on_event(gsr_capture *cap, gsr_egl *egl) {
+    gsr_capture_kms *self = cap->priv;
+    if(!self->is_x11)
+        return;
+
+    XEvent *xev = gsr_window_get_event_data(egl->window);
+    gsr_cursor_on_event(&self->x11_cursor, xev);
+}
+
+// TODO: This is disabled for now because we want to be able to record at a framerate higher than the monitor framerate
+// static void gsr_capture_kms_tick(gsr_capture *cap) {
+//     gsr_capture_kms *self = cap->priv;
+
+//     if(self->drm_fd <= 0)
+//         self->drm_fd = open(self->params.egl->card_path, O_RDONLY);
+
+//     if(self->drm_fd <= 0)
+//         return;
+
+//     uint64_t sequence = 0;
+//     uint64_t ns = 0;
+//     if(drmCrtcGetSequence(self->drm_fd, 79, &sequence, &ns) != 0)
+//         return;
+
+//     if(sequence != self->prev_sequence) {
+//         self->prev_sequence = sequence;
+//         self->damaged = true;
+//     }
+// }
+
+static gsr_kms_response_item* find_drm_by_connector_id(gsr_kms_response *kms_response, uint32_t connector_id) {
+    for(int i = 0; i < kms_response->num_items; ++i) {
+        if(kms_response->items[i].connector_id == connector_id && !kms_response->items[i].is_cursor)
+            return &kms_response->items[i];
+    }
+    return NULL;
+}
+
+static gsr_kms_response_item* find_largest_drm(gsr_kms_response *kms_response) {
+    if(kms_response->num_items == 0)
+        return NULL;
+
+    int64_t largest_size = 0;
+    gsr_kms_response_item *largest_drm = &kms_response->items[0];
+    for(int i = 0; i < kms_response->num_items; ++i) {
+        const int64_t size = (int64_t)kms_response->items[i].width * (int64_t)kms_response->items[i].height;
+        if(size > largest_size && !kms_response->items[i].is_cursor) {
+            largest_size = size;
+            largest_drm = &kms_response->items[i];
+        }
+    }
+    return largest_drm;
+}
+
+static gsr_kms_response_item* find_cursor_drm(gsr_kms_response *kms_response, uint32_t connector_id) {
+    gsr_kms_response_item *cursor_drm = NULL;
+    for(int i = 0; i < kms_response->num_items; ++i) {
+        if(kms_response->items[i].is_cursor) {
+            cursor_drm = &kms_response->items[i];
+            if(kms_response->items[i].connector_id == connector_id)
+                break;
+        }
+    }
+    return cursor_drm;
+}
+
+static bool hdr_metadata_is_supported_format(const struct hdr_output_metadata *hdr_metadata) {
+    return hdr_metadata->metadata_type == HDMI_STATIC_METADATA_TYPE1 &&
+        hdr_metadata->hdmi_metadata_type1.metadata_type == HDMI_STATIC_METADATA_TYPE1 &&
+        hdr_metadata->hdmi_metadata_type1.eotf == HDMI_EOTF_SMPTE_ST2084;
+}
+
+// TODO: Check if this hdr data can be changed after the call to av_packet_side_data_add
+static void gsr_kms_set_hdr_metadata(gsr_capture_kms *self, const gsr_kms_response_item *drm_fd) {
+    if(self->hdr_metadata_set)
+        return;
+
+    self->hdr_metadata_set = true;
+    self->hdr_metadata = drm_fd->hdr_metadata;
+}
+
+static vec2i swap_vec2i(vec2i value) {
+    int tmp = value.x;
+    value.x = value.y;
+    value.y = tmp;
+    return value;
+}
+
+static EGLImage gsr_capture_kms_create_egl_image(gsr_capture_kms *self, const gsr_kms_response_item *drm_fd, const int *fds, const uint32_t *offsets, const uint32_t *pitches, const uint64_t *modifiers, bool use_modifiers) {
+    intptr_t img_attr[44];
+    setup_dma_buf_attrs(img_attr, drm_fd->pixel_format, drm_fd->width, drm_fd->height, fds, offsets, pitches, modifiers, drm_fd->num_dma_bufs, use_modifiers);
+    while(self->params.egl->eglGetError() != EGL_SUCCESS){}
+    EGLImage image = self->params.egl->eglCreateImage(self->params.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
+    if(!image || self->params.egl->eglGetError() != EGL_SUCCESS) {
+        if(image)
+            self->params.egl->eglDestroyImage(self->params.egl->egl_display, image);
+        return NULL;
+    }
+    return image;
+}
+
+static EGLImage gsr_capture_kms_create_egl_image_with_fallback(gsr_capture_kms *self, const gsr_kms_response_item *drm_fd) {
+    // TODO: This causes a crash sometimes on steam deck, why? is it a driver bug? a vaapi pure version doesn't cause a crash.
+    // Even ffmpeg kmsgrab causes this crash. The error is:
+    // amdgpu: Failed to allocate a buffer:
+    // amdgpu:    size      : 28508160 bytes
+    // amdgpu:    alignment : 2097152 bytes
+    // amdgpu:    domains   : 4
+    // amdgpu:    flags   : 4
+    // amdgpu: Failed to allocate a buffer:
+    // amdgpu:    size      : 28508160 bytes
+    // amdgpu:    alignment : 2097152 bytes
+    // amdgpu:    domains   : 4
+    // amdgpu:    flags   : 4
+    // EE ../jupiter-mesa/src/gallium/drivers/radeonsi/radeon_vcn_enc.c:516 radeon_create_encoder UVD - Can't create CPB buffer.
+    // [hevc_vaapi @ 0x55ea72b09840] Failed to upload encode parameters: 2 (resource allocation failed).
+    // [hevc_vaapi @ 0x55ea72b09840] Encode failed: -5.
+    // Error: avcodec_send_frame failed, error: Input/output error
+    // Assertion pic->display_order == pic->encode_order failed at libavcodec/vaapi_encode_h265.c:765
+    // kms server info: kms client shutdown, shutting down the server
+
+    int fds[GSR_KMS_MAX_DMA_BUFS];
+    uint32_t offsets[GSR_KMS_MAX_DMA_BUFS];
+    uint32_t pitches[GSR_KMS_MAX_DMA_BUFS];
+    uint64_t modifiers[GSR_KMS_MAX_DMA_BUFS];
+
+    for(int i = 0; i < drm_fd->num_dma_bufs; ++i) {
+        fds[i] = drm_fd->dma_buf[i].fd;
+        offsets[i] = drm_fd->dma_buf[i].offset;
+        pitches[i] = drm_fd->dma_buf[i].pitch;
+        modifiers[i] = drm_fd->modifier;
+    }
+
+    EGLImage image = NULL;
+    if(self->no_modifiers_fallback) {
+        image = gsr_capture_kms_create_egl_image(self, drm_fd, fds, offsets, pitches, modifiers, false);
+    } else {
+        image = gsr_capture_kms_create_egl_image(self, drm_fd, fds, offsets, pitches, modifiers, true);
+        if(!image) {
+            fprintf(stderr, "gsr error: gsr_capture_kms_create_egl_image_with_fallback: failed to create egl image with modifiers, trying without modifiers\n");
+            self->no_modifiers_fallback = true;
+            image = gsr_capture_kms_create_egl_image(self, drm_fd, fds, offsets, pitches, modifiers, false);
+        }
+    }
+    return image;
+}
+
+static bool gsr_capture_kms_bind_image_to_texture(gsr_capture_kms *self, EGLImage image, unsigned int texture_id, bool external_texture) {
+    const int texture_target = external_texture ? GL_TEXTURE_EXTERNAL_OES : GL_TEXTURE_2D;
+    while(self->params.egl->glGetError() != 0){}
+    self->params.egl->glBindTexture(texture_target, texture_id);
+    self->params.egl->glEGLImageTargetTexture2DOES(texture_target, image);
+    const bool success = self->params.egl->glGetError() == 0;
+    self->params.egl->glBindTexture(texture_target, 0);
+    return success;
+}
+
+static void gsr_capture_kms_bind_image_to_input_texture_with_fallback(gsr_capture_kms *self, EGLImage image) {
+    if(self->external_texture_fallback) {
+        gsr_capture_kms_bind_image_to_texture(self, image, self->external_input_texture_id, true);
+    } else {
+        if(!gsr_capture_kms_bind_image_to_texture(self, image, self->input_texture_id, false)) {
+            fprintf(stderr, "gsr error: gsr_capture_kms_capture: failed to bind image to texture, trying with external texture\n");
+            self->external_texture_fallback = true;
+            gsr_capture_kms_bind_image_to_texture(self, image, self->external_input_texture_id, true);
+        }
+    }
+}
+
+static gsr_kms_response_item* find_monitor_drm(gsr_capture_kms *self, bool *capture_is_combined_plane) {
+    *capture_is_combined_plane = false;
+    gsr_kms_response_item *drm_fd = NULL;
+
+    for(int i = 0; i < self->monitor_id.num_connector_ids; ++i) {
+        drm_fd = find_drm_by_connector_id(&self->kms_response, self->monitor_id.connector_ids[i]);
+        if(drm_fd)
+            break;
+    }
+
+    // Will never happen on wayland unless the target monitor has been disconnected
+    if(!drm_fd && self->is_x11) {
+        drm_fd = find_largest_drm(&self->kms_response);
+        *capture_is_combined_plane = true;
+    }
+
+    return drm_fd;
+}
+
+static gsr_kms_response_item* find_cursor_drm_if_on_monitor(gsr_capture_kms *self, uint32_t monitor_connector_id, bool capture_is_combined_plane) {
+    gsr_kms_response_item *cursor_drm_fd = find_cursor_drm(&self->kms_response, monitor_connector_id);
+    if(!capture_is_combined_plane && cursor_drm_fd && cursor_drm_fd->connector_id != monitor_connector_id)
+        cursor_drm_fd = NULL;
+    return cursor_drm_fd;
+}
+
+static void render_drm_cursor(gsr_capture_kms *self, gsr_color_conversion *color_conversion, const gsr_kms_response_item *cursor_drm_fd, vec2i target_pos, vec2i output_size, vec2i framebuffer_size) {
+    const vec2d scale = {
+        self->capture_size.x == 0 ? 0 : (double)output_size.x / (double)self->capture_size.x,
+        self->capture_size.y == 0 ? 0 : (double)output_size.y / (double)self->capture_size.y
+    };
+
+    const bool cursor_texture_id_is_external = self->params.egl->gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA;
+    const vec2i cursor_size = {cursor_drm_fd->width, cursor_drm_fd->height};
+
+    vec2i cursor_pos = {cursor_drm_fd->x, cursor_drm_fd->y};
+    switch(self->monitor_rotation) {
+        case GSR_MONITOR_ROT_0:
+            break;
+        case GSR_MONITOR_ROT_90:
+            cursor_pos = swap_vec2i(cursor_pos);
+            cursor_pos.x = framebuffer_size.x - cursor_pos.x;
+            // TODO: Remove this horrible hack
+            cursor_pos.x -= cursor_size.x;
+            break;
+        case GSR_MONITOR_ROT_180:
+            cursor_pos.x = framebuffer_size.x - cursor_pos.x;
+            cursor_pos.y = framebuffer_size.y - cursor_pos.y;
+            // TODO: Remove this horrible hack
+            cursor_pos.x -= cursor_size.x;
+            cursor_pos.y -= cursor_size.y;
+            break;
+        case GSR_MONITOR_ROT_270:
+            cursor_pos = swap_vec2i(cursor_pos);
+            cursor_pos.y = framebuffer_size.y - cursor_pos.y;
+            // TODO: Remove this horrible hack
+            cursor_pos.y -= cursor_size.y;
+            break;
+    }
+
+    cursor_pos.x -= self->params.region_position.x;
+    cursor_pos.y -= self->params.region_position.y;
+
+    cursor_pos.x *= scale.x;
+    cursor_pos.y *= scale.y;
+
+    cursor_pos.x += target_pos.x;
+    cursor_pos.y += target_pos.y;
+
+    int fds[GSR_KMS_MAX_DMA_BUFS];
+    uint32_t offsets[GSR_KMS_MAX_DMA_BUFS];
+    uint32_t pitches[GSR_KMS_MAX_DMA_BUFS];
+    uint64_t modifiers[GSR_KMS_MAX_DMA_BUFS];
+
+    for(int i = 0; i < cursor_drm_fd->num_dma_bufs; ++i) {
+        fds[i] = cursor_drm_fd->dma_buf[i].fd;
+        offsets[i] = cursor_drm_fd->dma_buf[i].offset;
+        pitches[i] = cursor_drm_fd->dma_buf[i].pitch;
+        modifiers[i] = cursor_drm_fd->modifier;
+    }
+
+    intptr_t img_attr_cursor[44];
+    setup_dma_buf_attrs(img_attr_cursor, cursor_drm_fd->pixel_format, cursor_drm_fd->width, cursor_drm_fd->height,
+        fds, offsets, pitches, modifiers, cursor_drm_fd->num_dma_bufs, true);
+
+    EGLImage cursor_image = self->params.egl->eglCreateImage(self->params.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr_cursor);
+    const int target = cursor_texture_id_is_external ? GL_TEXTURE_EXTERNAL_OES : GL_TEXTURE_2D;
+    self->params.egl->glBindTexture(target, self->cursor_texture_id);
+    self->params.egl->glEGLImageTargetTexture2DOES(target, cursor_image);
+    self->params.egl->glBindTexture(target, 0);
+
+    if(cursor_image)
+        self->params.egl->eglDestroyImage(self->params.egl->egl_display, cursor_image);
+
+    self->params.egl->glEnable(GL_SCISSOR_TEST);
+    self->params.egl->glScissor(target_pos.x, target_pos.y, output_size.x, output_size.y);
+
+    gsr_color_conversion_draw(color_conversion, self->cursor_texture_id,
+        cursor_pos, (vec2i){cursor_size.x * scale.x, cursor_size.y * scale.y},
+        (vec2i){0, 0}, cursor_size, cursor_size,
+        gsr_monitor_rotation_to_rotation(self->monitor_rotation), GSR_SOURCE_COLOR_RGB, cursor_texture_id_is_external, true);
+
+    self->params.egl->glDisable(GL_SCISSOR_TEST);
+}
+
+static void render_x11_cursor(gsr_capture_kms *self, gsr_color_conversion *color_conversion, vec2i capture_pos, vec2i target_pos, vec2i output_size) {
+    if(!self->x11_cursor.visible)
+        return;
+
+    const vec2d scale = {
+        self->capture_size.x == 0 ? 0 : (double)output_size.x / (double)self->capture_size.x,
+        self->capture_size.y == 0 ? 0 : (double)output_size.y / (double)self->capture_size.y
+    };
+
+    Display *display = gsr_window_get_display(self->params.egl->window);
+    gsr_cursor_tick(&self->x11_cursor, DefaultRootWindow(display));
+
+    const vec2i cursor_pos = {
+        target_pos.x + (self->x11_cursor.position.x - self->x11_cursor.hotspot.x - capture_pos.x) * scale.x,
+        target_pos.y + (self->x11_cursor.position.y - self->x11_cursor.hotspot.y - capture_pos.y) * scale.y
+    };
+
+    self->params.egl->glEnable(GL_SCISSOR_TEST);
+    self->params.egl->glScissor(target_pos.x, target_pos.y, output_size.x, output_size.y);
+
+    gsr_color_conversion_draw(color_conversion, self->x11_cursor.texture_id,
+        cursor_pos, (vec2i){self->x11_cursor.size.x * scale.x, self->x11_cursor.size.y * scale.y},
+        (vec2i){0, 0}, self->x11_cursor.size, self->x11_cursor.size,
+        GSR_ROT_0, GSR_SOURCE_COLOR_RGB, false, true);
+
+    self->params.egl->glDisable(GL_SCISSOR_TEST);
+}
+
+static void gsr_capture_kms_update_capture_size_change(gsr_capture_kms *self, gsr_color_conversion *color_conversion, vec2i target_pos, const gsr_kms_response_item *drm_fd) {
+    if(target_pos.x != self->prev_target_pos.x || target_pos.y != self->prev_target_pos.y || drm_fd->src_w != self->prev_plane_size.x || drm_fd->src_h != self->prev_plane_size.y) {
+        self->prev_target_pos = target_pos;
+        self->prev_plane_size = self->capture_size;
+        gsr_color_conversion_clear(color_conversion);
+    }
+}
+
+static void gsr_capture_kms_update_connector_ids(gsr_capture_kms *self) {
+    const double now = clock_get_monotonic_seconds();
+    if(now - self->last_time_monitor_check < FIND_CRTC_BY_NAME_TIMEOUT_SECONDS)
+        return;
+
+    self->last_time_monitor_check = now;
+    /* TODO: Assume for now that there is only 1 framebuffer for all monitors and it doesn't change */
+    if(self->is_x11)
+        return;
+
+    self->monitor_id.num_connector_ids = 0;
+    const gsr_connection_type connection_type = self->is_x11 ? GSR_CONNECTION_X11 : GSR_CONNECTION_DRM;
+    // MonitorCallbackUserdata monitor_callback_userdata = {
+    //     &self->monitor_id,
+    //     self->params.display_to_capture, strlen(self->params.display_to_capture),
+    //     0,
+    // };
+    // for_each_active_monitor_output(self->params.egl->window, self->params.egl->card_path, connection_type, monitor_callback, &monitor_callback_userdata);
+
+    gsr_monitor monitor;
+    if(!get_monitor_by_name(self->params.egl, connection_type, self->params.display_to_capture, &monitor)) {
+        fprintf(stderr, "gsr error: gsr_capture_kms_update_connector_ids: failed to find monitor by name \"%s\"\n", self->params.display_to_capture);
+        return;
+    }
+
+    self->monitor_id.num_connector_ids = 1;
+    self->monitor_id.connector_ids[0] = monitor.connector_id;
+
+    monitor.name = self->params.display_to_capture;
+    vec2i monitor_position = {0, 0};
+    // TODO: This is cached. We need it updated.
+    drm_monitor_get_display_server_data(self->params.egl->window, &monitor, &self->monitor_rotation, &monitor_position);
+
+    self->capture_pos = monitor.pos;
+    /* Monitor size is already rotated on x11 when the monitor is rotated, no need to apply it ourselves */
+    if(self->is_x11)
+        self->capture_size = monitor.size;
+    else
+        self->capture_size = rotate_capture_size_if_rotated(self, monitor.size);
+}
+
+static int gsr_capture_kms_capture(gsr_capture *cap, gsr_capture_metadata *capture_metadata, gsr_color_conversion *color_conversion) {
+    gsr_capture_kms *self = cap->priv;
+
+    gsr_capture_kms_cleanup_kms_fds(self);
+
+    if(gsr_kms_client_get_kms(&self->kms_client, &self->kms_response) != 0) {
+        fprintf(stderr, "gsr error: gsr_capture_kms_capture: failed to get kms, error: %d (%s)\n", self->kms_response.result, self->kms_response.err_msg);
+        return -1;
+    }
+
+    if(self->kms_response.num_items == 0) {
+        static bool error_shown = false;
+        if(!error_shown) {
+            error_shown = true;
+            fprintf(stderr, "gsr error: no drm found, capture will fail\n");
+        }
+        return -1;
+    }
+
+    gsr_capture_kms_update_connector_ids(self);
+
+    bool capture_is_combined_plane = false;
+    const gsr_kms_response_item *drm_fd = find_monitor_drm(self, &capture_is_combined_plane);
+    if(!drm_fd) {
+        gsr_capture_kms_cleanup_kms_fds(self);
+        return -1;
+    }
+
+    if(drm_fd->has_hdr_metadata && self->params.hdr && hdr_metadata_is_supported_format(&drm_fd->hdr_metadata))
+        gsr_kms_set_hdr_metadata(self, drm_fd);
+
+    self->capture_size = rotate_capture_size_if_rotated(self, (vec2i){ drm_fd->src_w, drm_fd->src_h });
+    const vec2i original_frame_size = self->capture_size;
+    if(self->params.region_size.x > 0 && self->params.region_size.y > 0)
+        self->capture_size = self->params.region_size;
+
+    const bool is_scaled = self->params.output_resolution.x > 0 && self->params.output_resolution.y > 0;
+    vec2i output_size = is_scaled ? self->params.output_resolution : self->capture_size;
+    output_size = scale_keep_aspect_ratio(self->capture_size, output_size);
+
+    const vec2i target_pos = { max_int(0, capture_metadata->width / 2 - output_size.x / 2), max_int(0, capture_metadata->height / 2 - output_size.y / 2) };
+    gsr_capture_kms_update_capture_size_change(self, color_conversion, target_pos, drm_fd);
+
+    vec2i capture_pos = self->capture_pos;
+    if(!capture_is_combined_plane)
+        capture_pos = (vec2i){drm_fd->x, drm_fd->y};
+
+    capture_pos.x += self->params.region_position.x;
+    capture_pos.y += self->params.region_position.y;
+
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
+
+    EGLImage image = gsr_capture_kms_create_egl_image_with_fallback(self, drm_fd);
+    if(image) {
+        gsr_capture_kms_bind_image_to_input_texture_with_fallback(self, image);
+        self->params.egl->eglDestroyImage(self->params.egl->egl_display, image);
+    }
+
+    gsr_color_conversion_draw(color_conversion, self->external_texture_fallback ? self->external_input_texture_id : self->input_texture_id,
+        target_pos, output_size,
+        capture_pos, self->capture_size, original_frame_size,
+        gsr_monitor_rotation_to_rotation(self->monitor_rotation), GSR_SOURCE_COLOR_RGB, self->external_texture_fallback, false);
+
+    if(self->params.record_cursor) {
+        gsr_kms_response_item *cursor_drm_fd = find_cursor_drm_if_on_monitor(self, drm_fd->connector_id, capture_is_combined_plane);
+        // The cursor is handled by x11 on x11 instead of using the cursor drm plane because on prime systems with a dedicated nvidia gpu
+        // the cursor plane is not available when the cursor is on the monitor controlled by the nvidia device.
+        // TODO: This doesn't work properly with software cursor on x11 since it will draw the x11 cursor on top of the cursor already in the framebuffer.
+        // Detect if software cursor is used on x11 somehow.
+        if(self->is_x11) {
+            vec2i cursor_monitor_offset = self->capture_pos;
+            cursor_monitor_offset.x += self->params.region_position.x;
+            cursor_monitor_offset.y += self->params.region_position.y;
+            render_x11_cursor(self, color_conversion, cursor_monitor_offset, target_pos, output_size);
+        } else if(cursor_drm_fd) {
+            const vec2i framebuffer_size = rotate_capture_size_if_rotated(self, (vec2i){ drm_fd->src_w, drm_fd->src_h });
+            render_drm_cursor(self, color_conversion, cursor_drm_fd, target_pos, output_size, framebuffer_size);
+        }
+    }
+
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
+
+    gsr_capture_kms_cleanup_kms_fds(self);
+
+    return 0;
+}
+
+static bool gsr_capture_kms_should_stop(gsr_capture *cap, bool *err) {
+    (void)cap;
+    if(err)
+        *err = false;
+    return false;
+}
+
+static bool gsr_capture_kms_uses_external_image(gsr_capture *cap) {
+    (void)cap;
+    return true;
+}
+
+static bool gsr_capture_kms_set_hdr_metadata(gsr_capture *cap, AVMasteringDisplayMetadata *mastering_display_metadata, AVContentLightMetadata *light_metadata) {
+    gsr_capture_kms *self = cap->priv;
+
+    if(!self->hdr_metadata_set)
+        return false;
+
+    light_metadata->MaxCLL = self->hdr_metadata.hdmi_metadata_type1.max_cll;
+    light_metadata->MaxFALL = self->hdr_metadata.hdmi_metadata_type1.max_fall;
+
+    for(int i = 0; i < 3; ++i) {
+        mastering_display_metadata->display_primaries[i][0] = av_make_q(self->hdr_metadata.hdmi_metadata_type1.display_primaries[i].x, 50000);
+        mastering_display_metadata->display_primaries[i][1] = av_make_q(self->hdr_metadata.hdmi_metadata_type1.display_primaries[i].y, 50000);
+    }
+
+    mastering_display_metadata->white_point[0] = av_make_q(self->hdr_metadata.hdmi_metadata_type1.white_point.x, 50000);
+    mastering_display_metadata->white_point[1] = av_make_q(self->hdr_metadata.hdmi_metadata_type1.white_point.y, 50000);
+
+    mastering_display_metadata->min_luminance = av_make_q(self->hdr_metadata.hdmi_metadata_type1.min_display_mastering_luminance, 10000);
+    mastering_display_metadata->max_luminance = av_make_q(self->hdr_metadata.hdmi_metadata_type1.max_display_mastering_luminance, 1);
+
+    mastering_display_metadata->has_primaries = true;
+    mastering_display_metadata->has_luminance = true;
+
+    return true;
+}
+
+// static bool gsr_capture_kms_is_damaged(gsr_capture *cap) {
+//     gsr_capture_kms *self = cap->priv;
+//     return self->damaged;
+// }
+
+// static void gsr_capture_kms_clear_damage(gsr_capture *cap) {
+//     gsr_capture_kms *self = cap->priv;
+//     self->damaged = false;
+// }
+
+static void gsr_capture_kms_destroy(gsr_capture *cap) {
+    gsr_capture_kms *self = cap->priv;
+    if(cap->priv) {
+        gsr_capture_kms_stop(self);
+        free((void*)self->params.display_to_capture);
+        self->params.display_to_capture = NULL;
+        free(cap->priv);
+        cap->priv = NULL;
+    }
+    free(cap);
+}
+
+gsr_capture* gsr_capture_kms_create(const gsr_capture_kms_params *params) {
+    if(!params) {
+        fprintf(stderr, "gsr error: gsr_capture_kms_create params is NULL\n");
+        return NULL;
+    }
+
+    gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+    if(!cap)
+        return NULL;
+
+    gsr_capture_kms *cap_kms = calloc(1, sizeof(gsr_capture_kms));
+    if(!cap_kms) {
+        free(cap);
+        return NULL;
+    }
+
+    const char *display_to_capture = strdup(params->display_to_capture);
+    if(!display_to_capture) {
+        free(cap);
+        free(cap_kms);
+        return NULL;
+    }
+
+    cap_kms->params = *params;
+    cap_kms->params.display_to_capture = display_to_capture;
+    
+    *cap = (gsr_capture) {
+        .start = gsr_capture_kms_start,
+        .on_event = gsr_capture_kms_on_event,
+        //.tick = gsr_capture_kms_tick,
+        .should_stop = gsr_capture_kms_should_stop,
+        .capture = gsr_capture_kms_capture,
+        .uses_external_image = gsr_capture_kms_uses_external_image,
+        .set_hdr_metadata = gsr_capture_kms_set_hdr_metadata,
+        //.is_damaged = gsr_capture_kms_is_damaged,
+        //.clear_damage = gsr_capture_kms_clear_damage,
+        .destroy = gsr_capture_kms_destroy,
+        .priv = cap_kms
+    };
+
+    return cap;
+}
diff --git a/src/capture/kms_cuda.c b/src/capture/kms_cuda.c
deleted file mode 100644
index d93d603..0000000
--- a/src/capture/kms_cuda.c
+++ /dev/null
@@ -1,615 +0,0 @@
-#include "../../include/capture/kms_cuda.h"
-#include "../../kms/client/kms_client.h"
-#include "../../include/utils.h"
-#include "../../include/color_conversion.h"
-#include "../../include/cuda.h"
-#include <stdlib.h>
-#include <stdio.h>
-#include <unistd.h>
-#include <assert.h>
-#include <libavutil/hwcontext.h>
-#include <libavutil/hwcontext_cuda.h>
-#include <libavutil/frame.h>
-#include <libavcodec/avcodec.h>
-
-/*
-    TODO: Use dummy pool for cuda buffer so we can create our own cuda buffers from pixel buffer objects
-    and copy the input textures to the pixel buffer objects. Use sw_format NV12 as well. Then this is
-    similar to kms_vaapi. This allows us to remove one extra texture and texture copy.
-*/
-/* TODO: Support cursor plane capture when nvidia supports cursor plane */
-
-#define MAX_CONNECTOR_IDS 32
-
-typedef struct {
-    uint32_t connector_ids[MAX_CONNECTOR_IDS];
-    int num_connector_ids;
-} MonitorId;
-
-typedef struct {
-    gsr_capture_kms_cuda_params params;
-    XEvent xev;
-
-    bool should_stop;
-    bool stop_is_error;
-    bool created_hw_frame;
-
-    gsr_cuda cuda;
-    
-    gsr_kms_client kms_client;
-    gsr_kms_response kms_response;
-    gsr_kms_response_fd wayland_kms_data;
-    bool using_wayland_capture;
-
-    vec2i capture_pos;
-    vec2i capture_size;
-    MonitorId monitor_id;
-
-    CUgraphicsResource cuda_graphics_resource;
-    CUarray mapped_array;
-
-    unsigned int input_texture;
-    unsigned int target_texture;
-    gsr_color_conversion color_conversion;
-} gsr_capture_kms_cuda;
-
-static int max_int(int a, int b) {
-    return a > b ? a : b;
-}
-
-static void gsr_capture_kms_cuda_stop(gsr_capture *cap, AVCodecContext *video_codec_context);
-
-static bool cuda_create_codec_context(gsr_capture_kms_cuda *cap_kms, AVCodecContext *video_codec_context) {
-    CUcontext old_ctx;
-    cap_kms->cuda.cuCtxPushCurrent_v2(cap_kms->cuda.cu_ctx);
-
-    AVBufferRef *device_ctx = av_hwdevice_ctx_alloc(AV_HWDEVICE_TYPE_CUDA);
-    if(!device_ctx) {
-        fprintf(stderr, "Error: Failed to create hardware device context\n");
-        cap_kms->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    AVHWDeviceContext *hw_device_context = (AVHWDeviceContext*)device_ctx->data;
-    AVCUDADeviceContext *cuda_device_context = (AVCUDADeviceContext*)hw_device_context->hwctx;
-    cuda_device_context->cuda_ctx = cap_kms->cuda.cu_ctx;
-    if(av_hwdevice_ctx_init(device_ctx) < 0) {
-        fprintf(stderr, "Error: Failed to create hardware device context\n");
-        av_buffer_unref(&device_ctx);
-        cap_kms->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
-    if(!frame_context) {
-        fprintf(stderr, "Error: Failed to create hwframe context\n");
-        av_buffer_unref(&device_ctx);
-        cap_kms->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    AVHWFramesContext *hw_frame_context =
-        (AVHWFramesContext *)frame_context->data;
-    hw_frame_context->width = video_codec_context->width;
-    hw_frame_context->height = video_codec_context->height;
-    hw_frame_context->sw_format = AV_PIX_FMT_BGR0;
-    hw_frame_context->format = video_codec_context->pix_fmt;
-    hw_frame_context->device_ref = device_ctx;
-    hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
-
-    hw_frame_context->initial_pool_size = 1;
-
-    if (av_hwframe_ctx_init(frame_context) < 0) {
-        fprintf(stderr, "Error: Failed to initialize hardware frame context "
-                        "(note: ffmpeg version needs to be > 4.0)\n");
-        av_buffer_unref(&device_ctx);
-        //av_buffer_unref(&frame_context);
-        cap_kms->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
-    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
-    return true;
-}
-
-// TODO: On monitor reconfiguration, find monitor x, y, width and height again. Do the same for nvfbc.
-
-typedef struct {
-    gsr_capture_kms_cuda *cap_kms;
-    const char *monitor_to_capture;
-    int monitor_to_capture_len;
-    int num_monitors;
-} MonitorCallbackUserdata;
-
-static void monitor_callback(const gsr_monitor *monitor, void *userdata) {
-    MonitorCallbackUserdata *monitor_callback_userdata = userdata;
-    ++monitor_callback_userdata->num_monitors;
-
-    if(monitor_callback_userdata->monitor_to_capture_len != monitor->name_len || memcmp(monitor_callback_userdata->monitor_to_capture, monitor->name, monitor->name_len) != 0)
-        return;
-
-    if(monitor_callback_userdata->cap_kms->monitor_id.num_connector_ids < MAX_CONNECTOR_IDS) {
-        monitor_callback_userdata->cap_kms->monitor_id.connector_ids[monitor_callback_userdata->cap_kms->monitor_id.num_connector_ids] = monitor->connector_id;
-        ++monitor_callback_userdata->cap_kms->monitor_id.num_connector_ids;
-    }
-
-    if(monitor_callback_userdata->cap_kms->monitor_id.num_connector_ids == MAX_CONNECTOR_IDS)
-        fprintf(stderr, "gsr warning: reached max connector ids\n");
-}
-
-static int gsr_capture_kms_cuda_start(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_kms_cuda *cap_kms = cap->priv;
-
-    gsr_monitor monitor;
-    cap_kms->monitor_id.num_connector_ids = 0;
-    if(gsr_egl_start_capture(cap_kms->params.egl, cap_kms->params.display_to_capture)) {
-        if(!get_monitor_by_name(cap_kms->params.egl, GSR_CONNECTION_WAYLAND, cap_kms->params.display_to_capture, &monitor)) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_cuda_start: failed to find monitor by name \"%s\"\n", cap_kms->params.display_to_capture);
-            gsr_capture_kms_cuda_stop(cap, video_codec_context);
-            return -1;
-        }
-        cap_kms->using_wayland_capture = true;
-    } else {
-        int kms_init_res = gsr_kms_client_init(&cap_kms->kms_client, cap_kms->params.card_path);
-        if(kms_init_res != 0) {
-            gsr_capture_kms_cuda_stop(cap, video_codec_context);
-            return kms_init_res;
-        }
-
-        MonitorCallbackUserdata monitor_callback_userdata = {
-            cap_kms,
-            cap_kms->params.display_to_capture, strlen(cap_kms->params.display_to_capture),
-            0
-        };
-        for_each_active_monitor_output((void*)cap_kms->params.card_path, GSR_CONNECTION_DRM, monitor_callback, &monitor_callback_userdata);
-
-        if(!get_monitor_by_name((void*)cap_kms->params.card_path, GSR_CONNECTION_DRM, cap_kms->params.display_to_capture, &monitor)) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_cuda_start: failed to find monitor by name \"%s\"\n", cap_kms->params.display_to_capture);
-            gsr_capture_kms_cuda_stop(cap, video_codec_context);
-            return -1;
-        }
-    }
-
-    cap_kms->capture_pos = monitor.pos;
-    cap_kms->capture_size = monitor.size;
-
-    video_codec_context->width = max_int(2, cap_kms->capture_size.x & ~1);
-    video_codec_context->height = max_int(2, cap_kms->capture_size.y & ~1);
-
-    /* Disable vsync */
-    cap_kms->params.egl->eglSwapInterval(cap_kms->params.egl->egl_display, 0);
-
-    // TODO: overclocking is not supported on wayland...
-    if(!gsr_cuda_load(&cap_kms->cuda, NULL, false)) {
-        fprintf(stderr, "gsr error: gsr_capture_kms_cuda_start: failed to load cuda\n");
-        gsr_capture_kms_cuda_stop(cap, video_codec_context);
-        return -1;
-    }
-
-    if(!cuda_create_codec_context(cap_kms, video_codec_context)) {
-        gsr_capture_kms_cuda_stop(cap, video_codec_context);
-        return -1;
-    }
-
-    return 0;
-}
-
-static unsigned int gl_create_texture(gsr_capture_kms_cuda *cap_kms, int width, int height) {
-    unsigned int texture_id = 0;
-    cap_kms->params.egl->glGenTextures(1, &texture_id);
-    cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, texture_id);
-    cap_kms->params.egl->glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, NULL);
-
-    cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
-    cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
-    cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
-    cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
-
-    cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-    return texture_id;
-}
-
-static bool cuda_register_opengl_texture(gsr_capture_kms_cuda *cap_kms) {
-    CUresult res;
-    CUcontext old_ctx;
-    res = cap_kms->cuda.cuCtxPushCurrent_v2(cap_kms->cuda.cu_ctx);
-    // TODO: Use cuGraphicsEGLRegisterImage instead with the window egl image (dont use window_texture).
-    // That removes the need for an extra texture and texture copy
-    res = cap_kms->cuda.cuGraphicsGLRegisterImage(
-        &cap_kms->cuda_graphics_resource, cap_kms->target_texture, GL_TEXTURE_2D,
-        CU_GRAPHICS_REGISTER_FLAGS_READ_ONLY);
-    if (res != CUDA_SUCCESS) {
-        const char *err_str = "unknown";
-        cap_kms->cuda.cuGetErrorString(res, &err_str);
-        fprintf(stderr, "gsr error: cuda_register_opengl_texture: cuGraphicsGLRegisterImage failed, error: %s, texture " "id: %u\n", err_str, cap_kms->target_texture);
-        res = cap_kms->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    res = cap_kms->cuda.cuGraphicsResourceSetMapFlags(cap_kms->cuda_graphics_resource, CU_GRAPHICS_MAP_RESOURCE_FLAGS_READ_ONLY);
-    res = cap_kms->cuda.cuGraphicsMapResources(1, &cap_kms->cuda_graphics_resource, 0);
-
-    res = cap_kms->cuda.cuGraphicsSubResourceGetMappedArray(&cap_kms->mapped_array, cap_kms->cuda_graphics_resource, 0, 0);
-    res = cap_kms->cuda.cuCtxPopCurrent_v2(&old_ctx);
-    return true;
-}
-
-static void gsr_capture_kms_cuda_tick(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame **frame) {
-    gsr_capture_kms_cuda *cap_kms = cap->priv;
-
-    // TODO:
-    cap_kms->params.egl->glClear(GL_COLOR_BUFFER_BIT);
-
-    if(!cap_kms->created_hw_frame) {
-        cap_kms->created_hw_frame = true;
-
-        av_frame_free(frame);
-        *frame = av_frame_alloc();
-        if(!frame) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_cuda_tick: failed to allocate frame\n");
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-        (*frame)->format = video_codec_context->pix_fmt;
-        (*frame)->width = video_codec_context->width;
-        (*frame)->height = video_codec_context->height;
-        (*frame)->color_range = video_codec_context->color_range;
-        (*frame)->color_primaries = video_codec_context->color_primaries;
-        (*frame)->color_trc = video_codec_context->color_trc;
-        (*frame)->colorspace = video_codec_context->colorspace;
-        (*frame)->chroma_location = video_codec_context->chroma_sample_location;
-
-        if(av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, *frame, 0) < 0) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_cuda_tick: av_hwframe_get_buffer failed\n");
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-
-        cap_kms->params.egl->glGenTextures(1, &cap_kms->input_texture);
-        cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, cap_kms->input_texture);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
-        cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-        cap_kms->target_texture = gl_create_texture(cap_kms, video_codec_context->width, video_codec_context->height);
-        if(cap_kms->target_texture == 0) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_cuda_tick: failed to create opengl texture\n");
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-
-        if(!cuda_register_opengl_texture(cap_kms)) {
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-
-        gsr_color_conversion_params color_conversion_params = {0};
-        color_conversion_params.egl = cap_kms->params.egl;
-        color_conversion_params.source_color = GSR_SOURCE_COLOR_RGB;
-        color_conversion_params.destination_color = GSR_DESTINATION_COLOR_BGR;
-
-        color_conversion_params.destination_textures[0] = cap_kms->target_texture;
-        color_conversion_params.num_destination_textures = 1;
-
-        if(gsr_color_conversion_init(&cap_kms->color_conversion, &color_conversion_params) != 0) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_cuda_tick: failed to create color conversion\n");
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-    }
-}
-
-static bool gsr_capture_kms_cuda_should_stop(gsr_capture *cap, bool *err) {
-    gsr_capture_kms_cuda *cap_kms = cap->priv;
-    if(cap_kms->should_stop) {
-        if(err)
-            *err = cap_kms->stop_is_error;
-        return true;
-    }
-
-    if(err)
-        *err = false;
-    return false;
-}
-
-/* Prefer non combined planes */
-static gsr_kms_response_fd* find_drm_by_connector_id(gsr_kms_response *kms_response, uint32_t connector_id) {
-    int index_combined = -1;
-    for(int i = 0; i < kms_response->num_fds; ++i) {
-        if(kms_response->fds[i].connector_id == connector_id && !kms_response->fds[i].is_cursor) {
-            if(kms_response->fds[i].is_combined_plane)
-                index_combined = i;
-            else
-                return &kms_response->fds[i];
-        }
-    }
-
-    if(index_combined != -1)
-        return &kms_response->fds[index_combined];
-    else
-        return NULL;
-}
-
-static gsr_kms_response_fd* find_first_combined_drm(gsr_kms_response *kms_response) {
-    for(int i = 0; i < kms_response->num_fds; ++i) {
-        if(kms_response->fds[i].is_combined_plane && !kms_response->fds[i].is_cursor)
-            return &kms_response->fds[i];
-    }
-    return NULL;
-}
-
-static gsr_kms_response_fd* find_largest_drm(gsr_kms_response *kms_response) {
-    if(kms_response->num_fds == 0)
-        return NULL;
-
-    int64_t largest_size = 0;
-    gsr_kms_response_fd *largest_drm = &kms_response->fds[0];
-    for(int i = 0; i < kms_response->num_fds; ++i) {
-        const int64_t size = (int64_t)kms_response->fds[i].width * (int64_t)kms_response->fds[i].height;
-        if(size > largest_size && !kms_response->fds[i].is_cursor) {
-            largest_size = size;
-            largest_drm = &kms_response->fds[i];
-        }
-    }
-    return largest_drm;
-}
-
-static void gsr_capture_kms_unload_cuda_graphics(gsr_capture_kms_cuda *cap_kms) {
-    if(cap_kms->cuda.cu_ctx) {
-        CUcontext old_ctx;
-        cap_kms->cuda.cuCtxPushCurrent_v2(cap_kms->cuda.cu_ctx);
-
-        if(cap_kms->cuda_graphics_resource) {
-            cap_kms->cuda.cuGraphicsUnmapResources(1, &cap_kms->cuda_graphics_resource, 0);
-            cap_kms->cuda.cuGraphicsUnregisterResource(cap_kms->cuda_graphics_resource);
-            cap_kms->cuda_graphics_resource = 0;
-        }
-
-        cap_kms->cuda.cuCtxPopCurrent_v2(&old_ctx);
-    }
-}
-
-static gsr_kms_response_fd* find_cursor_drm(gsr_kms_response *kms_response) {
-    for(int i = 0; i < kms_response->num_fds; ++i) {
-        if(kms_response->fds[i].is_cursor)
-            return &kms_response->fds[i];
-    }
-    return NULL;
-}
-
-static int gsr_capture_kms_cuda_capture(gsr_capture *cap, AVFrame *frame) {
-    (void)frame;
-    gsr_capture_kms_cuda *cap_kms = cap->priv;
-
-    for(int i = 0; i < cap_kms->kms_response.num_fds; ++i) {
-        if(cap_kms->kms_response.fds[i].fd > 0)
-            close(cap_kms->kms_response.fds[i].fd);
-        cap_kms->kms_response.fds[i].fd = 0;
-    }
-    cap_kms->kms_response.num_fds = 0;
-
-    gsr_kms_response_fd *drm_fd = NULL;
-    gsr_kms_response_fd *cursor_drm_fd = NULL;
-    bool capture_is_combined_plane = false;
-    if(cap_kms->using_wayland_capture) {
-        gsr_egl_update(cap_kms->params.egl);
-        cap_kms->wayland_kms_data.fd = cap_kms->params.egl->fd;
-        cap_kms->wayland_kms_data.width = cap_kms->params.egl->width;
-        cap_kms->wayland_kms_data.height = cap_kms->params.egl->height;
-        cap_kms->wayland_kms_data.pitch = cap_kms->params.egl->pitch;
-        cap_kms->wayland_kms_data.offset = cap_kms->params.egl->offset;
-        cap_kms->wayland_kms_data.pixel_format = cap_kms->params.egl->pixel_format;
-        cap_kms->wayland_kms_data.modifier = cap_kms->params.egl->modifier;
-        cap_kms->wayland_kms_data.connector_id = 0;
-        cap_kms->wayland_kms_data.is_combined_plane = false;
-        cap_kms->wayland_kms_data.is_cursor = false;
-        cap_kms->wayland_kms_data.x = cap_kms->wayland_kms_data.x; // TODO: Use these
-        cap_kms->wayland_kms_data.y = cap_kms->wayland_kms_data.y;
-        cap_kms->wayland_kms_data.src_w = cap_kms->wayland_kms_data.width;
-        cap_kms->wayland_kms_data.src_h = cap_kms->wayland_kms_data.height;
-
-        cap_kms->capture_pos.x = cap_kms->wayland_kms_data.x;
-        cap_kms->capture_pos.y = cap_kms->wayland_kms_data.y;
-
-        if(cap_kms->wayland_kms_data.fd <= 0)
-            return -1;
-
-        drm_fd = &cap_kms->wayland_kms_data;
-    } else {
-        if(gsr_kms_client_get_kms(&cap_kms->kms_client, &cap_kms->kms_response) != 0) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_capture: failed to get kms, error: %d (%s)\n", cap_kms->kms_response.result, cap_kms->kms_response.err_msg);
-            return -1;
-        }
-
-        if(cap_kms->kms_response.num_fds == 0) {
-            static bool error_shown = false;
-            if(!error_shown) {
-                error_shown = true;
-                fprintf(stderr, "gsr error: no drm found, capture will fail\n");
-            }
-            return -1;
-        }
-
-        for(int i = 0; i < cap_kms->monitor_id.num_connector_ids; ++i) {
-            drm_fd = find_drm_by_connector_id(&cap_kms->kms_response, cap_kms->monitor_id.connector_ids[i]);
-            if(drm_fd)
-                break;
-        }
-
-        // Will never happen on wayland unless the target monitor has been disconnected
-        if(!drm_fd) {
-            drm_fd = find_first_combined_drm(&cap_kms->kms_response);
-            if(!drm_fd)
-                drm_fd = find_largest_drm(&cap_kms->kms_response);
-            capture_is_combined_plane = true;
-        }
-
-        cursor_drm_fd = find_cursor_drm(&cap_kms->kms_response);
-    }
-
-    if(!drm_fd)
-        return -1;
-
-    if(!capture_is_combined_plane && cursor_drm_fd && cursor_drm_fd->connector_id != drm_fd->connector_id)
-        cursor_drm_fd = NULL;
-
-    const intptr_t img_attr[] = {
-        //EGL_IMAGE_PRESERVED_KHR, EGL_TRUE,
-        EGL_LINUX_DRM_FOURCC_EXT,       drm_fd->pixel_format,//cap_kms->params.egl->pixel_format, ARGB8888
-        EGL_WIDTH,                      drm_fd->width,//cap_kms->params.egl->width,
-        EGL_HEIGHT,                     drm_fd->height,//cap_kms->params.egl->height,
-        EGL_DMA_BUF_PLANE0_FD_EXT,      drm_fd->fd,//cap_kms->params.egl->fd,
-        EGL_DMA_BUF_PLANE0_OFFSET_EXT,  drm_fd->offset,//cap_kms->params.egl->offset,
-        EGL_DMA_BUF_PLANE0_PITCH_EXT,   drm_fd->pitch,//cap_kms->params.egl->pitch,
-        EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT, drm_fd->modifier & 0xFFFFFFFFULL,//cap_kms->params.egl->modifier & 0xFFFFFFFFULL,
-        EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT, drm_fd->modifier >> 32ULL,//cap_kms->params.egl->modifier >> 32ULL,
-        EGL_NONE
-    };
-
-    EGLImage image = cap_kms->params.egl->eglCreateImage(cap_kms->params.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
-    cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, cap_kms->input_texture);
-    cap_kms->params.egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
-    cap_kms->params.egl->eglDestroyImage(cap_kms->params.egl->egl_display, image);
-    cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-    vec2i capture_pos = cap_kms->capture_pos;
-    if(!capture_is_combined_plane)
-        capture_pos = (vec2i){drm_fd->x, drm_fd->y};
-
-    gsr_color_conversion_draw(&cap_kms->color_conversion, cap_kms->input_texture,
-        (vec2i){0, 0}, cap_kms->capture_size,
-        capture_pos, cap_kms->capture_size,
-        0.0f);
-
-    cap_kms->params.egl->eglSwapBuffers(cap_kms->params.egl->egl_display, cap_kms->params.egl->egl_surface);
-
-    frame->linesize[0] = frame->width * 4;
-
-    CUDA_MEMCPY2D memcpy_struct;
-    memcpy_struct.srcXInBytes = 0;
-    memcpy_struct.srcY = 0;
-    memcpy_struct.srcMemoryType = CU_MEMORYTYPE_ARRAY;
-
-    memcpy_struct.dstXInBytes = 0;
-    memcpy_struct.dstY = 0;
-    memcpy_struct.dstMemoryType = CU_MEMORYTYPE_DEVICE;
-
-    memcpy_struct.srcArray = cap_kms->mapped_array;
-    memcpy_struct.srcPitch = frame->linesize[0];
-    memcpy_struct.dstDevice = (CUdeviceptr)frame->data[0];
-    memcpy_struct.dstPitch = frame->linesize[0];
-    memcpy_struct.WidthInBytes = frame->width * 4;
-    memcpy_struct.Height = frame->height;
-    cap_kms->cuda.cuMemcpy2D_v2(&memcpy_struct);
-
-    return 0;
-}
-
-static void gsr_capture_kms_cuda_capture_end(gsr_capture *cap, AVFrame *frame) {
-    (void)frame;
-    gsr_capture_kms_cuda *cap_kms = cap->priv;
-
-    gsr_egl_cleanup_frame(cap_kms->params.egl);
-
-    for(int i = 0; i < cap_kms->kms_response.num_fds; ++i) {
-        if(cap_kms->kms_response.fds[i].fd > 0)
-            close(cap_kms->kms_response.fds[i].fd);
-        cap_kms->kms_response.fds[i].fd = 0;
-    }
-    cap_kms->kms_response.num_fds = 0;
-}
-
-static void gsr_capture_kms_cuda_stop(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_kms_cuda *cap_kms = cap->priv;
-
-    gsr_color_conversion_deinit(&cap_kms->color_conversion);
-
-    gsr_capture_kms_unload_cuda_graphics(cap_kms);
-
-    if(cap_kms->params.egl->egl_context) {
-        if(cap_kms->input_texture) {
-            cap_kms->params.egl->glDeleteTextures(1, &cap_kms->input_texture);
-            cap_kms->input_texture = 0;
-        }
-
-        if(cap_kms->target_texture) {
-            cap_kms->params.egl->glDeleteTextures(1, &cap_kms->target_texture);
-            cap_kms->target_texture = 0;
-        }
-    }
-
-    for(int i = 0; i < cap_kms->kms_response.num_fds; ++i) {
-        if(cap_kms->kms_response.fds[i].fd > 0)
-            close(cap_kms->kms_response.fds[i].fd);
-        cap_kms->kms_response.fds[i].fd = 0;
-    }
-    cap_kms->kms_response.num_fds = 0;
-
-    if(video_codec_context->hw_device_ctx)
-        av_buffer_unref(&video_codec_context->hw_device_ctx);
-    if(video_codec_context->hw_frames_ctx)
-        av_buffer_unref(&video_codec_context->hw_frames_ctx);
-
-    gsr_cuda_unload(&cap_kms->cuda);
-    gsr_kms_client_deinit(&cap_kms->kms_client);
-}
-
-static void gsr_capture_kms_cuda_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    (void)video_codec_context;
-    gsr_capture_kms_cuda *cap_kms = cap->priv;
-    if(cap->priv) {
-        gsr_capture_kms_cuda_stop(cap, video_codec_context);
-        free((void*)cap_kms->params.display_to_capture);
-        cap_kms->params.display_to_capture = NULL;
-        free(cap->priv);
-        cap->priv = NULL;
-    }
-    free(cap);
-}
-
-gsr_capture* gsr_capture_kms_cuda_create(const gsr_capture_kms_cuda_params *params) {
-    if(!params) {
-        fprintf(stderr, "gsr error: gsr_capture_kms_cuda_create params is NULL\n");
-        return NULL;
-    }
-
-    gsr_capture *cap = calloc(1, sizeof(gsr_capture));
-    if(!cap)
-        return NULL;
-
-    gsr_capture_kms_cuda *cap_kms = calloc(1, sizeof(gsr_capture_kms_cuda));
-    if(!cap_kms) {
-        free(cap);
-        return NULL;
-    }
-
-    const char *display_to_capture = strdup(params->display_to_capture);
-    if(!display_to_capture) {
-        free(cap);
-        free(cap_kms);
-        return NULL;
-    }
-
-    cap_kms->params = *params;
-    cap_kms->params.display_to_capture = display_to_capture;
-    
-    *cap = (gsr_capture) {
-        .start = gsr_capture_kms_cuda_start,
-        .tick = gsr_capture_kms_cuda_tick,
-        .should_stop = gsr_capture_kms_cuda_should_stop,
-        .capture = gsr_capture_kms_cuda_capture,
-        .capture_end = gsr_capture_kms_cuda_capture_end,
-        .destroy = gsr_capture_kms_cuda_destroy,
-        .priv = cap_kms
-    };
-
-    return cap;
-}
diff --git a/src/capture/kms_vaapi.c b/src/capture/kms_vaapi.c
deleted file mode 100644
index 681f345..0000000
--- a/src/capture/kms_vaapi.c
+++ /dev/null
@@ -1,661 +0,0 @@
-#include "../../include/capture/kms_vaapi.h"
-#include "../../kms/client/kms_client.h"
-#include "../../include/utils.h"
-#include "../../include/color_conversion.h"
-#include <stdlib.h>
-#include <stdio.h>
-#include <unistd.h>
-#include <assert.h>
-#include <libavutil/hwcontext.h>
-#include <libavutil/hwcontext_vaapi.h>
-#include <libavutil/frame.h>
-#include <libavcodec/avcodec.h>
-#include <va/va.h>
-#include <va/va_drmcommon.h>
-
-#define MAX_CONNECTOR_IDS 32
-
-typedef struct {
-    uint32_t connector_ids[MAX_CONNECTOR_IDS];
-    int num_connector_ids;
-} MonitorId;
-
-typedef struct {
-    gsr_capture_kms_vaapi_params params;
-    XEvent xev;
-
-    bool should_stop;
-    bool stop_is_error;
-    bool created_hw_frame;
-    
-    gsr_kms_client kms_client;
-    gsr_kms_response kms_response;
-    gsr_kms_response_fd wayland_kms_data;
-    bool using_wayland_capture;
-
-    vec2i capture_pos;
-    vec2i capture_size;
-    MonitorId monitor_id;
-
-    VADisplay va_dpy;
-    VADRMPRIMESurfaceDescriptor prime;
-
-    unsigned int input_texture;
-    unsigned int target_textures[2];
-    unsigned int cursor_texture;
-
-    gsr_color_conversion color_conversion;
-} gsr_capture_kms_vaapi;
-
-static int max_int(int a, int b) {
-    return a > b ? a : b;
-}
-
-static void gsr_capture_kms_vaapi_stop(gsr_capture *cap, AVCodecContext *video_codec_context);
-
-static bool drm_create_codec_context(gsr_capture_kms_vaapi *cap_kms, AVCodecContext *video_codec_context) {
-    char render_path[128];
-    if(!gsr_card_path_get_render_path(cap_kms->params.card_path, render_path)) {
-        fprintf(stderr, "gsr error: failed to get /dev/dri/renderDXXX file from %s\n", cap_kms->params.card_path);
-        return false;
-    }
-
-    AVBufferRef *device_ctx;
-    if(av_hwdevice_ctx_create(&device_ctx, AV_HWDEVICE_TYPE_VAAPI, render_path, NULL, 0) < 0) {
-        fprintf(stderr, "Error: Failed to create hardware device context\n");
-        return false;
-    }
-
-    AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
-    if(!frame_context) {
-        fprintf(stderr, "Error: Failed to create hwframe context\n");
-        av_buffer_unref(&device_ctx);
-        return false;
-    }
-
-    AVHWFramesContext *hw_frame_context =
-        (AVHWFramesContext *)frame_context->data;
-    hw_frame_context->width = video_codec_context->width;
-    hw_frame_context->height = video_codec_context->height;
-    hw_frame_context->sw_format = AV_PIX_FMT_NV12;//AV_PIX_FMT_0RGB32;//AV_PIX_FMT_YUV420P;//AV_PIX_FMT_0RGB32;//AV_PIX_FMT_NV12;
-    hw_frame_context->format = video_codec_context->pix_fmt;
-    hw_frame_context->device_ref = device_ctx;
-    hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
-
-    hw_frame_context->initial_pool_size = 1; // TODO: (and in other places)
-
-    AVVAAPIDeviceContext *vactx =((AVHWDeviceContext*)device_ctx->data)->hwctx;
-    cap_kms->va_dpy = vactx->display;
-
-    if (av_hwframe_ctx_init(frame_context) < 0) {
-        fprintf(stderr, "Error: Failed to initialize hardware frame context "
-                        "(note: ffmpeg version needs to be > 4.0)\n");
-        av_buffer_unref(&device_ctx);
-        //av_buffer_unref(&frame_context);
-        return false;
-    }
-
-    video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
-    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
-    return true;
-}
-
-#define DRM_FORMAT_MOD_INVALID 72057594037927935
-
-// TODO: On monitor reconfiguration, find monitor x, y, width and height again. Do the same for nvfbc.
-
-typedef struct {
-    gsr_capture_kms_vaapi *cap_kms;
-    const char *monitor_to_capture;
-    int monitor_to_capture_len;
-    int num_monitors;
-} MonitorCallbackUserdata;
-
-static void monitor_callback(const gsr_monitor *monitor, void *userdata) {
-    (void)monitor;
-    MonitorCallbackUserdata *monitor_callback_userdata = userdata;
-    ++monitor_callback_userdata->num_monitors;
-
-    if(monitor_callback_userdata->monitor_to_capture_len != monitor->name_len || memcmp(monitor_callback_userdata->monitor_to_capture, monitor->name, monitor->name_len) != 0)
-        return;
-
-    if(monitor_callback_userdata->cap_kms->monitor_id.num_connector_ids < MAX_CONNECTOR_IDS) {
-        monitor_callback_userdata->cap_kms->monitor_id.connector_ids[monitor_callback_userdata->cap_kms->monitor_id.num_connector_ids] = monitor->connector_id;
-        ++monitor_callback_userdata->cap_kms->monitor_id.num_connector_ids;
-    }
-
-    if(monitor_callback_userdata->cap_kms->monitor_id.num_connector_ids == MAX_CONNECTOR_IDS)
-        fprintf(stderr, "gsr warning: reached max connector ids\n");
-}
-
-static int gsr_capture_kms_vaapi_start(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_kms_vaapi *cap_kms = cap->priv;
-
-    gsr_monitor monitor;
-    cap_kms->monitor_id.num_connector_ids = 0;
-    if(gsr_egl_start_capture(cap_kms->params.egl, cap_kms->params.display_to_capture)) {
-        if(!get_monitor_by_name(cap_kms->params.egl, GSR_CONNECTION_WAYLAND, cap_kms->params.display_to_capture, &monitor)) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_cuda_start: failed to find monitor by name \"%s\"\n", cap_kms->params.display_to_capture);
-            gsr_capture_kms_vaapi_stop(cap, video_codec_context);
-            return -1;
-        }
-        cap_kms->using_wayland_capture = true;
-    } else {
-        int kms_init_res = gsr_kms_client_init(&cap_kms->kms_client, cap_kms->params.card_path);
-        if(kms_init_res != 0) {
-            gsr_capture_kms_vaapi_stop(cap, video_codec_context);
-            return kms_init_res;
-        }
-
-        MonitorCallbackUserdata monitor_callback_userdata = {
-            cap_kms,
-            cap_kms->params.display_to_capture, strlen(cap_kms->params.display_to_capture),
-            0,
-        };
-        for_each_active_monitor_output((void*)cap_kms->params.card_path, GSR_CONNECTION_DRM, monitor_callback, &monitor_callback_userdata);
-
-        if(!get_monitor_by_name((void*)cap_kms->params.card_path, GSR_CONNECTION_DRM, cap_kms->params.display_to_capture, &monitor)) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_start: failed to find monitor by name \"%s\"\n", cap_kms->params.display_to_capture);
-            gsr_capture_kms_vaapi_stop(cap, video_codec_context);
-            return -1;
-        }
-    }
-
-    cap_kms->capture_pos = monitor.pos;
-    cap_kms->capture_size = monitor.size;
-
-    /* Disable vsync */
-    cap_kms->params.egl->eglSwapInterval(cap_kms->params.egl->egl_display, 0);
-
-    video_codec_context->width = max_int(2, even_number_ceil(cap_kms->capture_size.x));
-    video_codec_context->height = max_int(2, even_number_ceil(cap_kms->capture_size.y));
-
-    if(!drm_create_codec_context(cap_kms, video_codec_context)) {
-        gsr_capture_kms_vaapi_stop(cap, video_codec_context);
-        return -1;
-    }
-
-    return 0;
-}
-
-static uint32_t fourcc(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
-    return (d << 24) | (c << 16) | (b << 8) | a;
-}
-
-#define FOURCC_NV12 842094158
-
-static void gsr_capture_kms_vaapi_tick(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame **frame) {
-    gsr_capture_kms_vaapi *cap_kms = cap->priv;
-
-    // TODO:
-    cap_kms->params.egl->glClear(GL_COLOR_BUFFER_BIT);
-
-    if(!cap_kms->created_hw_frame) {
-        cap_kms->created_hw_frame = true;
-
-        av_frame_free(frame);
-        *frame = av_frame_alloc();
-        if(!frame) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_tick: failed to allocate frame\n");
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-        (*frame)->format = video_codec_context->pix_fmt;
-        (*frame)->width = video_codec_context->width;
-        (*frame)->height = video_codec_context->height;
-        (*frame)->color_range = video_codec_context->color_range;
-        (*frame)->color_primaries = video_codec_context->color_primaries;
-        (*frame)->color_trc = video_codec_context->color_trc;
-        (*frame)->colorspace = video_codec_context->colorspace;
-        (*frame)->chroma_location = video_codec_context->chroma_sample_location;
-
-        int res = av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, *frame, 0);
-        if(res < 0) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_tick: av_hwframe_get_buffer failed: %d\n", res);
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-
-        VASurfaceID target_surface_id = (uintptr_t)(*frame)->data[3];
-
-        VAStatus va_status = vaExportSurfaceHandle(cap_kms->va_dpy, target_surface_id, VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2, VA_EXPORT_SURFACE_READ_WRITE | VA_EXPORT_SURFACE_SEPARATE_LAYERS, &cap_kms->prime);
-        if(va_status != VA_STATUS_SUCCESS) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_tick: vaExportSurfaceHandle failed, error: %d\n", va_status);
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-        vaSyncSurface(cap_kms->va_dpy, target_surface_id);
-
-        cap_kms->params.egl->glGenTextures(1, &cap_kms->input_texture);
-        cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, cap_kms->input_texture);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
-        cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-        cap_kms->params.egl->glGenTextures(1, &cap_kms->cursor_texture);
-        cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, cap_kms->cursor_texture);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
-        cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
-        cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-        if(cap_kms->prime.fourcc == FOURCC_NV12) {
-            cap_kms->params.egl->glGenTextures(2, cap_kms->target_textures);
-            for(int i = 0; i < 2; ++i) {
-                const uint32_t formats[2] = { fourcc('R', '8', ' ', ' '), fourcc('G', 'R', '8', '8') };
-                const int layer = i;
-                const int plane = 0;
-
-                const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
-                //const uint64_t modifier = cap_kms->prime.objects[cap_kms->prime.layers[layer].object_index[plane]].drm_format_modifier;
-
-                const intptr_t img_attr[] = {
-                    EGL_LINUX_DRM_FOURCC_EXT,       formats[i],
-                    EGL_WIDTH,                      cap_kms->prime.width / div[i],
-                    EGL_HEIGHT,                     cap_kms->prime.height / div[i],
-                    EGL_DMA_BUF_PLANE0_FD_EXT,      cap_kms->prime.objects[cap_kms->prime.layers[layer].object_index[plane]].fd,
-                    EGL_DMA_BUF_PLANE0_OFFSET_EXT,  cap_kms->prime.layers[layer].offset[plane],
-                    EGL_DMA_BUF_PLANE0_PITCH_EXT,   cap_kms->prime.layers[layer].pitch[plane],
-                    // TODO:
-                    //EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT, modifier & 0xFFFFFFFFULL,
-                    //EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT, modifier >> 32ULL,
-                    EGL_NONE
-                };
-
-                while(cap_kms->params.egl->eglGetError() != EGL_SUCCESS){}
-                EGLImage image = cap_kms->params.egl->eglCreateImage(cap_kms->params.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
-                if(!image) {
-                    fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_tick: failed to create egl image from drm fd for output drm fd, error: %d\n", cap_kms->params.egl->eglGetError());
-                    cap_kms->should_stop = true;
-                    cap_kms->stop_is_error = true;
-                    return;
-                }
-
-                cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, cap_kms->target_textures[i]);
-                cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
-                cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
-                cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
-                cap_kms->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
-
-                while(cap_kms->params.egl->glGetError()) {}
-                while(cap_kms->params.egl->eglGetError() != EGL_SUCCESS){}
-                cap_kms->params.egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
-                if(cap_kms->params.egl->glGetError() != 0 || cap_kms->params.egl->eglGetError() != EGL_SUCCESS) {
-                    // TODO: Get the error properly
-                    fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_tick: failed to bind egl image to gl texture, error: %d\n", cap_kms->params.egl->eglGetError());
-                    cap_kms->should_stop = true;
-                    cap_kms->stop_is_error = true;
-                    cap_kms->params.egl->eglDestroyImage(cap_kms->params.egl->egl_display, image);
-                    cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-                    return;
-                }
-
-                cap_kms->params.egl->eglDestroyImage(cap_kms->params.egl->egl_display, image);
-                cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-            }
-
-            gsr_color_conversion_params color_conversion_params = {0};
-            color_conversion_params.egl = cap_kms->params.egl;
-            color_conversion_params.source_color = GSR_SOURCE_COLOR_RGB;
-            color_conversion_params.destination_color = GSR_DESTINATION_COLOR_NV12;
-
-            color_conversion_params.destination_textures[0] = cap_kms->target_textures[0];
-            color_conversion_params.destination_textures[1] = cap_kms->target_textures[1];
-            color_conversion_params.num_destination_textures = 2;
-
-            if(gsr_color_conversion_init(&cap_kms->color_conversion, &color_conversion_params) != 0) {
-                fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_tick: failed to create color conversion\n");
-                cap_kms->should_stop = true;
-                cap_kms->stop_is_error = true;
-                return;
-            }
-        } else {
-            fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_tick: unexpected fourcc %u for output drm fd, expected nv12\n", cap_kms->prime.fourcc);
-            cap_kms->should_stop = true;
-            cap_kms->stop_is_error = true;
-            return;
-        }
-    }
-}
-
-static bool gsr_capture_kms_vaapi_should_stop(gsr_capture *cap, bool *err) {
-    gsr_capture_kms_vaapi *cap_kms = cap->priv;
-    if(cap_kms->should_stop) {
-        if(err)
-            *err = cap_kms->stop_is_error;
-        return true;
-    }
-
-    if(err)
-        *err = false;
-    return false;
-}
-
-/* Prefer non combined planes */
-static gsr_kms_response_fd* find_drm_by_connector_id(gsr_kms_response *kms_response, uint32_t connector_id) {
-    int index_combined = -1;
-    for(int i = 0; i < kms_response->num_fds; ++i) {
-        if(kms_response->fds[i].connector_id == connector_id && !kms_response->fds[i].is_cursor) {
-            if(kms_response->fds[i].is_combined_plane)
-                index_combined = i;
-            else
-                return &kms_response->fds[i];
-        }
-    }
-
-    if(index_combined != -1)
-        return &kms_response->fds[index_combined];
-    else
-        return NULL;
-}
-
-static gsr_kms_response_fd* find_first_combined_drm(gsr_kms_response *kms_response) {
-    for(int i = 0; i < kms_response->num_fds; ++i) {
-        if(kms_response->fds[i].is_combined_plane && !kms_response->fds[i].is_cursor)
-            return &kms_response->fds[i];
-    }
-    return NULL;
-}
-
-static gsr_kms_response_fd* find_largest_drm(gsr_kms_response *kms_response) {
-    if(kms_response->num_fds == 0)
-        return NULL;
-
-    int64_t largest_size = 0;
-    gsr_kms_response_fd *largest_drm = &kms_response->fds[0];
-    for(int i = 0; i < kms_response->num_fds; ++i) {
-        const int64_t size = (int64_t)kms_response->fds[i].width * (int64_t)kms_response->fds[i].height;
-        if(size > largest_size && !kms_response->fds[i].is_cursor) {
-            largest_size = size;
-            largest_drm = &kms_response->fds[i];
-        }
-    }
-    return largest_drm;
-}
-
-static gsr_kms_response_fd* find_cursor_drm(gsr_kms_response *kms_response) {
-    for(int i = 0; i < kms_response->num_fds; ++i) {
-        if(kms_response->fds[i].is_cursor)
-            return &kms_response->fds[i];
-    }
-    return NULL;
-}
-
-static int gsr_capture_kms_vaapi_capture(gsr_capture *cap, AVFrame *frame) {
-    (void)frame;
-    gsr_capture_kms_vaapi *cap_kms = cap->priv;
-
-    for(int i = 0; i < cap_kms->kms_response.num_fds; ++i) {
-        if(cap_kms->kms_response.fds[i].fd > 0)
-            close(cap_kms->kms_response.fds[i].fd);
-        cap_kms->kms_response.fds[i].fd = 0;
-    }
-    cap_kms->kms_response.num_fds = 0;
-
-    gsr_kms_response_fd *drm_fd = NULL;
-    gsr_kms_response_fd *cursor_drm_fd = NULL;
-    bool capture_is_combined_plane = false;
-    if(cap_kms->using_wayland_capture) {
-        gsr_egl_update(cap_kms->params.egl);
-        cap_kms->wayland_kms_data.fd = cap_kms->params.egl->fd;
-        cap_kms->wayland_kms_data.width = cap_kms->params.egl->width;
-        cap_kms->wayland_kms_data.height = cap_kms->params.egl->height;
-        cap_kms->wayland_kms_data.pitch = cap_kms->params.egl->pitch;
-        cap_kms->wayland_kms_data.offset = cap_kms->params.egl->offset;
-        cap_kms->wayland_kms_data.pixel_format = cap_kms->params.egl->pixel_format;
-        cap_kms->wayland_kms_data.modifier = cap_kms->params.egl->modifier;
-        cap_kms->wayland_kms_data.connector_id = 0;
-        cap_kms->wayland_kms_data.is_combined_plane = false;
-        cap_kms->wayland_kms_data.is_cursor = false;
-        cap_kms->wayland_kms_data.x = cap_kms->wayland_kms_data.x; // TODO: Use these
-        cap_kms->wayland_kms_data.y = cap_kms->wayland_kms_data.y;
-        cap_kms->wayland_kms_data.src_w = cap_kms->wayland_kms_data.width;
-        cap_kms->wayland_kms_data.src_h = cap_kms->wayland_kms_data.height;
-
-        cap_kms->capture_pos.x = cap_kms->wayland_kms_data.x;
-        cap_kms->capture_pos.y = cap_kms->wayland_kms_data.y;
-
-        if(cap_kms->wayland_kms_data.fd <= 0)
-            return -1;
-
-        drm_fd = &cap_kms->wayland_kms_data;
-    } else {
-        if(gsr_kms_client_get_kms(&cap_kms->kms_client, &cap_kms->kms_response) != 0) {
-            fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_capture: failed to get kms, error: %d (%s)\n", cap_kms->kms_response.result, cap_kms->kms_response.err_msg);
-            return -1;
-        }
-
-        if(cap_kms->kms_response.num_fds == 0) {
-            static bool error_shown = false;
-            if(!error_shown) {
-                error_shown = true;
-                fprintf(stderr, "gsr error: no drm found, capture will fail\n");
-            }
-            return -1;
-        }
-
-        for(int i = 0; i < cap_kms->monitor_id.num_connector_ids; ++i) {
-            drm_fd = find_drm_by_connector_id(&cap_kms->kms_response, cap_kms->monitor_id.connector_ids[i]);
-            if(drm_fd)
-                break;
-        }
-
-        // Will never happen on wayland unless the target monitor has been disconnected
-        if(!drm_fd) {
-            drm_fd = find_first_combined_drm(&cap_kms->kms_response);
-            if(!drm_fd)
-                drm_fd = find_largest_drm(&cap_kms->kms_response);
-            capture_is_combined_plane = true;
-        }
-
-        cursor_drm_fd = find_cursor_drm(&cap_kms->kms_response);
-    }
-
-    if(!drm_fd)
-        return -1;
-
-    if(!capture_is_combined_plane && cursor_drm_fd && cursor_drm_fd->connector_id != drm_fd->connector_id)
-        cursor_drm_fd = NULL;
-
-    // TODO: This causes a crash sometimes on steam deck, why? is it a driver bug? a vaapi pure version doesn't cause a crash.
-    // Even ffmpeg kmsgrab causes this crash. The error is:
-    // amdgpu: Failed to allocate a buffer:
-    // amdgpu:    size      : 28508160 bytes
-    // amdgpu:    alignment : 2097152 bytes
-    // amdgpu:    domains   : 4
-    // amdgpu:    flags   : 4
-    // amdgpu: Failed to allocate a buffer:
-    // amdgpu:    size      : 28508160 bytes
-    // amdgpu:    alignment : 2097152 bytes
-    // amdgpu:    domains   : 4
-    // amdgpu:    flags   : 4
-    // EE ../jupiter-mesa/src/gallium/drivers/radeonsi/radeon_vcn_enc.c:516 radeon_create_encoder UVD - Can't create CPB buffer.
-    // [hevc_vaapi @ 0x55ea72b09840] Failed to upload encode parameters: 2 (resource allocation failed).
-    // [hevc_vaapi @ 0x55ea72b09840] Encode failed: -5.
-    // Error: avcodec_send_frame failed, error: Input/output error
-    // Assertion pic->display_order == pic->encode_order failed at libavcodec/vaapi_encode_h265.c:765
-    // kms server info: kms client shutdown, shutting down the server
-    const intptr_t img_attr[] = {
-        EGL_LINUX_DRM_FOURCC_EXT,       drm_fd->pixel_format,
-        EGL_WIDTH,                      drm_fd->width,
-        EGL_HEIGHT,                     drm_fd->height,
-        EGL_DMA_BUF_PLANE0_FD_EXT,      drm_fd->fd,
-        EGL_DMA_BUF_PLANE0_OFFSET_EXT,  drm_fd->offset,
-        EGL_DMA_BUF_PLANE0_PITCH_EXT,   drm_fd->pitch,
-        // TODO:
-        //EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT, drm_fd->modifier & 0xFFFFFFFFULL,
-        //EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT, drm_fd->modifier >> 32ULL,
-        EGL_NONE
-    };
-
-    EGLImage image = cap_kms->params.egl->eglCreateImage(cap_kms->params.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
-    cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, cap_kms->input_texture);
-    cap_kms->params.egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
-    cap_kms->params.egl->eglDestroyImage(cap_kms->params.egl->egl_display, image);
-    cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-    vec2i capture_pos = cap_kms->capture_pos;
-
-    if(cap_kms->using_wayland_capture) {
-        gsr_color_conversion_draw(&cap_kms->color_conversion, cap_kms->input_texture,
-            (vec2i){0, 0}, cap_kms->capture_size,
-            (vec2i){0, 0}, cap_kms->capture_size,
-            0.0f);
-    } else {
-        if(!capture_is_combined_plane)
-            capture_pos = (vec2i){drm_fd->x, drm_fd->y};
-
-        float texture_rotation = 0.0f;
-
-        gsr_color_conversion_draw(&cap_kms->color_conversion, cap_kms->input_texture,
-            (vec2i){0, 0}, cap_kms->capture_size,
-            capture_pos, cap_kms->capture_size,
-            texture_rotation);
-
-        if(cursor_drm_fd) {
-            const intptr_t img_attr_cursor[] = {
-                EGL_LINUX_DRM_FOURCC_EXT,       cursor_drm_fd->pixel_format,
-                EGL_WIDTH,                      cursor_drm_fd->width,
-                EGL_HEIGHT,                     cursor_drm_fd->height,
-                EGL_DMA_BUF_PLANE0_FD_EXT,      cursor_drm_fd->fd,
-                EGL_DMA_BUF_PLANE0_OFFSET_EXT,  cursor_drm_fd->offset,
-                EGL_DMA_BUF_PLANE0_PITCH_EXT,   cursor_drm_fd->pitch,
-                EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT, cursor_drm_fd->modifier & 0xFFFFFFFFULL,
-                EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT, cursor_drm_fd->modifier >> 32ULL,
-                EGL_NONE
-            };
-
-            EGLImage cursor_image = cap_kms->params.egl->eglCreateImage(cap_kms->params.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr_cursor);
-            cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, cap_kms->cursor_texture);
-            cap_kms->params.egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, cursor_image);
-            cap_kms->params.egl->eglDestroyImage(cap_kms->params.egl->egl_display, cursor_image);
-            cap_kms->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-            vec2i cursor_size = {cursor_drm_fd->width, cursor_drm_fd->height};
-            gsr_color_conversion_draw(&cap_kms->color_conversion, cap_kms->cursor_texture,
-                (vec2i){cursor_drm_fd->x, cursor_drm_fd->y}, cursor_size,
-                (vec2i){0, 0}, cursor_size,
-                0.0f);
-        }
-    }
-
-    cap_kms->params.egl->eglSwapBuffers(cap_kms->params.egl->egl_display, cap_kms->params.egl->egl_surface);
-
-    return 0;
-}
-
-static void gsr_capture_kms_vaapi_capture_end(gsr_capture *cap, AVFrame *frame) {
-    (void)frame;
-    gsr_capture_kms_vaapi *cap_kms = cap->priv;
-
-    gsr_egl_cleanup_frame(cap_kms->params.egl);
-
-    for(int i = 0; i < cap_kms->kms_response.num_fds; ++i) {
-        if(cap_kms->kms_response.fds[i].fd > 0)
-            close(cap_kms->kms_response.fds[i].fd);
-        cap_kms->kms_response.fds[i].fd = 0;
-    }
-    cap_kms->kms_response.num_fds = 0;
-}
-
-static void gsr_capture_kms_vaapi_stop(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_kms_vaapi *cap_kms = cap->priv;
-
-    gsr_color_conversion_deinit(&cap_kms->color_conversion);
-
-    for(uint32_t i = 0; i < cap_kms->prime.num_objects; ++i) {
-        if(cap_kms->prime.objects[i].fd > 0) {
-            close(cap_kms->prime.objects[i].fd);
-            cap_kms->prime.objects[i].fd = 0;
-        }
-    }
-
-    if(cap_kms->params.egl->egl_context) {
-        if(cap_kms->input_texture) {
-            cap_kms->params.egl->glDeleteTextures(1, &cap_kms->input_texture);
-            cap_kms->input_texture = 0;
-        }
-
-        if(cap_kms->cursor_texture) {
-            cap_kms->params.egl->glDeleteTextures(1, &cap_kms->cursor_texture);
-            cap_kms->cursor_texture = 0;
-        }
-
-        cap_kms->params.egl->glDeleteTextures(2, cap_kms->target_textures);
-        cap_kms->target_textures[0] = 0;
-        cap_kms->target_textures[1] = 0;
-    }
-
-    for(int i = 0; i < cap_kms->kms_response.num_fds; ++i) {
-        if(cap_kms->kms_response.fds[i].fd > 0)
-            close(cap_kms->kms_response.fds[i].fd);
-        cap_kms->kms_response.fds[i].fd = 0;
-    }
-    cap_kms->kms_response.num_fds = 0;
-
-    if(video_codec_context->hw_device_ctx)
-        av_buffer_unref(&video_codec_context->hw_device_ctx);
-    if(video_codec_context->hw_frames_ctx)
-        av_buffer_unref(&video_codec_context->hw_frames_ctx);
-
-    gsr_kms_client_deinit(&cap_kms->kms_client);
-}
-
-static void gsr_capture_kms_vaapi_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    (void)video_codec_context;
-    gsr_capture_kms_vaapi *cap_kms = cap->priv;
-    if(cap->priv) {
-        gsr_capture_kms_vaapi_stop(cap, video_codec_context);
-        free((void*)cap_kms->params.display_to_capture);
-        cap_kms->params.display_to_capture = NULL;
-        free(cap->priv);
-        cap->priv = NULL;
-    }
-    free(cap);
-}
-
-gsr_capture* gsr_capture_kms_vaapi_create(const gsr_capture_kms_vaapi_params *params) {
-    if(!params) {
-        fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_create params is NULL\n");
-        return NULL;
-    }
-
-    gsr_capture *cap = calloc(1, sizeof(gsr_capture));
-    if(!cap)
-        return NULL;
-
-    gsr_capture_kms_vaapi *cap_kms = calloc(1, sizeof(gsr_capture_kms_vaapi));
-    if(!cap_kms) {
-        free(cap);
-        return NULL;
-    }
-
-    const char *display_to_capture = strdup(params->display_to_capture);
-    if(!display_to_capture) {
-        /* TODO XCloseDisplay */
-        free(cap);
-        free(cap_kms);
-        return NULL;
-    }
-
-    cap_kms->params = *params;
-    cap_kms->params.display_to_capture = display_to_capture;
-    
-    *cap = (gsr_capture) {
-        .start = gsr_capture_kms_vaapi_start,
-        .tick = gsr_capture_kms_vaapi_tick,
-        .should_stop = gsr_capture_kms_vaapi_should_stop,
-        .capture = gsr_capture_kms_vaapi_capture,
-        .capture_end = gsr_capture_kms_vaapi_capture_end,
-        .destroy = gsr_capture_kms_vaapi_destroy,
-        .priv = cap_kms
-    };
-
-    return cap;
-}
diff --git a/src/capture/nvfbc.c b/src/capture/nvfbc.c
index 5b62310..13b46c3 100644
--- a/src/capture/nvfbc.c
+++ b/src/capture/nvfbc.c
@@ -1,16 +1,18 @@
 #include "../../include/capture/nvfbc.h"
 #include "../../external/NvFBC.h"
-#include "../../include/cuda.h"
+#include "../../include/egl.h"
+#include "../../include/utils.h"
+#include "../../include/color_conversion.h"
+#include "../../include/window/window.h"
+
 #include <dlfcn.h>
 #include <stdlib.h>
 #include <string.h>
 #include <stdio.h>
+#include <math.h>
+#include <assert.h>
+
 #include <X11/Xlib.h>
-#include <libavutil/hwcontext.h>
-#include <libavutil/hwcontext_cuda.h>
-#include <libavutil/frame.h>
-#include <libavutil/version.h>
-#include <libavcodec/avcodec.h>
 
 typedef struct {
     gsr_capture_nvfbc_params params;
@@ -22,16 +24,16 @@ typedef struct {
     bool fbc_handle_created;
     bool capture_session_created;
 
-    gsr_cuda cuda;
-    bool frame_initialized;
-} gsr_capture_nvfbc;
+    NVFBC_TOGL_SETUP_PARAMS setup_params;
 
-#if defined(_WIN64) || defined(__LP64__)
-typedef unsigned long long CUdeviceptr_v2;
-#else
-typedef unsigned int CUdeviceptr_v2;
-#endif
-typedef CUdeviceptr_v2 CUdeviceptr;
+    bool supports_direct_cursor;
+    uint32_t width, height;
+    NVFBC_TRACKING_TYPE tracking_type;
+    uint32_t output_id;
+    uint32_t tracking_width, tracking_height;
+    bool nvfbc_needs_recreate;
+    double nvfbc_dead_start;
+} gsr_capture_nvfbc;
 
 static int max_int(int a, int b) {
     return a > b ? a : b;
@@ -100,7 +102,7 @@ static void set_func_ptr(void **dst, void *src) {
 }
 
 static bool gsr_capture_nvfbc_load_library(gsr_capture *cap) {
-    gsr_capture_nvfbc *cap_nvfbc = cap->priv;
+    gsr_capture_nvfbc *self = cap->priv;
 
     dlerror(); /* clear */
     void *lib = dlopen("libnvidia-fbc.so.1", RTLD_LAZY);
@@ -109,157 +111,84 @@ static bool gsr_capture_nvfbc_load_library(gsr_capture *cap) {
         return false;
     }
 
-    set_func_ptr((void**)&cap_nvfbc->nv_fbc_create_instance, dlsym(lib, "NvFBCCreateInstance"));
-    if(!cap_nvfbc->nv_fbc_create_instance) {
+    set_func_ptr((void**)&self->nv_fbc_create_instance, dlsym(lib, "NvFBCCreateInstance"));
+    if(!self->nv_fbc_create_instance) {
         fprintf(stderr, "gsr error: unable to resolve symbol 'NvFBCCreateInstance'\n");
         dlclose(lib);
         return false;
     }
 
-    memset(&cap_nvfbc->nv_fbc_function_list, 0, sizeof(cap_nvfbc->nv_fbc_function_list));
-    cap_nvfbc->nv_fbc_function_list.dwVersion = NVFBC_VERSION;
-    NVFBCSTATUS status = cap_nvfbc->nv_fbc_create_instance(&cap_nvfbc->nv_fbc_function_list);
+    memset(&self->nv_fbc_function_list, 0, sizeof(self->nv_fbc_function_list));
+    self->nv_fbc_function_list.dwVersion = NVFBC_VERSION;
+    NVFBCSTATUS status = self->nv_fbc_create_instance(&self->nv_fbc_function_list);
     if(status != NVFBC_SUCCESS) {
         fprintf(stderr, "gsr error: failed to create NvFBC instance (status: %d)\n", status);
         dlclose(lib);
         return false;
     }
 
-    cap_nvfbc->library = lib;
+    self->library = lib;
     return true;
 }
 
-#if LIBAVUTIL_VERSION_MAJOR < 57
-static AVBufferRef* dummy_hw_frame_init(int size) {
-    return av_buffer_alloc(size);
-}
-#else
-static AVBufferRef* dummy_hw_frame_init(size_t size) {
-    return av_buffer_alloc(size);
-}
-#endif
-
-static bool ffmpeg_create_cuda_contexts(gsr_capture_nvfbc *cap_nvfbc, AVCodecContext *video_codec_context) {
-    AVBufferRef *device_ctx = av_hwdevice_ctx_alloc(AV_HWDEVICE_TYPE_CUDA);
-    if(!device_ctx) {
-        fprintf(stderr, "gsr error: cuda_create_codec_context failed: failed to create hardware device context\n");
-        return false;
+static void gsr_capture_nvfbc_destroy_session(gsr_capture_nvfbc *self) {
+    if(self->fbc_handle_created && self->capture_session_created) {
+        NVFBC_DESTROY_CAPTURE_SESSION_PARAMS destroy_capture_params;
+        memset(&destroy_capture_params, 0, sizeof(destroy_capture_params));
+        destroy_capture_params.dwVersion = NVFBC_DESTROY_CAPTURE_SESSION_PARAMS_VER;
+        self->nv_fbc_function_list.nvFBCDestroyCaptureSession(self->nv_fbc_handle, &destroy_capture_params);
+        self->capture_session_created = false;
     }
-
-    AVHWDeviceContext *hw_device_context = (AVHWDeviceContext*)device_ctx->data;
-    AVCUDADeviceContext *cuda_device_context = (AVCUDADeviceContext*)hw_device_context->hwctx;
-    cuda_device_context->cuda_ctx = cap_nvfbc->cuda.cu_ctx;
-    if(av_hwdevice_ctx_init(device_ctx) < 0) {
-        fprintf(stderr, "gsr error: cuda_create_codec_context failed: failed to create hardware device context\n");
-        av_buffer_unref(&device_ctx);
-        return false;
-    }
-
-    AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
-    if(!frame_context) {
-        fprintf(stderr, "gsr error: cuda_create_codec_context failed: failed to create hwframe context\n");
-        av_buffer_unref(&device_ctx);
-        return false;
-    }
-
-    AVHWFramesContext *hw_frame_context = (AVHWFramesContext*)frame_context->data;
-    hw_frame_context->width = video_codec_context->width;
-    hw_frame_context->height = video_codec_context->height;
-    hw_frame_context->sw_format = AV_PIX_FMT_BGR0;
-    hw_frame_context->format = video_codec_context->pix_fmt;
-    hw_frame_context->device_ref = device_ctx;
-    hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
-
-    hw_frame_context->pool = av_buffer_pool_init(1, dummy_hw_frame_init);
-    hw_frame_context->initial_pool_size = 1;
-
-    if (av_hwframe_ctx_init(frame_context) < 0) {
-        fprintf(stderr, "gsr error: cuda_create_codec_context failed: failed to initialize hardware frame context "
-                        "(note: ffmpeg version needs to be > 4.0)\n");
-        av_buffer_unref(&device_ctx);
-        //av_buffer_unref(&frame_context);
-        return false;
-    }
-
-    video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
-    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
-    return true;
 }
 
-static int gsr_capture_nvfbc_start(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_nvfbc *cap_nvfbc = cap->priv;
-    if(!gsr_cuda_load(&cap_nvfbc->cuda, cap_nvfbc->params.dpy, cap_nvfbc->params.overclock))
-        return -1;
-
-    if(!gsr_capture_nvfbc_load_library(cap)) {
-        gsr_cuda_unload(&cap_nvfbc->cuda);
-        return -1;
+static void gsr_capture_nvfbc_destroy_handle(gsr_capture_nvfbc *self) {
+    if(self->fbc_handle_created) {
+        NVFBC_DESTROY_HANDLE_PARAMS destroy_params;
+        memset(&destroy_params, 0, sizeof(destroy_params));
+        destroy_params.dwVersion = NVFBC_DESTROY_HANDLE_PARAMS_VER;
+        self->nv_fbc_function_list.nvFBCDestroyHandle(self->nv_fbc_handle, &destroy_params);
+        self->fbc_handle_created = false;
+        self->nv_fbc_handle = 0;
     }
+}
 
-    const uint32_t x = max_int(cap_nvfbc->params.pos.x, 0);
-    const uint32_t y = max_int(cap_nvfbc->params.pos.y, 0);
-    const uint32_t width = max_int(cap_nvfbc->params.size.x, 0);
-    const uint32_t height = max_int(cap_nvfbc->params.size.y, 0);
-
-    const bool capture_region = (x > 0 || y > 0 || width > 0 || height > 0);
-
-    bool supports_direct_cursor = false;
-    bool direct_capture = cap_nvfbc->params.direct_capture;
-    int driver_major_version = 0;
-    int driver_minor_version = 0;
-    if(direct_capture && get_driver_version(&driver_major_version, &driver_minor_version)) {
-        fprintf(stderr, "Info: detected nvidia version: %d.%d\n", driver_major_version, driver_minor_version);
-
-        // TODO:
-        if(version_at_least(driver_major_version, driver_minor_version, 515, 57) && version_less_than(driver_major_version, driver_minor_version, 520, 56)) {
-            direct_capture = false;
-            fprintf(stderr, "Warning: \"screen-direct\" has temporary been disabled as it causes stuttering with driver versions >= 515.57 and < 520.56. Please update your driver if possible. Capturing \"screen\" instead.\n");
-        }
-
-        // TODO:
-        // Cursor capture disabled because moving the cursor doesn't update capture rate to monitor hz and instead captures at 10-30 hz
-        /*
-        if(direct_capture) {
-            if(version_at_least(driver_major_version, driver_minor_version, 515, 57))
-                supports_direct_cursor = true;
-            else
-                fprintf(stderr, "Info: capturing \"screen-direct\" but driver version appears to be less than 515.57. Disabling capture of cursor. Please update your driver if you want to capture your cursor or record \"screen\" instead.\n");
-        }
-        */
-    }
+static void gsr_capture_nvfbc_destroy_session_and_handle(gsr_capture_nvfbc *self) {
+    gsr_capture_nvfbc_destroy_session(self);
+    gsr_capture_nvfbc_destroy_handle(self);
+}
 
+static int gsr_capture_nvfbc_setup_handle(gsr_capture_nvfbc *self) {
     NVFBCSTATUS status;
-    NVFBC_TRACKING_TYPE tracking_type;
-    uint32_t output_id = 0;
-    cap_nvfbc->fbc_handle_created = false;
-    cap_nvfbc->capture_session_created = false;
 
     NVFBC_CREATE_HANDLE_PARAMS create_params;
     memset(&create_params, 0, sizeof(create_params));
     create_params.dwVersion = NVFBC_CREATE_HANDLE_PARAMS_VER;
+    create_params.bExternallyManagedContext = NVFBC_TRUE;
+    create_params.glxCtx = self->params.egl->glx_context;
+    create_params.glxFBConfig = self->params.egl->glx_fb_config;
 
-    status = cap_nvfbc->nv_fbc_function_list.nvFBCCreateHandle(&cap_nvfbc->nv_fbc_handle, &create_params);
+    status = self->nv_fbc_function_list.nvFBCCreateHandle(&self->nv_fbc_handle, &create_params);
     if(status != NVFBC_SUCCESS) {
         // Reverse engineering for interoperability
         const uint8_t enable_key[] = { 0xac, 0x10, 0xc9, 0x2e, 0xa5, 0xe6, 0x87, 0x4f, 0x8f, 0x4b, 0xf4, 0x61, 0xf8, 0x56, 0x27, 0xe9 };
         create_params.privateData = enable_key;
         create_params.privateDataSize = 16;
 
-        status = cap_nvfbc->nv_fbc_function_list.nvFBCCreateHandle(&cap_nvfbc->nv_fbc_handle, &create_params);
+        status = self->nv_fbc_function_list.nvFBCCreateHandle(&self->nv_fbc_handle, &create_params);
         if(status != NVFBC_SUCCESS) {
-            fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
+            fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", self->nv_fbc_function_list.nvFBCGetLastErrorStr(self->nv_fbc_handle));
             goto error_cleanup;
         }
     }
-    cap_nvfbc->fbc_handle_created = true;
+    self->fbc_handle_created = true;
 
     NVFBC_GET_STATUS_PARAMS status_params;
     memset(&status_params, 0, sizeof(status_params));
     status_params.dwVersion = NVFBC_GET_STATUS_PARAMS_VER;
 
-    status = cap_nvfbc->nv_fbc_function_list.nvFBCGetStatus(cap_nvfbc->nv_fbc_handle, &status_params);
+    status = self->nv_fbc_function_list.nvFBCGetStatus(self->nv_fbc_handle, &status_params);
     if(status != NVFBC_SUCCESS) {
-        fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
+        fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", self->nv_fbc_function_list.nvFBCGetLastErrorStr(self->nv_fbc_handle));
         goto error_cleanup;
     }
 
@@ -268,10 +197,13 @@ static int gsr_capture_nvfbc_start(gsr_capture *cap, AVCodecContext *video_codec
         goto error_cleanup;
     }
 
-    uint32_t tracking_width = XWidthOfScreen(DefaultScreenOfDisplay(cap_nvfbc->params.dpy));
-    uint32_t tracking_height = XHeightOfScreen(DefaultScreenOfDisplay(cap_nvfbc->params.dpy));
-    tracking_type = strcmp(cap_nvfbc->params.display_to_capture, "screen") == 0 ? NVFBC_TRACKING_SCREEN : NVFBC_TRACKING_OUTPUT;
-    if(tracking_type == NVFBC_TRACKING_OUTPUT) {
+    assert(gsr_window_get_display_server(self->params.egl->window) == GSR_DISPLAY_SERVER_X11);
+    Display *display = gsr_window_get_display(self->params.egl->window);
+
+    self->tracking_width = XWidthOfScreen(DefaultScreenOfDisplay(display));
+    self->tracking_height = XHeightOfScreen(DefaultScreenOfDisplay(display));
+    self->tracking_type = strcmp(self->params.display_to_capture, "screen") == 0 ? NVFBC_TRACKING_SCREEN : NVFBC_TRACKING_OUTPUT;
+    if(self->tracking_type == NVFBC_TRACKING_OUTPUT) {
         if(!status_params.bXRandRAvailable) {
             fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: the xrandr extension is not available\n");
             goto error_cleanup;
@@ -282,176 +214,201 @@ static int gsr_capture_nvfbc_start(gsr_capture *cap, AVCodecContext *video_codec
             goto error_cleanup;
         }
 
-        output_id = get_output_id_from_display_name(status_params.outputs, status_params.dwOutputNum, cap_nvfbc->params.display_to_capture, &tracking_width, &tracking_height);
-        if(output_id == 0) {
-            fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: display '%s' not found\n", cap_nvfbc->params.display_to_capture);
+        self->output_id = get_output_id_from_display_name(status_params.outputs, status_params.dwOutputNum, self->params.display_to_capture, &self->tracking_width, &self->tracking_height);
+        if(self->output_id == 0) {
+            fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: display '%s' not found\n", self->params.display_to_capture);
             goto error_cleanup;
         }
     }
 
+    self->width = self->tracking_width;
+    self->height = self->tracking_height;
+    return 0;
+
+    error_cleanup:
+    gsr_capture_nvfbc_destroy_session_and_handle(self);
+    return -1;
+}
+
+static int gsr_capture_nvfbc_setup_session(gsr_capture_nvfbc *self) {
     NVFBC_CREATE_CAPTURE_SESSION_PARAMS create_capture_params;
     memset(&create_capture_params, 0, sizeof(create_capture_params));
     create_capture_params.dwVersion = NVFBC_CREATE_CAPTURE_SESSION_PARAMS_VER;
-    create_capture_params.eCaptureType = NVFBC_CAPTURE_SHARED_CUDA;
-    create_capture_params.bWithCursor = (!direct_capture || supports_direct_cursor) ? NVFBC_TRUE : NVFBC_FALSE;
-    if(capture_region)
-        create_capture_params.captureBox = (NVFBC_BOX){ x, y, width, height };
-    create_capture_params.eTrackingType = tracking_type;
-    create_capture_params.dwSamplingRateMs = 1000u / ((uint32_t)cap_nvfbc->params.fps + 1);
-    create_capture_params.bAllowDirectCapture = direct_capture ? NVFBC_TRUE : NVFBC_FALSE;
-    create_capture_params.bPushModel = direct_capture ? NVFBC_TRUE : NVFBC_FALSE;
-    //create_capture_params.bDisableAutoModesetRecovery = true; // TODO:
-    if(tracking_type == NVFBC_TRACKING_OUTPUT)
-        create_capture_params.dwOutputId = output_id;
-
-    status = cap_nvfbc->nv_fbc_function_list.nvFBCCreateCaptureSession(cap_nvfbc->nv_fbc_handle, &create_capture_params);
+    create_capture_params.eCaptureType = NVFBC_CAPTURE_TO_GL;
+    create_capture_params.bWithCursor = (!self->params.direct_capture || self->supports_direct_cursor) ? NVFBC_TRUE : NVFBC_FALSE;
+    if(!self->params.record_cursor)
+        create_capture_params.bWithCursor = false;
+    create_capture_params.eTrackingType = self->tracking_type;
+    create_capture_params.dwSamplingRateMs = (uint32_t)ceilf(1000.0f / (float)self->params.fps);
+    create_capture_params.bAllowDirectCapture = self->params.direct_capture ? NVFBC_TRUE : NVFBC_FALSE;
+    create_capture_params.bPushModel = self->params.direct_capture ? NVFBC_TRUE : NVFBC_FALSE;
+    create_capture_params.bDisableAutoModesetRecovery = true;
+    if(self->tracking_type == NVFBC_TRACKING_OUTPUT)
+        create_capture_params.dwOutputId = self->output_id;
+
+    NVFBCSTATUS status = self->nv_fbc_function_list.nvFBCCreateCaptureSession(self->nv_fbc_handle, &create_capture_params);
     if(status != NVFBC_SUCCESS) {
-        fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
-        goto error_cleanup;
+        fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", self->nv_fbc_function_list.nvFBCGetLastErrorStr(self->nv_fbc_handle));
+        return -1;
     }
-    cap_nvfbc->capture_session_created = true;
+    self->capture_session_created = true;
 
-    NVFBC_TOCUDA_SETUP_PARAMS setup_params;
-    memset(&setup_params, 0, sizeof(setup_params));
-    setup_params.dwVersion = NVFBC_TOCUDA_SETUP_PARAMS_VER;
-    setup_params.eBufferFormat = NVFBC_BUFFER_FORMAT_BGRA;
+    memset(&self->setup_params, 0, sizeof(self->setup_params));
+    self->setup_params.dwVersion = NVFBC_TOGL_SETUP_PARAMS_VER;
+    self->setup_params.eBufferFormat = NVFBC_BUFFER_FORMAT_BGRA;
 
-    status = cap_nvfbc->nv_fbc_function_list.nvFBCToCudaSetUp(cap_nvfbc->nv_fbc_handle, &setup_params);
+    status = self->nv_fbc_function_list.nvFBCToGLSetUp(self->nv_fbc_handle, &self->setup_params);
     if(status != NVFBC_SUCCESS) {
-        fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
-        goto error_cleanup;
+        fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", self->nv_fbc_function_list.nvFBCGetLastErrorStr(self->nv_fbc_handle));
+        gsr_capture_nvfbc_destroy_session(self);
+        return -1;
     }
 
-    if(capture_region) {
-        video_codec_context->width = width & ~1;
-        video_codec_context->height = height & ~1;
-    } else {
-        video_codec_context->width = tracking_width & ~1;
-        video_codec_context->height = tracking_height & ~1;
+    return 0;
+}
+
+static void gsr_capture_nvfbc_stop(gsr_capture_nvfbc *self) {
+    gsr_capture_nvfbc_destroy_session_and_handle(self);
+    if(self->library) {
+        dlclose(self->library);
+        self->library = NULL;
+    }
+    if(self->params.display_to_capture) {
+        free((void*)self->params.display_to_capture);
+        self->params.display_to_capture = NULL;
     }
+}
 
-    if(!ffmpeg_create_cuda_contexts(cap_nvfbc, video_codec_context))
-        goto error_cleanup;
+static int gsr_capture_nvfbc_start(gsr_capture *cap, gsr_capture_metadata *capture_metadata) {
+    gsr_capture_nvfbc *self = cap->priv;
 
-    return 0;
+    if(!gsr_capture_nvfbc_load_library(cap))
+        return -1;
 
-    error_cleanup:
-    if(cap_nvfbc->fbc_handle_created) {
-        if(cap_nvfbc->capture_session_created) {
-            NVFBC_DESTROY_CAPTURE_SESSION_PARAMS destroy_capture_params;
-            memset(&destroy_capture_params, 0, sizeof(destroy_capture_params));
-            destroy_capture_params.dwVersion = NVFBC_DESTROY_CAPTURE_SESSION_PARAMS_VER;
-            cap_nvfbc->nv_fbc_function_list.nvFBCDestroyCaptureSession(cap_nvfbc->nv_fbc_handle, &destroy_capture_params);
-            cap_nvfbc->capture_session_created = false;
+    self->supports_direct_cursor = false;
+    int driver_major_version = 0;
+    int driver_minor_version = 0;
+    if(self->params.direct_capture && get_driver_version(&driver_major_version, &driver_minor_version)) {
+        fprintf(stderr, "gsr info: detected nvidia version: %d.%d\n", driver_major_version, driver_minor_version);
+
+        // TODO:
+        if(version_at_least(driver_major_version, driver_minor_version, 515, 57) && version_less_than(driver_major_version, driver_minor_version, 520, 56)) {
+            self->params.direct_capture = false;
+            fprintf(stderr, "gsr warning: \"screen-direct\" has temporary been disabled as it causes stuttering with driver versions >= 515.57 and < 520.56. Please update your driver if possible. Capturing \"screen\" instead.\n");
         }
 
-        NVFBC_DESTROY_HANDLE_PARAMS destroy_params;
-        memset(&destroy_params, 0, sizeof(destroy_params));
-        destroy_params.dwVersion = NVFBC_DESTROY_HANDLE_PARAMS_VER;
-        cap_nvfbc->nv_fbc_function_list.nvFBCDestroyHandle(cap_nvfbc->nv_fbc_handle, &destroy_params);
-        cap_nvfbc->fbc_handle_created = false;
+        // TODO:
+        // Cursor capture disabled because moving the cursor doesn't update capture rate to monitor hz and instead captures at 10-30 hz
+        /*
+        if(direct_capture) {
+            if(version_at_least(driver_major_version, driver_minor_version, 515, 57))
+                self->supports_direct_cursor = true;
+            else
+                fprintf(stderr, "gsr info: capturing \"screen-direct\" but driver version appears to be less than 515.57. Disabling capture of cursor. Please update your driver if you want to capture your cursor or record \"screen\" instead.\n");
+        }
+        */
     }
 
-    if(video_codec_context->hw_device_ctx)
-        av_buffer_unref(&video_codec_context->hw_device_ctx);
-    if(video_codec_context->hw_frames_ctx)
-        av_buffer_unref(&video_codec_context->hw_frames_ctx);
-
-    gsr_cuda_unload(&cap_nvfbc->cuda);
-    return -1;
-}
+    if(gsr_capture_nvfbc_setup_handle(self) != 0) {
+        goto error_cleanup;
+    }
 
-static void gsr_capture_nvfbc_destroy_session(gsr_capture *cap) {
-    gsr_capture_nvfbc *cap_nvfbc = cap->priv;
+    if(gsr_capture_nvfbc_setup_session(self) != 0) {
+        goto error_cleanup;
+    }
 
-    if(cap_nvfbc->fbc_handle_created) {
-        if(cap_nvfbc->capture_session_created) {
-            NVFBC_DESTROY_CAPTURE_SESSION_PARAMS destroy_capture_params;
-            memset(&destroy_capture_params, 0, sizeof(destroy_capture_params));
-            destroy_capture_params.dwVersion = NVFBC_DESTROY_CAPTURE_SESSION_PARAMS_VER;
-            cap_nvfbc->nv_fbc_function_list.nvFBCDestroyCaptureSession(cap_nvfbc->nv_fbc_handle, &destroy_capture_params);
-            cap_nvfbc->capture_session_created = false;
-        }
+    capture_metadata->width = self->tracking_width;
+    capture_metadata->height = self->tracking_height;
 
-        NVFBC_DESTROY_HANDLE_PARAMS destroy_params;
-        memset(&destroy_params, 0, sizeof(destroy_params));
-        destroy_params.dwVersion = NVFBC_DESTROY_HANDLE_PARAMS_VER;
-        cap_nvfbc->nv_fbc_function_list.nvFBCDestroyHandle(cap_nvfbc->nv_fbc_handle, &destroy_params);
-        cap_nvfbc->fbc_handle_created = false;
+    if(self->params.output_resolution.x > 0 && self->params.output_resolution.y > 0) {
+        self->params.output_resolution = scale_keep_aspect_ratio((vec2i){capture_metadata->width, capture_metadata->height}, self->params.output_resolution);
+        capture_metadata->width = self->params.output_resolution.x;
+        capture_metadata->height = self->params.output_resolution.y;
+    } else if(self->params.region_size.x > 0 && self->params.region_size.y > 0) {
+        capture_metadata->width = self->params.region_size.x;
+        capture_metadata->height = self->params.region_size.y;
     }
 
-    cap_nvfbc->nv_fbc_handle = 0;
+    return 0;
+
+    error_cleanup:
+    gsr_capture_nvfbc_stop(self);
+    return -1;
 }
 
-static void gsr_capture_nvfbc_tick(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame **frame) {
-    gsr_capture_nvfbc *cap_nvfbc = cap->priv;
-    if(!cap_nvfbc->frame_initialized && video_codec_context->hw_frames_ctx) {
-        cap_nvfbc->frame_initialized = true;
-        (*frame)->hw_frames_ctx = video_codec_context->hw_frames_ctx;
-        (*frame)->buf[0] = av_buffer_pool_get(((AVHWFramesContext*)video_codec_context->hw_frames_ctx->data)->pool);
-        (*frame)->extended_data = (*frame)->data;
-        (*frame)->color_range = video_codec_context->color_range;
-        (*frame)->color_primaries = video_codec_context->color_primaries;
-        (*frame)->color_trc = video_codec_context->color_trc;
-        (*frame)->colorspace = video_codec_context->colorspace;
-        (*frame)->chroma_location = video_codec_context->chroma_sample_location;
+static int gsr_capture_nvfbc_capture(gsr_capture *cap, gsr_capture_metadata *capture_metadata, gsr_color_conversion *color_conversion) {
+    gsr_capture_nvfbc *self = cap->priv;
+
+    const double nvfbc_recreate_retry_time_seconds = 1.0;
+    if(self->nvfbc_needs_recreate) {
+        const double now = clock_get_monotonic_seconds();
+        if(now - self->nvfbc_dead_start >= nvfbc_recreate_retry_time_seconds) {
+            self->nvfbc_dead_start = now;
+            gsr_capture_nvfbc_destroy_session_and_handle(self);
+
+            if(gsr_capture_nvfbc_setup_handle(self) != 0) {
+                fprintf(stderr, "gsr error: gsr_capture_nvfbc_capture failed to recreate nvfbc handle, trying again in %f second(s)\n", nvfbc_recreate_retry_time_seconds);
+                return -1;
+            }
+
+            if(gsr_capture_nvfbc_setup_session(self) != 0) {
+                fprintf(stderr, "gsr error: gsr_capture_nvfbc_capture failed to recreate nvfbc session, trying again in %f second(s)\n", nvfbc_recreate_retry_time_seconds);
+                return -1;
+            }
+
+            self->nvfbc_needs_recreate = false;
+        } else {
+            return 0;
+        }
     }
-}
 
-static int gsr_capture_nvfbc_capture(gsr_capture *cap, AVFrame *frame) {
-    gsr_capture_nvfbc *cap_nvfbc = cap->priv;
+    vec2i frame_size = (vec2i){self->width, self->height};
+    const vec2i original_frame_size = frame_size;
+    if(self->params.region_size.x > 0 && self->params.region_size.y > 0)
+        frame_size = self->params.region_size;
 
-    CUdeviceptr cu_device_ptr = 0;
+    const bool is_scaled = self->params.output_resolution.x > 0 && self->params.output_resolution.y > 0;
+    vec2i output_size = is_scaled ? self->params.output_resolution : frame_size;
+    output_size = scale_keep_aspect_ratio(frame_size, output_size);
+
+    const vec2i target_pos = { max_int(0, capture_metadata->width / 2 - output_size.x / 2), max_int(0, capture_metadata->height / 2 - output_size.y / 2) };
 
     NVFBC_FRAME_GRAB_INFO frame_info;
     memset(&frame_info, 0, sizeof(frame_info));
 
-    NVFBC_TOCUDA_GRAB_FRAME_PARAMS grab_params;
+    NVFBC_TOGL_GRAB_FRAME_PARAMS grab_params;
     memset(&grab_params, 0, sizeof(grab_params));
-    grab_params.dwVersion = NVFBC_TOCUDA_GRAB_FRAME_PARAMS_VER;
-    grab_params.dwFlags = NVFBC_TOCUDA_GRAB_FLAGS_NOWAIT;/* | NVFBC_TOCUDA_GRAB_FLAGS_FORCE_REFRESH;*/
+    grab_params.dwVersion = NVFBC_TOGL_GRAB_FRAME_PARAMS_VER;
+    grab_params.dwFlags = NVFBC_TOGL_GRAB_FLAGS_NOWAIT | NVFBC_TOGL_GRAB_FLAGS_FORCE_REFRESH; // TODO: Remove NVFBC_TOGL_GRAB_FLAGS_FORCE_REFRESH
     grab_params.pFrameGrabInfo = &frame_info;
-    grab_params.pCUDADeviceBuffer = &cu_device_ptr;
     grab_params.dwTimeoutMs = 0;
 
-    NVFBCSTATUS status = cap_nvfbc->nv_fbc_function_list.nvFBCToCudaGrabFrame(cap_nvfbc->nv_fbc_handle, &grab_params);
+    NVFBCSTATUS status = self->nv_fbc_function_list.nvFBCToGLGrabFrame(self->nv_fbc_handle, &grab_params);
     if(status != NVFBC_SUCCESS) {
-        fprintf(stderr, "gsr error: gsr_capture_nvfbc_capture failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
-        return -1;
+        fprintf(stderr, "gsr error: gsr_capture_nvfbc_capture failed: %s (%d), recreating session after %f second(s)\n", self->nv_fbc_function_list.nvFBCGetLastErrorStr(self->nv_fbc_handle), status, nvfbc_recreate_retry_time_seconds);
+        self->nvfbc_needs_recreate = true;
+        self->nvfbc_dead_start = clock_get_monotonic_seconds();
+        return 0;
     }
 
-    /*
-        *byte_size = frame_info.dwByteSize;
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
 
-        TODO: Check bIsNewFrame
-        TODO: Check dwWidth and dwHeight and update size in video output in ffmpeg. This can happen when xrandr is used to change monitor resolution
-    */
+    gsr_color_conversion_draw(color_conversion, self->setup_params.dwTextures[grab_params.dwTextureIndex],
+        target_pos, (vec2i){output_size.x, output_size.y},
+        self->params.region_position, frame_size, original_frame_size,
+        GSR_ROT_0, GSR_SOURCE_COLOR_BGR, false, false);
+
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
 
-    frame->data[0] = (uint8_t*)cu_device_ptr;
-    //frame->data[1] = (uint8_t*)cu_device_ptr;
-    //frame->data[2] = (uint8_t*)cu_device_ptr;
-    frame->linesize[0] = frame->width * 4;
-    // TODO: Use these when outputting yuv444 by changing nvfbc color to YUV444P and sw_format to YUV444P
-    //frame->linesize[1] = frame->width * 1;
-    //frame->linesize[2] = frame->width * 1;
     return 0;
 }
 
-static void gsr_capture_nvfbc_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_nvfbc *cap_nvfbc = cap->priv;
-    gsr_capture_nvfbc_destroy_session(cap);
-    if(video_codec_context->hw_device_ctx)
-        av_buffer_unref(&video_codec_context->hw_device_ctx);
-    if(video_codec_context->hw_frames_ctx)
-        av_buffer_unref(&video_codec_context->hw_frames_ctx);
-    if(cap_nvfbc) {
-        gsr_cuda_unload(&cap_nvfbc->cuda);
-        dlclose(cap_nvfbc->library);
-        free((void*)cap_nvfbc->params.display_to_capture);
-        cap_nvfbc->params.display_to_capture = NULL;
-        free(cap->priv);
-        cap->priv = NULL;
-    }
+static void gsr_capture_nvfbc_destroy(gsr_capture *cap) {
+    gsr_capture_nvfbc *self = cap->priv;
+    gsr_capture_nvfbc_stop(self);
+    free(cap->priv);
     free(cap);
 }
 
@@ -486,13 +443,13 @@ gsr_capture* gsr_capture_nvfbc_create(const gsr_capture_nvfbc_params *params) {
     cap_nvfbc->params = *params;
     cap_nvfbc->params.display_to_capture = display_to_capture;
     cap_nvfbc->params.fps = max_int(cap_nvfbc->params.fps, 1);
-    
+
     *cap = (gsr_capture) {
         .start = gsr_capture_nvfbc_start,
-        .tick = gsr_capture_nvfbc_tick,
+        .tick = NULL,
         .should_stop = NULL,
         .capture = gsr_capture_nvfbc_capture,
-        .capture_end = NULL,
+        .uses_external_image = NULL,
         .destroy = gsr_capture_nvfbc_destroy,
         .priv = cap_nvfbc
     };
diff --git a/src/capture/portal.c b/src/capture/portal.c
new file mode 100644
index 0000000..581a2ed
--- /dev/null
+++ b/src/capture/portal.c
@@ -0,0 +1,416 @@
+#include "../../include/capture/portal.h"
+#include "../../include/color_conversion.h"
+#include "../../include/egl.h"
+#include "../../include/utils.h"
+#include "../../dbus/client/dbus_client.h"
+#include "../../include/pipewire_video.h"
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <limits.h>
+#include <assert.h>
+
+typedef struct {
+    gsr_capture_portal_params params;
+
+    gsr_texture_map texture_map;
+
+    gsr_dbus_client dbus_client;
+    char session_handle[128];
+
+    gsr_pipewire_video pipewire;
+    vec2i capture_size;
+    gsr_pipewire_video_dmabuf_data dmabuf_data[GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES];
+    int num_dmabuf_data;
+
+    gsr_pipewire_video_region region;
+    gsr_pipewire_video_region cursor_region;
+    uint32_t pipewire_fourcc;
+    uint64_t pipewire_modifiers;
+    bool using_external_image;
+} gsr_capture_portal;
+
+static void gsr_capture_portal_cleanup_plane_fds(gsr_capture_portal *self) {
+    for(int i = 0; i < self->num_dmabuf_data; ++i) {
+        if(self->dmabuf_data[i].fd > 0) {
+            close(self->dmabuf_data[i].fd);
+            self->dmabuf_data[i].fd = 0;
+        }
+    }
+    self->num_dmabuf_data = 0;
+}
+
+static void gsr_capture_portal_stop(gsr_capture_portal *self) {
+    if(self->texture_map.texture_id) {
+        self->params.egl->glDeleteTextures(1, &self->texture_map.texture_id);
+        self->texture_map.texture_id = 0;
+    }
+
+    if(self->texture_map.external_texture_id) {
+        self->params.egl->glDeleteTextures(1, &self->texture_map.external_texture_id);
+        self->texture_map.external_texture_id = 0;
+    }
+
+    if(self->texture_map.cursor_texture_id) {
+        self->params.egl->glDeleteTextures(1, &self->texture_map.cursor_texture_id);
+        self->texture_map.cursor_texture_id = 0;
+    }
+
+    gsr_capture_portal_cleanup_plane_fds(self);
+    gsr_pipewire_video_deinit(&self->pipewire);
+    gsr_dbus_client_deinit(&self->dbus_client);
+}
+
+static void gsr_capture_portal_create_input_textures(gsr_capture_portal *self) {
+    self->params.egl->glGenTextures(1, &self->texture_map.texture_id);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, self->texture_map.texture_id);
+    self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+    self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+
+    self->params.egl->glGenTextures(1, &self->texture_map.external_texture_id);
+    self->params.egl->glBindTexture(GL_TEXTURE_EXTERNAL_OES, self->texture_map.external_texture_id);
+    self->params.egl->glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+    self->params.egl->glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+    self->params.egl->glBindTexture(GL_TEXTURE_EXTERNAL_OES, 0);
+
+    self->params.egl->glGenTextures(1, &self->texture_map.cursor_texture_id);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, self->texture_map.cursor_texture_id);
+    self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+    self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+}
+
+static void get_default_gpu_screen_recorder_restore_token_path(char *buffer, size_t buffer_size) {
+    const char *xdg_config_home = getenv("XDG_CONFIG_HOME");
+    if(xdg_config_home) {
+        snprintf(buffer, buffer_size, "%s/gpu-screen-recorder/restore_token", xdg_config_home);
+    } else {
+        const char *home = getenv("HOME");
+        if(!home)
+            home = "/tmp";
+        snprintf(buffer, buffer_size, "%s/.config/gpu-screen-recorder/restore_token", home);
+    }
+}
+
+static bool create_directory_to_file(const char *filepath) {
+    char dir[PATH_MAX];
+    dir[0] = '\0';
+
+    const char *split = strrchr(filepath, '/');
+    if(!split) /* Assuming it's the current directory (for example if filepath is "restore_token"), which doesn't need to be created */
+        return true;
+
+    snprintf(dir, sizeof(dir), "%.*s", (int)(split - filepath), filepath);
+    if(create_directory_recursive(dir) != 0) {
+        fprintf(stderr, "gsr warning: gsr_capture_portal_save_restore_token: failed to create directory (%s) for restore token\n", dir);
+        return false;
+    }
+    return true;
+}
+
+static void gsr_capture_portal_save_restore_token(const char *restore_token, const char *portal_session_token_filepath) {
+    char restore_token_path[PATH_MAX];
+    restore_token_path[0] = '\0';
+    if(portal_session_token_filepath)
+        snprintf(restore_token_path, sizeof(restore_token_path), "%s", portal_session_token_filepath);
+    else
+        get_default_gpu_screen_recorder_restore_token_path(restore_token_path, sizeof(restore_token_path));
+
+    if(!create_directory_to_file(restore_token_path))
+        return;
+
+    FILE *f = fopen(restore_token_path, "wb");
+    if(!f) {
+        fprintf(stderr, "gsr warning: gsr_capture_portal_save_restore_token: failed to create restore token file (%s)\n", restore_token_path);
+        return;
+    }
+
+    const int restore_token_len = strlen(restore_token);
+    if((long)fwrite(restore_token, 1, restore_token_len, f) != restore_token_len) {
+        fprintf(stderr, "gsr warning: gsr_capture_portal_save_restore_token: failed to write restore token to file (%s)\n", restore_token_path);
+        fclose(f);
+        return;
+    }
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_save_restore_token: saved restore token to cache (%s)\n", restore_token);
+    fclose(f);
+}
+
+static void gsr_capture_portal_get_restore_token_from_cache(char *buffer, size_t buffer_size, const char *portal_session_token_filepath) {
+    assert(buffer_size > 0);
+    buffer[0] = '\0';
+
+    char restore_token_path[PATH_MAX];
+    restore_token_path[0] = '\0';
+    if(portal_session_token_filepath)
+        snprintf(restore_token_path, sizeof(restore_token_path), "%s", portal_session_token_filepath);
+    else
+        get_default_gpu_screen_recorder_restore_token_path(restore_token_path, sizeof(restore_token_path));
+
+    FILE *f = fopen(restore_token_path, "rb");
+    if(!f) {
+        fprintf(stderr, "gsr info: gsr_capture_portal_get_restore_token_from_cache: no restore token found in cache or failed to load (%s)\n", restore_token_path);
+        return;
+    }
+
+    fseek(f, 0, SEEK_END);
+    long file_size = ftell(f);
+    fseek(f, 0, SEEK_SET);
+
+    if(file_size > 0 && file_size < 1024 && file_size < (long)buffer_size && (long)fread(buffer, 1, file_size, f) != file_size) {
+        buffer[0] = '\0';
+        fprintf(stderr, "gsr warning: gsr_capture_portal_get_restore_token_from_cache: failed to read restore token (%s)\n", restore_token_path);
+        fclose(f);
+        return;
+    }
+
+    if(file_size > 0 && file_size < (long)buffer_size)
+        buffer[file_size] = '\0';
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_get_restore_token_from_cache: read cached restore token (%s)\n", buffer);
+    fclose(f);
+}
+
+static int gsr_capture_portal_setup_dbus(gsr_capture_portal *self, int *pipewire_fd, uint32_t *pipewire_node) {
+    *pipewire_fd = 0;
+    *pipewire_node = 0;
+    int response_status = 0;
+
+    char restore_token[1024];
+    restore_token[0] = '\0';
+    if(self->params.restore_portal_session)
+        gsr_capture_portal_get_restore_token_from_cache(restore_token, sizeof(restore_token), self->params.portal_session_token_filepath);
+
+    if(!gsr_dbus_client_init(&self->dbus_client, restore_token))
+        return -1;
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_setup_dbus: CreateSession\n");
+    response_status = gsr_dbus_client_screencast_create_session(&self->dbus_client, self->session_handle, sizeof(self->session_handle));
+    if(response_status != 0) {
+        fprintf(stderr, "gsr error: gsr_capture_portal_setup_dbus: CreateSession failed\n");
+        return response_status;
+    }
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_setup_dbus: SelectSources\n");
+    response_status = gsr_dbus_client_screencast_select_sources(&self->dbus_client, self->session_handle, GSR_PORTAL_CAPTURE_TYPE_ALL, self->params.record_cursor ? GSR_PORTAL_CURSOR_MODE_EMBEDDED : GSR_PORTAL_CURSOR_MODE_HIDDEN);
+    if(response_status != 0) {
+        fprintf(stderr, "gsr error: gsr_capture_portal_setup_dbus: SelectSources failed\n");
+        return response_status;
+    }
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_setup_dbus: Start\n");
+    response_status = gsr_dbus_client_screencast_start(&self->dbus_client, self->session_handle, pipewire_node);
+    if(response_status != 0) {
+        fprintf(stderr, "gsr error: gsr_capture_portal_setup_dbus: Start failed\n");
+        return response_status;
+    }
+
+    const char *screencast_restore_token = gsr_dbus_client_screencast_get_restore_token(&self->dbus_client);
+    if(screencast_restore_token)
+        gsr_capture_portal_save_restore_token(screencast_restore_token, self->params.portal_session_token_filepath);
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_setup_dbus: OpenPipeWireRemote\n");
+    if(!gsr_dbus_client_screencast_open_pipewire_remote(&self->dbus_client, self->session_handle, pipewire_fd)) {
+        fprintf(stderr, "gsr error: gsr_capture_portal_setup_dbus: OpenPipeWireRemote failed\n");
+        return -1;
+    }
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_setup_dbus: desktop portal setup finished\n");
+    return 0;
+}
+
+static bool gsr_capture_portal_get_frame_dimensions(gsr_capture_portal *self) {
+    fprintf(stderr, "gsr info: gsr_capture_portal_start: waiting for pipewire negotiation\n");
+
+    const double start_time = clock_get_monotonic_seconds();
+    while(clock_get_monotonic_seconds() - start_time < 5.0) {
+        if(gsr_pipewire_video_map_texture(&self->pipewire, self->texture_map, &self->region, &self->cursor_region, self->dmabuf_data, &self->num_dmabuf_data, &self->pipewire_fourcc, &self->pipewire_modifiers, &self->using_external_image)) {
+            self->capture_size.x = self->region.width;
+            self->capture_size.y = self->region.height;
+            fprintf(stderr, "gsr info: gsr_capture_portal_start: pipewire negotiation finished\n");
+            return true;
+        }
+        usleep(30 * 1000); /* 30 milliseconds */
+    }
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_start: timed out waiting for pipewire negotiation (5 seconds)\n");
+    return false;
+}
+
+static int gsr_capture_portal_start(gsr_capture *cap, gsr_capture_metadata *capture_metadata) {
+    gsr_capture_portal *self = cap->priv;
+
+    gsr_capture_portal_create_input_textures(self);
+
+    int pipewire_fd = 0;
+    uint32_t pipewire_node = 0;
+    const int response_status = gsr_capture_portal_setup_dbus(self, &pipewire_fd, &pipewire_node);
+    if(response_status != 0) {
+        gsr_capture_portal_stop(self);
+        // Response status values:
+        // 0: Success, the request is carried out
+        // 1: The user cancelled the interaction
+        // 2: The user interaction was ended in some other way
+        // Response status value 2 happens usually if there was some kind of error in the desktop portal on the system
+        if(response_status == 2) {
+            fprintf(stderr, "gsr error: gsr_capture_portal_start: desktop portal capture failed. Either you Wayland compositor doesn't support desktop portal capture or it's incorrectly setup on your system\n");
+            return 50;
+        } else if(response_status == 1) {
+            fprintf(stderr, "gsr error: gsr_capture_portal_start: desktop portal capture failed. It seems like desktop portal capture was canceled by the user.\n");
+            return 60;
+        } else {
+            return -1;
+        }
+    }
+
+    fprintf(stderr, "gsr info: gsr_capture_portal_start: setting up pipewire\n");
+    /* TODO: support hdr when pipewire supports it */
+    /* gsr_pipewire closes the pipewire fd, even on failure */
+    if(!gsr_pipewire_video_init(&self->pipewire, pipewire_fd, pipewire_node, capture_metadata->fps, self->params.record_cursor, self->params.egl)) {
+        fprintf(stderr, "gsr error: gsr_capture_portal_start: failed to setup pipewire with fd: %d, node: %" PRIu32 "\n", pipewire_fd, pipewire_node);
+        gsr_capture_portal_stop(self);
+        return -1;
+    }
+    fprintf(stderr, "gsr info: gsr_capture_portal_start: pipewire setup finished\n");
+
+    if(!gsr_capture_portal_get_frame_dimensions(self)) {
+        gsr_capture_portal_stop(self);
+        return -1;
+    }
+
+    if(self->params.output_resolution.x == 0 && self->params.output_resolution.y == 0) {
+        capture_metadata->width = self->capture_size.x;
+        capture_metadata->height = self->capture_size.y;
+    } else {
+        self->params.output_resolution = scale_keep_aspect_ratio(self->capture_size, self->params.output_resolution);
+        capture_metadata->width = self->params.output_resolution.x;
+        capture_metadata->height = self->params.output_resolution.y;
+    }
+
+    return 0;
+}
+
+static int max_int(int a, int b) {
+    return a > b ? a : b;
+}
+
+static int gsr_capture_portal_capture(gsr_capture *cap, gsr_capture_metadata *capture_metadata, gsr_color_conversion *color_conversion) {
+    (void)color_conversion;
+    gsr_capture_portal *self = cap->priv;
+
+    /* TODO: Handle formats other than RGB(A) */
+    if(self->num_dmabuf_data == 0) {
+        if(gsr_pipewire_video_map_texture(&self->pipewire, self->texture_map, &self->region, &self->cursor_region, self->dmabuf_data, &self->num_dmabuf_data, &self->pipewire_fourcc, &self->pipewire_modifiers, &self->using_external_image)) {
+            if(self->region.width != self->capture_size.x || self->region.height != self->capture_size.y) {
+                self->capture_size.x = self->region.width;
+                self->capture_size.y = self->region.height;
+                gsr_color_conversion_clear(color_conversion);
+            }
+        } else {
+            return -1;
+        }
+    }
+
+    const bool is_scaled = self->params.output_resolution.x > 0 && self->params.output_resolution.y > 0;
+    vec2i output_size = is_scaled ? self->params.output_resolution : self->capture_size;
+    output_size = scale_keep_aspect_ratio(self->capture_size, output_size);
+    
+    const vec2i target_pos = { max_int(0, capture_metadata->width / 2 - output_size.x / 2), max_int(0, capture_metadata->height / 2 - output_size.y / 2) };
+
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
+
+    // TODO: Handle region crop
+
+    gsr_color_conversion_draw(color_conversion, self->using_external_image ? self->texture_map.external_texture_id : self->texture_map.texture_id,
+        target_pos, output_size,
+        (vec2i){self->region.x, self->region.y}, self->capture_size, self->capture_size,
+        GSR_ROT_0, GSR_SOURCE_COLOR_RGB, self->using_external_image, false);
+
+    if(self->params.record_cursor && self->texture_map.cursor_texture_id > 0 && self->cursor_region.width > 0) {
+        const vec2d scale = {
+            self->capture_size.x == 0 ? 0 : (double)output_size.x / (double)self->capture_size.x,
+            self->capture_size.y == 0 ? 0 : (double)output_size.y / (double)self->capture_size.y
+        };
+
+        const vec2i cursor_pos = {
+            target_pos.x + (self->cursor_region.x * scale.x),
+            target_pos.y + (self->cursor_region.y * scale.y)
+        };
+
+        self->params.egl->glEnable(GL_SCISSOR_TEST);
+        self->params.egl->glScissor(target_pos.x, target_pos.y, output_size.x, output_size.y);
+        gsr_color_conversion_draw(color_conversion, self->texture_map.cursor_texture_id,
+            (vec2i){cursor_pos.x, cursor_pos.y}, (vec2i){self->cursor_region.width * scale.x, self->cursor_region.height * scale.y},
+            (vec2i){0, 0}, (vec2i){self->cursor_region.width, self->cursor_region.height}, (vec2i){self->cursor_region.width, self->cursor_region.height},
+            GSR_ROT_0, GSR_SOURCE_COLOR_RGB, false, true);
+        self->params.egl->glDisable(GL_SCISSOR_TEST);
+    }
+
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
+
+    gsr_capture_portal_cleanup_plane_fds(self);
+
+    return 0;
+}
+
+static bool gsr_capture_portal_uses_external_image(gsr_capture *cap) {
+    (void)cap;
+    return true;
+}
+
+static bool gsr_capture_portal_is_damaged(gsr_capture *cap) {
+    gsr_capture_portal *self = cap->priv;
+    return gsr_pipewire_video_is_damaged(&self->pipewire);
+}
+
+static void gsr_capture_portal_clear_damage(gsr_capture *cap) {
+    gsr_capture_portal *self = cap->priv;
+    gsr_pipewire_video_clear_damage(&self->pipewire);
+}
+
+static void gsr_capture_portal_destroy(gsr_capture *cap) {
+    gsr_capture_portal *self = cap->priv;
+    if(cap->priv) {
+        gsr_capture_portal_stop(self);
+        free(cap->priv);
+        cap->priv = NULL;
+    }
+    free(cap);
+}
+
+gsr_capture* gsr_capture_portal_create(const gsr_capture_portal_params *params) {
+    if(!params) {
+        fprintf(stderr, "gsr error: gsr_capture_portal_create params is NULL\n");
+        return NULL;
+    }
+
+    gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+    if(!cap)
+        return NULL;
+
+    gsr_capture_portal *cap_portal = calloc(1, sizeof(gsr_capture_portal));
+    if(!cap_portal) {
+        free(cap);
+        return NULL;
+    }
+
+    cap_portal->params = *params;
+    
+    *cap = (gsr_capture) {
+        .start = gsr_capture_portal_start,
+        .tick = NULL,
+        .should_stop = NULL,
+        .capture = gsr_capture_portal_capture,
+        .uses_external_image = gsr_capture_portal_uses_external_image,
+        .is_damaged = gsr_capture_portal_is_damaged,
+        .clear_damage = gsr_capture_portal_clear_damage,
+        .destroy = gsr_capture_portal_destroy,
+        .priv = cap_portal
+    };
+
+    return cap;
+}
diff --git a/src/capture/xcomposite.c b/src/capture/xcomposite.c
new file mode 100644
index 0000000..db41f63
--- /dev/null
+++ b/src/capture/xcomposite.c
@@ -0,0 +1,338 @@
+#include "../../include/capture/xcomposite.h"
+#include "../../include/window_texture.h"
+#include "../../include/utils.h"
+#include "../../include/cursor.h"
+#include "../../include/color_conversion.h"
+#include "../../include/window/window.h"
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <assert.h>
+
+#include <X11/Xlib.h>
+
+typedef struct {
+    gsr_capture_xcomposite_params params;
+    Display *display;
+
+    bool should_stop;
+    bool stop_is_error;
+    bool window_resized;
+    bool follow_focused_initialized;
+    bool init_new_window;
+
+    Window window;
+    vec2i window_size;
+    vec2i texture_size;
+    double window_resize_timer;
+    
+    WindowTexture window_texture;
+
+    Atom net_active_window_atom;
+
+    gsr_cursor cursor;
+
+    bool clear_background;
+} gsr_capture_xcomposite;
+
+static void gsr_capture_xcomposite_stop(gsr_capture_xcomposite *self) {
+    window_texture_deinit(&self->window_texture);
+    gsr_cursor_deinit(&self->cursor);
+}
+
+static int max_int(int a, int b) {
+    return a > b ? a : b;
+}
+
+static Window get_focused_window(Display *display, Atom net_active_window_atom) {
+    Atom type;
+    int format = 0;
+    unsigned long num_items = 0;
+    unsigned long bytes_after = 0;
+    unsigned char *properties = NULL;
+    if(XGetWindowProperty(display, DefaultRootWindow(display), net_active_window_atom, 0, 1024, False, AnyPropertyType, &type, &format, &num_items, &bytes_after, &properties) == Success && properties) {
+        Window focused_window = *(unsigned long*)properties;
+        XFree(properties);
+        return focused_window;
+    }
+    return None;
+}
+
+static int gsr_capture_xcomposite_start(gsr_capture *cap, gsr_capture_metadata *capture_metadata) {
+    gsr_capture_xcomposite *self = cap->priv;
+
+    if(self->params.follow_focused) {
+        self->net_active_window_atom = XInternAtom(self->display, "_NET_ACTIVE_WINDOW", False);
+        if(!self->net_active_window_atom) {
+            fprintf(stderr, "gsr error: gsr_capture_xcomposite_start failed: failed to get _NET_ACTIVE_WINDOW atom\n");
+            return -1;
+        }
+        self->window = get_focused_window(self->display, self->net_active_window_atom);
+    } else {
+        self->window = self->params.window;
+    }
+
+    /* TODO: Do these in tick, and allow error if follow_focused */
+
+    XWindowAttributes attr;
+    if(!XGetWindowAttributes(self->display, self->window, &attr) && !self->params.follow_focused) {
+        fprintf(stderr, "gsr error: gsr_capture_xcomposite_start failed: invalid window id: %lu\n", self->window);
+        return -1;
+    }
+
+    self->window_size.x = max_int(attr.width, 0);
+    self->window_size.y = max_int(attr.height, 0);
+
+    if(self->params.follow_focused)
+        XSelectInput(self->display, DefaultRootWindow(self->display), PropertyChangeMask);
+
+    // TODO: Get select and add these on top of it and then restore at the end. Also do the same in other xcomposite
+    XSelectInput(self->display, self->window, StructureNotifyMask | ExposureMask);
+
+    if(window_texture_init(&self->window_texture, self->display, self->window, self->params.egl) != 0 && !self->params.follow_focused) {
+        fprintf(stderr, "gsr error: gsr_capture_xcomposite_start: failed to get window texture for window %ld\n", (long)self->window);
+        return -1;
+    }
+
+    if(gsr_cursor_init(&self->cursor, self->params.egl, self->display) != 0) {
+        gsr_capture_xcomposite_stop(self);
+        return -1;
+    }
+
+    self->texture_size.x = 0;
+    self->texture_size.y = 0;
+
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&self->window_texture));
+    self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &self->texture_size.x);
+    self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &self->texture_size.y);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+
+    if(self->params.output_resolution.x == 0 && self->params.output_resolution.y == 0) {
+        capture_metadata->width = self->texture_size.x;
+        capture_metadata->height = self->texture_size.y;
+    } else {
+        capture_metadata->width = self->params.output_resolution.x;
+        capture_metadata->height = self->params.output_resolution.y;
+    }
+
+    self->window_resize_timer = clock_get_monotonic_seconds();
+    return 0;
+}
+
+static void gsr_capture_xcomposite_tick(gsr_capture *cap) {
+    gsr_capture_xcomposite *self = cap->priv;
+
+    if(self->params.follow_focused && !self->follow_focused_initialized) {
+        self->init_new_window = true;
+    }
+
+    if(self->init_new_window) {
+        self->init_new_window = false;
+        Window focused_window = get_focused_window(self->display, self->net_active_window_atom);
+        if(focused_window != self->window || !self->follow_focused_initialized) {
+            self->follow_focused_initialized = true;
+            XSelectInput(self->display, self->window, 0);
+            self->window = focused_window;
+            XSelectInput(self->display, self->window, StructureNotifyMask | ExposureMask);
+
+            XWindowAttributes attr;
+            attr.width = 0;
+            attr.height = 0;
+            if(!XGetWindowAttributes(self->display, self->window, &attr))
+                fprintf(stderr, "gsr error: gsr_capture_xcomposite_tick failed: invalid window id: %lu\n", self->window);
+
+            self->window_size.x = max_int(attr.width, 0);
+            self->window_size.y = max_int(attr.height, 0);
+
+            window_texture_deinit(&self->window_texture);
+            window_texture_init(&self->window_texture, self->display, self->window, self->params.egl); // TODO: Do not do the below window_texture_on_resize after this
+
+            self->texture_size.x = 0;
+            self->texture_size.y = 0;
+
+            self->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&self->window_texture));
+            self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &self->texture_size.x);
+            self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &self->texture_size.y);
+            self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+
+            self->window_resized = false;
+            self->clear_background = true;
+        }
+    }
+
+    const double window_resize_timeout = 1.0; // 1 second
+    if(self->window_resized && clock_get_monotonic_seconds() - self->window_resize_timer >= window_resize_timeout) {
+        self->window_resized = false;
+
+        if(window_texture_on_resize(&self->window_texture) != 0) {
+            fprintf(stderr, "gsr error: gsr_capture_xcomposite_tick: window_texture_on_resize failed\n");
+            //self->should_stop = true;
+            //self->stop_is_error = true;
+            return;
+        }
+
+        self->texture_size.x = 0;
+        self->texture_size.y = 0;
+
+        self->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&self->window_texture));
+        self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &self->texture_size.x);
+        self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &self->texture_size.y);
+        self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+
+        self->clear_background = true;
+    }
+}
+
+static void gsr_capture_xcomposite_on_event(gsr_capture *cap, gsr_egl *egl) {
+    gsr_capture_xcomposite *self = cap->priv;
+    XEvent *xev = gsr_window_get_event_data(egl->window);
+    switch(xev->type) {
+        case DestroyNotify: {
+            /* Window died (when not following focused window), so we stop recording */
+            if(!self->params.follow_focused && xev->xdestroywindow.window == self->window) {
+                self->should_stop = true;
+                self->stop_is_error = false;
+            }
+            break;
+        }
+        case Expose: {
+            /* Requires window texture recreate */
+            if(xev->xexpose.count == 0 && xev->xexpose.window == self->window) {
+                self->window_resize_timer = clock_get_monotonic_seconds();
+                self->window_resized = true;
+            }
+            break;
+        }
+        case ConfigureNotify: {
+            /* Window resized */
+            if(xev->xconfigure.window == self->window && (xev->xconfigure.width != self->window_size.x || xev->xconfigure.height != self->window_size.y)) {
+                self->window_size.x = max_int(xev->xconfigure.width, 0);
+                self->window_size.y = max_int(xev->xconfigure.height, 0);
+                self->window_resize_timer = clock_get_monotonic_seconds();
+                self->window_resized = true;
+            }
+            break;
+        }
+        case PropertyNotify: {
+            /* Focused window changed */
+            if(self->params.follow_focused && xev->xproperty.atom == self->net_active_window_atom) {
+                self->init_new_window = true;
+            }
+            break;
+        }
+    }
+
+    gsr_cursor_on_event(&self->cursor, xev);
+}
+
+static bool gsr_capture_xcomposite_should_stop(gsr_capture *cap, bool *err) {
+    gsr_capture_xcomposite *self = cap->priv;
+    if(self->should_stop) {
+        if(err)
+            *err = self->stop_is_error;
+        return true;
+    }
+
+    if(err)
+        *err = false;
+    return false;
+}
+
+static int gsr_capture_xcomposite_capture(gsr_capture *cap, gsr_capture_metadata *capture_metdata, gsr_color_conversion *color_conversion) {
+    gsr_capture_xcomposite *self = cap->priv;
+
+    if(self->clear_background) {
+        self->clear_background = false;
+        gsr_color_conversion_clear(color_conversion);
+    }
+
+    const bool is_scaled = self->params.output_resolution.x > 0 && self->params.output_resolution.y > 0;
+    vec2i output_size = is_scaled ? self->params.output_resolution : self->texture_size;
+    output_size = scale_keep_aspect_ratio(self->texture_size, output_size);
+
+    const vec2i target_pos = { max_int(0, capture_metdata->width / 2 - output_size.x / 2), max_int(0, capture_metdata->height / 2 - output_size.y / 2) };
+
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
+
+    gsr_color_conversion_draw(color_conversion, window_texture_get_opengl_texture_id(&self->window_texture),
+        target_pos, output_size,
+        (vec2i){0, 0}, self->texture_size, self->texture_size,
+        GSR_ROT_0, GSR_SOURCE_COLOR_RGB, false, false);
+
+    if(self->params.record_cursor && self->cursor.visible) {
+        const vec2d scale = {
+            self->texture_size.x == 0 ? 0 : (double)output_size.x / (double)self->texture_size.x,
+            self->texture_size.y == 0 ? 0 : (double)output_size.y / (double)self->texture_size.y
+        };
+
+        gsr_cursor_tick(&self->cursor, self->window);
+
+        const vec2i cursor_pos = {
+            target_pos.x + (self->cursor.position.x - self->cursor.hotspot.x) * scale.x,
+            target_pos.y + (self->cursor.position.y - self->cursor.hotspot.y) * scale.y
+        };
+
+        if(cursor_pos.x < target_pos.x || cursor_pos.x + self->cursor.size.x > target_pos.x + output_size.x || cursor_pos.y < target_pos.y || cursor_pos.y + self->cursor.size.y > target_pos.y + output_size.y)
+            self->clear_background = true;
+
+        gsr_color_conversion_draw(color_conversion, self->cursor.texture_id,
+            cursor_pos, (vec2i){self->cursor.size.x * scale.x, self->cursor.size.y * scale.y},
+            (vec2i){0, 0}, self->cursor.size, self->cursor.size,
+            GSR_ROT_0, GSR_SOURCE_COLOR_RGB, false, true);
+    }
+
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
+
+    return 0;
+}
+
+static uint64_t gsr_capture_xcomposite_get_window_id(gsr_capture *cap) {
+    gsr_capture_xcomposite *self = cap->priv;
+    return self->window;
+}
+
+static void gsr_capture_xcomposite_destroy(gsr_capture *cap) {
+    if(cap->priv) {
+        gsr_capture_xcomposite_stop(cap->priv);
+        free(cap->priv);
+        cap->priv = NULL;
+    }
+    free(cap);
+}
+
+gsr_capture* gsr_capture_xcomposite_create(const gsr_capture_xcomposite_params *params) {
+    if(!params) {
+        fprintf(stderr, "gsr error: gsr_capture_xcomposite_create params is NULL\n");
+        return NULL;
+    }
+
+    gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+    if(!cap)
+        return NULL;
+
+    gsr_capture_xcomposite *cap_xcomp = calloc(1, sizeof(gsr_capture_xcomposite));
+    if(!cap_xcomp) {
+        free(cap);
+        return NULL;
+    }
+
+    cap_xcomp->params = *params;
+    cap_xcomp->display = gsr_window_get_display(params->egl->window);
+    
+    *cap = (gsr_capture) {
+        .start = gsr_capture_xcomposite_start,
+        .on_event = gsr_capture_xcomposite_on_event,
+        .tick = gsr_capture_xcomposite_tick,
+        .should_stop = gsr_capture_xcomposite_should_stop,
+        .capture = gsr_capture_xcomposite_capture,
+        .uses_external_image = NULL,
+        .get_window_id = gsr_capture_xcomposite_get_window_id,
+        .destroy = gsr_capture_xcomposite_destroy,
+        .priv = cap_xcomp
+    };
+
+    return cap;
+}
diff --git a/src/capture/xcomposite_cuda.c b/src/capture/xcomposite_cuda.c
deleted file mode 100644
index 7e65efa..0000000
--- a/src/capture/xcomposite_cuda.c
+++ /dev/null
@@ -1,512 +0,0 @@
-#include "../../include/capture/xcomposite_cuda.h"
-#include "../../include/cuda.h"
-#include "../../include/window_texture.h"
-#include "../../include/utils.h"
-#include <libavutil/hwcontext.h>
-#include <libavutil/hwcontext_cuda.h>
-#include <libavutil/frame.h>
-#include <libavcodec/avcodec.h>
-
-typedef struct {
-    gsr_capture_xcomposite_cuda_params params;
-    XEvent xev;
-
-    bool should_stop;
-    bool stop_is_error;
-    bool window_resized;
-    bool created_hw_frame;
-    bool follow_focused_initialized;
-    double window_resize_timer;
-
-    vec2i window_size;
-
-    unsigned int target_texture_id;
-    vec2i texture_size;
-    Window window;
-    WindowTexture window_texture;
-    Atom net_active_window_atom;
-
-    CUgraphicsResource cuda_graphics_resource;
-    CUarray mapped_array;
-
-    gsr_cuda cuda;
-} gsr_capture_xcomposite_cuda;
-
-static int max_int(int a, int b) {
-    return a > b ? a : b;
-}
-
-static int min_int(int a, int b) {
-    return a < b ? a : b;
-}
-
-static Window get_focused_window(Display *display, Atom net_active_window_atom) {
-    Atom type;
-    int format = 0;
-    unsigned long num_items = 0;
-    unsigned long bytes_after = 0;
-    unsigned char *properties = NULL;
-    if(XGetWindowProperty(display, DefaultRootWindow(display), net_active_window_atom, 0, 1024, False, AnyPropertyType, &type, &format, &num_items, &bytes_after, &properties) == Success && properties) {
-        Window focused_window = *(unsigned long*)properties;
-        XFree(properties);
-        return focused_window;
-    }
-    return None;
-}
-
-static void gsr_capture_xcomposite_cuda_stop(gsr_capture *cap, AVCodecContext *video_codec_context);
-
-static bool cuda_register_opengl_texture(gsr_capture_xcomposite_cuda *cap_xcomp) {
-    CUresult res;
-    CUcontext old_ctx;
-    res = cap_xcomp->cuda.cuCtxPushCurrent_v2(cap_xcomp->cuda.cu_ctx);
-    // TODO: Use cuGraphicsEGLRegisterImage instead with the window egl image (dont use window_texture).
-    // That removes the need for an extra texture and texture copy
-    res = cap_xcomp->cuda.cuGraphicsGLRegisterImage(
-        &cap_xcomp->cuda_graphics_resource, cap_xcomp->target_texture_id, GL_TEXTURE_2D,
-        CU_GRAPHICS_REGISTER_FLAGS_READ_ONLY);
-    if (res != CUDA_SUCCESS) {
-        const char *err_str = "unknown";
-        cap_xcomp->cuda.cuGetErrorString(res, &err_str);
-        fprintf(stderr, "gsr error: cuda_register_opengl_texture: cuGraphicsGLRegisterImage failed, error: %s, texture " "id: %u\n", err_str, cap_xcomp->target_texture_id);
-        res = cap_xcomp->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    res = cap_xcomp->cuda.cuGraphicsResourceSetMapFlags(cap_xcomp->cuda_graphics_resource, CU_GRAPHICS_MAP_RESOURCE_FLAGS_READ_ONLY);
-    res = cap_xcomp->cuda.cuGraphicsMapResources(1, &cap_xcomp->cuda_graphics_resource, 0);
-
-    res = cap_xcomp->cuda.cuGraphicsSubResourceGetMappedArray(&cap_xcomp->mapped_array, cap_xcomp->cuda_graphics_resource, 0, 0);
-    res = cap_xcomp->cuda.cuCtxPopCurrent_v2(&old_ctx);
-    return true;
-}
-
-static bool cuda_create_codec_context(gsr_capture_xcomposite_cuda *cap_xcomp, AVCodecContext *video_codec_context) {
-    CUcontext old_ctx;
-    cap_xcomp->cuda.cuCtxPushCurrent_v2(cap_xcomp->cuda.cu_ctx);
-
-    AVBufferRef *device_ctx = av_hwdevice_ctx_alloc(AV_HWDEVICE_TYPE_CUDA);
-    if(!device_ctx) {
-        fprintf(stderr, "Error: Failed to create hardware device context\n");
-        cap_xcomp->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    AVHWDeviceContext *hw_device_context = (AVHWDeviceContext*)device_ctx->data;
-    AVCUDADeviceContext *cuda_device_context = (AVCUDADeviceContext*)hw_device_context->hwctx;
-    cuda_device_context->cuda_ctx = cap_xcomp->cuda.cu_ctx;
-    if(av_hwdevice_ctx_init(device_ctx) < 0) {
-        fprintf(stderr, "Error: Failed to create hardware device context\n");
-        av_buffer_unref(&device_ctx);
-        cap_xcomp->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
-    if(!frame_context) {
-        fprintf(stderr, "Error: Failed to create hwframe context\n");
-        av_buffer_unref(&device_ctx);
-        cap_xcomp->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    AVHWFramesContext *hw_frame_context =
-        (AVHWFramesContext *)frame_context->data;
-    hw_frame_context->width = video_codec_context->width;
-    hw_frame_context->height = video_codec_context->height;
-    hw_frame_context->sw_format = AV_PIX_FMT_BGR0;
-    hw_frame_context->format = video_codec_context->pix_fmt;
-    hw_frame_context->device_ref = device_ctx;
-    hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
-
-    hw_frame_context->initial_pool_size = 1;
-
-    if (av_hwframe_ctx_init(frame_context) < 0) {
-        fprintf(stderr, "Error: Failed to initialize hardware frame context "
-                        "(note: ffmpeg version needs to be > 4.0)\n");
-        av_buffer_unref(&device_ctx);
-        //av_buffer_unref(&frame_context);
-        cap_xcomp->cuda.cuCtxPopCurrent_v2(&old_ctx);
-        return false;
-    }
-
-    video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
-    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
-    return true;
-}
-
-static unsigned int gl_create_texture(gsr_capture_xcomposite_cuda *cap_xcomp, int width, int height) {
-    unsigned int texture_id = 0;
-    cap_xcomp->params.egl->glGenTextures(1, &texture_id);
-    cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, texture_id);
-    cap_xcomp->params.egl->glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, NULL);
-
-    cap_xcomp->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
-    cap_xcomp->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
-    cap_xcomp->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
-    cap_xcomp->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
-
-    cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-    return texture_id;
-}
-
-static int gsr_capture_xcomposite_cuda_start(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
-
-    if(cap_xcomp->params.follow_focused) {
-        cap_xcomp->net_active_window_atom = XInternAtom(cap_xcomp->params.dpy, "_NET_ACTIVE_WINDOW", False);
-        if(!cap_xcomp->net_active_window_atom) {
-            fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_start failed: failed to get _NET_ACTIVE_WINDOW atom\n");
-            return -1;
-        }
-        cap_xcomp->window = get_focused_window(cap_xcomp->params.dpy, cap_xcomp->net_active_window_atom);
-    } else {
-        cap_xcomp->window = cap_xcomp->params.window;
-    }
-
-    /* TODO: Do these in tick, and allow error if follow_focused */
-
-    XWindowAttributes attr;
-    attr.width = 0;
-    attr.height = 0;
-    if(!XGetWindowAttributes(cap_xcomp->params.dpy, cap_xcomp->window, &attr) && !cap_xcomp->params.follow_focused) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_start failed: invalid window id: %lu\n", cap_xcomp->window);
-        return -1;
-    }
-
-    cap_xcomp->window_size.x = max_int(attr.width, 0);
-    cap_xcomp->window_size.y = max_int(attr.height, 0);
-
-    if(cap_xcomp->params.follow_focused)
-        XSelectInput(cap_xcomp->params.dpy, DefaultRootWindow(cap_xcomp->params.dpy), PropertyChangeMask);
-
-    XSelectInput(cap_xcomp->params.dpy, cap_xcomp->window, StructureNotifyMask | ExposureMask);
-
-    cap_xcomp->params.egl->eglSwapInterval(cap_xcomp->params.egl->egl_display, 0);
-    if(window_texture_init(&cap_xcomp->window_texture, cap_xcomp->params.dpy, cap_xcomp->window, cap_xcomp->params.egl) != 0 && !cap_xcomp->params.follow_focused) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_start: failed to get window texture for window %ld\n", cap_xcomp->window);
-        return -1;
-    }
-
-    cap_xcomp->texture_size.x = 0;
-    cap_xcomp->texture_size.y = 0;
-
-    cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&cap_xcomp->window_texture));
-    cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &cap_xcomp->texture_size.x);
-    cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &cap_xcomp->texture_size.y);
-    cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-    cap_xcomp->texture_size.x = max_int(2, cap_xcomp->texture_size.x & ~1);
-    cap_xcomp->texture_size.y = max_int(2, cap_xcomp->texture_size.y & ~1);
-
-    video_codec_context->width = cap_xcomp->texture_size.x;
-    video_codec_context->height = cap_xcomp->texture_size.y;
-
-    if(cap_xcomp->params.region_size.x > 0 && cap_xcomp->params.region_size.y > 0) {
-        video_codec_context->width = max_int(2, cap_xcomp->params.region_size.x & ~1);
-        video_codec_context->height = max_int(2, cap_xcomp->params.region_size.y & ~1);
-    }
-
-    cap_xcomp->target_texture_id = gl_create_texture(cap_xcomp, video_codec_context->width, video_codec_context->height);
-    if(cap_xcomp->target_texture_id == 0) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_start: failed to create opengl texture\n");
-        gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
-        return -1;
-    }
-
-    if(!gsr_cuda_load(&cap_xcomp->cuda, cap_xcomp->params.dpy, cap_xcomp->params.overclock)) {
-        gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
-        return -1;
-    }
-
-    if(!cuda_create_codec_context(cap_xcomp, video_codec_context)) {
-        gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
-        return -1;
-    }
-
-    if(!cuda_register_opengl_texture(cap_xcomp)) {
-        gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
-        return -1;
-    }
-
-    cap_xcomp->window_resize_timer = clock_get_monotonic_seconds();
-    return 0;
-}
-
-static void gsr_capture_xcomposite_cuda_stop(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
-
-    if(cap_xcomp->cuda.cu_ctx) {
-        CUcontext old_ctx;
-        cap_xcomp->cuda.cuCtxPushCurrent_v2(cap_xcomp->cuda.cu_ctx);
-
-        if(cap_xcomp->cuda_graphics_resource) {
-            cap_xcomp->cuda.cuGraphicsUnmapResources(1, &cap_xcomp->cuda_graphics_resource, 0);
-            cap_xcomp->cuda.cuGraphicsUnregisterResource(cap_xcomp->cuda_graphics_resource);
-        }
-
-        cap_xcomp->cuda.cuCtxPopCurrent_v2(&old_ctx);
-    }
-
-    window_texture_deinit(&cap_xcomp->window_texture);
-
-    if(cap_xcomp->target_texture_id) {
-        cap_xcomp->params.egl->glDeleteTextures(1, &cap_xcomp->target_texture_id);
-        cap_xcomp->target_texture_id = 0;
-    }
-
-    if(video_codec_context->hw_device_ctx)
-        av_buffer_unref(&video_codec_context->hw_device_ctx);
-    if(video_codec_context->hw_frames_ctx)
-        av_buffer_unref(&video_codec_context->hw_frames_ctx);
-
-    gsr_cuda_unload(&cap_xcomp->cuda);
-
-    if(cap_xcomp->params.dpy) {
-        // TODO: This causes a crash, why? maybe some other library dlclose xlib and that also happened to unload this???
-        //XCloseDisplay(cap_xcomp->dpy);
-        cap_xcomp->params.dpy = NULL;
-    }
-}
-
-static void gsr_capture_xcomposite_cuda_tick(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame **frame) {
-    gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
-
-    cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT);
-
-    bool init_new_window = false;
-    while(XPending(cap_xcomp->params.dpy)) {
-        XNextEvent(cap_xcomp->params.dpy, &cap_xcomp->xev);
-
-        switch(cap_xcomp->xev.type) {
-            case DestroyNotify: {
-                /* Window died (when not following focused window), so we stop recording */
-                if(!cap_xcomp->params.follow_focused && cap_xcomp->xev.xdestroywindow.window == cap_xcomp->window) {
-                    cap_xcomp->should_stop = true;
-                    cap_xcomp->stop_is_error = false;
-                }
-                break;
-            }
-            case Expose: {
-                /* Requires window texture recreate */
-                if(cap_xcomp->xev.xexpose.count == 0 && cap_xcomp->xev.xexpose.window == cap_xcomp->window) {
-                    cap_xcomp->window_resize_timer = clock_get_monotonic_seconds();
-                    cap_xcomp->window_resized = true;
-                }
-                break;
-            }
-            case ConfigureNotify: {
-                /* Window resized */
-                if(cap_xcomp->xev.xconfigure.window == cap_xcomp->window && (cap_xcomp->xev.xconfigure.width != cap_xcomp->window_size.x || cap_xcomp->xev.xconfigure.height != cap_xcomp->window_size.y)) {
-                    cap_xcomp->window_size.x = max_int(cap_xcomp->xev.xconfigure.width, 0);
-                    cap_xcomp->window_size.y = max_int(cap_xcomp->xev.xconfigure.height, 0);
-                    cap_xcomp->window_resize_timer = clock_get_monotonic_seconds();
-                    cap_xcomp->window_resized = true;
-                }
-                break;
-            }
-            case PropertyNotify: {
-                /* Focused window changed */
-                if(cap_xcomp->params.follow_focused && cap_xcomp->xev.xproperty.atom == cap_xcomp->net_active_window_atom) {
-                    init_new_window = true;
-                }
-                break;
-            }
-        }
-    }
-
-    if(cap_xcomp->params.follow_focused && !cap_xcomp->follow_focused_initialized) {
-        init_new_window = true;
-    }
-
-    if(init_new_window) {
-        Window focused_window = get_focused_window(cap_xcomp->params.dpy, cap_xcomp->net_active_window_atom);
-        if(focused_window != cap_xcomp->window || !cap_xcomp->follow_focused_initialized) {
-            cap_xcomp->follow_focused_initialized = true;
-            XSelectInput(cap_xcomp->params.dpy, cap_xcomp->window, 0);
-            cap_xcomp->window = focused_window;
-            XSelectInput(cap_xcomp->params.dpy, cap_xcomp->window, StructureNotifyMask | ExposureMask);
-
-            XWindowAttributes attr;
-            attr.width = 0;
-            attr.height = 0;
-            if(!XGetWindowAttributes(cap_xcomp->params.dpy, cap_xcomp->window, &attr))
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_tick failed: invalid window id: %lu\n", cap_xcomp->window);
-
-            cap_xcomp->window_size.x = max_int(attr.width, 0);
-            cap_xcomp->window_size.y = max_int(attr.height, 0);
-            cap_xcomp->window_resized = true;
-
-            window_texture_deinit(&cap_xcomp->window_texture);
-            window_texture_init(&cap_xcomp->window_texture, cap_xcomp->params.dpy, cap_xcomp->window, cap_xcomp->params.egl); // TODO: Do not do the below window_texture_on_resize after this
-            
-            cap_xcomp->texture_size.x = 0;
-            cap_xcomp->texture_size.y = 0;
-
-            cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&cap_xcomp->window_texture));
-            cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &cap_xcomp->texture_size.x);
-            cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &cap_xcomp->texture_size.y);
-            cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-            cap_xcomp->texture_size.x = min_int(video_codec_context->width, max_int(2, cap_xcomp->texture_size.x & ~1));
-            cap_xcomp->texture_size.y = min_int(video_codec_context->height, max_int(2, cap_xcomp->texture_size.y & ~1));
-        }
-    }
-
-    const double window_resize_timeout = 1.0; // 1 second
-    if(!cap_xcomp->created_hw_frame || (cap_xcomp->window_resized && clock_get_monotonic_seconds() - cap_xcomp->window_resize_timer >= window_resize_timeout)) {
-        cap_xcomp->window_resized = false;
-        if(window_texture_on_resize(&cap_xcomp->window_texture) != 0) {
-            fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_tick: window_texture_on_resize failed\n");
-            //cap_xcomp->should_stop = true;
-            //cap_xcomp->stop_is_error = true;
-            return;
-        }
-
-        cap_xcomp->texture_size.x = 0;
-        cap_xcomp->texture_size.y = 0;
-
-        cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&cap_xcomp->window_texture));
-        cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &cap_xcomp->texture_size.x);
-        cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &cap_xcomp->texture_size.y);
-        cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-        cap_xcomp->texture_size.x = min_int(video_codec_context->width, max_int(2, cap_xcomp->texture_size.x & ~1));
-        cap_xcomp->texture_size.y = min_int(video_codec_context->height, max_int(2, cap_xcomp->texture_size.y & ~1));
-
-        if(!cap_xcomp->created_hw_frame) {
-            cap_xcomp->created_hw_frame = true;
-            av_frame_free(frame);
-            *frame = av_frame_alloc();
-            if(!frame) {
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_tick: failed to allocate frame\n");
-                cap_xcomp->should_stop = true;
-                cap_xcomp->stop_is_error = true;
-                return;
-            }
-            (*frame)->format = video_codec_context->pix_fmt;
-            (*frame)->width = video_codec_context->width;
-            (*frame)->height = video_codec_context->height;
-            (*frame)->color_range = video_codec_context->color_range;
-            (*frame)->color_primaries = video_codec_context->color_primaries;
-            (*frame)->color_trc = video_codec_context->color_trc;
-            (*frame)->colorspace = video_codec_context->colorspace;
-            (*frame)->chroma_location = video_codec_context->chroma_sample_location;
-
-            if(av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, *frame, 0) < 0) {
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_tick: av_hwframe_get_buffer failed\n");
-                cap_xcomp->should_stop = true;
-                cap_xcomp->stop_is_error = true;
-                return;
-            }
-        }
-
-        // Clear texture with black background because the source texture (window_texture_get_opengl_texture_id(&cap_xcomp->window_texture))
-        // might be smaller than cap_xcomp->target_texture_id
-        cap_xcomp->params.egl->glClearTexImage(cap_xcomp->target_texture_id, 0, GL_RGB, GL_UNSIGNED_BYTE, NULL);
-    }
-}
-
-static bool gsr_capture_xcomposite_cuda_should_stop(gsr_capture *cap, bool *err) {
-    gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
-    if(cap_xcomp->should_stop) {
-        if(err)
-            *err = cap_xcomp->stop_is_error;
-        return true;
-    }
-
-    if(err)
-        *err = false;
-    return false;
-}
-
-static int gsr_capture_xcomposite_cuda_capture(gsr_capture *cap, AVFrame *frame) {
-    gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
-
-    vec2i source_pos = { 0, 0 };
-    vec2i source_size = cap_xcomp->texture_size;
-
-    if(cap_xcomp->window_texture.texture_id != 0) {
-        while(cap_xcomp->params.egl->glGetError()) {}
-        /* TODO: Remove this copy, which is only possible by using nvenc directly and encoding window_pixmap.target_texture_id */
-        cap_xcomp->params.egl->glCopyImageSubData(
-            window_texture_get_opengl_texture_id(&cap_xcomp->window_texture), GL_TEXTURE_2D, 0, source_pos.x, source_pos.y, 0,
-            cap_xcomp->target_texture_id, GL_TEXTURE_2D, 0, 0, 0, 0,
-            source_size.x, source_size.y, 1);
-        unsigned int err = cap_xcomp->params.egl->glGetError();
-        if(err != 0) {
-            static bool error_shown = false;
-            if(!error_shown) {
-                error_shown = true;
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_capture: glCopyImageSubData failed, gl error: %d\n", err);
-            }
-        }
-    }
-    cap_xcomp->params.egl->eglSwapBuffers(cap_xcomp->params.egl->egl_display, cap_xcomp->params.egl->egl_surface);
-
-    frame->linesize[0] = frame->width * 4;
-    //frame->linesize[0] = frame->width * 1;
-    //frame->linesize[1] = frame->width * 1;
-    //frame->linesize[2] = frame->width * 1;
-
-    CUDA_MEMCPY2D memcpy_struct;
-    memcpy_struct.srcXInBytes = 0;
-    memcpy_struct.srcY = 0;
-    memcpy_struct.srcMemoryType = CU_MEMORYTYPE_ARRAY;
-
-    memcpy_struct.dstXInBytes = 0;
-    memcpy_struct.dstY = 0;
-    memcpy_struct.dstMemoryType = CU_MEMORYTYPE_DEVICE;
-
-    memcpy_struct.srcArray = cap_xcomp->mapped_array;
-    memcpy_struct.dstDevice = (CUdeviceptr)frame->data[0];
-    memcpy_struct.dstPitch = frame->linesize[0];
-    memcpy_struct.WidthInBytes = frame->width * 4;//frame->width * 1;
-    memcpy_struct.Height = frame->height;
-    cap_xcomp->cuda.cuMemcpy2D_v2(&memcpy_struct);
-
-    //frame->data[1] = frame->data[0];
-    //frame->data[2] = frame->data[0];
-
-    return 0;
-}
-
-static void gsr_capture_xcomposite_cuda_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    if(cap->priv) {
-        gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
-        free(cap->priv);
-        cap->priv = NULL;
-    }
-    free(cap);
-}
-
-gsr_capture* gsr_capture_xcomposite_cuda_create(const gsr_capture_xcomposite_cuda_params *params) {
-    if(!params) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_create params is NULL\n");
-        return NULL;
-    }
-
-    gsr_capture *cap = calloc(1, sizeof(gsr_capture));
-    if(!cap)
-        return NULL;
-
-    gsr_capture_xcomposite_cuda *cap_xcomp = calloc(1, sizeof(gsr_capture_xcomposite_cuda));
-    if(!cap_xcomp) {
-        free(cap);
-        return NULL;
-    }
-
-    cap_xcomp->params = *params;
-    
-    *cap = (gsr_capture) {
-        .start = gsr_capture_xcomposite_cuda_start,
-        .tick = gsr_capture_xcomposite_cuda_tick,
-        .should_stop = gsr_capture_xcomposite_cuda_should_stop,
-        .capture = gsr_capture_xcomposite_cuda_capture,
-        .capture_end = NULL,
-        .destroy = gsr_capture_xcomposite_cuda_destroy,
-        .priv = cap_xcomp
-    };
-
-    return cap;
-}
diff --git a/src/capture/xcomposite_vaapi.c b/src/capture/xcomposite_vaapi.c
deleted file mode 100644
index 2bc21a7..0000000
--- a/src/capture/xcomposite_vaapi.c
+++ /dev/null
@@ -1,517 +0,0 @@
-#include "../../include/capture/xcomposite_vaapi.h"
-#include "../../include/window_texture.h"
-#include "../../include/utils.h"
-#include "../../include/color_conversion.h"
-#include <stdlib.h>
-#include <stdio.h>
-#include <unistd.h>
-#include <assert.h>
-#include <X11/Xlib.h>
-#include <libavutil/hwcontext.h>
-#include <libavutil/hwcontext_vaapi.h>
-#include <libavutil/frame.h>
-#include <libavcodec/avcodec.h>
-#include <va/va.h>
-#include <va/va_drmcommon.h>
-
-typedef struct {
-    gsr_capture_xcomposite_vaapi_params params;
-    XEvent xev;
-
-    bool should_stop;
-    bool stop_is_error;
-    bool window_resized;
-    bool created_hw_frame;
-    bool follow_focused_initialized;
-
-    Window window;
-    vec2i window_size;
-    vec2i texture_size;
-    double window_resize_timer;
-    
-    WindowTexture window_texture;
-
-    VADisplay va_dpy;
-    VADRMPRIMESurfaceDescriptor prime;
-
-    unsigned int target_textures[2];
-
-    gsr_color_conversion color_conversion;
-
-    Atom net_active_window_atom;
-} gsr_capture_xcomposite_vaapi;
-
-static int max_int(int a, int b) {
-    return a > b ? a : b;
-}
-
-static int min_int(int a, int b) {
-    return a < b ? a : b;
-}
-
-static void gsr_capture_xcomposite_vaapi_stop(gsr_capture *cap, AVCodecContext *video_codec_context);
-
-static Window get_focused_window(Display *display, Atom net_active_window_atom) {
-    Atom type;
-    int format = 0;
-    unsigned long num_items = 0;
-    unsigned long bytes_after = 0;
-    unsigned char *properties = NULL;
-    if(XGetWindowProperty(display, DefaultRootWindow(display), net_active_window_atom, 0, 1024, False, AnyPropertyType, &type, &format, &num_items, &bytes_after, &properties) == Success && properties) {
-        Window focused_window = *(unsigned long*)properties;
-        XFree(properties);
-        return focused_window;
-    }
-    return None;
-}
-
-static bool drm_create_codec_context(gsr_capture_xcomposite_vaapi *cap_xcomp, AVCodecContext *video_codec_context) {
-    char render_path[128];
-    if(!gsr_card_path_get_render_path(cap_xcomp->params.card_path, render_path)) {
-        fprintf(stderr, "gsr error: failed to get /dev/dri/renderDXXX file from %s\n", cap_xcomp->params.card_path);
-        return false;
-    }
-
-    AVBufferRef *device_ctx;
-    if(av_hwdevice_ctx_create(&device_ctx, AV_HWDEVICE_TYPE_VAAPI, render_path, NULL, 0) < 0) {
-        fprintf(stderr, "Error: Failed to create hardware device context\n");
-        return false;
-    }
-
-    AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
-    if(!frame_context) {
-        fprintf(stderr, "Error: Failed to create hwframe context\n");
-        av_buffer_unref(&device_ctx);
-        return false;
-    }
-
-    AVHWFramesContext *hw_frame_context =
-        (AVHWFramesContext *)frame_context->data;
-    hw_frame_context->width = video_codec_context->width;
-    hw_frame_context->height = video_codec_context->height;
-    hw_frame_context->sw_format = AV_PIX_FMT_NV12;//AV_PIX_FMT_0RGB32;//AV_PIX_FMT_YUV420P;//AV_PIX_FMT_0RGB32;//AV_PIX_FMT_NV12;
-    hw_frame_context->format = video_codec_context->pix_fmt;
-    hw_frame_context->device_ref = device_ctx;
-    hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
-
-    hw_frame_context->initial_pool_size = 1;
-
-    AVVAAPIDeviceContext *vactx =((AVHWDeviceContext*)device_ctx->data)->hwctx;
-    cap_xcomp->va_dpy = vactx->display;
-
-    if (av_hwframe_ctx_init(frame_context) < 0) {
-        fprintf(stderr, "Error: Failed to initialize hardware frame context "
-                        "(note: ffmpeg version needs to be > 4.0)\n");
-        av_buffer_unref(&device_ctx);
-        //av_buffer_unref(&frame_context);
-        return false;
-    }
-
-    video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
-    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
-    return true;
-}
-
-#define DRM_FORMAT_MOD_INVALID 72057594037927935
-
-static int gsr_capture_xcomposite_vaapi_start(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
-
-    if(cap_xcomp->params.follow_focused) {
-        cap_xcomp->net_active_window_atom = XInternAtom(cap_xcomp->params.dpy, "_NET_ACTIVE_WINDOW", False);
-        if(!cap_xcomp->net_active_window_atom) {
-            fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_start failed: failed to get _NET_ACTIVE_WINDOW atom\n");
-            return -1;
-        }
-        cap_xcomp->window = get_focused_window(cap_xcomp->params.dpy, cap_xcomp->net_active_window_atom);
-    } else {
-        cap_xcomp->window = cap_xcomp->params.window;
-    }
-
-    /* TODO: Do these in tick, and allow error if follow_focused */
-
-    XWindowAttributes attr;
-    if(!XGetWindowAttributes(cap_xcomp->params.dpy, cap_xcomp->params.window, &attr) && !cap_xcomp->params.follow_focused) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_start failed: invalid window id: %lu\n", cap_xcomp->params.window);
-        return -1;
-    }
-
-    cap_xcomp->window_size.x = max_int(attr.width, 0);
-    cap_xcomp->window_size.y = max_int(attr.height, 0);
-
-    if(cap_xcomp->params.follow_focused)
-        XSelectInput(cap_xcomp->params.dpy, DefaultRootWindow(cap_xcomp->params.dpy), PropertyChangeMask);
-
-    // TODO: Get select and add these on top of it and then restore at the end. Also do the same in other xcomposite
-    XSelectInput(cap_xcomp->params.dpy, cap_xcomp->params.window, StructureNotifyMask | ExposureMask);
-
-    if(!cap_xcomp->params.egl->eglExportDMABUFImageQueryMESA) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_start: could not find eglExportDMABUFImageQueryMESA\n");
-        return -1;
-    }
-
-    if(!cap_xcomp->params.egl->eglExportDMABUFImageMESA) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_start: could not find eglExportDMABUFImageMESA\n");
-        return -1;
-    }
-
-    /* Disable vsync */
-    cap_xcomp->params.egl->eglSwapInterval(cap_xcomp->params.egl->egl_display, 0);
-    if(window_texture_init(&cap_xcomp->window_texture, cap_xcomp->params.dpy, cap_xcomp->params.window, cap_xcomp->params.egl) != 0 && !cap_xcomp->params.follow_focused) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_start: failed to get window texture for window %ld\n", cap_xcomp->params.window);
-        return -1;
-    }
-
-    cap_xcomp->texture_size.x = 0;
-    cap_xcomp->texture_size.y = 0;
-
-    cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&cap_xcomp->window_texture));
-    cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &cap_xcomp->texture_size.x);
-    cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &cap_xcomp->texture_size.y);
-    cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-    cap_xcomp->texture_size.x = max_int(2, even_number_ceil(cap_xcomp->texture_size.x));
-    cap_xcomp->texture_size.y = max_int(2, even_number_ceil(cap_xcomp->texture_size.y));
-
-    video_codec_context->width = cap_xcomp->texture_size.x;
-    video_codec_context->height = cap_xcomp->texture_size.y;
-
-    if(cap_xcomp->params.region_size.x > 0 && cap_xcomp->params.region_size.y > 0) {
-        video_codec_context->width = max_int(2, even_number_ceil(cap_xcomp->params.region_size.x));
-        video_codec_context->height = max_int(2, even_number_ceil(cap_xcomp->params.region_size.y));
-    }
-
-    if(!drm_create_codec_context(cap_xcomp, video_codec_context)) {
-        gsr_capture_xcomposite_vaapi_stop(cap, video_codec_context);
-        return -1;
-    }
-
-    cap_xcomp->window_resize_timer = clock_get_monotonic_seconds();
-    return 0;
-}
-
-static uint32_t fourcc(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
-    return (d << 24) | (c << 16) | (b << 8) | a;
-}
-
-#define FOURCC_NV12 842094158
-
-static void gsr_capture_xcomposite_vaapi_tick(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame **frame) {
-    gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
-
-    // TODO:
-    cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT);
-
-    bool init_new_window = false;
-    while(XPending(cap_xcomp->params.dpy)) {
-        XNextEvent(cap_xcomp->params.dpy, &cap_xcomp->xev);
-
-        switch(cap_xcomp->xev.type) {
-            case DestroyNotify: {
-                /* Window died (when not following focused window), so we stop recording */
-                if(!cap_xcomp->params.follow_focused && cap_xcomp->xev.xdestroywindow.window == cap_xcomp->window) {
-                    cap_xcomp->should_stop = true;
-                    cap_xcomp->stop_is_error = false;
-                }
-                break;
-            }
-            case Expose: {
-                /* Requires window texture recreate */
-                if(cap_xcomp->xev.xexpose.count == 0 && cap_xcomp->xev.xexpose.window == cap_xcomp->window) {
-                    cap_xcomp->window_resize_timer = clock_get_monotonic_seconds();
-                    cap_xcomp->window_resized = true;
-                }
-                break;
-            }
-            case ConfigureNotify: {
-                /* Window resized */
-                if(cap_xcomp->xev.xconfigure.window == cap_xcomp->window && (cap_xcomp->xev.xconfigure.width != cap_xcomp->window_size.x || cap_xcomp->xev.xconfigure.height != cap_xcomp->window_size.y)) {
-                    cap_xcomp->window_size.x = max_int(cap_xcomp->xev.xconfigure.width, 0);
-                    cap_xcomp->window_size.y = max_int(cap_xcomp->xev.xconfigure.height, 0);
-                    cap_xcomp->window_resize_timer = clock_get_monotonic_seconds();
-                    cap_xcomp->window_resized = true;
-                }
-                break;
-            }
-            case PropertyNotify: {
-                /* Focused window changed */
-                if(cap_xcomp->params.follow_focused && cap_xcomp->xev.xproperty.atom == cap_xcomp->net_active_window_atom) {
-                    init_new_window = true;
-                }
-                break;
-            }
-        }
-    }
-
-    if(cap_xcomp->params.follow_focused && !cap_xcomp->follow_focused_initialized) {
-        init_new_window = true;
-    }
-
-    if(init_new_window) {
-        Window focused_window = get_focused_window(cap_xcomp->params.dpy, cap_xcomp->net_active_window_atom);
-        if(focused_window != cap_xcomp->window || !cap_xcomp->follow_focused_initialized) {
-            cap_xcomp->follow_focused_initialized = true;
-            XSelectInput(cap_xcomp->params.dpy, cap_xcomp->window, 0);
-            cap_xcomp->window = focused_window;
-            XSelectInput(cap_xcomp->params.dpy, cap_xcomp->window, StructureNotifyMask | ExposureMask);
-
-            XWindowAttributes attr;
-            attr.width = 0;
-            attr.height = 0;
-            if(!XGetWindowAttributes(cap_xcomp->params.dpy, cap_xcomp->window, &attr))
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick failed: invalid window id: %lu\n", cap_xcomp->window);
-
-            cap_xcomp->window_size.x = max_int(attr.width, 0);
-            cap_xcomp->window_size.y = max_int(attr.height, 0);
-            cap_xcomp->window_resized = true;
-
-            window_texture_deinit(&cap_xcomp->window_texture);
-            window_texture_init(&cap_xcomp->window_texture, cap_xcomp->params.dpy, cap_xcomp->window, cap_xcomp->params.egl); // TODO: Do not do the below window_texture_on_resize after this
-            
-            cap_xcomp->texture_size.x = 0;
-            cap_xcomp->texture_size.y = 0;
-
-            cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&cap_xcomp->window_texture));
-            cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &cap_xcomp->texture_size.x);
-            cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &cap_xcomp->texture_size.y);
-            cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-            cap_xcomp->texture_size.x = min_int(video_codec_context->width, max_int(2, even_number_ceil(cap_xcomp->texture_size.x)));
-            cap_xcomp->texture_size.y = min_int(video_codec_context->height, max_int(2, even_number_ceil(cap_xcomp->texture_size.y)));
-        }
-    }
-
-    const double window_resize_timeout = 1.0; // 1 second
-    if(!cap_xcomp->created_hw_frame || (cap_xcomp->window_resized && clock_get_monotonic_seconds() - cap_xcomp->window_resize_timer >= window_resize_timeout)) {
-        cap_xcomp->window_resized = false;
-
-        if(window_texture_on_resize(&cap_xcomp->window_texture) != 0) {
-            fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick: window_texture_on_resize failed\n");
-            //cap_xcomp->should_stop = true;
-            //cap_xcomp->stop_is_error = true;
-            return;
-        }
-
-        cap_xcomp->texture_size.x = 0;
-        cap_xcomp->texture_size.y = 0;
-
-        cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&cap_xcomp->window_texture));
-        cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &cap_xcomp->texture_size.x);
-        cap_xcomp->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &cap_xcomp->texture_size.y);
-        cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-
-        cap_xcomp->texture_size.x = min_int(video_codec_context->width, max_int(2, even_number_ceil(cap_xcomp->texture_size.x)));
-        cap_xcomp->texture_size.y = min_int(video_codec_context->height, max_int(2, even_number_ceil(cap_xcomp->texture_size.y)));
-
-        if(!cap_xcomp->created_hw_frame) {
-            cap_xcomp->created_hw_frame = true;
-            av_frame_free(frame);
-            *frame = av_frame_alloc();
-            if(!frame) {
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick: failed to allocate frame\n");
-                cap_xcomp->should_stop = true;
-                cap_xcomp->stop_is_error = true;
-                return;
-            }
-            (*frame)->format = video_codec_context->pix_fmt;
-            (*frame)->width = video_codec_context->width;
-            (*frame)->height = video_codec_context->height;
-            (*frame)->color_range = video_codec_context->color_range;
-            (*frame)->color_primaries = video_codec_context->color_primaries;
-            (*frame)->color_trc = video_codec_context->color_trc;
-            (*frame)->colorspace = video_codec_context->colorspace;
-            (*frame)->chroma_location = video_codec_context->chroma_sample_location;
-
-            int res = av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, *frame, 0);
-            if(res < 0) {
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick: av_hwframe_get_buffer failed: %d\n", res);
-                cap_xcomp->should_stop = true;
-                cap_xcomp->stop_is_error = true;
-                return;
-            }
-
-            VASurfaceID target_surface_id = (uintptr_t)(*frame)->data[3];
-
-            VAStatus va_status = vaExportSurfaceHandle(cap_xcomp->va_dpy, target_surface_id, VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2, VA_EXPORT_SURFACE_READ_WRITE | VA_EXPORT_SURFACE_SEPARATE_LAYERS, &cap_xcomp->prime);
-            if(va_status != VA_STATUS_SUCCESS) {
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick: vaExportSurfaceHandle failed, error: %d\n", va_status);
-                cap_xcomp->should_stop = true;
-                cap_xcomp->stop_is_error = true;
-                return;
-            }
-            vaSyncSurface(cap_xcomp->va_dpy, target_surface_id);
-
-            if(cap_xcomp->prime.fourcc == FOURCC_NV12) {
-                cap_xcomp->params.egl->glGenTextures(2, cap_xcomp->target_textures);
-                for(int i = 0; i < 2; ++i) {
-                    const uint32_t formats[2] = { fourcc('R', '8', ' ', ' '), fourcc('G', 'R', '8', '8') };
-                    const int layer = i;
-                    const int plane = 0;
-
-                    const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
-                    //const uint64_t modifier = cap_kms->prime.objects[cap_kms->prime.layers[layer].object_index[plane]].drm_format_modifier;
-
-                    const intptr_t img_attr[] = {
-                        EGL_LINUX_DRM_FOURCC_EXT,       formats[i],
-                        EGL_WIDTH,                      cap_xcomp->prime.width / div[i],
-                        EGL_HEIGHT,                     cap_xcomp->prime.height / div[i],
-                        EGL_DMA_BUF_PLANE0_FD_EXT,      cap_xcomp->prime.objects[cap_xcomp->prime.layers[layer].object_index[plane]].fd,
-                        EGL_DMA_BUF_PLANE0_OFFSET_EXT,  cap_xcomp->prime.layers[layer].offset[plane],
-                        EGL_DMA_BUF_PLANE0_PITCH_EXT,   cap_xcomp->prime.layers[layer].pitch[plane],
-                        // TODO:
-                        //EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT, modifier & 0xFFFFFFFFULL,
-                        //EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT, modifier >> 32ULL,
-                        EGL_NONE
-                    };
-
-                    while(cap_xcomp->params.egl->eglGetError() != EGL_SUCCESS){}
-                    EGLImage image = cap_xcomp->params.egl->eglCreateImage(cap_xcomp->params.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
-                    if(!image) {
-                        fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick: failed to create egl image from drm fd for output drm fd, error: %d\n", cap_xcomp->params.egl->eglGetError());
-                        cap_xcomp->should_stop = true;
-                        cap_xcomp->stop_is_error = true;
-                        return;
-                    }
-
-                    cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, cap_xcomp->target_textures[i]);
-                    cap_xcomp->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
-                    cap_xcomp->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
-                    cap_xcomp->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
-                    cap_xcomp->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
-
-                    while(cap_xcomp->params.egl->glGetError()) {}
-                    while(cap_xcomp->params.egl->eglGetError() != EGL_SUCCESS){}
-                    cap_xcomp->params.egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
-                    if(cap_xcomp->params.egl->glGetError() != 0 || cap_xcomp->params.egl->eglGetError() != EGL_SUCCESS) {
-                        // TODO: Get the error properly
-                        fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick: failed to bind egl image to gl texture, error: %d\n", cap_xcomp->params.egl->eglGetError());
-                        cap_xcomp->should_stop = true;
-                        cap_xcomp->stop_is_error = true;
-                        cap_xcomp->params.egl->eglDestroyImage(cap_xcomp->params.egl->egl_display, image);
-                        cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-                        return;
-                    }
-
-                    cap_xcomp->params.egl->eglDestroyImage(cap_xcomp->params.egl->egl_display, image);
-                    cap_xcomp->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
-                }
-
-                gsr_color_conversion_params color_conversion_params = {0};
-                color_conversion_params.egl = cap_xcomp->params.egl;
-                color_conversion_params.source_color = GSR_SOURCE_COLOR_RGB;
-                color_conversion_params.destination_color = GSR_DESTINATION_COLOR_NV12;
-
-                color_conversion_params.destination_textures[0] = cap_xcomp->target_textures[0];
-                color_conversion_params.destination_textures[1] = cap_xcomp->target_textures[1];
-                color_conversion_params.num_destination_textures = 2;
-
-                if(gsr_color_conversion_init(&cap_xcomp->color_conversion, &color_conversion_params) != 0) {
-                    fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick: failed to create color conversion\n");
-                    cap_xcomp->should_stop = true;
-                    cap_xcomp->stop_is_error = true;
-                    return;
-                }
-            } else {
-                fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_tick: unexpected fourcc %u for output drm fd, expected nv12\n", cap_xcomp->prime.fourcc);
-                cap_xcomp->should_stop = true;
-                cap_xcomp->stop_is_error = true;
-                return;
-            }
-        }
-    }
-}
-
-static bool gsr_capture_xcomposite_vaapi_should_stop(gsr_capture *cap, bool *err) {
-    gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
-    if(cap_xcomp->should_stop) {
-        if(err)
-            *err = cap_xcomp->stop_is_error;
-        return true;
-    }
-
-    if(err)
-        *err = false;
-    return false;
-}
-
-static int gsr_capture_xcomposite_vaapi_capture(gsr_capture *cap, AVFrame *frame) {
-    (void)frame;
-    gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
-
-    float texture_rotation = 0.0f;
-    gsr_color_conversion_draw(&cap_xcomp->color_conversion, window_texture_get_opengl_texture_id(&cap_xcomp->window_texture),
-        (vec2i){0, 0}, cap_xcomp->texture_size,
-        (vec2i){0, 0}, cap_xcomp->texture_size,
-        texture_rotation);
-
-    cap_xcomp->params.egl->eglSwapBuffers(cap_xcomp->params.egl->egl_display, cap_xcomp->params.egl->egl_surface);
-
-    return 0;
-}
-
-static void gsr_capture_xcomposite_vaapi_stop(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
-
-    gsr_color_conversion_deinit(&cap_xcomp->color_conversion);
-
-    for(uint32_t i = 0; i < cap_xcomp->prime.num_objects; ++i) {
-        if(cap_xcomp->prime.objects[i].fd > 0) {
-            close(cap_xcomp->prime.objects[i].fd);
-            cap_xcomp->prime.objects[i].fd = 0;
-        }
-    }
-
-    if(cap_xcomp->params.egl->egl_context) {
-        cap_xcomp->params.egl->glDeleteTextures(2, cap_xcomp->target_textures);
-        cap_xcomp->target_textures[0] = 0;
-        cap_xcomp->target_textures[1] = 0;
-    }
-
-    window_texture_deinit(&cap_xcomp->window_texture);
-
-    if(video_codec_context->hw_device_ctx)
-        av_buffer_unref(&video_codec_context->hw_device_ctx);
-    if(video_codec_context->hw_frames_ctx)
-        av_buffer_unref(&video_codec_context->hw_frames_ctx);
-}
-
-static void gsr_capture_xcomposite_vaapi_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
-    (void)video_codec_context;
-    if(cap->priv) {
-        gsr_capture_xcomposite_vaapi_stop(cap, video_codec_context);
-        free(cap->priv);
-        cap->priv = NULL;
-    }
-    free(cap);
-}
-
-gsr_capture* gsr_capture_xcomposite_vaapi_create(const gsr_capture_xcomposite_vaapi_params *params) {
-    if(!params) {
-        fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_create params is NULL\n");
-        return NULL;
-    }
-
-    gsr_capture *cap = calloc(1, sizeof(gsr_capture));
-    if(!cap)
-        return NULL;
-
-    gsr_capture_xcomposite_vaapi *cap_xcomp = calloc(1, sizeof(gsr_capture_xcomposite_vaapi));
-    if(!cap_xcomp) {
-        free(cap);
-        return NULL;
-    }
-
-    cap_xcomp->params = *params;
-    
-    *cap = (gsr_capture) {
-        .start = gsr_capture_xcomposite_vaapi_start,
-        .tick = gsr_capture_xcomposite_vaapi_tick,
-        .should_stop = gsr_capture_xcomposite_vaapi_should_stop,
-        .capture = gsr_capture_xcomposite_vaapi_capture,
-        .capture_end = NULL,
-        .destroy = gsr_capture_xcomposite_vaapi_destroy,
-        .priv = cap_xcomp
-    };
-
-    return cap;
-}
diff --git a/src/capture/ximage.c b/src/capture/ximage.c
new file mode 100644
index 0000000..9b02907
--- /dev/null
+++ b/src/capture/ximage.c
@@ -0,0 +1,247 @@
+#include "../../include/capture/ximage.h"
+#include "../../include/utils.h"
+#include "../../include/cursor.h"
+#include "../../include/color_conversion.h"
+#include "../../include/window/window.h"
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <assert.h>
+
+#include <X11/Xlib.h>
+
+/* TODO: update when monitors are reconfigured */
+
+typedef struct {
+    gsr_capture_ximage_params params;
+    Display *display;
+    gsr_cursor cursor;
+    gsr_monitor monitor;
+    vec2i capture_pos;
+    vec2i capture_size;
+    unsigned int texture_id;
+    Window root_window;
+} gsr_capture_ximage;
+
+static void gsr_capture_ximage_stop(gsr_capture_ximage *self) {
+    gsr_cursor_deinit(&self->cursor);
+    if(self->texture_id) {
+        self->params.egl->glDeleteTextures(1, &self->texture_id);
+        self->texture_id = 0;
+    }
+}
+
+static int max_int(int a, int b) {
+    return a > b ? a : b;
+}
+
+static int gsr_capture_ximage_start(gsr_capture *cap, gsr_capture_metadata *capture_metadata) {
+    gsr_capture_ximage *self = cap->priv;
+    self->root_window = DefaultRootWindow(self->display);
+
+    if(gsr_cursor_init(&self->cursor, self->params.egl, self->display) != 0) {
+        gsr_capture_ximage_stop(self);
+        return -1;
+    }
+
+    if(!get_monitor_by_name(self->params.egl, GSR_CONNECTION_X11, self->params.display_to_capture, &self->monitor)) {
+        fprintf(stderr, "gsr error: gsr_capture_ximage_start: failed to find monitor by name \"%s\"\n", self->params.display_to_capture);
+        gsr_capture_ximage_stop(self);
+        return -1;
+    }
+
+    self->capture_pos = self->monitor.pos;
+    self->capture_size = self->monitor.size;
+
+    if(self->params.region_size.x > 0 && self->params.region_size.y > 0)
+        self->capture_size = self->params.region_size;
+
+    if(self->params.output_resolution.x > 0 && self->params.output_resolution.y > 0) {
+        self->params.output_resolution = scale_keep_aspect_ratio(self->capture_size, self->params.output_resolution);
+        capture_metadata->width = self->params.output_resolution.x;
+        capture_metadata->height = self->params.output_resolution.y;
+    } else if(self->params.region_size.x > 0 && self->params.region_size.y > 0) {
+        capture_metadata->width = self->params.region_size.x;
+        capture_metadata->height = self->params.region_size.y;
+    } else {
+        capture_metadata->width = self->capture_size.x;
+        capture_metadata->height = self->capture_size.y;
+    }
+
+    self->texture_id = gl_create_texture(self->params.egl, self->capture_size.x, self->capture_size.y, GL_RGB8, GL_RGB, GL_LINEAR);
+    if(self->texture_id == 0) {
+        fprintf(stderr, "gsr error: gsr_capture_ximage_start: failed to create texture\n");
+        gsr_capture_ximage_stop(self);
+        return -1;
+    }
+
+    return 0;
+}
+
+static void gsr_capture_ximage_on_event(gsr_capture *cap, gsr_egl *egl) {
+    gsr_capture_ximage *self = cap->priv;
+    XEvent *xev = gsr_window_get_event_data(egl->window);
+    gsr_cursor_on_event(&self->cursor, xev);
+}
+
+static bool gsr_capture_ximage_upload_to_texture(gsr_capture_ximage *self, int x, int y, int width, int height) {
+    const int max_width = XWidthOfScreen(DefaultScreenOfDisplay(self->display));
+    const int max_height = XHeightOfScreen(DefaultScreenOfDisplay(self->display));
+
+    if(x < 0)
+        x = 0;
+    else if(x >= max_width)
+        x = max_width - 1;
+
+    if(y < 0)
+        y = 0;
+    else if(y >= max_height)
+        y = max_height - 1;
+
+    if(width < 0)
+        width = 0;
+    else if(x + width >= max_width)
+        width = max_width - x;
+
+    if(height < 0)
+        height = 0;
+    else if(y + height >= max_height)
+        height = max_height - y;
+
+    XImage *image = XGetImage(self->display, self->root_window, x, y, width, height, AllPlanes, ZPixmap);
+    if(!image) {
+        fprintf(stderr, "gsr error: gsr_capture_ximage_upload_to_texture: XGetImage failed\n");
+        return false;
+    }
+
+    bool success = false;
+    uint8_t *image_data = malloc(image->width * image->height * 3);
+    if(!image_data) {
+        fprintf(stderr, "gsr error: gsr_capture_ximage_upload_to_texture: failed to allocate image data\n");
+        goto done;
+    }
+
+    for(int y = 0; y < image->height; ++y) {
+        for(int x = 0; x < image->width; ++x) {
+            unsigned long pixel = XGetPixel(image, x, y);
+            unsigned char red = (pixel & image->red_mask) >> 16;
+            unsigned char green = (pixel & image->green_mask) >> 8;
+            unsigned char blue = pixel & image->blue_mask;
+
+            const size_t texture_data_index = (x + y * image->width) * 3;
+            image_data[texture_data_index + 0] = red;
+            image_data[texture_data_index + 1] = green;
+            image_data[texture_data_index + 2] = blue;
+        }
+    }
+
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, self->texture_id);
+    self->params.egl->glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, image->width, image->height, GL_RGB, GL_UNSIGNED_BYTE, image_data);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+    success = true;
+
+    done:
+    free(image_data);
+    XDestroyImage(image);
+    return success;
+}
+
+static int gsr_capture_ximage_capture(gsr_capture *cap, gsr_capture_metadata *capture_metdata, gsr_color_conversion *color_conversion) {
+    gsr_capture_ximage *self = cap->priv;
+
+    const bool is_scaled = self->params.output_resolution.x > 0 && self->params.output_resolution.y > 0;
+    vec2i output_size = is_scaled ? self->params.output_resolution : self->capture_size;
+    output_size = scale_keep_aspect_ratio(self->capture_size, output_size);
+
+    const vec2i target_pos = { max_int(0, capture_metdata->width / 2 - output_size.x / 2), max_int(0, capture_metdata->height / 2 - output_size.y / 2) };
+    gsr_capture_ximage_upload_to_texture(self, self->capture_pos.x + self->params.region_position.x, self->capture_pos.y + self->params.region_position.y, self->capture_size.x, self->capture_size.y);
+
+    gsr_color_conversion_draw(color_conversion, self->texture_id,
+        target_pos, output_size,
+        (vec2i){0, 0}, self->capture_size, self->capture_size,
+        GSR_ROT_0, GSR_SOURCE_COLOR_RGB, false, false);
+
+    if(self->params.record_cursor && self->cursor.visible) {
+        const vec2d scale = {
+            self->capture_size.x == 0 ? 0 : (double)output_size.x / (double)self->capture_size.x,
+            self->capture_size.y == 0 ? 0 : (double)output_size.y / (double)self->capture_size.y
+        };
+
+        gsr_cursor_tick(&self->cursor, self->root_window);
+
+        const vec2i cursor_pos = {
+            target_pos.x + (self->cursor.position.x - self->cursor.hotspot.x) * scale.x - self->capture_pos.x - self->params.region_position.x,
+            target_pos.y + (self->cursor.position.y - self->cursor.hotspot.y) * scale.y - self->capture_pos.y - self->params.region_position.y
+        };
+
+        self->params.egl->glEnable(GL_SCISSOR_TEST);
+        self->params.egl->glScissor(target_pos.x, target_pos.y, output_size.x, output_size.y);
+
+        gsr_color_conversion_draw(color_conversion, self->cursor.texture_id,
+            cursor_pos, (vec2i){self->cursor.size.x * scale.x, self->cursor.size.y * scale.y},
+            (vec2i){0, 0}, self->cursor.size, self->cursor.size,
+            GSR_ROT_0, GSR_SOURCE_COLOR_RGB, false, true);
+
+        self->params.egl->glDisable(GL_SCISSOR_TEST);
+    }
+
+    self->params.egl->glFlush();
+    self->params.egl->glFinish();
+
+    return 0;
+}
+
+static void gsr_capture_ximage_destroy(gsr_capture *cap) {
+    gsr_capture_ximage *self = cap->priv;
+    if(cap->priv) {
+        gsr_capture_ximage_stop(self);
+        free((void*)self->params.display_to_capture);
+        self->params.display_to_capture = NULL;
+        free(self);
+        cap->priv = NULL;
+    }
+    free(cap);
+}
+
+gsr_capture* gsr_capture_ximage_create(const gsr_capture_ximage_params *params) {
+    if(!params) {
+        fprintf(stderr, "gsr error: gsr_capture_ximage_create params is NULL\n");
+        return NULL;
+    }
+
+    gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+    if(!cap)
+        return NULL;
+
+    gsr_capture_ximage *cap_ximage = calloc(1, sizeof(gsr_capture_ximage));
+    if(!cap_ximage) {
+        free(cap);
+        return NULL;
+    }
+
+    const char *display_to_capture = strdup(params->display_to_capture);
+    if(!display_to_capture) {
+        free(cap);
+        free(cap_ximage);
+        return NULL;
+    }
+
+    cap_ximage->params = *params;
+    cap_ximage->display = gsr_window_get_display(params->egl->window);
+    cap_ximage->params.display_to_capture = display_to_capture;
+    
+    *cap = (gsr_capture) {
+        .start = gsr_capture_ximage_start,
+        .on_event = gsr_capture_ximage_on_event,
+        .tick = NULL,
+        .should_stop = NULL,
+        .capture = gsr_capture_ximage_capture,
+        .uses_external_image = NULL,
+        .get_window_id = NULL,
+        .destroy = gsr_capture_ximage_destroy,
+        .priv = cap_ximage
+    };
+
+    return cap;
+}
diff --git a/src/codec_query/nvenc.c b/src/codec_query/nvenc.c
new file mode 100644
index 0000000..0501851
--- /dev/null
+++ b/src/codec_query/nvenc.c
@@ -0,0 +1,235 @@
+#include "../../include/codec_query/nvenc.h"
+#include "../../include/cuda.h"
+#include "../../external/nvEncodeAPI.h"
+
+#include <dlfcn.h>
+#include <stdio.h>
+#include <string.h>
+
+static void* open_nvenc_library(void) {
+    dlerror(); /* clear */
+    void *lib = dlopen("libnvidia-encode.so.1", RTLD_LAZY);
+    if(!lib) {
+        lib = dlopen("libnvidia-encode.so", RTLD_LAZY);
+        if(!lib) {
+            fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc failed: failed to load libnvidia-encode.so/libnvidia-encode.so.1, error: %s\n", dlerror());
+            return NULL;
+        }
+    }
+    return lib;
+}
+
+static bool profile_is_h264(const GUID *profile_guid) {
+    const GUID *h264_guids[] = {
+        &NV_ENC_H264_PROFILE_BASELINE_GUID,
+        &NV_ENC_H264_PROFILE_MAIN_GUID,
+        &NV_ENC_H264_PROFILE_HIGH_GUID,
+        &NV_ENC_H264_PROFILE_PROGRESSIVE_HIGH_GUID,
+        &NV_ENC_H264_PROFILE_CONSTRAINED_HIGH_GUID
+    };
+
+    for(int i = 0; i < 5; ++i) {
+        if(memcmp(profile_guid, h264_guids[i], sizeof(GUID)) == 0)
+            return true;
+    }
+
+    return false;
+}
+
+static bool profile_is_hevc(const GUID *profile_guid) {
+    const GUID *h264_guids[] = {
+        &NV_ENC_HEVC_PROFILE_MAIN_GUID,
+    };
+
+    for(int i = 0; i < 1; ++i) {
+        if(memcmp(profile_guid, h264_guids[i], sizeof(GUID)) == 0)
+            return true;
+    }
+
+    return false;
+}
+
+static bool profile_is_hevc_10bit(const GUID *profile_guid) {
+    const GUID *h264_guids[] = {
+        &NV_ENC_HEVC_PROFILE_MAIN10_GUID,
+    };
+
+    for(int i = 0; i < 1; ++i) {
+        if(memcmp(profile_guid, h264_guids[i], sizeof(GUID)) == 0)
+            return true;
+    }
+
+    return false;
+}
+
+static bool profile_is_av1(const GUID *profile_guid) {
+    const GUID *h264_guids[] = {
+        &NV_ENC_AV1_PROFILE_MAIN_GUID,
+    };
+
+    for(int i = 0; i < 1; ++i) {
+        if(memcmp(profile_guid, h264_guids[i], sizeof(GUID)) == 0)
+            return true;
+    }
+
+    return false;
+}
+
+static bool encoder_get_supported_profiles(const NV_ENCODE_API_FUNCTION_LIST *function_list, void *nvenc_encoder, const GUID *encoder_guid, gsr_supported_video_codecs *supported_video_codecs) {
+    bool success = false;
+    GUID *profile_guids = NULL;
+
+    uint32_t profile_guid_count = 0;
+    if(function_list->nvEncGetEncodeProfileGUIDCount(nvenc_encoder, *encoder_guid, &profile_guid_count) != NV_ENC_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: nvEncGetEncodeProfileGUIDCount failed, error: %s\n", function_list->nvEncGetLastErrorString(nvenc_encoder));
+        goto fail;
+    }
+
+    if(profile_guid_count == 0)
+        goto fail;
+
+    profile_guids = calloc(profile_guid_count, sizeof(GUID));
+    if(!profile_guids) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: failed to allocate %d guids\n", (int)profile_guid_count);
+        goto fail;
+    }
+
+    if(function_list->nvEncGetEncodeProfileGUIDs(nvenc_encoder, *encoder_guid, profile_guids, profile_guid_count, &profile_guid_count) != NV_ENC_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: nvEncGetEncodeProfileGUIDs failed, error: %s\n", function_list->nvEncGetLastErrorString(nvenc_encoder));
+        goto fail;
+    }
+
+    for(uint32_t i = 0; i < profile_guid_count; ++i) {
+        if(profile_is_h264(&profile_guids[i])) {
+            supported_video_codecs->h264 = (gsr_supported_video_codec){ true, false };
+        } else if(profile_is_hevc(&profile_guids[i])) {
+            supported_video_codecs->hevc = (gsr_supported_video_codec){ true, false };
+        } else if(profile_is_hevc_10bit(&profile_guids[i])) {
+            supported_video_codecs->hevc_hdr = (gsr_supported_video_codec){ true, false };
+            supported_video_codecs->hevc_10bit = (gsr_supported_video_codec){ true, false };
+        } else if(profile_is_av1(&profile_guids[i])) {
+            supported_video_codecs->av1 = (gsr_supported_video_codec){ true, false };
+            supported_video_codecs->av1_hdr = (gsr_supported_video_codec){ true, false };
+            supported_video_codecs->av1_10bit = (gsr_supported_video_codec){ true, false };
+        }
+    }
+
+    success = true;
+    fail:
+
+    if(profile_guids)
+        free(profile_guids);
+
+    return success;
+}
+
+static bool get_supported_video_codecs(const NV_ENCODE_API_FUNCTION_LIST *function_list, void *nvenc_encoder, gsr_supported_video_codecs *supported_video_codecs) {
+    bool success = false;
+    GUID *encoder_guids = NULL;
+    *supported_video_codecs = (gsr_supported_video_codecs){0};
+
+    uint32_t encode_guid_count = 0;
+    if(function_list->nvEncGetEncodeGUIDCount(nvenc_encoder, &encode_guid_count) != NV_ENC_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: nvEncGetEncodeGUIDCount failed, error: %s\n", function_list->nvEncGetLastErrorString(nvenc_encoder));
+        goto fail;
+    }
+
+    if(encode_guid_count == 0)
+        goto fail;
+
+    encoder_guids = calloc(encode_guid_count, sizeof(GUID));
+    if(!encoder_guids) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: failed to allocate %d guids\n", (int)encode_guid_count);
+        goto fail;
+    }
+
+    if(function_list->nvEncGetEncodeGUIDs(nvenc_encoder, encoder_guids, encode_guid_count, &encode_guid_count) != NV_ENC_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: nvEncGetEncodeGUIDs failed, error: %s\n", function_list->nvEncGetLastErrorString(nvenc_encoder));
+        goto fail;
+    }
+
+    for(uint32_t i = 0; i < encode_guid_count; ++i) {
+        encoder_get_supported_profiles(function_list, nvenc_encoder, &encoder_guids[i], supported_video_codecs);
+    }
+
+    success = true;
+    fail:
+
+    if(encoder_guids)
+        free(encoder_guids);
+
+    return success;
+}
+
+#define NVENCAPI_VERSION_470 (11 | (1 << 24))
+#define NVENCAPI_STRUCT_VERSION_470(ver) ((uint32_t)NVENCAPI_VERSION_470 | ((ver)<<16) | (0x7 << 28))
+
+bool gsr_get_supported_video_codecs_nvenc(gsr_supported_video_codecs *video_codecs, bool cleanup) {
+    memset(video_codecs, 0, sizeof(*video_codecs));
+
+    bool success = false;
+    void *nvenc_lib = NULL;
+    void *nvenc_encoder = NULL;
+    gsr_cuda cuda;
+    memset(&cuda, 0, sizeof(cuda));
+
+    if(!gsr_cuda_load(&cuda, NULL, false)) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: failed to load cuda\n");
+        goto done;
+    }
+
+    nvenc_lib = open_nvenc_library();
+    if(!nvenc_lib)
+        goto done;
+
+    typedef NVENCSTATUS NVENCAPI (*FUNC_NvEncodeAPICreateInstance)(NV_ENCODE_API_FUNCTION_LIST *functionList);
+    FUNC_NvEncodeAPICreateInstance nvEncodeAPICreateInstance = (FUNC_NvEncodeAPICreateInstance)dlsym(nvenc_lib, "NvEncodeAPICreateInstance");
+    if(!nvEncodeAPICreateInstance) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: failed to find NvEncodeAPICreateInstance in libnvidia-encode.so\n");
+        goto done;
+    }
+
+    NV_ENCODE_API_FUNCTION_LIST function_list;
+    memset(&function_list, 0, sizeof(function_list));
+    function_list.version = NVENCAPI_STRUCT_VERSION(2);
+    if(nvEncodeAPICreateInstance(&function_list) != NV_ENC_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: nvEncodeAPICreateInstance failed\n");
+        goto done;
+    }
+
+    NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS params;
+    memset(&params, 0, sizeof(params));
+    params.version = NVENCAPI_STRUCT_VERSION(1);
+    params.deviceType = NV_ENC_DEVICE_TYPE_CUDA;
+    params.device = cuda.cu_ctx;
+    params.apiVersion = NVENCAPI_VERSION;
+    if(function_list.nvEncOpenEncodeSessionEx(&params, &nvenc_encoder) != NV_ENC_SUCCESS) {
+        // Old nvidia gpus dont support the new nvenc api (which is required for av1).
+        // In such cases fallback to old api version if possible and try again.
+        function_list.version = NVENCAPI_STRUCT_VERSION_470(2);
+        if(nvEncodeAPICreateInstance(&function_list) != NV_ENC_SUCCESS) {
+            fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: nvEncodeAPICreateInstance (retry) failed\n");
+            goto done;
+        }
+
+        params.version = NVENCAPI_STRUCT_VERSION_470(1);
+        params.apiVersion = NVENCAPI_VERSION_470;
+        if(function_list.nvEncOpenEncodeSessionEx(&params, &nvenc_encoder) != NV_ENC_SUCCESS) {
+            fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_nvenc: nvEncOpenEncodeSessionEx (retry) failed\n");
+            goto done;
+        }
+    }
+
+    success = get_supported_video_codecs(&function_list, nvenc_encoder, video_codecs);
+
+    done:
+    if(cleanup) {
+        if(nvenc_encoder)
+            function_list.nvEncDestroyEncoder(nvenc_encoder);
+        if(nvenc_lib)
+            dlclose(nvenc_lib);
+        gsr_cuda_unload(&cuda);
+    }
+
+    return success;
+}
diff --git a/src/codec_query/vaapi.c b/src/codec_query/vaapi.c
new file mode 100644
index 0000000..8930a6c
--- /dev/null
+++ b/src/codec_query/vaapi.c
@@ -0,0 +1,203 @@
+#include "../../include/codec_query/vaapi.h"
+#include "../../include/utils.h"
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <va/va.h>
+#include <va/va_drm.h>
+
+static bool profile_is_h264(VAProfile profile) {
+    switch(profile) {
+        case 5: // VAProfileH264Baseline
+        case VAProfileH264Main:
+        case VAProfileH264High:
+        case VAProfileH264ConstrainedBaseline:
+            return true;
+        default:
+            return false;
+    }
+}
+
+static bool profile_is_hevc_8bit(VAProfile profile) {
+    switch(profile) {
+        case VAProfileHEVCMain:
+            return true;
+        default:
+            return false;
+    }
+}
+
+static bool profile_is_hevc_10bit(VAProfile profile) {
+    switch(profile) {
+        case VAProfileHEVCMain10:
+        //case VAProfileHEVCMain12:
+        //case VAProfileHEVCMain422_10:
+        //case VAProfileHEVCMain422_12:
+        //case VAProfileHEVCMain444:
+        //case VAProfileHEVCMain444_10:
+        //case VAProfileHEVCMain444_12:
+            return true;
+        default:
+            return false;
+    }
+}
+
+static bool profile_is_av1(VAProfile profile) {
+    switch(profile) {
+        case VAProfileAV1Profile0:
+        case VAProfileAV1Profile1:
+            return true;
+        default:
+            return false;
+    }
+}
+
+static bool profile_is_vp8(VAProfile profile) {
+    switch(profile) {
+        case VAProfileVP8Version0_3:
+            return true;
+        default:
+            return false;
+    }
+}
+
+static bool profile_is_vp9(VAProfile profile) {
+    switch(profile) {
+        case VAProfileVP9Profile0:
+        case VAProfileVP9Profile1:
+        case VAProfileVP9Profile2:
+        case VAProfileVP9Profile3:
+            return true;
+        default:
+            return false;
+    }
+}
+
+static bool profile_supports_video_encoding(VADisplay va_dpy, VAProfile profile, bool *low_power) {
+    *low_power = false;
+    int num_entrypoints = vaMaxNumEntrypoints(va_dpy);
+    if(num_entrypoints <= 0)
+        return false;
+
+    VAEntrypoint *entrypoint_list = calloc(num_entrypoints, sizeof(VAEntrypoint));
+    if(!entrypoint_list)
+        return false;
+
+    bool supports_encoding = false;
+    bool supports_low_power_encoding = false;
+    if(vaQueryConfigEntrypoints(va_dpy, profile, entrypoint_list, &num_entrypoints) == VA_STATUS_SUCCESS) {
+        for(int i = 0; i < num_entrypoints; ++i) {
+            if(entrypoint_list[i] == VAEntrypointEncSlice)
+                supports_encoding = true;
+            else if(entrypoint_list[i] == VAEntrypointEncSliceLP)
+                supports_low_power_encoding = true;
+        }
+    }
+
+    if(!supports_encoding && supports_low_power_encoding)
+        *low_power = true;
+
+    free(entrypoint_list);
+    return supports_encoding || supports_low_power_encoding;
+}
+
+static bool get_supported_video_codecs(VADisplay va_dpy, gsr_supported_video_codecs *video_codecs, bool cleanup) {
+    *video_codecs = (gsr_supported_video_codecs){0};
+    bool success = false;
+    VAProfile *profile_list = NULL;
+
+    vaSetInfoCallback(va_dpy, NULL, NULL);
+
+    int va_major = 0;
+    int va_minor = 0;
+    if(vaInitialize(va_dpy, &va_major, &va_minor) != VA_STATUS_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vaapi: vaInitialize failed\n");
+        return false;
+    }
+
+    int num_profiles = vaMaxNumProfiles(va_dpy);
+    if(num_profiles <= 0)
+        goto fail;
+
+    profile_list = calloc(num_profiles, sizeof(VAProfile));
+    if(!profile_list || vaQueryConfigProfiles(va_dpy, profile_list, &num_profiles) != VA_STATUS_SUCCESS)
+        goto fail;
+
+    for(int i = 0; i < num_profiles; ++i) {
+        bool low_power = false;
+        if(profile_is_h264(profile_list[i])) {
+            if(profile_supports_video_encoding(va_dpy, profile_list[i], &low_power)) {
+                video_codecs->h264 = (gsr_supported_video_codec){ true, low_power };
+            }
+        } else if(profile_is_hevc_8bit(profile_list[i])) {
+            if(profile_supports_video_encoding(va_dpy, profile_list[i], &low_power))
+                video_codecs->hevc = (gsr_supported_video_codec){ true, low_power };
+        } else if(profile_is_hevc_10bit(profile_list[i])) {
+            if(profile_supports_video_encoding(va_dpy, profile_list[i], &low_power)) {
+                video_codecs->hevc_hdr = (gsr_supported_video_codec){ true, low_power };
+                video_codecs->hevc_10bit = (gsr_supported_video_codec){ true, low_power };
+            }
+        } else if(profile_is_av1(profile_list[i])) {
+            if(profile_supports_video_encoding(va_dpy, profile_list[i], &low_power)) {
+                video_codecs->av1 = (gsr_supported_video_codec){ true, low_power };
+                video_codecs->av1_hdr = (gsr_supported_video_codec){ true, low_power };
+                video_codecs->av1_10bit = (gsr_supported_video_codec){ true, low_power };
+            }
+        } else if(profile_is_vp8(profile_list[i])) {
+            if(profile_supports_video_encoding(va_dpy, profile_list[i], &low_power))
+                video_codecs->vp8 = (gsr_supported_video_codec){ true, low_power };
+        } else if(profile_is_vp9(profile_list[i])) {
+            if(profile_supports_video_encoding(va_dpy, profile_list[i], &low_power))
+                video_codecs->vp9 = (gsr_supported_video_codec){ true, low_power };
+        }
+    }
+
+    success = true;
+    fail:
+    if(profile_list)
+        free(profile_list);
+
+    if(cleanup)
+        vaTerminate(va_dpy);
+
+    return success;
+}
+
+bool gsr_get_supported_video_codecs_vaapi(gsr_supported_video_codecs *video_codecs, const char *card_path, bool cleanup) {
+    memset(video_codecs, 0, sizeof(*video_codecs));
+    bool success = false;
+    int drm_fd = -1;
+
+    char render_path[128];
+    if(!gsr_card_path_get_render_path(card_path, render_path)) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vaapi: failed to get /dev/dri/renderDXXX file from %s\n", card_path);
+        goto done;
+    }
+
+    drm_fd = open(render_path, O_RDWR);
+    if(drm_fd == -1) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vaapi: failed to open device %s\n", render_path);
+        goto done;
+    }
+
+    VADisplay va_dpy = vaGetDisplayDRM(drm_fd);
+    if(va_dpy) {
+        if(!get_supported_video_codecs(va_dpy, video_codecs, cleanup)) {
+            fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vaapi: failed to query supported video codecs for device %s\n", render_path);
+            goto done;
+        }
+        success = true;
+    }
+
+    done:
+    if(cleanup) {
+        if(drm_fd > 0)
+            close(drm_fd);
+    }
+
+    return success;
+}
diff --git a/src/codec_query/vulkan.c b/src/codec_query/vulkan.c
new file mode 100644
index 0000000..15dd98b
--- /dev/null
+++ b/src/codec_query/vulkan.c
@@ -0,0 +1,156 @@
+#include "../../include/codec_query/vulkan.h"
+
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <xf86drm.h>
+#define VK_NO_PROTOTYPES
+//#include <vulkan/vulkan.h>
+
+#define MAX_PHYSICAL_DEVICES 32
+
+static const char *required_device_extensions[] = {
+    "VK_KHR_external_memory_fd",
+    "VK_KHR_external_semaphore_fd",
+    "VK_KHR_video_encode_queue",
+    "VK_KHR_video_queue",
+    "VK_KHR_video_maintenance1",
+    "VK_EXT_external_memory_dma_buf",
+    "VK_EXT_external_memory_host",
+    "VK_EXT_image_drm_format_modifier"
+};
+static int num_required_device_extensions = 8;
+
+bool gsr_get_supported_video_codecs_vulkan(gsr_supported_video_codecs *video_codecs, const char *card_path, bool cleanup) {
+    memset(video_codecs, 0, sizeof(*video_codecs));
+#if 0
+    bool success = false;
+    VkInstance instance = NULL;
+    VkPhysicalDevice physical_devices[MAX_PHYSICAL_DEVICES];
+    VkDevice device = NULL;
+    VkExtensionProperties *device_extensions = NULL;
+
+    const VkApplicationInfo app_info = {
+        .sType = VK_STRUCTURE_TYPE_APPLICATION_INFO,
+        .pApplicationName = "GPU Screen Recorder",
+        .applicationVersion = VK_MAKE_VERSION(1, 0, 0),
+        .pEngineName = "GPU Screen Recorder",
+        .engineVersion = VK_MAKE_VERSION(1, 0, 0),
+        .apiVersion = VK_API_VERSION_1_3,
+    };
+
+    const VkInstanceCreateInfo instance_create_info = {
+        .sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO,
+        .pApplicationInfo = &app_info
+    };
+
+    if(vkCreateInstance(&instance_create_info, NULL, &instance) != VK_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: vkCreateInstance failed\n");
+        goto done;
+    }
+
+    uint32_t num_devices = 0;
+    if(vkEnumeratePhysicalDevices(instance, &num_devices, NULL) != VK_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: vkEnumeratePhysicalDevices (query num devices) failed\n");
+        goto done;
+    }
+
+    if(num_devices == 0) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: no vulkan capable device found\n");
+        goto done;
+    }
+
+    if(num_devices > MAX_PHYSICAL_DEVICES)
+        num_devices = MAX_PHYSICAL_DEVICES;
+    
+    if(vkEnumeratePhysicalDevices(instance, &num_devices, physical_devices) != VK_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: vkEnumeratePhysicalDevices (get data) failed\n");
+        goto done;
+    }
+
+    VkPhysicalDevice physical_device = NULL;
+    char device_card_path[128];
+    for(uint32_t i = 0; i < num_devices; ++i) {
+        VkPhysicalDeviceDrmPropertiesEXT device_drm_properties = {
+            .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DRM_PROPERTIES_EXT
+        };
+
+        VkPhysicalDeviceProperties2 device_properties = {
+            .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2,
+            .pNext = &device_drm_properties
+        };
+        vkGetPhysicalDeviceProperties2(physical_devices[i], &device_properties);
+
+        if(!device_drm_properties.hasPrimary)
+            continue;
+
+        snprintf(device_card_path, sizeof(device_card_path), DRM_DEV_NAME, DRM_DIR_NAME, (int)device_drm_properties.primaryMinor);
+        if(strcmp(device_card_path, card_path) == 0) {
+            physical_device = physical_devices[i];
+            break;
+        }
+    }
+
+    if(!physical_device) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: failed to find a vulkan device that matches opengl device %s\n", card_path);
+        goto done;
+    }
+
+    const VkDeviceCreateInfo device_create_info = {
+        .sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
+        .enabledExtensionCount = num_required_device_extensions,
+        .ppEnabledExtensionNames = required_device_extensions
+    };
+
+    if(vkCreateDevice(physical_device, &device_create_info, NULL, &device) != VK_SUCCESS) {
+        //fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: vkCreateDevice failed. Device %s likely doesn't support vulkan video encoding\n", card_path);
+        goto done;
+    }
+
+    uint32_t num_device_extensions = 0;
+    if(vkEnumerateDeviceExtensionProperties(physical_device, NULL, &num_device_extensions, NULL) != VK_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: vkEnumerateDeviceExtensionProperties (query num device extensions) failed\n");
+        goto done;
+    }
+
+    device_extensions = calloc(num_device_extensions, sizeof(VkExtensionProperties));
+    if(!device_extensions) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: failed to allocate %d device extensions\n", num_device_extensions);
+        goto done;
+    }
+
+    if(vkEnumerateDeviceExtensionProperties(physical_device, NULL, &num_device_extensions, device_extensions) != VK_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_get_supported_video_codecs_vulkan: vkEnumerateDeviceExtensionProperties (get data) failed\n");
+        goto done;
+    }
+
+    for(uint32_t i = 0; i < num_device_extensions; ++i) {
+        if(strcmp(device_extensions[i].extensionName, "VK_KHR_video_encode_h264") == 0) {
+            video_codecs->h264 = true;
+        } else if(strcmp(device_extensions[i].extensionName, "VK_KHR_video_encode_h265") == 0) {
+            // TODO: Verify if 10bit and hdr are actually supported
+            video_codecs->hevc = true;
+            video_codecs->hevc_10bit = true;
+            video_codecs->hevc_hdr = true;
+        }
+    }
+
+    success = true;
+
+    done:
+    if(cleanup) {
+        if(device)
+            vkDestroyDevice(device, NULL);
+        if(instance)
+            vkDestroyInstance(instance, NULL);
+    }
+    if(device_extensions)
+        free(device_extensions);
+    return success;
+#else
+    // TODO: Low power query
+    video_codecs->h264 = (gsr_supported_video_codec){ true, false };
+    video_codecs->hevc = (gsr_supported_video_codec){ true, false };
+    return true;
+#endif
+}
diff --git a/src/color_conversion.c b/src/color_conversion.c
index 821ae52..23b166e 100644
--- a/src/color_conversion.c
+++ b/src/color_conversion.c
@@ -1,144 +1,428 @@
 #include "../include/color_conversion.h"
+#include "../include/egl.h"
 #include <stdio.h>
-#include <string.h>
 #include <math.h>
+#include <string.h>
 #include <assert.h>
 
-#define MAX_SHADERS 2
-#define MAX_FRAMEBUFFERS 2
+#define COMPUTE_SHADER_INDEX_Y                  0
+#define COMPUTE_SHADER_INDEX_UV                 1
+#define COMPUTE_SHADER_INDEX_Y_EXTERNAL         2
+#define COMPUTE_SHADER_INDEX_UV_EXTERNAL        3
+#define COMPUTE_SHADER_INDEX_RGB                4
+#define COMPUTE_SHADER_INDEX_RGB_EXTERNAL       5
+#define COMPUTE_SHADER_INDEX_Y_BLEND            6
+#define COMPUTE_SHADER_INDEX_UV_BLEND           7
+#define COMPUTE_SHADER_INDEX_Y_EXTERNAL_BLEND   8
+#define COMPUTE_SHADER_INDEX_UV_EXTERNAL_BLEND  9
+#define COMPUTE_SHADER_INDEX_RGB_BLEND          10
+#define COMPUTE_SHADER_INDEX_RGB_EXTERNAL_BLEND 11
+
+#define GRAPHICS_SHADER_INDEX_Y                  0
+#define GRAPHICS_SHADER_INDEX_UV                 1
+#define GRAPHICS_SHADER_INDEX_Y_EXTERNAL         2
+#define GRAPHICS_SHADER_INDEX_UV_EXTERNAL        3
+#define GRAPHICS_SHADER_INDEX_RGB                4
+#define GRAPHICS_SHADER_INDEX_RGB_EXTERNAL       5
+
+/* https://en.wikipedia.org/wiki/YCbCr, see study/color_space_transform_matrix.png */
+
+/* ITU-R BT2020, full */
+/* https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.2020-2-201510-I!!PDF-E.pdf */
+#define RGB_TO_P010_FULL "const mat4 RGBtoYUV = mat4(0.262700, -0.139630,  0.500000, 0.000000,\n" \
+                         "                           0.678000, -0.360370, -0.459786, 0.000000,\n" \
+                         "                           0.059300,  0.500000, -0.040214, 0.000000,\n" \
+                         "                           0.000000,  0.500000,  0.500000, 1.000000);\n"
+
+/* ITU-R BT2020, limited (full multiplied by (235-16)/255, adding 16/255 to luma) */
+#define RGB_TO_P010_LIMITED "const mat4 RGBtoYUV = mat4(0.225613, -0.119918,  0.429412, 0.000000,\n" \
+                            "                           0.582282, -0.309494, -0.394875, 0.000000,\n" \
+                            "                           0.050928,  0.429412, -0.034537, 0.000000,\n" \
+                            "                           0.062745,  0.500000,  0.500000, 1.000000);\n"
+
+/* ITU-R BT709, full, custom values: 0.2110 0.7110 0.0710 */
+/* https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!!PDF-E.pdf */
+#define RGB_TO_NV12_FULL "const mat4 RGBtoYUV = mat4(0.211000, -0.113563,  0.500000, 0.000000,\n" \
+                         "                           0.711000, -0.382670, -0.450570, 0.000000,\n" \
+                         "                           0.071000,  0.500000, -0.044994, 0.000000,\n" \
+                         "                           0.000000,  0.500000,  0.500000, 1.000000);\n"
+
+/* ITU-R BT709, limited, custom values: 0.2100 0.7100 0.0700 (full multiplied by (235-16)/255, adding 16/255 to luma) */
+#define RGB_TO_NV12_LIMITED "const mat4 RGBtoYUV = mat4(0.180353, -0.096964,  0.429412, 0.000000,\n" \
+                            "                           0.609765, -0.327830, -0.385927, 0.000000,\n" \
+                            "                           0.060118,  0.429412, -0.038049, 0.000000,\n" \
+                            "                           0.062745,  0.500000,  0.500000, 1.000000);\n"
 
-static float abs_f(float v) {
-    return v >= 0.0f ? v : -v;
+static int max_int(int a, int b) {
+    return a > b ? a : b;
+}
+
+static const char* color_format_range_get_transform_matrix(gsr_destination_color color_format, gsr_color_range color_range) {
+    switch(color_format) {
+        case GSR_DESTINATION_COLOR_NV12: {
+            switch(color_range) {
+                case GSR_COLOR_RANGE_LIMITED:
+                    return RGB_TO_NV12_LIMITED;
+                case GSR_COLOR_RANGE_FULL:
+                    return RGB_TO_NV12_FULL;
+            }
+            break;
+        }
+        case GSR_DESTINATION_COLOR_P010: {
+            switch(color_range) {
+                case GSR_COLOR_RANGE_LIMITED:
+                    return RGB_TO_P010_LIMITED;
+                case GSR_COLOR_RANGE_FULL:
+                    return RGB_TO_P010_FULL;
+            }
+            break;
+        }
+        case GSR_DESTINATION_COLOR_RGB8:
+            return "";
+        default:
+            return NULL;
+    }
+    return NULL;
 }
 
-#define ROTATE_Z   "mat4 rotate_z(in float angle) {\n"                        \
-                   "    return mat4(cos(angle), -sin(angle), 0.0, 0.0,\n"     \
-                   "                sin(angle),  cos(angle), 0.0, 0.0,\n"     \
-                   "                0.0,           0.0,      1.0, 0.0,\n"     \
-                   "                0.0,           0.0,      0.0, 1.0);\n"    \
-                   "}\n"
+static void get_compute_shader_header(char *header, size_t header_size, bool external_texture) {
+    if(external_texture) {
+        snprintf(header, header_size,
+            "#version 310 es\n"
+            "#extension GL_OES_EGL_image_external : enable\n"
+            "#extension GL_OES_EGL_image_external_essl3 : require\n"
+            "layout(binding = 0) uniform highp samplerExternalOES img_input;\n"
+            "layout(binding = 1) uniform highp sampler2D img_background;\n");
+    } else {
+        snprintf(header, header_size,
+            "#version 310 es\n"
+            "layout(binding = 0) uniform highp sampler2D img_input;\n"
+            "layout(binding = 1) uniform highp sampler2D img_background;\n");
+    }
+}
 
-/* BT709 limited */
-#define RGB_TO_YUV "const mat4 RGBtoYUV = mat4(0.1826, -0.1006,  0.4392, 0.0,\n" \
-                   "                           0.6142, -0.3386, -0.3989, 0.0,\n" \
-                   "                           0.0620,  0.4392, -0.0403, 0.0,\n" \
-                   "                           0.0625,  0.5000,  0.5000, 1.0);"
+static int load_compute_shader_y(gsr_shader *shader, gsr_egl *egl, gsr_color_compute_uniforms *uniforms, int max_local_size_dim, gsr_destination_color color_format, gsr_color_range color_range, bool external_texture, bool alpha_blending) {
+    const char *color_transform_matrix = color_format_range_get_transform_matrix(color_format, color_range);
 
-static int load_shader_bgr(gsr_shader *shader, gsr_egl *egl, int *rotation_uniform) {
-    char vertex_shader[2048];
-    snprintf(vertex_shader, sizeof(vertex_shader),
-        "#version 300 es                                   \n"
-        "in vec2 pos;                                      \n"
-        "in vec2 texcoords;                                \n"
-        "out vec2 texcoords_out;                           \n"
-        "uniform float rotation;                           \n"
-        ROTATE_Z
-        "void main()                                       \n"
-        "{                                                 \n"
-        "  texcoords_out = texcoords;                      \n"
-        "  gl_Position = vec4(pos.x, pos.y, 0.0, 1.0) * rotate_z(rotation);    \n"
-        "}                                                 \n");
+    char header[512];
+    get_compute_shader_header(header, sizeof(header), external_texture);
 
-    char fragment_shader[] =
-        "#version 300 es                                                                 \n"
-        "precision mediump float;                                                        \n"
-        "in vec2 texcoords_out;                                                          \n"
-        "uniform sampler2D tex1;                                                         \n"
-        "out vec4 FragColor;                                                             \n"
-        "void main()                                                                     \n"
-        "{                                                                               \n"
-        "  FragColor = texture(tex1, texcoords_out).bgra;                                 \n"
-        "}                                                                               \n";
-
-    if(gsr_shader_init(shader, egl, vertex_shader, fragment_shader) != 0)
+    char compute_shader[4096];
+    snprintf(compute_shader, sizeof(compute_shader),
+        "%s"
+        "layout (local_size_x = %d, local_size_y = %d, local_size_z = 1) in;\n"
+        "precision highp float;\n"
+        "uniform ivec2 source_position;\n"
+        "uniform ivec2 target_position;\n"
+        "uniform vec2 scale;\n"
+        "uniform mat2 rotation_matrix;\n"
+        "layout(rgba8, binding = 0) writeonly uniform highp image2D img_output;\n"
+        "%s"
+        "void main() {\n"
+        "    ivec2 texel_coord = ivec2(gl_GlobalInvocationID.xy);\n"
+        "    ivec2 size = ivec2(vec2(textureSize(img_input, 0)) * scale + 0.5);\n"
+        "    ivec2 size_shift = size >> 1;\n" // size/2
+        "    ivec2 output_size = textureSize(img_background, 0);\n"
+        "    vec2 rotated_texel_coord = vec2(texel_coord - source_position - size_shift) * rotation_matrix + vec2(size_shift) + 0.5;\n"
+        "    vec2 output_texel_coord = vec2(texel_coord - source_position + target_position) + 0.5;\n"
+        "    vec2 source_color_coords = rotated_texel_coord/vec2(size);\n"
+        "    vec4 source_color = texture(img_input, source_color_coords);\n"
+        "    if(source_color_coords.x > 1.0 || source_color_coords.y > 1.0)\n"
+        "        source_color.rgba = vec4(0.0, 0.0, 0.0, %s);\n"
+        "    vec4 source_color_yuv = RGBtoYUV * vec4(source_color.rgb, 1.0);\n"
+        "    vec4 output_color_yuv = %s;\n"
+        "    float y_color = mix(output_color_yuv.r, source_color_yuv.r, source_color.a);\n"
+        "    imageStore(img_output, texel_coord + target_position, vec4(y_color, 1.0, 1.0, 1.0));\n"
+        "}\n", header, max_local_size_dim, max_local_size_dim, color_transform_matrix,
+            alpha_blending ? "0.0" : "1.0",
+            alpha_blending ? "texture(img_background, output_texel_coord/vec2(output_size))" : "source_color_yuv");
+
+    if(gsr_shader_init(shader, egl, NULL, NULL, compute_shader) != 0)
         return -1;
 
-    gsr_shader_bind_attribute_location(shader, "pos", 0);
-    gsr_shader_bind_attribute_location(shader, "texcoords", 1);
-    *rotation_uniform = egl->glGetUniformLocation(shader->program_id, "rotation");
+    uniforms->source_position = egl->glGetUniformLocation(shader->program_id, "source_position");
+    uniforms->target_position = egl->glGetUniformLocation(shader->program_id, "target_position");
+    uniforms->rotation_matrix = egl->glGetUniformLocation(shader->program_id, "rotation_matrix");
+    uniforms->scale = egl->glGetUniformLocation(shader->program_id, "scale");
     return 0;
 }
 
-static int load_shader_y(gsr_shader *shader, gsr_egl *egl, int *rotation_uniform) {
+static int load_compute_shader_uv(gsr_shader *shader, gsr_egl *egl, gsr_color_compute_uniforms *uniforms, int max_local_size_dim, gsr_destination_color color_format, gsr_color_range color_range, bool external_texture, bool alpha_blending) {
+    const char *color_transform_matrix = color_format_range_get_transform_matrix(color_format, color_range);
+
+    char header[512];
+    get_compute_shader_header(header, sizeof(header), external_texture);
+
+    char compute_shader[4096];
+    snprintf(compute_shader, sizeof(compute_shader),
+        "%s"
+        "layout (local_size_x = %d, local_size_y = %d, local_size_z = 1) in;\n"
+        "precision highp float;\n"
+        "uniform ivec2 source_position;\n"
+        "uniform ivec2 target_position;\n"
+        "uniform vec2 scale;\n"
+        "uniform mat2 rotation_matrix;\n"
+        "layout(rgba8, binding = 0) writeonly uniform highp image2D img_output;\n"
+        "%s"
+        "void main() {\n"
+        "    ivec2 texel_coord = ivec2(gl_GlobalInvocationID.xy);\n"
+        "    ivec2 size = ivec2(vec2(textureSize(img_input, 0)) * scale + 0.5);\n"
+        "    ivec2 size_shift = size >> 2;\n" // size/4
+        "    ivec2 output_size = textureSize(img_background, 0);\n"
+        "    vec2 rotated_texel_coord = vec2(texel_coord - source_position - size_shift) * rotation_matrix + vec2(size_shift) + 0.5;\n"
+        "    vec2 output_texel_coord = vec2(texel_coord - source_position + target_position) + 0.5;\n"
+        "    vec2 source_color_coords = rotated_texel_coord/vec2(size>>1);\n"
+        "    vec4 source_color = texture(img_input, source_color_coords);\n" // size/2
+        "    if(source_color_coords.x > 1.0 || source_color_coords.y > 1.0)\n"
+        "        source_color.rgba = vec4(0.0, 0.0, 0.0, %s);\n"
+        "    vec4 source_color_yuv = RGBtoYUV * vec4(source_color.rgb, 1.0);\n"
+        "    vec4 output_color_yuv = %s;\n"
+        "    vec2 uv_color = mix(output_color_yuv.rg, source_color_yuv.gb, source_color.a);\n"
+        "    imageStore(img_output, texel_coord + target_position, vec4(uv_color, 1.0, 1.0));\n"
+        "}\n", header, max_local_size_dim, max_local_size_dim, color_transform_matrix,
+            alpha_blending ? "0.0" : "1.0",
+            alpha_blending ? "texture(img_background, output_texel_coord/vec2(output_size))" : "source_color_yuv");
+
+    if(gsr_shader_init(shader, egl, NULL, NULL, compute_shader) != 0)
+        return -1;
+
+    uniforms->source_position = egl->glGetUniformLocation(shader->program_id, "source_position");
+    uniforms->target_position = egl->glGetUniformLocation(shader->program_id, "target_position");
+    uniforms->rotation_matrix = egl->glGetUniformLocation(shader->program_id, "rotation_matrix");
+    uniforms->scale = egl->glGetUniformLocation(shader->program_id, "scale");
+    return 0;
+}
+
+static int load_compute_shader_rgb(gsr_shader *shader, gsr_egl *egl, gsr_color_compute_uniforms *uniforms, int max_local_size_dim, bool external_texture, bool alpha_blending) {
+    char header[512];
+    get_compute_shader_header(header, sizeof(header), external_texture);
+
+    char compute_shader[4096];
+    snprintf(compute_shader, sizeof(compute_shader),
+        "%s"
+        "layout (local_size_x = %d, local_size_y = %d, local_size_z = 1) in;\n"
+        "precision highp float;\n"
+        "uniform ivec2 source_position;\n"
+        "uniform ivec2 target_position;\n"
+        "uniform vec2 scale;\n"
+        "uniform mat2 rotation_matrix;\n"
+        "layout(rgba8, binding = 0) writeonly uniform highp image2D img_output;\n"
+        "void main() {\n"
+        "    ivec2 texel_coord = ivec2(gl_GlobalInvocationID.xy);\n"
+        "    ivec2 size = ivec2(vec2(textureSize(img_input, 0)) * scale + 0.5);\n"
+        "    ivec2 size_shift = size >> 1;\n" // size/2
+        "    ivec2 output_size = textureSize(img_background, 0);\n"
+        "    vec2 rotated_texel_coord = vec2(texel_coord - source_position - size_shift) * rotation_matrix + vec2(size_shift) + 0.5;\n"
+        "    vec2 output_texel_coord = vec2(texel_coord - source_position + target_position) + 0.5;\n"
+        "    vec2 source_color_coords = rotated_texel_coord/vec2(size);\n"
+        "    vec4 source_color = texture(img_input, source_color_coords);\n"
+        "    if(source_color_coords.x > 1.0 || source_color_coords.y > 1.0)\n"
+        "        source_color.rgba = vec4(0.0, 0.0, 0.0, %s);\n"
+        "    vec4 output_color = %s;\n"
+        "    vec3 color = mix(output_color.rgb, source_color.rgb, source_color.a);\n"
+        "    imageStore(img_output, texel_coord + target_position, vec4(color, 1.0));\n"
+        "}\n", header, max_local_size_dim, max_local_size_dim,
+            alpha_blending ? "0.0" : "1.0",
+            alpha_blending ? "texture(img_background, output_texel_coord/vec2(output_size))" : "source_color");
+
+    if(gsr_shader_init(shader, egl, NULL, NULL, compute_shader) != 0)
+        return -1;
+
+    uniforms->source_position = egl->glGetUniformLocation(shader->program_id, "source_position");
+    uniforms->target_position = egl->glGetUniformLocation(shader->program_id, "target_position");
+    uniforms->rotation_matrix = egl->glGetUniformLocation(shader->program_id, "rotation_matrix");
+    uniforms->scale = egl->glGetUniformLocation(shader->program_id, "scale");
+    return 0;
+}
+
+static int load_graphics_shader_y(gsr_shader *shader, gsr_egl *egl, gsr_color_graphics_uniforms *uniforms, gsr_destination_color color_format, gsr_color_range color_range, bool external_texture) {
+    const char *color_transform_matrix = color_format_range_get_transform_matrix(color_format, color_range);
+
     char vertex_shader[2048];
     snprintf(vertex_shader, sizeof(vertex_shader),
         "#version 300 es                                   \n"
         "in vec2 pos;                                      \n"
         "in vec2 texcoords;                                \n"
         "out vec2 texcoords_out;                           \n"
+        "uniform vec2 offset;                              \n"
         "uniform float rotation;                           \n"
-        ROTATE_Z
+        "uniform mat2 rotation_matrix;                     \n"
         "void main()                                       \n"
         "{                                                 \n"
-        "  texcoords_out = texcoords;                      \n"
-        "  gl_Position = vec4(pos.x, pos.y, 0.0, 1.0) * rotate_z(rotation);    \n"
+        "  texcoords_out = vec2(texcoords.x - 0.5, texcoords.y - 0.5) * rotation_matrix + vec2(0.5, 0.5);  \n"
+        "  gl_Position = vec4(offset.x, offset.y, 0.0, 0.0) + vec4(pos.x, pos.y, 0.0, 1.0);    \n"
         "}                                                 \n");
 
-    char fragment_shader[] =
-        "#version 300 es                                                                 \n"
-        "precision mediump float;                                                        \n"
-        "in vec2 texcoords_out;                                                          \n"
-        "uniform sampler2D tex1;                                                         \n"
-        "out vec4 FragColor;                                                             \n"
-        RGB_TO_YUV
-        "void main()                                                                     \n"
-        "{                                                                               \n"
-        "  vec4 pixel = texture(tex1, texcoords_out);                                    \n"
-        "  FragColor.x = (RGBtoYUV * vec4(pixel.rgb, 1.0)).x;                            \n"
-        "  FragColor.w = pixel.a;                                                        \n"
-        "}                                                                               \n";
-
-    if(gsr_shader_init(shader, egl, vertex_shader, fragment_shader) != 0)
+    const char *main_code =
+        main_code =
+            "  vec4 pixel = texture(tex1, texcoords_out);                                    \n"
+            "  FragColor.x = (RGBtoYUV * vec4(pixel.rgb, 1.0)).x;                            \n"
+            "  FragColor.w = pixel.a;                                                        \n";
+
+    char fragment_shader[2048];
+    if(external_texture) {
+        snprintf(fragment_shader, sizeof(fragment_shader),
+            "#version 300 es                                                                 \n"
+            "#extension GL_OES_EGL_image_external : enable                                   \n"
+            "#extension GL_OES_EGL_image_external_essl3 : require                            \n"
+            "precision highp float;                                                        \n"
+            "in vec2 texcoords_out;                                                          \n"
+            "uniform samplerExternalOES tex1;                                                \n"
+            "out vec4 FragColor;                                                             \n"
+            "%s"
+            "void main()                                                                     \n"
+            "{                                                                               \n"
+            "%s"
+            "}                                                                               \n", color_transform_matrix, main_code);
+    } else {
+        snprintf(fragment_shader, sizeof(fragment_shader),
+            "#version 300 es                                                                 \n"
+            "precision highp float;                                                        \n"
+            "in vec2 texcoords_out;                                                          \n"
+            "uniform sampler2D tex1;                                                         \n"
+            "out vec4 FragColor;                                                             \n"
+            "%s"
+            "void main()                                                                     \n"
+            "{                                                                               \n"
+            "%s"
+            "}                                                                               \n", color_transform_matrix, main_code);
+    }
+
+    if(gsr_shader_init(shader, egl, vertex_shader, fragment_shader, NULL) != 0)
         return -1;
 
     gsr_shader_bind_attribute_location(shader, "pos", 0);
     gsr_shader_bind_attribute_location(shader, "texcoords", 1);
-    *rotation_uniform = egl->glGetUniformLocation(shader->program_id, "rotation");
+    uniforms->offset = egl->glGetUniformLocation(shader->program_id, "offset");
+    uniforms->rotation_matrix = egl->glGetUniformLocation(shader->program_id, "rotation_matrix");
     return 0;
 }
 
-static unsigned int load_shader_uv(gsr_shader *shader, gsr_egl *egl, int *rotation_uniform) {
+static unsigned int load_graphics_shader_uv(gsr_shader *shader, gsr_egl *egl, gsr_color_graphics_uniforms *uniforms, gsr_destination_color color_format, gsr_color_range color_range, bool external_texture) {
+    const char *color_transform_matrix = color_format_range_get_transform_matrix(color_format, color_range);
+
     char vertex_shader[2048];
     snprintf(vertex_shader, sizeof(vertex_shader),
         "#version 300 es                                 \n"
         "in vec2 pos;                                    \n"
         "in vec2 texcoords;                              \n"
         "out vec2 texcoords_out;                         \n"
+        "uniform vec2 offset;                            \n"
         "uniform float rotation;                         \n"
-        ROTATE_Z
+        "uniform mat2 rotation_matrix;                   \n"
         "void main()                                     \n"
         "{                                               \n"
-        "  texcoords_out = texcoords;                    \n"
-        "  gl_Position = vec4(pos.x, pos.y, 0.0, 1.0) * rotate_z(rotation) * vec4(0.5, 0.5, 1.0, 1.0) - vec4(0.5, 0.5, 0.0, 0.0);   \n"
+        "  texcoords_out = vec2(texcoords.x - 0.5, texcoords.y - 0.5) * rotation_matrix + vec2(0.5, 0.5);                      \n"
+        "  gl_Position = (vec4(offset.x, offset.y, 0.0, 0.0) + vec4(pos.x, pos.y, 0.0, 1.0)) * vec4(0.5, 0.5, 1.0, 1.0) - vec4(0.5, 0.5, 0.0, 0.0);   \n"
         "}                                               \n");
 
-    char fragment_shader[] =
-        "#version 300 es                                                                       \n"
-        "precision mediump float;                                                              \n"
-        "in vec2 texcoords_out;                                                                \n"
-        "uniform sampler2D tex1;                                                               \n"
-        "out vec4 FragColor;                                                                   \n"
-        RGB_TO_YUV
-        "void main()                                                                           \n"
-        "{                                                                                     \n"
-        "  vec4 pixel = texture(tex1, texcoords_out);                                          \n"
-        "  FragColor.xy = (RGBtoYUV * vec4(pixel.rgb, 1.0)).yz;                                \n"
-        "  FragColor.w = pixel.a;                                                              \n"
-        "}                                                                                     \n";
-
-    if(gsr_shader_init(shader, egl, vertex_shader, fragment_shader) != 0)
+    const char *main_code =
+        main_code =
+            "  vec4 pixel = texture(tex1, texcoords_out);                                          \n"
+            "  FragColor.xy = (RGBtoYUV * vec4(pixel.rgb, 1.0)).yz;                                \n"
+            "  FragColor.w = pixel.a;                                                              \n";
+
+    char fragment_shader[2048];
+    if(external_texture) {
+        snprintf(fragment_shader, sizeof(fragment_shader),
+            "#version 300 es                                                                       \n"
+            "#extension GL_OES_EGL_image_external : enable                                         \n"
+            "#extension GL_OES_EGL_image_external_essl3 : require                                  \n"
+            "precision highp float;                                                              \n"
+            "in vec2 texcoords_out;                                                                \n"
+            "uniform samplerExternalOES tex1;                                                      \n"
+            "out vec4 FragColor;                                                                   \n"
+            "%s"
+            "void main()                                                                           \n"
+            "{                                                                                     \n"
+            "%s"
+            "}                                                                                     \n", color_transform_matrix, main_code);
+    } else {
+        snprintf(fragment_shader, sizeof(fragment_shader),
+            "#version 300 es                                                                       \n"
+            "precision highp float;                                                              \n"
+            "in vec2 texcoords_out;                                                                \n"
+            "uniform sampler2D tex1;                                                               \n"
+            "out vec4 FragColor;                                                                   \n"
+            "%s"
+            "void main()                                                                           \n"
+            "{                                                                                     \n"
+            "%s"
+            "}                                                                                     \n", color_transform_matrix, main_code);
+    }
+
+    if(gsr_shader_init(shader, egl, vertex_shader, fragment_shader, NULL) != 0)
+        return -1;
+
+    gsr_shader_bind_attribute_location(shader, "pos", 0);
+    gsr_shader_bind_attribute_location(shader, "texcoords", 1);
+    uniforms->offset = egl->glGetUniformLocation(shader->program_id, "offset");
+    uniforms->rotation_matrix = egl->glGetUniformLocation(shader->program_id, "rotation_matrix");
+    return 0;
+}
+
+static unsigned int load_graphics_shader_rgb(gsr_shader *shader, gsr_egl *egl, gsr_color_graphics_uniforms *uniforms, bool external_texture) {
+    char vertex_shader[2048];
+    snprintf(vertex_shader, sizeof(vertex_shader),
+        "#version 300 es                                   \n"
+        "in vec2 pos;                                      \n"
+        "in vec2 texcoords;                                \n"
+        "out vec2 texcoords_out;                           \n"
+        "uniform vec2 offset;                              \n"
+        "uniform float rotation;                           \n"
+        "uniform mat2 rotation_matrix;                     \n"
+        "void main()                                       \n"
+        "{                                                 \n"
+        "  texcoords_out = vec2(texcoords.x - 0.5, texcoords.y - 0.5) * rotation_matrix + vec2(0.5, 0.5);  \n"
+        "  gl_Position = vec4(offset.x, offset.y, 0.0, 0.0) + vec4(pos.x, pos.y, 0.0, 1.0);    \n"
+        "}                                                 \n");
+
+    const char *main_code =
+        main_code =
+            "  vec4 pixel = texture(tex1, texcoords_out);                                          \n"
+            "  FragColor = pixel;                                                                  \n";
+
+    char fragment_shader[2048];
+    if(external_texture) {
+        snprintf(fragment_shader, sizeof(fragment_shader),
+            "#version 300 es                                                                       \n"
+            "#extension GL_OES_EGL_image_external : enable                                         \n"
+            "#extension GL_OES_EGL_image_external_essl3 : require                                  \n"
+            "precision highp float;                                                              \n"
+            "in vec2 texcoords_out;                                                                \n"
+            "uniform samplerExternalOES tex1;                                                      \n"
+            "out vec4 FragColor;                                                                   \n"
+            "void main()                                                                           \n"
+            "{                                                                                     \n"
+            "%s"
+            "}                                                                                     \n", main_code);
+    } else {
+        snprintf(fragment_shader, sizeof(fragment_shader),
+            "#version 300 es                                                                       \n"
+            "precision highp float;                                                              \n"
+            "in vec2 texcoords_out;                                                                \n"
+            "uniform sampler2D tex1;                                                               \n"
+            "out vec4 FragColor;                                                                   \n"
+            "void main()                                                                           \n"
+            "{                                                                                     \n"
+            "%s"
+            "}                                                                                     \n", main_code);
+    }
+
+    if(gsr_shader_init(shader, egl, vertex_shader, fragment_shader, NULL) != 0)
         return -1;
 
     gsr_shader_bind_attribute_location(shader, "pos", 0);
     gsr_shader_bind_attribute_location(shader, "texcoords", 1);
-    *rotation_uniform = egl->glGetUniformLocation(shader->program_id, "rotation");
+    uniforms->offset = egl->glGetUniformLocation(shader->program_id, "offset");
+    uniforms->rotation_matrix = egl->glGetUniformLocation(shader->program_id, "rotation_matrix");
     return 0;
 }
 
 static int load_framebuffers(gsr_color_conversion *self) {
     /* TODO: Only generate the necessary amount of framebuffers (self->params.num_destination_textures) */
     const unsigned int draw_buffer = GL_COLOR_ATTACHMENT0;
-    self->params.egl->glGenFramebuffers(MAX_FRAMEBUFFERS, self->framebuffers);
+    self->params.egl->glGenFramebuffers(GSR_COLOR_CONVERSION_MAX_FRAMEBUFFERS, self->framebuffers);
 
     self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[0]);
     self->params.egl->glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, self->params.destination_textures[0], 0);
@@ -172,7 +456,7 @@ static int create_vertices(gsr_color_conversion *self) {
 
     self->params.egl->glGenBuffers(1, &self->vertex_buffer_object_id);
     self->params.egl->glBindBuffer(GL_ARRAY_BUFFER, self->vertex_buffer_object_id);
-    self->params.egl->glBufferData(GL_ARRAY_BUFFER, 24 * sizeof(float), NULL, GL_STREAM_DRAW);
+    self->params.egl->glBufferData(GL_ARRAY_BUFFER, 24 * sizeof(float), NULL, GL_DYNAMIC_DRAW);
 
     self->params.egl->glEnableVertexAttribArray(0);
     self->params.egl->glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 4 * sizeof(float), (void*)0);
@@ -184,42 +468,195 @@ static int create_vertices(gsr_color_conversion *self) {
     return 0;
 }
 
+static bool gsr_color_conversion_load_compute_shaders(gsr_color_conversion *self) {
+    switch(self->params.destination_color) {
+        case GSR_DESTINATION_COLOR_NV12:
+        case GSR_DESTINATION_COLOR_P010: {
+            if(load_compute_shader_y(&self->compute_shaders[COMPUTE_SHADER_INDEX_Y], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_Y], self->max_local_size_dim, self->params.destination_color, self->params.color_range, false, false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y compute shader\n");
+                return false;
+            }
+
+            if(load_compute_shader_uv(&self->compute_shaders[COMPUTE_SHADER_INDEX_UV], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_UV], self->max_local_size_dim, self->params.destination_color, self->params.color_range, false, false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV compute shader\n");
+                return false;
+            }
+
+            if(load_compute_shader_y(&self->compute_shaders[COMPUTE_SHADER_INDEX_Y_BLEND], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_Y_BLEND], self->max_local_size_dim, self->params.destination_color, self->params.color_range, false, true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y compute shader\n");
+                return false;
+            }
+
+            if(load_compute_shader_uv(&self->compute_shaders[COMPUTE_SHADER_INDEX_UV_BLEND], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_UV_BLEND], self->max_local_size_dim, self->params.destination_color, self->params.color_range, false, true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV compute shader\n");
+                return false;
+            }
+            break;
+        }
+        case GSR_DESTINATION_COLOR_RGB8: {
+            if(load_compute_shader_rgb(&self->compute_shaders[COMPUTE_SHADER_INDEX_RGB], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_RGB], self->max_local_size_dim, false, false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y compute shader\n");
+                return false;
+            }
+
+            if(load_compute_shader_rgb(&self->compute_shaders[COMPUTE_SHADER_INDEX_RGB_BLEND], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_RGB_BLEND], self->max_local_size_dim, false, true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y compute shader\n");
+                return false;
+            }
+            break;
+        }
+    }
+    return true;
+}
+
+static bool gsr_color_conversion_load_external_compute_shaders(gsr_color_conversion *self) {
+    switch(self->params.destination_color) {
+        case GSR_DESTINATION_COLOR_NV12:
+        case GSR_DESTINATION_COLOR_P010: {
+            if(load_compute_shader_y(&self->compute_shaders[COMPUTE_SHADER_INDEX_Y_EXTERNAL], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_Y_EXTERNAL], self->max_local_size_dim, self->params.destination_color, self->params.color_range, true, false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y compute shader\n");
+                return false;
+            }
+
+            if(load_compute_shader_uv(&self->compute_shaders[COMPUTE_SHADER_INDEX_UV_EXTERNAL], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_UV_EXTERNAL], self->max_local_size_dim, self->params.destination_color, self->params.color_range, true, false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV compute shader\n");
+                return false;
+            }
+
+            if(load_compute_shader_y(&self->compute_shaders[COMPUTE_SHADER_INDEX_Y_EXTERNAL_BLEND], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_Y_EXTERNAL_BLEND], self->max_local_size_dim, self->params.destination_color, self->params.color_range, true, true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y compute shader\n");
+                return false;
+            }
+
+            if(load_compute_shader_uv(&self->compute_shaders[COMPUTE_SHADER_INDEX_UV_EXTERNAL_BLEND], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_UV_EXTERNAL_BLEND], self->max_local_size_dim, self->params.destination_color, self->params.color_range, true, true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV compute shader\n");
+                return false;
+            }
+            break;
+        }
+        case GSR_DESTINATION_COLOR_RGB8: {
+            if(load_compute_shader_rgb(&self->compute_shaders[COMPUTE_SHADER_INDEX_RGB_EXTERNAL], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_RGB_EXTERNAL], self->max_local_size_dim, true, false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y compute shader\n");
+                return false;
+            }
+
+            if(load_compute_shader_rgb(&self->compute_shaders[COMPUTE_SHADER_INDEX_RGB_EXTERNAL_BLEND], self->params.egl, &self->compute_uniforms[COMPUTE_SHADER_INDEX_RGB_EXTERNAL_BLEND], self->max_local_size_dim, true, true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y compute shader\n");
+                return false;
+            }
+            break;
+        }
+    }
+    return true;
+}
+
+static bool gsr_color_conversion_load_graphics_shaders(gsr_color_conversion *self) {
+    switch(self->params.destination_color) {
+        case GSR_DESTINATION_COLOR_NV12:
+        case GSR_DESTINATION_COLOR_P010: {
+            if(load_graphics_shader_y(&self->graphics_shaders[GRAPHICS_SHADER_INDEX_Y], self->params.egl, &self->graphics_uniforms[GRAPHICS_SHADER_INDEX_Y], self->params.destination_color, self->params.color_range, false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y graphics shader\n");
+                return false;
+            }
+
+            if(load_graphics_shader_uv(&self->graphics_shaders[GRAPHICS_SHADER_INDEX_UV], self->params.egl, &self->graphics_uniforms[GRAPHICS_SHADER_INDEX_UV], self->params.destination_color, self->params.color_range, false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV graphics shader\n");
+                return false;
+            }
+            break;
+        }
+        case GSR_DESTINATION_COLOR_RGB8: {
+            if(load_graphics_shader_rgb(&self->graphics_shaders[GRAPHICS_SHADER_INDEX_RGB], self->params.egl, &self->graphics_uniforms[GRAPHICS_SHADER_INDEX_RGB], false) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y graphics shader\n");
+                return false;
+            }
+            break;
+        }
+    }
+    return true;
+}
+
+static bool gsr_color_conversion_load_external_graphics_shaders(gsr_color_conversion *self) {
+    switch(self->params.destination_color) {
+        case GSR_DESTINATION_COLOR_NV12:
+        case GSR_DESTINATION_COLOR_P010: {
+            if(load_graphics_shader_y(&self->graphics_shaders[GRAPHICS_SHADER_INDEX_Y_EXTERNAL], self->params.egl, &self->graphics_uniforms[GRAPHICS_SHADER_INDEX_Y_EXTERNAL], self->params.destination_color, self->params.color_range, true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y graphics shader\n");
+                return false;
+            }
+
+            if(load_graphics_shader_uv(&self->graphics_shaders[GRAPHICS_SHADER_INDEX_UV_EXTERNAL], self->params.egl, &self->graphics_uniforms[GRAPHICS_SHADER_INDEX_UV_EXTERNAL], self->params.destination_color, self->params.color_range, true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV graphics shader\n");
+                return false;
+            }
+            break;
+        }
+        case GSR_DESTINATION_COLOR_RGB8: {
+            if(load_graphics_shader_rgb(&self->graphics_shaders[GRAPHICS_SHADER_INDEX_RGB_EXTERNAL], self->params.egl, &self->graphics_uniforms[GRAPHICS_SHADER_INDEX_RGB_EXTERNAL], true) != 0) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y graphics shader\n");
+                return false;
+            }
+            break;
+        }
+    }
+    return true;
+}
+
 int gsr_color_conversion_init(gsr_color_conversion *self, const gsr_color_conversion_params *params) {
     assert(params);
     assert(params->egl);
     memset(self, 0, sizeof(*self));
     self->params.egl = params->egl;
     self->params = *params;
+    
+    int max_compute_work_group_invocations = 256;
+    self->params.egl->glGetIntegerv(GL_MAX_COMPUTE_FIXED_GROUP_INVOCATIONS, &max_compute_work_group_invocations);
+    self->max_local_size_dim = sqrt(max_compute_work_group_invocations);
 
-    switch(params->destination_color) {
-        case GSR_DESTINATION_COLOR_BGR: {
-            if(self->params.num_destination_textures != 1) {
-                fprintf(stderr, "gsr error: gsr_color_conversion_init: expected 1 destination texture for destination color BGR, got %d destination texture(s)\n", self->params.num_destination_textures);
-                return -1;
-            }
-
-            if(load_shader_bgr(&self->shaders[0], self->params.egl, &self->rotation_uniforms[0]) != 0) {
-                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load bgr shader\n");
+    switch(self->params.destination_color) {
+        case GSR_DESTINATION_COLOR_NV12:
+        case GSR_DESTINATION_COLOR_P010: {
+            if(self->params.num_destination_textures != 2) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: expected 2 destination textures for destination color NV12/P010, got %d destination texture(s)\n", self->params.num_destination_textures);
                 goto err;
             }
             break;
         }
-        case GSR_DESTINATION_COLOR_NV12: {
-            if(self->params.num_destination_textures != 2) {
-                fprintf(stderr, "gsr error: gsr_color_conversion_init: expected 2 destination textures for destination color NV12, got %d destination texture(s)\n", self->params.num_destination_textures);
-                return -1;
-            }
-
-            if(load_shader_y(&self->shaders[0], self->params.egl, &self->rotation_uniforms[0]) != 0) {
-                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y shader\n");
+        case GSR_DESTINATION_COLOR_RGB8: {
+            if(self->params.num_destination_textures != 1) {
+                fprintf(stderr, "gsr error: gsr_color_conversion_init: expected 1 destination textures for destination color RGB8, got %d destination texture(s)\n", self->params.num_destination_textures);
                 goto err;
             }
+            break;
+        }
+    }
 
-            if(load_shader_uv(&self->shaders[1], self->params.egl, &self->rotation_uniforms[1]) != 0) {
-                fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV shader\n");
+    if(self->params.force_graphics_shader) {
+        self->compute_shaders_failed_to_load = true;
+        self->external_compute_shaders_failed_to_load = true;
+        
+        if(!gsr_color_conversion_load_graphics_shaders(self))
+            goto err;
+
+        if(self->params.load_external_image_shader) {
+            if(!gsr_color_conversion_load_external_graphics_shaders(self))
                 goto err;
+        }
+    } else {
+        if(!gsr_color_conversion_load_compute_shaders(self)) {
+            self->compute_shaders_failed_to_load = true;
+            fprintf(stderr, "gsr info: failed to load one or more compute shaders, run gpu-screen-recorder with the '-gl-debug yes' option to see why. Falling back to slower graphics shader instead\n");
+            if(!gsr_color_conversion_load_graphics_shaders(self))
+                goto err;
+        }
+
+        if(self->params.load_external_image_shader) {
+            if(!gsr_color_conversion_load_external_compute_shaders(self)) {
+                self->external_compute_shaders_failed_to_load = true;
+                fprintf(stderr, "gsr info: failed to load one or more external compute shaders, run gpu-screen-recorder with the '-gl-debug yes' option to see why. Falling back to slower graphics shader instead\n");
+                if(!gsr_color_conversion_load_external_graphics_shaders(self))
+                    goto err;
             }
-            break;
         }
     }
 
@@ -250,97 +687,348 @@ void gsr_color_conversion_deinit(gsr_color_conversion *self) {
         self->vertex_array_object_id = 0;
     }
 
-    self->params.egl->glDeleteFramebuffers(MAX_FRAMEBUFFERS, self->framebuffers);
-    for(int i = 0; i < MAX_FRAMEBUFFERS; ++i) {
+    self->params.egl->glDeleteFramebuffers(GSR_COLOR_CONVERSION_MAX_FRAMEBUFFERS, self->framebuffers);
+    for(int i = 0; i < GSR_COLOR_CONVERSION_MAX_FRAMEBUFFERS; ++i) {
         self->framebuffers[i] = 0;
     }
 
-    for(int i = 0; i < MAX_SHADERS; ++i) {
-        gsr_shader_deinit(&self->shaders[i]);
+    for(int i = 0; i < GSR_COLOR_CONVERSION_MAX_COMPUTE_SHADERS; ++i) {
+        gsr_shader_deinit(&self->compute_shaders[i]);
+    }
+
+    for(int i = 0; i < GSR_COLOR_CONVERSION_MAX_GRAPHICS_SHADERS; ++i) {
+        gsr_shader_deinit(&self->graphics_shaders[i]);
     }
 
     self->params.egl = NULL;
 }
 
-/* |source_pos| is in pixel coordinates and |source_size|  */
-int gsr_color_conversion_draw(gsr_color_conversion *self, unsigned int texture_id, vec2i source_pos, vec2i source_size, vec2i texture_pos, vec2i texture_size, float rotation) {
+static void gsr_color_conversion_apply_rotation(gsr_rotation rotation, float rotation_matrix[2][2]) {
+    /*
+    rotation_matrix[0][0] =  cos(angle);
+    rotation_matrix[0][1] = -sin(angle);
+    rotation_matrix[1][0] =  sin(angle);
+    rotation_matrix[1][1] =  cos(angle);
+    The manual matrix code below is the same as this code above, but without floating-point errors.
+    This is done to remove any blurring caused by these floating-point errors.
+    */
+    switch(rotation) {
+        case GSR_ROT_0:
+            rotation_matrix[0][0] = 1.0f;
+            rotation_matrix[0][1] = 0.0f;
+            rotation_matrix[1][0] = 0.0f;
+            rotation_matrix[1][1] = 1.0f;
+            break;
+        case GSR_ROT_90:
+            rotation_matrix[0][0] =  0.0f;
+            rotation_matrix[0][1] = -1.0f;
+            rotation_matrix[1][0] =  1.0f;
+            rotation_matrix[1][1] =  0.0f;
+            break;
+        case GSR_ROT_180:
+            rotation_matrix[0][0] = -1.0f;
+            rotation_matrix[0][1] =  0.0f;
+            rotation_matrix[1][0] =  0.0f;
+            rotation_matrix[1][1] = -1.0f;
+            break;
+        case GSR_ROT_270:
+            rotation_matrix[0][0] =  0.0f;
+            rotation_matrix[0][1] =  1.0f;
+            rotation_matrix[1][0] = -1.0f;
+            rotation_matrix[1][1] =  0.0f;
+            break;
+    }
+}
+
+static void gsr_color_conversion_swizzle_texture_source(gsr_color_conversion *self, gsr_source_color source_color) {
+    if(source_color == GSR_SOURCE_COLOR_BGR) {
+        const int swizzle_mask[] = { GL_BLUE, GL_GREEN, GL_RED, 1 };
+        self->params.egl->glTexParameteriv(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_RGBA, swizzle_mask);
+    }
+}
+
+static void gsr_color_conversion_swizzle_reset(gsr_color_conversion *self, gsr_source_color source_color) {
+    if(source_color == GSR_SOURCE_COLOR_BGR) {
+        const int swizzle_mask[] = { GL_RED, GL_GREEN, GL_BLUE, GL_ALPHA };
+        self->params.egl->glTexParameteriv(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_RGBA, swizzle_mask);
+    }
+}
+
+typedef enum {
+    GSR_COLOR_COMP_Y,
+    GSR_COLOR_COMP_UV,
+    GSR_COLOR_COMP_RGB
+} gsr_color_component;
+
+static int color_component_get_destination_texture_index(gsr_color_component color_component) {
+    switch(color_component) {
+        case GSR_COLOR_COMP_Y:   return 0;
+        case GSR_COLOR_COMP_UV:  return 1;
+        case GSR_COLOR_COMP_RGB: return 0;
+    }
+    assert(false);
+    return 0;
+}
+
+static unsigned int color_component_get_color_format(gsr_color_component color_component, bool use_16bit_colors) {
+    switch(color_component) {
+        case GSR_COLOR_COMP_Y:   return use_16bit_colors ? GL_R16 : GL_R8;
+        case GSR_COLOR_COMP_UV:  return use_16bit_colors ? GL_RG16 : GL_RG8;
+        case GSR_COLOR_COMP_RGB: return GL_RGBA8; // TODO: 16-bit color support
+    }
+    assert(false);
+    return GL_RGBA8;
+}
+
+static int color_component_get_COMPUTE_SHADER_INDEX(gsr_color_component color_component, bool external_texture, bool alpha_blending) {
+    switch(color_component) {
+        case GSR_COLOR_COMP_Y: {
+            if(external_texture)
+                return alpha_blending ? COMPUTE_SHADER_INDEX_Y_EXTERNAL_BLEND : COMPUTE_SHADER_INDEX_Y_EXTERNAL;
+            else
+                return alpha_blending ? COMPUTE_SHADER_INDEX_Y_BLEND : COMPUTE_SHADER_INDEX_Y;
+        }
+        case GSR_COLOR_COMP_UV: {
+            if(external_texture)
+                return alpha_blending ? COMPUTE_SHADER_INDEX_UV_EXTERNAL_BLEND : COMPUTE_SHADER_INDEX_UV_EXTERNAL;
+            else
+                return alpha_blending ? COMPUTE_SHADER_INDEX_UV_BLEND : COMPUTE_SHADER_INDEX_UV;
+        }
+        case GSR_COLOR_COMP_RGB: {
+            if(external_texture)
+                return alpha_blending ? COMPUTE_SHADER_INDEX_RGB_EXTERNAL_BLEND : COMPUTE_SHADER_INDEX_RGB_EXTERNAL;
+            else
+                return alpha_blending ? COMPUTE_SHADER_INDEX_RGB_BLEND : COMPUTE_SHADER_INDEX_RGB;
+        }
+    }
+    assert(false);
+    return COMPUTE_SHADER_INDEX_RGB;
+}
+
+static void gsr_color_conversion_dispatch_compute_shader(gsr_color_conversion *self, bool external_texture, bool alpha_blending, float rotation_matrix[2][2], vec2i source_position, vec2i destination_pos, vec2i destination_size, vec2f scale, bool use_16bit_colors, gsr_color_component color_component) {
+    const int compute_shader_index = color_component_get_COMPUTE_SHADER_INDEX(color_component, external_texture, alpha_blending);
+    const int destination_texture_index = color_component_get_destination_texture_index(color_component);
+    const unsigned int color_format = color_component_get_color_format(color_component, use_16bit_colors);
+
+    self->params.egl->glActiveTexture(GL_TEXTURE1);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, self->params.destination_textures[destination_texture_index]);
+    self->params.egl->glActiveTexture(GL_TEXTURE0);
+
+    gsr_color_compute_uniforms *uniform = &self->compute_uniforms[compute_shader_index];
+    gsr_shader_use(&self->compute_shaders[compute_shader_index]);
+    self->params.egl->glUniformMatrix2fv(uniform->rotation_matrix, 1, GL_TRUE, (const float*)rotation_matrix);
+    self->params.egl->glUniform2i(uniform->source_position, source_position.x, source_position.y);
+    self->params.egl->glUniform2i(uniform->target_position, destination_pos.x, destination_pos.y);
+    self->params.egl->glUniform2f(uniform->scale, scale.x, scale.y);
+    self->params.egl->glBindImageTexture(0, self->params.destination_textures[destination_texture_index], 0, GL_FALSE, 0, GL_WRITE_ONLY, color_format);
+    const double num_groups_x = ceil((double)destination_size.x/(double)self->max_local_size_dim);
+    const double num_groups_y = ceil((double)destination_size.y/(double)self->max_local_size_dim);
+    self->params.egl->glDispatchCompute(max_int(1, num_groups_x), max_int(1, num_groups_y), 1);
+}
+
+static void gsr_color_conversion_draw_graphics(gsr_color_conversion *self, unsigned int texture_id, bool external_texture, float rotation_matrix[2][2], vec2i source_position, vec2i source_size, vec2i destination_pos, vec2i texture_size, vec2f scale, gsr_source_color source_color) {
     /* TODO: Do not call this every frame? */
     vec2i dest_texture_size = {0, 0};
     self->params.egl->glBindTexture(GL_TEXTURE_2D, self->params.destination_textures[0]);
     self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &dest_texture_size.x);
     self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &dest_texture_size.y);
+    self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
 
-    /* TODO: Do not call this every frame? */
-    vec2i source_texture_size = {0, 0};
-    self->params.egl->glBindTexture(GL_TEXTURE_2D, texture_id);
-    self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &source_texture_size.x);
-    self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &source_texture_size.y);
-
-    if(abs_f(M_PI * 0.5f - rotation) <= 0.001f || abs_f(M_PI * 1.5f - rotation) <= 0.001f) {
-        float tmp = source_texture_size.x;
-        source_texture_size.x = source_texture_size.y;
-        source_texture_size.y = tmp;
-    }
+    const int texture_target = external_texture ? GL_TEXTURE_EXTERNAL_OES : GL_TEXTURE_2D;
+
+    self->params.egl->glBindTexture(texture_target, texture_id);
+    gsr_color_conversion_swizzle_texture_source(self, source_color);
 
     const vec2f pos_norm = {
-        ((float)source_pos.x / (dest_texture_size.x == 0 ? 1.0f : (float)dest_texture_size.x)) * 2.0f,
-        ((float)source_pos.y / (dest_texture_size.y == 0 ? 1.0f : (float)dest_texture_size.y)) * 2.0f,
+        ((float)destination_pos.x / (dest_texture_size.x == 0 ? 1.0f : (float)dest_texture_size.x)) * 2.0f,
+        ((float)destination_pos.y / (dest_texture_size.y == 0 ? 1.0f : (float)dest_texture_size.y)) * 2.0f,
     };
 
     const vec2f size_norm = {
-        ((float)source_size.x / (dest_texture_size.x == 0 ? 1.0f : (float)dest_texture_size.x)) * 2.0f,
-        ((float)source_size.y / (dest_texture_size.y == 0 ? 1.0f : (float)dest_texture_size.y)) * 2.0f,
+        ((float)source_size.x / (dest_texture_size.x == 0 ? 1.0f : (float)dest_texture_size.x)) * 2.0f * scale.x,
+        ((float)source_size.y / (dest_texture_size.y == 0 ? 1.0f : (float)dest_texture_size.y)) * 2.0f * scale.y,
     };
 
     const vec2f texture_pos_norm = {
-        (float)texture_pos.x / (source_texture_size.x == 0 ? 1.0f : (float)source_texture_size.x),
-        (float)texture_pos.y / (source_texture_size.y == 0 ? 1.0f : (float)source_texture_size.y),
+        (float)source_position.x / (texture_size.x == 0 ? 1.0f : (float)texture_size.x),
+        (float)source_position.y / (texture_size.y == 0 ? 1.0f : (float)texture_size.y),
     };
 
     const vec2f texture_size_norm = {
-        (float)texture_size.x / (source_texture_size.x == 0 ? 1.0f : (float)source_texture_size.x),
-        (float)texture_size.y / (source_texture_size.y == 0 ? 1.0f : (float)source_texture_size.y),
+        (float)source_size.x / (texture_size.x == 0 ? 1.0f : (float)texture_size.x),
+        (float)source_size.y / (texture_size.y == 0 ? 1.0f : (float)texture_size.y),
     };
 
     const float vertices[] = {
-        -1.0f + pos_norm.x,               -1.0f + pos_norm.y + size_norm.y, texture_pos_norm.x,                       texture_pos_norm.y + texture_size_norm.y,
-        -1.0f + pos_norm.x,               -1.0f + pos_norm.y,               texture_pos_norm.x,                       texture_pos_norm.y,
-        -1.0f + pos_norm.x + size_norm.x, -1.0f + pos_norm.y,               texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y,
+        -1.0f + 0.0f,               -1.0f + 0.0f + size_norm.y, texture_pos_norm.x,                       texture_pos_norm.y + texture_size_norm.y,
+        -1.0f + 0.0f,               -1.0f + 0.0f,               texture_pos_norm.x,                       texture_pos_norm.y,
+        -1.0f + 0.0f + size_norm.x, -1.0f + 0.0f,               texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y,
 
-        -1.0f + pos_norm.x,               -1.0f + pos_norm.y + size_norm.y, texture_pos_norm.x,                       texture_pos_norm.y + texture_size_norm.y,
-        -1.0f + pos_norm.x + size_norm.x, -1.0f + pos_norm.y,               texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y,
-        -1.0f + pos_norm.x + size_norm.x, -1.0f + pos_norm.y + size_norm.y, texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y + texture_size_norm.y
+        -1.0f + 0.0f,               -1.0f + 0.0f + size_norm.y, texture_pos_norm.x,                       texture_pos_norm.y + texture_size_norm.y,
+        -1.0f + 0.0f + size_norm.x, -1.0f + 0.0f,               texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y,
+        -1.0f + 0.0f + size_norm.x, -1.0f + 0.0f + size_norm.y, texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y + texture_size_norm.y
     };
 
     self->params.egl->glBindVertexArray(self->vertex_array_object_id);
     self->params.egl->glViewport(0, 0, dest_texture_size.x, dest_texture_size.y);
-    self->params.egl->glBindTexture(GL_TEXTURE_2D, texture_id);
 
     /* TODO: this, also cleanup */
     //self->params.egl->glBindBuffer(GL_ARRAY_BUFFER, self->vertex_buffer_object_id);
     self->params.egl->glBufferSubData(GL_ARRAY_BUFFER, 0, 24 * sizeof(float), vertices);
 
-    {
-        self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[0]);
-        //cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT); // TODO: Do this in a separate clear_ function. We want to do that when using multiple drm to create the final image (multiple monitors for example)
+    switch(self->params.destination_color) {
+        case GSR_DESTINATION_COLOR_NV12:
+        case GSR_DESTINATION_COLOR_P010: {
+            self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[0]);
+            //cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT); // TODO: Do this in a separate clear_ function. We want to do that when using multiple drm to create the final image (multiple monitors for example)
+
+            int shader_index = external_texture ? GRAPHICS_SHADER_INDEX_Y_EXTERNAL : GRAPHICS_SHADER_INDEX_Y;
+            gsr_shader_use(&self->graphics_shaders[shader_index]);
+            self->params.egl->glUniformMatrix2fv(self->graphics_uniforms[shader_index].rotation_matrix, 1, GL_TRUE, (const float*)rotation_matrix);
+            self->params.egl->glUniform2f(self->graphics_uniforms[shader_index].offset, pos_norm.x, pos_norm.y);
+            self->params.egl->glDrawArrays(GL_TRIANGLES, 0, 6);
+
+            if(self->params.num_destination_textures > 1) {
+                self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[1]);
+                //cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT);
+
+                shader_index = external_texture ? GRAPHICS_SHADER_INDEX_UV_EXTERNAL : GRAPHICS_SHADER_INDEX_UV;
+                gsr_shader_use(&self->graphics_shaders[shader_index]);
+                self->params.egl->glUniformMatrix2fv(self->graphics_uniforms[shader_index].rotation_matrix, 1, GL_TRUE, (const float*)rotation_matrix);
+                self->params.egl->glUniform2f(self->graphics_uniforms[shader_index].offset, pos_norm.x, pos_norm.y);
+                self->params.egl->glDrawArrays(GL_TRIANGLES, 0, 6);
+            }
+            break;
+        }
+        case GSR_DESTINATION_COLOR_RGB8: {
+            self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[0]);
+            //cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT); // TODO: Do this in a separate clear_ function. We want to do that when using multiple drm to create the final image (multiple monitors for example)
+
+            const int shader_index = external_texture ? GRAPHICS_SHADER_INDEX_RGB_EXTERNAL : GRAPHICS_SHADER_INDEX_RGB;
+            gsr_shader_use(&self->graphics_shaders[shader_index]);
+            self->params.egl->glUniformMatrix2fv(self->graphics_uniforms[shader_index].rotation_matrix, 1, GL_TRUE, (const float*)rotation_matrix);
+            self->params.egl->glUniform2f(self->graphics_uniforms[shader_index].offset, pos_norm.x, pos_norm.y);
+            self->params.egl->glDrawArrays(GL_TRIANGLES, 0, 6);
+            break;
+        }
+    }
+
+    self->params.egl->glBindVertexArray(0);
+    self->params.egl->glUseProgram(0);
+    gsr_color_conversion_swizzle_reset(self, source_color);
+    self->params.egl->glBindTexture(texture_target, 0);
+    self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, 0);
+}
+
+void gsr_color_conversion_draw(gsr_color_conversion *self, unsigned int texture_id, vec2i destination_pos, vec2i destination_size, vec2i source_pos, vec2i source_size, vec2i texture_size, gsr_rotation rotation, gsr_source_color source_color, bool external_texture, bool alpha_blending) {
+    assert(!external_texture || self->params.load_external_image_shader);
+    if(external_texture && !self->params.load_external_image_shader) {
+        fprintf(stderr, "gsr error: gsr_color_conversion_draw: external texture not loaded\n");
+        return;
+    }
+
+    vec2f scale = {0.0f, 0.0f};
+    if(source_size.x > 0 && source_size.y > 0)
+        scale = (vec2f){ (double)destination_size.x/(double)source_size.x, (double)destination_size.y/(double)source_size.y };
+
+    vec2i source_position = {0, 0};
+    float rotation_matrix[2][2] = {{0, 0}, {0, 0}};
+    gsr_color_conversion_apply_rotation(rotation, rotation_matrix);
+
+    const int texture_target = external_texture ? GL_TEXTURE_EXTERNAL_OES : GL_TEXTURE_2D;
+    self->params.egl->glBindTexture(texture_target, texture_id);
+    gsr_color_conversion_swizzle_texture_source(self, source_color);
+
+    const bool use_graphics_shader = external_texture ? self->external_compute_shaders_failed_to_load : self->compute_shaders_failed_to_load;
+    if(use_graphics_shader) {
+        source_position.x += source_pos.x;
+        source_position.y += source_pos.y;
+        gsr_color_conversion_draw_graphics(self, texture_id, external_texture, rotation_matrix, source_position, source_size, destination_pos, texture_size, scale, source_color);
+    } else {
+        switch(rotation) {
+            case GSR_ROT_0:
+                break;
+            case GSR_ROT_90:
+                source_position.x += (((double)texture_size.x*0.5 - (double)texture_size.y*0.5) * scale.x);
+                source_position.y += (((double)texture_size.y*0.5 - (double)texture_size.x*0.5) * scale.y);
+                break;
+            case GSR_ROT_180:
+                break;
+            case GSR_ROT_270:
+                source_position.x += (((double)texture_size.x*0.5 - (double)texture_size.y*0.5) * scale.x);
+                source_position.y += (((double)texture_size.y*0.5 - (double)texture_size.x*0.5) * scale.y);
+                break;
+        }
+        source_position.x -= (source_pos.x * scale.x + 0.5);
+        source_position.y -= (source_pos.y * scale.y + 0.5);
 
-        gsr_shader_use(&self->shaders[0]);
-        self->params.egl->glUniform1f(self->rotation_uniforms[0], rotation);
-        self->params.egl->glDrawArrays(GL_TRIANGLES, 0, 6);
+        switch(self->params.destination_color) {
+            case GSR_DESTINATION_COLOR_NV12:
+            case GSR_DESTINATION_COLOR_P010: {
+                const bool use_16bit_colors = self->params.destination_color == GSR_DESTINATION_COLOR_P010;
+                gsr_color_conversion_dispatch_compute_shader(self, external_texture, alpha_blending, rotation_matrix, source_position, destination_pos, destination_size, scale, use_16bit_colors, GSR_COLOR_COMP_Y);
+                gsr_color_conversion_dispatch_compute_shader(self, external_texture, alpha_blending, rotation_matrix, (vec2i){source_position.x/2, source_position.y/2},
+                    (vec2i){destination_pos.x/2, destination_pos.y/2}, (vec2i){destination_size.x/2, destination_size.y/2}, scale, use_16bit_colors, GSR_COLOR_COMP_UV);
+                break;
+            }
+            case GSR_DESTINATION_COLOR_RGB8: {
+                gsr_color_conversion_dispatch_compute_shader(self, external_texture, alpha_blending, rotation_matrix, source_position, destination_pos, destination_size, scale, false, GSR_COLOR_COMP_RGB);
+                break;
+            }
+        }
     }
 
+    self->params.egl->glFlush();
+    // TODO: Use the minimal barrier required
+    self->params.egl->glMemoryBarrier(GL_ALL_BARRIER_BITS); // GL_SHADER_IMAGE_ACCESS_BARRIER_BIT
+    self->params.egl->glUseProgram(0);
+
+    gsr_color_conversion_swizzle_reset(self, source_color);
+    self->params.egl->glBindTexture(texture_target, 0);
+}
+
+void gsr_color_conversion_clear(gsr_color_conversion *self) {
+    float color1[4] = {0.0f, 0.0f, 0.0f, 1.0f};
+    float color2[4] = {0.0f, 0.0f, 0.0f, 1.0f};
+
+    switch(self->params.destination_color) {
+        case GSR_DESTINATION_COLOR_NV12:
+        case GSR_DESTINATION_COLOR_P010: {
+            color2[0] = 0.5f;
+            color2[1] = 0.5f;
+            color2[2] = 0.0f;
+            color2[3] = 1.0f;
+            break;
+        }
+        case GSR_DESTINATION_COLOR_RGB8: {
+            color2[0] = 0.0f;
+            color2[1] = 0.0f;
+            color2[2] = 0.0f;
+            color2[3] = 1.0f;
+            break;
+        }
+    }
+
+    self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[0]);
+    self->params.egl->glClearColor(color1[0], color1[1], color1[2], color1[3]);
+    self->params.egl->glClear(GL_COLOR_BUFFER_BIT);
+
     if(self->params.num_destination_textures > 1) {
         self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[1]);
-        //cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT);
-
-        gsr_shader_use(&self->shaders[1]);
-        self->params.egl->glUniform1f(self->rotation_uniforms[1], rotation);
-        self->params.egl->glDrawArrays(GL_TRIANGLES, 0, 6);
+        self->params.egl->glClearColor(color2[0], color2[1], color2[2], color2[3]);
+        self->params.egl->glClear(GL_COLOR_BUFFER_BIT);
     }
 
-    self->params.egl->glBindVertexArray(0);
-    gsr_shader_use_none(&self->shaders[0]);
-    self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
     self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, 0);
-    return 0;
+}
+
+void gsr_color_conversion_read_destination_texture(gsr_color_conversion *self, int destination_texture_index, int x, int y, int width, int height, unsigned int color_format, unsigned int data_format, void *pixels) {
+    assert(destination_texture_index >= 0 && destination_texture_index < self->params.num_destination_textures);
+    self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[destination_texture_index]);
+    self->params.egl->glReadPixels(x, y, width, height, color_format, data_format, pixels);
+    self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, 0);
+}
+
+gsr_rotation gsr_monitor_rotation_to_rotation(gsr_monitor_rotation monitor_rotation) {
+    return (gsr_rotation)monitor_rotation;
 }
diff --git a/src/cuda.c b/src/cuda.c
index dcbbb92..23954a4 100644
--- a/src/cuda.c
+++ b/src/cuda.c
@@ -19,7 +19,7 @@ bool gsr_cuda_load(gsr_cuda *self, Display *display, bool do_overclock) {
         }
     }
 
-    dlsym_assign required_dlsym[] = {
+    const dlsym_assign required_dlsym[] = {
         { (void**)&self->cuInit, "cuInit" },
         { (void**)&self->cuDeviceGetCount, "cuDeviceGetCount" },
         { (void**)&self->cuDeviceGet, "cuDeviceGet" },
@@ -28,8 +28,9 @@ bool gsr_cuda_load(gsr_cuda *self, Display *display, bool do_overclock) {
         { (void**)&self->cuCtxPushCurrent_v2, "cuCtxPushCurrent_v2" },
         { (void**)&self->cuCtxPopCurrent_v2, "cuCtxPopCurrent_v2" },
         { (void**)&self->cuGetErrorString, "cuGetErrorString" },
-        { (void**)&self->cuMemsetD8_v2, "cuMemsetD8_v2" },
         { (void**)&self->cuMemcpy2D_v2, "cuMemcpy2D_v2" },
+        { (void**)&self->cuMemcpy2DAsync_v2, "cuMemcpy2DAsync_v2" },
+        { (void**)&self->cuStreamSynchronize, "cuStreamSynchronize" },
 
         { (void**)&self->cuGraphicsGLRegisterImage, "cuGraphicsGLRegisterImage" },
         { (void**)&self->cuGraphicsEGLRegisterImage, "cuGraphicsEGLRegisterImage" },
@@ -81,12 +82,13 @@ bool gsr_cuda_load(gsr_cuda *self, Display *display, bool do_overclock) {
         goto fail;
     }
 
-    if(self->do_overclock) {
-        assert(display);
+    if(self->do_overclock && display) {
         if(gsr_overclock_load(&self->overclock, display))
             gsr_overclock_start(&self->overclock);
         else
             fprintf(stderr, "gsr warning: gsr_cuda_load: failed to load xnvctrl, failed to overclock memory transfer rate\n");
+    } else if(self->do_overclock && !display) {
+        fprintf(stderr, "gsr warning: gsr_cuda_load: overclocking enabled but no X server is running. Overclocking has been disabled\n");
     }
 
     self->library = lib;
diff --git a/src/cursor.c b/src/cursor.c
new file mode 100644
index 0000000..e818d72
--- /dev/null
+++ b/src/cursor.c
@@ -0,0 +1,138 @@
+#include "../include/cursor.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+
+#include <X11/extensions/Xfixes.h>
+
+// TODO: Test cursor visibility with XFixesHideCursor
+
+static bool gsr_cursor_set_from_x11_cursor_image(gsr_cursor *self, XFixesCursorImage *x11_cursor_image, bool *visible) {
+    uint8_t *cursor_data = NULL;
+    uint8_t *out = NULL;
+    *visible = false;
+
+    if(!x11_cursor_image)
+        goto err;
+        
+    if(!x11_cursor_image->pixels)
+        goto err;
+
+    self->hotspot.x = x11_cursor_image->xhot;
+    self->hotspot.y = x11_cursor_image->yhot;
+    self->egl->glBindTexture(GL_TEXTURE_2D, self->texture_id);
+
+    self->size.x = x11_cursor_image->width;
+    self->size.y = x11_cursor_image->height;
+    const unsigned long *pixels = x11_cursor_image->pixels;
+    cursor_data = malloc(self->size.x * self->size.y * 4);
+    if(!cursor_data)
+        goto err;
+    out = cursor_data;
+    /* Un-premultiply alpha */
+    for(int y = 0; y < self->size.y; ++y) {
+        for(int x = 0; x < self->size.x; ++x) {
+            uint32_t pixel = *pixels++;
+            uint8_t *in = (uint8_t*)&pixel;
+            uint8_t alpha = in[3];
+            if(alpha == 0) {
+                alpha = 1;
+            } else {
+                *visible = true;
+            }
+
+            out[0] = (float)in[2] * 255.0/(float)alpha;
+            out[1] = (float)in[1] * 255.0/(float)alpha;
+            out[2] = (float)in[0] * 255.0/(float)alpha;
+            out[3] = in[3];
+            out += 4;
+            in += 4;
+        }
+    }
+
+    // TODO: glTextureSubImage2D if same size
+    self->egl->glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, self->size.x, self->size.y, 0, GL_RGBA, GL_UNSIGNED_BYTE, cursor_data);
+    free(cursor_data);
+
+    self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+    self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+
+    self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+    XFree(x11_cursor_image);
+    return true;
+
+    err:
+    self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+    if(x11_cursor_image)
+        XFree(x11_cursor_image);
+    return false;
+}
+
+int gsr_cursor_init(gsr_cursor *self, gsr_egl *egl, Display *display) {
+    int x_fixes_error_base = 0;
+
+    assert(egl);
+    assert(display);
+    memset(self, 0, sizeof(*self));
+    self->egl = egl;
+    self->display = display;
+
+    self->x_fixes_event_base = 0;
+    if(!XFixesQueryExtension(self->display, &self->x_fixes_event_base, &x_fixes_error_base)) {
+        fprintf(stderr, "gsr error: gsr_cursor_init: your X11 server is missing the XFixes extension\n");
+        gsr_cursor_deinit(self);
+        return -1;
+    }
+
+    self->egl->glGenTextures(1, &self->texture_id);
+
+    XFixesSelectCursorInput(self->display, DefaultRootWindow(self->display), XFixesDisplayCursorNotifyMask);
+    gsr_cursor_set_from_x11_cursor_image(self, XFixesGetCursorImage(self->display), &self->visible);
+    self->cursor_image_set = true;
+
+    return 0;
+}
+
+void gsr_cursor_deinit(gsr_cursor *self) {
+    if(!self->egl)
+        return;
+
+    if(self->texture_id) {
+        self->egl->glDeleteTextures(1, &self->texture_id);
+        self->texture_id = 0;
+    }
+
+    if(self->display)
+        XFixesSelectCursorInput(self->display, DefaultRootWindow(self->display), 0);
+
+    self->display = NULL;
+    self->egl = NULL;
+}
+
+bool gsr_cursor_on_event(gsr_cursor *self, XEvent *xev) {
+    bool updated = false;
+
+    if(xev->type == self->x_fixes_event_base + XFixesCursorNotify) {
+        XFixesCursorNotifyEvent *cursor_notify_event = (XFixesCursorNotifyEvent*)xev;
+        if(cursor_notify_event->subtype == XFixesDisplayCursorNotify && cursor_notify_event->window == DefaultRootWindow(self->display)) {
+            self->cursor_image_set = false;
+        }
+    }
+
+    if(!self->cursor_image_set) {
+        self->cursor_image_set = true;
+        gsr_cursor_set_from_x11_cursor_image(self, XFixesGetCursorImage(self->display), &self->visible);
+        updated = true;
+    }
+
+    return updated;
+}
+
+void gsr_cursor_tick(gsr_cursor *self, Window relative_to) {
+    Window dummy_window;
+    int dummy_i;
+    unsigned int dummy_u;
+    XQueryPointer(self->display, relative_to, &dummy_window, &dummy_window, &dummy_i, &dummy_i, &self->position.x, &self->position.y, &dummy_u);
+}
diff --git a/src/damage.c b/src/damage.c
new file mode 100644
index 0000000..25a2225
--- /dev/null
+++ b/src/damage.c
@@ -0,0 +1,327 @@
+#include "../include/damage.h"
+#include "../include/utils.h"
+#include "../include/window/window.h"
+
+#include <stdio.h>
+#include <string.h>
+#include <X11/extensions/Xdamage.h>
+#include <X11/extensions/Xrandr.h>
+
+typedef struct {
+    vec2i pos;
+    vec2i size;
+} gsr_rectangle;
+
+static bool rectangles_intersect(gsr_rectangle rect1, gsr_rectangle rect2) {
+    return rect1.pos.x < rect2.pos.x + rect2.size.x && rect1.pos.x + rect1.size.x > rect2.pos.x &&
+        rect1.pos.y < rect2.pos.y + rect2.size.y && rect1.pos.y + rect1.size.y > rect2.pos.y;
+}
+
+static bool xrandr_is_supported(Display *display) {
+    int major_version = 0;
+    int minor_version = 0;
+    if(!XRRQueryVersion(display, &major_version, &minor_version))
+        return false;
+
+    return major_version > 1 || (major_version == 1 && minor_version >= 2);
+}
+
+bool gsr_damage_init(gsr_damage *self, gsr_egl *egl, bool track_cursor) {
+    memset(self, 0, sizeof(*self));
+    self->egl = egl;
+    self->track_cursor = track_cursor;
+
+    if(gsr_window_get_display_server(egl->window) != GSR_DISPLAY_SERVER_X11) {
+        fprintf(stderr, "gsr warning: gsr_damage_init: damage tracking is not supported on wayland\n");
+        return false;
+    }
+    self->display = gsr_window_get_display(egl->window);
+
+    if(!XDamageQueryExtension(self->display, &self->damage_event, &self->damage_error)) {
+        fprintf(stderr, "gsr warning: gsr_damage_init: XDamage is not supported by your X11 server\n");
+        gsr_damage_deinit(self);
+        return false;
+    }
+
+    if(!XRRQueryExtension(self->display, &self->randr_event, &self->randr_error)) {
+        fprintf(stderr, "gsr warning: gsr_damage_init: XRandr is not supported by your X11 server\n");
+        gsr_damage_deinit(self);
+        return false;
+    }
+
+    if(!xrandr_is_supported(self->display)) {
+        fprintf(stderr, "gsr warning: gsr_damage_init: your X11 randr version is too old\n");
+        gsr_damage_deinit(self);
+        return false;
+    }
+
+    if(self->track_cursor)
+        self->track_cursor = gsr_cursor_init(&self->cursor, self->egl, self->display) == 0;
+
+    XRRSelectInput(self->display, DefaultRootWindow(self->display), RRScreenChangeNotifyMask | RRCrtcChangeNotifyMask | RROutputChangeNotifyMask);
+
+    self->damaged = true;
+    return true;
+}
+
+void gsr_damage_deinit(gsr_damage *self) {
+    if(self->damage) {
+        XDamageDestroy(self->display, self->damage);
+        self->damage = None;
+    }
+
+    gsr_cursor_deinit(&self->cursor);
+
+    self->damage_event = 0;
+    self->damage_error = 0;
+
+    self->randr_event = 0;
+    self->randr_error = 0;
+}
+
+bool gsr_damage_set_target_window(gsr_damage *self, uint64_t window) {
+    if(self->damage_event == 0)
+        return false;
+
+    if(window == self->window)
+        return true;
+
+    if(self->damage) {
+        XDamageDestroy(self->display, self->damage);
+        self->damage = None;
+    }
+
+    if(self->window)
+        XSelectInput(self->display, self->window, 0);
+
+    self->window = window;
+    XSelectInput(self->display, self->window, StructureNotifyMask | ExposureMask);
+
+    XWindowAttributes win_attr;
+    win_attr.x = 0;
+    win_attr.y = 0;
+    win_attr.width = 0;
+    win_attr.height = 0;
+    if(!XGetWindowAttributes(self->display, self->window, &win_attr))
+        fprintf(stderr, "gsr warning: gsr_damage_set_target_window failed: failed to get window attributes: %ld\n", (long)self->window);
+
+    //self->window_pos.x = win_attr.x;
+    //self->window_pos.y = win_attr.y;
+
+    self->window_size.x = win_attr.width;
+    self->window_size.y = win_attr.height;
+
+    self->damage = XDamageCreate(self->display, window, XDamageReportNonEmpty);
+    if(self->damage) {
+        XDamageSubtract(self->display, self->damage, None, None);
+        self->damaged = true;
+        self->track_type = GSR_DAMAGE_TRACK_WINDOW;
+        return true;
+    } else {
+        fprintf(stderr, "gsr warning: gsr_damage_set_target_window: XDamageCreate failed\n");
+        self->track_type = GSR_DAMAGE_TRACK_NONE;
+        return false;
+    }
+}
+
+bool gsr_damage_set_target_monitor(gsr_damage *self, const char *monitor_name) {
+    if(self->damage_event == 0)
+        return false;
+
+    if(strcmp(self->monitor_name, monitor_name) == 0)
+        return true;
+
+    if(self->damage) {
+        XDamageDestroy(self->display, self->damage);
+        self->damage = None;
+    }
+
+    memset(&self->monitor, 0, sizeof(self->monitor));
+    if(strcmp(monitor_name, "screen-direct") != 0 && strcmp(monitor_name, "screen-direct-force") != 0) {
+        if(!get_monitor_by_name(self->egl, GSR_CONNECTION_X11, monitor_name, &self->monitor))
+            fprintf(stderr, "gsr warning: gsr_damage_set_target_monitor: failed to find monitor: %s\n", monitor_name);
+    }
+
+    if(self->window)
+        XSelectInput(self->display, self->window, 0);
+
+    self->window = DefaultRootWindow(self->display);
+    self->damage = XDamageCreate(self->display, self->window, XDamageReportNonEmpty);
+    if(self->damage) {
+        XDamageSubtract(self->display, self->damage, None, None);
+        self->damaged = true;
+        snprintf(self->monitor_name, sizeof(self->monitor_name), "%s", monitor_name);
+        self->track_type = GSR_DAMAGE_TRACK_MONITOR;
+        return true;
+    } else {
+        fprintf(stderr, "gsr warning: gsr_damage_set_target_monitor: XDamageCreate failed\n");
+        self->track_type = GSR_DAMAGE_TRACK_NONE;
+        return false;
+    }
+}
+
+static void gsr_damage_on_crtc_change(gsr_damage *self, XEvent *xev) {
+    const XRRCrtcChangeNotifyEvent *rr_crtc_change_event = (XRRCrtcChangeNotifyEvent*)xev;
+    if(rr_crtc_change_event->crtc == 0 || self->monitor.monitor_identifier == 0)
+        return;
+
+    if(rr_crtc_change_event->crtc != self->monitor.monitor_identifier)
+        return;
+
+    if(rr_crtc_change_event->width == 0 || rr_crtc_change_event->height == 0)
+        return;
+
+    if(rr_crtc_change_event->x != self->monitor.pos.x || rr_crtc_change_event->y != self->monitor.pos.y ||
+        (int)rr_crtc_change_event->width != self->monitor.size.x || (int)rr_crtc_change_event->height != self->monitor.size.y) {
+        self->monitor.pos.x = rr_crtc_change_event->x;
+        self->monitor.pos.y = rr_crtc_change_event->y;
+
+        self->monitor.size.x = rr_crtc_change_event->width;
+        self->monitor.size.y = rr_crtc_change_event->height;
+    }
+}
+
+static void gsr_damage_on_output_change(gsr_damage *self, XEvent *xev) {
+    const XRROutputChangeNotifyEvent *rr_output_change_event = (XRROutputChangeNotifyEvent*)xev;
+    if(!rr_output_change_event->output || self->monitor.monitor_identifier == 0)
+        return;
+
+    XRRScreenResources *screen_res = XRRGetScreenResources(self->display, DefaultRootWindow(self->display));
+    if(!screen_res)
+        return;
+
+    // TODO: What about scaled output? look at for_each_active_monitor_output_x11_not_cached
+    XRROutputInfo *out_info = XRRGetOutputInfo(self->display, screen_res, rr_output_change_event->output);
+    if(out_info && out_info->crtc && out_info->crtc == self->monitor.monitor_identifier) {
+        XRRCrtcInfo *crtc_info = XRRGetCrtcInfo(self->display, screen_res, out_info->crtc);
+        if(crtc_info && (crtc_info->x != self->monitor.pos.x || crtc_info->y != self->monitor.pos.y ||
+            (int)crtc_info->width != self->monitor.size.x || (int)crtc_info->height != self->monitor.size.y))
+        {
+            self->monitor.pos.x = crtc_info->x;
+            self->monitor.pos.y = crtc_info->y;
+
+            self->monitor.size.x = crtc_info->width;
+            self->monitor.size.y = crtc_info->height;
+        }
+
+        if(crtc_info)
+            XRRFreeCrtcInfo(crtc_info);
+    }
+
+    if(out_info)
+        XRRFreeOutputInfo(out_info);
+    
+    XRRFreeScreenResources(screen_res);
+}
+
+static void gsr_damage_on_randr_event(gsr_damage *self, XEvent *xev) {
+    const XRRNotifyEvent *rr_event = (XRRNotifyEvent*)xev;
+    switch(rr_event->subtype) {
+        case RRNotify_CrtcChange:
+            gsr_damage_on_crtc_change(self, xev);
+            break;
+        case RRNotify_OutputChange:
+            gsr_damage_on_output_change(self, xev);
+            break;
+    }
+}
+
+static void gsr_damage_on_damage_event(gsr_damage *self, XEvent *xev) {
+    const XDamageNotifyEvent *de = (XDamageNotifyEvent*)xev;
+    XserverRegion region = XFixesCreateRegion(self->display, NULL, 0);
+    /* Subtract all the damage, repairing the window */
+    XDamageSubtract(self->display, de->damage, None, region);
+
+    if(self->track_type == GSR_DAMAGE_TRACK_WINDOW || (self->track_type == GSR_DAMAGE_TRACK_MONITOR && self->monitor.connector_id == 0)) {
+        self->damaged = true;
+    } else {
+        int num_rectangles = 0;
+        XRectangle *rectangles = XFixesFetchRegion(self->display, region, &num_rectangles);
+        if(rectangles) {
+            const gsr_rectangle monitor_region = { self->monitor.pos, self->monitor.size };
+            for(int i = 0; i < num_rectangles; ++i) {
+                const gsr_rectangle damage_region = { (vec2i){rectangles[i].x, rectangles[i].y}, (vec2i){rectangles[i].width, rectangles[i].height} };
+                self->damaged = rectangles_intersect(monitor_region, damage_region);
+                if(self->damaged)
+                    break;
+            }
+            XFree(rectangles);
+        }
+    }
+
+    XFixesDestroyRegion(self->display, region);
+    XFlush(self->display);
+}
+
+static void gsr_damage_on_tick_cursor(gsr_damage *self) {
+    vec2i prev_cursor_pos = self->cursor.position;
+    gsr_cursor_tick(&self->cursor, self->window);
+    if(self->cursor.position.x != prev_cursor_pos.x || self->cursor.position.y != prev_cursor_pos.y) {
+        const gsr_rectangle cursor_region = { self->cursor.position, self->cursor.size };
+        switch(self->track_type) {
+            case GSR_DAMAGE_TRACK_NONE: {
+                self->damaged = true;
+                break;
+            }
+            case GSR_DAMAGE_TRACK_WINDOW: {
+                const gsr_rectangle window_region = { (vec2i){0, 0}, self->window_size };
+                self->damaged = self->window_size.x == 0 || rectangles_intersect(window_region, cursor_region);
+                break;
+            }
+            case GSR_DAMAGE_TRACK_MONITOR: {
+                const gsr_rectangle monitor_region = { self->monitor.pos, self->monitor.size };
+                self->damaged = self->monitor.monitor_identifier == 0 || rectangles_intersect(monitor_region, cursor_region);
+                break;
+            }
+        }
+    }
+}
+
+static void gsr_damage_on_window_configure_notify(gsr_damage *self, XEvent *xev) {
+    if(xev->xconfigure.window != self->window)
+        return;
+
+    //self->window_pos.x = xev->xconfigure.x;
+    //self->window_pos.y = xev->xconfigure.y;
+    
+    self->window_size.x = xev->xconfigure.width;
+    self->window_size.y = xev->xconfigure.height;
+}
+
+void gsr_damage_on_event(gsr_damage *self, XEvent *xev) {
+    if(self->damage_event == 0 || self->track_type == GSR_DAMAGE_TRACK_NONE)
+        return;
+
+    if(self->track_type == GSR_DAMAGE_TRACK_WINDOW && xev->type == ConfigureNotify)
+        gsr_damage_on_window_configure_notify(self, xev);
+
+    if(self->randr_event) {
+        if(xev->type == self->randr_event + RRScreenChangeNotify)
+            XRRUpdateConfiguration(xev);
+
+        if(xev->type == self->randr_event + RRNotify)
+            gsr_damage_on_randr_event(self, xev);
+    }
+
+    if(self->damage_event && xev->type == self->damage_event + XDamageNotify)
+        gsr_damage_on_damage_event(self, xev);
+
+    if(self->track_cursor)
+        gsr_cursor_on_event(&self->cursor, xev);
+}
+
+void gsr_damage_tick(gsr_damage *self) {
+    if(self->damage_event == 0 || self->track_type == GSR_DAMAGE_TRACK_NONE)
+        return;
+
+    if(self->track_cursor && self->cursor.visible && !self->damaged)
+        gsr_damage_on_tick_cursor(self);
+}
+
+bool gsr_damage_is_damaged(gsr_damage *self) {
+    return self->damage_event == 0 || !self->damage || self->damaged || self->track_type == GSR_DAMAGE_TRACK_NONE;
+}
+
+void gsr_damage_clear(gsr_damage *self) {
+    self->damaged = false;
+}
diff --git a/src/defs.c b/src/defs.c
new file mode 100644
index 0000000..319d21b
--- /dev/null
+++ b/src/defs.c
@@ -0,0 +1,100 @@
+#include "../include/defs.h"
+#include <assert.h>
+
+bool video_codec_is_hdr(gsr_video_codec video_codec) {
+    // TODO: Vulkan
+    switch(video_codec) {
+        case GSR_VIDEO_CODEC_HEVC_HDR:
+        case GSR_VIDEO_CODEC_AV1_HDR:
+            return true;
+        default:
+            return false;
+    }
+}
+
+gsr_video_codec hdr_video_codec_to_sdr_video_codec(gsr_video_codec video_codec) {
+    // TODO: Vulkan
+    switch(video_codec) {
+        case GSR_VIDEO_CODEC_HEVC_HDR:
+            return GSR_VIDEO_CODEC_HEVC;
+        case GSR_VIDEO_CODEC_AV1_HDR:
+            return GSR_VIDEO_CODEC_AV1;
+        default:
+            return video_codec;
+    }
+}
+
+gsr_color_depth video_codec_to_bit_depth(gsr_video_codec video_codec) {
+    // TODO: 10-bit Vulkan
+    switch(video_codec) {
+        case GSR_VIDEO_CODEC_HEVC_HDR:
+        case GSR_VIDEO_CODEC_HEVC_10BIT:
+        case GSR_VIDEO_CODEC_AV1_HDR:
+        case GSR_VIDEO_CODEC_AV1_10BIT:
+            return GSR_COLOR_DEPTH_10_BITS;
+        default:
+            return GSR_COLOR_DEPTH_8_BITS;
+    }
+}
+
+const char* video_codec_to_string(gsr_video_codec video_codec) {
+    switch(video_codec) {
+        case GSR_VIDEO_CODEC_H264:        return "h264";
+        case GSR_VIDEO_CODEC_HEVC:        return "hevc";
+        case GSR_VIDEO_CODEC_HEVC_HDR:    return "hevc_hdr";
+        case GSR_VIDEO_CODEC_HEVC_10BIT:  return "hevc_10bit";
+        case GSR_VIDEO_CODEC_AV1:         return "av1";
+        case GSR_VIDEO_CODEC_AV1_HDR:     return "av1_hdr";
+        case GSR_VIDEO_CODEC_AV1_10BIT:   return "av1_10bit";
+        case GSR_VIDEO_CODEC_VP8:         return "vp8";
+        case GSR_VIDEO_CODEC_VP9:         return "vp9";
+        case GSR_VIDEO_CODEC_H264_VULKAN: return "h264_vulkan";
+        case GSR_VIDEO_CODEC_HEVC_VULKAN: return "hevc_vulkan";
+    }
+    return "";
+}
+
+// bool video_codec_is_hevc(gsr_video_codec video_codec) {
+//     // TODO: 10-bit vulkan
+//     switch(video_codec) {
+//         case GSR_VIDEO_CODEC_HEVC:
+//         case GSR_VIDEO_CODEC_HEVC_HDR:
+//         case GSR_VIDEO_CODEC_HEVC_10BIT:
+//         case GSR_VIDEO_CODEC_HEVC_VULKAN:
+//             return true;
+//         default:
+//             return false;
+//     }
+// }
+
+bool video_codec_is_av1(gsr_video_codec video_codec) {
+    // TODO: Vulkan
+    switch(video_codec) {
+        case GSR_VIDEO_CODEC_AV1:
+        case GSR_VIDEO_CODEC_AV1_HDR:
+        case GSR_VIDEO_CODEC_AV1_10BIT:
+            return true;
+        default:
+            return false;
+    }
+}
+
+bool video_codec_is_vulkan(gsr_video_codec video_codec) {
+    switch(video_codec) {
+        case GSR_VIDEO_CODEC_H264_VULKAN:
+        case GSR_VIDEO_CODEC_HEVC_VULKAN:
+            return true;
+        default:
+            return false;
+    }
+}
+
+const char* audio_codec_get_name(gsr_audio_codec audio_codec) {
+    switch(audio_codec) {
+        case GSR_AUDIO_CODEC_AAC:  return "aac";
+        case GSR_AUDIO_CODEC_OPUS: return "opus";
+        case GSR_AUDIO_CODEC_FLAC: return "flac";
+    }
+    assert(false);
+    return "";
+}
diff --git a/src/egl.c b/src/egl.c
index faae6e7..bcb1663 100644
--- a/src/egl.c
+++ b/src/egl.c
@@ -1,273 +1,52 @@
 #include "../include/egl.h"
+#include "../include/window/window.h"
 #include "../include/library_loader.h"
+#include "../include/utils.h"
+
 #include <string.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <dlfcn.h>
 #include <assert.h>
-
-#include <wayland-client.h>
-#include <wayland-egl.h>
-#include "../external/wlr-export-dmabuf-unstable-v1-client-protocol.h"
 #include <unistd.h>
-#include <sys/capability.h>
-
-// Move this shit to a separate wayland file, and have a separate file for x11.
-
-static void output_handle_geometry(void *data, struct wl_output *wl_output,
-        int32_t x, int32_t y, int32_t phys_width, int32_t phys_height,
-        int32_t subpixel, const char *make, const char *model,
-        int32_t transform) {
-    (void)wl_output;
-    (void)phys_width;
-    (void)phys_height;
-    (void)subpixel;
-    (void)make;
-    (void)model;
-    (void)transform;
-    gsr_wayland_output *gsr_output = data;
-    gsr_output->pos.x = x;
-    gsr_output->pos.y = y;
-}
-
-static void output_handle_mode(void *data, struct wl_output *wl_output, uint32_t flags, int32_t width, int32_t height, int32_t refresh) {
-    (void)wl_output;
-    (void)flags;
-    (void)refresh;
-    gsr_wayland_output *gsr_output = data;
-    gsr_output->size.x = width;
-    gsr_output->size.y = height;
-}
-
-static void output_handle_done(void *data, struct wl_output *wl_output) {
-    (void)data;
-    (void)wl_output;
-}
 
-static void output_handle_scale(void* data, struct wl_output *wl_output, int32_t factor) {
-    (void)data;
-    (void)wl_output;
-    (void)factor;
-}
-
-static void output_handle_name(void *data, struct wl_output *wl_output, const char *name) {
-    (void)wl_output;
-    gsr_wayland_output *gsr_output = data;
-    if(gsr_output->name) {
-        free(gsr_output->name);
-        gsr_output->name = NULL;
-    }
-    gsr_output->name = strdup(name);
-}
-
-static void output_handle_description(void *data, struct wl_output *wl_output, const char *description) {
-    (void)data;
-    (void)wl_output;
-    (void)description;
-}
-
-static const struct wl_output_listener output_listener = {
-    .geometry = output_handle_geometry,
-    .mode = output_handle_mode,
-    .done = output_handle_done,
-    .scale = output_handle_scale,
-    .name = output_handle_name,
-    .description = output_handle_description,
-};
-
-static void registry_add_object(void *data, struct wl_registry *registry, uint32_t name, const char *interface, uint32_t version) {
-    (void)version;
-    gsr_egl *egl = data;
-    if (strcmp(interface, "wl_compositor") == 0) {
-        if(egl->wayland.compositor) {
-            wl_compositor_destroy(egl->wayland.compositor);
-            egl->wayland.compositor = NULL;
-        }
-        egl->wayland.compositor = wl_registry_bind(registry, name, &wl_compositor_interface, 1);
-    } else if(strcmp(interface, wl_output_interface.name) == 0) {
-        if(version < 4) {
-            fprintf(stderr, "gsr warning: wl output interface version is < 4, expected >= 4 to capture a monitor. Using KMS capture instead\n");
-            return;
-        }
-
-        if(egl->wayland.num_outputs == GSR_MAX_OUTPUTS) {
-            fprintf(stderr, "gsr warning: reached maximum outputs (32), ignoring output %u\n", name);
-            return;
-        }
-
-        gsr_wayland_output *gsr_output = &egl->wayland.outputs[egl->wayland.num_outputs];
-        egl->wayland.num_outputs++;
-        *gsr_output = (gsr_wayland_output) {
-            .wl_name = name,
-            .output = wl_registry_bind(registry, name, &wl_output_interface, 4),
-            .pos = { .x = 0, .y = 0 },
-            .size = { .x = 0, .y = 0 },
-            .name = NULL,
-        };
-        wl_output_add_listener(gsr_output->output, &output_listener, gsr_output);
-    } else if(strcmp(interface, zwlr_export_dmabuf_manager_v1_interface.name) == 0) {
-        if(egl->wayland.export_manager) {
-            zwlr_export_dmabuf_manager_v1_destroy(egl->wayland.export_manager);
-            egl->wayland.export_manager = NULL;
-        }
-        egl->wayland.export_manager = wl_registry_bind(registry, name, &zwlr_export_dmabuf_manager_v1_interface, 1);
-    }
-}
-
-static void registry_remove_object(void *data, struct wl_registry *registry, uint32_t name) {
-    (void)data;
-    (void)registry;
-    (void)name;
-}
-
-static struct wl_registry_listener registry_listener = {
-    .global = registry_add_object,
-    .global_remove = registry_remove_object,
-};
-
-static void frame_capture_output(gsr_egl *egl);
-
-static void frame_start(void *data, struct zwlr_export_dmabuf_frame_v1 *frame,
-        uint32_t width, uint32_t height, uint32_t offset_x, uint32_t offset_y,
-        uint32_t buffer_flags, uint32_t flags, uint32_t format,
-        uint32_t mod_high, uint32_t mod_low, uint32_t num_objects) {
-    (void)buffer_flags;
-    (void)flags;
-    (void)num_objects;
-    gsr_egl *egl = data;
-    //fprintf(stderr, "frame start %p, width: %u, height: %u, offset x: %u, offset y: %u, format: %u, num objects: %u\n", (void*)frame, width, height, offset_x, offset_y, format, num_objects);
-    egl->x = offset_x;
-    egl->y = offset_y;
-    egl->width = width;
-    egl->height = height;
-    egl->pixel_format = format;
-    egl->modifier = ((uint64_t)mod_high << 32) | (uint64_t)mod_low;
-    egl->wayland.current_frame = frame;
-}
-
-static void frame_object(void *data, struct zwlr_export_dmabuf_frame_v1 *frame,
-        uint32_t index, int32_t fd, uint32_t size, uint32_t offset,
-        uint32_t stride, uint32_t plane_index) {
-    // TODO: What if we get multiple objects? then we get multiple fd per frame
-    (void)frame;
-    (void)index;
-    (void)size;
-    (void)plane_index;
-    gsr_egl *egl = data;
-    if(egl->fd > 0) {
-        close(egl->fd);
-        egl->fd = 0;
-    }
-    egl->fd = fd;
-    egl->pitch = stride;
-    egl->offset = offset;
-    //fprintf(stderr, "new frame %p, fd: %d, index: %u, size: %u, offset: %u, stride: %u, plane_index: %u\n", (void*)frame, fd, index, size, offset, stride, plane_index);
-}
-
-static void frame_ready(void *data, struct zwlr_export_dmabuf_frame_v1 *frame, uint32_t tv_sec_hi, uint32_t tv_sec_lo, uint32_t tv_nsec) {
-    (void)frame;
-    (void)tv_sec_hi;
-    (void)tv_sec_lo;
-    (void)tv_nsec;
-    frame_capture_output(data);
-}
-
-static void frame_cancel(void *data, struct zwlr_export_dmabuf_frame_v1 *frame, uint32_t reason) {
-    (void)frame;
-    (void)reason;
-    frame_capture_output(data);
-}
-
-static const struct zwlr_export_dmabuf_frame_v1_listener frame_listener = {
-    .frame = frame_start,
-    .object = frame_object,
-    .ready = frame_ready,
-    .cancel = frame_cancel,
-};
-
-static void frame_capture_output(gsr_egl *egl) {
-    assert(egl->wayland.output_to_capture);
-    bool with_cursor = true;
-
-    if(egl->wayland.frame_callback) {
-        zwlr_export_dmabuf_frame_v1_destroy(egl->wayland.frame_callback);
-        egl->wayland.frame_callback = NULL;
-    }
-
-    egl->wayland.frame_callback = zwlr_export_dmabuf_manager_v1_capture_output(egl->wayland.export_manager, with_cursor, egl->wayland.output_to_capture->output);
-    zwlr_export_dmabuf_frame_v1_add_listener(egl->wayland.frame_callback, &frame_listener, egl);
-}
-
-static gsr_wayland_output* get_wayland_output_by_name(gsr_egl *egl, const char *name) {
-    assert(name);
-    for(int i = 0; i < egl->wayland.num_outputs; ++i) {
-        if(egl->wayland.outputs[i].name && strcmp(egl->wayland.outputs[i].name, name) == 0)
-            return &egl->wayland.outputs[i];
-    }
-    return NULL;
-}
-
-static void reset_cap_nice(void) {
-    cap_t caps = cap_get_proc();
-    if(!caps)
-        return;
-
-    const cap_value_t cap_to_remove = CAP_SYS_NICE;
-    cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_to_remove, CAP_CLEAR);
-    cap_set_flag(caps, CAP_PERMITTED, 1, &cap_to_remove, CAP_CLEAR);
-    cap_set_proc(caps);
-    cap_free(caps);
-}
+// TODO: rename gsr_egl to something else since this includes both egl and glx and in the future maybe vulkan too
+
+#define GLX_DRAWABLE_TYPE                  0x8010
+#define GLX_RENDER_TYPE                    0x8011
+#define GLX_RGBA_BIT                       0x00000001
+#define GLX_WINDOW_BIT                     0x00000001
+#define GLX_PIXMAP_BIT                     0x00000002
+#define GLX_BIND_TO_TEXTURE_RGBA_EXT       0x20D1
+#define GLX_BIND_TO_TEXTURE_TARGETS_EXT    0x20D3
+#define GLX_TEXTURE_2D_BIT_EXT             0x00000002
+#define GLX_DOUBLEBUFFER                   5
+#define GLX_RED_SIZE                       8
+#define GLX_GREEN_SIZE                     9
+#define GLX_BLUE_SIZE                      10
+#define GLX_ALPHA_SIZE                     11
+#define GLX_DEPTH_SIZE                     12
+#define GLX_RGBA_TYPE                      0x8014
 
 // TODO: Create egl context without surface (in other words, x11/wayland agnostic, doesn't require x11/wayland dependency)
-static bool gsr_egl_create_window(gsr_egl *self, bool wayland) {
+static bool gsr_egl_create_window(gsr_egl *self) {
     EGLConfig  ecfg;
     int32_t    num_config = 0;
 
     const int32_t attr[] = {
         EGL_BUFFER_SIZE, 24,
-        EGL_RENDERABLE_TYPE, EGL_OPENGL_BIT,
-        EGL_NONE
+        EGL_RENDERABLE_TYPE, EGL_OPENGL_ES3_BIT,
+        EGL_NONE, EGL_NONE
     };
 
     const int32_t ctxattr[] = {
         EGL_CONTEXT_CLIENT_VERSION, 2,
-        EGL_CONTEXT_PRIORITY_LEVEL_IMG, EGL_CONTEXT_PRIORITY_HIGH_IMG, /* requires cap_sys_nice, ignored otherwise */
-        EGL_NONE
+        EGL_NONE, EGL_NONE
     };
 
-    if(wayland) {
-        self->wayland.dpy = wl_display_connect(NULL);
-        if(!self->wayland.dpy) {
-            fprintf(stderr, "gsr error: gsr_egl_create_window failed: wl_display_connect failed\n");
-            goto fail;
-        }
-
-        self->wayland.registry = wl_display_get_registry(self->wayland.dpy); // TODO: Error checking
-        wl_registry_add_listener(self->wayland.registry, &registry_listener, self); // TODO: Error checking
-
-        // Fetch globals
-        wl_display_roundtrip(self->wayland.dpy);
-
-        // Fetch wl_output
-        wl_display_roundtrip(self->wayland.dpy);
+    self->eglBindAPI(EGL_OPENGL_ES_API);
 
-        if(!self->wayland.compositor) {
-            fprintf(stderr, "gsr error: gsr_gl_create_window failed: failed to find compositor\n");
-            goto fail;
-        }
-    } else {
-        self->x11.window = XCreateWindow(self->x11.dpy, DefaultRootWindow(self->x11.dpy), 0, 0, 16, 16, 0, CopyFromParent, InputOutput, CopyFromParent, 0, NULL);
-
-        if(!self->x11.window) {
-            fprintf(stderr, "gsr error: gsr_gl_create_window failed: failed to create gl window\n");
-            goto fail;
-        }
-    }
-
-    self->eglBindAPI(EGL_OPENGL_API);
-
-    self->egl_display = self->eglGetDisplay(self->wayland.dpy ? (EGLNativeDisplayType)self->wayland.dpy : (EGLNativeDisplayType)self->x11.dpy);
+    self->egl_display = self->eglGetDisplay((EGLNativeDisplayType)gsr_window_get_display(self->window));
     if(!self->egl_display) {
         fprintf(stderr, "gsr error: gsr_egl_create_window failed: eglGetDisplay failed\n");
         goto fail;
@@ -277,47 +56,116 @@ static bool gsr_egl_create_window(gsr_egl *self, bool wayland) {
         fprintf(stderr, "gsr error: gsr_egl_create_window failed: eglInitialize failed\n");
         goto fail;
     }
-    
+
     if(!self->eglChooseConfig(self->egl_display, attr, &ecfg, 1, &num_config) || num_config != 1) {
         fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to find a matching config\n");
         goto fail;
     }
-    
+
     self->egl_context = self->eglCreateContext(self->egl_display, ecfg, NULL, ctxattr);
     if(!self->egl_context) {
         fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to create egl context\n");
         goto fail;
     }
 
-    if(wayland) {
-        self->wayland.surface = wl_compositor_create_surface(self->wayland.compositor);
-        self->wayland.window = wl_egl_window_create(self->wayland.surface, 16, 16);
-        self->egl_surface = self->eglCreateWindowSurface(self->egl_display, ecfg, (EGLNativeWindowType)self->wayland.window, NULL);
-    } else {
-        self->egl_surface = self->eglCreateWindowSurface(self->egl_display, ecfg, (EGLNativeWindowType)self->x11.window, NULL);
-    }
-
+    self->egl_surface = self->eglCreateWindowSurface(self->egl_display, ecfg, (EGLNativeWindowType)gsr_window_get_window(self->window), NULL);
     if(!self->egl_surface) {
         fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to create window surface\n");
         goto fail;
     }
 
     if(!self->eglMakeCurrent(self->egl_display, self->egl_surface, self->egl_surface, self->egl_context)) {
-        fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to make context current\n");
+        fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to make egl context current\n");
         goto fail;
     }
 
-    reset_cap_nice();
     return true;
 
     fail:
-    reset_cap_nice();
     gsr_egl_unload(self);
     return false;
 }
 
+static GLXFBConfig glx_fb_config_choose(gsr_egl *self, Display *display) {
+    const int glx_visual_attribs[] = {
+        GLX_RENDER_TYPE, GLX_RGBA_BIT,
+        GLX_DRAWABLE_TYPE, GLX_WINDOW_BIT,
+        // TODO:
+        //GLX_BIND_TO_TEXTURE_RGBA_EXT, 1,
+        //GLX_BIND_TO_TEXTURE_TARGETS_EXT, GLX_TEXTURE_2D_BIT_EXT,
+        GLX_DOUBLEBUFFER, True,
+        GLX_RED_SIZE, 8,
+        GLX_GREEN_SIZE, 8,
+        GLX_BLUE_SIZE, 8,
+        GLX_ALPHA_SIZE, 0,
+        GLX_DEPTH_SIZE, 0,
+        None, None
+    };
+
+    // TODO: Cleanup
+    int c = 0;
+    GLXFBConfig *fb_configs = self->glXChooseFBConfig(display, DefaultScreen(display), glx_visual_attribs, &c);
+    if(c == 0 || !fb_configs)
+        return NULL;
+
+    return fb_configs[0];
+}
+
+static bool gsr_egl_switch_to_glx_context(gsr_egl *self) {
+    // TODO: Cleanup
+    assert(gsr_window_get_display_server(self->window) == GSR_DISPLAY_SERVER_X11);
+    Display *display = gsr_window_get_display(self->window);
+    const Window window = (Window)gsr_window_get_window(self->window);
+
+    if(self->egl_context) {
+        self->eglMakeCurrent(self->egl_display, NULL, NULL, NULL);
+        self->eglDestroyContext(self->egl_display, self->egl_context);
+        self->egl_context = NULL;
+    }
+
+    if(self->egl_surface) {
+        self->eglDestroySurface(self->egl_display, self->egl_surface);
+        self->egl_surface = NULL;
+    }
+
+    if(self->egl_display) {
+        self->eglTerminate(self->egl_display);
+        self->egl_display = NULL;
+    }
+
+    self->glx_fb_config = glx_fb_config_choose(self, display);
+    if(!self->glx_fb_config) {
+        fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to find a suitable fb config\n");
+        goto fail;
+    }
+
+    // TODO:
+    //self->glx_context = self->glXCreateContextAttribsARB(display, self->glx_fb_config, NULL, True, context_attrib_list);
+    self->glx_context = self->glXCreateNewContext(display, self->glx_fb_config, GLX_RGBA_TYPE, NULL, True);
+    if(!self->glx_context) {
+        fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to create glx context\n");
+        goto fail;
+    }
+
+    if(!self->glXMakeContextCurrent(display, window, window, self->glx_context)) {
+        fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to make glx context current\n");
+        goto fail;
+    }
+
+    return true;
+
+    fail:
+    if(self->glx_context) {
+        self->glXMakeContextCurrent(display, None, None, NULL);
+        self->glXDestroyContext(display, self->glx_context);
+        self->glx_context = NULL;
+        self->glx_fb_config = NULL;
+    }
+    return false;
+}
+
 static bool gsr_egl_load_egl(gsr_egl *self, void *library) {
-    dlsym_assign required_dlsym[] = {
+    const dlsym_assign required_dlsym[] = {
         { (void**)&self->eglGetError, "eglGetError" },
         { (void**)&self->eglGetDisplay, "eglGetDisplay" },
         { (void**)&self->eglInitialize, "eglInitialize" },
@@ -350,6 +198,27 @@ static bool gsr_egl_proc_load_egl(gsr_egl *self) {
     self->eglExportDMABUFImageQueryMESA = (FUNC_eglExportDMABUFImageQueryMESA)self->eglGetProcAddress("eglExportDMABUFImageQueryMESA");
     self->eglExportDMABUFImageMESA = (FUNC_eglExportDMABUFImageMESA)self->eglGetProcAddress("eglExportDMABUFImageMESA");
     self->glEGLImageTargetTexture2DOES = (FUNC_glEGLImageTargetTexture2DOES)self->eglGetProcAddress("glEGLImageTargetTexture2DOES");
+    self->eglQueryDisplayAttribEXT = (FUNC_eglQueryDisplayAttribEXT)self->eglGetProcAddress("eglQueryDisplayAttribEXT");
+    self->eglQueryDeviceStringEXT = (FUNC_eglQueryDeviceStringEXT)self->eglGetProcAddress("eglQueryDeviceStringEXT");
+    self->eglQueryDmaBufModifiersEXT = (FUNC_eglQueryDmaBufModifiersEXT)self->eglGetProcAddress("eglQueryDmaBufModifiersEXT");
+
+    self->glCreateMemoryObjectsEXT = (FUNC_glCreateMemoryObjectsEXT)self->eglGetProcAddress("glCreateMemoryObjectsEXT");
+    self->glImportMemoryFdEXT = (FUNC_glImportMemoryFdEXT)self->eglGetProcAddress("glImportMemoryFdEXT");
+    self->glIsMemoryObjectEXT = (FUNC_glIsMemoryObjectEXT)self->eglGetProcAddress("glIsMemoryObjectEXT");
+    self->glTexStorageMem2DEXT = (FUNC_glTexStorageMem2DEXT)self->eglGetProcAddress("glTexStorageMem2DEXT");
+    self->glBufferStorageMemEXT = (FUNC_glBufferStorageMemEXT)self->eglGetProcAddress("glBufferStorageMemEXT");
+    self->glNamedBufferStorageMemEXT = (FUNC_glNamedBufferStorageMemEXT)self->eglGetProcAddress("glNamedBufferStorageMemEXT");
+    self->glMemoryObjectParameterivEXT = (FUNC_glMemoryObjectParameterivEXT)self->eglGetProcAddress("glMemoryObjectParameterivEXT");
+
+    if(!self->eglExportDMABUFImageQueryMESA) {
+        fprintf(stderr, "gsr error: gsr_egl_load failed: could not find eglExportDMABUFImageQueryMESA\n");
+        return false;
+    }
+
+    if(!self->eglExportDMABUFImageMESA) {
+        fprintf(stderr, "gsr error: gsr_egl_load failed: could not find eglExportDMABUFImageMESA\n");
+        return false;
+    }
 
     if(!self->glEGLImageTargetTexture2DOES) {
         fprintf(stderr, "gsr error: gsr_egl_load failed: could not find glEGLImageTargetTexture2DOES\n");
@@ -359,23 +228,62 @@ static bool gsr_egl_proc_load_egl(gsr_egl *self) {
     return true;
 }
 
+static bool gsr_egl_load_glx(gsr_egl *self, void *library) {
+    const dlsym_assign required_dlsym[] = {
+        { (void**)&self->glXGetProcAddress, "glXGetProcAddress" },
+        { (void**)&self->glXChooseFBConfig, "glXChooseFBConfig" },
+        { (void**)&self->glXMakeContextCurrent, "glXMakeContextCurrent" },
+        { (void**)&self->glXCreateNewContext, "glXCreateNewContext" },
+        { (void**)&self->glXDestroyContext, "glXDestroyContext" },
+        { (void**)&self->glXSwapBuffers, "glXSwapBuffers" },
+
+        { NULL, NULL }
+    };
+
+    if(!dlsym_load_list(library, required_dlsym)) {
+        fprintf(stderr, "gsr error: gsr_egl_load failed: missing required symbols in libGLX.so.0\n");
+        return false;
+    }
+
+    self->glXCreateContextAttribsARB = (FUNC_glXCreateContextAttribsARB)self->glXGetProcAddress((const unsigned char*)"glXCreateContextAttribsARB");
+    if(!self->glXCreateContextAttribsARB) {
+        fprintf(stderr, "gsr error: gsr_egl_load_glx failed: could not find glXCreateContextAttribsARB\n");
+        return false;
+    }
+
+    self->glXSwapIntervalEXT = (FUNC_glXSwapIntervalEXT)self->glXGetProcAddress((const unsigned char*)"glXSwapIntervalEXT");
+    self->glXSwapIntervalMESA = (FUNC_glXSwapIntervalMESA)self->glXGetProcAddress((const unsigned char*)"glXSwapIntervalMESA");
+    self->glXSwapIntervalSGI = (FUNC_glXSwapIntervalSGI)self->glXGetProcAddress((const unsigned char*)"glXSwapIntervalSGI");
+
+    return true;
+}
+
 static bool gsr_egl_load_gl(gsr_egl *self, void *library) {
-    dlsym_assign required_dlsym[] = {
+    const dlsym_assign required_dlsym[] = {
         { (void**)&self->glGetError, "glGetError" },
         { (void**)&self->glGetString, "glGetString" },
+        { (void**)&self->glFlush, "glFlush" },
+        { (void**)&self->glFinish, "glFinish" },
         { (void**)&self->glClear, "glClear" },
         { (void**)&self->glClearColor, "glClearColor" },
         { (void**)&self->glGenTextures, "glGenTextures" },
         { (void**)&self->glDeleteTextures, "glDeleteTextures" },
+        { (void**)&self->glActiveTexture, "glActiveTexture" },
         { (void**)&self->glBindTexture, "glBindTexture" },
+        { (void**)&self->glBindImageTexture, "glBindImageTexture" },
         { (void**)&self->glTexParameteri, "glTexParameteri" },
+        { (void**)&self->glTexParameteriv, "glTexParameteriv" },
+        { (void**)&self->glTexParameterfv, "glTexParameterfv" },
         { (void**)&self->glGetTexLevelParameteriv, "glGetTexLevelParameteriv" },
         { (void**)&self->glTexImage2D, "glTexImage2D" },
-        { (void**)&self->glCopyImageSubData, "glCopyImageSubData" },
-        { (void**)&self->glClearTexImage, "glClearTexImage" },
+        { (void**)&self->glTexSubImage2D, "glTexSubImage2D" },
+        { (void**)&self->glTexStorage2D, "glTexStorage2D" },
+        { (void**)&self->glGetTexImage, "glGetTexImage" },
         { (void**)&self->glGenFramebuffers, "glGenFramebuffers" },
         { (void**)&self->glBindFramebuffer, "glBindFramebuffer" },
         { (void**)&self->glDeleteFramebuffers, "glDeleteFramebuffers" },
+        { (void**)&self->glDispatchCompute, "glDispatchCompute" },
+        { (void**)&self->glMemoryBarrier, "glMemoryBarrier" },
         { (void**)&self->glViewport, "glViewport" },
         { (void**)&self->glFramebufferTexture2D, "glFramebufferTexture2D" },
         { (void**)&self->glDrawBuffers, "glDrawBuffers" },
@@ -406,9 +314,21 @@ static bool gsr_egl_load_gl(gsr_egl *self, void *library) {
         { (void**)&self->glEnableVertexAttribArray, "glEnableVertexAttribArray" },
         { (void**)&self->glDrawArrays, "glDrawArrays" },
         { (void**)&self->glEnable, "glEnable" },
+        { (void**)&self->glDisable, "glDisable" },
         { (void**)&self->glBlendFunc, "glBlendFunc" },
+        { (void**)&self->glPixelStorei, "glPixelStorei" },
         { (void**)&self->glGetUniformLocation, "glGetUniformLocation" },
         { (void**)&self->glUniform1f, "glUniform1f" },
+        { (void**)&self->glUniform2f, "glUniform2f" },
+        { (void**)&self->glUniform1i, "glUniform1i" },
+        { (void**)&self->glUniform2i, "glUniform2i" },
+        { (void**)&self->glUniformMatrix2fv, "glUniformMatrix2fv" },
+        { (void**)&self->glDebugMessageCallback, "glDebugMessageCallback" },
+        { (void**)&self->glScissor, "glScissor" },
+        { (void**)&self->glReadPixels, "glReadPixels" },
+        { (void**)&self->glMapBuffer, "glMapBuffer" },
+        { (void**)&self->glUnmapBuffer, "glUnmapBuffer" },
+        { (void**)&self->glGetIntegerv, "glGetIntegerv" },
 
         { NULL, NULL }
     };
@@ -421,56 +341,139 @@ static bool gsr_egl_load_gl(gsr_egl *self, void *library) {
     return true;
 }
 
-bool gsr_egl_load(gsr_egl *self, Display *dpy, bool wayland) {
-    memset(self, 0, sizeof(gsr_egl));
-    self->x11.dpy = dpy;
+#define GL_DEBUG_TYPE_ERROR               0x824C
+#define GL_DEBUG_SEVERITY_NOTIFICATION    0x826B
+static void debug_callback(unsigned int source, unsigned int type, unsigned int id, unsigned int severity, int length, const char* message, const void* userParam) {
+    (void)source;
+    (void)id;
+    (void)length;
+    (void)userParam;
+    if(severity != GL_DEBUG_SEVERITY_NOTIFICATION)
+        fprintf(stderr, "gsr info: gl callback: %s type = 0x%x, severity = 0x%x, message = %s\n", type == GL_DEBUG_TYPE_ERROR ? "** GL ERROR **" : "", type, severity, message);
+}
 
-    void *egl_lib = NULL;
-    void *gl_lib = NULL;
+/* TODO: check for glx swap control extension string (GLX_EXT_swap_control, etc) */
+static void set_vertical_sync_enabled(gsr_egl *egl, int enabled) {
+    int result = 0;
+
+    if(egl->glXSwapIntervalEXT) {
+        assert(gsr_window_get_display_server(egl->window) == GSR_DISPLAY_SERVER_X11);
+        Display *display = gsr_window_get_display(egl->window);
+        const Window window = (Window)gsr_window_get_window(egl->window);
+        egl->glXSwapIntervalEXT(display, window, enabled ? 1 : 0);
+    } else if(egl->glXSwapIntervalMESA) {
+        result = egl->glXSwapIntervalMESA(enabled ? 1 : 0);
+    } else if(egl->glXSwapIntervalSGI) {
+        result = egl->glXSwapIntervalSGI(enabled ? 1 : 0);
+    } else {
+        static int warned = 0;
+        if (!warned) {
+            warned = 1;
+            fprintf(stderr, "gsr warning: setting vertical sync not supported\n");
+        }
+    }
+
+    if(result != 0)
+        fprintf(stderr, "gsr warning: setting vertical sync failed\n");
+}
+
+static void gsr_egl_disable_vsync(gsr_egl *self) {
+    switch(self->context_type) {
+        case GSR_GL_CONTEXT_TYPE_EGL: {
+            self->eglSwapInterval(self->egl_display, 0);
+            break;
+        }
+        case GSR_GL_CONTEXT_TYPE_GLX: {
+            set_vertical_sync_enabled(self, 0);
+            break;
+        }
+    }
+}
+
+bool gsr_egl_load(gsr_egl *self, gsr_window *window, bool is_monitor_capture, bool enable_debug) {
+    memset(self, 0, sizeof(gsr_egl));
+    self->context_type = GSR_GL_CONTEXT_TYPE_EGL;
+    self->window = window;
 
     dlerror(); /* clear */
-    egl_lib = dlopen("libEGL.so.1", RTLD_LAZY);
-    if(!egl_lib) {
+    self->egl_library = dlopen("libEGL.so.1", RTLD_LAZY);
+    if(!self->egl_library) {
         fprintf(stderr, "gsr error: gsr_egl_load: failed to load libEGL.so.1, error: %s\n", dlerror());
         goto fail;
     }
 
-    gl_lib = dlopen("libGL.so.1", RTLD_LAZY);
-    if(!egl_lib) {
+    self->glx_library = dlopen("libGLX.so.0", RTLD_LAZY);
+
+    self->gl_library = dlopen("libGL.so.1", RTLD_LAZY);
+    if(!self->gl_library) {
         fprintf(stderr, "gsr error: gsr_egl_load: failed to load libGL.so.1, error: %s\n", dlerror());
         goto fail;
     }
 
-    if(!gsr_egl_load_egl(self, egl_lib))
+    if(!gsr_egl_load_egl(self, self->egl_library))
         goto fail;
 
-    if(!gsr_egl_load_gl(self, gl_lib))
+    /* In some distros (alpine for example libGLX doesn't exist, but libGL can be used instead) */
+    if(!gsr_egl_load_glx(self, self->glx_library ? self->glx_library : self->gl_library))
+        goto fail;
+
+    if(!gsr_egl_load_gl(self, self->gl_library))
         goto fail;
 
     if(!gsr_egl_proc_load_egl(self))
         goto fail;
 
-    if(!gsr_egl_create_window(self, wayland))
+    if(!gsr_egl_create_window(self))
+        goto fail;
+
+    if(!gl_get_gpu_info(self, &self->gpu_info))
         goto fail;
 
+    if(self->eglQueryDisplayAttribEXT && self->eglQueryDeviceStringEXT) {
+        intptr_t device = 0;
+        if(self->eglQueryDisplayAttribEXT(self->egl_display, EGL_DEVICE_EXT, &device) && device)
+            self->dri_card_path = self->eglQueryDeviceStringEXT((void*)device, EGL_DRM_DEVICE_FILE_EXT);
+    }
+
+    /* Nvfbc requires glx */
+    if(gsr_window_get_display_server(self->window) == GSR_DISPLAY_SERVER_X11 && is_monitor_capture && self->gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA) {
+        self->context_type = GSR_GL_CONTEXT_TYPE_GLX;
+        self->dri_card_path = NULL;
+        if(!gsr_egl_switch_to_glx_context(self))
+            goto fail;
+    }
+
     self->glEnable(GL_BLEND);
     self->glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
+    self->glPixelStorei(GL_PACK_ALIGNMENT, 1);
+    self->glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
+
+    if(enable_debug) {
+        self->glEnable(GL_DEBUG_OUTPUT);
+        self->glDebugMessageCallback(debug_callback, NULL);
+    }
+
+    gsr_egl_disable_vsync(self);
+
+    if(self->gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA) {
+        /* This fixes nvenc codecs unable to load on openSUSE tumbleweed because of a cuda error. Don't ask me why */
+        const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
+        if(inside_flatpak)
+            system("flatpak-spawn --host -- sh -c 'grep -q openSUSE /etc/os-release && nvidia-smi -f /dev/null'");
+        else
+            system("sh -c 'grep -q openSUSE /etc/os-release && nvidia-smi -f /dev/null'");
+    }
 
-    self->egl_library = egl_lib;
-    self->gl_library = gl_lib;
     return true;
 
     fail:
-    if(egl_lib)
-        dlclose(egl_lib);
-    if(gl_lib)
-        dlclose(gl_lib);
-    memset(self, 0, sizeof(gsr_egl));
+    gsr_egl_unload(self);
     return false;
 }
 
 void gsr_egl_unload(gsr_egl *self) {
     if(self->egl_context) {
+        self->eglMakeCurrent(self->egl_display, NULL, NULL, NULL);
         self->eglDestroyContext(self->egl_display, self->egl_context);
         self->egl_context = NULL;
     }
@@ -485,59 +488,13 @@ void gsr_egl_unload(gsr_egl *self) {
         self->egl_display = NULL;
     }
 
-    if(self->x11.window) {
-        XDestroyWindow(self->x11.dpy, self->x11.window);
-        self->x11.window = None;
-    }
-
-    gsr_egl_cleanup_frame(self);
-
-    if(self->wayland.frame_callback) {
-        zwlr_export_dmabuf_frame_v1_destroy(self->wayland.frame_callback);
-        self->wayland.frame_callback = NULL;
-    }
-
-    if(self->wayland.export_manager) {
-        zwlr_export_dmabuf_manager_v1_destroy(self->wayland.export_manager);
-        self->wayland.export_manager = NULL;
-    }
-
-    if(self->wayland.window) {
-        wl_egl_window_destroy(self->wayland.window);
-        self->wayland.window = NULL;
-    }
-
-    if(self->wayland.surface) {
-        wl_surface_destroy(self->wayland.surface);
-        self->wayland.surface = NULL;
-    }
-
-    for(int i = 0; i < self->wayland.num_outputs; ++i) {
-        if(self->wayland.outputs[i].output) {
-            wl_output_destroy(self->wayland.outputs[i].output);
-            self->wayland.outputs[i].output = NULL;
-        }
-
-        if(self->wayland.outputs[i].name) {
-            free(self->wayland.outputs[i].name);
-            self->wayland.outputs[i].name = NULL;
-        }
-    }
-    self->wayland.num_outputs = 0;
-
-    if(self->wayland.compositor) {
-        wl_compositor_destroy(self->wayland.compositor);
-        self->wayland.compositor = NULL;
-    }
-
-    if(self->wayland.registry) {
-        wl_registry_destroy(self->wayland.registry);
-        self->wayland.registry = NULL;
-    }
-
-    if(self->wayland.dpy) {
-        wl_display_disconnect(self->wayland.dpy);
-        self->wayland.dpy = NULL;
+    if(self->glx_context) {
+        assert(gsr_window_get_display_server(self->window) == GSR_DISPLAY_SERVER_X11);
+        Display *display = gsr_window_get_display(self->window);
+        self->glXMakeContextCurrent(display, None, None, NULL);
+        self->glXDestroyContext(display, self->glx_context);
+        self->glx_context = NULL;
+        self->glx_fb_config = NULL;
     }
 
     if(self->egl_library) {
@@ -545,6 +502,11 @@ void gsr_egl_unload(gsr_egl *self) {
         self->egl_library = NULL;
     }
 
+    if(self->glx_library) {
+        dlclose(self->glx_library);
+        self->glx_library = NULL;
+    }
+
     if(self->gl_library) {
         dlclose(self->gl_library);
         self->gl_library = NULL;
@@ -553,55 +515,8 @@ void gsr_egl_unload(gsr_egl *self) {
     memset(self, 0, sizeof(gsr_egl));
 }
 
-bool gsr_egl_supports_wayland_capture(gsr_egl *self) {
-    // TODO: wlroots capture is broken right now (black screen) on amd and multiple monitors
-    // so it has to be disabled right now. Find out why it happens and fix it.
-    (void)self;
-    return false;
-    //return !!self->wayland.export_manager && self->wayland.num_outputs > 0;
-}
-
-bool gsr_egl_start_capture(gsr_egl *self, const char *monitor_to_capture) {
-    assert(monitor_to_capture);
-    if(!monitor_to_capture)
-        return false;
-
-    if(!self->wayland.dpy)
-        return false;
-
-    if(!gsr_egl_supports_wayland_capture(self))
-        return false;
-
-    if(self->wayland.frame_callback)
-        return false;
-
-    self->wayland.output_to_capture = get_wayland_output_by_name(self, monitor_to_capture);
-    if(!self->wayland.output_to_capture)
-        return false;
-
-    frame_capture_output(self);
-    return true;
-}
-
-void gsr_egl_update(gsr_egl *self) {
-    if(!self->wayland.dpy)
-        return;
-
-    // TODO: pselect on wl_display_get_fd before doing dispatch
-    wl_display_dispatch(self->wayland.dpy);
-}
-
-void gsr_egl_cleanup_frame(gsr_egl *self) {
-    if(!self->wayland.dpy)
-        return;
-
-    if(self->fd > 0) {
-        close(self->fd);
-        self->fd = 0;
-    }
-
-    if(self->wayland.current_frame) {
-        //zwlr_export_dmabuf_frame_v1_destroy(self->wayland.current_frame);
-        self->wayland.current_frame = NULL;
-    }
+void gsr_egl_swap_buffers(gsr_egl *self) {
+    self->glFlush();
+    // TODO: Use the minimal barrier required
+    self->glMemoryBarrier(GL_ALL_BARRIER_BITS); // GL_SHADER_IMAGE_ACCESS_BARRIER_BIT
 }
diff --git a/src/encoder/encoder.c b/src/encoder/encoder.c
new file mode 100644
index 0000000..0f8eda5
--- /dev/null
+++ b/src/encoder/encoder.c
@@ -0,0 +1,155 @@
+#include "../../include/encoder/encoder.h"
+#include "../../include/utils.h"
+
+#include <string.h>
+#include <stdio.h>
+
+#include <libavcodec/avcodec.h>
+#include <libavformat/avformat.h>
+
+bool gsr_encoder_init(gsr_encoder *self, gsr_replay_storage replay_storage, size_t replay_buffer_num_packets, double replay_buffer_time, const char *replay_directory) {
+    memset(self, 0, sizeof(*self));
+    self->num_recording_destinations = 0;
+    self->recording_destination_id_counter = 0;
+
+    if(pthread_mutex_init(&self->file_write_mutex, NULL) != 0) {
+        fprintf(stderr, "gsr error: gsr_encoder_init: failed to create mutex\n");
+        return false;
+    }
+    self->mutex_created = true;
+
+    if(replay_buffer_num_packets > 0) {
+        self->replay_buffer = gsr_replay_buffer_create(replay_storage, replay_directory, replay_buffer_time, replay_buffer_num_packets);
+        if(!self->replay_buffer) {
+            fprintf(stderr, "gsr error: gsr_encoder_init: failed to create replay buffer\n");
+            gsr_encoder_deinit(self);
+            return false;
+        }
+    }
+
+    return true;
+}
+
+void gsr_encoder_deinit(gsr_encoder *self)  {
+    if(self->mutex_created) {
+        self->mutex_created = false;
+        pthread_mutex_destroy(&self->file_write_mutex);
+    }
+
+    if(self->replay_buffer) {
+        gsr_replay_buffer_destroy(self->replay_buffer);
+        self->replay_buffer = NULL;
+    }
+
+    self->num_recording_destinations = 0;
+    self->recording_destination_id_counter = 0;
+}
+
+void gsr_encoder_receive_packets(gsr_encoder *self, AVCodecContext *codec_context, int64_t pts, int stream_index) {
+    for(;;) {
+        AVPacket *av_packet = av_packet_alloc();
+        if(!av_packet)
+            break;
+
+        av_packet->data = NULL;
+        av_packet->size = 0;
+        int res = avcodec_receive_packet(codec_context, av_packet);
+        if(res == 0) { // we have a packet, send the packet to the muxer
+            av_packet->stream_index = stream_index;
+            av_packet->pts = pts;
+            av_packet->dts = pts;
+
+            if(self->replay_buffer) {
+                const double time_now = clock_get_monotonic_seconds();
+                if(!gsr_replay_buffer_append(self->replay_buffer, av_packet, time_now))
+                    fprintf(stderr, "gsr error: gsr_encoder_receive_packets: failed to add replay buffer data\n");
+            }
+
+            pthread_mutex_lock(&self->file_write_mutex);
+            const bool is_keyframe = av_packet->flags & AV_PKT_FLAG_KEY;
+            for(size_t i = 0; i < self->num_recording_destinations; ++i) {
+                gsr_encoder_recording_destination *recording_destination = &self->recording_destinations[i];
+                if(recording_destination->codec_context != codec_context)
+                    continue;
+
+                if(is_keyframe)
+                    recording_destination->has_received_keyframe = true;
+                else if(!recording_destination->has_received_keyframe)
+                    continue;
+
+                av_packet->pts = pts - recording_destination->start_pts;
+                av_packet->dts = pts - recording_destination->start_pts;
+
+                av_packet_rescale_ts(av_packet, codec_context->time_base, recording_destination->stream->time_base);
+                // TODO: Is av_interleaved_write_frame needed?. Answer: might be needed for mkv but dont use it! it causes frames to be inconsistent, skipping frames and duplicating frames.
+                // TODO: av_interleaved_write_frame might be needed for cfr, or always for flv
+                const int ret = av_write_frame(recording_destination->format_context, av_packet);
+                if(ret < 0) {
+                    char error_buffer[AV_ERROR_MAX_STRING_SIZE];
+                    if(av_strerror(ret, error_buffer, sizeof(error_buffer)) < 0)
+                        snprintf(error_buffer, sizeof(error_buffer), "Unknown error");
+                    fprintf(stderr, "gsr error: gsr_encoder_receive_packets: failed to write frame index %d to muxer, reason: %s (%d)\n", av_packet->stream_index, error_buffer, ret);
+                }
+            }
+            pthread_mutex_unlock(&self->file_write_mutex);
+
+            av_packet_free(&av_packet);
+        } else if (res == AVERROR(EAGAIN)) { // we have no packet
+                                             // fprintf(stderr, "No packet!\n");
+            av_packet_free(&av_packet);
+            break;
+        } else if (res == AVERROR_EOF) { // this is the end of the stream
+            av_packet_free(&av_packet);
+            fprintf(stderr, "End of stream!\n");
+            break;
+        } else {
+            av_packet_free(&av_packet);
+            fprintf(stderr, "Unexpected error: %d\n", res);
+            break;
+        }
+    }
+}
+
+size_t gsr_encoder_add_recording_destination(gsr_encoder *self, AVCodecContext *codec_context, AVFormatContext *format_context, AVStream *stream, int64_t start_pts) {
+    if(self->num_recording_destinations >= GSR_MAX_RECORDING_DESTINATIONS) {
+        fprintf(stderr, "gsr error: gsr_encoder_add_recording_destination: failed to add destination, reached the max amount of recording destinations (%d)\n", GSR_MAX_RECORDING_DESTINATIONS);
+        return (size_t)-1;
+    }
+
+    for(size_t i = 0; i < self->num_recording_destinations; ++i) {
+        if(self->recording_destinations[i].stream == stream) {
+            fprintf(stderr, "gsr error: gsr_encoder_add_recording_destination: failed to add destination, the stream %p already exists as an output\n", (void*)stream);
+            return (size_t)-1;
+        }
+    }
+
+    pthread_mutex_lock(&self->file_write_mutex);
+    gsr_encoder_recording_destination *recording_destination = &self->recording_destinations[self->num_recording_destinations];
+    recording_destination->id = self->recording_destination_id_counter;
+    recording_destination->codec_context = codec_context;
+    recording_destination->format_context = format_context;
+    recording_destination->stream = stream;
+    recording_destination->start_pts = start_pts;
+    recording_destination->has_received_keyframe = false;
+
+    ++self->recording_destination_id_counter;
+    ++self->num_recording_destinations;
+    pthread_mutex_unlock(&self->file_write_mutex);
+
+    return recording_destination->id;
+}
+
+bool gsr_encoder_remove_recording_destination(gsr_encoder *self, size_t id) {
+    bool found = false;
+    pthread_mutex_lock(&self->file_write_mutex);
+    for(size_t i = 0; i < self->num_recording_destinations; ++i) {
+        if(self->recording_destinations[i].id == id) {
+            self->recording_destinations[i] = self->recording_destinations[self->num_recording_destinations - 1];
+            --self->num_recording_destinations;
+            found = true;
+            break;
+        }
+    }
+    pthread_mutex_unlock(&self->file_write_mutex);
+    return found;
+}
diff --git a/src/encoder/video/nvenc.c b/src/encoder/video/nvenc.c
new file mode 100644
index 0000000..5f578c2
--- /dev/null
+++ b/src/encoder/video/nvenc.c
@@ -0,0 +1,237 @@
+#include "../../../include/encoder/video/nvenc.h"
+#include "../../../include/egl.h"
+#include "../../../include/cuda.h"
+#include "../../../include/window/window.h"
+
+#include <libavcodec/avcodec.h>
+#include <libavutil/hwcontext_cuda.h>
+
+#include <stdlib.h>
+
+typedef struct {
+    gsr_video_encoder_nvenc_params params;
+
+    unsigned int target_textures[2];
+
+    AVBufferRef *device_ctx;
+
+    gsr_cuda cuda;
+    CUgraphicsResource cuda_graphics_resources[2];
+    CUarray mapped_arrays[2];
+    CUstream cuda_stream;
+} gsr_video_encoder_nvenc;
+
+static bool gsr_video_encoder_nvenc_setup_context(gsr_video_encoder_nvenc *self, AVCodecContext *video_codec_context) {
+    self->device_ctx = av_hwdevice_ctx_alloc(AV_HWDEVICE_TYPE_CUDA);
+    if(!self->device_ctx) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_nvenc_setup_context failed: failed to create hardware device context\n");
+        return false;
+    }
+
+    AVHWDeviceContext *hw_device_context = (AVHWDeviceContext*)self->device_ctx->data;
+    AVCUDADeviceContext *cuda_device_context = (AVCUDADeviceContext*)hw_device_context->hwctx;
+    cuda_device_context->cuda_ctx = self->cuda.cu_ctx;
+    if(av_hwdevice_ctx_init(self->device_ctx) < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_nvenc_setup_context failed: failed to create hardware device context\n");
+        av_buffer_unref(&self->device_ctx);
+        return false;
+    }
+
+    AVBufferRef *frame_context = av_hwframe_ctx_alloc(self->device_ctx);
+    if(!frame_context) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_nvenc_setup_context failed: failed to create hwframe context\n");
+        av_buffer_unref(&self->device_ctx);
+        return false;
+    }
+
+    AVHWFramesContext *hw_frame_context = (AVHWFramesContext*)frame_context->data;
+    hw_frame_context->width = video_codec_context->width;
+    hw_frame_context->height = video_codec_context->height;
+    hw_frame_context->sw_format = self->params.color_depth == GSR_COLOR_DEPTH_10_BITS ? AV_PIX_FMT_P010LE : AV_PIX_FMT_NV12;
+    hw_frame_context->format = video_codec_context->pix_fmt;
+    hw_frame_context->device_ctx = (AVHWDeviceContext*)self->device_ctx->data;
+
+    if (av_hwframe_ctx_init(frame_context) < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_nvenc_setup_context failed: failed to initialize hardware frame context "
+                        "(note: ffmpeg version needs to be > 4.0)\n");
+        av_buffer_unref(&self->device_ctx);
+        //av_buffer_unref(&frame_context);
+        return false;
+    }
+
+    self->cuda_stream = cuda_device_context->stream;
+    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
+    av_buffer_unref(&frame_context);
+    return true;
+}
+
+static bool cuda_register_opengl_texture(gsr_cuda *cuda, CUgraphicsResource *cuda_graphics_resource, CUarray *mapped_array, unsigned int texture_id) {
+    CUresult res;
+    res = cuda->cuGraphicsGLRegisterImage(cuda_graphics_resource, texture_id, GL_TEXTURE_2D, CU_GRAPHICS_REGISTER_FLAGS_NONE);
+    if (res != CUDA_SUCCESS) {
+        const char *err_str = "unknown";
+        cuda->cuGetErrorString(res, &err_str);
+        fprintf(stderr, "gsr error: cuda_register_opengl_texture: cuGraphicsGLRegisterImage failed, error: %s, texture " "id: %u\n", err_str, texture_id);
+        return false;
+    }
+
+    res = cuda->cuGraphicsResourceSetMapFlags(*cuda_graphics_resource, CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE);
+    res = cuda->cuGraphicsMapResources(1, cuda_graphics_resource, 0);
+
+    res = cuda->cuGraphicsSubResourceGetMappedArray(mapped_array, *cuda_graphics_resource, 0, 0);
+    return true;
+}
+
+static bool gsr_video_encoder_nvenc_setup_textures(gsr_video_encoder_nvenc *self, AVCodecContext *video_codec_context, AVFrame *frame) {
+    const int res = av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, frame, 0);
+    if(res < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_nvenc_setup_textures: av_hwframe_get_buffer failed: %d\n", res);
+        return false;
+    }
+
+    const unsigned int internal_formats_nv12[2] = { GL_R8, GL_RG8 };
+    const unsigned int internal_formats_p010[2] = { GL_R16, GL_RG16 };
+    const unsigned int formats[2] = { GL_RED, GL_RG };
+    const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+
+    for(int i = 0; i < 2; ++i) {
+        self->target_textures[i] = gl_create_texture(self->params.egl, video_codec_context->width / div[i], video_codec_context->height / div[i], self->params.color_depth == GSR_COLOR_DEPTH_8_BITS ? internal_formats_nv12[i] : internal_formats_p010[i], formats[i], GL_NEAREST);
+        if(self->target_textures[i] == 0) {
+            fprintf(stderr, "gsr error: gsr_video_encoder_nvenc_setup_textures: failed to create opengl texture\n");
+            return false;
+        }
+
+        if(!cuda_register_opengl_texture(&self->cuda, &self->cuda_graphics_resources[i], &self->mapped_arrays[i], self->target_textures[i])) {
+            return false;
+        }
+    }
+
+    return true;
+}
+
+static void gsr_video_encoder_nvenc_stop(gsr_video_encoder_nvenc *self, AVCodecContext *video_codec_context);
+
+static bool gsr_video_encoder_nvenc_start(gsr_video_encoder *encoder, AVCodecContext *video_codec_context, AVFrame *frame) {
+    gsr_video_encoder_nvenc *self = encoder->priv;
+
+    const bool is_x11 = gsr_window_get_display_server(self->params.egl->window) == GSR_DISPLAY_SERVER_X11;
+    const bool overclock = is_x11 ? self->params.overclock : false;
+    Display *display = is_x11 ? gsr_window_get_display(self->params.egl->window) : NULL;
+    if(!gsr_cuda_load(&self->cuda, display, overclock)) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_nvenc_start: failed to load cuda\n");
+        gsr_video_encoder_nvenc_stop(self, video_codec_context);
+        return false;
+    }
+
+    video_codec_context->width = FFALIGN(video_codec_context->width, 2);
+    video_codec_context->height = FFALIGN(video_codec_context->height, 2);
+
+    if(video_codec_context->width < 128)
+        video_codec_context->width = 128;
+
+    if(video_codec_context->height < 128)
+        video_codec_context->height = 128;
+
+    frame->width = video_codec_context->width;
+    frame->height = video_codec_context->height;
+
+    if(!gsr_video_encoder_nvenc_setup_context(self, video_codec_context)) {
+        gsr_video_encoder_nvenc_stop(self, video_codec_context);
+        return false;
+    }
+
+    if(!gsr_video_encoder_nvenc_setup_textures(self, video_codec_context, frame)) {
+        gsr_video_encoder_nvenc_stop(self, video_codec_context);
+        return false;
+    }
+
+    return true;
+}
+
+void gsr_video_encoder_nvenc_stop(gsr_video_encoder_nvenc *self, AVCodecContext *video_codec_context) {
+    self->params.egl->glDeleteTextures(2, self->target_textures);
+    self->target_textures[0] = 0;
+    self->target_textures[1] = 0;
+
+    if(video_codec_context->hw_frames_ctx)
+        av_buffer_unref(&video_codec_context->hw_frames_ctx);
+    if(self->device_ctx)
+        av_buffer_unref(&self->device_ctx);
+
+    if(self->cuda.cu_ctx) {
+        for(int i = 0; i < 2; ++i) {
+            if(self->cuda_graphics_resources[i]) {
+                self->cuda.cuGraphicsUnmapResources(1, &self->cuda_graphics_resources[i], 0);
+                self->cuda.cuGraphicsUnregisterResource(self->cuda_graphics_resources[i]);
+                self->cuda_graphics_resources[i] = 0;
+            }
+        }
+    }
+
+    gsr_cuda_unload(&self->cuda);
+}
+
+static void gsr_video_encoder_nvenc_copy_textures_to_frame(gsr_video_encoder *encoder, AVFrame *frame, gsr_color_conversion *color_conversion) {
+    gsr_video_encoder_nvenc *self = encoder->priv;
+    const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+    for(int i = 0; i < 2; ++i) {
+        CUDA_MEMCPY2D memcpy_struct;
+        memcpy_struct.srcXInBytes = 0;
+        memcpy_struct.srcY = 0;
+        memcpy_struct.srcMemoryType = CU_MEMORYTYPE_ARRAY;
+
+        memcpy_struct.dstXInBytes = 0;
+        memcpy_struct.dstY = 0;
+        memcpy_struct.dstMemoryType = CU_MEMORYTYPE_DEVICE;
+
+        memcpy_struct.srcArray = self->mapped_arrays[i];
+        memcpy_struct.srcPitch = frame->width / div[i];
+        memcpy_struct.dstDevice = (CUdeviceptr)frame->data[i];
+        memcpy_struct.dstPitch = frame->linesize[i];
+        memcpy_struct.WidthInBytes = frame->width * (self->params.color_depth == GSR_COLOR_DEPTH_10_BITS ? 2 : 1);
+        memcpy_struct.Height = frame->height / div[i];
+        // TODO: Remove this copy if possible
+        self->cuda.cuMemcpy2DAsync_v2(&memcpy_struct, self->cuda_stream);
+    }
+
+    // TODO: needed?
+    self->cuda.cuStreamSynchronize(self->cuda_stream);
+}
+
+static void gsr_video_encoder_nvenc_get_textures(gsr_video_encoder *encoder, unsigned int *textures, int *num_textures, gsr_destination_color *destination_color) {
+    gsr_video_encoder_nvenc *self = encoder->priv;
+    textures[0] = self->target_textures[0];
+    textures[1] = self->target_textures[1];
+    *num_textures = 2;
+    *destination_color = self->params.color_depth == GSR_COLOR_DEPTH_10_BITS ? GSR_DESTINATION_COLOR_P010 : GSR_DESTINATION_COLOR_NV12;
+}
+
+static void gsr_video_encoder_nvenc_destroy(gsr_video_encoder *encoder, AVCodecContext *video_codec_context) {
+    gsr_video_encoder_nvenc_stop(encoder->priv, video_codec_context);
+    free(encoder->priv);
+    free(encoder);
+}
+
+gsr_video_encoder* gsr_video_encoder_nvenc_create(const gsr_video_encoder_nvenc_params *params) {
+    gsr_video_encoder *encoder = calloc(1, sizeof(gsr_video_encoder));
+    if(!encoder)
+        return NULL;
+
+    gsr_video_encoder_nvenc *encoder_cuda = calloc(1, sizeof(gsr_video_encoder_nvenc));
+    if(!encoder_cuda) {
+        free(encoder);
+        return NULL;
+    }
+
+    encoder_cuda->params = *params;
+
+    *encoder = (gsr_video_encoder) {
+        .start = gsr_video_encoder_nvenc_start,
+        .copy_textures_to_frame = gsr_video_encoder_nvenc_copy_textures_to_frame,
+        .get_textures = gsr_video_encoder_nvenc_get_textures,
+        .destroy = gsr_video_encoder_nvenc_destroy,
+        .priv = encoder_cuda
+    };
+
+    return encoder;
+}
diff --git a/src/encoder/video/software.c b/src/encoder/video/software.c
new file mode 100644
index 0000000..d8d9828
--- /dev/null
+++ b/src/encoder/video/software.c
@@ -0,0 +1,125 @@
+#include "../../../include/encoder/video/software.h"
+#include "../../../include/egl.h"
+#include "../../../include/utils.h"
+
+#include <libavcodec/avcodec.h>
+#include <libavutil/frame.h>
+
+#include <stdlib.h>
+
+#define LINESIZE_ALIGNMENT 4
+
+typedef struct {
+    gsr_video_encoder_software_params params;
+
+    unsigned int target_textures[2];
+} gsr_video_encoder_software;
+
+static bool gsr_video_encoder_software_setup_textures(gsr_video_encoder_software *self, AVCodecContext *video_codec_context, AVFrame *frame) {
+    int res = av_frame_get_buffer(frame, LINESIZE_ALIGNMENT);
+    if(res < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_software_setup_textures: av_frame_get_buffer failed: %d\n", res);
+        return false;
+    }
+
+    res = av_frame_make_writable(frame);
+    if(res < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_software_setup_textures: av_frame_make_writable failed: %d\n", res);
+        return false;
+    }
+
+    const unsigned int internal_formats_nv12[2] = { GL_R8, GL_RG8 };
+    const unsigned int internal_formats_p010[2] = { GL_R16, GL_RG16 };
+    const unsigned int formats[2] = { GL_RED, GL_RG };
+    const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+
+    for(int i = 0; i < 2; ++i) {
+        self->target_textures[i] = gl_create_texture(self->params.egl, video_codec_context->width / div[i], video_codec_context->height / div[i], self->params.color_depth == GSR_COLOR_DEPTH_8_BITS ? internal_formats_nv12[i] : internal_formats_p010[i], formats[i], GL_NEAREST);
+        if(self->target_textures[i] == 0) {
+            fprintf(stderr, "gsr error: gsr_capture_kms_setup_cuda_textures: failed to create opengl texture\n");
+            return false;
+        }
+    }
+
+    return true;
+}
+
+static void gsr_video_encoder_software_stop(gsr_video_encoder_software *self, AVCodecContext *video_codec_context);
+
+static bool gsr_video_encoder_software_start(gsr_video_encoder *encoder, AVCodecContext *video_codec_context, AVFrame *frame) {
+    gsr_video_encoder_software *self = encoder->priv;
+
+    video_codec_context->width = FFALIGN(video_codec_context->width, LINESIZE_ALIGNMENT);
+    video_codec_context->height = FFALIGN(video_codec_context->height, 2);
+
+    frame->width = video_codec_context->width;
+    frame->height = video_codec_context->height;
+
+    if(!gsr_video_encoder_software_setup_textures(self, video_codec_context, frame)) {
+        gsr_video_encoder_software_stop(self, video_codec_context);
+        return false;
+    }
+
+    return true;
+}
+
+void gsr_video_encoder_software_stop(gsr_video_encoder_software *self, AVCodecContext *video_codec_context) {
+    (void)video_codec_context;
+    self->params.egl->glDeleteTextures(2, self->target_textures);
+    self->target_textures[0] = 0;
+    self->target_textures[1] = 0;
+}
+
+static void gsr_video_encoder_software_copy_textures_to_frame(gsr_video_encoder *encoder, AVFrame *frame, gsr_color_conversion *color_conversion) {
+    (void)encoder;
+    //gsr_video_encoder_software *self = encoder->priv;
+    // TODO: hdr support
+    const unsigned int formats[2] = { GL_RED, GL_RG };
+    const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+    for(int i = 0; i < 2; ++i) {
+        // TODO: Use glPixelStore?
+        gsr_color_conversion_read_destination_texture(color_conversion, i, 0, 0, frame->width / div[i], frame->height / div[i], formats[i], GL_UNSIGNED_BYTE, frame->data[i]);
+    }
+    // cap_kms->kms.base.egl->eglSwapBuffers(cap_kms->kms.base.egl->egl_display, cap_kms->kms.base.egl->egl_surface);
+
+    //self->params.egl->glFlush();
+    //self->params.egl->glFinish();
+}
+
+static void gsr_video_encoder_software_get_textures(gsr_video_encoder *encoder, unsigned int *textures, int *num_textures, gsr_destination_color *destination_color) {
+    gsr_video_encoder_software *self = encoder->priv;
+    textures[0] = self->target_textures[0];
+    textures[1] = self->target_textures[1];
+    *num_textures = 2;
+    *destination_color = self->params.color_depth == GSR_COLOR_DEPTH_10_BITS ? GSR_DESTINATION_COLOR_P010 : GSR_DESTINATION_COLOR_NV12;
+}
+
+static void gsr_video_encoder_software_destroy(gsr_video_encoder *encoder, AVCodecContext *video_codec_context) {
+    gsr_video_encoder_software_stop(encoder->priv, video_codec_context);
+    free(encoder->priv);
+    free(encoder);
+}
+
+gsr_video_encoder* gsr_video_encoder_software_create(const gsr_video_encoder_software_params *params) {
+    gsr_video_encoder *encoder = calloc(1, sizeof(gsr_video_encoder));
+    if(!encoder)
+        return NULL;
+
+    gsr_video_encoder_software *encoder_software = calloc(1, sizeof(gsr_video_encoder_software));
+    if(!encoder_software) {
+        free(encoder);
+        return NULL;
+    }
+
+    encoder_software->params = *params;
+
+    *encoder = (gsr_video_encoder) {
+        .start = gsr_video_encoder_software_start,
+        .copy_textures_to_frame = gsr_video_encoder_software_copy_textures_to_frame,
+        .get_textures = gsr_video_encoder_software_get_textures,
+        .destroy = gsr_video_encoder_software_destroy,
+        .priv = encoder_software
+    };
+
+    return encoder;
+}
diff --git a/src/encoder/video/vaapi.c b/src/encoder/video/vaapi.c
new file mode 100644
index 0000000..0daf4d8
--- /dev/null
+++ b/src/encoder/video/vaapi.c
@@ -0,0 +1,252 @@
+#include "../../../include/encoder/video/vaapi.h"
+#include "../../../include/utils.h"
+#include "../../../include/egl.h"
+
+#include <libavcodec/avcodec.h>
+#include <libavutil/hwcontext_vaapi.h>
+#include <libavutil/intreadwrite.h>
+
+#include <va/va_drmcommon.h>
+
+#include <stdlib.h>
+#include <unistd.h>
+#include <fcntl.h>
+
+typedef struct {
+    gsr_video_encoder_vaapi_params params;
+
+    unsigned int target_textures[2];
+
+    AVBufferRef *device_ctx;
+    VADisplay va_dpy;
+    VADRMPRIMESurfaceDescriptor prime;
+} gsr_video_encoder_vaapi;
+
+static bool gsr_video_encoder_vaapi_setup_context(gsr_video_encoder_vaapi *self, AVCodecContext *video_codec_context) {
+    char render_path[128];
+    if(!gsr_card_path_get_render_path(self->params.egl->card_path, render_path)) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_context: failed to get /dev/dri/renderDXXX file from %s\n", self->params.egl->card_path);
+        return false;
+    }
+
+    if(av_hwdevice_ctx_create(&self->device_ctx, AV_HWDEVICE_TYPE_VAAPI, render_path, NULL, 0) < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_context: failed to create hardware device context\n");
+        return false;
+    }
+
+    AVBufferRef *frame_context = av_hwframe_ctx_alloc(self->device_ctx);
+    if(!frame_context) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_context: failed to create hwframe context\n");
+        av_buffer_unref(&self->device_ctx);
+        return false;
+    }
+
+    AVHWFramesContext *hw_frame_context = (AVHWFramesContext*)frame_context->data;
+    hw_frame_context->width = video_codec_context->width;
+    hw_frame_context->height = video_codec_context->height;
+    hw_frame_context->sw_format = self->params.color_depth == GSR_COLOR_DEPTH_10_BITS ? AV_PIX_FMT_P010LE : AV_PIX_FMT_NV12;
+    hw_frame_context->format = video_codec_context->pix_fmt;
+    hw_frame_context->device_ctx = (AVHWDeviceContext*)self->device_ctx->data;
+
+    //hw_frame_context->initial_pool_size = 20;
+
+    AVVAAPIDeviceContext *vactx = ((AVHWDeviceContext*)self->device_ctx->data)->hwctx;
+    self->va_dpy = vactx->display;
+
+    if (av_hwframe_ctx_init(frame_context) < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_context: failed to initialize hardware frame context "
+                        "(note: ffmpeg version needs to be > 4.0)\n");
+        av_buffer_unref(&self->device_ctx);
+        //av_buffer_unref(&frame_context);
+        return false;
+    }
+
+    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
+    av_buffer_unref(&frame_context);
+    return true;
+}
+
+static uint32_t fourcc(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
+    return (d << 24) | (c << 16) | (b << 8) | a;
+}
+
+static bool gsr_video_encoder_vaapi_setup_textures(gsr_video_encoder_vaapi *self, AVCodecContext *video_codec_context, AVFrame *frame) {
+    const int res = av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, frame, 0);
+    if(res < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_textures: av_hwframe_get_buffer failed: %d\n", res);
+        return false;
+    }
+
+    VASurfaceID target_surface_id = (uintptr_t)frame->data[3];
+
+    VAStatus va_status = vaExportSurfaceHandle(self->va_dpy, target_surface_id, VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2, VA_EXPORT_SURFACE_WRITE_ONLY | VA_EXPORT_SURFACE_SEPARATE_LAYERS, &self->prime);
+    if(va_status != VA_STATUS_SUCCESS) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_textures: vaExportSurfaceHandle failed, error: %d\n", va_status);
+        return false;
+    }
+    vaSyncSurface(self->va_dpy, target_surface_id);
+
+    const uint32_t formats_nv12[2] = { fourcc('R', '8', ' ', ' '), fourcc('G', 'R', '8', '8') };
+    const uint32_t formats_p010[2] = { fourcc('R', '1', '6', ' '), fourcc('G', 'R', '3', '2') };
+
+    if(self->prime.fourcc == VA_FOURCC_NV12 || self->prime.fourcc == VA_FOURCC_P010) {
+        const uint32_t *formats = self->prime.fourcc == VA_FOURCC_NV12 ? formats_nv12 : formats_p010;
+        const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+
+        self->params.egl->glGenTextures(2, self->target_textures);
+        for(int i = 0; i < 2; ++i) {
+            const int layer = i;
+
+            int fds[4];
+            uint32_t offsets[4];
+            uint32_t pitches[4];
+            uint64_t modifiers[4];
+            for(uint32_t j = 0; j < self->prime.layers[layer].num_planes; ++j) {
+                // TODO: Close these? in _stop, using self->prime
+                fds[j] = self->prime.objects[self->prime.layers[layer].object_index[j]].fd;
+                offsets[j] = self->prime.layers[layer].offset[j];
+                pitches[j] = self->prime.layers[layer].pitch[j];
+                modifiers[j] = self->prime.objects[self->prime.layers[layer].object_index[j]].drm_format_modifier;
+            }
+
+            intptr_t img_attr[44];
+            setup_dma_buf_attrs(img_attr, formats[i], self->prime.width / div[i], self->prime.height / div[i],
+                fds, offsets, pitches, modifiers, self->prime.layers[layer].num_planes, true);
+
+            while(self->params.egl->eglGetError() != EGL_SUCCESS){}
+            EGLImage image = self->params.egl->eglCreateImage(self->params.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
+            if(!image) {
+                fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_textures: failed to create egl image from drm fd for output drm fd, error: %d\n", self->params.egl->eglGetError());
+                return false;
+            }
+
+            self->params.egl->glBindTexture(GL_TEXTURE_2D, self->target_textures[i]);
+            self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
+            self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
+
+            while(self->params.egl->glGetError()) {}
+            while(self->params.egl->eglGetError() != EGL_SUCCESS){}
+            self->params.egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
+            if(self->params.egl->glGetError() != 0 || self->params.egl->eglGetError() != EGL_SUCCESS) {
+                // TODO: Get the error properly
+                fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_textures: failed to bind egl image to gl texture, error: %d\n", self->params.egl->eglGetError());
+                self->params.egl->eglDestroyImage(self->params.egl->egl_display, image);
+                self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+                return false;
+            }
+
+            self->params.egl->eglDestroyImage(self->params.egl->egl_display, image);
+            self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+        }
+
+        return true;
+    } else {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vaapi_setup_textures: unexpected fourcc %u for output drm fd, expected nv12 or p010\n", self->prime.fourcc);
+        return false;
+    }
+}
+
+static void gsr_video_encoder_vaapi_stop(gsr_video_encoder_vaapi *self, AVCodecContext *video_codec_context);
+
+static bool gsr_video_encoder_vaapi_start(gsr_video_encoder *encoder, AVCodecContext *video_codec_context, AVFrame *frame) {
+    gsr_video_encoder_vaapi *self = encoder->priv;
+
+    if(self->params.egl->gpu_info.vendor == GSR_GPU_VENDOR_AMD && video_codec_context->codec_id == AV_CODEC_ID_HEVC) {
+        // TODO: dont do this if using ffmpeg reports that this is not needed (AMD driver bug that was fixed recently)
+        video_codec_context->width = FFALIGN(video_codec_context->width, 64);
+        video_codec_context->height = FFALIGN(video_codec_context->height, 16);
+    } else if(self->params.egl->gpu_info.vendor == GSR_GPU_VENDOR_AMD && video_codec_context->codec_id == AV_CODEC_ID_AV1) {
+        // TODO: Dont do this for VCN 5 and forward which should fix this hardware bug
+        video_codec_context->width = FFALIGN(video_codec_context->width, 64);
+        // AMD driver has special case handling for 1080 height to set it to 1082 instead of 1088 (1080 aligned to 16).
+        // TODO: Set height to 1082 in this case, but it wont work because it will be aligned to 1088.
+        if(video_codec_context->height == 1080) {
+            video_codec_context->height = 1080;
+        } else {
+            video_codec_context->height = FFALIGN(video_codec_context->height, 16);
+        }
+    } else {
+        video_codec_context->width = FFALIGN(video_codec_context->width, 2);
+        video_codec_context->height = FFALIGN(video_codec_context->height, 2);
+    }
+
+    if(FFALIGN(video_codec_context->width, 2) != FFALIGN(frame->width, 2) || FFALIGN(video_codec_context->height, 2) != FFALIGN(frame->height, 2)) {
+        fprintf(stderr, "gsr warning: gsr_video_encoder_vaapi_start: black bars have been added to the video because of a bug in AMD drivers/hardware. Record with h264 codec instead (-k h264) to get around this issue\n");
+    }
+
+    if(video_codec_context->width < 128)
+        video_codec_context->width = 128;
+
+    if(video_codec_context->height < 128)
+        video_codec_context->height = 128;
+
+    frame->width = video_codec_context->width;
+    frame->height = video_codec_context->height;
+
+    if(!gsr_video_encoder_vaapi_setup_context(self, video_codec_context)) {
+        gsr_video_encoder_vaapi_stop(self, video_codec_context);
+        return false;
+    }
+
+    if(!gsr_video_encoder_vaapi_setup_textures(self, video_codec_context, frame)) {
+        gsr_video_encoder_vaapi_stop(self, video_codec_context);
+        return false;
+    }
+
+    return true;
+}
+
+void gsr_video_encoder_vaapi_stop(gsr_video_encoder_vaapi *self, AVCodecContext *video_codec_context) {
+    self->params.egl->glDeleteTextures(2, self->target_textures);
+    self->target_textures[0] = 0;
+    self->target_textures[1] = 0;
+
+    if(video_codec_context->hw_frames_ctx)
+        av_buffer_unref(&video_codec_context->hw_frames_ctx);
+    if(self->device_ctx)
+        av_buffer_unref(&self->device_ctx);
+
+    for(uint32_t i = 0; i < self->prime.num_objects; ++i) {
+        if(self->prime.objects[i].fd > 0) {
+            close(self->prime.objects[i].fd);
+            self->prime.objects[i].fd = 0;
+        }
+    }
+}
+
+static void gsr_video_encoder_vaapi_get_textures(gsr_video_encoder *encoder, unsigned int *textures, int *num_textures, gsr_destination_color *destination_color) {
+    gsr_video_encoder_vaapi *self = encoder->priv;
+    textures[0] = self->target_textures[0];
+    textures[1] = self->target_textures[1];
+    *num_textures = 2;
+    *destination_color = self->params.color_depth == GSR_COLOR_DEPTH_10_BITS ? GSR_DESTINATION_COLOR_P010 : GSR_DESTINATION_COLOR_NV12;
+}
+
+static void gsr_video_encoder_vaapi_destroy(gsr_video_encoder *encoder, AVCodecContext *video_codec_context) {
+    gsr_video_encoder_vaapi_stop(encoder->priv, video_codec_context);
+    free(encoder->priv);
+    free(encoder);
+}
+
+gsr_video_encoder* gsr_video_encoder_vaapi_create(const gsr_video_encoder_vaapi_params *params) {
+    gsr_video_encoder *encoder = calloc(1, sizeof(gsr_video_encoder));
+    if(!encoder)
+        return NULL;
+
+    gsr_video_encoder_vaapi *encoder_vaapi = calloc(1, sizeof(gsr_video_encoder_vaapi));
+    if(!encoder_vaapi) {
+        free(encoder);
+        return NULL;
+    }
+
+    encoder_vaapi->params = *params;
+
+    *encoder = (gsr_video_encoder) {
+        .start = gsr_video_encoder_vaapi_start,
+        .get_textures = gsr_video_encoder_vaapi_get_textures,
+        .destroy = gsr_video_encoder_vaapi_destroy,
+        .priv = encoder_vaapi
+    };
+
+    return encoder;
+}
diff --git a/src/encoder/video/video.c b/src/encoder/video/video.c
new file mode 100644
index 0000000..ce3b61b
--- /dev/null
+++ b/src/encoder/video/video.c
@@ -0,0 +1,28 @@
+#include "../../../include/encoder/video/video.h"
+
+#include <assert.h>
+
+bool gsr_video_encoder_start(gsr_video_encoder *encoder, AVCodecContext *video_codec_context, AVFrame *frame) {
+    assert(!encoder->started);
+    bool res = encoder->start(encoder, video_codec_context, frame);
+    if(res)
+        encoder->started = true;
+    return res;
+}
+
+void gsr_video_encoder_destroy(gsr_video_encoder *encoder, AVCodecContext *video_codec_context) {
+    assert(encoder->started);
+    encoder->started = false;
+    encoder->destroy(encoder, video_codec_context);
+}
+
+void gsr_video_encoder_copy_textures_to_frame(gsr_video_encoder *encoder, AVFrame *frame, gsr_color_conversion *color_conversion) {
+    assert(encoder->started);
+    if(encoder->copy_textures_to_frame)
+        encoder->copy_textures_to_frame(encoder, frame, color_conversion);
+}
+
+void gsr_video_encoder_get_textures(gsr_video_encoder *encoder, unsigned int *textures, int *num_textures, gsr_destination_color *destination_color) {
+    assert(encoder->started);
+    encoder->get_textures(encoder, textures, num_textures, destination_color);
+}
diff --git a/src/encoder/video/vulkan.c b/src/encoder/video/vulkan.c
new file mode 100644
index 0000000..802934d
--- /dev/null
+++ b/src/encoder/video/vulkan.c
@@ -0,0 +1,309 @@
+#include "../../../include/encoder/video/vulkan.h"
+#include "../../../include/utils.h"
+#include "../../../include/egl.h"
+
+#include <libavcodec/avcodec.h>
+#define VK_NO_PROTOTYPES
+#include <libavutil/hwcontext_vulkan.h>
+
+//#include <vulkan/vulkan_core.h>
+
+#define GL_HANDLE_TYPE_OPAQUE_FD_EXT      0x9586
+#define GL_TEXTURE_TILING_EXT             0x9580
+#define GL_OPTIMAL_TILING_EXT             0x9584
+#define GL_LINEAR_TILING_EXT              0x9585
+
+typedef struct {
+    gsr_video_encoder_vulkan_params params;
+    unsigned int target_textures[2];
+    AVBufferRef *device_ctx;
+} gsr_video_encoder_vulkan;
+
+static bool gsr_video_encoder_vulkan_setup_context(gsr_video_encoder_vulkan *self, AVCodecContext *video_codec_context) {
+    AVDictionary *options = NULL;
+    //av_dict_set(&options, "linear_images", "1", 0);
+    //av_dict_set(&options, "disable_multiplane", "1", 0);
+#if 0
+    // TODO: Use correct device
+    if(av_hwdevice_ctx_create(&self->device_ctx, AV_HWDEVICE_TYPE_VULKAN, NULL, options, 0) < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vulkan_setup_context: failed to create hardware device context\n");
+        return false;
+    }
+
+    AVBufferRef *frame_context = av_hwframe_ctx_alloc(self->device_ctx);
+    if(!frame_context) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vulkan_setup_context: failed to create hwframe context\n");
+        av_buffer_unref(&self->device_ctx);
+        return false;
+    }
+
+    AVHWFramesContext *hw_frame_context = (AVHWFramesContext*)frame_context->data;
+    hw_frame_context->width = video_codec_context->width;
+    hw_frame_context->height = video_codec_context->height;
+    hw_frame_context->sw_format = self->params.color_depth == GSR_COLOR_DEPTH_10_BITS ? AV_PIX_FMT_P010LE : AV_PIX_FMT_NV12;
+    hw_frame_context->format = video_codec_context->pix_fmt;
+    hw_frame_context->device_ctx = (AVHWDeviceContext*)self->device_ctx->data;
+
+    //AVVulkanFramesContext *vk_frame_ctx = (AVVulkanFramesContext*)hw_frame_context->hwctx;
+    //hw_frame_context->initial_pool_size = 20;
+
+    if (av_hwframe_ctx_init(frame_context) < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vulkan_setup_context: failed to initialize hardware frame context "
+                        "(note: ffmpeg version needs to be > 4.0)\n");
+        av_buffer_unref(&self->device_ctx);
+        //av_buffer_unref(&frame_context);
+        return false;
+    }
+
+    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
+    av_buffer_unref(&frame_context);
+#endif
+    return true;
+}
+
+static AVVulkanDeviceContext* video_codec_context_get_vulkan_data(AVCodecContext *video_codec_context) {
+    AVBufferRef *hw_frames_ctx = video_codec_context->hw_frames_ctx;
+    if(!hw_frames_ctx)
+        return NULL;
+
+    AVHWFramesContext *hw_frame_context = (AVHWFramesContext*)hw_frames_ctx->data;
+    AVHWDeviceContext *device_context = (AVHWDeviceContext*)hw_frame_context->device_ctx;
+    if(device_context->type != AV_HWDEVICE_TYPE_VULKAN)
+        return NULL;
+
+    return (AVVulkanDeviceContext*)device_context->hwctx;
+}
+
+static uint32_t get_memory_type_idx(VkPhysicalDevice pdev, const VkMemoryRequirements *mem_reqs, VkMemoryPropertyFlagBits prop_flags, PFN_vkGetPhysicalDeviceMemoryProperties vkGetPhysicalDeviceMemoryProperties) {
+    VkPhysicalDeviceMemoryProperties pdev_mem_props;
+    uint32_t i;
+
+    vkGetPhysicalDeviceMemoryProperties(pdev, &pdev_mem_props);
+
+    for (i = 0; i < pdev_mem_props.memoryTypeCount; i++) {
+        const VkMemoryType *type = &pdev_mem_props.memoryTypes[i];
+
+        if ((mem_reqs->memoryTypeBits & (1 << i)) &&
+            (type->propertyFlags & prop_flags) == prop_flags) {
+            return i;
+            break;
+        }
+    }
+    return UINT32_MAX;
+}
+
+static bool gsr_video_encoder_vulkan_setup_textures(gsr_video_encoder_vulkan *self, AVCodecContext *video_codec_context, AVFrame *frame) {
+    const int res = av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, frame, 0);
+    if(res < 0) {
+        fprintf(stderr, "gsr error: gsr_video_encoder_vulkan_setup_textures: av_hwframe_get_buffer failed: %d\n", res);
+        return false;
+    }
+
+    while(self->params.egl->glGetError()) {}
+#if 0
+    AVVkFrame *target_surface_id = (AVVkFrame*)frame->data[0];
+    AVVulkanDeviceContext* vv = video_codec_context_get_vulkan_data(video_codec_context);
+    const size_t luma_size = frame->width * frame->height;
+    if(vv) {
+        PFN_vkGetImageMemoryRequirements vkGetImageMemoryRequirements = (PFN_vkGetImageMemoryRequirements)vv->get_proc_addr(vv->inst, "vkGetImageMemoryRequirements");
+        PFN_vkAllocateMemory vkAllocateMemory = (PFN_vkAllocateMemory)vv->get_proc_addr(vv->inst, "vkAllocateMemory");
+        PFN_vkGetPhysicalDeviceMemoryProperties vkGetPhysicalDeviceMemoryProperties = (PFN_vkGetPhysicalDeviceMemoryProperties)vv->get_proc_addr(vv->inst, "vkGetPhysicalDeviceMemoryProperties");
+        PFN_vkGetMemoryFdKHR vkGetMemoryFdKHR = (PFN_vkGetMemoryFdKHR)vv->get_proc_addr(vv->inst, "vkGetMemoryFdKHR");
+
+        VkMemoryRequirements mem_reqs = {0};
+        vkGetImageMemoryRequirements(vv->act_dev, target_surface_id->img[0], &mem_reqs);
+
+        fprintf(stderr, "size: %lu, alignment: %lu, memory bits: 0x%08x\n", mem_reqs.size, mem_reqs.alignment, mem_reqs.memoryTypeBits);
+        VkDeviceMemory mem;
+        {
+            VkExportMemoryAllocateInfo exp_mem_info;
+            VkMemoryAllocateInfo mem_alloc_info;
+            VkMemoryDedicatedAllocateInfoKHR ded_info;
+
+            memset(&exp_mem_info, 0, sizeof(exp_mem_info));
+            exp_mem_info.sType = VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO;
+            exp_mem_info.handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT;
+            
+            memset(&ded_info, 0, sizeof(ded_info));
+            ded_info.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO;
+            ded_info.image = target_surface_id->img[0];
+
+            exp_mem_info.pNext = &ded_info;
+
+            memset(&mem_alloc_info, 0, sizeof(mem_alloc_info));
+            mem_alloc_info.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
+            mem_alloc_info.pNext = &exp_mem_info;
+            mem_alloc_info.allocationSize = target_surface_id->size[0];
+            mem_alloc_info.memoryTypeIndex = get_memory_type_idx(vv->phys_dev, &mem_reqs, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, vkGetPhysicalDeviceMemoryProperties);
+
+            if (mem_alloc_info.memoryTypeIndex == UINT32_MAX) {
+                fprintf(stderr, "No suitable memory type index found.\n");
+                return VK_NULL_HANDLE;
+            }
+
+            if (vkAllocateMemory(vv->act_dev, &mem_alloc_info, 0, &mem) !=
+                VK_SUCCESS)
+                return VK_NULL_HANDLE;
+ 
+            fprintf(stderr, "memory: %p\n", (void*)mem);
+
+        }
+
+        fprintf(stderr, "target surface id: %p, %zu, %zu\n", (void*)target_surface_id->mem[0], target_surface_id->offset[0], target_surface_id->offset[1]);
+        fprintf(stderr, "vkGetMemoryFdKHR: %p\n", (void*)vkGetMemoryFdKHR);
+
+        int fd = 0;
+        VkMemoryGetFdInfoKHR fd_info;
+        memset(&fd_info, 0, sizeof(fd_info));
+        fd_info.sType = VK_STRUCTURE_TYPE_MEMORY_GET_FD_INFO_KHR;
+        fd_info.memory = target_surface_id->mem[0];
+        fd_info.handleType = VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD_BIT;
+        if(vkGetMemoryFdKHR(vv->act_dev, &fd_info, &fd) != VK_SUCCESS) {
+            fprintf(stderr, "failed!\n");
+        } else {
+            fprintf(stderr, "fd: %d\n", fd);
+        }
+
+        fprintf(stderr, "glImportMemoryFdEXT: %p, size: %zu\n", (void*)self->params.egl->glImportMemoryFdEXT, target_surface_id->size[0]);
+        const int tiling = target_surface_id->tiling == VK_IMAGE_TILING_LINEAR ? GL_LINEAR_TILING_EXT : GL_OPTIMAL_TILING_EXT;
+
+        if(tiling != GL_OPTIMAL_TILING_EXT) {
+            fprintf(stderr, "tiling %d is not supported, only GL_OPTIMAL_TILING_EXT (%d) is supported\n", tiling, GL_OPTIMAL_TILING_EXT);
+        }
+
+
+        unsigned int gl_memory_obj = 0;
+        self->params.egl->glCreateMemoryObjectsEXT(1, &gl_memory_obj);
+
+        //const int dedicated = GL_TRUE;
+        //self->params.egl->glMemoryObjectParameterivEXT(gl_memory_obj, GL_DEDICATED_MEMORY_OBJECT_EXT, &dedicated);
+
+        self->params.egl->glImportMemoryFdEXT(gl_memory_obj, target_surface_id->size[0], GL_HANDLE_TYPE_OPAQUE_FD_EXT, fd);
+        if(!self->params.egl->glIsMemoryObjectEXT(gl_memory_obj))
+            fprintf(stderr, "failed to create object!\n");
+
+        fprintf(stderr, "gl memory obj: %u, error: %d\n", gl_memory_obj, self->params.egl->glGetError());
+
+        // fprintf(stderr, "0 gl error: %d\n", self->params.egl->glGetError());
+        // unsigned int vertex_buffer = 0;
+        // self->params.egl->glGenBuffers(1, &vertex_buffer);
+        // self->params.egl->glBindBuffer(GL_ARRAY_BUFFER, vertex_buffer);
+        // self->params.egl->glBufferStorageMemEXT(GL_ARRAY_BUFFER, target_surface_id->size[0], gl_memory_obj, target_surface_id->offset[0]);
+        // fprintf(stderr, "1 gl error: %d\n", self->params.egl->glGetError());
+
+        // fprintf(stderr, "0 gl error: %d\n", self->params.egl->glGetError());
+        // unsigned int buffer = 0;
+        // self->params.egl->glCreateBuffers(1, &buffer);
+        // self->params.egl->glNamedBufferStorageMemEXT(buffer, target_surface_id->size[0], gl_memory_obj, target_surface_id->offset[0]);
+        // fprintf(stderr, "1 gl error: %d\n", self->params.egl->glGetError());
+
+        self->params.egl->glGenTextures(1, &self->target_textures[0]);
+        self->params.egl->glBindTexture(GL_TEXTURE_2D, self->target_textures[0]);
+
+        fprintf(stderr, "1 gl error: %d\n", self->params.egl->glGetError());
+        self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_TILING_EXT, tiling);
+
+        fprintf(stderr, "tiling: %d\n", tiling);
+
+        fprintf(stderr, "2 gl error: %d\n", self->params.egl->glGetError());
+        self->params.egl->glTexStorageMem2DEXT(GL_TEXTURE_2D, 1, GL_R8, frame->width, frame->height, gl_memory_obj, target_surface_id->offset[0]);
+
+        fprintf(stderr, "3 gl error: %d\n", self->params.egl->glGetError());
+        self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+
+        self->params.egl->glGenTextures(1, &self->target_textures[1]);
+        self->params.egl->glBindTexture(GL_TEXTURE_2D, self->target_textures[1]);
+
+        fprintf(stderr, "1 gl error: %d\n", self->params.egl->glGetError());
+        self->params.egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_TILING_EXT, tiling);
+
+        fprintf(stderr, "tiling: %d\n", tiling);
+
+        fprintf(stderr, "2 gl error: %d\n", self->params.egl->glGetError());
+        self->params.egl->glTexStorageMem2DEXT(GL_TEXTURE_2D, 1, GL_RG8, frame->width/2, frame->height/2, gl_memory_obj, target_surface_id->offset[0] + luma_size);
+
+        fprintf(stderr, "3 gl error: %d\n", self->params.egl->glGetError());
+        self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+     }
+#endif
+    return true;
+}
+
+static void gsr_video_encoder_vulkan_stop(gsr_video_encoder_vulkan *self, AVCodecContext *video_codec_context);
+
+static bool gsr_video_encoder_vulkan_start(gsr_video_encoder *encoder, AVCodecContext *video_codec_context, AVFrame *frame) {
+    gsr_video_encoder_vulkan *self = encoder->priv;
+
+    video_codec_context->width = FFALIGN(video_codec_context->width, 2);
+    video_codec_context->height = FFALIGN(video_codec_context->height, 2);
+
+    if(video_codec_context->width < 128)
+        video_codec_context->width = 128;
+
+    if(video_codec_context->height < 128)
+        video_codec_context->height = 128;
+
+    frame->width = video_codec_context->width;
+    frame->height = video_codec_context->height;
+
+    if(!gsr_video_encoder_vulkan_setup_context(self, video_codec_context)) {
+        gsr_video_encoder_vulkan_stop(self, video_codec_context);
+        return false;
+    }
+
+    if(!gsr_video_encoder_vulkan_setup_textures(self, video_codec_context, frame)) {
+        gsr_video_encoder_vulkan_stop(self, video_codec_context);
+        return false;
+    }
+
+    return true;
+}
+
+void gsr_video_encoder_vulkan_stop(gsr_video_encoder_vulkan *self, AVCodecContext *video_codec_context) {
+    self->params.egl->glDeleteTextures(2, self->target_textures);
+    self->target_textures[0] = 0;
+    self->target_textures[1] = 0;
+
+    if(video_codec_context->hw_frames_ctx)
+        av_buffer_unref(&video_codec_context->hw_frames_ctx);
+    if(self->device_ctx)
+        av_buffer_unref(&self->device_ctx);
+}
+
+static void gsr_video_encoder_vulkan_get_textures(gsr_video_encoder *encoder, unsigned int *textures, int *num_textures, gsr_destination_color *destination_color) {
+    gsr_video_encoder_vulkan *self = encoder->priv;
+    textures[0] = self->target_textures[0];
+    textures[1] = self->target_textures[1];
+    *num_textures = 2;
+    *destination_color = self->params.color_depth == GSR_COLOR_DEPTH_10_BITS ? GSR_DESTINATION_COLOR_P010 : GSR_DESTINATION_COLOR_NV12;
+}
+
+static void gsr_video_encoder_vulkan_destroy(gsr_video_encoder *encoder, AVCodecContext *video_codec_context) {
+    gsr_video_encoder_vulkan_stop(encoder->priv, video_codec_context);
+    free(encoder->priv);
+    free(encoder);
+}
+
+gsr_video_encoder* gsr_video_encoder_vulkan_create(const gsr_video_encoder_vulkan_params *params) {
+    gsr_video_encoder *encoder = calloc(1, sizeof(gsr_video_encoder));
+    if(!encoder)
+        return NULL;
+
+    gsr_video_encoder_vulkan *encoder_vulkan = calloc(1, sizeof(gsr_video_encoder_vulkan));
+    if(!encoder_vulkan) {
+        free(encoder);
+        return NULL;
+    }
+
+    encoder_vulkan->params = *params;
+
+    *encoder = (gsr_video_encoder) {
+        .start = gsr_video_encoder_vulkan_start,
+        .copy_textures_to_frame = NULL,
+        .get_textures = gsr_video_encoder_vulkan_get_textures,
+        .destroy = gsr_video_encoder_vulkan_destroy,
+        .priv = encoder_vulkan
+    };
+
+    return encoder;
+}
diff --git a/src/image_writer.c b/src/image_writer.c
new file mode 100644
index 0000000..3d731a0
--- /dev/null
+++ b/src/image_writer.c
@@ -0,0 +1,100 @@
+#include "../include/image_writer.h"
+#include "../include/egl.h"
+#include "../include/utils.h"
+
+#define STB_IMAGE_WRITE_IMPLEMENTATION
+#include "../external/stb_image_write.h"
+
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <assert.h>
+
+/* TODO: Support hdr/10-bit */
+bool gsr_image_writer_init_opengl(gsr_image_writer *self, gsr_egl *egl, int width, int height) {
+    memset(self, 0, sizeof(*self));
+    self->source = GSR_IMAGE_WRITER_SOURCE_OPENGL;
+    self->egl = egl;
+    self->width = width;
+    self->height = height;
+    self->texture = gl_create_texture(self->egl, self->width, self->height, GL_RGBA8, GL_RGBA, GL_NEAREST); /* TODO: use GL_RGB16 instead of GL_RGB8 for hdr/10-bit */
+    if(self->texture == 0) {
+        fprintf(stderr, "gsr error: gsr_image_writer_init: failed to create texture\n");
+        return false;
+    }
+    return true;
+}
+
+bool gsr_image_writer_init_memory(gsr_image_writer *self, const void *memory, int width, int height) {
+    memset(self, 0, sizeof(*self));
+    self->source = GSR_IMAGE_WRITER_SOURCE_OPENGL;
+    self->width = width;
+    self->height = height;
+    self->memory = memory;
+    return true;
+}
+
+void gsr_image_writer_deinit(gsr_image_writer *self) {
+    if(self->texture) {
+        self->egl->glDeleteTextures(1, &self->texture);
+        self->texture = 0;
+    }
+}
+
+static bool gsr_image_writer_write_memory_to_file(gsr_image_writer *self, const char *filepath, gsr_image_format image_format, int quality, const void *data) {
+    if(quality < 1)
+        quality = 1;
+    else if(quality > 100)
+        quality = 100;
+
+    bool success = false;
+    switch(image_format) {
+        case GSR_IMAGE_FORMAT_JPEG:
+            success = stbi_write_jpg(filepath, self->width, self->height, 4, data, quality);
+            break;
+        case GSR_IMAGE_FORMAT_PNG:
+            success = stbi_write_png(filepath, self->width, self->height, 4, data, 0);
+            break;
+    }
+
+    if(!success)
+        fprintf(stderr, "gsr error: gsr_image_writer_write_to_file: failed to write image data to output file %s\n", filepath);
+
+    return success;
+}
+
+static bool gsr_image_writer_write_opengl_texture_to_file(gsr_image_writer *self, const char *filepath, gsr_image_format image_format, int quality) {
+    assert(self->source == GSR_IMAGE_WRITER_SOURCE_OPENGL);
+    uint8_t *frame_data = malloc(self->width * self->height * 4);
+    if(!frame_data) {
+        fprintf(stderr, "gsr error: gsr_image_writer_write_to_file: failed to allocate memory for image frame\n");
+        return false;
+    }
+
+    unsigned int fbo = 0;
+    self->egl->glGenFramebuffers(1, &fbo);
+    self->egl->glBindFramebuffer(GL_FRAMEBUFFER, fbo);
+    self->egl->glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, self->texture, 0);
+
+    self->egl->glReadPixels(0, 0, self->width, self->height, GL_RGBA, GL_UNSIGNED_BYTE, frame_data);
+
+    self->egl->glBindFramebuffer(GL_FRAMEBUFFER, 0);
+    self->egl->glDeleteFramebuffers(1, &fbo);
+
+    self->egl->glFlush();
+    self->egl->glFinish();
+    
+    const bool success = gsr_image_writer_write_memory_to_file(self, filepath, image_format, quality, frame_data);
+    free(frame_data);
+    return success;
+}
+
+bool gsr_image_writer_write_to_file(gsr_image_writer *self, const char *filepath, gsr_image_format image_format, int quality) {
+    switch(self->source) {
+        case GSR_IMAGE_WRITER_SOURCE_OPENGL:
+            return gsr_image_writer_write_opengl_texture_to_file(self, filepath, image_format, quality);
+        case GSR_IMAGE_WRITER_SOURCE_MEMORY:
+            return gsr_image_writer_write_memory_to_file(self, filepath, image_format, quality, self->memory);
+    }
+    return false;
+}
diff --git a/src/main.cpp b/src/main.cpp
index 71452e6..67619f9 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -1,65 +1,134 @@
 extern "C" {
 #include "../include/capture/nvfbc.h"
-#include "../include/capture/xcomposite_cuda.h"
-#include "../include/capture/xcomposite_vaapi.h"
-#include "../include/capture/kms_vaapi.h"
-#include "../include/capture/kms_cuda.h"
+#include "../include/capture/xcomposite.h"
+#include "../include/capture/ximage.h"
+#include "../include/capture/kms.h"
+#ifdef GSR_PORTAL
+#include "../include/capture/portal.h"
+#include "../dbus/client/dbus_client.h"
+#endif
+#ifdef GSR_APP_AUDIO
+#include "../include/pipewire_audio.h"
+#endif
+#include "../include/encoder/encoder.h"
+#include "../include/encoder/video/nvenc.h"
+#include "../include/encoder/video/vaapi.h"
+#include "../include/encoder/video/vulkan.h"
+#include "../include/encoder/video/software.h"
+#include "../include/codec_query/nvenc.h"
+#include "../include/codec_query/vaapi.h"
+#include "../include/codec_query/vulkan.h"
+#include "../include/window/x11.h"
+#include "../include/window/wayland.h"
 #include "../include/egl.h"
 #include "../include/utils.h"
+#include "../include/damage.h"
+#include "../include/color_conversion.h"
+#include "../include/image_writer.h"
+#include "../include/args_parser.h"
 }
 
 #include <assert.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string>
-#include <vector>
-#include <unordered_map>
 #include <thread>
 #include <mutex>
-#include <map>
 #include <signal.h>
 #include <sys/stat.h>
 #include <unistd.h>
+#include <sys/wait.h>
+#include <inttypes.h>
+#include <libgen.h>
+#include <malloc.h>
 
 #include "../include/sound.hpp"
 
 extern "C" {
 #include <libavutil/pixfmt.h>
 #include <libavcodec/avcodec.h>
+#include <libavcodec/defs.h>
 #include <libavformat/avformat.h>
 #include <libavutil/opt.h>
 #include <libswresample/swresample.h>
 #include <libavutil/avutil.h>
 #include <libavutil/time.h>
+#include <libavutil/mastering_display_metadata.h>
 #include <libavfilter/avfilter.h>
 #include <libavfilter/buffersink.h>
 #include <libavfilter/buffersrc.h>
 }
 
-#include <deque>
 #include <future>
 
+#ifndef GSR_VERSION
+#define GSR_VERSION "unknown"
+#endif
+
 // TODO: If options are not supported then they are returned (allocated) in the options. This should be free'd.
 
 // TODO: Remove LIBAVUTIL_VERSION_MAJOR checks in the future when ubuntu, pop os LTS etc update ffmpeg to >= 5.0
 
+static const int AUDIO_SAMPLE_RATE = 48000;
+
 static const int VIDEO_STREAM_INDEX = 0;
 
 static thread_local char av_error_buffer[AV_ERROR_MAX_STRING_SIZE];
 
+typedef struct {
+    const gsr_window *window;
+} MonitorOutputCallbackUserdata;
+
 static void monitor_output_callback_print(const gsr_monitor *monitor, void *userdata) {
-    (void)userdata;
-    fprintf(stderr, "    \"%.*s\"    (%dx%d+%d+%d)\n", monitor->name_len, monitor->name, monitor->size.x, monitor->size.y, monitor->pos.x, monitor->pos.y);
+    const MonitorOutputCallbackUserdata *options = (MonitorOutputCallbackUserdata*)userdata;
+    vec2i monitor_position = monitor->pos;
+    vec2i monitor_size = monitor->size;
+    if(gsr_window_get_display_server(options->window) == GSR_DISPLAY_SERVER_WAYLAND) {
+        gsr_monitor_rotation monitor_rotation = GSR_MONITOR_ROT_0;
+        drm_monitor_get_display_server_data(options->window, monitor, &monitor_rotation, &monitor_position);
+        if(monitor_rotation == GSR_MONITOR_ROT_90 || monitor_rotation == GSR_MONITOR_ROT_270)
+            std::swap(monitor_size.x, monitor_size.y);
+    }
+    fprintf(stderr, "  \"%.*s\"    (%dx%d+%d+%d)\n", monitor->name_len, monitor->name, monitor_size.x, monitor_size.y, monitor_position.x, monitor_position.y);
 }
 
 typedef struct {
-    const char *output_name;
+    char *output_name;
 } FirstOutputCallback;
 
-static void get_first_output(const gsr_monitor *monitor, void *userdata) {
-    FirstOutputCallback *first_output = (FirstOutputCallback*)userdata;
-    if(!first_output->output_name)
-        first_output->output_name = strndup(monitor->name, monitor->name_len + 1);
+static void get_first_output_callback(const gsr_monitor *monitor, void *userdata) {
+    FirstOutputCallback *data = (FirstOutputCallback*)userdata;
+    if(!data->output_name)
+        data->output_name = strdup(monitor->name);
+}
+
+typedef struct {
+    gsr_window *window;
+    vec2i position;
+    char *output_name;
+    vec2i monitor_pos;
+    vec2i monitor_size;
+} MonitorByPositionCallback;
+
+static void get_monitor_by_position_callback(const gsr_monitor *monitor, void *userdata) {
+    MonitorByPositionCallback *data = (MonitorByPositionCallback*)userdata;
+
+    vec2i monitor_position = monitor->pos;
+    vec2i monitor_size = monitor->size;
+    if(gsr_window_get_display_server(data->window) == GSR_DISPLAY_SERVER_WAYLAND) {
+        gsr_monitor_rotation monitor_rotation = GSR_MONITOR_ROT_0;
+        drm_monitor_get_display_server_data(data->window, monitor, &monitor_rotation, &monitor_position);
+        if(monitor_rotation == GSR_MONITOR_ROT_90 || monitor_rotation == GSR_MONITOR_ROT_270)
+            std::swap(monitor_size.x, monitor_size.y);
+    }
+
+    if(!data->output_name && data->position.x >= monitor_position.x && data->position.x <= monitor_position.x + monitor->size.x
+        && data->position.y >= monitor_position.y && data->position.y <= monitor_position.y + monitor->size.y)
+    {
+        data->output_name = strdup(monitor->name);
+        data->monitor_pos = monitor_position;
+        data->monitor_size = monitor->size;
+    }
 }
 
 static char* av_error_to_string(int err) {
@@ -68,35 +137,6 @@ static char* av_error_to_string(int err) {
     return av_error_buffer;
 }
 
-enum class VideoQuality {
-    MEDIUM,
-    HIGH,
-    VERY_HIGH,
-    ULTRA
-};
-
-enum class VideoCodec {
-    H264,
-    H265,
-    AV1
-};
-
-enum class AudioCodec {
-    AAC,
-    OPUS,
-    FLAC
-};
-
-enum class PixelFormat {
-    YUV420,
-    YUV444
-};
-
-enum class FramerateMode {
-    CONSTANT,
-    VARIABLE
-};
-
 static int x11_error_handler(Display*, XErrorEvent*) {
     return 0;
 }
@@ -105,124 +145,61 @@ static int x11_io_error_handler(Display*) {
     return 0;
 }
 
-struct PacketData {
-    PacketData() {}
-    PacketData(const PacketData&) = delete;
-    PacketData& operator=(const PacketData&) = delete;
-
-    ~PacketData() {
-        av_free(data.data);
-    }
-
-    AVPacket data;
-};
-
-// |stream| is only required for non-replay mode
-static void receive_frames(AVCodecContext *av_codec_context, int stream_index, AVStream *stream, int64_t pts,
-                           AVFormatContext *av_format_context,
-                           double replay_start_time,
-                           std::deque<std::shared_ptr<PacketData>> &frame_data_queue,
-                           int replay_buffer_size_secs,
-                           bool &frames_erased,
-                           std::mutex &write_output_mutex) {
-    for (;;) {
-        // TODO: Use av_packet_alloc instead because sizeof(av_packet) might not be future proof(?)
-        AVPacket *av_packet = av_packet_alloc();
-        if(!av_packet)
-            break;
-
-        av_packet->data = NULL;
-        av_packet->size = 0;
-        int res = avcodec_receive_packet(av_codec_context, av_packet);
-        if (res == 0) { // we have a packet, send the packet to the muxer
-            av_packet->stream_index = stream_index;
-            av_packet->pts = pts;
-            av_packet->dts = pts;
-
-            std::lock_guard<std::mutex> lock(write_output_mutex);
-            if(replay_buffer_size_secs != -1) {
-                // TODO: Preallocate all frames data and use those instead.
-                // Why are we doing this you ask? there is a new ffmpeg bug that causes cpu usage to increase over time when you have
-                // packets that are not being free'd until later. So we copy the packet data, free the packet and then reconstruct
-                // the packet later on when we need it, to keep packets alive only for a short period.
-                auto new_packet = std::make_shared<PacketData>();
-                new_packet->data = *av_packet;
-                new_packet->data.data = (uint8_t*)av_malloc(av_packet->size);
-                memcpy(new_packet->data.data, av_packet->data, av_packet->size);
-
-                double time_now = clock_get_monotonic_seconds();
-                double replay_time_elapsed = time_now - replay_start_time;
-
-                frame_data_queue.push_back(std::move(new_packet));
-                if(replay_time_elapsed >= replay_buffer_size_secs) {
-                    frame_data_queue.pop_front();
-                    frames_erased = true;
-                }
-            } else {
-                av_packet_rescale_ts(av_packet, av_codec_context->time_base, stream->time_base);
-                av_packet->stream_index = stream->index;
-                // TODO: Is av_interleaved_write_frame needed?
-                int ret = av_write_frame(av_format_context, av_packet);
-                if(ret < 0) {
-                    fprintf(stderr, "Error: Failed to write frame index %d to muxer, reason: %s (%d)\n", av_packet->stream_index, av_error_to_string(ret), ret);
-                }
-            }
-            av_packet_free(&av_packet);
-        } else if (res == AVERROR(EAGAIN)) { // we have no packet
-                                             // fprintf(stderr, "No packet!\n");
-            av_packet_free(&av_packet);
-            break;
-        } else if (res == AVERROR_EOF) { // this is the end of the stream
-            av_packet_free(&av_packet);
-            fprintf(stderr, "End of stream!\n");
-            break;
-        } else {
-            av_packet_free(&av_packet);
-            fprintf(stderr, "Unexpected error: %d\n", res);
-            break;
-        }
-    }
-}
-
-static const char* audio_codec_get_name(AudioCodec audio_codec) {
+static AVCodecID audio_codec_get_id(gsr_audio_codec audio_codec) {
     switch(audio_codec) {
-        case AudioCodec::AAC:  return "aac";
-        case AudioCodec::OPUS: return "opus";
-        case AudioCodec::FLAC: return "flac";
-    }
-    assert(false);
-    return "";
-}
-
-static AVCodecID audio_codec_get_id(AudioCodec audio_codec) {
-    switch(audio_codec) {
-        case AudioCodec::AAC:  return AV_CODEC_ID_AAC;
-        case AudioCodec::OPUS: return AV_CODEC_ID_OPUS;
-        case AudioCodec::FLAC: return AV_CODEC_ID_FLAC;
+        case GSR_AUDIO_CODEC_AAC:  return AV_CODEC_ID_AAC;
+        case GSR_AUDIO_CODEC_OPUS: return AV_CODEC_ID_OPUS;
+        case GSR_AUDIO_CODEC_FLAC: return AV_CODEC_ID_FLAC;
     }
     assert(false);
     return AV_CODEC_ID_AAC;
 }
 
-static AVSampleFormat audio_codec_get_sample_format(AudioCodec audio_codec, const AVCodec *codec) {
+static AVSampleFormat audio_codec_get_sample_format(AVCodecContext *audio_codec_context, gsr_audio_codec audio_codec, const AVCodec *codec, bool mix_audio) {
+    (void)audio_codec_context;
     switch(audio_codec) {
-        case AudioCodec::AAC: {
+        case GSR_AUDIO_CODEC_AAC: {
             return AV_SAMPLE_FMT_FLTP;
         }
-        case AudioCodec::OPUS: {
+        case GSR_AUDIO_CODEC_OPUS: {
             bool supports_s16 = false;
             bool supports_flt = false;
 
-            for(size_t i = 0; codec->sample_fmts && codec->sample_fmts[i] != -1; ++i) {
+            #if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(61, 15, 0)
+            for(size_t i = 0; codec->sample_fmts && codec->sample_fmts[i] != AV_SAMPLE_FMT_NONE; ++i) {
                 if(codec->sample_fmts[i] == AV_SAMPLE_FMT_S16) {
                     supports_s16 = true;
                 } else if(codec->sample_fmts[i] == AV_SAMPLE_FMT_FLT) {
                     supports_flt = true;
                 }
             }
+            #else
+            const enum AVSampleFormat *sample_fmts = NULL;
+            if(avcodec_get_supported_config(audio_codec_context, codec, AV_CODEC_CONFIG_SAMPLE_FORMAT, 0, (const void**)&sample_fmts, NULL) >= 0) {
+                if(sample_fmts) {
+                    for(size_t i = 0; sample_fmts[i] != AV_SAMPLE_FMT_NONE; ++i) {
+                        if(sample_fmts[i] == AV_SAMPLE_FMT_S16) {
+                            supports_s16 = true;
+                        } else if(sample_fmts[i] == AV_SAMPLE_FMT_FLT) {
+                            supports_flt = true;
+                        }
+                    }
+                } else {
+                    // What a dumb API. It returns NULL if all formats are supported
+                    supports_s16 = true;
+                    supports_flt = true;
+                }
+            }
+            #endif
+
+            // Amix only works with float audio
+            if(mix_audio)
+                supports_s16 = false;
 
             if(!supports_s16 && !supports_flt) {
-                fprintf(stderr, "Warning: opus audio codec is chosen but your ffmpeg version does not support s16/flt sample format and performance might be slightly worse. You can either rebuild ffmpeg with libopus instead of the built-in opus, use the flatpak version of gpu screen recorder or record with flac audio codec instead (-ac flac). Falling back to fltp audio sample format instead.\n");
+                fprintf(stderr, "gsr warning: opus audio codec is chosen but your ffmpeg version does not support s16/flt sample format and performance might be slightly worse.\n");
+                fprintf(stderr, "  You can either rebuild ffmpeg with libopus instead of the built-in opus, use the flatpak version of gpu screen recorder or record with aac audio codec instead (-ac aac).\n");
+                fprintf(stderr, "  Falling back to fltp audio sample format instead.\n");
             }
 
             if(supports_s16)
@@ -232,7 +209,7 @@ static AVSampleFormat audio_codec_get_sample_format(AudioCodec audio_codec, cons
             else
                 return AV_SAMPLE_FMT_FLTP;
         }
-        case AudioCodec::FLAC: {
+        case GSR_AUDIO_CODEC_FLAC: {
             return AV_SAMPLE_FMT_S32;
         }
     }
@@ -240,14 +217,14 @@ static AVSampleFormat audio_codec_get_sample_format(AudioCodec audio_codec, cons
     return AV_SAMPLE_FMT_FLTP;
 }
 
-static int64_t audio_codec_get_get_bitrate(AudioCodec audio_codec) {
+static int64_t audio_codec_get_get_bitrate(gsr_audio_codec audio_codec) {
     switch(audio_codec) {
-        case AudioCodec::AAC:  return 128000;
-        case AudioCodec::OPUS: return 96000;
-        case AudioCodec::FLAC: return 96000;
+        case GSR_AUDIO_CODEC_AAC:  return 160000;
+        case GSR_AUDIO_CODEC_OPUS: return 128000;
+        case GSR_AUDIO_CODEC_FLAC: return 128000;
     }
     assert(false);
-    return 96000;
+    return 128000;
 }
 
 static AudioFormat audio_codec_context_get_audio_format(const AVCodecContext *audio_codec_context) {
@@ -270,10 +247,11 @@ static AVSampleFormat audio_format_to_sample_format(const AudioFormat audio_form
     return AV_SAMPLE_FMT_S16;
 }
 
-static AVCodecContext* create_audio_codec_context(int fps, AudioCodec audio_codec) {
+static AVCodecContext* create_audio_codec_context(int fps, gsr_audio_codec audio_codec, bool mix_audio, int64_t audio_bitrate) {
+    (void)fps;
     const AVCodec *codec = avcodec_find_encoder(audio_codec_get_id(audio_codec));
     if (!codec) {
-        fprintf(stderr, "Error: Could not find %s audio encoder\n", audio_codec_get_name(audio_codec));
+        fprintf(stderr, "gsr error: Could not find %s audio encoder\n", audio_codec_get_name(audio_codec));
         _exit(1);
     }
 
@@ -281,11 +259,16 @@ static AVCodecContext* create_audio_codec_context(int fps, AudioCodec audio_code
 
     assert(codec->type == AVMEDIA_TYPE_AUDIO);
     codec_context->codec_id = codec->id;
-    codec_context->sample_fmt = audio_codec_get_sample_format(audio_codec, codec);
-    codec_context->bit_rate = audio_codec_get_get_bitrate(audio_codec);
-    codec_context->sample_rate = 48000;
-    if(audio_codec == AudioCodec::AAC)
+    codec_context->sample_fmt = audio_codec_get_sample_format(codec_context, audio_codec, codec, mix_audio);
+    codec_context->bit_rate = audio_bitrate == 0 ? audio_codec_get_get_bitrate(audio_codec) : audio_bitrate;
+    codec_context->sample_rate = AUDIO_SAMPLE_RATE;
+    if(audio_codec == GSR_AUDIO_CODEC_AAC) {
+#if LIBAVCODEC_VERSION_MAJOR < 62
         codec_context->profile = FF_PROFILE_AAC_LOW;
+#else
+        codec_context->profile = AV_PROFILE_AAC_LOW;
+#endif
+    }
 #if LIBAVCODEC_VERSION_MAJOR < 60
     codec_context->channel_layout = AV_CH_LAYOUT_STEREO;
     codec_context->channels = 2;
@@ -294,18 +277,68 @@ static AVCodecContext* create_audio_codec_context(int fps, AudioCodec audio_code
 #endif
 
     codec_context->time_base.num = 1;
-    codec_context->time_base.den = AV_TIME_BASE;
-    codec_context->framerate.num = fps;
-    codec_context->framerate.den = 1;
+    codec_context->time_base.den = codec_context->sample_rate;
+    codec_context->thread_count = 1;
     codec_context->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
 
     return codec_context;
 }
 
-static AVCodecContext *create_video_codec_context(AVPixelFormat pix_fmt,
-                            VideoQuality video_quality,
-                            int fps, const AVCodec *codec, bool is_livestream, gsr_gpu_vendor vendor, FramerateMode framerate_mode) {
+static int vbr_get_quality_parameter(AVCodecContext *codec_context, gsr_video_quality video_quality, bool hdr) {
+    // 8 bit / 10 bit = 80%
+    const float qp_multiply = hdr ? 8.0f/10.0f : 1.0f;
+    if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+        switch(video_quality) {
+            case GSR_VIDEO_QUALITY_MEDIUM:
+                return 160 * qp_multiply;
+            case GSR_VIDEO_QUALITY_HIGH:
+                return 130 * qp_multiply;
+            case GSR_VIDEO_QUALITY_VERY_HIGH:
+                return 110 * qp_multiply;
+            case GSR_VIDEO_QUALITY_ULTRA:
+                return 90 * qp_multiply;
+        }
+    } else if(codec_context->codec_id == AV_CODEC_ID_H264) {
+        switch(video_quality) {
+            case GSR_VIDEO_QUALITY_MEDIUM:
+                return 35 * qp_multiply;
+            case GSR_VIDEO_QUALITY_HIGH:
+                return 30 * qp_multiply;
+            case GSR_VIDEO_QUALITY_VERY_HIGH:
+                return 25 * qp_multiply;
+            case GSR_VIDEO_QUALITY_ULTRA:
+                return 22 * qp_multiply;
+        }
+    } else if(codec_context->codec_id == AV_CODEC_ID_HEVC) {
+        switch(video_quality) {
+            case GSR_VIDEO_QUALITY_MEDIUM:
+                return 35 * qp_multiply;
+            case GSR_VIDEO_QUALITY_HIGH:
+                return 30 * qp_multiply;
+            case GSR_VIDEO_QUALITY_VERY_HIGH:
+                return 25 * qp_multiply;
+            case GSR_VIDEO_QUALITY_ULTRA:
+                return 22 * qp_multiply;
+        }
+    } else if(codec_context->codec_id == AV_CODEC_ID_VP8 || codec_context->codec_id == AV_CODEC_ID_VP9) {
+        switch(video_quality) {
+            case GSR_VIDEO_QUALITY_MEDIUM:
+                return 35 * qp_multiply;
+            case GSR_VIDEO_QUALITY_HIGH:
+                return 30 * qp_multiply;
+            case GSR_VIDEO_QUALITY_VERY_HIGH:
+                return 25 * qp_multiply;
+            case GSR_VIDEO_QUALITY_ULTRA:
+                return 22 * qp_multiply;
+        }
+    }
+    assert(false);
+    return 22 * qp_multiply;
+}
 
+static AVCodecContext *create_video_codec_context(AVPixelFormat pix_fmt, const AVCodec *codec, const gsr_egl &egl, const args_parser &arg_parser) {
+    const bool use_software_video_encoder = arg_parser.video_encoder == GSR_VIDEO_ENCODER_HW_CPU;
+    const bool hdr = video_codec_is_hdr(arg_parser.video_codec);
     AVCodecContext *codec_context = avcodec_alloc_context3(codec);
 
     //double fps_ratio = (double)fps / 30.0;
@@ -317,238 +350,176 @@ static AVCodecContext *create_video_codec_context(AVPixelFormat pix_fmt,
     // timebase should be 1/framerate and timestamp increments should be
     // identical to 1
     codec_context->time_base.num = 1;
-    codec_context->time_base.den = framerate_mode == FramerateMode::CONSTANT ? fps : AV_TIME_BASE;
-    codec_context->framerate.num = fps;
+    codec_context->time_base.den = arg_parser.framerate_mode == GSR_FRAMERATE_MODE_CONSTANT ? arg_parser.fps : AV_TIME_BASE;
+    codec_context->framerate.num = arg_parser.fps;
     codec_context->framerate.den = 1;
     codec_context->sample_aspect_ratio.num = 0;
     codec_context->sample_aspect_ratio.den = 0;
-    // High values reeduce file size but increases time it takes to seek
-    if(is_livestream) {
+    if(arg_parser.low_latency_recording) {
         codec_context->flags |= (AV_CODEC_FLAG_CLOSED_GOP | AV_CODEC_FLAG_LOW_DELAY);
         codec_context->flags2 |= AV_CODEC_FLAG2_FAST;
         //codec_context->gop_size = std::numeric_limits<int>::max();
         //codec_context->keyint_min = std::numeric_limits<int>::max();
-        codec_context->gop_size = fps * 2;
+        codec_context->gop_size = arg_parser.fps * arg_parser.keyint;
     } else {
-        codec_context->gop_size = fps * 2;
+        // High values reduce file size but increases time it takes to seek
+        codec_context->gop_size = arg_parser.fps * arg_parser.keyint;
     }
     codec_context->max_b_frames = 0;
     codec_context->pix_fmt = pix_fmt;
-    //codec_context->color_range = AVCOL_RANGE_JPEG; // TODO: Amd/nvidia?
-    //codec_context->color_primaries = AVCOL_PRI_BT709;
-    //codec_context->color_trc = AVCOL_TRC_BT709;
-    //codec_context->colorspace = AVCOL_SPC_BT709;
+    codec_context->color_range = arg_parser.color_range == GSR_COLOR_RANGE_LIMITED ? AVCOL_RANGE_MPEG : AVCOL_RANGE_JPEG;
+    if(hdr) {
+        codec_context->color_primaries = AVCOL_PRI_BT2020;
+        codec_context->color_trc = AVCOL_TRC_SMPTE2084;
+        codec_context->colorspace = AVCOL_SPC_BT2020_NCL;
+    } else {
+        codec_context->color_primaries = AVCOL_PRI_BT709;
+        codec_context->color_trc = AVCOL_TRC_BT709;
+        codec_context->colorspace = AVCOL_SPC_BT709;
+    }
     //codec_context->chroma_sample_location = AVCHROMA_LOC_CENTER;
     if(codec->id == AV_CODEC_ID_HEVC)
-        codec_context->codec_tag = MKTAG('h', 'v', 'c', '1');
-    switch(video_quality) {
-        case VideoQuality::MEDIUM:
-            //codec_context->qmin = 35;
-            //codec_context->qmax = 35;
-            codec_context->bit_rate = 100000;//4500000 + (codec_context->width * codec_context->height)*0.75;
-            break;
-        case VideoQuality::HIGH:
-            //codec_context->qmin = 34;
-            //codec_context->qmax = 34;
-            codec_context->bit_rate = 100000;//10000000-9000000 + (codec_context->width * codec_context->height)*0.75;
-            break;
-        case VideoQuality::VERY_HIGH:
-            //codec_context->qmin = 28;
-            //codec_context->qmax = 28;
-            codec_context->bit_rate = 100000;//10000000-9000000 + (codec_context->width * codec_context->height)*0.75;
-            break;
-        case VideoQuality::ULTRA:
-            //codec_context->qmin = 22;
-            //codec_context->qmax = 22;
-            codec_context->bit_rate = 100000;//10000000-9000000 + (codec_context->width * codec_context->height)*0.75;
-            break;
+        codec_context->codec_tag = MKTAG('h', 'v', 'c', '1'); // QuickTime on MacOS requires this or the video wont be playable
+
+    if(arg_parser.bitrate_mode == GSR_BITRATE_MODE_CBR) {
+        codec_context->bit_rate = arg_parser.video_bitrate;
+        codec_context->rc_max_rate = codec_context->bit_rate;
+        //codec_context->rc_min_rate = codec_context->bit_rate;
+        codec_context->rc_buffer_size = codec_context->bit_rate;//codec_context->bit_rate / 10;
+        codec_context->rc_initial_buffer_occupancy = 0;//codec_context->bit_rate;//codec_context->bit_rate * 1000;
+    } else if(arg_parser.bitrate_mode == GSR_BITRATE_MODE_VBR) {
+        const int quality = vbr_get_quality_parameter(codec_context, arg_parser.video_quality, hdr);
+        switch(arg_parser.video_quality) {
+            case GSR_VIDEO_QUALITY_MEDIUM:
+                codec_context->qmin = quality;
+                codec_context->qmax = quality;
+                codec_context->bit_rate = 100000;//4500000 + (codec_context->width * codec_context->height)*0.75;
+                break;
+            case GSR_VIDEO_QUALITY_HIGH:
+                codec_context->qmin = quality;
+                codec_context->qmax = quality;
+                codec_context->bit_rate = 100000;//10000000-9000000 + (codec_context->width * codec_context->height)*0.75;
+                break;
+            case GSR_VIDEO_QUALITY_VERY_HIGH:
+                codec_context->qmin = quality;
+                codec_context->qmax = quality;
+                codec_context->bit_rate = 100000;//10000000-9000000 + (codec_context->width * codec_context->height)*0.75;
+                break;
+            case GSR_VIDEO_QUALITY_ULTRA:
+                codec_context->qmin = quality;
+                codec_context->qmax = quality;
+                codec_context->bit_rate = 100000;//10000000-9000000 + (codec_context->width * codec_context->height)*0.75;
+                break;
+        }
+
+        codec_context->rc_max_rate = codec_context->bit_rate;
+        //codec_context->rc_min_rate = codec_context->bit_rate;
+        codec_context->rc_buffer_size = codec_context->bit_rate;//codec_context->bit_rate / 10;
+        codec_context->rc_initial_buffer_occupancy = codec_context->bit_rate;//codec_context->bit_rate * 1000;
+    } else {
+        //codec_context->rc_buffer_size = 50000 * 1000;
     }
     //codec_context->profile = FF_PROFILE_H264_MAIN;
     if (codec_context->codec_id == AV_CODEC_ID_MPEG1VIDEO)
         codec_context->mb_decision = 2;
 
-    // stream->time_base = codec_context->time_base;
-    // codec_context->ticks_per_frame = 30;
-    //av_opt_set(codec_context->priv_data, "tune", "hq", 0);
-    // TODO: Do this for better file size? also allows setting qmin, qmax per frame? which can then be used to dynamically set bitrate to reduce quality
-    // if live streaming is slow or if the users harddrive is cant handle writing megabytes of data per second.
-    #if 0
-    char qmin_str[32];
-    snprintf(qmin_str, sizeof(qmin_str), "%d", codec_context->qmin);
-
-    char qmax_str[32];
-    snprintf(qmax_str, sizeof(qmax_str), "%d", codec_context->qmax);
-
-    av_opt_set(codec_context->priv_data, "cq", qmax_str, 0);
-    av_opt_set(codec_context->priv_data, "rc", "vbr", 0);
-    av_opt_set(codec_context->priv_data, "qmin", qmin_str, 0);
-    av_opt_set(codec_context->priv_data, "qmax", qmax_str, 0);
-    codec_context->bit_rate = 0;
-    #endif
-
-    if(vendor != GSR_GPU_VENDOR_NVIDIA) {
-        switch(video_quality) {
-            case VideoQuality::MEDIUM:
-                codec_context->global_quality = 180;
-                break;
-            case VideoQuality::HIGH:
-                codec_context->global_quality = 120;
-                break;
-            case VideoQuality::VERY_HIGH:
-                codec_context->global_quality = 100;
-                break;
-            case VideoQuality::ULTRA:
-                codec_context->global_quality = 70;
-                break;
+    if(!use_software_video_encoder && egl.gpu_info.vendor != GSR_GPU_VENDOR_NVIDIA && arg_parser.bitrate_mode != GSR_BITRATE_MODE_CBR) {
+        // 8 bit / 10 bit = 80%, and increase it even more
+        const float quality_multiply = hdr ? (8.0f/10.0f * 0.7f) : 1.0f;
+        if(codec_context->codec_id == AV_CODEC_ID_AV1 || codec_context->codec_id == AV_CODEC_ID_H264 || codec_context->codec_id == AV_CODEC_ID_HEVC) {
+            switch(arg_parser.video_quality) {
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    codec_context->global_quality = 130 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_HIGH:
+                    codec_context->global_quality = 110 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    codec_context->global_quality = 95 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    codec_context->global_quality = 85 * quality_multiply;
+                    break;
+            }
+        } else if(codec_context->codec_id == AV_CODEC_ID_VP8) {
+            switch(arg_parser.video_quality) {
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    codec_context->global_quality = 35 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_HIGH:
+                    codec_context->global_quality = 30 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    codec_context->global_quality = 25 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    codec_context->global_quality = 10 * quality_multiply;
+                    break;
+            }
+        } else if(codec_context->codec_id == AV_CODEC_ID_VP9) {
+            switch(arg_parser.video_quality) {
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    codec_context->global_quality = 35 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_HIGH:
+                    codec_context->global_quality = 30 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    codec_context->global_quality = 25 * quality_multiply;
+                    break;
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    codec_context->global_quality = 10 * quality_multiply;
+                    break;
+            }
         }
     }
 
     av_opt_set_int(codec_context->priv_data, "b_ref_mode", 0, 0);
     //av_opt_set_int(codec_context->priv_data, "cbr", true, 0);
 
-    if(vendor != GSR_GPU_VENDOR_NVIDIA) {
+    if(egl.gpu_info.vendor != GSR_GPU_VENDOR_NVIDIA) {
         // TODO: More options, better options
         //codec_context->bit_rate = codec_context->width * codec_context->height;
-        av_opt_set(codec_context->priv_data, "rc_mode", "CQP", 0);
+        switch(arg_parser.bitrate_mode) {
+            case GSR_BITRATE_MODE_QP: {
+                if(video_codec_is_vulkan(arg_parser.video_codec))
+                    av_opt_set(codec_context->priv_data, "rc_mode", "cqp", 0);
+                else if(egl.gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA)
+                    av_opt_set(codec_context->priv_data, "rc", "constqp", 0);
+                else
+                    av_opt_set(codec_context->priv_data, "rc_mode", "CQP", 0);
+                break;
+            }
+            case GSR_BITRATE_MODE_VBR: {
+                if(video_codec_is_vulkan(arg_parser.video_codec))
+                    av_opt_set(codec_context->priv_data, "rc_mode", "vbr", 0);
+                else if(egl.gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA)
+                    av_opt_set(codec_context->priv_data, "rc", "vbr", 0);
+                else
+                    av_opt_set(codec_context->priv_data, "rc_mode", "VBR", 0);
+                break;
+            }
+            case GSR_BITRATE_MODE_CBR: {
+                if(video_codec_is_vulkan(arg_parser.video_codec))
+                    av_opt_set(codec_context->priv_data, "rc_mode", "cbr", 0);
+                else if(egl.gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA)
+                    av_opt_set(codec_context->priv_data, "rc", "cbr", 0);
+                else
+                    av_opt_set(codec_context->priv_data, "rc_mode", "CBR", 0);
+                break;
+            }
+        }
         //codec_context->global_quality = 4;
         //codec_context->compression_level = 2;
     }
 
-    //codec_context->rc_max_rate = codec_context->bit_rate;
-    //codec_context->rc_min_rate = codec_context->bit_rate;
-    //codec_context->rc_buffer_size = codec_context->bit_rate / 10;
-    // TODO: Do this when not using cqp
-    //codec_context->rc_initial_buffer_occupancy = codec_context->bit_rate * 1000;
+    //av_opt_set(codec_context->priv_data, "bsf", "hevc_metadata=colour_primaries=9:transfer_characteristics=16:matrix_coefficients=9", 0);
 
     codec_context->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
 
     return codec_context;
 }
 
-static bool vaapi_create_codec_context(AVCodecContext *video_codec_context, const char *card_path) {
-    char render_path[128];
-    if(!gsr_card_path_get_render_path(card_path, render_path)) {
-        fprintf(stderr, "gsr error: failed to get /dev/dri/renderDXXX file from %s\n", card_path);
-        return false;
-    }
-
-    AVBufferRef *device_ctx;
-    if(av_hwdevice_ctx_create(&device_ctx, AV_HWDEVICE_TYPE_VAAPI, render_path, NULL, 0) < 0) {
-        fprintf(stderr, "Error: Failed to create hardware device context\n");
-        return false;
-    }
-
-    AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
-    if(!frame_context) {
-        fprintf(stderr, "Error: Failed to create hwframe context\n");
-        av_buffer_unref(&device_ctx);
-        return false;
-    }
-
-    AVHWFramesContext *hw_frame_context =
-        (AVHWFramesContext *)frame_context->data;
-    hw_frame_context->width = video_codec_context->width;
-    hw_frame_context->height = video_codec_context->height;
-    hw_frame_context->sw_format = AV_PIX_FMT_NV12;
-    hw_frame_context->format = video_codec_context->pix_fmt;
-    hw_frame_context->device_ref = device_ctx;
-    hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
-
-    hw_frame_context->initial_pool_size = 1;
-
-    if (av_hwframe_ctx_init(frame_context) < 0) {
-        fprintf(stderr, "Error: Failed to initialize hardware frame context "
-                        "(note: ffmpeg version needs to be > 4.0)\n");
-        av_buffer_unref(&device_ctx);
-        //av_buffer_unref(&frame_context);
-        return false;
-    }
-
-    video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
-    video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
-    return true;
-}
-
-static bool check_if_codec_valid_for_hardware(const AVCodec *codec, gsr_gpu_vendor vendor, const char *card_path) {
-    // Do not use AV_PIX_FMT_CUDA because we dont want to do full check with hardware context
-    AVCodecContext *codec_context = create_video_codec_context(vendor == GSR_GPU_VENDOR_NVIDIA ? AV_PIX_FMT_YUV420P : AV_PIX_FMT_VAAPI, VideoQuality::VERY_HIGH, 60, codec, false, vendor, FramerateMode::CONSTANT);
-    if(!codec_context)
-        return false;
-
-    codec_context->width = 512;
-    codec_context->height = 512;
-
-    if(vendor != GSR_GPU_VENDOR_NVIDIA) {
-        if(!vaapi_create_codec_context(codec_context, card_path)) {
-            avcodec_free_context(&codec_context);
-            return false;
-        }
-    }
-
-    bool success = false;
-    success = avcodec_open2(codec_context, codec_context->codec, NULL) == 0;
-    if(codec_context->hw_device_ctx)
-        av_buffer_unref(&codec_context->hw_device_ctx);
-    if(codec_context->hw_frames_ctx)
-        av_buffer_unref(&codec_context->hw_frames_ctx);
-    avcodec_free_context(&codec_context);
-    return success;
-}
-
-static const AVCodec* find_h264_encoder(gsr_gpu_vendor vendor, const char *card_path) {
-    const AVCodec *codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "h264_nvenc" : "h264_vaapi");
-    if(!codec)
-        codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "nvenc_h264" : "vaapi_h264");
-
-    if(!codec)
-        return nullptr;
-
-    static bool checked = false;
-    static bool checked_success = true;
-    if(!checked) {
-        checked = true;
-        if(!check_if_codec_valid_for_hardware(codec, vendor, card_path))
-            checked_success = false;
-    }
-    return checked_success ? codec : nullptr;
-}
-
-static const AVCodec* find_h265_encoder(gsr_gpu_vendor vendor, const char *card_path) {
-    const AVCodec *codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "hevc_nvenc" : "hevc_vaapi");
-    if(!codec)
-        codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "nvenc_hevc" : "vaapi_hevc");
-
-    if(!codec)
-        return nullptr;
-
-    static bool checked = false;
-    static bool checked_success = true;
-    if(!checked) {
-        checked = true;
-        if(!check_if_codec_valid_for_hardware(codec, vendor, card_path))
-            checked_success = false;
-    }
-    return checked_success ? codec : nullptr;
-}
-
-static const AVCodec* find_av1_encoder(gsr_gpu_vendor vendor, const char *card_path) {
-    const AVCodec *codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "av1_nvenc" : "av1_vaapi");
-    if(!codec)
-        codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "nvenc_av1" : "vaapi_av1");
-
-    if(!codec)
-        return nullptr;
-
-    static bool checked = false;
-    static bool checked_success = true;
-    if(!checked) {
-        checked = true;
-        if(!check_if_codec_valid_for_hardware(codec, vendor, card_path))
-            checked_success = false;
-    }
-    return checked_success ? codec : nullptr;
-}
-
 static void open_audio(AVCodecContext *audio_codec_context) {
     AVDictionary *options = nullptr;
     av_dict_set(&options, "strict", "experimental", 0);
@@ -587,165 +558,358 @@ static AVFrame* create_audio_frame(AVCodecContext *audio_codec_context) {
     return frame;
 }
 
-static void open_video(AVCodecContext *codec_context, VideoQuality video_quality, bool very_old_gpu, gsr_gpu_vendor vendor, PixelFormat pixel_format) {
+static void dict_set_profile(AVCodecContext *codec_context, gsr_gpu_vendor vendor, gsr_color_depth color_depth, gsr_video_codec video_codec, AVDictionary **options) {
+    #if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(61, 17, 100)
+    if(codec_context->codec_id == AV_CODEC_ID_H264) {
+        // TODO: Only for vaapi
+        //if(color_depth == GSR_COLOR_DEPTH_10_BITS)
+        //    av_dict_set(options, "profile", "high10", 0);
+        //else
+        av_dict_set(options, "profile", "high", 0);
+    } else if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+        if(vendor == GSR_GPU_VENDOR_NVIDIA) {
+            if(color_depth == GSR_COLOR_DEPTH_10_BITS)
+                av_dict_set_int(options, "highbitdepth", 1, 0);
+        } else {
+            av_dict_set(options, "profile", "main", 0); // TODO: use professional instead?
+        }
+    } else if(codec_context->codec_id == AV_CODEC_ID_HEVC) {
+        if(color_depth == GSR_COLOR_DEPTH_10_BITS)
+            av_dict_set(options, "profile", "main10", 0);
+        else
+            av_dict_set(options, "profile", "main", 0);
+    }
+    #else
+    const bool use_nvidia_values = vendor == GSR_GPU_VENDOR_NVIDIA && !video_codec_is_vulkan(video_codec);
+    if(codec_context->codec_id == AV_CODEC_ID_H264) {
+        // TODO: Only for vaapi
+        //if(color_depth == GSR_COLOR_DEPTH_10_BITS)
+        //    av_dict_set_int(options, "profile", AV_PROFILE_H264_HIGH_10, 0);
+        //else
+        av_dict_set_int(options, "profile", use_nvidia_values ? 2 : AV_PROFILE_H264_HIGH, 0);
+    } else if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+        if(use_nvidia_values) {
+            if(color_depth == GSR_COLOR_DEPTH_10_BITS)
+                av_dict_set_int(options, "highbitdepth", 1, 0);
+        } else {
+            av_dict_set_int(options, "profile", AV_PROFILE_AV1_MAIN, 0); // TODO: use professional instead?
+        }
+    } else if(codec_context->codec_id == AV_CODEC_ID_HEVC) {
+        if(color_depth == GSR_COLOR_DEPTH_10_BITS)
+            av_dict_set_int(options, "profile", use_nvidia_values ? 1 : AV_PROFILE_HEVC_MAIN_10, 0);
+        else
+            av_dict_set_int(options, "profile", use_nvidia_values ? 0 : AV_PROFILE_HEVC_MAIN, 0);
+    }
+    #endif
+}
+
+static void video_software_set_qp(AVCodecContext *codec_context, gsr_video_quality video_quality, bool hdr, AVDictionary **options) {
+    // 8 bit / 10 bit = 80%
+    const float qp_multiply = hdr ? 8.0f/10.0f : 1.0f;
+    if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+        switch(video_quality) {
+            case GSR_VIDEO_QUALITY_MEDIUM:
+                av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_HIGH:
+                av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_VERY_HIGH:
+                av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_ULTRA:
+                av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
+                break;
+        }
+    } else if(codec_context->codec_id == AV_CODEC_ID_H264) {
+        switch(video_quality) {
+            case GSR_VIDEO_QUALITY_MEDIUM:
+                av_dict_set_int(options, "qp", 34 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_HIGH:
+                av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_VERY_HIGH:
+                av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_ULTRA:
+                av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
+                break;
+        }
+    } else {
+        switch(video_quality) {
+            case GSR_VIDEO_QUALITY_MEDIUM:
+                av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_HIGH:
+                av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_VERY_HIGH:
+                av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
+                break;
+            case GSR_VIDEO_QUALITY_ULTRA:
+                av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
+                break;
+        }
+    }
+}
+
+static void open_video_software(AVCodecContext *codec_context, const args_parser &arg_parser) {
+    const bool hdr = video_codec_is_hdr(arg_parser.video_codec);
     AVDictionary *options = nullptr;
-    if(vendor == GSR_GPU_VENDOR_NVIDIA) {
-        bool supports_p4 = false;
-        bool supports_p5 = false;
-
-        const AVOption *opt = nullptr;
-        while((opt = av_opt_next(codec_context->priv_data, opt))) {
-            if(opt->type == AV_OPT_TYPE_CONST) {
-                if(strcmp(opt->name, "p4") == 0)
-                    supports_p4 = true;
-                else if(strcmp(opt->name, "p5") == 0)
-                    supports_p5 = true;
-            }
+
+    if(arg_parser.bitrate_mode == GSR_BITRATE_MODE_QP)
+        video_software_set_qp(codec_context, arg_parser.video_quality, hdr, &options);
+
+    av_dict_set(&options, "preset", "veryfast", 0);
+    av_dict_set(&options, "tune", "film", 0);
+
+    if(codec_context->codec_id == AV_CODEC_ID_H264) {
+        av_dict_set(&options, "coder", "cabac", 0); // TODO: cavlc is faster than cabac but worse compression. Which to use?
+    }
+
+    av_dict_set(&options, "strict", "experimental", 0);
+
+    int ret = avcodec_open2(codec_context, codec_context->codec, &options);
+    if (ret < 0) {
+        fprintf(stderr, "gsr error: Could not open video codec: %s\n", av_error_to_string(ret));
+        _exit(1);
+    }
+}
+
+static void video_set_rc(gsr_video_codec video_codec, gsr_gpu_vendor vendor, gsr_bitrate_mode bitrate_mode, AVDictionary **options) {
+    switch(bitrate_mode) {
+        case GSR_BITRATE_MODE_QP: {
+            if(video_codec_is_vulkan(video_codec))
+                av_dict_set(options, "rc_mode", "cqp", 0);
+            else if(vendor == GSR_GPU_VENDOR_NVIDIA)
+                av_dict_set(options, "rc", "constqp", 0);
+            else
+                av_dict_set(options, "rc_mode", "CQP", 0);
+            break;
+        }
+        case GSR_BITRATE_MODE_VBR: {
+            if(video_codec_is_vulkan(video_codec))
+                av_dict_set(options, "rc_mode", "vbr", 0);
+            else if(vendor == GSR_GPU_VENDOR_NVIDIA)
+                av_dict_set(options, "rc", "vbr", 0);
+            else
+                av_dict_set(options, "rc_mode", "VBR", 0);
+            break;
         }
+        case GSR_BITRATE_MODE_CBR: {
+            if(video_codec_is_vulkan(video_codec))
+                av_dict_set(options, "rc_mode", "cbr", 0);
+            else if(vendor == GSR_GPU_VENDOR_NVIDIA)
+                av_dict_set(options, "rc", "cbr", 0);
+            else
+                av_dict_set(options, "rc_mode", "CBR", 0);
+            break;
+        }
+    }
+}
 
+static void video_hardware_set_qp(AVCodecContext *codec_context, gsr_video_quality video_quality, gsr_gpu_vendor vendor, bool hdr, AVDictionary **options) {
+    // 8 bit / 10 bit = 80%
+    const float qp_multiply = hdr ? 8.0f/10.0f : 1.0f;
+    if(vendor == GSR_GPU_VENDOR_NVIDIA) {
+        // TODO: Test if these should be in the same range as vaapi
         if(codec_context->codec_id == AV_CODEC_ID_AV1) {
             switch(video_quality) {
-                case VideoQuality::MEDIUM:
-                    av_dict_set_int(&options, "qp", 43, 0);
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
                     break;
-                case VideoQuality::HIGH:
-                    av_dict_set_int(&options, "qp", 39, 0);
+                case GSR_VIDEO_QUALITY_HIGH:
+                    av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
                     break;
-                case VideoQuality::VERY_HIGH:
-                    av_dict_set_int(&options, "qp", 34, 0);
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
                     break;
-                case VideoQuality::ULTRA:
-                    av_dict_set_int(&options, "qp", 28, 0);
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
                     break;
             }
-        } else if(very_old_gpu || codec_context->codec_id == AV_CODEC_ID_H264) {
+        } else if(codec_context->codec_id == AV_CODEC_ID_H264) {
             switch(video_quality) {
-                case VideoQuality::MEDIUM:
-                    av_dict_set_int(&options, "qp", 37, 0);
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
                     break;
-                case VideoQuality::HIGH:
-                    av_dict_set_int(&options, "qp", 32, 0);
+                case GSR_VIDEO_QUALITY_HIGH:
+                    av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
                     break;
-                case VideoQuality::VERY_HIGH:
-                    av_dict_set_int(&options, "qp", 27, 0);
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
                     break;
-                case VideoQuality::ULTRA:
-                    av_dict_set_int(&options, "qp", 21, 0);
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
                     break;
             }
-        } else {
+        } else if(codec_context->codec_id == AV_CODEC_ID_HEVC) {
             switch(video_quality) {
-                case VideoQuality::MEDIUM:
-                    av_dict_set_int(&options, "qp", 40, 0);
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
                     break;
-                case VideoQuality::HIGH:
-                    av_dict_set_int(&options, "qp", 35, 0);
+                case GSR_VIDEO_QUALITY_HIGH:
+                    av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
                     break;
-                case VideoQuality::VERY_HIGH:
-                    av_dict_set_int(&options, "qp", 30, 0);
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
                     break;
-                case VideoQuality::ULTRA:
-                    av_dict_set_int(&options, "qp", 24, 0);
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
                     break;
             }
-        }
-
-        if(!supports_p4 && !supports_p5)
-            fprintf(stderr, "Info: your ffmpeg version is outdated. It's recommended that you use the flatpak version of gpu-screen-recorder version instead, which you can find at https://flathub.org/apps/details/com.dec05eba.gpu_screen_recorder\n");
-
-        //if(is_livestream) {
-        //    av_dict_set_int(&options, "zerolatency", 1, 0);
-        //    //av_dict_set(&options, "preset", "llhq", 0);
-        //}
-
-        // I want to use a good preset for the gpu but all gpus prefer different
-        // presets. Nvidia and ffmpeg used to support "hq" preset that chose the best preset for the gpu
-        // with pretty good performance but you now have to choose p1-p7, which are gpu agnostic and on
-        // older gpus p5-p7 slow the gpu down to a crawl...
-        // "hq" is now just an alias for p7 in ffmpeg :(
-        // TODO: Temporary disable because of stuttering?
-
-        // TODO: Preset is set to p5 for now but it should ideally be p6 or p7.
-        // This change is needed because for certain sizes of a window (or monitor?) such as 971x780 causes encoding to freeze
-        // when using h264 codec. This is a new(?) nvidia driver bug.
-        if(very_old_gpu)
-            av_dict_set(&options, "preset", supports_p4 ? "p4" : "medium", 0);
-        else
-            av_dict_set(&options, "preset", supports_p5 ? "p5" : "slow", 0);
-
-        av_dict_set(&options, "tune", "hq", 0);
-        av_dict_set(&options, "rc", "constqp", 0);
-
-        if(codec_context->codec_id == AV_CODEC_ID_H264) {
-            switch(pixel_format) {
-                case PixelFormat::YUV420:
-                    av_dict_set(&options, "profile", "high", 0);
+        } else if(codec_context->codec_id == AV_CODEC_ID_VP8 || codec_context->codec_id == AV_CODEC_ID_VP9) {
+            switch(video_quality) {
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
                     break;
-                case PixelFormat::YUV444:
-                    av_dict_set(&options, "profile", "high444p", 0);
+                case GSR_VIDEO_QUALITY_HIGH:
+                    av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
                     break;
-            }
-        } else if(codec_context->codec_id == AV_CODEC_ID_AV1) {
-            switch(pixel_format) {
-                case PixelFormat::YUV420:
-                    av_dict_set(&options, "rgb_mode", "yuv420", 0);
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
                     break;
-                case PixelFormat::YUV444:
-                    av_dict_set(&options, "rgb_mode", "yuv444", 0);
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
                     break;
             }
-        } else {
-            //av_dict_set(&options, "profile", "main10", 0);
-            //av_dict_set(&options, "pix_fmt", "yuv420p16le", 0);
         }
     } else {
         if(codec_context->codec_id == AV_CODEC_ID_AV1) {
             // Using global_quality option
         } else if(codec_context->codec_id == AV_CODEC_ID_H264) {
             switch(video_quality) {
-                case VideoQuality::MEDIUM:
-                    av_dict_set_int(&options, "qp", 32, 0);
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
                     break;
-                case VideoQuality::HIGH:
-                    av_dict_set_int(&options, "qp", 28, 0);
+                case GSR_VIDEO_QUALITY_HIGH:
+                    av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
                     break;
-                case VideoQuality::VERY_HIGH:
-                    av_dict_set_int(&options, "qp", 24, 0);
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
                     break;
-                case VideoQuality::ULTRA:
-                    av_dict_set_int(&options, "qp", 18, 0);
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
                     break;
             }
-        } else {
+        } else if(codec_context->codec_id == AV_CODEC_ID_HEVC) {
             switch(video_quality) {
-                case VideoQuality::MEDIUM:
-                    av_dict_set_int(&options, "qp", 34, 0);
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
                     break;
-                case VideoQuality::HIGH:
-                    av_dict_set_int(&options, "qp", 30, 0);
+                case GSR_VIDEO_QUALITY_HIGH:
+                    av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
                     break;
-                case VideoQuality::VERY_HIGH:
-                    av_dict_set_int(&options, "qp", 26, 0);
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
                     break;
-                case VideoQuality::ULTRA:
-                    av_dict_set_int(&options, "qp", 20, 0);
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
                     break;
             }
+        } else if(codec_context->codec_id == AV_CODEC_ID_VP8 || codec_context->codec_id == AV_CODEC_ID_VP9) {
+            switch(video_quality) {
+                case GSR_VIDEO_QUALITY_MEDIUM:
+                    av_dict_set_int(options, "qp", 35 * qp_multiply, 0);
+                    break;
+                case GSR_VIDEO_QUALITY_HIGH:
+                    av_dict_set_int(options, "qp", 30 * qp_multiply, 0);
+                    break;
+                case GSR_VIDEO_QUALITY_VERY_HIGH:
+                    av_dict_set_int(options, "qp", 25 * qp_multiply, 0);
+                    break;
+                case GSR_VIDEO_QUALITY_ULTRA:
+                    av_dict_set_int(options, "qp", 22 * qp_multiply, 0);
+                    break;
+            }
+        }
+    }
+}
+
+static void open_video_hardware(AVCodecContext *codec_context, bool low_power, const gsr_egl &egl, const args_parser &arg_parser) {
+    const gsr_color_depth color_depth = video_codec_to_bit_depth(arg_parser.video_codec);
+    const bool hdr = video_codec_is_hdr(arg_parser.video_codec);
+    AVDictionary *options = nullptr;
+
+    if(arg_parser.bitrate_mode == GSR_BITRATE_MODE_QP)
+        video_hardware_set_qp(codec_context, arg_parser.video_quality, egl.gpu_info.vendor, hdr, &options);
+
+    video_set_rc(arg_parser.video_codec, egl.gpu_info.vendor, arg_parser.bitrate_mode, &options);
+
+    // TODO: Enable multipass
+
+    dict_set_profile(codec_context, egl.gpu_info.vendor, color_depth, arg_parser.video_codec, &options);
+
+    if(video_codec_is_vulkan(arg_parser.video_codec)) {
+        av_dict_set_int(&options, "async_depth", 3, 0);
+        av_dict_set(&options, "tune", "hq", 0);
+        av_dict_set(&options, "usage", "record", 0); // TODO: Set to stream when streaming
+        av_dict_set(&options, "content", "rendered", 0);
+    } else if(egl.gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA) {
+        // TODO: These dont seem to be necessary
+        // av_dict_set_int(&options, "zerolatency", 1, 0);
+        // if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+        //     av_dict_set(&options, "tune", "ll", 0);
+        // } else if(codec_context->codec_id == AV_CODEC_ID_H264 || codec_context->codec_id == AV_CODEC_ID_HEVC) {
+        //     av_dict_set(&options, "preset", "llhq", 0);
+        //     av_dict_set(&options, "tune", "ll", 0);
+        // }
+        av_dict_set(&options, "tune", "hq", 0);
+
+        switch(arg_parser.tune) {
+            case GSR_TUNE_PERFORMANCE:
+                //av_dict_set(&options, "multipass", "qres", 0);
+                break;
+            case GSR_TUNE_QUALITY:
+                av_dict_set(&options, "multipass", "fullres", 0);
+                av_dict_set(&options, "preset", "p6", 0);
+                av_dict_set_int(&options, "rc-lookahead", 0, 0);
+                break;
         }
 
+        if(codec_context->codec_id == AV_CODEC_ID_H264) {
+            // TODO: h264 10bit?
+            // TODO:
+            // switch(pixel_format) {
+            //     case GSR_PIXEL_FORMAT_YUV420:
+            //         av_dict_set_int(&options, "profile", AV_PROFILE_H264_HIGH, 0);
+            //         break;
+            //     case GSR_PIXEL_FORMAT_YUV444:
+            //         av_dict_set_int(&options, "profile", AV_PROFILE_H264_HIGH_444, 0);
+            //         break;
+            // }
+        } else if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+            switch(arg_parser.pixel_format) {
+                case GSR_PIXEL_FORMAT_YUV420:
+                    av_dict_set(&options, "rgb_mode", "yuv420", 0);
+                    break;
+                case GSR_PIXEL_FORMAT_YUV444:
+                    av_dict_set(&options, "rgb_mode", "yuv444", 0);
+                    break;
+            }
+        } else if(codec_context->codec_id == AV_CODEC_ID_HEVC) {
+            //av_dict_set(&options, "pix_fmt", "yuv420p16le", 0);
+        }
+    } else {
         // TODO: More quality options
-        av_dict_set(&options, "rc_mode", "CQP", 0);
-        //av_dict_set_int(&options, "low_power", 1, 0);
+        if(low_power)
+            av_dict_set_int(&options, "low_power", 1, 0);
+        // Improves performance but increases vram.
+        // TODO: Might need a different async_depth for optimal performance on different amd/intel gpus
+        av_dict_set_int(&options, "async_depth", 3, 0);
 
         if(codec_context->codec_id == AV_CODEC_ID_H264) {
-            av_dict_set(&options, "profile", "high", 0);
-            av_dict_set_int(&options, "quality", 5, 0); // quality preset
+            // Removed because it causes stutter in games for some people
+            //av_dict_set_int(&options, "quality", 5, 0); // quality preset
         } else if(codec_context->codec_id == AV_CODEC_ID_AV1) {
-            av_dict_set(&options, "profile", "main", 0); // TODO: use professional instead?
             av_dict_set(&options, "tier", "main", 0);
-        } else {
-            av_dict_set(&options, "profile", "main", 0);
+        } else if(codec_context->codec_id == AV_CODEC_ID_HEVC) {
+            if(hdr)
+                av_dict_set(&options, "sei", "hdr", 0);
         }
+
+        // TODO: vp8/vp9 10bit
     }
 
     if(codec_context->codec_id == AV_CODEC_ID_H264) {
@@ -756,108 +920,57 @@ static void open_video(AVCodecContext *codec_context, VideoQuality video_quality
 
     int ret = avcodec_open2(codec_context, codec_context->codec, &options);
     if (ret < 0) {
-        fprintf(stderr, "Error: Could not open video codec: %s\n", av_error_to_string(ret));
+        fprintf(stderr, "gsr error: Could not open video codec: %s\n", av_error_to_string(ret));
         _exit(1);
     }
 }
 
-static void usage_header() {
-    fprintf(stderr, "usage: gpu-screen-recorder -w <window_id|monitor|focused> [-c <container_format>] [-s WxH] -f <fps> [-a <audio_input>] [-q <quality>] [-r <replay_buffer_size_sec>] [-k h264|h265|av1] [-ac aac|opus|flac] [-oc yes|no] [-fm cfr|vfr] [-v yes|no] [-h|--help] [-o <output_file>] [-mf yes|no]\n");
-}
-
-static void usage_full() {
-    usage_header();
-    fprintf(stderr, "\n");
-    fprintf(stderr, "OPTIONS:\n");
-    fprintf(stderr, "  -w    Window to record, a display, \"screen\", \"screen-direct\", \"screen-direct-force\" or \"focused\".\n");
-    fprintf(stderr, "        The display is the display (monitor) name in xrandr and if \"screen\", \"screen-direct\" or \"screen-direct-force\" is selected then all displays are recorded.\n");
-    fprintf(stderr, "        If this is \"focused\" then the currently focused window is recorded. When recording the focused window then the -s option has to be used as well.\n");
-    fprintf(stderr, "        \"screen-direct\"/\"screen-direct-force\" skips one texture copy for fullscreen applications so it may lead to better performance and it works with VRR monitors\n");
-    fprintf(stderr, "        when recording fullscreen application but may break some applications, such as mpv in fullscreen mode or might cause games to freeze/crash because of nvidia driver issues.\n");
-    fprintf(stderr, "        Direct mode doesn't capture cursor either.\n");
-    fprintf(stderr, "        \"screen-direct-force\" is not recommended unless you use a VRR monitor and you are aware that using this option can cause games to freeze/crash or other issues.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -c    Container format for output file, for example mp4, or flv. Only required if no output file is specified or if recording in replay buffer mode.\n");
-    fprintf(stderr, "        If an output file is specified and -c is not used then the container format is determined from the output filename extension.\n");
-    fprintf(stderr, "        Only containers that support h264, hevc or av1 are supported, which means that only mp4, mkv, flv (and some others) are supported.\n");
-    fprintf(stderr, "        WebM is not supported yet.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -s    The size (area) to record at in the format WxH, for example 1920x1080. This option is only supported (and required) when -w is \"focused\".\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -f    Framerate to record at.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -a    Audio device to record from (pulse audio device). Can be specified multiple times. Each time this is specified a new audio track is added for the specified audio device.\n");
-    fprintf(stderr, "        A name can be given to the audio input device by prefixing the audio input with <name>/, for example \"dummy/alsa_output.pci-0000_00_1b.0.analog-stereo.monitor\".\n");
-    fprintf(stderr, "        Multiple audio devices can be merged into one audio track by using \"|\" as a separator into one -a argument, for example: -a \"alsa_output1|alsa_output2\".\n");
-    fprintf(stderr, "        Optional, no audio track is added by default.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -q    Video quality. Should be either 'medium', 'high', 'very_high' or 'ultra'. 'high' is the recommended option when live streaming or when you have a slower harddrive.\n");
-    fprintf(stderr, "        Optional, set to 'very_high' be default.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -r    Replay buffer size in seconds. If this is set, then only the last seconds as set by this option will be stored\n");
-    fprintf(stderr, "        and the video will only be saved when the gpu-screen-recorder is closed. This feature is similar to Nvidia's instant replay feature.\n");
-    fprintf(stderr, "        This option has be between 5 and 1200. Note that the replay buffer size will not always be precise, because of keyframes. Optional, disabled by default.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -k    Video codec to use. Should be either 'auto', 'h264', 'h265', 'av1'. Defaults to 'auto' which defaults to 'h265' unless recording at fps higher than 60. Defaults to 'h264' on intel.\n");
-    fprintf(stderr, "        Forcefully set to 'h264' if -c is 'flv'.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -ac   Audio codec to use. Should be either 'aac', 'opus' or 'flac'. Defaults to 'opus' for .mp4/.mkv files, otherwise defaults to 'aac'.\n");
-    fprintf(stderr, "        'opus' and 'flac' is only supported by .mp4/.mkv files. 'opus' is recommended for best performance and smallest audio size.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -oc   Overclock memory transfer rate to the maximum performance level. This only applies to NVIDIA on X11 and exists to overcome a bug in NVIDIA driver where performance level. The same issue exists on Wayland but overclocking is not possible on Wayland.\n");
-    fprintf(stderr, "        is dropped when you record a game. Only needed if you are recording a game that is bottlenecked by GPU.\n");
-    fprintf(stderr, "        Works only if your have \"Coolbits\" set to \"12\" in NVIDIA X settings, see README for more information. Note! use at your own risk! Optional, disabled by default.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -fm   Framerate mode. Should be either 'cfr' or 'vfr'. Defaults to 'cfr' on NVIDIA X11 and 'vfr' on AMD/Intel X11/Wayland or NVIDIA Wayland.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -v    Prints per second, fps updates. Optional, set to 'yes' by default.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -h    Show this help.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "  -mf   Organise replays in folders based on the current date.\n");
-    fprintf(stderr, "\n");
-    //fprintf(stderr, "  -pixfmt  The pixel format to use for the output video. yuv420 is the most common format and is best supported, but the color is compressed, so colors can look washed out and certain colors of text can look bad. Use yuv444 for no color compression, but the video may not work everywhere and it may not work with hardware video decoding. Optional, defaults to yuv420\n");
-    fprintf(stderr, "  -o    The output file path. If omitted then the encoded data is sent to stdout. Required in replay mode (when using -r).\n");
-    fprintf(stderr, "        In replay mode this has to be a directory instead of a file.\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "NOTES:\n");
-    fprintf(stderr, "  Send signal SIGINT to gpu-screen-recorder (Ctrl+C, or killall gpu-screen-recorder) to stop and save the recording (when not using replay mode).\n");
-    fprintf(stderr, "  Send signal SIGUSR1 to gpu-screen-recorder (killall -SIGUSR1 gpu-screen-recorder) to save a replay (when in replay mode).\n");
-    fprintf(stderr, "\n");
-    fprintf(stderr, "EXAMPLES\n");
-    fprintf(stderr, "  gpu-screen-recorder -w screen -f 60 -a \"$(pactl get-default-sink).monitor\" -o video.mp4\n");
-    fprintf(stderr, "  gpu-screen-recorder -w screen -f 60 -a \"$(pactl get-default-sink).monitor|$(pactl get-default-source)\" -o video.mp4\n");
-    //fprintf(stderr, "  gpu-screen-recorder -w screen -f 60 -q ultra -pixfmt yuv444 -o video.mp4\n");
-    _exit(1);
-}
-
-static void usage() {
-    usage_header();
-    _exit(1);
-}
+static const int save_replay_seconds_full = -1;
 
 static sig_atomic_t running = 1;
-static sig_atomic_t save_replay = 0;
+static sig_atomic_t toggle_pause = 0;
+static sig_atomic_t toggle_replay_recording = 0;
+static sig_atomic_t save_replay_seconds = 0;
 
-static void int_handler(int) {
+static void stop_handler(int) {
     running = 0;
 }
 
+static void toggle_pause_handler(int) {
+    toggle_pause = 1;
+}
+
+static void toggle_replay_recording_handler(int) {
+    toggle_replay_recording = 1;
+}
+
 static void save_replay_handler(int) {
-    save_replay = 1;
+    save_replay_seconds = save_replay_seconds_full;
 }
 
-struct Arg {
-    std::vector<const char*> values;
-    bool optional = false;
-    bool list = false;
+static void save_replay_10_seconds_handler(int) {
+    save_replay_seconds = 10;
+}
 
-    const char* value() const {
-        if(values.empty())
-            return nullptr;
-        return values.front();
-    }
-};
+static void save_replay_30_seconds_handler(int) {
+    save_replay_seconds = 30;
+}
+
+static void save_replay_1_minute_handler(int) {
+    save_replay_seconds = 60;
+}
+
+static void save_replay_5_minutes_handler(int) {
+    save_replay_seconds = 60*5;
+}
+
+static void save_replay_10_minutes_handler(int) {
+    save_replay_seconds = 60*10;
+}
+
+static void save_replay_30_minutes_handler(int) {
+    save_replay_seconds = 60*30;
+}
 
 static bool is_hex_num(char c) {
     return (c >= 'A' && c <= 'F') || (c >= 'a' && c <= 'f') || (c >= '0' && c <= '9');
@@ -891,7 +1004,7 @@ static std::string get_date_str() {
     time_t now = time(NULL);
     struct tm *t = localtime(&now);
     strftime(str, sizeof(str)-1, "%Y-%m-%d_%H-%M-%S", t);
-    return str; 
+    return str;
 }
 
 static std::string get_date_only_str() {
@@ -913,7 +1026,7 @@ static std::string get_time_only_str() {
 static AVStream* create_stream(AVFormatContext *av_format_context, AVCodecContext *codec_context) {
     AVStream *stream = avformat_new_stream(av_format_context, nullptr);
     if (!stream) {
-        fprintf(stderr, "Error: Could not allocate stream\n");
+        fprintf(stderr, "gsr error: Could not allocate stream\n");
         _exit(1);
     }
     stream->id = av_format_context->nb_streams - 1;
@@ -922,7 +1035,70 @@ static AVStream* create_stream(AVFormatContext *av_format_context, AVCodecContex
     return stream;
 }
 
-struct AudioDevice {
+static void run_recording_saved_script_async(const char *script_file, const char *video_file, const char *type) {
+    char script_file_full[PATH_MAX];
+    script_file_full[0] = '\0';
+    if(!realpath(script_file, script_file_full)) {
+        fprintf(stderr, "gsr error: script file not found: %s\n", script_file);
+        return;
+    }
+
+    const char *args[7];
+    const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
+
+    if(inside_flatpak) {
+        args[0] = "flatpak-spawn";
+        args[1] = "--host";
+        args[2] = "--";
+        args[3] = script_file_full;
+        args[4] = video_file;
+        args[5] = type;
+        args[6] = NULL;
+    } else {
+        args[0] = script_file_full;
+        args[1] = video_file;
+        args[2] = type;
+        args[3] = NULL;
+    }
+
+    pid_t pid = fork();
+    if(pid == -1) {
+        perror(script_file_full);
+        return;
+    } else if(pid == 0) { // child
+        setsid();
+        signal(SIGHUP, SIG_IGN);
+
+        pid_t second_child = fork();
+        if(second_child == 0) { // child
+            execvp(args[0], (char* const*)args);
+            perror(script_file_full);
+            _exit(127);
+        } else if(second_child != -1) { // parent
+            _exit(0);
+        }
+    } else { // parent
+        waitpid(pid, NULL, 0);
+    }
+}
+
+static double audio_codec_get_desired_delay(gsr_audio_codec audio_codec, int fps) {
+    const double fps_inv = 1.0 / (double)fps;
+    const double base = 0.01 + 1.0/165.0;
+    switch(audio_codec) {
+        case GSR_AUDIO_CODEC_OPUS:
+            return std::max(0.0, base - fps_inv);
+        case GSR_AUDIO_CODEC_AAC:
+            return std::max(0.0, (base + 0.008) * 2.0 - fps_inv);
+        case GSR_AUDIO_CODEC_FLAC:
+            // TODO: Test
+            return std::max(0.0, base - fps_inv);
+    }
+    assert(false);
+    return std::max(0.0, base - fps_inv);
+}
+
+struct AudioDeviceData {
     SoundDevice sound_device;
     AudioInput audio_input;
     AVFilterContext *src_filter_ctx = nullptr;
@@ -932,180 +1108,262 @@ struct AudioDevice {
 
 // TODO: Cleanup
 struct AudioTrack {
+    std::string name;
     AVCodecContext *codec_context = nullptr;
-    AVStream *stream = nullptr;
 
-    std::vector<AudioDevice> audio_devices;
+    std::vector<AudioDeviceData> audio_devices;
     AVFilterGraph *graph = nullptr;
     AVFilterContext *sink = nullptr;
     int stream_index = 0;
+    int64_t pts = 0;
 };
 
-static std::future<void> save_replay_thread;
-static std::vector<std::shared_ptr<PacketData>> save_replay_packets;
-static std::string save_replay_output_filepath;
+static bool add_hdr_metadata_to_video_stream(gsr_capture *cap, AVStream *video_stream) {
+    size_t light_metadata_size = 0;
+    size_t mastering_display_metadata_size = 0;
+    AVContentLightMetadata *light_metadata = av_content_light_metadata_alloc(&light_metadata_size);
+    #if LIBAVUTIL_VERSION_INT < AV_VERSION_INT(59, 37, 100)
+    AVMasteringDisplayMetadata *mastering_display_metadata = av_mastering_display_metadata_alloc();
+    mastering_display_metadata_size = sizeof(*mastering_display_metadata);
+    #else
+    AVMasteringDisplayMetadata *mastering_display_metadata = av_mastering_display_metadata_alloc_size(&mastering_display_metadata_size);
+    #endif
 
-static int create_directory_recursive(char *path) {
-    int path_len = strlen(path);
-    char *p = path;
-    char *end = path + path_len;
-    for(;;) {
-        char *slash_p = strchr(p, '/');
+    if(!light_metadata || !mastering_display_metadata) {
+        if(light_metadata)
+            av_freep(&light_metadata);
 
-        // Skips first '/', we don't want to try and create the root directory
-        if(slash_p == path) {
-            ++p;
-            continue;
-        }
+        if(mastering_display_metadata)
+            av_freep(&mastering_display_metadata);
 
-        if(!slash_p)
-            slash_p = end;
+        return false;
+    }
 
-        char prev_char = *slash_p;
-        *slash_p = '\0';
-        int err = mkdir(path, S_IRWXU);
-        *slash_p = prev_char;
+    if(!gsr_capture_set_hdr_metadata(cap, mastering_display_metadata, light_metadata)) {
+        av_freep(&light_metadata);
+        av_freep(&mastering_display_metadata);
+        return false;
+    }
 
-        if(err == -1 && errno != EEXIST)
-            return err;
+    // TODO: More error checking
 
-        if(slash_p == end)
-            break;
-        else
-            p = slash_p + 1;
-    }
-    return 0;
+    #if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(60, 31, 102)
+    const bool content_light_level_added = av_stream_add_side_data(video_stream, AV_PKT_DATA_CONTENT_LIGHT_LEVEL, (uint8_t*)light_metadata, light_metadata_size) == 0;
+    #else
+    const bool content_light_level_added = av_packet_side_data_add(&video_stream->codecpar->coded_side_data, &video_stream->codecpar->nb_coded_side_data, AV_PKT_DATA_CONTENT_LIGHT_LEVEL, light_metadata, light_metadata_size, 0) != NULL;
+    #endif
+
+    #if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(60, 31, 102)
+    const bool mastering_display_metadata_added = av_stream_add_side_data(video_stream, AV_PKT_DATA_MASTERING_DISPLAY_METADATA, (uint8_t*)mastering_display_metadata, mastering_display_metadata_size) == 0;
+    #else
+    const bool mastering_display_metadata_added = av_packet_side_data_add(&video_stream->codecpar->coded_side_data, &video_stream->codecpar->nb_coded_side_data, AV_PKT_DATA_MASTERING_DISPLAY_METADATA, mastering_display_metadata, mastering_display_metadata_size, 0) != NULL;
+    #endif
+
+    if(!content_light_level_added)
+        av_freep(&light_metadata);
+
+    if(!mastering_display_metadata_added)
+        av_freep(&mastering_display_metadata);
+
+    // Return true even on failure because we dont want to retry adding hdr metadata on failure
+    return true;
 }
 
-static void save_replay_async(AVCodecContext *video_codec_context, int video_stream_index, std::vector<AudioTrack> &audio_tracks, std::deque<std::shared_ptr<PacketData>> &frame_data_queue, bool frames_erased, std::string output_dir, const char *container_format, const std::string &file_extension, std::mutex &write_output_mutex, bool make_folders) {
-    if(save_replay_thread.valid())
-        return;
-    
-    size_t start_index = (size_t)-1;
-    int64_t video_pts_offset = 0;
-    int64_t audio_pts_offset = 0;
+struct RecordingStartAudio {
+    const AudioTrack *audio_track;
+    AVStream *stream;
+};
 
-    {
-        std::lock_guard<std::mutex> lock(write_output_mutex);
-        start_index = (size_t)-1;
-        for(size_t i = 0; i < frame_data_queue.size(); ++i) {
-            const AVPacket &av_packet = frame_data_queue[i]->data;
-            if((av_packet.flags & AV_PKT_FLAG_KEY) && av_packet.stream_index == video_stream_index) {
-                start_index = i;
-                break;
-            }
-        }
+struct RecordingStartResult {
+    AVFormatContext *av_format_context = nullptr;
+    AVStream *video_stream = nullptr;
+    std::vector<RecordingStartAudio> audio_inputs;
+};
 
-        if(start_index == (size_t)-1)
-            return;
+static RecordingStartResult start_recording_create_streams(const char *filename, const char *container_format, AVCodecContext *video_codec_context, const std::vector<AudioTrack> &audio_tracks, bool hdr, gsr_capture *capture) {
+    AVFormatContext *av_format_context;
+    avformat_alloc_output_context2(&av_format_context, nullptr, container_format, filename);
 
-        if(frames_erased) {
-            video_pts_offset = frame_data_queue[start_index]->data.pts;
-            
-            // Find the next audio packet to use as audio pts offset
-            for(size_t i = start_index; i < frame_data_queue.size(); ++i) {
-                const AVPacket &av_packet = frame_data_queue[i]->data;
-                if(av_packet.stream_index != video_stream_index) {
-                    audio_pts_offset = av_packet.pts;
-                    break;
-                }
-            }
-        } else {
-            start_index = 0;
-        }
+    AVStream *video_stream = create_stream(av_format_context, video_codec_context);
+    avcodec_parameters_from_context(video_stream->codecpar, video_codec_context);
 
-        save_replay_packets.resize(frame_data_queue.size());
-        for(size_t i = 0; i < frame_data_queue.size(); ++i) {
-            save_replay_packets[i] = frame_data_queue[i];
-        }
+    RecordingStartResult result;
+    result.audio_inputs.reserve(audio_tracks.size());
+
+    for(const AudioTrack &audio_track : audio_tracks) {
+        AVStream *audio_stream = create_stream(av_format_context, audio_track.codec_context);
+        if(!audio_track.name.empty())
+            av_dict_set(&audio_stream->metadata, "title", audio_track.name.c_str(), 0);
+        avcodec_parameters_from_context(audio_stream->codecpar, audio_track.codec_context);
+        result.audio_inputs.push_back({&audio_track, audio_stream});
+    }
+
+    const int open_ret = avio_open(&av_format_context->pb, filename, AVIO_FLAG_WRITE);
+    if(open_ret < 0) {
+        fprintf(stderr, "gsr error: start: could not open '%s': %s\n", filename, av_error_to_string(open_ret));
+        return result;
+    }
+
+    AVDictionary *options = nullptr;
+    av_dict_set(&options, "strict", "experimental", 0);
+
+    const int header_write_ret = avformat_write_header(av_format_context, &options);
+    av_dict_free(&options);
+    if(header_write_ret < 0) {
+        fprintf(stderr, "gsr error: start: error occurred when writing header to output file: %s\n", av_error_to_string(header_write_ret));
+        avio_close(av_format_context->pb);
+        avformat_free_context(av_format_context);
+        return result;
     }
 
-    if (make_folders) {
-        std::string output_folder = output_dir + '/' + get_date_only_str();
-        create_directory_recursive(&output_folder[0]);
-        save_replay_output_filepath = output_folder + "/Replay_" + get_time_only_str() + "." + file_extension;
+    if(hdr)
+        add_hdr_metadata_to_video_stream(capture, video_stream);
+
+    result.av_format_context = av_format_context;
+    result.video_stream = video_stream;
+    return result;
+}
+
+static bool stop_recording_close_streams(AVFormatContext *av_format_context) {
+    bool trailer_written = true;
+    if(av_write_trailer(av_format_context) != 0) {
+        fprintf(stderr, "gsr error: end: failed to write trailer\n");
+        trailer_written = false;
+    }
+
+    const bool closed = avio_close(av_format_context->pb) == 0;
+    avformat_free_context(av_format_context);
+    return trailer_written && closed;
+}
+
+static std::future<void> save_replay_thread;
+static std::string save_replay_output_filepath;
+
+static std::string create_new_recording_filepath_from_timestamp(std::string directory, const char *filename_prefix, const std::string &file_extension, bool date_folders) {
+    std::string output_filepath;
+    if(date_folders) {
+        std::string output_folder = directory + '/' + get_date_only_str();
+        if(create_directory_recursive(&output_folder[0]) != 0)
+            fprintf(stderr, "gsr error: failed to create directory: %s\n", output_folder.c_str());
+        output_filepath = output_folder + "/" + filename_prefix + "_" + get_time_only_str() + "." + file_extension;
     } else {
-        create_directory_recursive(&output_dir[0]);
-        save_replay_output_filepath = output_dir + "/Replay_" + get_date_str() + "." + file_extension;
+        if(create_directory_recursive(&directory[0]) != 0)
+            fprintf(stderr, "gsr error: failed to create directory: %s\n", directory.c_str());
+        output_filepath = directory + "/" + filename_prefix + "_" + get_date_str() + "." + file_extension;
     }
+    return output_filepath;
+}
 
-    save_replay_thread = std::async(std::launch::async, [video_stream_index, container_format, start_index, video_pts_offset, audio_pts_offset, video_codec_context, &audio_tracks]() mutable {
-        AVFormatContext *av_format_context;
-        avformat_alloc_output_context2(&av_format_context, nullptr, container_format, nullptr);
+static RecordingStartAudio* get_recording_start_item_by_stream_index(RecordingStartResult &result, int stream_index) {
+    for(auto &audio_input : result.audio_inputs) {
+        if(audio_input.stream->index == stream_index)
+            return &audio_input;
+    }
+    return nullptr;
+}
 
-        AVStream *video_stream = create_stream(av_format_context, video_codec_context);
-        avcodec_parameters_from_context(video_stream->codecpar, video_codec_context);
+static void save_replay_async(AVCodecContext *video_codec_context, int video_stream_index, const std::vector<AudioTrack> &audio_tracks, gsr_replay_buffer *replay_buffer, std::string output_dir, const char *container_format, const std::string &file_extension, bool date_folders, bool hdr, gsr_capture *capture, int current_save_replay_seconds) {
+    if(save_replay_thread.valid())
+        return;
 
-        std::unordered_map<int, AudioTrack*> stream_index_to_audio_track_map;
-        for(AudioTrack &audio_track : audio_tracks) {
-            stream_index_to_audio_track_map[audio_track.stream_index] = &audio_track;
-            AVStream *audio_stream = create_stream(av_format_context, audio_track.codec_context);
-            avcodec_parameters_from_context(audio_stream->codecpar, audio_track.codec_context);
-            audio_track.stream = audio_stream;
-        }
+    const gsr_replay_buffer_iterator search_start_iterator = current_save_replay_seconds == save_replay_seconds_full ? gsr_replay_buffer_iterator{0, 0} : gsr_replay_buffer_find_packet_index_by_time_passed(replay_buffer, current_save_replay_seconds);
+    const gsr_replay_buffer_iterator video_start_iterator = gsr_replay_buffer_find_keyframe(replay_buffer, search_start_iterator, video_stream_index, false);
+    if(video_start_iterator.packet_index == (size_t)-1) {
+        fprintf(stderr, "gsr error: failed to save replay: failed to find a video keyframe. perhaps replay was saved too fast, before anything has been recorded\n");
+        return;
+    }
 
-        int ret = avio_open(&av_format_context->pb, save_replay_output_filepath.c_str(), AVIO_FLAG_WRITE);
-        if (ret < 0) {
-            fprintf(stderr, "Error: Could not open '%s': %s. Make sure %s is an existing directory with write access\n", save_replay_output_filepath.c_str(), av_error_to_string(ret), save_replay_output_filepath.c_str());
-            return;
-        }
+    const gsr_replay_buffer_iterator audio_start_iterator = gsr_replay_buffer_find_keyframe(replay_buffer, video_start_iterator, video_stream_index, true);
+    // if(audio_start_index == (size_t)-1) {
+    //     fprintf(stderr, "gsr error: failed to save replay: failed to find an audio keyframe. perhaps replay was saved too fast, before anything has been recorded\n");
+    //     return;
+    // }
 
-        AVDictionary *options = nullptr;
-        av_dict_set(&options, "strict", "experimental", 0);
+    const int64_t video_pts_offset = gsr_replay_buffer_iterator_get_packet(replay_buffer, video_start_iterator)->pts;
+    const int64_t audio_pts_offset = audio_start_iterator.packet_index == (size_t)-1 ? 0 : gsr_replay_buffer_iterator_get_packet(replay_buffer, audio_start_iterator)->pts;
 
-        ret = avformat_write_header(av_format_context, &options);
-        if (ret < 0) {
-            fprintf(stderr, "Error occurred when writing header to output file: %s\n", av_error_to_string(ret));
-            return;
-        }
+    gsr_replay_buffer *cloned_replay_buffer = gsr_replay_buffer_clone(replay_buffer);
+    if(!cloned_replay_buffer) {
+        // TODO: Return this error to mark the replay as failed
+        fprintf(stderr, "gsr error: failed to save replay: failed to clone replay buffer\n");
+        return;
+    }
+
+    std::string output_filepath = create_new_recording_filepath_from_timestamp(output_dir, "Replay", file_extension, date_folders);
+    RecordingStartResult recording_start_result = start_recording_create_streams(output_filepath.c_str(), container_format, video_codec_context, audio_tracks, hdr, capture);
+    if(!recording_start_result.av_format_context)
+        return;
+
+    save_replay_output_filepath = std::move(output_filepath);
+
+    save_replay_thread = std::async(std::launch::async, [video_stream_index, recording_start_result, video_start_iterator, video_pts_offset, audio_pts_offset, video_codec_context, cloned_replay_buffer]() mutable {
+        gsr_replay_buffer_iterator replay_iterator = video_start_iterator;
+        for(;;) {
+            AVPacket *replay_packet = gsr_replay_buffer_iterator_get_packet(cloned_replay_buffer, replay_iterator);
+            uint8_t *replay_packet_data = NULL;
+            if(replay_packet)
+                replay_packet_data = gsr_replay_buffer_iterator_get_packet_data(cloned_replay_buffer, replay_iterator);
+
+            if(!replay_packet) {
+                fprintf(stderr, "gsr error: save_replay_async: no replay packet\n");
+                break;
+            }
+
+            if(!replay_packet->data && !replay_packet_data) {
+                fprintf(stderr, "gsr error: save_replay_async: no replay packet data\n");
+                break;
+            }
 
-        for(size_t i = start_index; i < save_replay_packets.size(); ++i) {
             // TODO: Check if successful
             AVPacket av_packet;
             memset(&av_packet, 0, sizeof(av_packet));
-            //av_packet_from_data(av_packet, save_replay_packets[i]->data.data, save_replay_packets[i]->data.size);
-            av_packet.data = save_replay_packets[i]->data.data;
-            av_packet.size = save_replay_packets[i]->data.size;
-            av_packet.stream_index = save_replay_packets[i]->data.stream_index;
-            av_packet.pts = save_replay_packets[i]->data.pts;
-            av_packet.dts = save_replay_packets[i]->data.pts;
-            av_packet.flags = save_replay_packets[i]->data.flags;
-
-            AVStream *stream = video_stream;
+            //av_packet_from_data(av_packet, replay_packet->data, replay_packet->size);
+            av_packet.data = replay_packet->data ? replay_packet->data : replay_packet_data;
+            av_packet.size = replay_packet->size;
+            av_packet.stream_index = replay_packet->stream_index;
+            av_packet.pts = replay_packet->pts;
+            av_packet.dts = replay_packet->pts;
+            av_packet.flags = replay_packet->flags;
+            //av_packet.duration = replay_packet->duration;
+
+            AVStream *stream = recording_start_result.video_stream;
             AVCodecContext *codec_context = video_codec_context;
 
             if(av_packet.stream_index == video_stream_index) {
                 av_packet.pts -= video_pts_offset;
                 av_packet.dts -= video_pts_offset;
             } else {
-                AudioTrack *audio_track = stream_index_to_audio_track_map[av_packet.stream_index];
-                stream = audio_track->stream;
+                RecordingStartAudio *recording_start_audio = get_recording_start_item_by_stream_index(recording_start_result, av_packet.stream_index);
+                if(!recording_start_audio) {
+                    fprintf(stderr, "gsr error: save_replay_async: failed to find audio stream by index: %d\n", av_packet.stream_index);
+                    free(replay_packet_data);
+                    continue;
+                }
+
+                const AudioTrack *audio_track = recording_start_audio->audio_track;
+                stream = recording_start_audio->stream;
                 codec_context = audio_track->codec_context;
 
                 av_packet.pts -= audio_pts_offset;
                 av_packet.dts -= audio_pts_offset;
             }
 
-            av_packet.stream_index = stream->index;
+            //av_packet.stream_index = stream->index;
             av_packet_rescale_ts(&av_packet, codec_context->time_base, stream->time_base);
 
-            ret = av_write_frame(av_format_context, &av_packet);
+            const int ret = av_write_frame(recording_start_result.av_format_context, &av_packet);
             if(ret < 0)
-                fprintf(stderr, "Error: Failed to write frame index %d to muxer, reason: %s (%d)\n", stream->index, av_error_to_string(ret), ret);
+                fprintf(stderr, "gsr error: Failed to write frame index %d to muxer, reason: %s (%d)\n", av_packet.stream_index, av_error_to_string(ret), ret);
+
+            free(replay_packet_data);
 
             //av_packet_free(&av_packet);
+            if(!gsr_replay_buffer_iterator_next(cloned_replay_buffer, &replay_iterator))
+                break;
         }
 
-        if (av_write_trailer(av_format_context) != 0)
-            fprintf(stderr, "Failed to write trailer\n");
-
-        avio_close(av_format_context->pb);
-        avformat_free_context(av_format_context);
-        av_dict_free(&options);
-
-        for(AudioTrack &audio_track : audio_tracks) {
-            audio_track.stream = nullptr;
-        }
+        stop_recording_close_streams(recording_start_result.av_format_context);
+        gsr_replay_buffer_destroy(cloned_replay_buffer);
     });
 }
 
@@ -1123,57 +1381,99 @@ static void split_string(const std::string &str, char delimiter, std::function<b
     }
 }
 
-static std::vector<AudioInput> parse_audio_input_arg(const char *str) {
-    std::vector<AudioInput> audio_inputs;
-    split_string(str, '|', [&audio_inputs](const char *sub, size_t size) {
+static bool string_starts_with(const std::string &str, const char *substr) {
+    int len = strlen(substr);
+    return (int)str.size() >= len && memcmp(str.data(), substr, len) == 0;
+}
+
+static bool string_ends_with(const char *str, const char *substr) {
+    int str_len = strlen(str);
+    int substr_len = strlen(substr);
+    return str_len >= substr_len && memcmp(str + str_len - substr_len, substr, substr_len) == 0;
+}
+
+static const AudioDevice* get_audio_device_by_name(const std::vector<AudioDevice> &audio_devices, const char *name) {
+    for(const auto &audio_device : audio_devices) {
+        if(strcmp(audio_device.name.c_str(), name) == 0)
+            return &audio_device;
+    }
+    return nullptr;
+}
+
+static MergedAudioInputs parse_audio_input_arg(const char *str) {
+    MergedAudioInputs result;
+
+    split_string(str, '|', [&](const char *sub, size_t size) {
         AudioInput audio_input;
         audio_input.name.assign(sub, size);
-        const size_t index = audio_input.name.find('/');
-        if(index != std::string::npos) {
-            audio_input.description = audio_input.name.substr(0, index);
-            audio_input.name.erase(audio_input.name.begin(), audio_input.name.begin() + index + 1);
+
+        if(string_starts_with(audio_input.name.c_str(), "app:")) {
+            audio_input.name.erase(audio_input.name.begin(), audio_input.name.begin() + 4);
+            audio_input.type = AudioInputType::APPLICATION;
+            audio_input.inverted = false;
+            result.audio_inputs.push_back(std::move(audio_input));
+            return true;
+        } else if(string_starts_with(audio_input.name.c_str(), "app-inverse:")) {
+            audio_input.name.erase(audio_input.name.begin(), audio_input.name.begin() + 12);
+            audio_input.type = AudioInputType::APPLICATION;
+            audio_input.inverted = true;
+            result.audio_inputs.push_back(std::move(audio_input));
+            return true;
+        } else if(string_starts_with(audio_input.name.c_str(), "device:")) {
+            audio_input.name.erase(audio_input.name.begin(), audio_input.name.begin() + 7);
+            audio_input.type = AudioInputType::DEVICE;
+            result.audio_inputs.push_back(std::move(audio_input));
+            return true;
+        } else {
+            audio_input.type = AudioInputType::DEVICE;
+            result.audio_inputs.push_back(std::move(audio_input));
+            return true;
         }
-        audio_inputs.push_back(std::move(audio_input));
-        return true;
     });
-    return audio_inputs;
-}
 
-// TODO: Does this match all livestreaming cases?
-static bool is_livestream_path(const char *str) {
-    const int len = strlen(str);
-    if((len >= 7 && memcmp(str, "http://", 7) == 0) || (len >= 8 && memcmp(str, "https://", 8) == 0))
-        return true;
-    else if((len >= 7 && memcmp(str, "rtmp://", 7) == 0) || (len >= 8 && memcmp(str, "rtmps://", 8) == 0))
-        return true;
-    else
-        return false;
+    return result;
 }
 
-// TODO: Proper cleanup
-static int init_filter_graph(AVCodecContext *audio_codec_context, AVFilterGraph **graph, AVFilterContext **sink, std::vector<AVFilterContext*> &src_filter_ctx, size_t num_sources) {
+static int init_filter_graph(AVCodecContext* audio_codec_context, AVFilterGraph** graph, AVFilterContext** sink, std::vector<AVFilterContext*>& src_filter_ctx, size_t num_sources) {
     char ch_layout[64];
     int err = 0;
- 
-    AVFilterGraph *filter_graph = avfilter_graph_alloc();
+    ch_layout[0] = '\0';
+
+    // C89-style variable declaration to
+    // avoid problems because of goto
+    AVFilterGraph* filter_graph = nullptr;
+    AVFilterContext* mix_ctx = nullptr;
+
+    const AVFilter* mix_filter = nullptr;
+    const AVFilter* abuffersink = nullptr;
+    AVFilterContext* abuffersink_ctx = nullptr;
+    char args[512] = { 0 };
+#if LIBAVFILTER_VERSION_INT >= AV_VERSION_INT(7, 107, 100)
+    bool normalize = false;
+#endif
+
+    filter_graph = avfilter_graph_alloc();
     if (!filter_graph) {
         fprintf(stderr, "Unable to create filter graph.\n");
-        return AVERROR(ENOMEM);
+        err = AVERROR(ENOMEM);
+        goto fail;
     }
- 
+
     for(size_t i = 0; i < num_sources; ++i) {
         const AVFilter *abuffer = avfilter_get_by_name("abuffer");
         if (!abuffer) {
             fprintf(stderr, "Could not find the abuffer filter.\n");
-            return AVERROR_FILTER_NOT_FOUND;
+            err = AVERROR_FILTER_NOT_FOUND;
+            goto fail;
         }
-    
+
         AVFilterContext *abuffer_ctx = avfilter_graph_alloc_filter(filter_graph, abuffer, NULL);
         if (!abuffer_ctx) {
             fprintf(stderr, "Could not allocate the abuffer instance.\n");
-            return AVERROR(ENOMEM);
+            err = AVERROR(ENOMEM);
+            goto fail;
         }
-    
+
         #if LIBAVCODEC_VERSION_MAJOR < 60
         av_get_channel_layout_string(ch_layout, sizeof(ch_layout), 0, AV_CH_LAYOUT_STEREO);
         #else
@@ -1184,50 +1484,56 @@ static int init_filter_graph(AVCodecContext *audio_codec_context, AVFilterGraph
         av_opt_set_q  (abuffer_ctx, "time_base",      audio_codec_context->time_base,                          AV_OPT_SEARCH_CHILDREN);
         av_opt_set_int(abuffer_ctx, "sample_rate",    audio_codec_context->sample_rate,                        AV_OPT_SEARCH_CHILDREN);
         av_opt_set_int(abuffer_ctx, "bit_rate",       audio_codec_context->bit_rate,                           AV_OPT_SEARCH_CHILDREN);
-    
+
         err = avfilter_init_str(abuffer_ctx, NULL);
         if (err < 0) {
             fprintf(stderr, "Could not initialize the abuffer filter.\n");
-            return err;
+            goto fail;
         }
 
         src_filter_ctx.push_back(abuffer_ctx);
     }
 
-    const AVFilter *mix_filter = avfilter_get_by_name("amix");
+    mix_filter = avfilter_get_by_name("amix");
     if (!mix_filter) {
         av_log(NULL, AV_LOG_ERROR, "Could not find the mix filter.\n");
-        return AVERROR_FILTER_NOT_FOUND;
+        err = AVERROR_FILTER_NOT_FOUND;
+        goto fail;
     }
-    
-    char args[512];
+
+#if LIBAVFILTER_VERSION_INT >= AV_VERSION_INT(7, 107, 100)
+    snprintf(args, sizeof(args), "inputs=%d:normalize=%s", (int)num_sources, normalize ? "true" : "false");
+#else
     snprintf(args, sizeof(args), "inputs=%d", (int)num_sources);
-    
-    AVFilterContext *mix_ctx;
+    fprintf(stderr, "gsr warning: your ffmpeg version doesn't support disabling normalizing of mixed audio. Volume might be lower than expected\n");
+#endif
+
     err = avfilter_graph_create_filter(&mix_ctx, mix_filter, "amix", args, NULL, filter_graph);
     if (err < 0) {
         av_log(NULL, AV_LOG_ERROR, "Cannot create audio amix filter\n");
-        return err;
+        goto fail;
     }
- 
-    const AVFilter *abuffersink = avfilter_get_by_name("abuffersink");
+
+    abuffersink = avfilter_get_by_name("abuffersink");
     if (!abuffersink) {
         fprintf(stderr, "Could not find the abuffersink filter.\n");
-        return AVERROR_FILTER_NOT_FOUND;
+        err = AVERROR_FILTER_NOT_FOUND;
+        goto fail;
     }
- 
-    AVFilterContext *abuffersink_ctx = avfilter_graph_alloc_filter(filter_graph, abuffersink, "sink");
+
+    abuffersink_ctx = avfilter_graph_alloc_filter(filter_graph, abuffersink, "sink");
     if (!abuffersink_ctx) {
         fprintf(stderr, "Could not allocate the abuffersink instance.\n");
-        return AVERROR(ENOMEM);
+        err = AVERROR(ENOMEM);
+        goto fail;
     }
- 
+
     err = avfilter_init_str(abuffersink_ctx, NULL);
     if (err < 0) {
         fprintf(stderr, "Could not initialize the abuffersink instance.\n");
-        return err;
+        goto fail;
     }
- 
+
     err = 0;
     for(size_t i = 0; i < src_filter_ctx.size(); ++i) {
         AVFilterContext *src_ctx = src_filter_ctx[i];
@@ -1238,19 +1544,90 @@ static int init_filter_graph(AVCodecContext *audio_codec_context, AVFilterGraph
         err = avfilter_link(mix_ctx, 0, abuffersink_ctx, 0);
     if (err < 0) {
         av_log(NULL, AV_LOG_ERROR, "Error connecting filters\n");
-        return err;
+        goto fail;
     }
- 
+
     err = avfilter_graph_config(filter_graph, NULL);
     if (err < 0) {
         av_log(NULL, AV_LOG_ERROR, "Error configuring the filter graph\n");
-        return err;
+        goto fail;
     }
- 
+
     *graph = filter_graph;
-    *sink  = abuffersink_ctx;
- 
+    *sink = abuffersink_ctx;
+
     return 0;
+
+fail:
+    avfilter_graph_free(&filter_graph);
+    src_filter_ctx.clear();  // possibly unnecessary?
+    return err;
+}
+
+static gsr_video_encoder* create_video_encoder(gsr_egl *egl, const args_parser &arg_parser) {
+    const gsr_color_depth color_depth = video_codec_to_bit_depth(arg_parser.video_codec);
+    gsr_video_encoder *video_encoder = nullptr;
+
+    if(arg_parser.video_encoder == GSR_VIDEO_ENCODER_HW_CPU) {
+        gsr_video_encoder_software_params params;
+        params.egl = egl;
+        params.color_depth = color_depth;
+        video_encoder = gsr_video_encoder_software_create(&params);
+        return video_encoder;
+    }
+
+    if(video_codec_is_vulkan(arg_parser.video_codec)) {
+        gsr_video_encoder_vulkan_params params;
+        params.egl = egl;
+        params.color_depth = color_depth;
+        video_encoder = gsr_video_encoder_vulkan_create(&params);
+        return video_encoder;
+    }
+
+    switch(egl->gpu_info.vendor) {
+        case GSR_GPU_VENDOR_AMD:
+        case GSR_GPU_VENDOR_INTEL:
+        case GSR_GPU_VENDOR_BROADCOM: {
+            gsr_video_encoder_vaapi_params params;
+            params.egl = egl;
+            params.color_depth = color_depth;
+            video_encoder = gsr_video_encoder_vaapi_create(&params);
+            break;
+        }
+        case GSR_GPU_VENDOR_NVIDIA: {
+            gsr_video_encoder_nvenc_params params;
+            params.egl = egl;
+            params.overclock = arg_parser.overclock;
+            params.color_depth = color_depth;
+            video_encoder = gsr_video_encoder_nvenc_create(&params);
+            break;
+        }
+    }
+
+    return video_encoder;
+}
+
+static bool get_supported_video_codecs(gsr_egl *egl, gsr_video_codec video_codec, bool use_software_video_encoder, bool cleanup, gsr_supported_video_codecs *video_codecs) {
+    memset(video_codecs, 0, sizeof(*video_codecs));
+
+    if(use_software_video_encoder) {
+        video_codecs->h264.supported = true;
+        return true;
+    }
+
+    if(video_codec_is_vulkan(video_codec))
+        return gsr_get_supported_video_codecs_vulkan(video_codecs, egl->card_path, cleanup);
+
+    switch(egl->gpu_info.vendor) {
+        case GSR_GPU_VENDOR_AMD:
+        case GSR_GPU_VENDOR_INTEL:
+        case GSR_GPU_VENDOR_BROADCOM:
+            return gsr_get_supported_video_codecs_vaapi(video_codecs, egl->card_path, cleanup);
+        case GSR_GPU_VENDOR_NVIDIA:
+            return gsr_get_supported_video_codecs_nvenc(video_codecs, cleanup);
+    }
+
+    return false;
 }
 
 static void xwayland_check_callback(const gsr_monitor *monitor, void *userdata) {
@@ -1267,250 +1644,350 @@ static bool is_xwayland(Display *display) {
         return true;
 
     bool xwayland_found = false;
-    for_each_active_monitor_output(display, GSR_CONNECTION_X11, xwayland_check_callback, &xwayland_found);
+    for_each_active_monitor_output_x11_not_cached(display, xwayland_check_callback, &xwayland_found);
     return xwayland_found;
 }
 
-struct ReceivePacketData {
-    AVCodecContext *codec_context;
-    int stream_index;
-    AVStream *stream;
-    int64_t pts;
-};
+static bool is_using_prime_run() {
+    const char *prime_render_offload = getenv("__NV_PRIME_RENDER_OFFLOAD");
+    return (prime_render_offload && strcmp(prime_render_offload, "1") == 0) || getenv("DRI_PRIME");
+}
 
-int main(int argc, char **argv) {
-    signal(SIGINT, int_handler);
-    signal(SIGUSR1, save_replay_handler);
+static void disable_prime_run() {
+    unsetenv("__NV_PRIME_RENDER_OFFLOAD");
+    unsetenv("__NV_PRIME_RENDER_OFFLOAD_PROVIDER");
+    unsetenv("__GLX_VENDOR_LIBRARY_NAME");
+    unsetenv("__VK_LAYER_NV_optimus");
+    unsetenv("DRI_PRIME");
+}
+
+static gsr_window* gsr_window_create(Display *display, bool wayland) {
+    if(wayland)
+        return gsr_window_wayland_create();
+    else
+        return gsr_window_x11_create(display);
+}
 
-    if(argc <= 1)
-        usage_full();
+static void list_system_info(bool wayland) {
+    printf("display_server|%s\n", wayland ? "wayland" : "x11");
+    bool supports_app_audio = false;
+#ifdef GSR_APP_AUDIO
+    supports_app_audio = pulseaudio_server_is_pipewire();
+    if(supports_app_audio) {
+        gsr_pipewire_audio audio;
+        if(gsr_pipewire_audio_init(&audio))
+            gsr_pipewire_audio_deinit(&audio);
+        else
+            supports_app_audio = false;
+    }
+#endif
+    printf("supports_app_audio|%s\n", supports_app_audio ? "yes" : "no");
+}
 
-    if(argc == 2 && (strcmp(argv[1], "-h") == 0 || strcmp(argv[1], "--help") == 0))
-        usage_full();
+static void list_gpu_info(gsr_egl *egl) {
+    switch(egl->gpu_info.vendor) {
+        case GSR_GPU_VENDOR_AMD:
+            printf("vendor|amd\n");
+            break;
+        case GSR_GPU_VENDOR_INTEL:
+            printf("vendor|intel\n");
+            break;
+        case GSR_GPU_VENDOR_NVIDIA:
+            printf("vendor|nvidia\n");
+            break;
+        case GSR_GPU_VENDOR_BROADCOM:
+            printf("vendor|broadcom\n");
+            break;
+    }
+    printf("card_path|%s\n", egl->card_path);
+}
 
-    //av_log_set_level(AV_LOG_TRACE);
+static const AVCodec* get_ffmpeg_video_codec(gsr_video_codec video_codec, gsr_gpu_vendor vendor) {
+    switch(video_codec) {
+        case GSR_VIDEO_CODEC_H264:
+            return avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "h264_nvenc" : "h264_vaapi");
+        case GSR_VIDEO_CODEC_HEVC:
+        case GSR_VIDEO_CODEC_HEVC_HDR:
+        case GSR_VIDEO_CODEC_HEVC_10BIT:
+            return avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "hevc_nvenc" : "hevc_vaapi");
+        case GSR_VIDEO_CODEC_AV1:
+        case GSR_VIDEO_CODEC_AV1_HDR:
+        case GSR_VIDEO_CODEC_AV1_10BIT:
+            return avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "av1_nvenc" : "av1_vaapi");
+        case GSR_VIDEO_CODEC_VP8:
+            return avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "vp8_nvenc" : "vp8_vaapi");
+        case GSR_VIDEO_CODEC_VP9:
+            return avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "vp9_nvenc" : "vp9_vaapi");
+        case GSR_VIDEO_CODEC_H264_VULKAN:
+            return avcodec_find_encoder_by_name("h264_vulkan");
+        case GSR_VIDEO_CODEC_HEVC_VULKAN:
+            return avcodec_find_encoder_by_name("hevc_vulkan");
+    }
+    return nullptr;
+}
 
-    std::map<std::string, Arg> args = {
-        { "-w", Arg { {}, false, false } },
-        { "-c", Arg { {}, true, false } },
-        { "-f", Arg { {}, false, false } },
-        { "-s", Arg { {}, true, false } },
-        { "-a", Arg { {}, true, true } },
-        { "-q", Arg { {}, true, false } },
-        { "-o", Arg { {}, true, false } },
-        { "-r", Arg { {}, true, false } },
-        { "-k", Arg { {}, true, false } },
-        { "-ac", Arg { {}, true, false } },
-        { "-oc", Arg { {}, true, false } },
-        { "-fm", Arg { {}, true, false } },
-        { "-pixfmt", Arg { {}, true, false } },
-        { "-v", Arg { {}, true, false } },
-        { "-mf", Arg { {}, true, false } },
-    };
-
-    for(int i = 1; i < argc; i += 2) {
-        auto it = args.find(argv[i]);
-        if(it == args.end()) {
-            fprintf(stderr, "Invalid argument '%s'\n", argv[i]);
-            usage();
-        }
-
-        if(!it->second.values.empty() && !it->second.list) {
-            fprintf(stderr, "Expected argument '%s' to only be specified once\n", argv[i]);
-            usage();
-        }
-
-        if(i + 1 >= argc) {
-            fprintf(stderr, "Missing value for argument '%s'\n", argv[i]);
-            usage();
-        }
-
-        it->second.values.push_back(argv[i + 1]);
-    }
-
-    for(auto &it : args) {
-        if(!it.second.optional && !it.second.value()) {
-            fprintf(stderr, "Missing argument '%s'\n", it.first.c_str());
-            usage();
-        }
-    }
-
-    VideoCodec video_codec = VideoCodec::H265;
-    const char *video_codec_to_use = args["-k"].value();
-    if(!video_codec_to_use)
-        video_codec_to_use = "auto";
-
-    if(strcmp(video_codec_to_use, "h264") == 0) {
-        video_codec = VideoCodec::H264;
-    } else if(strcmp(video_codec_to_use, "h265") == 0) {
-        video_codec = VideoCodec::H265;
-    } else if(strcmp(video_codec_to_use, "av1") == 0) {
-        video_codec = VideoCodec::AV1;
-    } else if(strcmp(video_codec_to_use, "auto") != 0) {
-        fprintf(stderr, "Error: -k should either be either 'auto', 'h264', 'h265' or 'av1', got: '%s'\n", video_codec_to_use);
-        usage();
-    }
-
-    AudioCodec audio_codec = AudioCodec::OPUS;
-    const char *audio_codec_to_use = args["-ac"].value();
-    if(!audio_codec_to_use)
-        audio_codec_to_use = "aac";
-
-    if(strcmp(audio_codec_to_use, "aac") == 0) {
-        audio_codec = AudioCodec::AAC;
-    } else if(strcmp(audio_codec_to_use, "opus") == 0) {
-        audio_codec = AudioCodec::OPUS;
-    } else if(strcmp(audio_codec_to_use, "flac") == 0) {
-        audio_codec = AudioCodec::FLAC;
-    } else {
-        fprintf(stderr, "Error: -ac should either be either 'aac', 'opus' or 'flac', got: '%s'\n", audio_codec_to_use);
-        usage();
+static void set_supported_video_codecs_ffmpeg(gsr_supported_video_codecs *supported_video_codecs, gsr_supported_video_codecs *supported_video_codecs_vulkan, gsr_gpu_vendor vendor) {
+    if(!get_ffmpeg_video_codec(GSR_VIDEO_CODEC_H264, vendor)) {
+        supported_video_codecs->h264.supported = false;
     }
 
-    if(audio_codec != AudioCodec::AAC) {
-        audio_codec_to_use = "aac";
-        audio_codec = AudioCodec::AAC;
-        fprintf(stderr, "Info: audio codec is forcefully set to aac at the moment because of issues with opus/flac. This is a temporary issue\n");
+    if(!get_ffmpeg_video_codec(GSR_VIDEO_CODEC_HEVC, vendor)) {
+        supported_video_codecs->hevc.supported = false;
+        supported_video_codecs->hevc_hdr.supported = false;
+        supported_video_codecs->hevc_10bit.supported = false;
     }
 
-    bool overclock = false;
-    const char *overclock_str = args["-oc"].value();
-    if(!overclock_str)
-        overclock_str = "no";
+    if(!get_ffmpeg_video_codec(GSR_VIDEO_CODEC_AV1, vendor)) {
+        supported_video_codecs->av1.supported = false;
+        supported_video_codecs->av1_hdr.supported = false;
+        supported_video_codecs->av1_10bit.supported = false;
+    }
 
-    if(strcmp(overclock_str, "yes") == 0) {
-        overclock = true;
-    } else if(strcmp(overclock_str, "no") == 0) {
-        overclock = false;
-    } else {
-        fprintf(stderr, "Error: -oc should either be either 'yes' or 'no', got: '%s'\n", overclock_str);
-        usage();
+    if(!get_ffmpeg_video_codec(GSR_VIDEO_CODEC_VP8, vendor)) {
+        supported_video_codecs->vp8.supported = false;
     }
 
-    bool verbose = true;
-    const char *verbose_str = args["-v"].value();
-    if(!verbose_str)
-        verbose_str = "yes";
+    if(!get_ffmpeg_video_codec(GSR_VIDEO_CODEC_VP9, vendor)) {
+        supported_video_codecs->vp9.supported = false;
+    }
 
-    if(strcmp(verbose_str, "yes") == 0) {
-        verbose = true;
-    } else if(strcmp(verbose_str, "no") == 0) {
-        verbose = false;
-    } else {
-        fprintf(stderr, "Error: -v should either be either 'yes' or 'no', got: '%s'\n", verbose_str);
-        usage();
+    if(!get_ffmpeg_video_codec(GSR_VIDEO_CODEC_H264_VULKAN, vendor)) {
+        supported_video_codecs_vulkan->h264.supported = false;
     }
 
-    bool make_folders = false;
-    const char *make_folders_str = args["-mf"].value();
-    if(!make_folders_str)
-        make_folders_str = "no";
+    if(!get_ffmpeg_video_codec(GSR_VIDEO_CODEC_HEVC_VULKAN, vendor)) {
+        supported_video_codecs_vulkan->hevc.supported = false;
+        supported_video_codecs_vulkan->hevc_hdr.supported = false;
+        supported_video_codecs_vulkan->hevc_10bit.supported = false;
+    }
+}
 
-    if(strcmp(make_folders_str, "yes") == 0) {
-        make_folders = true;
-    } else if(strcmp(make_folders_str, "no") == 0) {
-        make_folders = false;
+static void list_supported_video_codecs(gsr_egl *egl, bool wayland) {
+    // Dont clean it up on purpose to increase shutdown speed
+    gsr_supported_video_codecs supported_video_codecs;
+    get_supported_video_codecs(egl, GSR_VIDEO_CODEC_H264, false, false, &supported_video_codecs);
+
+    gsr_supported_video_codecs supported_video_codecs_vulkan;
+    get_supported_video_codecs(egl, GSR_VIDEO_CODEC_H264_VULKAN, false, false, &supported_video_codecs_vulkan);
+
+    set_supported_video_codecs_ffmpeg(&supported_video_codecs, &supported_video_codecs_vulkan, egl->gpu_info.vendor);
+
+    if(supported_video_codecs.h264.supported)
+        puts("h264");
+    if(avcodec_find_encoder_by_name("libx264"))
+        puts("h264_software");
+    if(supported_video_codecs.hevc.supported)
+        puts("hevc");
+    if(supported_video_codecs.hevc_hdr.supported && wayland)
+        puts("hevc_hdr");
+    if(supported_video_codecs.hevc_10bit.supported)
+        puts("hevc_10bit");
+    if(supported_video_codecs.av1.supported)
+        puts("av1");
+    if(supported_video_codecs.av1_hdr.supported && wayland)
+        puts("av1_hdr");
+    if(supported_video_codecs.av1_10bit.supported)
+        puts("av1_10bit");
+    if(supported_video_codecs.vp8.supported)
+        puts("vp8");
+    if(supported_video_codecs.vp9.supported)
+        puts("vp9");
+    //if(supported_video_codecs_vulkan.h264.supported)
+    //    puts("h264_vulkan");
+    //if(supported_video_codecs_vulkan.hevc.supported)
+    //    puts("hevc_vulkan"); // TODO: hdr, 10 bit
+}
+
+static bool monitor_capture_use_drm(const gsr_window *window, gsr_gpu_vendor vendor) {
+    return gsr_window_get_display_server(window) == GSR_DISPLAY_SERVER_WAYLAND || vendor != GSR_GPU_VENDOR_NVIDIA;
+}
+
+typedef struct {
+    const gsr_window *window;
+    int num_monitors;
+} capture_options_callback;
+
+static void output_monitor_info(const gsr_monitor *monitor, void *userdata) {
+    capture_options_callback *options = (capture_options_callback*)userdata;
+    if(gsr_window_get_display_server(options->window) == GSR_DISPLAY_SERVER_WAYLAND) {
+        vec2i monitor_size = monitor->size;
+        gsr_monitor_rotation monitor_rotation = GSR_MONITOR_ROT_0;
+        vec2i monitor_position = {0, 0};
+        drm_monitor_get_display_server_data(options->window, monitor, &monitor_rotation, &monitor_position);
+        if(monitor_rotation == GSR_MONITOR_ROT_90 || monitor_rotation == GSR_MONITOR_ROT_270)
+            std::swap(monitor_size.x, monitor_size.y);
+        printf("%.*s|%dx%d\n", monitor->name_len, monitor->name, monitor_size.x, monitor_size.y);
     } else {
-        fprintf(stderr, "Error: -mf should either be either 'yes' or 'no', got: '%s'\n", make_folders_str);
-        usage();
+        printf("%.*s|%dx%d\n", monitor->name_len, monitor->name, monitor->size.x, monitor->size.y);
     }
+    ++options->num_monitors;
+}
 
-    PixelFormat pixel_format = PixelFormat::YUV420;
-    const char *pixfmt = args["-pixfmt"].value();
-    if(!pixfmt)
-        pixfmt = "yuv420";
+static void list_supported_capture_options(const gsr_window *window, const char *card_path, bool list_monitors) {
+    const bool wayland = gsr_window_get_display_server(window) == GSR_DISPLAY_SERVER_WAYLAND;
+    if(!wayland) {
+        puts("window");
+        puts("focused");
+    }
 
-    if(strcmp(pixfmt, "yuv420") == 0) {
-        pixel_format = PixelFormat::YUV420;
-    } else if(strcmp(pixfmt, "yuv444") == 0) {
-        pixel_format = PixelFormat::YUV444;
-    } else {
-        fprintf(stderr, "Error: -pixfmt should either be either 'yuv420', or 'yuv444', got: '%s'\n", pixfmt);
-        usage();
+    capture_options_callback options;
+    options.window = window;
+    options.num_monitors = 0;
+    if(list_monitors) {
+        const bool is_x11 = gsr_window_get_display_server(window) == GSR_DISPLAY_SERVER_X11;
+        const gsr_connection_type connection_type = is_x11 ? GSR_CONNECTION_X11 : GSR_CONNECTION_DRM;
+        for_each_active_monitor_output(window, card_path, connection_type, output_monitor_info, &options);
     }
 
-    const Arg &audio_input_arg = args["-a"];
-    std::vector<AudioInput> audio_inputs;
-    if(!audio_input_arg.values.empty())
-        audio_inputs = get_pulseaudio_inputs();
-    std::vector<MergedAudioInputs> requested_audio_inputs;
+    if(options.num_monitors > 0)
+        puts("region");
 
-    // Manually check if the audio inputs we give exist. This is only needed for pipewire, not pulseaudio.
-    // Pipewire instead DEFAULTS TO THE DEFAULT AUDIO INPUT. THAT'S RETARDED.
-    // OH, YOU MISSPELLED THE AUDIO INPUT? FUCK YOU
-    for(const char *audio_input : audio_input_arg.values) {
-        if(!audio_input || audio_input[0] == '\0')
-            continue;
+#ifdef GSR_PORTAL
+    // Desktop portal capture on x11 doesn't seem to be hardware accelerated
+    if(!wayland)
+        return;
 
-        requested_audio_inputs.push_back({parse_audio_input_arg(audio_input)});
-        for(AudioInput &request_audio_input : requested_audio_inputs.back().audio_inputs) {
-            bool match = false;
-            for(const auto &existing_audio_input : audio_inputs) {
-                if(strcmp(request_audio_input.name.c_str(), existing_audio_input.name.c_str()) == 0) {
-                    if(request_audio_input.description.empty())
-                        request_audio_input.description = "gsr-" + existing_audio_input.description;
+    gsr_dbus_client dbus_client;
+    if(!gsr_dbus_client_init(&dbus_client, NULL))
+        return;
 
-                    match = true;
-                    break;
-                }
-            }
+    char session_handle[128];
+    if(gsr_dbus_client_screencast_create_session(&dbus_client, session_handle, sizeof(session_handle)) == 0)
+        puts("portal");
 
-            if(!match) {
-                fprintf(stderr, "Error: Audio input device '%s' is not a valid audio device, expected one of:\n", request_audio_input.name.c_str());
-                for(const auto &existing_audio_input : audio_inputs) {
-                    fprintf(stderr, "    %s\n", existing_audio_input.name.c_str());
-                }
-                _exit(2);
-            }
-        }
+    gsr_dbus_client_deinit(&dbus_client);
+#endif
+}
+
+static void version_command(void *userdata) {
+    (void)userdata;
+    puts(GSR_VERSION);
+    fflush(stdout);
+    _exit(0);
+}
+
+static void info_command(void *userdata) {
+    (void)userdata;
+    bool wayland = false;
+    Display *dpy = XOpenDisplay(nullptr);
+    if (!dpy) {
+        wayland = true;
+        fprintf(stderr, "gsr warning: failed to connect to the X server. Assuming wayland is running without Xwayland\n");
     }
 
-    const char *container_format = args["-c"].value();
-    if(container_format && strcmp(container_format, "mkv") == 0)
-        container_format = "matroska";
+    XSetErrorHandler(x11_error_handler);
+    XSetIOErrorHandler(x11_io_error_handler);
+
+    if(!wayland)
+        wayland = is_xwayland(dpy);
 
-    int fps = atoi(args["-f"].value());
-    if(fps == 0) {
-        fprintf(stderr, "Invalid fps argument: %s\n", args["-f"].value());
+    if(!wayland && is_using_prime_run()) {
+        // Disable prime-run and similar options as it doesn't work, the monitor to capture has to be run on the same device.
+        // This is fine on wayland since nvidia uses drm interface there and the monitor query checks the monitors connected
+        // to the drm device.
+        fprintf(stderr, "gsr warning: use of prime-run on X11 is not supported. Disabling prime-run\n");
+        disable_prime_run();
+    }
+
+    gsr_window *window = gsr_window_create(dpy, wayland);
+    if(!window) {
+        fprintf(stderr, "gsr error: failed to create window\n");
         _exit(1);
     }
-    if(fps < 1)
-        fps = 1;
-
-    const char *quality_str = args["-q"].value();
-    if(!quality_str)
-        quality_str = "very_high";
-
-    VideoQuality quality;
-    if(strcmp(quality_str, "medium") == 0) {
-        quality = VideoQuality::MEDIUM;
-    } else if(strcmp(quality_str, "high") == 0) {
-        quality = VideoQuality::HIGH;
-    } else if(strcmp(quality_str, "very_high") == 0) {
-        quality = VideoQuality::VERY_HIGH;
-    } else if(strcmp(quality_str, "ultra") == 0) {
-        quality = VideoQuality::ULTRA;
-    } else {
-        fprintf(stderr, "Error: -q should either be either 'medium', 'high', 'very_high' or 'ultra', got: '%s'\n", quality_str);
-        usage();
+
+    gsr_egl egl;
+    if(!gsr_egl_load(&egl, window, false, false)) {
+        fprintf(stderr, "gsr error: failed to load opengl\n");
+        _exit(22);
     }
 
-    int replay_buffer_size_secs = -1;
-    const char *replay_buffer_size_secs_str = args["-r"].value();
-    if(replay_buffer_size_secs_str) {
-        replay_buffer_size_secs = atoi(replay_buffer_size_secs_str);
-        if(replay_buffer_size_secs < 5 || replay_buffer_size_secs > 1200) {
-            fprintf(stderr, "Error: option -r has to be between 5 and 1200, was: %s\n", replay_buffer_size_secs_str);
-            _exit(1);
+    bool list_monitors = true;
+    egl.card_path[0] = '\0';
+    if(monitor_capture_use_drm(window, egl.gpu_info.vendor)) {
+        // TODO: Allow specifying another card, and in other places
+        if(!gsr_get_valid_card_path(&egl, egl.card_path, true)) {
+            fprintf(stderr, "gsr error: no /dev/dri/cardX device found. Make sure that you have at least one monitor connected\n");
+            list_monitors = false;
         }
-        replay_buffer_size_secs += 3; // Add a few seconds to account of lost packets because of non-keyframe packets skipped
     }
 
+    av_log_set_level(AV_LOG_FATAL);
+
+    puts("section=system_info");
+    list_system_info(wayland);
+    if(egl.gpu_info.is_steam_deck)
+        puts("is_steam_deck|yes");
+    else
+        puts("is_steam_deck|no");
+    printf("gsr_version|%s\n", GSR_VERSION);
+    puts("section=gpu_info");
+    list_gpu_info(&egl);
+    puts("section=video_codecs");
+    list_supported_video_codecs(&egl, wayland);
+    puts("section=image_formats");
+    puts("jpeg");
+    puts("png");
+    puts("section=capture_options");
+    list_supported_capture_options(window, egl.card_path, list_monitors);
+
+    fflush(stdout);
+
+    // Not needed as this will just slow down shutdown
+    //gsr_egl_unload(&egl);
+    //gsr_window_destroy(&window);
+    //if(dpy)
+    //    XCloseDisplay(dpy);
+
+    _exit(0);
+}
+
+static void list_audio_devices_command(void *userdata) {
+    (void)userdata;
+    const AudioDevices audio_devices = get_pulseaudio_inputs();
+
+    if(!audio_devices.default_output.empty())
+        puts("default_output|Default output");
+
+    if(!audio_devices.default_input.empty())
+        puts("default_input|Default input");
+
+    for(const auto &audio_input : audio_devices.audio_inputs) {
+        printf("%s|%s\n", audio_input.name.c_str(), audio_input.description.c_str());
+    }
+
+    fflush(stdout);
+    _exit(0);
+}
+
+static bool app_audio_query_callback(const char *app_name, void*) {
+    puts(app_name);
+    return true;
+}
+
+static void list_application_audio_command(void *userdata) {
+    (void)userdata;
+#ifdef GSR_APP_AUDIO
+    if(pulseaudio_server_is_pipewire()) {
+        gsr_pipewire_audio audio;
+        if(gsr_pipewire_audio_init(&audio)) {
+            gsr_pipewire_audio_for_each_app(&audio, app_audio_query_callback, NULL);
+            gsr_pipewire_audio_deinit(&audio);
+        }
+    }
+#endif
+
+    fflush(stdout);
+    _exit(0);
+}
+
+// |card_path| can be NULL. If not NULL then |vendor| has to be valid
+static void list_capture_options_command(const char *card_path, void *userdata) {
+    (void)userdata;
     bool wayland = false;
     Display *dpy = XOpenDisplay(nullptr);
     if (!dpy) {
         wayland = true;
-        fprintf(stderr, "Warning: failed to connect to the X server. Assuming wayland is running without Xwayland\n");
+        fprintf(stderr, "gsr warning: failed to connect to the X server. Assuming wayland is running without Xwayland\n");
     }
 
     XSetErrorHandler(x11_error_handler);
@@ -1519,468 +1996,1253 @@ int main(int argc, char **argv) {
     if(!wayland)
         wayland = is_xwayland(dpy);
 
-    gsr_egl egl;
-    if(!gsr_egl_load(&egl, dpy, wayland)) {
-        fprintf(stderr, "gsr error: failed to load opengl\n");
+    if(!wayland && is_using_prime_run()) {
+        // Disable prime-run and similar options as it doesn't work, the monitor to capture has to be run on the same device.
+        // This is fine on wayland since nvidia uses drm interface there and the monitor query checks the monitors connected
+        // to the drm device.
+        fprintf(stderr, "gsr warning: use of prime-run on X11 is not supported. Disabling prime-run\n");
+        disable_prime_run();
+    }
+
+    gsr_window *window = gsr_window_create(dpy, wayland);
+    if(!window) {
+        fprintf(stderr, "gsr error: failed to create window\n");
         _exit(1);
     }
 
-    gsr_gpu_info gpu_inf;
-    bool very_old_gpu = false;
-    if(!gl_get_gpu_info(&egl, &gpu_inf))
-        _exit(2);
+    if(card_path) {
+        list_supported_capture_options(window, card_path, true);
+    } else {
+        gsr_egl egl;
+        if(!gsr_egl_load(&egl, window, false, false)) {
+            fprintf(stderr, "gsr error: failed to load opengl\n");
+            _exit(1);
+        }
 
-    if(gpu_inf.vendor == GSR_GPU_VENDOR_NVIDIA && gpu_inf.gpu_version != 0 && gpu_inf.gpu_version < 900) {
-        fprintf(stderr, "Info: your gpu appears to be very old (older than maxwell architecture). Switching to lower preset\n");
-        very_old_gpu = true;
+        bool list_monitors = true;
+        egl.card_path[0] = '\0';
+        if(monitor_capture_use_drm(window, egl.gpu_info.vendor)) {
+            // TODO: Allow specifying another card, and in other places
+            if(!gsr_get_valid_card_path(&egl, egl.card_path, true)) {
+                fprintf(stderr, "gsr error: no /dev/dri/cardX device found. Make sure that you have at least one monitor connected\n");
+                list_monitors = false;
+            }
+        }
+        list_supported_capture_options(window, egl.card_path, list_monitors);
     }
 
-    if(gpu_inf.vendor != GSR_GPU_VENDOR_NVIDIA && overclock) {
-        fprintf(stderr, "Info: overclock option has no effect on amd/intel, ignoring option\n");
-    }
+    fflush(stdout);
 
-    if(gpu_inf.vendor == GSR_GPU_VENDOR_NVIDIA && overclock && wayland) {
-        fprintf(stderr, "Info: overclocking is not possible on nvidia on wayland, ignoring option\n");
-    }
+    // Not needed as this will just slow down shutdown
+    //gsr_egl_unload(&egl);
+    //gsr_window_destroy(&window);
+    //if(dpy)
+    //    XCloseDisplay(dpy);
 
-    char card_path[128];
-    card_path[0] = '\0';
-    if(wayland || gpu_inf.vendor != GSR_GPU_VENDOR_NVIDIA) {
-        // TODO: Allow specifying another card, and in other places
-        if(!gsr_get_valid_card_path(card_path)) {
-            fprintf(stderr, "Error: no /dev/dri/cardX device found\n");
-            _exit(2);
+    _exit(0);
+}
+
+static std::string validate_monitor_get_valid(const gsr_egl *egl, const char* window) {
+    const bool is_x11 = gsr_window_get_display_server(egl->window) == GSR_DISPLAY_SERVER_X11;
+    const gsr_connection_type connection_type = is_x11 ? GSR_CONNECTION_X11 : GSR_CONNECTION_DRM;
+    const bool capture_use_drm = monitor_capture_use_drm(egl->window, egl->gpu_info.vendor);
+
+    std::string window_result = window;
+    if(strcmp(window_result.c_str(), "screen") == 0) {
+        FirstOutputCallback data;
+        data.output_name = NULL;
+        for_each_active_monitor_output(egl->window, egl->card_path, connection_type, get_first_output_callback, &data);
+
+        if(data.output_name) {
+            window_result = data.output_name;
+            free(data.output_name);
+        } else {
+            fprintf(stderr, "gsr error: no usable output found\n");
+            _exit(51);
+        }
+    } else if(capture_use_drm || (strcmp(window_result.c_str(), "screen-direct") != 0 && strcmp(window_result.c_str(), "screen-direct-force") != 0)) {
+        gsr_monitor gmon;
+        if(!get_monitor_by_name(egl, connection_type, window_result.c_str(), &gmon)) {
+            fprintf(stderr, "gsr error: display \"%s\" not found, expected one of:\n", window_result.c_str());
+            fprintf(stderr, "  \"screen\"\n");
+            if(!capture_use_drm)
+                fprintf(stderr, "  \"screen-direct\"\n");
+
+            MonitorOutputCallbackUserdata userdata;
+            userdata.window = egl->window;
+            for_each_active_monitor_output(egl->window, egl->card_path, connection_type, monitor_output_callback_print, &userdata);
+            _exit(51);
         }
     }
+    return window_result;
+}
 
-    // TODO: Fix constant framerate not working properly on amd/intel because capture framerate gets locked to the same framerate as
-    // game framerate, which doesn't work well when you need to encode multiple duplicate frames (AMD/Intel is slow at encoding!).
-    // It also appears to skip audio frames on nvidia wayland? why? that should be fine, but it causes video stuttering because of audio/video sync.
-    FramerateMode framerate_mode;
-    const char *framerate_mode_str = args["-fm"].value();
-    if(!framerate_mode_str)
-        framerate_mode_str = (gpu_inf.vendor == GSR_GPU_VENDOR_NVIDIA && !wayland) ? "cfr" : "vfr";
+static std::string get_monitor_by_region_center(const gsr_egl *egl, vec2i region_position, vec2i region_size, vec2i *monitor_pos, vec2i *monitor_size) {
+    const bool is_x11 = gsr_window_get_display_server(egl->window) == GSR_DISPLAY_SERVER_X11;
+    const gsr_connection_type connection_type = is_x11 ? GSR_CONNECTION_X11 : GSR_CONNECTION_DRM;
+
+    MonitorByPositionCallback data;
+    data.window = egl->window;
+    data.position = { region_position.x + region_size.x / 2, region_position.y + region_size.y / 2 };
+    data.output_name = NULL;
+    data.monitor_pos = {0, 0};
+    data.monitor_size = {0, 0};
+    for_each_active_monitor_output(egl->window, egl->card_path, connection_type, get_monitor_by_position_callback, &data);
+
+    std::string result;
+    if(data.output_name) {
+        result = data.output_name;
+        free(data.output_name);
+    }
+    *monitor_pos = data.monitor_pos;
+    *monitor_size = data.monitor_size;
+    return result;
+}
 
-    if(strcmp(framerate_mode_str, "cfr") == 0) {
-        framerate_mode = FramerateMode::CONSTANT;
-    } else if(strcmp(framerate_mode_str, "vfr") == 0) {
-        framerate_mode = FramerateMode::VARIABLE;
+static gsr_capture* create_monitor_capture(const args_parser &arg_parser, gsr_egl *egl, bool prefer_ximage) {
+    if(gsr_window_get_display_server(egl->window) == GSR_DISPLAY_SERVER_X11 && prefer_ximage) {
+        gsr_capture_ximage_params ximage_params;
+        ximage_params.egl = egl;
+        ximage_params.display_to_capture = arg_parser.window;
+        ximage_params.record_cursor = arg_parser.record_cursor;
+        ximage_params.output_resolution = arg_parser.output_resolution;
+        ximage_params.region_size = arg_parser.region_size;
+        ximage_params.region_position = arg_parser.region_position;
+        return gsr_capture_ximage_create(&ximage_params);
+    }
+
+    if(monitor_capture_use_drm(egl->window, egl->gpu_info.vendor)) {
+        gsr_capture_kms_params kms_params;
+        kms_params.egl = egl;
+        kms_params.display_to_capture = arg_parser.window;
+        kms_params.record_cursor = arg_parser.record_cursor;
+        kms_params.hdr = video_codec_is_hdr(arg_parser.video_codec);
+        kms_params.fps = arg_parser.fps;
+        kms_params.output_resolution = arg_parser.output_resolution;
+        kms_params.region_size = arg_parser.region_size;
+        kms_params.region_position = arg_parser.region_position;
+        return gsr_capture_kms_create(&kms_params);
     } else {
-        fprintf(stderr, "Error: -fm should either be either 'cfr' or 'vfr', got: '%s'\n", framerate_mode_str);
-        usage();
-    }
+        const char *capture_target = arg_parser.window;
+        const bool direct_capture = strcmp(arg_parser.window, "screen-direct") == 0 || strcmp(arg_parser.window, "screen-direct-force") == 0;
+        if(direct_capture) {
+            capture_target = "screen";
+            fprintf(stderr, "gsr warning: %s capture option is not recommended unless you use G-SYNC as Nvidia has driver issues that can cause your system or games to freeze/crash.\n", arg_parser.window);
+        }
 
-    const char *screen_region = args["-s"].value();
-    const char *window_str = strdup(args["-w"].value());
+        gsr_capture_nvfbc_params nvfbc_params;
+        nvfbc_params.egl = egl;
+        nvfbc_params.display_to_capture = capture_target;
+        nvfbc_params.fps = arg_parser.fps;
+        nvfbc_params.direct_capture = direct_capture;
+        nvfbc_params.record_cursor = arg_parser.record_cursor;
+        nvfbc_params.output_resolution = arg_parser.output_resolution;
+        nvfbc_params.region_size = arg_parser.region_size;
+        nvfbc_params.region_position = arg_parser.region_position;
+        return gsr_capture_nvfbc_create(&nvfbc_params);
+    }
+}
 
-    if(screen_region && strcmp(window_str, "focused") != 0) {
-        fprintf(stderr, "Error: option -s is only available when using -w focused\n");
-        usage();
+static std::string region_get_data(gsr_egl *egl, vec2i *region_size, vec2i *region_position) {
+    vec2i monitor_pos = {0, 0};
+    vec2i monitor_size = {0, 0};
+    std::string window = get_monitor_by_region_center(egl, *region_position, *region_size, &monitor_pos, &monitor_size);
+    if(window.empty()) {
+        const bool is_x11 = gsr_window_get_display_server(egl->window) == GSR_DISPLAY_SERVER_X11;
+        const gsr_connection_type connection_type = is_x11 ? GSR_CONNECTION_X11 : GSR_CONNECTION_DRM;
+        fprintf(stderr, "gsr error: the region %dx%d+%d+%d doesn't match any monitor. Available monitors and their regions:\n", region_size->x, region_size->y, region_position->x, region_position->y);
+
+        MonitorOutputCallbackUserdata userdata;
+        userdata.window = egl->window;
+        for_each_active_monitor_output(egl->window, egl->card_path, connection_type, monitor_output_callback_print, &userdata);
+        _exit(51);
+    }
+
+    // Capture whole monitor when region size is set to 0x0
+    if(region_size->x == 0 && region_size->y == 0) {
+        region_position->x = 0;
+        region_position->y = 0;
+    } else {
+        region_position->x -= monitor_pos.x;
+        region_position->y -= monitor_pos.y;
     }
+    return window;
+}
 
-    vec2i region_size = { 0, 0 };
+static gsr_capture* create_capture_impl(args_parser &arg_parser, gsr_egl *egl, bool prefer_ximage) {
     Window src_window_id = None;
     bool follow_focused = false;
+    const bool wayland = gsr_window_get_display_server(egl->window) == GSR_DISPLAY_SERVER_WAYLAND;
 
     gsr_capture *capture = nullptr;
-    if(strcmp(window_str, "focused") == 0) {
+    if(strcmp(arg_parser.window, "focused") == 0) {
         if(wayland) {
-            fprintf(stderr, "Error: GPU Screen Recorder window capture only works in a pure X11 session. Xwayland is not supported. You can record a monitor instead on wayland\n");
+            fprintf(stderr, "gsr error: GPU Screen Recorder window capture only works in a pure X11 session. Xwayland is not supported. You can record a monitor instead on wayland\n");
             _exit(2);
         }
 
-        if(!screen_region) {
-            fprintf(stderr, "Error: option -s is required when using -w focused\n");
-            usage();
-        }
-
-        if(sscanf(screen_region, "%dx%d", &region_size.x, &region_size.y) != 2) {
-            fprintf(stderr, "Error: invalid value for option -s '%s', expected a value in format WxH\n", screen_region);
-            usage();
-        }
-
-        if(region_size.x <= 0 || region_size.y <= 0) {
-            fprintf(stderr, "Error: invalud value for option -s '%s', expected width and height to be greater than 0\n", screen_region);
-            usage();
+        if(arg_parser.output_resolution.x <= 0 || arg_parser.output_resolution.y <= 0) {
+            fprintf(stderr, "gsr error: invalid value for option -s '%dx%d' when using -w focused option. expected width and height to be greater than 0\n", arg_parser.output_resolution.x, arg_parser.output_resolution.y);
+            args_parser_print_usage();
+            _exit(1);
         }
 
         follow_focused = true;
-    } else if(contains_non_hex_number(window_str)) {
-        if(wayland || gpu_inf.vendor != GSR_GPU_VENDOR_NVIDIA) {
-            if(strcmp(window_str, "screen") == 0) {
-                FirstOutputCallback first_output;
-                first_output.output_name = NULL;
-                if(gsr_egl_supports_wayland_capture(&egl)) {
-                    for_each_active_monitor_output(&egl, GSR_CONNECTION_WAYLAND, get_first_output, &first_output);
-                } else {
-                    for_each_active_monitor_output(card_path, GSR_CONNECTION_DRM, get_first_output, &first_output);
-                }
-
-                if(first_output.output_name) {
-                    window_str = first_output.output_name;
-                } else {
-                    fprintf(stderr, "Error: no available output found\n");
-                }
-            }
-
-            if(gsr_egl_supports_wayland_capture(&egl)) {
-                gsr_monitor gmon;
-                if(!get_monitor_by_name(&egl, GSR_CONNECTION_WAYLAND, window_str, &gmon)) {
-                    fprintf(stderr, "gsr error: display \"%s\" not found, expected one of:\n", window_str);
-                    fprintf(stderr, "    \"screen\"\n");
-                    for_each_active_monitor_output(&egl, GSR_CONNECTION_WAYLAND, monitor_output_callback_print, NULL);
-                    _exit(1);
-                }
-            } else {
-                gsr_monitor gmon;
-                if(!get_monitor_by_name(card_path, GSR_CONNECTION_DRM, window_str, &gmon)) {
-                    fprintf(stderr, "gsr error: display \"%s\" not found, expected one of:\n", window_str);
-                    fprintf(stderr, "    \"screen\"\n");
-                    for_each_active_monitor_output(card_path, GSR_CONNECTION_DRM, monitor_output_callback_print, NULL);
-                    _exit(1);
-                }
-            }
-        } else {
-            if(strcmp(window_str, "screen") != 0 && strcmp(window_str, "screen-direct") != 0 && strcmp(window_str, "screen-direct-force") != 0) {
-                gsr_monitor gmon;
-                if(!get_monitor_by_name(dpy, GSR_CONNECTION_X11, window_str, &gmon)) {
-                    fprintf(stderr, "gsr error: display \"%s\" not found, expected one of:\n", window_str);
-                    fprintf(stderr, "    \"screen\"    (%dx%d+%d+%d)\n", XWidthOfScreen(DefaultScreenOfDisplay(dpy)), XHeightOfScreen(DefaultScreenOfDisplay(dpy)), 0, 0);
-                    fprintf(stderr, "    \"screen-direct\"    (%dx%d+%d+%d)\n", XWidthOfScreen(DefaultScreenOfDisplay(dpy)), XHeightOfScreen(DefaultScreenOfDisplay(dpy)), 0, 0);
-                    fprintf(stderr, "    \"screen-direct-force\"    (%dx%d+%d+%d)\n", XWidthOfScreen(DefaultScreenOfDisplay(dpy)), XHeightOfScreen(DefaultScreenOfDisplay(dpy)), 0, 0);
-                    for_each_active_monitor_output(dpy, GSR_CONNECTION_X11, monitor_output_callback_print, NULL);
-                    _exit(1);
-                }
-            }
+    } else if(strcmp(arg_parser.window, "portal") == 0) {
+#ifdef GSR_PORTAL
+        // Desktop portal capture on x11 doesn't seem to be hardware accelerated
+        if(!wayland) {
+            fprintf(stderr, "gsr error: desktop portal capture is not supported on X11\n");
+            _exit(1);
         }
 
-        if(gpu_inf.vendor == GSR_GPU_VENDOR_NVIDIA) {
-            if(wayland) {
-                gsr_capture_kms_cuda_params kms_params;
-                kms_params.egl = &egl;
-                kms_params.display_to_capture = window_str;
-                kms_params.gpu_inf = gpu_inf;
-                kms_params.card_path = card_path;
-                capture = gsr_capture_kms_cuda_create(&kms_params);
-                if(!capture)
-                    _exit(1);
-            } else {
-                const char *capture_target = window_str;
-                bool direct_capture = strcmp(window_str, "screen-direct") == 0;
-                if(direct_capture) {
-                    capture_target = "screen";
-                    // TODO: Temporary disable direct capture because push model causes stuttering when it's direct capturing. This might be a nvfbc bug. This does not happen when using a compositor.
-                    direct_capture = false;
-                    fprintf(stderr, "Warning: screen-direct has temporary been disabled as it causes stuttering. This is likely a NvFBC bug. Falling back to \"screen\".\n");
-                }
-
-                if(strcmp(window_str, "screen-direct-force") == 0) {
-                    direct_capture = true;
-                    capture_target = "screen";
-                }
-
-                gsr_egl_unload(&egl);
-
-                gsr_capture_nvfbc_params nvfbc_params;
-                nvfbc_params.dpy = dpy;
-                nvfbc_params.display_to_capture = capture_target;
-                nvfbc_params.fps = fps;
-                nvfbc_params.pos = { 0, 0 };
-                nvfbc_params.size = { 0, 0 };
-                nvfbc_params.direct_capture = direct_capture;
-                nvfbc_params.overclock = overclock;
-                capture = gsr_capture_nvfbc_create(&nvfbc_params);
-                if(!capture)
-                    _exit(1);
-            }
-        } else {
-            gsr_capture_kms_vaapi_params kms_params;
-            kms_params.egl = &egl;
-            kms_params.display_to_capture = window_str;
-            kms_params.gpu_inf = gpu_inf;
-            kms_params.card_path = card_path;
-            kms_params.wayland = wayland;
-            capture = gsr_capture_kms_vaapi_create(&kms_params);
-            if(!capture)
-                _exit(1);
-        }
+        gsr_capture_portal_params portal_params;
+        portal_params.egl = egl;
+        portal_params.record_cursor = arg_parser.record_cursor;
+        portal_params.restore_portal_session = arg_parser.restore_portal_session;
+        portal_params.portal_session_token_filepath = arg_parser.portal_session_token_filepath;
+        portal_params.output_resolution = arg_parser.output_resolution;
+        capture = gsr_capture_portal_create(&portal_params);
+        if(!capture)
+            _exit(1);
+#else
+        fprintf(stderr, "gsr error: option '-w portal' used but GPU Screen Recorder was compiled without desktop portal support. Please recompile GPU Screen recorder with the -Dportal=true option\n");
+        _exit(2);
+#endif
+    } else if(strcmp(arg_parser.window, "region") == 0) {
+        const std::string window = region_get_data(egl, &arg_parser.region_size, &arg_parser.region_position);
+        snprintf(arg_parser.window, sizeof(arg_parser.window), "%s", window.c_str());
+        capture = create_monitor_capture(arg_parser, egl, prefer_ximage);
+        if(!capture)
+            _exit(1);
+    } else if(contains_non_hex_number(arg_parser.window)) {
+        const std::string window = validate_monitor_get_valid(egl, arg_parser.window);
+        snprintf(arg_parser.window, sizeof(arg_parser.window), "%s", window.c_str());
+        capture = create_monitor_capture(arg_parser, egl, prefer_ximage);
+        if(!capture)
+            _exit(1);
     } else {
         if(wayland) {
-            fprintf(stderr, "Error: GPU Screen Recorder window capture only works in a pure X11 session. Xwayland is not supported. You can record a monitor instead on wayland\n");
+            fprintf(stderr, "gsr error: GPU Screen Recorder window capture only works in a pure X11 session. Xwayland is not supported. You can record a monitor instead on wayland or use -w portal option which supports window capture if your wayland compositor supports window capture\n");
             _exit(2);
         }
 
         errno = 0;
-        src_window_id = strtol(window_str, nullptr, 0);
+        src_window_id = strtol(arg_parser.window, nullptr, 0);
         if(src_window_id == None || errno == EINVAL) {
-            fprintf(stderr, "Invalid window number %s\n", window_str);
-            usage();
+            fprintf(stderr, "gsr error: invalid window number %s\n", arg_parser.window);
+            args_parser_print_usage();
+            _exit(1);
         }
     }
 
     if(!capture) {
-        switch(gpu_inf.vendor) {
-            case GSR_GPU_VENDOR_AMD: {
-                gsr_capture_xcomposite_vaapi_params xcomposite_params;
-                xcomposite_params.egl = &egl;
-                xcomposite_params.dpy = dpy;
-                xcomposite_params.window = src_window_id;
-                xcomposite_params.follow_focused = follow_focused;
-                xcomposite_params.region_size = region_size;
-                xcomposite_params.card_path = card_path;
-                capture = gsr_capture_xcomposite_vaapi_create(&xcomposite_params);
-                if(!capture)
-                    _exit(1);
-                break;
-            }
-            case GSR_GPU_VENDOR_INTEL: {
-                gsr_capture_xcomposite_vaapi_params xcomposite_params;
-                xcomposite_params.egl = &egl;
-                xcomposite_params.dpy = dpy;
-                xcomposite_params.window = src_window_id;
-                xcomposite_params.follow_focused = follow_focused;
-                xcomposite_params.region_size = region_size;
-                xcomposite_params.card_path = card_path;
-                capture = gsr_capture_xcomposite_vaapi_create(&xcomposite_params);
-                if(!capture)
-                    _exit(1);
+        gsr_capture_xcomposite_params xcomposite_params;
+        xcomposite_params.egl = egl;
+        xcomposite_params.window = src_window_id;
+        xcomposite_params.follow_focused = follow_focused;
+        xcomposite_params.record_cursor = arg_parser.record_cursor;
+        xcomposite_params.output_resolution = arg_parser.output_resolution;
+        capture = gsr_capture_xcomposite_create(&xcomposite_params);
+        if(!capture)
+            _exit(1);
+    }
+
+    return capture;
+}
+
+static gsr_color_range image_format_to_color_range(gsr_image_format image_format) {
+    switch(image_format) {
+        case GSR_IMAGE_FORMAT_JPEG: return GSR_COLOR_RANGE_LIMITED;
+        case GSR_IMAGE_FORMAT_PNG:  return GSR_COLOR_RANGE_FULL;
+    }
+    assert(false);
+    return GSR_COLOR_RANGE_FULL;
+}
+
+static int video_quality_to_image_quality_value(gsr_video_quality video_quality) {
+    switch(video_quality) {
+        case GSR_VIDEO_QUALITY_MEDIUM:
+            return 75;
+        case GSR_VIDEO_QUALITY_HIGH:
+            return 85;
+        case GSR_VIDEO_QUALITY_VERY_HIGH:
+            return 90;
+        case GSR_VIDEO_QUALITY_ULTRA:
+            return 97;
+    }
+    assert(false);
+    return 90;
+}
+
+// TODO: 10-bit and hdr.
+static void capture_image_to_file(args_parser &arg_parser, gsr_egl *egl, gsr_image_format image_format) {
+    const gsr_color_range color_range = image_format_to_color_range(image_format);
+    const int fps = 60;
+    const bool prefer_ximage = true;
+    gsr_capture *capture = create_capture_impl(arg_parser, egl, prefer_ximage);
+
+    gsr_capture_metadata capture_metadata;
+    capture_metadata.width = 0;
+    capture_metadata.height = 0;
+    capture_metadata.fps = fps;
+    capture_metadata.video_codec_context = nullptr;
+    capture_metadata.frame = nullptr;
+
+    int capture_result = gsr_capture_start(capture, &capture_metadata);
+    if(capture_result != 0) {
+        fprintf(stderr, "gsr error: capture_image_to_file_wayland: gsr_capture_start failed\n");
+        _exit(capture_result);
+    }
+
+    gsr_image_writer image_writer;
+    if(!gsr_image_writer_init_opengl(&image_writer, egl, capture_metadata.width, capture_metadata.height)) {
+        fprintf(stderr, "gsr error: capture_image_to_file_wayland: gsr_image_write_gl_init failed\n");
+        _exit(1);
+    }
+
+    gsr_color_conversion_params color_conversion_params;
+    memset(&color_conversion_params, 0, sizeof(color_conversion_params));
+    color_conversion_params.color_range = color_range;
+    color_conversion_params.egl = egl;
+    color_conversion_params.load_external_image_shader = gsr_capture_uses_external_image(capture);
+
+    color_conversion_params.destination_textures[0] = image_writer.texture;
+    color_conversion_params.num_destination_textures = 1;
+    color_conversion_params.destination_color = GSR_DESTINATION_COLOR_RGB8;
+
+    gsr_color_conversion color_conversion;
+    if(gsr_color_conversion_init(&color_conversion, &color_conversion_params) != 0) {
+        fprintf(stderr, "gsr error: capture_image_to_file_wayland: failed to create color conversion\n");
+        _exit(1);
+    }
+
+    gsr_color_conversion_clear(&color_conversion);
+
+    bool should_stop_error = false;
+    egl->glClear(0);
+
+    while(running) {
+        should_stop_error = false;
+        if(gsr_capture_should_stop(capture, &should_stop_error)) {
+            running = 0;
+            break;
+        }
+
+        // It can fail, for example when capturing portal and the target is a monitor that hasn't been updated.
+        // Desktop portal wont refresh the image until there is an update.
+        // TODO: Find out if there is a way to force update desktop portal image.
+        // This can also happen for example if the system suspends and the monitor to capture's framebuffer is gone, or if the target window disappeared.
+        if(gsr_capture_capture(capture, &capture_metadata, &color_conversion) == 0)
+            break;
+
+        usleep(30 * 1000); // 30 ms
+    }
+
+    gsr_egl_swap_buffers(egl);
+    
+    const int image_quality = video_quality_to_image_quality_value(arg_parser.video_quality);
+    if(!gsr_image_writer_write_to_file(&image_writer, arg_parser.filename, image_format, image_quality)) {
+        fprintf(stderr, "gsr error: capture_image_to_file_wayland: failed to write opengl texture to image output file %s\n", arg_parser.filename);
+        _exit(1);
+    }
+
+    gsr_image_writer_deinit(&image_writer);
+    gsr_capture_destroy(capture);
+    _exit(should_stop_error ? 3 : 0);
+}
+
+static AVPixelFormat get_pixel_format(gsr_video_codec video_codec, gsr_gpu_vendor vendor, bool use_software_video_encoder) {
+    if(use_software_video_encoder) {
+        return AV_PIX_FMT_NV12;
+    } else {
+        if(video_codec_is_vulkan(video_codec))
+            return AV_PIX_FMT_VULKAN;
+        else
+            return vendor == GSR_GPU_VENDOR_NVIDIA ? AV_PIX_FMT_CUDA : AV_PIX_FMT_VAAPI;
+    }
+}
+
+static void match_app_audio_input_to_available_apps(const std::vector<AudioInput> &requested_audio_inputs, const std::vector<std::string> &app_audio_names) {
+    for(const AudioInput &request_audio_input : requested_audio_inputs) {
+        if(request_audio_input.type != AudioInputType::APPLICATION || request_audio_input.inverted)
+            continue;
+
+        bool match = false;
+        for(const std::string &app_name : app_audio_names) {
+            if(strcasecmp(app_name.c_str(), request_audio_input.name.c_str()) == 0) {
+                match = true;
                 break;
             }
-            case GSR_GPU_VENDOR_NVIDIA: {
-                gsr_capture_xcomposite_cuda_params xcomposite_params;
-                xcomposite_params.egl = &egl;
-                xcomposite_params.dpy = dpy;
-                xcomposite_params.window = src_window_id;
-                xcomposite_params.follow_focused = follow_focused;
-                xcomposite_params.region_size = region_size;
-                xcomposite_params.overclock = overclock;
-                capture = gsr_capture_xcomposite_cuda_create(&xcomposite_params);
-                if(!capture)
-                    _exit(1);
-                break;
+        }
+
+        if(!match) {
+            fprintf(stderr, "gsr warning: no audio application with the name \"%s\" was found, expected one of the following:\n", request_audio_input.name.c_str());
+            for(const std::string &app_name : app_audio_names) {
+                fprintf(stderr, "  * %s\n", app_name.c_str());
             }
+            fprintf(stderr, "  assuming this is intentional (if you are trying to record audio for applications that haven't started yet).\n");
         }
     }
+}
+
+// Manually check if the audio inputs we give exist. This is only needed for pipewire, not pulseaudio.
+// Pipewire instead DEFAULTS TO THE DEFAULT AUDIO INPUT. THAT'S RETARDED.
+// OH, YOU MISSPELLED THE AUDIO INPUT? FUCK YOU
+static std::vector<MergedAudioInputs> parse_audio_inputs(const AudioDevices &audio_devices, const Arg *audio_input_arg) {
+    std::vector<MergedAudioInputs> requested_audio_inputs;
+
+    for(int i = 0; i < audio_input_arg->num_values; ++i) {
+        const char *audio_input = audio_input_arg->values[i];
+        if(!audio_input || audio_input[0] == '\0')
+            continue;
+
+        requested_audio_inputs.push_back(parse_audio_input_arg(audio_input));
+        for(AudioInput &request_audio_input : requested_audio_inputs.back().audio_inputs) {
+            if(request_audio_input.type != AudioInputType::DEVICE)
+                continue;
+
+            bool match = false;
 
-    const char *filename = args["-o"].value();
-    if(filename) {
-        if(replay_buffer_size_secs != -1) {
-            if(!container_format) {
-                fprintf(stderr, "Error: option -c is required when using option -r\n");
-                usage();
+            if(request_audio_input.name == "default_output") {
+                if(audio_devices.default_output.empty()) {
+                    fprintf(stderr, "gsr error: -a default_output was specified but no default audio output is specified in the audio server\n");
+                    _exit(2);
+                }
+                match = true;
+            } else if(request_audio_input.name == "default_input") {
+                if(audio_devices.default_input.empty()) {
+                    fprintf(stderr, "gsr error: -a default_input was specified but no default audio input is specified in the audio server\n");
+                    _exit(2);
+                }
+                match = true;
+            } else {
+                const bool name_is_existing_audio_device = get_audio_device_by_name(audio_devices.audio_inputs, request_audio_input.name.c_str()) != nullptr;
+                if(name_is_existing_audio_device)
+                    match = true;
             }
 
-            struct stat buf;
-            if(stat(filename, &buf) != -1 && !S_ISDIR(buf.st_mode)) {
-                fprintf(stderr, "Error: File \"%s\" exists but it's not a directory\n", filename);
-                usage();
+            if(!match) {
+                fprintf(stderr, "gsr error: Audio device '%s' is not a valid audio device, expected one of:\n", request_audio_input.name.c_str());
+                if(!audio_devices.default_output.empty())
+                    fprintf(stderr, "    default_output (Default output)\n");
+                if(!audio_devices.default_input.empty())
+                    fprintf(stderr, "    default_input (Default input)\n");
+                for(const auto &audio_device_input : audio_devices.audio_inputs) {
+                    fprintf(stderr, "    %s (%s)\n", audio_device_input.name.c_str(), audio_device_input.description.c_str());
+                }
+                _exit(50);
             }
         }
-    } else {
-        if(replay_buffer_size_secs == -1) {
-            filename = "/dev/stdout";
-        } else {
-            fprintf(stderr, "Error: Option -o is required when using option -r\n");
-            usage();
-        }
+    }
 
-        if(!container_format) {
-            fprintf(stderr, "Error: option -c is required when not using option -o\n");
-            usage();
-        }
+    return requested_audio_inputs;
+}
+
+static bool audio_inputs_has_app_audio(const std::vector<AudioInput> &audio_inputs) {
+    for(const auto &audio_input : audio_inputs) {
+        if(audio_input.type == AudioInputType::APPLICATION)
+            return true;
     }
+    return false;
+}
 
-    AVFormatContext *av_format_context;
-    // The output format is automatically guessed by the file extension
-    avformat_alloc_output_context2(&av_format_context, nullptr, container_format, filename);
-    if (!av_format_context) {
-        if(container_format)
-            fprintf(stderr, "Error: Container format '%s' (argument -c) is not valid\n", container_format);
-        else
-            fprintf(stderr, "Error: Failed to deduce container format from file extension\n");
-        _exit(1);
+static bool merged_audio_inputs_has_app_audio(const std::vector<MergedAudioInputs> &merged_audio_inputs) {
+    for(const auto &merged_audio_input : merged_audio_inputs) {
+        if(audio_inputs_has_app_audio(merged_audio_input.audio_inputs))
+            return true;
     }
+    return false;
+}
 
-    const AVOutputFormat *output_format = av_format_context->oformat;
+// Should use amix if more than 1 audio device and 0 application audio, merged
+static bool audio_inputs_should_use_amix(const std::vector<AudioInput> &audio_inputs) {
+    int num_audio_devices = 0;
+    int num_app_audio = 0;
 
-    std::string file_extension = output_format->extensions;
-    {
-        size_t comma_index = file_extension.find(',');
-        if(comma_index != std::string::npos)
-            file_extension = file_extension.substr(0, comma_index);
+    for(const auto &audio_input : audio_inputs) {
+        if(audio_input.type == AudioInputType::DEVICE)
+            ++num_audio_devices;
+        else if(audio_input.type == AudioInputType::APPLICATION)
+            ++num_app_audio;
+    }
+
+    return num_audio_devices > 1 && num_app_audio == 0;
+}
+
+static bool merged_audio_inputs_should_use_amix(const std::vector<MergedAudioInputs> &merged_audio_inputs) {
+    for(const auto &merged_audio_input : merged_audio_inputs) {
+        if(audio_inputs_should_use_amix(merged_audio_input.audio_inputs))
+            return true;
     }
+    return false;
+}
 
+static void validate_merged_audio_inputs_app_audio(const std::vector<MergedAudioInputs> &merged_audio_inputs, const std::vector<std::string> &app_audio_names) {
+    for(const auto &merged_audio_input : merged_audio_inputs) {
+        int num_app_audio = 0;
+        int num_app_inverted_audio = 0;
+
+        for(const auto &audio_input : merged_audio_input.audio_inputs) {
+            if(audio_input.type == AudioInputType::APPLICATION) {
+                if(audio_input.inverted)
+                    ++num_app_inverted_audio;
+                else
+                    ++num_app_audio;
+            }
+        }
+
+        match_app_audio_input_to_available_apps(merged_audio_input.audio_inputs, app_audio_names);
+
+        if(num_app_audio > 0 && num_app_inverted_audio > 0) {
+            fprintf(stderr, "gsr error: argument -a was provided with both app: and app-inverse:, only one of them can be used for one audio track\n");
+            _exit(2);
+        }
+    }
+}
+
+static gsr_audio_codec select_audio_codec_with_fallback(gsr_audio_codec audio_codec, const std::string &file_extension, bool uses_amix) {
     switch(audio_codec) {
-        case AudioCodec::AAC: {
+        case GSR_AUDIO_CODEC_AAC: {
+            if(file_extension == "webm") {
+                //audio_codec_to_use = "opus";
+                audio_codec = GSR_AUDIO_CODEC_OPUS;
+                fprintf(stderr, "gsr warning: .webm files only support opus audio codec, changing audio codec from aac to opus\n");
+            }
             break;
         }
-        case AudioCodec::OPUS: {
-            if(file_extension != "mp4" && file_extension != "mkv") {
-                audio_codec_to_use = "aac";
-                audio_codec = AudioCodec::AAC;
-                fprintf(stderr, "Warning: opus audio codec is only supported by .mp4 and .mkv files, falling back to aac instead\n");
+        case GSR_AUDIO_CODEC_OPUS: {
+            // TODO: Also check mpegts?
+            if(file_extension != "mp4" && file_extension != "mkv" && file_extension != "webm") {
+                //audio_codec_to_use = "aac";
+                audio_codec = GSR_AUDIO_CODEC_AAC;
+                fprintf(stderr, "gsr warning: opus audio codec is only supported by .mp4, .mkv and .webm files, falling back to aac instead\n");
             }
             break;
         }
-        case AudioCodec::FLAC: {
-            if(file_extension != "mp4" && file_extension != "mkv") {
-                audio_codec_to_use = "aac";
-                audio_codec = AudioCodec::AAC;
-                fprintf(stderr, "Warning: flac audio codec is only supported by .mp4 and .mkv files, falling back to aac instead\n");
+        case GSR_AUDIO_CODEC_FLAC: {
+            // TODO: Also check mpegts?
+            if(file_extension == "webm") {
+                //audio_codec_to_use = "opus";
+                audio_codec = GSR_AUDIO_CODEC_OPUS;
+                fprintf(stderr, "gsr warning: .webm files only support opus audio codec, changing audio codec from flac to opus\n");
+            } else if(file_extension != "mp4" && file_extension != "mkv") {
+                //audio_codec_to_use = "aac";
+                audio_codec = GSR_AUDIO_CODEC_AAC;
+                fprintf(stderr, "gsr warning: flac audio codec is only supported by .mp4 and .mkv files, falling back to aac instead\n");
+            } else if(uses_amix) {
+                // TODO: remove this? is it true anymore?
+                //audio_codec_to_use = "opus";
+                audio_codec = GSR_AUDIO_CODEC_OPUS;
+                fprintf(stderr, "gsr warning: flac audio codec is not supported when mixing audio sources, falling back to opus instead\n");
             }
             break;
         }
     }
+    return audio_codec;
+}
 
-    const double target_fps = 1.0 / (double)fps;
+static bool video_codec_only_supports_low_power_mode(const gsr_supported_video_codecs &supported_video_codecs, gsr_video_codec video_codec) {
+    switch(video_codec) {
+        case GSR_VIDEO_CODEC_H264:        return supported_video_codecs.h264.low_power;
+        case GSR_VIDEO_CODEC_HEVC:        return supported_video_codecs.hevc.low_power;
+        case GSR_VIDEO_CODEC_HEVC_HDR:    return supported_video_codecs.hevc_hdr.low_power;
+        case GSR_VIDEO_CODEC_HEVC_10BIT:  return supported_video_codecs.hevc_10bit.low_power;
+        case GSR_VIDEO_CODEC_AV1:         return supported_video_codecs.av1.low_power;
+        case GSR_VIDEO_CODEC_AV1_HDR:     return supported_video_codecs.av1_hdr.low_power;
+        case GSR_VIDEO_CODEC_AV1_10BIT:   return supported_video_codecs.av1_10bit.low_power;
+        case GSR_VIDEO_CODEC_VP8:         return supported_video_codecs.vp8.low_power;
+        case GSR_VIDEO_CODEC_VP9:         return supported_video_codecs.vp9.low_power;
+        case GSR_VIDEO_CODEC_H264_VULKAN: return supported_video_codecs.h264.low_power;
+        case GSR_VIDEO_CODEC_HEVC_VULKAN: return supported_video_codecs.hevc.low_power; // TODO: hdr, 10 bit
+    }
+    return false;
+}
 
-    const bool video_codec_auto = strcmp(video_codec_to_use, "auto") == 0;
-    if(video_codec_auto) {
-        if(gpu_inf.vendor == GSR_GPU_VENDOR_INTEL) {
-            const AVCodec *h264_codec = find_h264_encoder(gpu_inf.vendor, card_path);
-            if(!h264_codec) {
-                fprintf(stderr, "Info: using h265 encoder because a codec was not specified and your gpu does not support h264\n");
-                video_codec_to_use = "h265";
-                video_codec = VideoCodec::H265;
-            } else {
-                fprintf(stderr, "Info: using h264 encoder because a codec was not specified\n");
-                video_codec_to_use = "h264";
-                video_codec = VideoCodec::H264;
-            }
-        } else {
-            const AVCodec *h265_codec = find_h265_encoder(gpu_inf.vendor, card_path);
-
-            // h265 generally allows recording at a higher resolution than h264 on nvidia cards. On a gtx 1080 4k is the max resolution for h264 but for h265 it's 8k.
-            // Another important info is that when recording at a higher fps than.. 60? h265 has very bad performance. For example when recording at 144 fps the fps drops to 1
-            // while with h264 the fps doesn't drop.
-            if(!h265_codec) {
-                fprintf(stderr, "Info: using h264 encoder because a codec was not specified and your gpu does not support h265\n");
-                video_codec_to_use = "h264";
-                video_codec = VideoCodec::H264;
-            } else if(fps > 60) {
-                fprintf(stderr, "Info: using h264 encoder because a codec was not specified and fps is more than 60\n");
-                video_codec_to_use = "h264";
-                video_codec = VideoCodec::H264;
-            } else {
-                fprintf(stderr, "Info: using h265 encoder because a codec was not specified\n");
-                video_codec_to_use = "h265";
-                video_codec = VideoCodec::H265;
-            }
-        }
-    }
+static const AVCodec* pick_video_codec(gsr_video_codec *video_codec, gsr_egl *egl, bool use_software_video_encoder, bool video_codec_auto, bool is_flv, bool *low_power) {
+    // TODO: software encoder for hevc, av1, vp8 and vp9
+    *low_power = false;
 
-    // TODO: Allow hevc, vp9 and av1 in (enhanced) flv (supported since ffmpeg 6.1)
-    const bool is_flv = strcmp(file_extension.c_str(), "flv") == 0;
-    if(video_codec != VideoCodec::H264 && is_flv) {
-        video_codec_to_use = "h264";
-        video_codec = VideoCodec::H264;
-        fprintf(stderr, "Warning: h265 is not compatible with flv, falling back to h264 instead.\n");
+    gsr_supported_video_codecs supported_video_codecs;
+    if(!get_supported_video_codecs(egl, *video_codec, use_software_video_encoder, true, &supported_video_codecs)) {
+        fprintf(stderr, "gsr error: failed to query for supported video codecs\n");
+        _exit(11);
     }
 
     const AVCodec *video_codec_f = nullptr;
-    switch(video_codec) {
-        case VideoCodec::H264:
-            video_codec_f = find_h264_encoder(gpu_inf.vendor, card_path);
+
+    switch(*video_codec) {
+        case GSR_VIDEO_CODEC_H264: {
+            if(use_software_video_encoder)
+                video_codec_f = avcodec_find_encoder_by_name("libx264");
+            else if(supported_video_codecs.h264.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
+            break;
+        }
+        case GSR_VIDEO_CODEC_HEVC: {
+            if(supported_video_codecs.hevc.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
+            break;
+        }
+        case GSR_VIDEO_CODEC_HEVC_HDR: {
+            if(supported_video_codecs.hevc_hdr.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
+            break;
+        }
+        case GSR_VIDEO_CODEC_HEVC_10BIT: {
+            if(supported_video_codecs.hevc_10bit.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
+            break;
+        }
+        case GSR_VIDEO_CODEC_AV1: {
+            if(supported_video_codecs.av1.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
+            break;
+        }
+        case GSR_VIDEO_CODEC_AV1_HDR: {
+            if(supported_video_codecs.av1_hdr.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
             break;
-        case VideoCodec::H265:
-            video_codec_f = find_h265_encoder(gpu_inf.vendor, card_path);
+        }
+        case GSR_VIDEO_CODEC_AV1_10BIT: {
+            if(supported_video_codecs.av1_10bit.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
             break;
-        case VideoCodec::AV1:
-            video_codec_f = find_av1_encoder(gpu_inf.vendor, card_path);
+        }
+        case GSR_VIDEO_CODEC_VP8: {
+            if(supported_video_codecs.vp8.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
             break;
+        }
+        case GSR_VIDEO_CODEC_VP9: {
+            if(supported_video_codecs.vp9.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
+            break;
+        }
+        case GSR_VIDEO_CODEC_H264_VULKAN: {
+            if(supported_video_codecs.h264.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
+            break;
+        }
+        case GSR_VIDEO_CODEC_HEVC_VULKAN: {
+            // TODO: hdr, 10 bit
+            if(supported_video_codecs.hevc.supported)
+                video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
+            break;
+        }
     }
 
     if(!video_codec_auto && !video_codec_f && !is_flv) {
-        switch(video_codec) {
-            case VideoCodec::H264: {
-                fprintf(stderr, "Warning: selected video codec h264 is not supported, trying h265 instead\n");
-                video_codec_to_use = "h265";
-                video_codec = VideoCodec::H265;
-                video_codec_f = find_h265_encoder(gpu_inf.vendor, card_path);
+        switch(*video_codec) {
+            case GSR_VIDEO_CODEC_H264: {
+                fprintf(stderr, "gsr warning: selected video codec h264 is not supported, trying hevc instead\n");
+                *video_codec = GSR_VIDEO_CODEC_HEVC;
+                if(supported_video_codecs.hevc.supported)
+                    video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
                 break;
             }
-            case VideoCodec::H265: {
-                fprintf(stderr, "Warning: selected video codec h265 is not supported, trying h264 instead\n");
-                video_codec_to_use = "h264";
-                video_codec = VideoCodec::H264;
-                video_codec_f = find_h264_encoder(gpu_inf.vendor, card_path);
+            case GSR_VIDEO_CODEC_HEVC:
+            case GSR_VIDEO_CODEC_HEVC_HDR:
+            case GSR_VIDEO_CODEC_HEVC_10BIT: {
+                fprintf(stderr, "gsr warning: selected video codec hevc is not supported, trying h264 instead\n");
+                *video_codec = GSR_VIDEO_CODEC_H264;
+                if(supported_video_codecs.h264.supported)
+                    video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
                 break;
             }
-            case VideoCodec::AV1: {
-                fprintf(stderr, "Warning: selected video codec av1 is not supported, trying h264 instead\n");
-                video_codec_to_use = "h264";
-                video_codec = VideoCodec::H264;
-                video_codec_f = find_h264_encoder(gpu_inf.vendor, card_path);
+            case GSR_VIDEO_CODEC_AV1:
+            case GSR_VIDEO_CODEC_AV1_HDR:
+            case GSR_VIDEO_CODEC_AV1_10BIT: {
+                fprintf(stderr, "gsr warning: selected video codec av1 is not supported, trying h264 instead\n");
+                *video_codec = GSR_VIDEO_CODEC_H264;
+                if(supported_video_codecs.h264.supported)
+                    video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
                 break;
             }
-        }
-    }
-
-    if(!video_codec_f) {
-        const char *video_codec_name = "";
-        switch(video_codec) {
-            case VideoCodec::H264: {
-                video_codec_name = "h265";
+            case GSR_VIDEO_CODEC_VP8:
+            case GSR_VIDEO_CODEC_VP9:
+                // TODO: Cant fallback to other codec because webm only supports vp8/vp9
                 break;
-            }
-            case VideoCodec::H265: {
-                video_codec_name = "h265";
+            case GSR_VIDEO_CODEC_H264_VULKAN: {
+                fprintf(stderr, "gsr warning: selected video codec h264_vulkan is not supported, trying h264 instead\n");
+                *video_codec = GSR_VIDEO_CODEC_H264;
+                // Need to do a query again because this time it's without vulkan
+                if(!get_supported_video_codecs(egl, *video_codec, use_software_video_encoder, true, &supported_video_codecs)) {
+                    fprintf(stderr, "gsr error: failed to query for supported video codecs\n");
+                    _exit(11);
+                }
+                if(supported_video_codecs.h264.supported)
+                    video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
                 break;
             }
-            case VideoCodec::AV1: {
-                video_codec_name = "av1";
+            case GSR_VIDEO_CODEC_HEVC_VULKAN: {
+                fprintf(stderr, "gsr warning: selected video codec hevc_vulkan is not supported, trying hevc instead\n");
+                *video_codec = GSR_VIDEO_CODEC_HEVC;
+                // Need to do a query again because this time it's without vulkan
+                if(!get_supported_video_codecs(egl, *video_codec, use_software_video_encoder, true, &supported_video_codecs)) {
+                    fprintf(stderr, "gsr error: failed to query for supported video codecs\n");
+                    _exit(11);
+                }
+                if(supported_video_codecs.hevc.supported)
+                    video_codec_f = get_ffmpeg_video_codec(*video_codec, egl->gpu_info.vendor);
                 break;
             }
         }
+    }
 
-        fprintf(stderr, "Error: your gpu does not support '%s' video codec. If you are sure that your gpu does support '%s' video encoding and you are using an AMD/Intel GPU,\n"
-            "  then it's possible that your distro has disabled hardware accelerated video encoding for '%s' video codec.\n"
-            "  This may be the case on corporate distros such as Manjaro.\n"
-            "  You can test this by running 'vainfo | grep VAEntrypointEncSlice' to see if it matches any H264/HEVC/AV1 profile. vainfo is part of libva-utils.\n"
-            "  On such distros, you need to manually install mesa from source to enable H264/HEVC/AV1 hardware acceleration, or use a more user friendly distro.\n", video_codec_name, video_codec_name, video_codec_name);
+    if(!video_codec_f) {
+        const char *video_codec_name = video_codec_to_string(*video_codec);
+        fprintf(stderr, "gsr error: your gpu does not support '%s' video codec. If you are sure that your gpu does support '%s' video encoding and you are using an AMD/Intel GPU,\n"
+            "  then make sure you have installed the GPU specific vaapi packages (intel-media-driver, libva-intel-driver, libva-mesa-driver and linux-firmware).\n"
+            "  It's also possible that your distro has disabled hardware accelerated video encoding for '%s' video codec.\n"
+            "  This may be the case on corporate distros such as Manjaro, Fedora or OpenSUSE.\n"
+            "  You can test this by running 'vainfo | grep VAEntrypointEncSlice' to see if it matches any H264/HEVC/AV1/VP8/VP9 profile.\n"
+            "  On such distros, you need to manually install mesa from source to enable H264/HEVC hardware acceleration, or use a more user friendly distro. Alternatively record with AV1 if supported by your GPU.\n"
+            "  You can alternatively use the flatpak version of GPU Screen Recorder (https://flathub.org/apps/com.dec05eba.gpu_screen_recorder) which bypasses system issues with patented H264/HEVC codecs.\n"
+            "  Make sure you have mesa-extra freedesktop runtime installed when using the flatpak (this should be the default), which can be installed with this command:\n"
+            "  flatpak install --system org.freedesktop.Platform.GL.default//23.08-extra\n"
+            "  If your GPU doesn't support hardware accelerated video encoding then you can use '-encoder cpu' option to encode with your cpu instead.\n", video_codec_name, video_codec_name, video_codec_name);
         _exit(2);
     }
 
-    const bool is_livestream = is_livestream_path(filename);
+    *low_power = video_codec_only_supports_low_power_mode(supported_video_codecs, *video_codec);
+
+    return video_codec_f;
+}
+
+static const AVCodec* select_video_codec_with_fallback(gsr_video_codec *video_codec, const char *file_extension, bool use_software_video_encoder, gsr_egl *egl, bool *low_power) {
+    const bool video_codec_auto = *video_codec == (gsr_video_codec)GSR_VIDEO_CODEC_AUTO;
+    if(video_codec_auto) {
+        if(strcmp(file_extension, "webm") == 0) {
+            fprintf(stderr, "gsr info: using vp8 encoder because a codec was not specified and the file extension is .webm\n");
+            *video_codec = GSR_VIDEO_CODEC_VP8;
+        } else {
+            fprintf(stderr, "gsr info: using h264 encoder because a codec was not specified\n");
+            *video_codec = GSR_VIDEO_CODEC_H264;
+        }
+    }
+
+    // TODO: Allow hevc, vp9 and av1 in (enhanced) flv (supported since ffmpeg 6.1)
+    const bool is_flv = strcmp(file_extension, "flv") == 0;
+    if(is_flv) {
+        if(*video_codec != GSR_VIDEO_CODEC_H264) {
+            *video_codec = GSR_VIDEO_CODEC_H264;
+            fprintf(stderr, "gsr warning: hevc/av1 is not compatible with flv, falling back to h264 instead.\n");
+        }
+
+        // if(audio_codec != GSR_AUDIO_CODEC_AAC) {
+        //     audio_codec_to_use = "aac";
+        //     audio_codec = GSR_AUDIO_CODEC_AAC;
+        //     fprintf(stderr, "gsr warning: flv only supports aac, falling back to aac instead.\n");
+        // }
+    }
+
+    const bool is_hls = strcmp(file_extension, "m3u8") == 0;
+    if(is_hls) {
+        if(video_codec_is_av1(*video_codec)) {
+            *video_codec = GSR_VIDEO_CODEC_HEVC;
+            fprintf(stderr, "gsr warning: av1 is not compatible with hls (m3u8), falling back to hevc instead.\n");
+        }
+
+        // if(audio_codec != GSR_AUDIO_CODEC_AAC) {
+        //     audio_codec_to_use = "aac";
+        //     audio_codec = GSR_AUDIO_CODEC_AAC;
+        //     fprintf(stderr, "gsr warning: hls (m3u8) only supports aac, falling back to aac instead.\n");
+        // }
+    }
+
+    if(use_software_video_encoder && *video_codec != GSR_VIDEO_CODEC_H264) {
+        fprintf(stderr, "gsr error: \"-encoder cpu\" option is currently only available when using h264 codec option (-k)\n");
+        args_parser_print_usage();
+        _exit(1);
+    }
+
+    return pick_video_codec(video_codec, egl, use_software_video_encoder, video_codec_auto, is_flv, low_power);
+}
+
+static std::vector<AudioDeviceData> create_device_audio_inputs(const std::vector<AudioInput> &audio_inputs, AVCodecContext *audio_codec_context, int num_channels, double num_audio_frames_shift, std::vector<AVFilterContext*> &src_filter_ctx, bool use_amix) {
+    std::vector<AudioDeviceData> audio_track_audio_devices;
+    for(size_t i = 0; i < audio_inputs.size(); ++i) {
+        const auto &audio_input = audio_inputs[i];
+        AVFilterContext *src_ctx = nullptr;
+        if(use_amix)
+            src_ctx = src_filter_ctx[i];
+
+        AudioDeviceData audio_device;
+        audio_device.audio_input = audio_input;
+        audio_device.src_filter_ctx = src_ctx;
+
+        if(audio_input.name.empty()) {
+            audio_device.sound_device.handle = NULL;
+            audio_device.sound_device.frames = 0;
+        } else {
+            const std::string description = "gsr-" + audio_input.name;
+            if(sound_device_get_by_name(&audio_device.sound_device, audio_input.name.c_str(), description.c_str(), num_channels, audio_codec_context->frame_size, audio_codec_context_get_audio_format(audio_codec_context)) != 0) {
+                fprintf(stderr, "gsr error: failed to get \"%s\" audio device\n", audio_input.name.c_str());
+                _exit(1);
+            }
+        }
+
+        audio_device.frame = create_audio_frame(audio_codec_context);
+        audio_device.frame->pts = -audio_codec_context->frame_size * num_audio_frames_shift;
+
+        audio_track_audio_devices.push_back(std::move(audio_device));
+    }
+    return audio_track_audio_devices;
+}
+
+#ifdef GSR_APP_AUDIO
+static AudioDeviceData create_application_audio_audio_input(const MergedAudioInputs &merged_audio_inputs, AVCodecContext *audio_codec_context, int num_channels, double num_audio_frames_shift, gsr_pipewire_audio *pipewire_audio) {
+    AudioDeviceData audio_device;
+    audio_device.frame = create_audio_frame(audio_codec_context);
+    audio_device.frame->pts = -audio_codec_context->frame_size * num_audio_frames_shift;
+
+    char random_str[8];
+    if(!generate_random_characters_standard_alphabet(random_str, sizeof(random_str))) {
+        fprintf(stderr, "gsr error: failed to generate random string\n");
+        _exit(1);
+    }
+    std::string combined_sink_name = "gsr-combined-";
+    combined_sink_name.append(random_str, sizeof(random_str));
+
+    if(!gsr_pipewire_audio_create_virtual_sink(pipewire_audio, combined_sink_name.c_str())) {
+        fprintf(stderr, "gsr error: failed to create virtual sink for application audio\n");
+        _exit(1);
+    }
+
+    combined_sink_name += ".monitor";
+
+    if(sound_device_get_by_name(&audio_device.sound_device, combined_sink_name.c_str(), "gpu-screen-recorder", num_channels, audio_codec_context->frame_size, audio_codec_context_get_audio_format(audio_codec_context)) != 0) {
+        fprintf(stderr, "gsr error: failed to setup audio recording to combined sink\n");
+        _exit(1);
+    }
+
+    std::vector<const char*> audio_devices_sources;
+    for(const auto &audio_input : merged_audio_inputs.audio_inputs) {
+        if(audio_input.type == AudioInputType::DEVICE)
+            audio_devices_sources.push_back(audio_input.name.c_str());
+    }
+
+    bool app_audio_inverted = false;
+    std::vector<const char*> app_names;
+    for(const auto &audio_input : merged_audio_inputs.audio_inputs) {
+        if(audio_input.type == AudioInputType::APPLICATION) {
+            app_names.push_back(audio_input.name.c_str());
+            app_audio_inverted = audio_input.inverted;
+        }
+    }
+
+    if(!audio_devices_sources.empty()) {
+        if(!gsr_pipewire_audio_add_link_from_sources_to_sink(pipewire_audio, audio_devices_sources.data(), audio_devices_sources.size(), combined_sink_name.c_str())) {
+            fprintf(stderr, "gsr error: failed to add application audio link\n");
+            _exit(1);
+        }
+    }
+
+    if(app_audio_inverted) {
+        if(!gsr_pipewire_audio_add_link_from_apps_to_sink_inverted(pipewire_audio, app_names.data(), app_names.size(), combined_sink_name.c_str())) {
+            fprintf(stderr, "gsr error: failed to add application audio link\n");
+            _exit(1);
+        }
+    } else {
+        if(!gsr_pipewire_audio_add_link_from_apps_to_sink(pipewire_audio, app_names.data(), app_names.size(), combined_sink_name.c_str())) {
+            fprintf(stderr, "gsr error: failed to add application audio link\n");
+            _exit(1);
+        }
+    }
+
+    return audio_device;
+}
+#endif
+
+static bool get_image_format_from_filename(const char *filename, gsr_image_format *image_format) {
+    if(string_ends_with(filename, ".jpg") || string_ends_with(filename, ".jpeg")) {
+        *image_format = GSR_IMAGE_FORMAT_JPEG;
+        return true;
+    } else if(string_ends_with(filename, ".png")) {
+        *image_format = GSR_IMAGE_FORMAT_PNG;
+        return true;
+    } else {
+        return false;
+    }
+}
+
+// TODO: replace this with start_recording_create_steams
+static bool av_open_file_write_header(AVFormatContext *av_format_context, const char *filename) {
+    int ret = avio_open(&av_format_context->pb, filename, AVIO_FLAG_WRITE);
+    if(ret < 0) {
+        fprintf(stderr, "gsr error: Could not open '%s': %s\n", filename, av_error_to_string(ret));
+        return false;
+    }
+
+    AVDictionary *options = nullptr;
+    av_dict_set(&options, "strict", "experimental", 0);
+    //av_dict_set_int(&av_format_context->metadata, "video_full_range_flag", 1, 0);
+
+    ret = avformat_write_header(av_format_context, &options);
+    if(ret < 0)
+        fprintf(stderr, "Error occurred when writing header to output file: %s\n", av_error_to_string(ret));
+
+    const bool success = ret >= 0;
+    if(!success)
+        avio_close(av_format_context->pb);
+
+    av_dict_free(&options);
+    return success;
+}
+
+static int audio_codec_get_frame_size(gsr_audio_codec audio_codec) {
+    switch(audio_codec) {
+        case GSR_AUDIO_CODEC_AAC: return 1024;
+        case GSR_AUDIO_CODEC_OPUS: return 960;
+        case GSR_AUDIO_CODEC_FLAC:
+            assert(false);
+            return 1024;
+    }
+    assert(false);
+    return 1024;
+}
+
+static size_t calculate_estimated_replay_buffer_packets(int64_t replay_buffer_size_secs, int fps, gsr_audio_codec audio_codec, const std::vector<MergedAudioInputs> &audio_inputs) {
+    if(replay_buffer_size_secs == -1)
+        return 0;
+
+    int audio_fps = 0;
+    if(!audio_inputs.empty())
+        audio_fps = AUDIO_SAMPLE_RATE / audio_codec_get_frame_size(audio_codec);
+
+    return replay_buffer_size_secs * (fps + audio_fps * audio_inputs.size());
+}
+
+static void set_display_server_environment_variables() {
+    // Some users dont have properly setup environments (no display manager that does systemctl --user import-environment DISPLAY WAYLAND_DISPLAY)
+    const char *display = getenv("DISPLAY");
+    if(!display) {
+        display = ":0";
+        setenv("DISPLAY", display, true);
+    }
+
+    const char *wayland_display = getenv("WAYLAND_DISPLAY");
+    if(!wayland_display) {
+        wayland_display = "wayland-1";
+        setenv("WAYLAND_DISPLAY", wayland_display, true);
+    }
+}
+
+int main(int argc, char **argv) {
+    setlocale(LC_ALL, "C"); // Sigh... stupid C
+    mallopt(M_MMAP_THRESHOLD, 65536);
+
+    signal(SIGINT, stop_handler);
+    signal(SIGTERM, stop_handler);
+    signal(SIGUSR1, save_replay_handler);
+    signal(SIGUSR2, toggle_pause_handler);
+    signal(SIGRTMIN, toggle_replay_recording_handler);
+    signal(SIGRTMIN+1, save_replay_10_seconds_handler);
+    signal(SIGRTMIN+2, save_replay_30_seconds_handler);
+    signal(SIGRTMIN+3, save_replay_1_minute_handler);
+    signal(SIGRTMIN+4, save_replay_5_minutes_handler);
+    signal(SIGRTMIN+5, save_replay_10_minutes_handler);
+    signal(SIGRTMIN+6, save_replay_30_minutes_handler);
+
+    set_display_server_environment_variables();
+
+    // Stop nvidia driver from buffering frames
+    setenv("__GL_MaxFramesAllowed", "1", true);
+    // If this is set to 1 then cuGraphicsGLRegisterImage will fail for egl context with error: invalid OpenGL or DirectX context,
+    // so we overwrite it
+    setenv("__GL_THREADED_OPTIMIZATIONS", "0", true);
+    // Forces low latency encoding mode. Use this environment variable until vaapi supports setting this as a parameter.
+    // The downside of this is that it always uses maximum power, which is not ideal for replay mode that runs on system startup.
+    // This option was added in mesa 24.1.4, released in july 17, 2024.
+    // TODO: Add an option to enable/disable this?
+    // Seems like the performance issue is not in encoding, but rendering the frame.
+    // Some frames end up taking 10 times longer. Seems to be an issue with amd gpu power management when letting the application sleep on the cpu side?
+    setenv("AMD_DEBUG", "lowlatencyenc", true);
+    // Some people set this to nvidia (for nvdec) or vdpau (for nvidia vdpau), which breaks gpu screen recorder since
+    // nvidia doesn't support vaapi and nvidia-vaapi-driver doesn't support encoding yet.
+    // Let vaapi find the match vaapi driver instead of forcing a specific one.
+    unsetenv("LIBVA_DRIVER_NAME");
+    // Some people set this to force all applications to vsync on nvidia, but this makes eglSwapBuffers never return.
+    unsetenv("__GL_SYNC_TO_VBLANK");
+    // Same as above, but for amd/intel
+    unsetenv("vblank_mode");
+
+    if(geteuid() == 0) {
+        fprintf(stderr, "gsr error: don't run gpu-screen-recorder as the root user\n");
+        _exit(1);
+    }
+
+    args_handlers arg_handlers;
+    arg_handlers.version = version_command;
+    arg_handlers.info = info_command;
+    arg_handlers.list_audio_devices = list_audio_devices_command;
+    arg_handlers.list_application_audio = list_application_audio_command;
+    arg_handlers.list_capture_options = list_capture_options_command;
+
+    args_parser arg_parser;
+    if(!args_parser_parse(&arg_parser, argc, argv, &arg_handlers, NULL))
+        _exit(1);
+
+    //av_log_set_level(AV_LOG_TRACE);
+
+    const Arg *audio_input_arg = args_parser_get_arg(&arg_parser, "-a");
+    assert(audio_input_arg);
+
+    AudioDevices audio_devices;
+    if(audio_input_arg->num_values > 0)
+        audio_devices = get_pulseaudio_inputs();
+
+    std::vector<MergedAudioInputs> requested_audio_inputs = parse_audio_inputs(audio_devices, audio_input_arg);
+
+    const bool uses_app_audio = merged_audio_inputs_has_app_audio(requested_audio_inputs);
+    std::vector<std::string> app_audio_names;
+#ifdef GSR_APP_AUDIO
+    gsr_pipewire_audio pipewire_audio;
+    memset(&pipewire_audio, 0, sizeof(pipewire_audio));
+    if(uses_app_audio) {
+        if(!pulseaudio_server_is_pipewire()) {
+            fprintf(stderr, "gsr error: your sound server is not PipeWire. Application audio is only available when running PipeWire audio server\n");
+            _exit(2);
+        }
+
+        if(!gsr_pipewire_audio_init(&pipewire_audio)) {
+            fprintf(stderr, "gsr error: failed to setup PipeWire audio for application audio capture\n");
+            _exit(2);
+        }
+
+        gsr_pipewire_audio_for_each_app(&pipewire_audio, [](const char *app_name, void *userdata) {
+            std::vector<std::string> *app_audio_names = (std::vector<std::string>*)userdata;
+            app_audio_names->push_back(app_name);
+            return true;
+        }, &app_audio_names);
+    }
+#endif
+
+    validate_merged_audio_inputs_app_audio(requested_audio_inputs, app_audio_names);
+
+    const bool is_replaying = arg_parser.replay_buffer_size_secs != -1;
+    const bool is_portal_capture = strcmp(arg_parser.window, "portal") == 0;
+
+    bool wayland = false;
+    Display *dpy = XOpenDisplay(nullptr);
+    if (!dpy) {
+        wayland = true;
+        fprintf(stderr, "gsr warning: failed to connect to the X server. Assuming wayland is running without Xwayland\n");
+    }
+
+    XSetErrorHandler(x11_error_handler);
+    XSetIOErrorHandler(x11_io_error_handler);
+
+    if(!wayland)
+        wayland = is_xwayland(dpy);
+
+    if(!wayland && is_using_prime_run()) {
+        // Disable prime-run and similar options as it doesn't work, the monitor to capture has to be run on the same device.
+        // This is fine on wayland since nvidia uses drm interface there and the monitor query checks the monitors connected
+        // to the drm device.
+        fprintf(stderr, "gsr warning: use of prime-run on X11 is not supported. Disabling prime-run\n");
+        disable_prime_run();
+    }
+
+    gsr_window *window = gsr_window_create(dpy, wayland);
+    if(!window) {
+        fprintf(stderr, "gsr error: failed to create window\n");
+        _exit(1);
+    }
+
+    if(is_portal_capture && is_using_prime_run()) {
+        fprintf(stderr, "gsr warning: use of prime-run with -w portal option is currently not supported. Disabling prime-run\n");
+        disable_prime_run();
+    }
+
+    const bool is_monitor_capture = strcmp(arg_parser.window, "focused") != 0 && strcmp(arg_parser.window, "region") != 0 && !is_portal_capture && contains_non_hex_number(arg_parser.window);
+    gsr_egl egl;
+    if(!gsr_egl_load(&egl, window, is_monitor_capture, arg_parser.gl_debug)) {
+        fprintf(stderr, "gsr error: failed to load opengl\n");
+        _exit(1);
+    }
+
+    gsr_shader_enable_debug_output(arg_parser.gl_debug);
+#ifndef NDEBUG
+    gsr_shader_enable_debug_output(true);
+#endif
+
+    if(!args_parser_validate_with_gl_info(&arg_parser, &egl))
+        _exit(1);
+
+    egl.card_path[0] = '\0';
+    if(monitor_capture_use_drm(window, egl.gpu_info.vendor)) {
+        // TODO: Allow specifying another card, and in other places
+        if(!gsr_get_valid_card_path(&egl, egl.card_path, is_monitor_capture)) {
+            fprintf(stderr, "gsr error: no /dev/dri/cardX device found. Make sure that you have at least one monitor connected or record a single window instead on X11 or record with the -w portal option\n");
+            _exit(2);
+        }
+    }
+
+    // if(wayland && is_monitor_capture) {
+    //     fprintf(stderr, "gsr warning: it's not possible to sync video to recorded monitor exactly on wayland when recording a monitor."
+    //         " If you experience stutter in the video then record with portal capture option instead (-w portal) or use X11 instead\n");
+    // }
+
+    gsr_image_format image_format;
+    if(get_image_format_from_filename(arg_parser.filename, &image_format)) {
+        if(audio_input_arg->num_values > 0) {
+            fprintf(stderr, "gsr error: can't record audio (-a) when taking a screenshot\n");
+            _exit(1);
+        }
+
+        capture_image_to_file(arg_parser, &egl, image_format);
+        _exit(0);
+    }
+
+    AVFormatContext *av_format_context;
+    // The output format is automatically guessed by the file extension
+    avformat_alloc_output_context2(&av_format_context, nullptr, arg_parser.container_format, arg_parser.filename);
+    if (!av_format_context) {
+        if(arg_parser.container_format) {
+            fprintf(stderr, "gsr error: Container format '%s' (argument -c) is not valid\n", arg_parser.container_format);
+        } else {
+            fprintf(stderr, "gsr error: Failed to deduce container format from file extension. Use the '-c' option to specify container format\n");
+            args_parser_print_usage();
+            _exit(1);
+        }
+        _exit(1);
+    }
+
+    const AVOutputFormat *output_format = av_format_context->oformat;
+
+    std::string file_extension = output_format->extensions ? output_format->extensions : "";
+    {
+        size_t comma_index = file_extension.find(',');
+        if(comma_index != std::string::npos)
+            file_extension = file_extension.substr(0, comma_index);
+    }
+
+    const bool force_no_audio_offset = arg_parser.is_livestream || arg_parser.is_output_piped || (file_extension != "mp4" && file_extension != "mkv" && file_extension != "webm");
+    const double target_fps = 1.0 / (double)arg_parser.fps;
+
+    const bool uses_amix = merged_audio_inputs_should_use_amix(requested_audio_inputs);
+    arg_parser.audio_codec = select_audio_codec_with_fallback(arg_parser.audio_codec, file_extension, uses_amix);
+    bool low_power = false;
+    const AVCodec *video_codec_f = select_video_codec_with_fallback(&arg_parser.video_codec, file_extension.c_str(), arg_parser.video_encoder == GSR_VIDEO_ENCODER_HW_CPU, &egl, &low_power);
+
+    gsr_capture *capture = create_capture_impl(arg_parser, &egl, false);
+
     // (Some?) livestreaming services require at least one audio track to work.
     // If not audio is provided then create one silent audio track.
-    if(is_livestream && requested_audio_inputs.empty()) {
-        fprintf(stderr, "Info: live streaming but no audio track was added. Adding a silent audio track\n");
+    if(arg_parser.is_livestream && requested_audio_inputs.empty()) {
+        fprintf(stderr, "gsr info: live streaming but no audio track was added. Adding a silent audio track\n");
         MergedAudioInputs mai;
-        mai.audio_inputs.push_back({ "", "gsr-silent" });
+        mai.audio_inputs.push_back({""});
         requested_audio_inputs.push_back(std::move(mai));
     }
 
-    if(is_livestream && framerate_mode != FramerateMode::CONSTANT) {
-        fprintf(stderr, "Info: framerate mode was forcefully set to \"cfr\" because live streaming was detected\n");
-        framerate_mode = FramerateMode::CONSTANT;
-        framerate_mode_str = "cfr";
-    }
-
     AVStream *video_stream = nullptr;
     std::vector<AudioTrack> audio_tracks;
 
-    AVCodecContext *video_codec_context = create_video_codec_context(gpu_inf.vendor == GSR_GPU_VENDOR_NVIDIA ? AV_PIX_FMT_CUDA : AV_PIX_FMT_VAAPI, quality, fps, video_codec_f, is_livestream, gpu_inf.vendor, framerate_mode);
-    if(replay_buffer_size_secs == -1)
+    const enum AVPixelFormat video_pix_fmt = get_pixel_format(arg_parser.video_codec, egl.gpu_info.vendor, arg_parser.video_encoder == GSR_VIDEO_ENCODER_HW_CPU);
+    AVCodecContext *video_codec_context = create_video_codec_context(video_pix_fmt, video_codec_f, egl, arg_parser);
+    if(!is_replaying)
         video_stream = create_stream(av_format_context, video_codec_context);
 
-    int capture_result = gsr_capture_start(capture, video_codec_context);
+    if(arg_parser.tune == GSR_TUNE_QUALITY)
+        video_codec_context->max_b_frames = 2;
+
+    AVFrame *video_frame = av_frame_alloc();
+    if(!video_frame) {
+        fprintf(stderr, "gsr error: Failed to allocate video frame\n");
+        _exit(1);
+    }
+    video_frame->format = video_codec_context->pix_fmt;
+    video_frame->width = 0;
+    video_frame->height = 0;
+    video_frame->color_range = video_codec_context->color_range;
+    video_frame->color_primaries = video_codec_context->color_primaries;
+    video_frame->color_trc = video_codec_context->color_trc;
+    video_frame->colorspace = video_codec_context->colorspace;
+    video_frame->chroma_location = video_codec_context->chroma_sample_location;
+
+    gsr_capture_metadata capture_metadata;
+    capture_metadata.width = 0;
+    capture_metadata.height = 0;
+    capture_metadata.fps = arg_parser.fps;
+    capture_metadata.video_codec_context = video_codec_context;
+    capture_metadata.frame = video_frame;
+
+    int capture_result = gsr_capture_start(capture, &capture_metadata);
     if(capture_result != 0) {
         fprintf(stderr, "gsr error: gsr_capture_start failed\n");
         _exit(capture_result);
     }
 
-    open_video(video_codec_context, quality, very_old_gpu, gpu_inf.vendor, pixel_format);
-    if(video_stream)
+    video_codec_context->width = capture_metadata.width;
+    video_codec_context->height = capture_metadata.height;
+    video_frame->width = capture_metadata.width;
+    video_frame->height = capture_metadata.height;
+
+    const size_t estimated_replay_buffer_packets = calculate_estimated_replay_buffer_packets(arg_parser.replay_buffer_size_secs, arg_parser.fps, arg_parser.audio_codec, requested_audio_inputs);
+    gsr_encoder encoder;
+    if(!gsr_encoder_init(&encoder, arg_parser.replay_storage, estimated_replay_buffer_packets, arg_parser.replay_buffer_size_secs, arg_parser.filename)) {
+        fprintf(stderr, "gsr error: failed to create encoder\n");
+        _exit(1);
+    }
+
+    gsr_video_encoder *video_encoder = create_video_encoder(&egl, arg_parser);
+    if(!video_encoder) {
+        fprintf(stderr, "gsr error: failed to create video encoder\n");
+        _exit(1);
+    }
+
+    if(!gsr_video_encoder_start(video_encoder, video_codec_context, video_frame)) {
+        fprintf(stderr, "gsr error: failed to start video encoder\n");
+        _exit(1);
+    }
+
+    capture_metadata.width = video_codec_context->width;
+    capture_metadata.height = video_codec_context->height;
+
+    gsr_color_conversion_params color_conversion_params;
+    memset(&color_conversion_params, 0, sizeof(color_conversion_params));
+    color_conversion_params.color_range = arg_parser.color_range;
+    color_conversion_params.egl = &egl;
+    color_conversion_params.load_external_image_shader = gsr_capture_uses_external_image(capture);
+    gsr_video_encoder_get_textures(video_encoder, color_conversion_params.destination_textures, &color_conversion_params.num_destination_textures, &color_conversion_params.destination_color);
+
+    gsr_color_conversion color_conversion;
+    if(gsr_color_conversion_init(&color_conversion, &color_conversion_params) != 0) {
+        fprintf(stderr, "gsr error: main: failed to create color conversion\n");
+        _exit(1);
+    }
+
+    gsr_color_conversion_clear(&color_conversion);
+
+    if(arg_parser.video_encoder == GSR_VIDEO_ENCODER_HW_CPU) {
+        open_video_software(video_codec_context, arg_parser);
+    } else {
+        open_video_hardware(video_codec_context, low_power, egl, arg_parser);
+    }
+
+    if(video_stream) {
         avcodec_parameters_from_context(video_stream->codecpar, video_codec_context);
+        gsr_encoder_add_recording_destination(&encoder, video_codec_context, av_format_context, video_stream, 0);
+    }
 
+    int audio_max_frame_size = 1024;
     int audio_stream_index = VIDEO_STREAM_INDEX + 1;
     for(const MergedAudioInputs &merged_audio_inputs : requested_audio_inputs) {
-        AVCodecContext *audio_codec_context = create_audio_codec_context(fps, audio_codec);
+        const bool use_amix = audio_inputs_should_use_amix(merged_audio_inputs.audio_inputs);
+        AVCodecContext *audio_codec_context = create_audio_codec_context(arg_parser.fps, arg_parser.audio_codec, use_amix, arg_parser.audio_bitrate);
 
         AVStream *audio_stream = nullptr;
-        if(replay_buffer_size_secs == -1)
+        if(!is_replaying) {
             audio_stream = create_stream(av_format_context, audio_codec_context);
+            if(gsr_encoder_add_recording_destination(&encoder, audio_codec_context, av_format_context, audio_stream, 0) == (size_t)-1)
+                fprintf(stderr, "gsr error: added too many audio sources\n");
+        }
+
+        if(audio_stream && !merged_audio_inputs.track_name.empty())
+            av_dict_set(&audio_stream->metadata, "title", merged_audio_inputs.track_name.c_str(), 0);
 
         open_audio(audio_codec_context);
         if(audio_stream)
@@ -1997,118 +3259,85 @@ int main(int argc, char **argv) {
         std::vector<AVFilterContext*> src_filter_ctx;
         AVFilterGraph *graph = nullptr;
         AVFilterContext *sink = nullptr;
-        bool use_amix = merged_audio_inputs.audio_inputs.size() > 1;
         if(use_amix) {
             int err = init_filter_graph(audio_codec_context, &graph, &sink, src_filter_ctx, merged_audio_inputs.audio_inputs.size());
             if(err < 0) {
-                fprintf(stderr, "Error: failed to create audio filter\n");
+                fprintf(stderr, "gsr error: failed to create audio filter\n");
                 _exit(1);
             }
         }
 
         // TODO: Cleanup above
 
-        std::vector<AudioDevice> audio_devices;
-        for(size_t i = 0; i < merged_audio_inputs.audio_inputs.size(); ++i) {
-            auto &audio_input = merged_audio_inputs.audio_inputs[i];
-            AVFilterContext *src_ctx = nullptr;
-            if(use_amix)
-                src_ctx = src_filter_ctx[i];
-
-            AudioDevice audio_device;
-            audio_device.audio_input = audio_input;
-            audio_device.src_filter_ctx = src_ctx;
-
-            if(audio_input.name.empty()) {
-                audio_device.sound_device.handle = NULL;
-                audio_device.sound_device.frames = 0;
-                audio_device.frame = NULL;
-            } else {
-                if(sound_device_get_by_name(&audio_device.sound_device, audio_input.name.c_str(), audio_input.description.c_str(), num_channels, audio_codec_context->frame_size, audio_codec_context_get_audio_format(audio_codec_context)) != 0) {
-                    fprintf(stderr, "Error: failed to get \"%s\" sound device\n", audio_input.name.c_str());
-                    _exit(1);
-                }
-                audio_device.frame = create_audio_frame(audio_codec_context);
-            }
+        const double audio_fps = (double)audio_codec_context->sample_rate / (double)audio_codec_context->frame_size;
+        const double timeout_sec = 1000.0 / audio_fps / 1000.0;
 
-            audio_devices.push_back(std::move(audio_device));
+        const double audio_startup_time_seconds = force_no_audio_offset ? 0 : audio_codec_get_desired_delay(arg_parser.audio_codec, arg_parser.fps);// * ((double)audio_codec_context->frame_size / 1024.0);
+        const double num_audio_frames_shift = audio_startup_time_seconds / timeout_sec;
+
+        std::vector<AudioDeviceData> audio_track_audio_devices;
+        if(audio_inputs_has_app_audio(merged_audio_inputs.audio_inputs)) {
+            assert(!use_amix);
+#ifdef GSR_APP_AUDIO
+            audio_track_audio_devices.push_back(create_application_audio_audio_input(merged_audio_inputs, audio_codec_context, num_channels, num_audio_frames_shift, &pipewire_audio));
+#endif
+        } else {
+            audio_track_audio_devices = create_device_audio_inputs(merged_audio_inputs.audio_inputs, audio_codec_context, num_channels, num_audio_frames_shift, src_filter_ctx, use_amix);
         }
 
         AudioTrack audio_track;
+        audio_track.name = merged_audio_inputs.track_name;
         audio_track.codec_context = audio_codec_context;
-        audio_track.stream = audio_stream;
-        audio_track.audio_devices = std::move(audio_devices);
+        audio_track.audio_devices = std::move(audio_track_audio_devices);
         audio_track.graph = graph;
         audio_track.sink = sink;
         audio_track.stream_index = audio_stream_index;
+        audio_track.pts = -audio_codec_context->frame_size * num_audio_frames_shift;
         audio_tracks.push_back(std::move(audio_track));
         ++audio_stream_index;
-    }
 
-    //av_dump_format(av_format_context, 0, filename, 1);
-
-    if (replay_buffer_size_secs == -1 && !(output_format->flags & AVFMT_NOFILE)) {
-        int ret = avio_open(&av_format_context->pb, filename, AVIO_FLAG_WRITE);
-        if (ret < 0) {
-            fprintf(stderr, "Error: Could not open '%s': %s\n", filename, av_error_to_string(ret));
-            _exit(1);
-        }
+        audio_max_frame_size = std::max(audio_max_frame_size, audio_codec_context->frame_size);
     }
 
-    if(replay_buffer_size_secs == -1) {
-        AVDictionary *options = nullptr;
-        av_dict_set(&options, "strict", "experimental", 0);
-        //av_dict_set_int(&av_format_context->metadata, "video_full_range_flag", 1, 0);
+    //av_dump_format(av_format_context, 0, filename, 1);
 
-        int ret = avformat_write_header(av_format_context, &options);
-        if (ret < 0) {
-            fprintf(stderr, "Error occurred when writing header to output file: %s\n", av_error_to_string(ret));
+    if(!is_replaying) {
+        if(!av_open_file_write_header(av_format_context, arg_parser.filename))
             _exit(1);
-        }
-
-        av_dict_free(&options);
     }
 
-    const double start_time_pts = clock_get_monotonic_seconds();
-
-    double start_time = clock_get_monotonic_seconds(); // todo - target_fps to make first frame start immediately?
-    double frame_timer_start = start_time;
+    double fps_start_time = clock_get_monotonic_seconds();
+    //double frame_timer_start = fps_start_time;
     int fps_counter = 0;
+    int damage_fps_counter = 0;
 
-    AVFrame *frame = av_frame_alloc();
-    if (!frame) {
-        fprintf(stderr, "Error: Failed to allocate frame\n");
-        _exit(1);
-    }
-    frame->format = video_codec_context->pix_fmt;
-    frame->width = video_codec_context->width;
-    frame->height = video_codec_context->height;
-    frame->color_range = video_codec_context->color_range;
-    frame->color_primaries = video_codec_context->color_primaries;
-    frame->color_trc = video_codec_context->color_trc;
-    frame->colorspace = video_codec_context->colorspace;
-    frame->chroma_location = video_codec_context->chroma_sample_location;
+    bool paused = false;
+    double paused_time_offset = 0.0;
+    double paused_time_start = 0.0;
+    bool replay_recording = false;
+    RecordingStartResult replay_recording_start_result;
+    std::vector<size_t> replay_recording_items;
+    std::string replay_recording_filepath;
+    bool force_iframe_frame = false; // Only needed for video since audio frames are always iframes
 
-    std::mutex write_output_mutex;
     std::mutex audio_filter_mutex;
 
     const double record_start_time = clock_get_monotonic_seconds();
-    std::deque<std::shared_ptr<PacketData>> frame_data_queue;
-    bool frames_erased = false;
 
-    const size_t audio_buffer_size = 1024 * 4 * 2; // max 4 bytes/sample, 2 channels
+    const size_t audio_buffer_size = audio_max_frame_size * 4 * 2; // max 4 bytes/sample, 2 channels
     uint8_t *empty_audio = (uint8_t*)malloc(audio_buffer_size);
     if(!empty_audio) {
-        fprintf(stderr, "Error: failed to create empty audio\n");
+        fprintf(stderr, "gsr error: failed to create empty audio\n");
         _exit(1);
     }
     memset(empty_audio, 0, audio_buffer_size);
 
     for(AudioTrack &audio_track : audio_tracks) {
-        for(AudioDevice &audio_device : audio_track.audio_devices) {
+        for(AudioDeviceData &audio_device : audio_track.audio_devices) {
             audio_device.thread = std::thread([&]() mutable {
                 const AVSampleFormat sound_device_sample_format = audio_format_to_sample_format(audio_codec_context_get_audio_format(audio_track.codec_context));
-                const bool needs_audio_conversion = audio_track.codec_context->sample_fmt != sound_device_sample_format;
+                // TODO: Always do conversion for now. This fixes issue with stuttering audio on pulseaudio with opus + multiple audio sources merged
+                const bool needs_audio_conversion = true;//audio_track.codec_context->sample_fmt != sound_device_sample_format;
                 SwrContext *swr = nullptr;
                 if(needs_audio_conversion) {
                     swr = swr_alloc();
@@ -2116,8 +3345,16 @@ int main(int argc, char **argv) {
                         fprintf(stderr, "Failed to create SwrContext\n");
                         _exit(1);
                     }
-                    av_opt_set_int(swr, "in_channel_layout", AV_CH_LAYOUT_STEREO, 0);
-                    av_opt_set_int(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0);
+                    #if LIBAVUTIL_VERSION_MAJOR <= 56
+                    av_opt_set_channel_layout(swr, "in_channel_layout", AV_CH_LAYOUT_STEREO, 0);
+                    av_opt_set_channel_layout(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0);
+                    #elif LIBAVUTIL_VERSION_MAJOR >= 59
+                    av_opt_set_chlayout(swr, "in_chlayout", &audio_track.codec_context->ch_layout, 0);
+                    av_opt_set_chlayout(swr, "out_chlayout", &audio_track.codec_context->ch_layout, 0);
+                    #else
+                    av_opt_set_chlayout(swr, "in_channel_layout", &audio_track.codec_context->ch_layout, 0);
+                    av_opt_set_chlayout(swr, "out_channel_layout", &audio_track.codec_context->ch_layout, 0);
+                    #endif
                     av_opt_set_int(swr, "in_sample_rate", audio_track.codec_context->sample_rate, 0);
                     av_opt_set_int(swr, "out_sample_rate", audio_track.codec_context->sample_rate, 0);
                     av_opt_set_sample_fmt(swr, "in_sample_fmt", sound_device_sample_format, 0);
@@ -2125,21 +3362,36 @@ int main(int argc, char **argv) {
                     swr_init(swr);
                 }
 
-                const double target_audio_hz = 1.0 / (double)audio_track.codec_context->sample_rate;
-                double received_audio_time = clock_get_monotonic_seconds();
-                const int64_t timeout_ms = std::round((1000.0 / (double)audio_track.codec_context->sample_rate) * 1000.0);
-                int64_t prev_pts = 0;
+                const double audio_fps = (double)audio_track.codec_context->sample_rate / (double)audio_track.codec_context->frame_size;
+                const int64_t timeout_ms = std::round(1000.0 / audio_fps);
+                const double timeout_sec = 1000.0 / audio_fps / 1000.0;
+                bool first_frame = true;
+                int64_t num_received_frames = 0;
 
                 while(running) {
                     void *sound_buffer;
                     int sound_buffer_size = -1;
-                    if(audio_device.sound_device.handle)
-                        sound_buffer_size = sound_device_read_next_chunk(&audio_device.sound_device, &sound_buffer);
+                    //const double time_before_read_seconds = clock_get_monotonic_seconds();
+                    if(audio_device.sound_device.handle) {
+                        // TODO: use this instead of calculating time to read. But this can fluctuate and we dont want to go back in time,
+                        // also it's 0.0 for some users???
+                        double latency_seconds = 0.0;
+                        sound_buffer_size = sound_device_read_next_chunk(&audio_device.sound_device, &sound_buffer, timeout_sec * 2.0, &latency_seconds);
+                    }
+
                     const bool got_audio_data = sound_buffer_size >= 0;
+                    //fprintf(stderr, "got audio data: %s\n", got_audio_data ? "yes" : "no");
+                    //const double time_after_read_seconds = clock_get_monotonic_seconds();
+                    //const double time_to_read_seconds = time_after_read_seconds - time_before_read_seconds;
+                    //fprintf(stderr, "time to read: %f, %s, %f\n", time_to_read_seconds, got_audio_data ? "yes" : "no", timeout_sec);
+                    const double this_audio_frame_time = clock_get_monotonic_seconds() - paused_time_offset;
 
-                    const double this_audio_frame_time = clock_get_monotonic_seconds();
-                    if(got_audio_data)
-                        received_audio_time = this_audio_frame_time;
+                    if(paused) {
+                        if(!audio_device.sound_device.handle)
+                            av_usleep(timeout_ms * 1000);
+
+                        continue;
+                    }
 
                     int ret = av_frame_make_writable(audio_device.frame);
                     if (ret < 0) {
@@ -2148,28 +3400,32 @@ int main(int argc, char **argv) {
                     }
 
                     // TODO: Is this |received_audio_time| really correct?
-                    int64_t num_missing_frames = std::round((this_audio_frame_time - received_audio_time) / target_audio_hz / (int64_t)audio_device.frame->nb_samples);
+                    const int64_t num_expected_frames = std::round((this_audio_frame_time - record_start_time) / timeout_sec);
+                    int64_t num_missing_frames = std::max((int64_t)0LL, num_expected_frames - num_received_frames);
+
                     if(got_audio_data)
-                        num_missing_frames = std::max((int64_t)0, num_missing_frames - 1);
+                        num_missing_frames = std::max((int64_t)0LL, num_missing_frames - 1);
 
                     if(!audio_device.sound_device.handle)
                         num_missing_frames = std::max((int64_t)1, num_missing_frames);
 
-                    // Jesus is there a better way to do this? I JUST WANT TO KEEP VIDEO AND AUDIO SYNCED HOLY FUCK I WANT TO KILL MYSELF NOW.
+                    // Fucking hell is there a better way to do this? I JUST WANT TO KEEP VIDEO AND AUDIO SYNCED HOLY FUCK I WANT TO KILL MYSELF NOW.
                     // THIS PIECE OF SHIT WANTS EMPTY FRAMES OTHERWISE VIDEO PLAYS TOO FAST TO KEEP UP WITH AUDIO OR THE AUDIO PLAYS TOO EARLY.
                     // BUT WE CANT USE DELAYS TO GIVE DUMMY DATA BECAUSE PULSEAUDIO MIGHT GIVE AUDIO A BIG DELAYED!!!
                     // This garbage is needed because we want to produce constant frame rate videos instead of variable frame rate
                     // videos because bad software such as video editing software and VLC do not support variable frame rate software,
                     // despite nvidia shadowplay and xbox game bar producing variable frame rate videos.
                     // So we have to make sure we produce frames at the same relative rate as the video.
-                    if(num_missing_frames >= 5 || !audio_device.sound_device.handle) {
+                    if((num_missing_frames >= 1 && got_audio_data) || num_missing_frames >= 5 || !audio_device.sound_device.handle) {
                         // TODO:
                         //audio_track.frame->data[0] = empty_audio;
-                        received_audio_time = this_audio_frame_time;
-                        if(needs_audio_conversion)
-                            swr_convert(swr, &audio_device.frame->data[0], audio_device.frame->nb_samples, (const uint8_t**)&empty_audio, audio_track.codec_context->frame_size);
-                        else
-                            audio_device.frame->data[0] = empty_audio;
+                        if(first_frame || num_missing_frames >= 5) {
+                            if(needs_audio_conversion)
+                                swr_convert(swr, &audio_device.frame->data[0], audio_track.codec_context->frame_size, (const uint8_t**)&empty_audio, audio_track.codec_context->frame_size);
+                            else
+                                audio_device.frame->data[0] = empty_audio;
+                        }
+                        first_frame = false;
 
                         // TODO: Check if duplicate frame can be saved just by writing it with a different pts instead of sending it again
                         std::lock_guard<std::mutex> lock(audio_filter_mutex);
@@ -2177,57 +3433,55 @@ int main(int argc, char **argv) {
                             if(audio_track.graph) {
                                 // TODO: av_buffersrc_add_frame
                                 if(av_buffersrc_write_frame(audio_device.src_filter_ctx, audio_device.frame) < 0) {
-                                    fprintf(stderr, "Error: failed to add audio frame to filter\n");
+                                    fprintf(stderr, "gsr error: failed to add audio frame to filter\n");
                                 }
                             } else {
-                                audio_device.frame->pts = (this_audio_frame_time - record_start_time) * (double)AV_TIME_BASE;
-                                const bool same_pts = audio_device.frame->pts == prev_pts;
-                                prev_pts = audio_device.frame->pts;
-                                if(same_pts)
-                                    continue;
-
                                 ret = avcodec_send_frame(audio_track.codec_context, audio_device.frame);
                                 if(ret >= 0) {
                                     // TODO: Move to separate thread because this could write to network (for example when livestreaming)
-                                    receive_frames(audio_track.codec_context, audio_track.stream_index, audio_track.stream, audio_device.frame->pts, av_format_context, record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex);
+                                    gsr_encoder_receive_packets(&encoder, audio_track.codec_context, audio_device.frame->pts, audio_track.stream_index);
                                 } else {
                                     fprintf(stderr, "Failed to encode audio!\n");
                                 }
+                                audio_track.pts += audio_track.codec_context->frame_size;
                             }
+
+                            audio_device.frame->pts += audio_track.codec_context->frame_size;
+                            num_received_frames++;
                         }
                     }
 
                     if(!audio_device.sound_device.handle)
-                        usleep(timeout_ms * 1000);
+                        av_usleep(timeout_ms * 1000);
 
                     if(got_audio_data) {
                         // TODO: Instead of converting audio, get float audio from alsa. Or does alsa do conversion internally to get this format?
                         if(needs_audio_conversion)
-                            swr_convert(swr, &audio_device.frame->data[0], audio_device.frame->nb_samples, (const uint8_t**)&sound_buffer, audio_track.codec_context->frame_size);
+                            swr_convert(swr, &audio_device.frame->data[0], audio_track.codec_context->frame_size, (const uint8_t**)&sound_buffer, audio_track.codec_context->frame_size);
                         else
                             audio_device.frame->data[0] = (uint8_t*)sound_buffer;
+                        first_frame = false;
 
-                        audio_device.frame->pts = (this_audio_frame_time - record_start_time) * (double)AV_TIME_BASE;
-                        const bool same_pts = audio_device.frame->pts == prev_pts;
-                        prev_pts = audio_device.frame->pts;
-                        if(same_pts)
-                            continue;
+                        std::lock_guard<std::mutex> lock(audio_filter_mutex);
 
                         if(audio_track.graph) {
-                            std::lock_guard<std::mutex> lock(audio_filter_mutex);
                             // TODO: av_buffersrc_add_frame
                             if(av_buffersrc_write_frame(audio_device.src_filter_ctx, audio_device.frame) < 0) {
-                                fprintf(stderr, "Error: failed to add audio frame to filter\n");
+                                fprintf(stderr, "gsr error: failed to add audio frame to filter\n");
                             }
                         } else {
                             ret = avcodec_send_frame(audio_track.codec_context, audio_device.frame);
                             if(ret >= 0) {
                                 // TODO: Move to separate thread because this could write to network (for example when livestreaming)
-                                receive_frames(audio_track.codec_context, audio_track.stream_index, audio_track.stream, audio_device.frame->pts, av_format_context, record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex);
+                                gsr_encoder_receive_packets(&encoder, audio_track.codec_context, audio_device.frame->pts, audio_track.stream_index);
                             } else {
                                 fprintf(stderr, "Failed to encode audio!\n");
                             }
+                            audio_track.pts += audio_track.codec_context->frame_size;
                         }
+
+                        audio_device.frame->pts += audio_track.codec_context->frame_size;
+                        num_received_frames++;
                     }
                 }
 
@@ -2237,161 +3491,383 @@ int main(int argc, char **argv) {
         }
     }
 
+    std::thread amix_thread;
+    if(uses_amix) {
+        amix_thread = std::thread([&]() {
+            AVFrame *aframe = av_frame_alloc();
+            while(running) {
+                {
+                    std::lock_guard<std::mutex> lock(audio_filter_mutex);
+                    for(AudioTrack &audio_track : audio_tracks) {
+                        if(!audio_track.sink)
+                            continue;
+
+                        int err = 0;
+                        while ((err = av_buffersink_get_frame(audio_track.sink, aframe)) >= 0) {
+                            aframe->pts = audio_track.pts;
+                            err = avcodec_send_frame(audio_track.codec_context, aframe);
+                            if(err >= 0){
+                                // TODO: Move to separate thread because this could write to network (for example when livestreaming)
+                                gsr_encoder_receive_packets(&encoder, audio_track.codec_context, aframe->pts, audio_track.stream_index);
+                            } else {
+                                fprintf(stderr, "Failed to encode audio!\n");
+                            }
+                            av_frame_unref(aframe);
+                            audio_track.pts += audio_track.codec_context->frame_size;
+                        }
+                    }
+                }
+                av_usleep(5 * 1000); // 5 milliseconds
+            }
+            av_frame_free(&aframe);
+        });
+    }
+
     // Set update_fps to 24 to test if duplicate/delayed frames cause video/audio desync or too fast/slow video.
-    const double update_fps = fps + 190;
+    //const double update_fps = fps + 190;
     bool should_stop_error = false;
 
-    AVFrame *aframe = av_frame_alloc();
-
     int64_t video_pts_counter = 0;
     int64_t video_prev_pts = 0;
-    int64_t audio_prev_pts = 0;
+
+    bool hdr_metadata_set = false;
+    const bool hdr = video_codec_is_hdr(arg_parser.video_codec);
+
+    double damage_timeout_seconds = arg_parser.framerate_mode == GSR_FRAMERATE_MODE_CONTENT ? 0.5 : 0.1;
+    damage_timeout_seconds = std::max(damage_timeout_seconds, target_fps);
+
+    bool use_damage_tracking = false;
+    gsr_damage damage;
+    memset(&damage, 0, sizeof(damage));
+    if(gsr_window_get_display_server(window) == GSR_DISPLAY_SERVER_X11) {
+        gsr_damage_init(&damage, &egl, arg_parser.record_cursor);
+        use_damage_tracking = true;
+    }
+
+    if(is_monitor_capture)
+        gsr_damage_set_target_monitor(&damage, arg_parser.window);
+
+    double last_capture_seconds = record_start_time;
+    bool wait_until_frame_time_elapsed = false;
 
     while(running) {
-        double frame_start = clock_get_monotonic_seconds();
+        const double frame_start = clock_get_monotonic_seconds();
+
+        while(gsr_window_process_event(window)) {
+            gsr_damage_on_event(&damage, gsr_window_get_event_data(window));
+            gsr_capture_on_event(capture, &egl);
+        }
+        gsr_damage_tick(&damage);
+        gsr_capture_tick(capture);
+
+        if(!is_monitor_capture) {
+            Window damage_target_window = 0;
+            if(capture->get_window_id)
+                damage_target_window = capture->get_window_id(capture);
+
+            if(damage_target_window != 0)
+                gsr_damage_set_target_window(&damage, damage_target_window);
+        }
 
-        gsr_capture_tick(capture, video_codec_context, &frame);
         should_stop_error = false;
         if(gsr_capture_should_stop(capture, &should_stop_error)) {
             running = 0;
             break;
         }
+
+        bool damaged = false;
+        if(use_damage_tracking)
+            damaged = gsr_damage_is_damaged(&damage);
+        else if(capture->is_damaged)
+            damaged = capture->is_damaged(capture);
+        else
+            damaged = true;
+
+        // TODO: Readd wayland sync warning when removing this
+        if(arg_parser.framerate_mode != GSR_FRAMERATE_MODE_CONTENT)
+            damaged = true;
+
+        if(damaged)
+            ++damage_fps_counter;
+
         ++fps_counter;
+        const double time_now = clock_get_monotonic_seconds();
+        //const double frame_timer_elapsed = time_now - frame_timer_start;
+        const double elapsed = time_now - fps_start_time;
+        if (elapsed >= 1.0) {
+            if(arg_parser.verbose) {
+                fprintf(stderr, "update fps: %d, damage fps: %d\n", fps_counter, damage_fps_counter);
+            }
+            fps_start_time = time_now;
+            fps_counter = 0;
+            damage_fps_counter = 0;
+        }
 
-        {
-            std::lock_guard<std::mutex> lock(audio_filter_mutex);
-            for(AudioTrack &audio_track : audio_tracks) {
-                if(!audio_track.sink)
-                    continue;
+        const double this_video_frame_time = clock_get_monotonic_seconds() - paused_time_offset;
+        const double time_since_last_frame_captured_seconds = this_video_frame_time - last_capture_seconds;
+        double frame_time_overflow = time_since_last_frame_captured_seconds - target_fps;
+        const bool frame_timeout = frame_time_overflow >= 0.0;
+
+        bool force_frame_capture = wait_until_frame_time_elapsed && frame_timeout;
+        bool allow_capture = !wait_until_frame_time_elapsed || force_frame_capture;
+        if(arg_parser.framerate_mode == GSR_FRAMERATE_MODE_CONTENT) {
+            force_frame_capture = false;
+            allow_capture = frame_timeout;
+        }
 
-                int err = 0;
-                while ((err = av_buffersink_get_frame(audio_track.sink, aframe)) >= 0) {
-                    const double this_audio_frame_time = clock_get_monotonic_seconds();
-                    aframe->pts = (this_audio_frame_time - record_start_time) * (double)AV_TIME_BASE;
-                    const bool same_pts = aframe->pts == audio_prev_pts;
-                    audio_prev_pts = aframe->pts;
-                    if(same_pts) {
-                        av_frame_unref(aframe);
+        bool frame_captured = false;
+        if((damaged || force_frame_capture) && allow_capture && !paused) {
+            frame_captured = true;
+            frame_time_overflow = std::min(std::max(0.0, frame_time_overflow), target_fps);
+            last_capture_seconds = this_video_frame_time - frame_time_overflow;
+            wait_until_frame_time_elapsed = false;
+
+            gsr_damage_clear(&damage);
+            if(capture->clear_damage)
+                capture->clear_damage(capture);
+
+            // TODO: Dont do this if no damage?
+            egl.glClear(0);
+            gsr_capture_capture(capture, &capture_metadata, &color_conversion);
+            gsr_egl_swap_buffers(&egl);
+            gsr_video_encoder_copy_textures_to_frame(video_encoder, video_frame, &color_conversion);
+
+            if(hdr && !hdr_metadata_set && !is_replaying && add_hdr_metadata_to_video_stream(capture, video_stream))
+                hdr_metadata_set = true;
+
+            const int64_t expected_frames = std::round((this_video_frame_time - record_start_time) / target_fps);
+            const int num_missed_frames = std::max((int64_t)1LL, expected_frames - video_pts_counter);
+
+            // TODO: Check if duplicate frame can be saved just by writing it with a different pts instead of sending it again
+            const int num_frames_to_encode = arg_parser.framerate_mode == GSR_FRAMERATE_MODE_CONSTANT ? num_missed_frames : 1;
+            for(int i = 0; i < num_frames_to_encode; ++i) {
+                if(arg_parser.framerate_mode == GSR_FRAMERATE_MODE_CONSTANT) {
+                    video_frame->pts = video_pts_counter + i;
+                } else {
+                    video_frame->pts = (this_video_frame_time - record_start_time) * (double)AV_TIME_BASE;
+                    const bool same_pts = video_frame->pts == video_prev_pts;
+                    video_prev_pts = video_frame->pts;
+                    if(same_pts)
                         continue;
-                    }
+                }
 
-                    err = avcodec_send_frame(audio_track.codec_context, aframe);
-                    if(err >= 0){
-                        // TODO: Move to separate thread because this could write to network (for example when livestreaming)
-                        receive_frames(audio_track.codec_context, audio_track.stream_index, audio_track.stream, aframe->pts, av_format_context, record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex);
-                    } else {
-                        fprintf(stderr, "Failed to encode audio!\n");
-                    }
-                    av_frame_unref(aframe);
+                if(force_iframe_frame) {
+                    video_frame->pict_type = AV_PICTURE_TYPE_I;
                 }
-            }
-        }
 
-        double time_now = clock_get_monotonic_seconds();
-        double frame_timer_elapsed = time_now - frame_timer_start;
-        double elapsed = time_now - start_time;
-        if (elapsed >= 1.0) {
-            if(verbose) {
-                fprintf(stderr, "update fps: %d\n", fps_counter);
+                int ret = avcodec_send_frame(video_codec_context, video_frame);
+                if(ret == 0) {
+                    // TODO: Move to separate thread because this could write to network (for example when livestreaming)
+                    gsr_encoder_receive_packets(&encoder, video_codec_context, video_frame->pts, VIDEO_STREAM_INDEX);
+                } else {
+                    fprintf(stderr, "gsr error: avcodec_send_frame failed, error: %s\n", av_error_to_string(ret));
+                }
+
+                if(force_iframe_frame) {
+                    force_iframe_frame = false;
+                    video_frame->pict_type = AV_PICTURE_TYPE_NONE;
+                }
             }
-            start_time = time_now;
-            fps_counter = 0;
+
+            video_pts_counter += num_frames_to_encode;
         }
 
-        double frame_time_overflow = frame_timer_elapsed - target_fps;
-        if (frame_time_overflow >= 0.0) {
-            frame_time_overflow = std::min(frame_time_overflow, target_fps);
-            frame_timer_start = time_now - frame_time_overflow;
+        if(toggle_pause == 1 && !is_replaying) {
+            const bool new_paused_state = !paused;
+            if(new_paused_state) {
+                paused_time_start = clock_get_monotonic_seconds();
+                fprintf(stderr, "Paused\n");
+            } else {
+                paused_time_offset += (clock_get_monotonic_seconds() - paused_time_start);
+                fprintf(stderr, "Unpaused\n");
+            }
 
-            const double this_video_frame_time = clock_get_monotonic_seconds();
-            const int64_t expected_frames = std::round((this_video_frame_time - start_time_pts) / target_fps);
-            const int num_frames = framerate_mode == FramerateMode::CONSTANT ? std::max((int64_t)0LL, expected_frames - video_pts_counter) : 1;
+            toggle_pause = 0;
+            paused = !paused;
+        }
 
-            if(num_frames > 0) {
-                gsr_capture_capture(capture, frame);
+        if(toggle_replay_recording && !arg_parser.replay_recording_directory) {
+            toggle_replay_recording = 0;
+            printf("gsr error: Unable to start recording since the -ro option was not specified\n");
+            fflush(stdout);
+        }
 
-                // TODO: Check if duplicate frame can be saved just by writing it with a different pts instead of sending it again
-                for(int i = 0; i < num_frames; ++i) {
-                    if(framerate_mode == FramerateMode::CONSTANT) {
-                        frame->pts = video_pts_counter + i;
-                    } else {
-                        frame->pts = (this_video_frame_time - record_start_time) * (double)AV_TIME_BASE;
-                        const bool same_pts = frame->pts == video_prev_pts;
-                        video_prev_pts = frame->pts;
-                        if(same_pts)
-                            continue;
+        if(toggle_replay_recording && arg_parser.replay_recording_directory) {
+            toggle_replay_recording = 0;
+            const bool new_replay_recording_state = !replay_recording;
+            if(new_replay_recording_state) {
+                std::lock_guard<std::mutex> lock(audio_filter_mutex);
+                replay_recording_items.clear();
+                replay_recording_filepath = create_new_recording_filepath_from_timestamp(arg_parser.replay_recording_directory, "Video", file_extension, arg_parser.date_folders);
+                replay_recording_start_result = start_recording_create_streams(replay_recording_filepath.c_str(), arg_parser.container_format, video_codec_context, audio_tracks, hdr, capture);
+                if(replay_recording_start_result.av_format_context) {
+                    const size_t video_recording_destination_id = gsr_encoder_add_recording_destination(&encoder, video_codec_context, replay_recording_start_result.av_format_context, replay_recording_start_result.video_stream, video_frame->pts);
+                    if(video_recording_destination_id != (size_t)-1)
+                        replay_recording_items.push_back(video_recording_destination_id);
+
+                    for(const auto &audio_input : replay_recording_start_result.audio_inputs) {
+                        const size_t audio_recording_destination_id = gsr_encoder_add_recording_destination(&encoder, audio_input.audio_track->codec_context, replay_recording_start_result.av_format_context, audio_input.stream, audio_input.audio_track->pts);
+                        if(audio_recording_destination_id != (size_t)-1)
+                            replay_recording_items.push_back(audio_recording_destination_id);
                     }
 
-                    int ret = avcodec_send_frame(video_codec_context, frame);
-                    if(ret == 0) {
-                        // TODO: Move to separate thread because this could write to network (for example when livestreaming)
-                        receive_frames(video_codec_context, VIDEO_STREAM_INDEX, video_stream, frame->pts, av_format_context,
-                            record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex);
-                    } else {
-                        fprintf(stderr, "Error: avcodec_send_frame failed, error: %s\n", av_error_to_string(ret));
-                    }
+                    replay_recording = true;
+                    force_iframe_frame = true;
+                    fprintf(stderr, "Started recording\n");
+                } else {
+                    printf("gsr error: Failed to start recording\n");
+                    fflush(stdout);
+                }
+            } else if(replay_recording_start_result.av_format_context) {
+                for(size_t id : replay_recording_items) {
+                    gsr_encoder_remove_recording_destination(&encoder, id);
+                }
+                replay_recording_items.clear();
+
+                if(stop_recording_close_streams(replay_recording_start_result.av_format_context)) {
+                    fprintf(stderr, "Stopped recording\n");
+                    puts(replay_recording_filepath.c_str());
+                    fflush(stdout);
+                    if(arg_parser.recording_saved_script)
+                        run_recording_saved_script_async(arg_parser.recording_saved_script, replay_recording_filepath.c_str(), "regular");
+                } else {
+                    printf("gsr error: Failed to save recording\n");
+                    fflush(stdout);
                 }
 
-                gsr_capture_end(capture, frame);
-                video_pts_counter += num_frames;
+                replay_recording_start_result = RecordingStartResult{};
+                replay_recording = false;
+                replay_recording_filepath.clear();
             }
         }
 
         if(save_replay_thread.valid() && save_replay_thread.wait_for(std::chrono::seconds(0)) == std::future_status::ready) {
             save_replay_thread.get();
-            puts(save_replay_output_filepath.c_str());
-            fflush(stdout);
-            std::lock_guard<std::mutex> lock(write_output_mutex);
-            save_replay_packets.clear();
+            if(save_replay_output_filepath.empty()) {
+                printf("gsr error: Failed to save replay\n");
+                fflush(stdout);
+            } else {
+                puts(save_replay_output_filepath.c_str());
+                fflush(stdout);
+                if(arg_parser.recording_saved_script)
+                    run_recording_saved_script_async(arg_parser.recording_saved_script, save_replay_output_filepath.c_str(), "replay");
+            }
         }
 
-        if(save_replay == 1 && !save_replay_thread.valid() && replay_buffer_size_secs != -1) {
-            save_replay = 0;
-            save_replay_async(video_codec_context, VIDEO_STREAM_INDEX, audio_tracks, frame_data_queue, frames_erased, filename, container_format, file_extension, write_output_mutex, make_folders);
+        if(save_replay_seconds != 0 && !save_replay_thread.valid() && is_replaying) {
+            int current_save_replay_seconds = save_replay_seconds;
+            if(current_save_replay_seconds > 0)
+                current_save_replay_seconds += arg_parser.keyint;
+
+            save_replay_seconds = 0;
+            save_replay_output_filepath.clear();
+            save_replay_async(video_codec_context, VIDEO_STREAM_INDEX, audio_tracks, encoder.replay_buffer, arg_parser.filename, arg_parser.container_format, file_extension, arg_parser.date_folders, hdr, capture, current_save_replay_seconds);
+
+            if(arg_parser.restart_replay_on_save && current_save_replay_seconds == save_replay_seconds_full)
+                gsr_replay_buffer_clear(encoder.replay_buffer);
         }
 
-        double frame_end = clock_get_monotonic_seconds();
-        double frame_sleep_fps = 1.0 / update_fps;
-        double sleep_time = frame_sleep_fps - (frame_end - frame_start);
-        if(sleep_time > 0.0)
-            usleep(sleep_time * 1000.0 * 1000.0);
+        const double frame_end = clock_get_monotonic_seconds();
+        const double time_at_frame_end = frame_end - paused_time_offset;
+        const double time_elapsed_total = time_at_frame_end - record_start_time;
+        const int64_t frames_elapsed = (int64_t)(time_elapsed_total / target_fps);
+        const double time_at_next_frame = (frames_elapsed + 1) * target_fps;
+        double time_to_next_frame = time_at_next_frame - time_elapsed_total;
+        if(time_to_next_frame > target_fps*1.1)
+            time_to_next_frame = target_fps;
+
+        const double frame_time = frame_end - frame_start;
+        const bool frame_deadline_missed = frame_time > target_fps;
+        if(time_to_next_frame >= 0.0 && !frame_deadline_missed && frame_captured)
+            av_usleep(time_to_next_frame * 1000.0 * 1000.0);
+        else {
+            if(paused)
+                av_usleep(20.0 * 1000.0); // 20 milliseconds
+            else if(frame_deadline_missed)
+            {}
+            else if(arg_parser.framerate_mode == GSR_FRAMERATE_MODE_CONTENT || !frame_captured)
+                av_usleep(2.8 * 1000.0); // 2.8 milliseconds
+            else if(!frame_captured)
+                av_usleep(1.0 * 1000.0); // 1 milliseconds
+            wait_until_frame_time_elapsed = true;
+        }
     }
 
     running = 0;
 
     if(save_replay_thread.valid()) {
         save_replay_thread.get();
-        puts(save_replay_output_filepath.c_str());
-        fflush(stdout);
-        std::lock_guard<std::mutex> lock(write_output_mutex);
-        save_replay_packets.clear();
+        if(save_replay_output_filepath.empty()) {
+            // TODO: Output failed to save
+        } else {
+            puts(save_replay_output_filepath.c_str());
+            fflush(stdout);
+            if(arg_parser.recording_saved_script)
+                run_recording_saved_script_async(arg_parser.recording_saved_script, save_replay_output_filepath.c_str(), "replay");
+        }
+    }
+
+    if(replay_recording_start_result.av_format_context) {
+        for(size_t id : replay_recording_items) {
+            gsr_encoder_remove_recording_destination(&encoder, id);
+        }
+        replay_recording_items.clear();
+
+        if(stop_recording_close_streams(replay_recording_start_result.av_format_context)) {
+            fprintf(stderr, "Stopped recording\n");
+            puts(replay_recording_filepath.c_str());
+            fflush(stdout);
+            if(arg_parser.recording_saved_script)
+                run_recording_saved_script_async(arg_parser.recording_saved_script, replay_recording_filepath.c_str(), "regular");
+        } else {
+            printf("gsr error: Failed to save recording\n");
+            fflush(stdout);
+        }
     }
 
     for(AudioTrack &audio_track : audio_tracks) {
-        for(AudioDevice &audio_device : audio_track.audio_devices) {
+        for(auto &audio_device : audio_track.audio_devices) {
             audio_device.thread.join();
             sound_device_close(&audio_device.sound_device);
         }
     }
 
-    av_frame_free(&aframe);
+    if(amix_thread.joinable())
+        amix_thread.join();
 
-    if (replay_buffer_size_secs == -1 && av_write_trailer(av_format_context) != 0) {
+    // TODO: Replace this with start_recording_create_steams
+    if(!is_replaying && av_write_trailer(av_format_context) != 0) {
         fprintf(stderr, "Failed to write trailer\n");
     }
 
-    if(replay_buffer_size_secs == -1 && !(output_format->flags & AVFMT_NOFILE))
+    if(!is_replaying) {
         avio_close(av_format_context->pb);
+        avformat_free_context(av_format_context);
+    }
+
+    gsr_damage_deinit(&damage);
+    gsr_color_conversion_deinit(&color_conversion);
+    gsr_video_encoder_destroy(video_encoder, video_codec_context);
+    gsr_encoder_deinit(&encoder);
+    gsr_capture_destroy(capture);
+#ifdef GSR_APP_AUDIO
+    gsr_pipewire_audio_deinit(&pipewire_audio);
+#endif
 
-    gsr_capture_destroy(capture, video_codec_context);
+    if(!is_replaying && arg_parser.recording_saved_script)
+        run_recording_saved_script_async(arg_parser.recording_saved_script, arg_parser.filename, "regular");
 
     if(dpy) {
         // TODO: This causes a crash, why? maybe some other library dlclose xlib and that also happened to unload this???
         //XCloseDisplay(dpy);
     }
 
-    free((void*)window_str);
+    //gsr_egl_unload(&egl);
+    //gsr_window_destroy(&window);
+
+    //av_frame_free(&video_frame);
     free(empty_audio);
+    args_parser_deinit(&arg_parser);
     // We do an _exit here because cuda uses at_exit to do _something_ that causes the program to freeze,
     // but only on some nvidia driver versions on some gpus (RTX?), and _exit exits the program without calling
     // the at_exit registered functions.
diff --git a/src/overclock.c b/src/overclock.c
index 9160a04..df2ae66 100644
--- a/src/overclock.c
+++ b/src/overclock.c
@@ -2,13 +2,12 @@
 #include <X11/Xlib.h>
 #include <stdio.h>
 #include <string.h>
+#include <stdlib.h>
 
-// HACK!!!: When a program uses cuda (including nvenc) then the nvidia driver drops to performance level 2 (memory transfer rate is dropped and possibly graphics clock).
+// HACK!!!: When a program uses cuda (including nvenc) then the nvidia driver drops to max performance level - 1 (memory transfer rate is dropped and possibly graphics clock).
 // Nvidia does this because in some very extreme cases of cuda there can be memory corruption when running at max memory transfer rate.
 // So to get around this we overclock memory transfer rate (maybe this should also be done for graphics clock?) to the best performance level while GPU Screen Recorder is running.
 
-// TODO: Does it always drop to performance level 2?
-
 static int min_int(int a, int b) {
     return a < b ? a : b;
 }
@@ -76,13 +75,13 @@ static unsigned int attribute_type_to_attribute_param_all_levels(NvCTRLAttribute
 }
 
 // Returns 0 on error
-static int xnvctrl_get_attribute_max_value(gsr_xnvctrl *xnvctrl, const NVCTRLPerformanceLevelQuery *query, NvCTRLAttributeType attribute_type) {
+static int xnvctrl_get_attribute_max_value(gsr_xnvctrl *xnvctrl, int num_performance_levels, NvCTRLAttributeType attribute_type) {
     NVCTRLAttributeValidValuesRec valid;
     if(xnvctrl->XNVCTRLQueryValidTargetAttributeValues(xnvctrl->display, NV_CTRL_TARGET_TYPE_GPU, 0, 0, attribute_type_to_attribute_param_all_levels(attribute_type), &valid)) {
         return valid.u.range.max;
     }
 
-    if(query->num_performance_levels > 0 && xnvctrl->XNVCTRLQueryValidTargetAttributeValues(xnvctrl->display, NV_CTRL_TARGET_TYPE_GPU, 0, query->num_performance_levels - 1, attribute_type_to_attribute_param(attribute_type), &valid)) {
+    if(num_performance_levels > 0 && xnvctrl->XNVCTRLQueryValidTargetAttributeValues(xnvctrl->display, NV_CTRL_TARGET_TYPE_GPU, 0, num_performance_levels - 1, attribute_type_to_attribute_param(attribute_type), &valid)) {
         return valid.u.range.max;
     }
     
@@ -208,6 +207,12 @@ static bool xnvctrl_get_performance_levels(gsr_xnvctrl *xnvctrl, NVCTRLPerforman
     return success;
 }
 
+static int compare_mem_transfer_rate_max_asc(const void *a, const void *b) {
+    const NVCTRLPerformanceLevel *perf_a = a;
+    const NVCTRLPerformanceLevel *perf_b = b;
+    return perf_a->mem_transfer_rate_max - perf_b->mem_transfer_rate_max;
+}
+
 bool gsr_overclock_load(gsr_overclock *self, Display *display) {
     memset(self, 0, sizeof(gsr_overclock));
     self->num_performance_levels = 0;
@@ -234,31 +239,32 @@ bool gsr_overclock_start(gsr_overclock *self) {
     }
     self->num_performance_levels = query.num_performance_levels;
 
-    int target_transfer_rate_offset = xnvctrl_get_attribute_max_value(&self->xnvctrl, &query, NVCTRL_ATTRIB_GPU_MEM_TRANSFER_RATE) / 2; // Divide by 2 just to be safe that we dont set it too high
-    if(query.num_performance_levels > 2) {
-        const int transfer_rate_max_diff = query.performance_level[query.num_performance_levels - 1].mem_transfer_rate_max - query.performance_level[2].mem_transfer_rate_max;
-        target_transfer_rate_offset = min_int(target_transfer_rate_offset, transfer_rate_max_diff);
-    }
+    qsort(query.performance_level, query.num_performance_levels, sizeof(NVCTRLPerformanceLevel), compare_mem_transfer_rate_max_asc);
 
-    if(xnvctrl_set_attribute_offset(&self->xnvctrl, self->num_performance_levels, target_transfer_rate_offset, NVCTRL_ATTRIB_GPU_MEM_TRANSFER_RATE)) {
-        fprintf(stderr, "gsr info: gsr_overclock_start: sucessfully set memory transfer rate offset to %d\n", target_transfer_rate_offset);
-    } else {
-        fprintf(stderr, "gsr info: gsr_overclock_start: failed to overclock memory transfer rate offset to %d\n", target_transfer_rate_offset);
+    int target_transfer_rate_offset = xnvctrl_get_attribute_max_value(&self->xnvctrl, query.num_performance_levels, NVCTRL_ATTRIB_GPU_MEM_TRANSFER_RATE);
+    if(query.num_performance_levels > 1) {
+        const int transfer_rate_max_diff = query.performance_level[query.num_performance_levels - 1].mem_transfer_rate_max - query.performance_level[query.num_performance_levels - 2].mem_transfer_rate_max;
+        target_transfer_rate_offset = min_int(target_transfer_rate_offset, transfer_rate_max_diff);
+        if(target_transfer_rate_offset >= 0 && xnvctrl_set_attribute_offset(&self->xnvctrl, self->num_performance_levels, target_transfer_rate_offset, NVCTRL_ATTRIB_GPU_MEM_TRANSFER_RATE)) {
+            fprintf(stderr, "gsr info: gsr_overclock_start: sucessfully set memory transfer rate offset to %d\n", target_transfer_rate_offset);
+        } else {
+            fprintf(stderr, "gsr info: gsr_overclock_start: failed to overclock memory transfer rate offset to %d\n", target_transfer_rate_offset);
+        }
     }
 
+    // TODO: Sort by nv_clock_max
 
     // TODO: Enable. Crashes on my system (gtx 1080) so it's disabled for now. Seems to crash even if graphics clock is increasd by 1, let alone 1200
     /*
-    int target_nv_clock_offset = xnvctrl_get_attribute_max_value(&self->xnvctrl, &query, NVCTRL_GPU_NVCLOCK) / 2; // Divide by 2 just to be safe that we dont set it too high
-    if(query.num_performance_levels > 2) {
-        const int nv_clock_max_diff = query.performance_level[query.num_performance_levels - 1].nv_clock_max - query.performance_level[2].nv_clock_max;
+    int target_nv_clock_offset = xnvctrl_get_attribute_max_value(&self->xnvctrl, query.num_performance_levels, NVCTRL_GPU_NVCLOCK);
+    if(query.num_performance_levels > 1) {
+        const int nv_clock_max_diff = query.performance_level[query.num_performance_levels - 1].nv_clock_max - query.performance_level[query.num_performance_levels - 2].nv_clock_max;
         target_nv_clock_offset = min_int(target_nv_clock_offset, nv_clock_max_diff);
-    }
-
-    if(xnvctrl_set_attribute_offset(&self->xnvctrl, self->num_performance_levels, target_nv_clock_offset, NVCTRL_GPU_NVCLOCK)) {
-        fprintf(stderr, "gsr info: gsr_overclock_start: sucessfully set nv clock offset to %d\n", target_nv_clock_offset);
-    } else {
-        fprintf(stderr, "gsr info: gsr_overclock_start: failed to overclock nv clock offset to %d\n", target_nv_clock_offset);
+        if(target_nv_clock_offset >= 0 && xnvctrl_set_attribute_offset(&self->xnvctrl, self->num_performance_levels, target_nv_clock_offset, NVCTRL_GPU_NVCLOCK)) {
+            fprintf(stderr, "gsr info: gsr_overclock_start: sucessfully set nv clock offset to %d\n", target_nv_clock_offset);
+        } else {
+            fprintf(stderr, "gsr info: gsr_overclock_start: failed to overclock nv clock offset to %d\n", target_nv_clock_offset);
+        }
     }
     */
 
diff --git a/src/pipewire_audio.c b/src/pipewire_audio.c
new file mode 100644
index 0000000..4ce07fb
--- /dev/null
+++ b/src/pipewire_audio.c
@@ -0,0 +1,854 @@
+#include "../include/pipewire_audio.h"
+
+#include <pipewire/pipewire.h>
+#include <pipewire/extensions/metadata.h>
+#include <pipewire/impl-module.h>
+
+typedef struct {
+    const gsr_pipewire_audio_port *output_port;
+    const gsr_pipewire_audio_port *input_port;
+} gsr_pipewire_audio_desired_link;
+
+static void on_core_info_cb(void *user_data, const struct pw_core_info *info) {
+    gsr_pipewire_audio *self = user_data;
+    //fprintf(stderr, "server name: %s\n", info->name);
+}
+
+static void on_core_error_cb(void *user_data, uint32_t id, int seq, int res, const char *message) {
+    gsr_pipewire_audio *self = user_data;
+    //fprintf(stderr, "gsr error: pipewire: error id:%u seq:%d res:%d: %s\n", id, seq, res, message);
+    pw_thread_loop_signal(self->thread_loop, false);
+}
+
+static void on_core_done_cb(void *user_data, uint32_t id, int seq) {
+    gsr_pipewire_audio *self = user_data;
+    if(id == PW_ID_CORE && self->server_version_sync == seq)
+        pw_thread_loop_signal(self->thread_loop, false);
+}
+
+static const struct pw_core_events core_events = {
+    PW_VERSION_CORE_EVENTS,
+    .info = on_core_info_cb,
+    .done = on_core_done_cb,
+    .error = on_core_error_cb,
+};
+
+static gsr_pipewire_audio_node* gsr_pipewire_audio_get_node_by_name_case_insensitive(gsr_pipewire_audio *self, const char *node_name, gsr_pipewire_audio_node_type node_type) {
+    for(size_t i = 0; i < self->num_stream_nodes; ++i) {
+        const gsr_pipewire_audio_node *node = &self->stream_nodes[i];
+        if(node->type == node_type && strcasecmp(node->name, node_name) == 0)
+            return &self->stream_nodes[i];
+    }
+    return NULL;
+}
+
+static gsr_pipewire_audio_port* gsr_pipewire_audio_get_node_port_by_name(gsr_pipewire_audio *self, uint32_t node_id, const char *port_name) {
+    for(size_t i = 0; i < self->num_ports; ++i) {
+        if(self->ports[i].node_id == node_id && strcmp(self->ports[i].name, port_name) == 0)
+            return &self->ports[i];
+    }
+    return NULL;
+}
+
+static bool requested_link_matches_name_case_insensitive(const gsr_pipewire_audio_requested_link *requested_link, const char *name) {
+    for(int i = 0; i < requested_link->num_outputs; ++i) {
+        if(requested_link->outputs[i].type == GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_STANDARD && strcasecmp(requested_link->outputs[i].name, name) == 0)
+            return true;
+    }
+    return false;
+}
+
+static bool requested_link_has_type(const gsr_pipewire_audio_requested_link *requested_link, gsr_pipewire_audio_requested_type type) {
+    for(int i = 0; i < requested_link->num_outputs; ++i) {
+        if(requested_link->outputs[i].type == type)
+            return true;
+    }
+    return false;
+}
+
+static void gsr_pipewire_get_node_input_port_by_type(gsr_pipewire_audio *self, const gsr_pipewire_audio_node *input_node, gsr_pipewire_audio_link_input_type input_type,
+    const gsr_pipewire_audio_port **input_fl_port, const gsr_pipewire_audio_port **input_fr_port)
+{
+    *input_fl_port = NULL;
+    *input_fr_port = NULL;
+
+    switch(input_type) {
+        case GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_STREAM: {
+            *input_fl_port = gsr_pipewire_audio_get_node_port_by_name(self, input_node->id, "input_FL");
+            *input_fr_port = gsr_pipewire_audio_get_node_port_by_name(self, input_node->id, "input_FR");
+            break;
+        }
+        case GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_SINK: {
+            *input_fl_port = gsr_pipewire_audio_get_node_port_by_name(self, input_node->id, "playback_FL");
+            *input_fr_port = gsr_pipewire_audio_get_node_port_by_name(self, input_node->id, "playback_FR");
+            break;
+        }
+    }
+}
+
+static bool string_starts_with(const char *str, const char *substr) {
+    const int len = strlen(str);
+    const int substr_len = strlen(substr);
+    return len >= substr_len && memcmp(str, substr, substr_len) == 0;
+}
+
+static bool string_ends_with(const char *str, const char *substr) {
+    const int len = strlen(str);
+    const int substr_len = strlen(substr);
+    return len >= substr_len && memcmp(str + len - substr_len, substr, substr_len) == 0;
+}
+
+/* Returns number of desired links */
+static size_t gsr_pipewire_get_node_output_ports(gsr_pipewire_audio *self, const gsr_pipewire_audio_node *output_node,
+    gsr_pipewire_audio_desired_link *desired_links, size_t desired_links_max_size,
+    const gsr_pipewire_audio_port *input_fl_port, const gsr_pipewire_audio_port *input_fr_port)
+{
+    size_t num_desired_links = 0;
+    for(size_t i = 0; i < self->num_ports && num_desired_links < desired_links_max_size; ++i) {
+        if(self->ports[i].node_id != output_node->id)
+            continue;
+
+        if(string_starts_with(self->ports[i].name, "playback_"))
+            continue;
+
+        if(string_ends_with(self->ports[i].name, "_MONO") || string_ends_with(self->ports[i].name, "_FC") || string_ends_with(self->ports[i].name, "_LFE")) {
+            if(num_desired_links + 2 >= desired_links_max_size)
+                break;
+
+            desired_links[num_desired_links + 0] = (gsr_pipewire_audio_desired_link){ .output_port = &self->ports[i], .input_port = input_fl_port };
+            desired_links[num_desired_links + 1] = (gsr_pipewire_audio_desired_link){ .output_port = &self->ports[i], .input_port = input_fr_port };
+            num_desired_links += 2;
+        } else if(string_ends_with(self->ports[i].name, "_FL") || string_ends_with(self->ports[i].name, "_RL") || string_ends_with(self->ports[i].name, "_SL")) {
+            if(num_desired_links + 1 >= desired_links_max_size)
+                break;
+
+            desired_links[num_desired_links] = (gsr_pipewire_audio_desired_link){ .output_port = &self->ports[i], .input_port = input_fl_port };
+            num_desired_links += 1;
+        } else if(string_ends_with(self->ports[i].name, "_FR") || string_ends_with(self->ports[i].name, "_RR") || string_ends_with(self->ports[i].name, "_SR")) {
+            if(num_desired_links + 1 >= desired_links_max_size)
+                break;
+
+            desired_links[num_desired_links] = (gsr_pipewire_audio_desired_link){ .output_port = &self->ports[i], .input_port = input_fr_port };
+            num_desired_links += 1;
+        }
+    }
+    return num_desired_links;
+}
+
+static void gsr_pipewire_audio_establish_link(gsr_pipewire_audio *self, const gsr_pipewire_audio_port *output_port, const gsr_pipewire_audio_port *input_port) {
+    // TODO: Detect if link already exists before so we dont create these proxies when not needed.
+    // We could do that by saving which nodes have been linked with which nodes after linking them.
+
+    //fprintf(stderr, "linking!\n");
+    // TODO: error check and cleanup
+    struct pw_properties *props = pw_properties_new(NULL, NULL);
+    pw_properties_setf(props, PW_KEY_LINK_OUTPUT_PORT, "%u", output_port->id);
+    pw_properties_setf(props, PW_KEY_LINK_INPUT_PORT, "%u", input_port->id);
+    // TODO: Clean this up when removing node
+    struct pw_proxy *proxy = pw_core_create_object(self->core, "link-factory", PW_TYPE_INTERFACE_Link, PW_VERSION_LINK, &props->dict, 0);
+    //self->server_version_sync = pw_core_sync(self->core, PW_ID_CORE, self->server_version_sync);
+    pw_properties_free(props);
+}
+
+static void gsr_pipewire_audio_create_link(gsr_pipewire_audio *self, const gsr_pipewire_audio_requested_link *requested_link) {
+    const gsr_pipewire_audio_node_type requested_link_node_type = requested_link->input_type == GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_STREAM ? GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_INPUT : GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE;
+    const gsr_pipewire_audio_node *stream_input_node = gsr_pipewire_audio_get_node_by_name_case_insensitive(self, requested_link->input_name, requested_link_node_type);
+    if(!stream_input_node)
+        return;
+
+    const gsr_pipewire_audio_port *input_fl_port = NULL;
+    const gsr_pipewire_audio_port *input_fr_port = NULL;
+    gsr_pipewire_get_node_input_port_by_type(self, stream_input_node, requested_link->input_type, &input_fl_port, &input_fr_port);
+    if(!input_fl_port || !input_fr_port)
+        return;
+
+    gsr_pipewire_audio_desired_link desired_links[64];
+    for(size_t i = 0; i < self->num_stream_nodes; ++i) {
+        const gsr_pipewire_audio_node *output_node = &self->stream_nodes[i];
+        if(output_node->type != requested_link->output_type)
+            continue;
+
+        const bool requested_link_matches_app = requested_link_matches_name_case_insensitive(requested_link, output_node->name);
+        if(requested_link->inverted) {
+            if(requested_link_matches_app)
+                continue;
+        } else {
+            if(!requested_link_matches_app)
+                continue;
+        }
+
+        const size_t num_desired_links = gsr_pipewire_get_node_output_ports(self, output_node, desired_links, 64, input_fl_port, input_fr_port);
+        for(size_t j = 0; j < num_desired_links; ++j) {
+            gsr_pipewire_audio_establish_link(self, desired_links[j].output_port, desired_links[j].input_port);
+        }
+    }
+}
+
+static void gsr_pipewire_audio_create_links(gsr_pipewire_audio *self) {
+    for(size_t i = 0; i < self->num_requested_links; ++i) {
+        gsr_pipewire_audio_create_link(self, &self->requested_links[i]);
+    }
+}
+
+static void gsr_pipewire_audio_create_link_for_default_devices(gsr_pipewire_audio *self, const gsr_pipewire_audio_requested_link *requested_link, gsr_pipewire_audio_requested_type default_device_type) {
+    if(default_device_type == GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_STANDARD)
+        return;
+
+    const char *device_name = default_device_type == GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT ? self->default_output_device_name : self->default_input_device_name;
+    if(device_name[0] == '\0')
+        return;
+
+    if(!requested_link_has_type(requested_link, default_device_type))
+        return;
+
+    const gsr_pipewire_audio_node_type requested_link_node_type = requested_link->input_type == GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_STREAM ? GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_INPUT : GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE;
+    const gsr_pipewire_audio_node *stream_input_node = gsr_pipewire_audio_get_node_by_name_case_insensitive(self, requested_link->input_name, requested_link_node_type);
+    if(!stream_input_node)
+        return;
+
+    const gsr_pipewire_audio_port *input_fl_port = NULL;
+    const gsr_pipewire_audio_port *input_fr_port = NULL;
+    gsr_pipewire_get_node_input_port_by_type(self, stream_input_node, requested_link->input_type, &input_fl_port, &input_fr_port);
+    if(!input_fl_port || !input_fr_port)
+        return;
+
+    const gsr_pipewire_audio_node *stream_output_node = gsr_pipewire_audio_get_node_by_name_case_insensitive(self, device_name, GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE);
+    if(!stream_output_node)
+        return;
+
+    gsr_pipewire_audio_desired_link desired_links[64];
+    const size_t num_desired_links = gsr_pipewire_get_node_output_ports(self, stream_output_node, desired_links, 64, input_fl_port, input_fr_port);
+    for(size_t i = 0; i < num_desired_links; ++i) {
+        gsr_pipewire_audio_establish_link(self, desired_links[i].output_port, desired_links[i].input_port);
+    }
+}
+
+static void gsr_pipewire_audio_create_links_for_default_devices(gsr_pipewire_audio *self, gsr_pipewire_audio_requested_type default_device_type) {
+    for(size_t i = 0; i < self->num_requested_links; ++i) {
+        gsr_pipewire_audio_create_link_for_default_devices(self, &self->requested_links[i], default_device_type);
+    }
+}
+
+static void gsr_pipewire_audio_destroy_links_by_output_to_input(gsr_pipewire_audio *self, uint32_t output_node_id, uint32_t input_node_id) {
+    for(size_t i = 0; i < self->num_links; ++i) {
+        if(self->links[i].output_node_id == output_node_id && self->links[i].input_node_id == input_node_id)
+            pw_registry_destroy(self->registry, self->links[i].id);
+    }
+}
+
+static void gsr_pipewire_destroy_default_device_link(gsr_pipewire_audio *self, const gsr_pipewire_audio_requested_link *requested_link, gsr_pipewire_audio_requested_type default_device_type) {
+    if(default_device_type == GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_STANDARD)
+        return;
+
+    const char *device_name = default_device_type == GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT ? self->default_output_device_name : self->default_input_device_name;
+    if(device_name[0] == '\0')
+        return;
+
+    if(!requested_link_has_type(requested_link, default_device_type))
+        return;
+
+    /* default_output and default_input can be the same device. In that case both are the same link and we dont want to remove the link */
+    const gsr_pipewire_audio_requested_type opposite_device_type = default_device_type == GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT ? GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_INPUT : GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT;
+    const char *opposite_device_name = opposite_device_type == GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT ? self->default_output_device_name : self->default_input_device_name;
+    if(requested_link_has_type(requested_link, opposite_device_type) && strcmp(device_name, opposite_device_name) == 0)
+        return;
+
+    const gsr_pipewire_audio_node_type requested_link_node_type = requested_link->input_type == GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_STREAM ? GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_INPUT : GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE;
+    const gsr_pipewire_audio_node *stream_input_node = gsr_pipewire_audio_get_node_by_name_case_insensitive(self, requested_link->input_name, requested_link_node_type);
+    if(!stream_input_node)
+        return;
+
+    const gsr_pipewire_audio_node *stream_output_node = gsr_pipewire_audio_get_node_by_name_case_insensitive(self, device_name, GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE);
+    if(!stream_output_node)
+        return;
+
+    if(requested_link_matches_name_case_insensitive(requested_link, stream_output_node->name))
+        return;
+
+    gsr_pipewire_audio_destroy_links_by_output_to_input(self, stream_output_node->id, stream_input_node->id);
+    //fprintf(stderr, "destroying a link from %u to %u\n", stream_output_node->id, stream_input_node->id);
+}
+
+static void gsr_pipewire_destroy_default_device_links(gsr_pipewire_audio *self, gsr_pipewire_audio_requested_type default_device_type) {
+    for(size_t i = 0; i < self->num_requested_links; ++i) {
+        gsr_pipewire_destroy_default_device_link(self, &self->requested_links[i], default_device_type);
+    }
+}
+
+static bool json_get_value(const char *json_str, const char *key, char *value, size_t value_size) {
+    char key_full[32];
+    const int key_full_size = snprintf(key_full, sizeof(key_full), "\"%s\":", key);
+    const char *start = strstr(json_str, key_full);
+    if(!start)
+        return false;
+    
+    start += key_full_size;
+    const char *value_start = strchr(start, '"');
+    if(!value_start)
+        return false;
+
+    value_start += 1;
+    const char *value_end = strchr(value_start, '"');
+    if(!value_end)
+        return false;
+
+    snprintf(value, value_size, "%.*s", (int)(value_end - value_start), value_start);
+    return true;
+}
+
+static int on_metadata_property_cb(void *data, uint32_t id, const char *key, const char *type, const char *value) {
+	(void)type;
+    gsr_pipewire_audio *self = data;
+
+	if(id == PW_ID_CORE && key && value) {
+        char value_decoded[128];
+        if(strcmp(key, "default.audio.sink") == 0) {
+            if(json_get_value(value, "name", value_decoded, sizeof(value_decoded)) && strcmp(value_decoded, self->default_output_device_name) != 0) {
+                gsr_pipewire_destroy_default_device_links(self, GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT);
+                snprintf(self->default_output_device_name, sizeof(self->default_output_device_name), "%s", value_decoded);
+                gsr_pipewire_audio_create_links_for_default_devices(self, GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT);
+            }
+        } else if(strcmp(key, "default.audio.source") == 0) {
+            if(json_get_value(value, "name", value_decoded, sizeof(value_decoded)) && strcmp(value_decoded, self->default_input_device_name) != 0) {
+                gsr_pipewire_destroy_default_device_links(self, GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_INPUT);
+                snprintf(self->default_input_device_name, sizeof(self->default_input_device_name), "%s", value_decoded);
+                gsr_pipewire_audio_create_links_for_default_devices(self, GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_INPUT);
+            }
+        }
+	}
+
+	return 0;
+}
+
+static const struct pw_metadata_events metadata_events = {
+	PW_VERSION_METADATA_EVENTS,
+	.property = on_metadata_property_cb,
+};
+
+static void on_metadata_proxy_removed_cb(void *data) {
+    gsr_pipewire_audio *self = data;
+    if(self->metadata_proxy) {
+        pw_proxy_destroy(self->metadata_proxy);
+        self->metadata_proxy = NULL;
+    }
+}
+
+static void on_metadata_proxy_destroy_cb(void *data) {
+	gsr_pipewire_audio *self = data;
+
+	spa_hook_remove(&self->metadata_listener);
+	spa_hook_remove(&self->metadata_proxy_listener);
+	spa_zero(self->metadata_listener);
+	spa_zero(self->metadata_proxy_listener);
+
+	self->metadata_proxy = NULL;
+}
+
+static const struct pw_proxy_events metadata_proxy_events = {
+	PW_VERSION_PROXY_EVENTS,
+	.removed = on_metadata_proxy_removed_cb,
+	.destroy = on_metadata_proxy_destroy_cb,
+};
+
+static bool gsr_pipewire_audio_listen_on_metadata(gsr_pipewire_audio *self, uint32_t id) {
+    if(self->metadata_proxy) {
+        pw_proxy_destroy(self->metadata_proxy);
+        self->metadata_proxy = NULL;
+    }
+
+    self->metadata_proxy = pw_registry_bind(self->registry, id, PW_TYPE_INTERFACE_Metadata, PW_VERSION_METADATA, 0);
+    if(!self->metadata_proxy) {
+        fprintf(stderr, "gsr error: gsr_pipewire_audio_listen_on_metadata: failed to bind to registry\n");
+        return false;
+    }
+
+    pw_proxy_add_object_listener(self->metadata_proxy, &self->metadata_listener, &metadata_events, self);
+    pw_proxy_add_listener(self->metadata_proxy, &self->metadata_proxy_listener, &metadata_proxy_events, self);
+
+    self->server_version_sync = pw_core_sync(self->core, PW_ID_CORE, self->server_version_sync);
+    return true;
+}
+
+static bool array_ensure_capacity(void **array, size_t size, size_t *capacity_items, size_t element_size) {
+    if(size + 1 >= *capacity_items) {
+        size_t new_capacity_items = *capacity_items * 2;
+        if(new_capacity_items == 0)
+            new_capacity_items = 32;
+
+        void *new_data = realloc(*array, new_capacity_items * element_size);
+        if(!new_data) {
+            fprintf(stderr, "gsr error: pipewire_audio: failed to reallocate memory\n");
+            return false;
+        }
+
+        *array = new_data;
+        *capacity_items = new_capacity_items;
+    }
+    return true;
+}
+
+static void registry_event_global(void *data, uint32_t id, uint32_t permissions,
+                  const char *type, uint32_t version,
+                  const struct spa_dict *props)
+{
+    //fprintf(stderr, "add: id: %d, type: %s\n", (int)id, type);
+    if(!props || !type)
+        return;
+
+    //pw_properties_new_dict(props);
+
+    gsr_pipewire_audio *self = (gsr_pipewire_audio*)data;
+    if(strcmp(type, PW_TYPE_INTERFACE_Node) == 0) {
+        const char *node_name = spa_dict_lookup(props, PW_KEY_NODE_NAME);
+        const char *media_class = spa_dict_lookup(props, PW_KEY_MEDIA_CLASS);
+        //fprintf(stderr, "  node id: %u, node name: %s, media class: %s\n", id, node_name, media_class);
+        const bool is_stream_output = media_class && strcmp(media_class, "Stream/Output/Audio") == 0;
+        const bool is_stream_input = media_class && strcmp(media_class, "Stream/Input/Audio") == 0;
+        const bool is_sink = media_class && strcmp(media_class, "Audio/Sink") == 0;
+        const bool is_source = media_class && strcmp(media_class, "Audio/Source") == 0;
+        if(node_name && (is_stream_output || is_stream_input || is_sink || is_source)) {
+            //const char *application_binary = spa_dict_lookup(props, PW_KEY_APP_PROCESS_BINARY);
+            //const char *application_name = spa_dict_lookup(props, PW_KEY_APP_NAME);
+            //fprintf(stderr, "  node name: %s, app binary: %s, app name: %s\n", node_name, application_binary, application_name);
+
+            if(!array_ensure_capacity((void**)&self->stream_nodes, self->num_stream_nodes, &self->stream_nodes_capacity_items, sizeof(gsr_pipewire_audio_node)))
+                return;
+
+            char *node_name_copy = strdup(node_name);
+            if(node_name_copy) {
+                self->stream_nodes[self->num_stream_nodes].id = id;
+                self->stream_nodes[self->num_stream_nodes].name = node_name_copy;
+                if(is_stream_output)
+                    self->stream_nodes[self->num_stream_nodes].type = GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_OUTPUT;
+                else if(is_stream_input)
+                    self->stream_nodes[self->num_stream_nodes].type = GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_INPUT;
+                else if(is_sink || is_source)
+                    self->stream_nodes[self->num_stream_nodes].type = GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE;
+                ++self->num_stream_nodes;
+
+                gsr_pipewire_audio_create_links(self);
+            }
+        }
+    } else if(strcmp(type, PW_TYPE_INTERFACE_Port) == 0) {
+        const char *port_name = spa_dict_lookup(props, PW_KEY_PORT_NAME);
+
+        const char *port_direction = spa_dict_lookup(props, PW_KEY_PORT_DIRECTION);
+        gsr_pipewire_audio_port_direction direction = -1;
+        if(port_direction && strcmp(port_direction, "in") == 0)
+            direction = GSR_PIPEWIRE_AUDIO_PORT_DIRECTION_INPUT;
+        else if(port_direction && strcmp(port_direction, "out") == 0)
+            direction = GSR_PIPEWIRE_AUDIO_PORT_DIRECTION_OUTPUT;
+
+        const char *node_id = spa_dict_lookup(props, PW_KEY_NODE_ID);
+        const int node_id_num = node_id ? atoi(node_id) : 0;
+
+        if(port_name && direction >= 0 && node_id_num > 0) {
+            if(!array_ensure_capacity((void**)&self->ports, self->num_ports, &self->ports_capacity_items, sizeof(gsr_pipewire_audio_port)))
+                return;
+
+            //fprintf(stderr, "  port name: %s, node id: %d, direction: %s\n", port_name, node_id_num, port_direction);
+            char *port_name_copy = strdup(port_name);
+            if(port_name_copy) {
+                //fprintf(stderr, "  port id: %u, node id: %u, name: %s\n", id, node_id_num, port_name_copy);
+                self->ports[self->num_ports].id = id;
+                self->ports[self->num_ports].node_id = node_id_num;
+                self->ports[self->num_ports].direction = direction;
+                self->ports[self->num_ports].name = port_name_copy;
+                ++self->num_ports;
+
+                gsr_pipewire_audio_create_links(self);
+            }
+        }
+    } else if(strcmp(type, PW_TYPE_INTERFACE_Link) == 0) {
+        const char *output_node = spa_dict_lookup(props, PW_KEY_LINK_OUTPUT_NODE);
+        const char *input_node = spa_dict_lookup(props, PW_KEY_LINK_INPUT_NODE);
+
+        const uint32_t output_node_id_num = output_node ? atoi(output_node) : 0;
+        const uint32_t input_node_id_num = input_node ? atoi(input_node) : 0;
+        if(output_node_id_num > 0 && input_node_id_num > 0) {
+            if(!array_ensure_capacity((void**)&self->links, self->num_links, &self->links_capacity_items, sizeof(gsr_pipewire_audio_link)))
+                return;
+
+            //fprintf(stderr, "  new link (%u): %u -> %u\n", id, output_node_id_num, input_node_id_num);
+            self->links[self->num_links].id = id;
+            self->links[self->num_links].output_node_id = output_node_id_num;
+            self->links[self->num_links].input_node_id = input_node_id_num;
+            ++self->num_links;
+        }
+    } else if(strcmp(type, PW_TYPE_INTERFACE_Metadata) == 0) {
+        const char *name = spa_dict_lookup(props, PW_KEY_METADATA_NAME);
+        if(name && strcmp(name, "default") == 0)
+            gsr_pipewire_audio_listen_on_metadata(self, id);
+    }
+}
+
+static bool gsr_pipewire_audio_remove_node_by_id(gsr_pipewire_audio *self, uint32_t node_id) {
+    for(size_t i = 0; i < self->num_stream_nodes; ++i) {
+        if(self->stream_nodes[i].id != node_id)
+            continue;
+
+        free(self->stream_nodes[i].name);
+        self->stream_nodes[i] = self->stream_nodes[self->num_stream_nodes - 1];
+        --self->num_stream_nodes;
+        return true;
+    }
+    return false;
+}
+
+static bool gsr_pipewire_audio_remove_port_by_id(gsr_pipewire_audio *self, uint32_t port_id) {
+    for(size_t i = 0; i < self->num_ports; ++i) {
+        if(self->ports[i].id != port_id)
+            continue;
+
+        free(self->ports[i].name);
+        self->ports[i] = self->ports[self->num_ports - 1];
+        --self->num_ports;
+        return true;
+    }
+    return false;
+}
+
+static bool gsr_pipewire_audio_remove_link_by_id(gsr_pipewire_audio *self, uint32_t link_id) {
+    for(size_t i = 0; i < self->num_links; ++i) {
+        if(self->links[i].id != link_id)
+            continue;
+
+        self->links[i] = self->links[self->num_links - 1];
+        --self->num_links;
+        return true;
+    }
+    return false;
+}
+
+static void registry_event_global_remove(void *data, uint32_t id) {
+    //fprintf(stderr, "remove: %d\n", (int)id);
+    gsr_pipewire_audio *self = (gsr_pipewire_audio*)data;
+    if(gsr_pipewire_audio_remove_node_by_id(self, id)) {
+        //fprintf(stderr, "removed node\n");
+        return;
+    }
+
+    if(gsr_pipewire_audio_remove_port_by_id(self, id)) {
+        //fprintf(stderr, "removed port\n");
+        return;
+    }
+
+    if(gsr_pipewire_audio_remove_link_by_id(self, id)) {
+        //fprintf(stderr, "removed link\n");
+        return;
+    }
+}
+
+static const struct pw_registry_events registry_events = {
+    PW_VERSION_REGISTRY_EVENTS,
+    .global = registry_event_global,
+    .global_remove = registry_event_global_remove,
+};
+
+bool gsr_pipewire_audio_init(gsr_pipewire_audio *self) {
+    memset(self, 0, sizeof(*self));
+
+    pw_init(NULL, NULL);
+    
+    self->thread_loop = pw_thread_loop_new("gsr screen capture", NULL);
+    if(!self->thread_loop) {
+        fprintf(stderr, "gsr error: gsr_pipewire_audio_init: failed to create pipewire thread\n");
+        gsr_pipewire_audio_deinit(self);
+        return false;
+    }
+
+    self->context = pw_context_new(pw_thread_loop_get_loop(self->thread_loop), NULL, 0);
+    if(!self->context) {
+        fprintf(stderr, "gsr error: gsr_pipewire_audio_init: failed to create pipewire context\n");
+        gsr_pipewire_audio_deinit(self);
+        return false;
+    }
+
+    pw_context_load_module(self->context, "libpipewire-module-link-factory", NULL, NULL);
+
+    if(pw_thread_loop_start(self->thread_loop) < 0) {
+        fprintf(stderr, "gsr error: gsr_pipewire_audio_init: failed to start thread\n");
+        gsr_pipewire_audio_deinit(self);
+        return false;
+    }
+
+    pw_thread_loop_lock(self->thread_loop);
+
+    self->core = pw_context_connect(self->context, pw_properties_new(PW_KEY_REMOTE_NAME, NULL, NULL), 0);
+    if(!self->core) {
+        pw_thread_loop_unlock(self->thread_loop);
+        gsr_pipewire_audio_deinit(self);
+        return false;
+    }
+
+    // TODO: Error check
+    pw_core_add_listener(self->core, &self->core_listener, &core_events, self);
+
+    self->registry = pw_core_get_registry(self->core, PW_VERSION_REGISTRY, 0);
+    pw_registry_add_listener(self->registry, &self->registry_listener, &registry_events, self);
+
+    self->server_version_sync = pw_core_sync(self->core, PW_ID_CORE, self->server_version_sync);
+    pw_thread_loop_wait(self->thread_loop);
+
+    pw_thread_loop_unlock(self->thread_loop);
+    return true;
+}
+
+void gsr_pipewire_audio_deinit(gsr_pipewire_audio *self) {
+    if(self->thread_loop) {
+        //pw_thread_loop_wait(self->thread_loop);
+        pw_thread_loop_stop(self->thread_loop);
+    }
+
+    for(size_t i = 0; i < self->num_virtual_sink_proxies; ++i) {
+        if(self->virtual_sink_proxies[i]) {
+            pw_proxy_destroy(self->virtual_sink_proxies[i]);
+            self->virtual_sink_proxies[i] = NULL;
+        }
+    }
+    self->num_virtual_sink_proxies = 0;
+    self->virtual_sink_proxies_capacity_items = 0;
+
+    if(self->virtual_sink_proxies) {
+        free(self->virtual_sink_proxies);
+        self->virtual_sink_proxies = NULL;
+    }
+
+    if(self->metadata_proxy) {
+        spa_hook_remove(&self->metadata_listener);
+        spa_hook_remove(&self->metadata_proxy_listener);
+        pw_proxy_destroy(self->metadata_proxy);
+        spa_zero(self->metadata_listener);
+        spa_zero(self->metadata_proxy_listener);
+        self->metadata_proxy = NULL;
+    }
+
+    spa_hook_remove(&self->registry_listener);
+    spa_hook_remove(&self->core_listener);
+
+    if(self->core) {
+        pw_core_disconnect(self->core);
+        self->core = NULL;
+    }
+
+    if(self->context) {
+        pw_context_destroy(self->context);
+        self->context = NULL;
+    }
+
+    if(self->thread_loop) {
+        pw_thread_loop_destroy(self->thread_loop);
+        self->thread_loop = NULL;
+    }
+
+    if(self->stream_nodes) {
+        for(size_t i = 0; i < self->num_stream_nodes; ++i) {
+            free(self->stream_nodes[i].name);
+        }
+        self->num_stream_nodes = 0;
+        self->stream_nodes_capacity_items = 0;
+
+        free(self->stream_nodes);
+        self->stream_nodes = NULL;
+    }
+
+    if(self->ports) {
+        for(size_t i = 0; i < self->num_ports; ++i) {
+            free(self->ports[i].name);
+        }
+        self->num_ports = 0;
+        self->ports_capacity_items = 0;
+
+        free(self->ports);
+        self->ports = NULL;
+    }
+
+    if(self->links) {
+        self->num_links = 0;
+        self->links_capacity_items = 0;
+
+        free(self->links);
+        self->links = NULL;
+    }
+
+    if(self->requested_links) {
+        for(size_t i = 0; i < self->num_requested_links; ++i) {
+            for(int j = 0; j < self->requested_links[i].num_outputs; ++j) {
+                free(self->requested_links[i].outputs[j].name);
+            }
+            free(self->requested_links[i].outputs);
+            free(self->requested_links[i].input_name);
+        }
+        self->num_requested_links = 0;
+        self->requested_links_capacity_items = 0;
+
+        free(self->requested_links);
+        self->requested_links = NULL;
+    }
+
+#if PW_CHECK_VERSION(0, 3, 49)
+    pw_deinit();
+#endif
+}
+
+static struct pw_properties* gsr_pipewire_create_null_audio_sink(const char *name) {
+    char props_str[512];
+    snprintf(props_str, sizeof(props_str), "{ factory.name=support.null-audio-sink node.name=\"%s\" media.class=Audio/Sink object.linger=false audio.position=[FL FR] monitor.channel-volumes=true monitor.passthrough=true adjust_time=0 node.description=gsr-app-sink slaves=\"\" }", name);
+    struct pw_properties *props = pw_properties_new_string(props_str);
+    if(!props) {
+        fprintf(stderr, "gsr error: gsr_pipewire_create_null_audio_sink: failed to create virtual sink properties\n");
+        return NULL;
+    }
+    return props;
+}
+
+bool gsr_pipewire_audio_create_virtual_sink(gsr_pipewire_audio *self, const char *name) {
+    if(!array_ensure_capacity((void**)&self->virtual_sink_proxies, self->num_virtual_sink_proxies, &self->virtual_sink_proxies_capacity_items, sizeof(struct pw_proxy*)))
+        return false;
+
+    pw_thread_loop_lock(self->thread_loop);
+
+    struct pw_properties *virtual_sink_props = gsr_pipewire_create_null_audio_sink(name);
+    if(!virtual_sink_props) {
+        pw_thread_loop_unlock(self->thread_loop);
+        return false;
+    }
+
+    struct pw_proxy *virtual_sink_proxy = pw_core_create_object(self->core, "adapter", PW_TYPE_INTERFACE_Node, PW_VERSION_NODE, &virtual_sink_props->dict, 0);
+    // TODO:
+    // If these are done then the above needs sizeof(*self) as the last argument
+    //pw_proxy_add_object_listener(virtual_sink_proxy, &pd->object_listener, &node_events, self);
+	//pw_proxy_add_listener(virtual_sink_proxy, &pd->proxy_listener, &proxy_events, self);
+    // TODO: proxy
+    pw_properties_free(virtual_sink_props);
+    if(!virtual_sink_proxy) {
+        fprintf(stderr, "gsr error: gsr_pipewire_audio_create_virtual_sink: failed to create virtual sink\n");
+        pw_thread_loop_unlock(self->thread_loop);
+        return false;
+    }
+
+    self->server_version_sync = pw_core_sync(self->core, PW_ID_CORE, self->server_version_sync);
+    pw_thread_loop_wait(self->thread_loop);
+    pw_thread_loop_unlock(self->thread_loop);
+
+    self->virtual_sink_proxies[self->num_virtual_sink_proxies] = virtual_sink_proxy;
+    ++self->num_virtual_sink_proxies;
+
+    return true;
+}
+
+static bool string_remove_suffix(char *str, const char *suffix) {
+    int str_len = strlen(str);
+    int suffix_len = strlen(suffix);
+    if(str_len >= suffix_len && memcmp(str + str_len - suffix_len, suffix, suffix_len) == 0) {
+        str[str_len - suffix_len] = '\0';
+        return true;
+    } else {
+        return false;
+    }
+}
+
+static bool gsr_pipewire_audio_add_links_to_output(gsr_pipewire_audio *self, const char **output_names, int num_output_names, const char *input_name, gsr_pipewire_audio_node_type output_type, gsr_pipewire_audio_link_input_type input_type, bool inverted) {
+    if(!array_ensure_capacity((void**)&self->requested_links, self->num_requested_links, &self->requested_links_capacity_items, sizeof(gsr_pipewire_audio_requested_link)))
+        return false;
+    
+    gsr_pipewire_audio_requested_output *outputs = calloc(num_output_names, sizeof(gsr_pipewire_audio_requested_output));
+    if(!outputs)
+        return false;
+
+    char *input_name_copy = strdup(input_name);
+    if(!input_name_copy)
+        goto error;
+
+    if(input_type == GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_SINK)
+        string_remove_suffix(input_name_copy, ".monitor");
+
+    for(int i = 0; i < num_output_names; ++i) {
+        outputs[i].name = strdup(output_names[i]);
+        if(!outputs[i].name)
+            goto error;
+
+        outputs[i].type = GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_STANDARD;
+        if(output_type == GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE) {
+            string_remove_suffix(outputs[i].name, ".monitor");
+
+            if(strcmp(outputs[i].name, "default_output") == 0)
+                outputs[i].type = GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT;
+            else if(strcmp(outputs[i].name, "default_input") == 0)
+                outputs[i].type = GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_INPUT;
+            else
+                outputs[i].type = GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_STANDARD;
+        }
+    }
+
+    pw_thread_loop_lock(self->thread_loop);
+    self->requested_links[self->num_requested_links].outputs = outputs;
+    self->requested_links[self->num_requested_links].num_outputs = num_output_names;
+    self->requested_links[self->num_requested_links].input_name = input_name_copy;
+    self->requested_links[self->num_requested_links].output_type = output_type;
+    self->requested_links[self->num_requested_links].input_type = input_type;
+    self->requested_links[self->num_requested_links].inverted = inverted;
+    ++self->num_requested_links;
+    gsr_pipewire_audio_create_link(self, &self->requested_links[self->num_requested_links - 1]);
+    gsr_pipewire_audio_create_link_for_default_devices(self, &self->requested_links[self->num_requested_links - 1], GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_OUTPUT);
+    gsr_pipewire_audio_create_link_for_default_devices(self, &self->requested_links[self->num_requested_links - 1], GSR_PIPEWIRE_AUDIO_REQUESTED_TYPE_DEFAULT_INPUT);
+    pw_thread_loop_unlock(self->thread_loop);
+
+    return true;
+
+    error:
+    free(input_name_copy);
+    for(int i = 0; i < num_output_names; ++i) {
+        free(outputs[i].name);
+    }
+    free(outputs);
+    return false;
+}
+
+bool gsr_pipewire_audio_add_link_from_apps_to_stream(gsr_pipewire_audio *self, const char **app_names, int num_app_names, const char *stream_name_input) {
+    return gsr_pipewire_audio_add_links_to_output(self, app_names, num_app_names, stream_name_input, GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_OUTPUT, GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_STREAM, false);
+}
+
+bool gsr_pipewire_audio_add_link_from_apps_to_stream_inverted(gsr_pipewire_audio *self, const char **app_names, int num_app_names, const char *stream_name_input) {
+    return gsr_pipewire_audio_add_links_to_output(self, app_names, num_app_names, stream_name_input, GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_OUTPUT, GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_STREAM, true);
+}
+
+bool gsr_pipewire_audio_add_link_from_apps_to_sink(gsr_pipewire_audio *self, const char **app_names, int num_app_names, const char *sink_name_input) {
+    return gsr_pipewire_audio_add_links_to_output(self, app_names, num_app_names, sink_name_input, GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_OUTPUT, GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_SINK, false);
+}
+
+bool gsr_pipewire_audio_add_link_from_apps_to_sink_inverted(gsr_pipewire_audio *self, const char **app_names, int num_app_names, const char *sink_name_input) {
+    return gsr_pipewire_audio_add_links_to_output(self, app_names, num_app_names, sink_name_input, GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_OUTPUT, GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_SINK, true);
+}
+
+bool gsr_pipewire_audio_add_link_from_sources_to_sink(gsr_pipewire_audio *self, const char **source_names, int num_source_names, const char *sink_name_input) {
+    return gsr_pipewire_audio_add_links_to_output(self, source_names, num_source_names, sink_name_input, GSR_PIPEWIRE_AUDIO_NODE_TYPE_SINK_OR_SOURCE, GSR_PIPEWIRE_AUDIO_LINK_INPUT_TYPE_SINK, false);
+}
+
+void gsr_pipewire_audio_for_each_app(gsr_pipewire_audio *self, gsr_pipewire_audio_app_query_callback callback, void *userdata) {
+    pw_thread_loop_lock(self->thread_loop);
+    for(int i = 0; i < (int)self->num_stream_nodes; ++i) {
+        const gsr_pipewire_audio_node *node = &self->stream_nodes[i];
+        if(node->type != GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_OUTPUT)
+            continue;
+
+        bool duplicate_app = false;
+        for(int j = i - 1; j >= 0; --j) {
+            const gsr_pipewire_audio_node *prev_node = &self->stream_nodes[j];
+            if(prev_node->type != GSR_PIPEWIRE_AUDIO_NODE_TYPE_STREAM_OUTPUT)
+                continue;
+
+            if(strcasecmp(node->name, prev_node->name) == 0) {
+                duplicate_app = true;
+                break;
+            }
+        }
+
+        if(duplicate_app)
+            continue;
+
+        if(!callback(node->name, userdata))
+            break;
+    }
+    pw_thread_loop_unlock(self->thread_loop);
+}
diff --git a/src/pipewire_video.c b/src/pipewire_video.c
new file mode 100644
index 0000000..83b0bc3
--- /dev/null
+++ b/src/pipewire_video.c
@@ -0,0 +1,855 @@
+#include "../include/pipewire_video.h"
+#include "../include/egl.h"
+#include "../include/utils.h"
+
+#include <pipewire/pipewire.h>
+#include <spa/param/video/format-utils.h>
+#include <spa/debug/types.h>
+
+#include <drm_fourcc.h>
+
+#include <fcntl.h>
+#include <unistd.h>
+
+/* This code is partially based on xr-video-player pipewire implementation which is based on obs-studio's pipewire implementation */
+
+/* TODO: Make gsr_pipewire_video_init asynchronous */
+/* TODO: Support hdr when pipewire supports it */
+/* TODO: Test all of the image formats */
+
+#ifndef SPA_POD_PROP_FLAG_DONT_FIXATE
+#define SPA_POD_PROP_FLAG_DONT_FIXATE (1 << 4)
+#endif
+
+#define CURSOR_META_SIZE(width, height)                                    \
+    (sizeof(struct spa_meta_cursor) + sizeof(struct spa_meta_bitmap) + \
+     width * height * 4)
+
+static bool parse_pw_version(gsr_pipewire_video_data_version *dst, const char *version) {
+    const int n_matches = sscanf(version, "%d.%d.%d", &dst->major, &dst->minor, &dst->micro);
+    return n_matches == 3;
+}
+
+static bool check_pw_version(const gsr_pipewire_video_data_version *pw_version, int major, int minor, int micro) {
+    if (pw_version->major != major)
+        return pw_version->major > major;
+    if (pw_version->minor != minor)
+        return pw_version->minor > minor;
+    return pw_version->micro >= micro;
+}
+
+static void update_pw_versions(gsr_pipewire_video *self, const char *version) {
+    fprintf(stderr, "gsr info: pipewire: server version: %s\n", version);
+    fprintf(stderr, "gsr info: pipewire: library version: %s\n", pw_get_library_version());
+    fprintf(stderr, "gsr info: pipewire: header version: %s\n", pw_get_headers_version());
+    if(!parse_pw_version(&self->server_version, version))
+        fprintf(stderr, "gsr error: pipewire: failed to parse server version\n");
+}
+
+static void on_core_info_cb(void *user_data, const struct pw_core_info *info) {
+    gsr_pipewire_video *self = user_data;
+    update_pw_versions(self, info->version);
+}
+
+static void on_core_error_cb(void *user_data, uint32_t id, int seq, int res, const char *message) {
+    gsr_pipewire_video *self = user_data;
+    fprintf(stderr, "gsr error: pipewire: error id:%u seq:%d res:%d: %s\n", id, seq, res, message);
+    pw_thread_loop_signal(self->thread_loop, false);
+}
+
+static void on_core_done_cb(void *user_data, uint32_t id, int seq) {
+    gsr_pipewire_video *self = user_data;
+    if (id == PW_ID_CORE && self->server_version_sync == seq)
+        pw_thread_loop_signal(self->thread_loop, false);
+}
+
+static bool is_cursor_format_supported(const enum spa_video_format format) {
+    switch(format) {
+        case SPA_VIDEO_FORMAT_RGBx:       return true;
+        case SPA_VIDEO_FORMAT_BGRx:       return true;
+        case SPA_VIDEO_FORMAT_RGBA:       return true;
+        case SPA_VIDEO_FORMAT_BGRA:       return true;
+        case SPA_VIDEO_FORMAT_RGB:        return true;
+        case SPA_VIDEO_FORMAT_BGR:        return true;
+        case SPA_VIDEO_FORMAT_ARGB:       return true;
+        case SPA_VIDEO_FORMAT_ABGR:       return true;
+#if PW_CHECK_VERSION(0, 3, 41)
+        case SPA_VIDEO_FORMAT_xRGB_210LE: return true;
+        case SPA_VIDEO_FORMAT_xBGR_210LE: return true;
+        case SPA_VIDEO_FORMAT_ARGB_210LE: return true;
+        case SPA_VIDEO_FORMAT_ABGR_210LE: return true;
+#endif
+        default:                    break;
+    }
+    return false;
+}
+
+static const struct pw_core_events core_events = {
+    PW_VERSION_CORE_EVENTS,
+    .info = on_core_info_cb,
+    .done = on_core_done_cb,
+    .error = on_core_error_cb,
+};
+
+static void on_process_cb(void *user_data) {
+    gsr_pipewire_video *self = user_data;
+    struct spa_meta_cursor *cursor = NULL;
+    //struct spa_meta *video_damage = NULL;
+
+    /* Find the most recent buffer */
+    struct pw_buffer *pw_buf = NULL;
+    for(;;) {
+        struct pw_buffer *aux = pw_stream_dequeue_buffer(self->stream);
+        if(!aux)
+            break;
+        if(pw_buf)
+            pw_stream_queue_buffer(self->stream, pw_buf);
+        pw_buf = aux;
+    }
+
+    if(!pw_buf) {
+        fprintf(stderr, "gsr info: pipewire: out of buffers!\n");
+        return;
+    }
+
+    struct spa_buffer *buffer = pw_buf->buffer;
+    const bool has_buffer = buffer->datas[0].chunk->size != 0;
+    if(!has_buffer)
+        goto read_metadata;
+
+    pthread_mutex_lock(&self->mutex);
+
+    if(buffer->datas[0].type == SPA_DATA_DmaBuf) {
+        for(size_t i = 0; i < self->dmabuf_num_planes; ++i) {
+            if(self->dmabuf_data[i].fd > 0) {
+                close(self->dmabuf_data[i].fd);
+                self->dmabuf_data[i].fd = -1;
+            }
+        }
+
+        self->dmabuf_num_planes = buffer->n_datas;
+        if(self->dmabuf_num_planes > GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES)
+            self->dmabuf_num_planes = GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES;
+
+        for(size_t i = 0; i < self->dmabuf_num_planes; ++i) {
+            self->dmabuf_data[i].fd = dup(buffer->datas[i].fd);
+            self->dmabuf_data[i].offset = buffer->datas[i].chunk->offset;
+            self->dmabuf_data[i].stride = buffer->datas[i].chunk->stride;
+        }
+
+        self->damaged = true;
+    } else {
+        // TODO:
+    }
+
+    // TODO: Move down to read_metadata
+    struct spa_meta_region *region = spa_buffer_find_meta_data(buffer, SPA_META_VideoCrop, sizeof(*region));
+    if(region && spa_meta_region_is_valid(region)) {
+        // fprintf(stderr, "gsr info: pipewire: crop Region available (%dx%d+%d+%d)\n",
+        //      region->region.position.x, region->region.position.y,
+        //      region->region.size.width, region->region.size.height);
+        self->crop.x = region->region.position.x;
+        self->crop.y = region->region.position.y;
+        self->crop.width = region->region.size.width;
+        self->crop.height = region->region.size.height;
+        self->crop.valid = true;
+    } else {
+        self->crop.valid = false;
+    }
+
+    pthread_mutex_unlock(&self->mutex);
+
+read_metadata:
+
+    // video_damage = spa_buffer_find_meta(buffer, SPA_META_VideoDamage);
+    // if(video_damage) {
+    //     struct spa_meta_region *r = spa_meta_first(video_damage);
+    //     if(spa_meta_check(r, video_damage)) {
+    //         //fprintf(stderr, "damage: %d,%d %ux%u\n", r->region.position.x, r->region.position.y, r->region.size.width, r->region.size.height);
+    //         pthread_mutex_lock(&self->mutex);
+    //         self->damaged = true;
+    //         pthread_mutex_unlock(&self->mutex);
+    //     }
+    // }
+
+    cursor = spa_buffer_find_meta_data(buffer, SPA_META_Cursor, sizeof(*cursor));
+    self->cursor.valid = cursor && spa_meta_cursor_is_valid(cursor);
+    
+    if (self->cursor.visible && self->cursor.valid) {
+        pthread_mutex_lock(&self->mutex);
+
+        struct spa_meta_bitmap *bitmap = NULL;
+        if (cursor->bitmap_offset)
+            bitmap = SPA_MEMBER(cursor, cursor->bitmap_offset, struct spa_meta_bitmap);
+
+        if (bitmap && bitmap->size.width > 0 && bitmap->size.height && is_cursor_format_supported(bitmap->format)) {
+            const uint8_t *bitmap_data = SPA_MEMBER(bitmap, bitmap->offset, uint8_t);
+            fprintf(stderr, "gsr info: pipewire: cursor bitmap update, size: %dx%d, format: %s\n",
+                (int)bitmap->size.width, (int)bitmap->size.height, spa_debug_type_find_name(spa_type_video_format, bitmap->format));
+
+            const size_t bitmap_size = bitmap->size.width * bitmap->size.height * 4;
+            uint8_t *new_bitmap_data = realloc(self->cursor.data, bitmap_size);
+            if(new_bitmap_data) {
+                self->cursor.data = new_bitmap_data;
+                /* TODO: Convert bgr and other image formats to rgb here */
+                memcpy(self->cursor.data, bitmap_data, bitmap_size);
+            }
+        
+            self->cursor.hotspot_x = cursor->hotspot.x;
+            self->cursor.hotspot_y = cursor->hotspot.y;
+            self->cursor.width = bitmap->size.width;
+            self->cursor.height = bitmap->size.height;
+        }
+
+        self->cursor.x = cursor->position.x;
+        self->cursor.y = cursor->position.y;
+        pthread_mutex_unlock(&self->mutex);
+
+        //fprintf(stderr, "gsr info: pipewire: cursor: %d %d %d %d\n", cursor->hotspot.x, cursor->hotspot.y, cursor->position.x, cursor->position.y);
+    }
+
+    pw_stream_queue_buffer(self->stream, pw_buf);
+}
+
+static void on_param_changed_cb(void *user_data, uint32_t id, const struct spa_pod *param) {
+    gsr_pipewire_video *self = user_data;
+
+    if (!param || id != SPA_PARAM_Format)
+        return;
+
+    int result = spa_format_parse(param, &self->format.media_type, &self->format.media_subtype);
+    if (result < 0)
+        return;
+
+    if (self->format.media_type != SPA_MEDIA_TYPE_video || self->format.media_subtype != SPA_MEDIA_SUBTYPE_raw)
+        return;
+
+    pthread_mutex_lock(&self->mutex);
+    spa_format_video_raw_parse(param, &self->format.info.raw);
+    pthread_mutex_unlock(&self->mutex);
+
+    uint32_t buffer_types = 0;
+    const bool has_modifier = spa_pod_find_prop(param, NULL, SPA_FORMAT_VIDEO_modifier) != NULL;
+    if(has_modifier || check_pw_version(&self->server_version, 0, 3, 24))
+        buffer_types |= 1 << SPA_DATA_DmaBuf;
+
+    fprintf(stderr, "gsr info: pipewire: negotiated format:\n");
+
+    fprintf(stderr, "gsr info: pipewire:    Format: %d (%s)\n",
+         self->format.info.raw.format,
+         spa_debug_type_find_name(spa_type_video_format, self->format.info.raw.format));
+
+    if(has_modifier) {
+        fprintf(stderr, "gsr info: pipewire:    Modifier: 0x%" PRIx64 "\n", self->format.info.raw.modifier);
+    }
+
+    fprintf(stderr, "gsr info: pipewire:    Size: %dx%d\n", self->format.info.raw.size.width, self->format.info.raw.size.height);
+    fprintf(stderr, "gsr info: pipewire:    Framerate: %d/%d\n", self->format.info.raw.framerate.num, self->format.info.raw.framerate.denom);
+
+    uint8_t params_buffer[1024];
+    struct spa_pod_builder pod_builder = SPA_POD_BUILDER_INIT(params_buffer, sizeof(params_buffer));
+    const struct spa_pod *params[4];
+
+    params[0] = spa_pod_builder_add_object(
+        &pod_builder, SPA_TYPE_OBJECT_ParamMeta, SPA_PARAM_Meta,
+        SPA_PARAM_META_type, SPA_POD_Id(SPA_META_VideoCrop),
+        SPA_PARAM_META_size,
+        SPA_POD_Int(sizeof(struct spa_meta_region)));
+
+    params[1] = spa_pod_builder_add_object(
+        &pod_builder, SPA_TYPE_OBJECT_ParamMeta, SPA_PARAM_Meta,
+        SPA_PARAM_META_type, SPA_POD_Id(SPA_META_VideoDamage),
+        SPA_PARAM_META_size, SPA_POD_CHOICE_RANGE_Int(
+                                sizeof(struct spa_meta_region) * 16,
+                                sizeof(struct spa_meta_region) * 1,
+                                sizeof(struct spa_meta_region) * 16));
+
+    params[2] = spa_pod_builder_add_object(
+        &pod_builder, SPA_TYPE_OBJECT_ParamMeta, SPA_PARAM_Meta,
+        SPA_PARAM_META_type, SPA_POD_Id(SPA_META_Cursor),
+        SPA_PARAM_META_size,
+        SPA_POD_CHOICE_RANGE_Int(CURSOR_META_SIZE(64, 64),
+                     CURSOR_META_SIZE(1, 1),
+                     CURSOR_META_SIZE(1024, 1024)));
+
+    params[3] = spa_pod_builder_add_object(
+        &pod_builder, SPA_TYPE_OBJECT_ParamBuffers, SPA_PARAM_Buffers,
+        SPA_PARAM_BUFFERS_dataType, SPA_POD_Int(buffer_types));
+
+    pw_stream_update_params(self->stream, params, 4);
+    self->negotiated = true;
+}
+
+static void on_state_changed_cb(void *user_data, enum pw_stream_state old, enum pw_stream_state state, const char *error) {
+    (void)old;
+    gsr_pipewire_video *self = user_data;
+
+    fprintf(stderr, "gsr info: pipewire: stream %p state: \"%s\" (error: %s)\n",
+         (void*)self->stream, pw_stream_state_as_string(state),
+         error ? error : "none");
+}
+
+static const struct pw_stream_events stream_events = {
+    PW_VERSION_STREAM_EVENTS,
+    .state_changed = on_state_changed_cb,
+    .param_changed = on_param_changed_cb,
+    .process = on_process_cb,
+};
+
+static inline struct spa_pod *build_format(struct spa_pod_builder *b,
+                       const gsr_pipewire_video_video_info *ovi,
+                       uint32_t format, const uint64_t *modifiers,
+                       size_t modifier_count)
+{
+    struct spa_pod_frame format_frame;
+
+    spa_pod_builder_push_object(b, &format_frame, SPA_TYPE_OBJECT_Format, SPA_PARAM_EnumFormat);
+    spa_pod_builder_add(b, SPA_FORMAT_mediaType, SPA_POD_Id(SPA_MEDIA_TYPE_video), 0);
+    spa_pod_builder_add(b, SPA_FORMAT_mediaSubtype, SPA_POD_Id(SPA_MEDIA_SUBTYPE_raw), 0);
+
+    spa_pod_builder_add(b, SPA_FORMAT_VIDEO_format, SPA_POD_Id(format), 0);
+
+    if (modifier_count > 0) {
+        struct spa_pod_frame modifier_frame;
+
+        spa_pod_builder_prop(b, SPA_FORMAT_VIDEO_modifier, SPA_POD_PROP_FLAG_MANDATORY | SPA_POD_PROP_FLAG_DONT_FIXATE);
+        spa_pod_builder_push_choice(b, &modifier_frame, SPA_CHOICE_Enum, 0);
+
+        /* The first element of choice pods is the preferred value. Here
+         * we arbitrarily pick the first modifier as the preferred one.
+         */
+        // TODO:
+        spa_pod_builder_long(b, modifiers[0]);
+
+        for(uint32_t i = 0; i < modifier_count; i++)
+            spa_pod_builder_long(b, modifiers[i]);
+
+        spa_pod_builder_pop(b, &modifier_frame);
+    }
+
+    spa_pod_builder_add(b, SPA_FORMAT_VIDEO_size,
+                SPA_POD_CHOICE_RANGE_Rectangle(
+                    &SPA_RECTANGLE(32, 32),
+                    &SPA_RECTANGLE(1, 1),
+                    &SPA_RECTANGLE(16384, 16384)),
+                SPA_FORMAT_VIDEO_framerate,
+                SPA_POD_CHOICE_RANGE_Fraction(
+                    &SPA_FRACTION(ovi->fps_num, ovi->fps_den),
+                    &SPA_FRACTION(0, 1), &SPA_FRACTION(500, 1)),
+                0);
+    return spa_pod_builder_pop(b, &format_frame);
+}
+
+/* https://gstreamer.freedesktop.org/documentation/additional/design/mediatype-video-raw.html?gi-language=c#formats */
+/* For some reason gstreamer formats are in opposite order to drm formats */
+static int64_t spa_video_format_to_drm_format(const enum spa_video_format format) {
+    switch(format) {
+        case SPA_VIDEO_FORMAT_RGBx:       return DRM_FORMAT_XBGR8888;
+        case SPA_VIDEO_FORMAT_BGRx:       return DRM_FORMAT_XRGB8888;
+       // case SPA_VIDEO_FORMAT_RGBA:       return DRM_FORMAT_ABGR8888;
+        //case SPA_VIDEO_FORMAT_BGRA:       return DRM_FORMAT_ARGB8888;
+        case SPA_VIDEO_FORMAT_RGB:        return DRM_FORMAT_XBGR8888;
+        case SPA_VIDEO_FORMAT_BGR:        return DRM_FORMAT_XRGB8888;
+        //case SPA_VIDEO_FORMAT_ARGB:       return DRM_FORMAT_XRGB8888;
+        //case SPA_VIDEO_FORMAT_ABGR:       return DRM_FORMAT_XRGB8888;
+#if PW_CHECK_VERSION(0, 3, 41)
+        case SPA_VIDEO_FORMAT_xRGB_210LE: return DRM_FORMAT_XRGB2101010;
+        case SPA_VIDEO_FORMAT_xBGR_210LE: return DRM_FORMAT_XBGR2101010;
+      //  case SPA_VIDEO_FORMAT_ARGB_210LE: return DRM_FORMAT_ARGB2101010;
+      //  case SPA_VIDEO_FORMAT_ABGR_210LE: return DRM_FORMAT_ABGR2101010;
+#endif
+        default:                          break;
+    }
+    return DRM_FORMAT_INVALID;
+}
+
+#if PW_CHECK_VERSION(0, 3, 41)
+#define GSR_PIPEWIRE_VIDEO_NUM_VIDEO_FORMATS GSR_PIPEWIRE_VIDEO_MAX_VIDEO_FORMATS
+#else
+#define GSR_PIPEWIRE_VIDEO_NUM_VIDEO_FORMATS 4
+#endif
+
+static const enum spa_video_format video_formats[GSR_PIPEWIRE_VIDEO_MAX_VIDEO_FORMATS] = {
+   // SPA_VIDEO_FORMAT_BGRA,
+    SPA_VIDEO_FORMAT_BGRx,
+    SPA_VIDEO_FORMAT_BGR,
+    SPA_VIDEO_FORMAT_RGBx,
+   // SPA_VIDEO_FORMAT_RGBA,
+    SPA_VIDEO_FORMAT_RGB,
+  //  SPA_VIDEO_FORMAT_ARGB,
+  //  SPA_VIDEO_FORMAT_ABGR,
+#if PW_CHECK_VERSION(0, 3, 41)
+    SPA_VIDEO_FORMAT_xRGB_210LE,
+    SPA_VIDEO_FORMAT_xBGR_210LE,
+  //  SPA_VIDEO_FORMAT_ARGB_210LE,
+  //  SPA_VIDEO_FORMAT_ABGR_210LE
+#endif
+};
+
+static bool gsr_pipewire_video_build_format_params(gsr_pipewire_video *self, struct spa_pod_builder *pod_builder, struct spa_pod **params, uint32_t *num_params) {
+    *num_params = 0;
+
+    if(!check_pw_version(&self->server_version, 0, 3, 33))
+        return false;
+
+    for(size_t i = 0; i < GSR_PIPEWIRE_VIDEO_NUM_VIDEO_FORMATS; i++) {
+        if(self->supported_video_formats[i].modifiers_size == 0)
+            continue;
+        params[*num_params] = build_format(pod_builder, &self->video_info, self->supported_video_formats[i].format, self->modifiers + self->supported_video_formats[i].modifiers_index, self->supported_video_formats[i].modifiers_size);
+        ++(*num_params);
+    }
+
+    return true;
+}
+
+static void renegotiate_format(void *data, uint64_t expirations) {
+    (void)expirations;
+    gsr_pipewire_video *self = (gsr_pipewire_video*)data;
+
+    pw_thread_loop_lock(self->thread_loop);
+
+    struct spa_pod *params[GSR_PIPEWIRE_VIDEO_NUM_VIDEO_FORMATS];
+    uint32_t num_video_formats = 0;
+    uint8_t params_buffer[4096];
+    struct spa_pod_builder pod_builder = SPA_POD_BUILDER_INIT(params_buffer, sizeof(params_buffer));
+    if (!gsr_pipewire_video_build_format_params(self, &pod_builder, params, &num_video_formats)) {
+        fprintf(stderr, "gsr error: renegotiate_format: failed to build formats\n");
+        pw_thread_loop_unlock(self->thread_loop);
+        return;
+    }
+
+    pw_stream_update_params(self->stream, (const struct spa_pod**)params, num_video_formats);
+    pw_thread_loop_unlock(self->thread_loop);
+}
+
+static bool spa_video_format_get_modifiers(gsr_pipewire_video *self, const enum spa_video_format format, uint64_t *modifiers, int32_t max_modifiers, int32_t *num_modifiers) {
+    *num_modifiers = 0;
+
+    if(max_modifiers == 0) {
+        fprintf(stderr, "gsr error: spa_video_format_get_modifiers: no space for modifiers left\n");
+        //modifiers[0] = DRM_FORMAT_MOD_LINEAR;
+        //modifiers[1] = DRM_FORMAT_MOD_INVALID;
+        //*num_modifiers = 2;
+        return false;
+    }
+
+    if(!self->egl->eglQueryDmaBufModifiersEXT) {
+        fprintf(stderr, "gsr error: spa_video_format_get_modifiers: failed to initialize modifiers because eglQueryDmaBufModifiersEXT is not available\n");
+        //modifiers[0] = DRM_FORMAT_MOD_LINEAR;
+        //modifiers[1] = DRM_FORMAT_MOD_INVALID;
+        //*num_modifiers = 2;
+        return false;
+    }
+
+    const int64_t drm_format = spa_video_format_to_drm_format(format);
+    if(drm_format == DRM_FORMAT_INVALID) {
+        fprintf(stderr, "gsr error: spa_video_format_get_modifiers: unsupported format: %d\n", (int)format);
+        return false;
+    }
+
+    if(!self->egl->eglQueryDmaBufModifiersEXT(self->egl->egl_display, drm_format, max_modifiers, modifiers, NULL, num_modifiers)) {
+        fprintf(stderr, "gsr error: spa_video_format_get_modifiers: eglQueryDmaBufModifiersEXT failed with drm format %d, %" PRIi64 "\n", (int)format, drm_format);
+        //modifiers[0] = DRM_FORMAT_MOD_LINEAR;
+        //modifiers[1] = DRM_FORMAT_MOD_INVALID;
+        //*num_modifiers = 2;
+        *num_modifiers = 0;
+        return false;
+    }
+
+    // if(*num_modifiers + 2 <= max_modifiers) {
+    //     modifiers[*num_modifiers + 0] = DRM_FORMAT_MOD_LINEAR;
+    //     modifiers[*num_modifiers + 1] = DRM_FORMAT_MOD_INVALID;
+    //     *num_modifiers += 2;
+    // }
+    return true;
+}
+
+static void gsr_pipewire_video_init_modifiers(gsr_pipewire_video *self) {
+    for(size_t i = 0; i < GSR_PIPEWIRE_VIDEO_NUM_VIDEO_FORMATS; i++) {
+        self->supported_video_formats[i].format = video_formats[i];
+        int32_t num_modifiers = 0;
+        spa_video_format_get_modifiers(self, self->supported_video_formats[i].format, self->modifiers + self->num_modifiers, GSR_PIPEWIRE_VIDEO_MAX_MODIFIERS - self->num_modifiers, &num_modifiers);
+        self->supported_video_formats[i].modifiers_index = self->num_modifiers;
+        self->supported_video_formats[i].modifiers_size = num_modifiers;
+        self->num_modifiers += num_modifiers;
+    }
+}
+
+static void gsr_pipewire_video_format_remove_modifier(gsr_pipewire_video *self, gsr_video_format *video_format, uint64_t modifier) {
+    for(size_t i = 0; i < video_format->modifiers_size; ++i) {
+        if(self->modifiers[video_format->modifiers_index + i] != modifier)
+            continue;
+
+        for(size_t j = i + 1; j < video_format->modifiers_size; ++j) {
+            self->modifiers[j - 1] = self->modifiers[j];
+        }
+        --video_format->modifiers_size;
+        return;
+    }
+}
+
+static void gsr_pipewire_video_remove_modifier(gsr_pipewire_video *self, uint64_t modifier) {
+    for(size_t i = 0; i < GSR_PIPEWIRE_VIDEO_NUM_VIDEO_FORMATS; i++) {
+        gsr_video_format *video_format = &self->supported_video_formats[i];
+        gsr_pipewire_video_format_remove_modifier(self, video_format, modifier);
+    }
+}
+
+static bool gsr_pipewire_video_setup_stream(gsr_pipewire_video *self) {
+    struct spa_pod *params[GSR_PIPEWIRE_VIDEO_NUM_VIDEO_FORMATS];
+    uint32_t num_video_formats = 0;
+    uint8_t params_buffer[4096];
+    struct spa_pod_builder pod_builder = SPA_POD_BUILDER_INIT(params_buffer, sizeof(params_buffer));
+
+    self->thread_loop = pw_thread_loop_new("gsr screen capture", NULL);
+    if(!self->thread_loop) {
+        fprintf(stderr, "gsr error: gsr_pipewire_video_setup_stream: failed to create pipewire thread\n");
+        goto error;
+    }
+
+    self->context = pw_context_new(pw_thread_loop_get_loop(self->thread_loop), NULL, 0);
+    if(!self->context) {
+        fprintf(stderr, "gsr error: gsr_pipewire_video_setup_stream: failed to create pipewire context\n");
+        goto error;
+    }
+
+    if(pw_thread_loop_start(self->thread_loop) < 0) {
+        fprintf(stderr, "gsr error: gsr_pipewire_video_setup_stream: failed to start thread\n");
+        goto error;
+    }
+
+    pw_thread_loop_lock(self->thread_loop);
+
+    // TODO: Why pass 5 to fcntl?
+    self->core = pw_context_connect_fd(self->context, fcntl(self->fd, F_DUPFD_CLOEXEC, 5), NULL, 0);
+    if(!self->core) {
+        pw_thread_loop_unlock(self->thread_loop);
+        fprintf(stderr, "gsr error: gsr_pipewire_video_setup_stream: failed to connect to fd %d\n", self->fd);
+        goto error;
+    }
+
+    // TODO: Error check
+    pw_core_add_listener(self->core, &self->core_listener, &core_events, self);
+
+    self->server_version_sync = pw_core_sync(self->core, PW_ID_CORE, 0);
+    pw_thread_loop_wait(self->thread_loop);
+
+    gsr_pipewire_video_init_modifiers(self);
+
+    // TODO: Cleanup?
+    self->reneg = pw_loop_add_event(pw_thread_loop_get_loop(self->thread_loop), renegotiate_format, self);
+    if(!self->reneg) {
+        pw_thread_loop_unlock(self->thread_loop);
+        fprintf(stderr, "gsr error: gsr_pipewire_video_setup_stream: pw_loop_add_event failed\n");
+        goto error;
+    }
+
+    self->stream = pw_stream_new(self->core, "com.dec05eba.gpu_screen_recorder",
+        pw_properties_new(PW_KEY_MEDIA_TYPE, "Video",
+                          PW_KEY_MEDIA_CATEGORY, "Capture",
+                          PW_KEY_MEDIA_ROLE, "Screen", NULL));
+    if(!self->stream) {
+        pw_thread_loop_unlock(self->thread_loop);
+        fprintf(stderr, "gsr error: gsr_pipewire_video_setup_stream: failed to create stream\n");
+        goto error;
+    }
+    pw_stream_add_listener(self->stream, &self->stream_listener, &stream_events, self);
+
+    if(!gsr_pipewire_video_build_format_params(self, &pod_builder, params, &num_video_formats)) {
+        pw_thread_loop_unlock(self->thread_loop);
+        fprintf(stderr, "gsr error: gsr_pipewire_video_setup_stream: failed to build format params\n");
+        goto error;
+    }
+
+    if(pw_stream_connect(
+        self->stream, PW_DIRECTION_INPUT, self->node,
+        PW_STREAM_FLAG_AUTOCONNECT | PW_STREAM_FLAG_MAP_BUFFERS, (const struct spa_pod**)params,
+        num_video_formats) < 0)
+    {
+        pw_thread_loop_unlock(self->thread_loop);
+        fprintf(stderr, "gsr error: gsr_pipewire_video_setup_stream: failed to connect stream\n");
+        goto error;
+    }
+
+    pw_thread_loop_unlock(self->thread_loop);
+    return true;
+
+    error:
+    if(self->thread_loop) {
+        //pw_thread_loop_wait(self->thread_loop);
+        pw_thread_loop_stop(self->thread_loop);
+    }
+
+    if(self->stream) {
+        pw_stream_disconnect(self->stream);
+        pw_stream_destroy(self->stream);
+        self->stream = NULL;
+    }
+
+    if(self->core) {
+        pw_core_disconnect(self->core);
+        self->core = NULL;
+    }
+
+    if(self->context) {
+        pw_context_destroy(self->context);
+        self->context = NULL;
+    }
+
+    if(self->thread_loop) {
+        pw_thread_loop_destroy(self->thread_loop);
+        self->thread_loop = NULL;
+    }
+    return false;
+}
+
+static int pw_init_counter = 0;
+bool gsr_pipewire_video_init(gsr_pipewire_video *self, int pipewire_fd, uint32_t pipewire_node, int fps, bool capture_cursor, gsr_egl *egl) {
+    if(pw_init_counter == 0)
+        pw_init(NULL, NULL);
+    ++pw_init_counter;
+
+    memset(self, 0, sizeof(*self));
+    self->egl = egl;
+    self->fd = pipewire_fd;
+    self->node = pipewire_node;
+    if(pthread_mutex_init(&self->mutex, NULL) != 0) {
+        fprintf(stderr, "gsr error: gsr_pipewire_video_init: failed to initialize mutex\n");
+        gsr_pipewire_video_deinit(self);
+        return false;
+    }
+    self->mutex_initialized = true;
+    self->video_info.fps_num = fps;
+    self->video_info.fps_den = 1;
+    self->cursor.visible = capture_cursor;
+    
+    if(!gsr_pipewire_video_setup_stream(self)) {
+        gsr_pipewire_video_deinit(self);
+        return false;
+    }
+
+    return true;
+}
+
+void gsr_pipewire_video_deinit(gsr_pipewire_video *self) {
+    if(self->thread_loop) {
+        //pw_thread_loop_wait(self->thread_loop);
+        pw_thread_loop_stop(self->thread_loop);
+    }
+
+    if(self->stream) {
+        pw_stream_disconnect(self->stream);
+        pw_stream_destroy(self->stream);
+        self->stream = NULL;
+    }
+
+    if(self->core) {
+        pw_core_disconnect(self->core);
+        self->core = NULL;
+    }
+
+    if(self->context) {
+        pw_context_destroy(self->context);
+        self->context = NULL;
+    }
+
+    if(self->thread_loop) {
+        pw_thread_loop_destroy(self->thread_loop);
+        self->thread_loop = NULL;
+    }
+
+    if(self->fd > 0) {
+        close(self->fd);
+        self->fd = -1;
+    }
+
+    for(size_t i = 0; i < self->dmabuf_num_planes; ++i) {
+        if(self->dmabuf_data[i].fd > 0) {
+            close(self->dmabuf_data[i].fd);
+            self->dmabuf_data[i].fd = -1;
+        }
+    }
+    self->dmabuf_num_planes = 0;
+
+    self->negotiated = false;
+    self->renegotiated = false;
+
+    if(self->mutex_initialized) {
+        pthread_mutex_destroy(&self->mutex);
+        self->mutex_initialized = false;
+    }
+
+    if(self->cursor.data) {
+        free(self->cursor.data);
+        self->cursor.data = NULL;
+    }
+
+    --pw_init_counter;
+    if(pw_init_counter == 0) {
+#if PW_CHECK_VERSION(0, 3, 49)
+        pw_deinit();
+#endif
+    }
+}
+
+static EGLImage gsr_pipewire_video_create_egl_image(gsr_pipewire_video *self, const int *fds, const uint32_t *offsets, const uint32_t *pitches, const uint64_t *modifiers, bool use_modifiers) {
+    intptr_t img_attr[44];
+    setup_dma_buf_attrs(img_attr, spa_video_format_to_drm_format(self->format.info.raw.format), self->format.info.raw.size.width, self->format.info.raw.size.height,
+        fds, offsets, pitches, modifiers, self->dmabuf_num_planes, use_modifiers);
+    while(self->egl->eglGetError() != EGL_SUCCESS){}
+    EGLImage image = self->egl->eglCreateImage(self->egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
+    if(!image || self->egl->eglGetError() != EGL_SUCCESS) {
+        if(image)
+            self->egl->eglDestroyImage(self->egl->egl_display, image);
+        return NULL;
+    }
+    return image;
+}
+
+static EGLImage gsr_pipewire_video_create_egl_image_with_fallback(gsr_pipewire_video *self) {
+    int fds[GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES];
+    uint32_t offsets[GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES];
+    uint32_t pitches[GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES];
+    uint64_t modifiers[GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES];
+    for(size_t i = 0; i < self->dmabuf_num_planes; ++i) {
+        fds[i] = self->dmabuf_data[i].fd;
+        offsets[i] = self->dmabuf_data[i].offset;
+        pitches[i] = self->dmabuf_data[i].stride;
+        modifiers[i] = self->format.info.raw.modifier;
+    }
+
+    EGLImage image = NULL;
+    if(self->no_modifiers_fallback) {
+        image = gsr_pipewire_video_create_egl_image(self, fds, offsets, pitches, modifiers, false);
+    } else {
+        image = gsr_pipewire_video_create_egl_image(self, fds, offsets, pitches, modifiers, true);
+        if(!image) {
+            if(self->renegotiated) {
+                fprintf(stderr, "gsr error: gsr_pipewire_video_create_egl_image_with_fallback: failed to create egl image with modifiers, trying without modifiers\n");
+                self->no_modifiers_fallback = true;
+                image = gsr_pipewire_video_create_egl_image(self, fds, offsets, pitches, modifiers, false);
+            } else {
+                fprintf(stderr, "gsr error: gsr_pipewire_video_create_egl_image_with_fallback: failed to create egl image with modifiers, renegotiating with a different modifier\n");
+                self->negotiated = false;
+                self->renegotiated = true;
+                gsr_pipewire_video_remove_modifier(self, self->format.info.raw.modifier);
+                pw_thread_loop_lock(self->thread_loop);
+                pw_loop_signal_event(pw_thread_loop_get_loop(self->thread_loop), self->reneg);
+                pw_thread_loop_unlock(self->thread_loop);
+            }
+        }
+    }
+    return image;
+}
+
+static bool gsr_pipewire_video_bind_image_to_texture(gsr_pipewire_video *self, EGLImage image, unsigned int texture_id, bool external_texture) {
+    const int texture_target = external_texture ? GL_TEXTURE_EXTERNAL_OES : GL_TEXTURE_2D;
+    while(self->egl->glGetError() != 0){}
+    self->egl->glBindTexture(texture_target, texture_id);
+    self->egl->glEGLImageTargetTexture2DOES(texture_target, image);
+    const bool success = self->egl->glGetError() == 0;
+    self->egl->glBindTexture(texture_target, 0);
+    return success;
+}
+
+static void gsr_pipewire_video_bind_image_to_texture_with_fallback(gsr_pipewire_video *self, gsr_texture_map texture_map, EGLImage image) {
+    if(self->external_texture_fallback) {
+        gsr_pipewire_video_bind_image_to_texture(self, image, texture_map.external_texture_id, true);
+    } else {
+        if(!gsr_pipewire_video_bind_image_to_texture(self, image, texture_map.texture_id, false)) {
+            fprintf(stderr, "gsr error: gsr_pipewire_video_map_texture: failed to bind image to texture, trying with external texture\n");
+            self->external_texture_fallback = true;
+            gsr_pipewire_video_bind_image_to_texture(self, image, texture_map.external_texture_id, true);
+        }
+    }
+}
+
+static void gsr_pipewire_video_update_cursor_texture(gsr_pipewire_video *self, gsr_texture_map texture_map) {
+    if(!self->cursor.data)
+        return;
+
+    self->egl->glBindTexture(GL_TEXTURE_2D, texture_map.cursor_texture_id);
+    // TODO: glTextureSubImage2D if same size
+    self->egl->glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, self->cursor.width, self->cursor.height, 0, GL_RGBA, GL_UNSIGNED_BYTE, self->cursor.data);
+    self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+    self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+    self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+
+    free(self->cursor.data);
+    self->cursor.data = NULL;
+}
+
+bool gsr_pipewire_video_map_texture(gsr_pipewire_video *self, gsr_texture_map texture_map, gsr_pipewire_video_region *region, gsr_pipewire_video_region *cursor_region, gsr_pipewire_video_dmabuf_data *dmabuf_data, int *num_dmabuf_data, uint32_t *fourcc, uint64_t *modifiers, bool *using_external_image) {
+    for(int i = 0; i < GSR_PIPEWIRE_VIDEO_DMABUF_MAX_PLANES; ++i) {
+        memset(&dmabuf_data[i], 0, sizeof(gsr_pipewire_video_dmabuf_data));
+    }
+    *num_dmabuf_data = 0;
+    *using_external_image = self->external_texture_fallback;
+    *fourcc = 0;
+    *modifiers = 0;
+    pthread_mutex_lock(&self->mutex);
+
+    if(!self->negotiated || self->dmabuf_data[0].fd <= 0) {
+        pthread_mutex_unlock(&self->mutex);
+        return false;
+    }
+
+    EGLImage image = gsr_pipewire_video_create_egl_image_with_fallback(self);
+    if(!image) {
+        pthread_mutex_unlock(&self->mutex);
+        return false;
+    }
+
+    gsr_pipewire_video_bind_image_to_texture_with_fallback(self, texture_map, image);
+    *using_external_image = self->external_texture_fallback;
+    self->egl->eglDestroyImage(self->egl->egl_display, image);
+
+    gsr_pipewire_video_update_cursor_texture(self, texture_map);
+
+    region->x = 0;
+    region->y = 0;
+
+    region->width = self->format.info.raw.size.width;
+    region->height = self->format.info.raw.size.height;
+
+    if(self->crop.valid) {
+        region->x = self->crop.x;
+        region->y = self->crop.y;
+
+        region->width = self->crop.width;
+        region->height = self->crop.height;
+    }
+
+    /* TODO: Test if cursor hotspot is correct */
+    cursor_region->x = self->cursor.x - self->cursor.hotspot_x;
+    cursor_region->y = self->cursor.y - self->cursor.hotspot_y;
+
+    cursor_region->width = self->cursor.width;
+    cursor_region->height = self->cursor.height;
+
+    for(size_t i = 0; i < self->dmabuf_num_planes; ++i) {
+        dmabuf_data[i] = self->dmabuf_data[i];
+        self->dmabuf_data[i].fd = -1;
+    }
+    *num_dmabuf_data = self->dmabuf_num_planes;
+    *fourcc = spa_video_format_to_drm_format(self->format.info.raw.format);
+    *modifiers = self->format.info.raw.modifier;
+    self->dmabuf_num_planes = 0;
+
+    pthread_mutex_unlock(&self->mutex);
+    return true;
+}
+
+bool gsr_pipewire_video_is_damaged(gsr_pipewire_video *self) {
+    bool damaged = false;
+    pthread_mutex_lock(&self->mutex);
+    damaged = self->damaged;
+    pthread_mutex_unlock(&self->mutex);
+    return damaged;
+}
+
+void gsr_pipewire_video_clear_damage(gsr_pipewire_video *self) {
+    pthread_mutex_lock(&self->mutex);
+    self->damaged = false;
+    pthread_mutex_unlock(&self->mutex);
+}
diff --git a/src/replay_buffer/replay_buffer.c b/src/replay_buffer/replay_buffer.c
new file mode 100644
index 0000000..92aa645
--- /dev/null
+++ b/src/replay_buffer/replay_buffer.c
@@ -0,0 +1,91 @@
+#include "../../include/replay_buffer/replay_buffer.h"
+#include "../../include/replay_buffer/replay_buffer_ram.h"
+#include "../../include/replay_buffer/replay_buffer_disk.h"
+
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+
+gsr_replay_buffer* gsr_replay_buffer_create(gsr_replay_storage replay_storage, const char *replay_directory, double replay_buffer_time, size_t replay_buffer_num_packets) {
+    gsr_replay_buffer *replay_buffer = NULL;
+    switch(replay_storage) {
+        case GSR_REPLAY_STORAGE_RAM:
+            replay_buffer = gsr_replay_buffer_ram_create(replay_buffer_num_packets);
+            break;
+        case GSR_REPLAY_STORAGE_DISK:
+            replay_buffer = gsr_replay_buffer_disk_create(replay_directory, replay_buffer_time);
+            break;
+    }
+
+    replay_buffer->mutex_initialized = false;
+    replay_buffer->original_replay_buffer = NULL;
+    if(pthread_mutex_init(&replay_buffer->mutex, NULL) != 0) {
+        gsr_replay_buffer_destroy(replay_buffer);
+        return NULL;
+    }
+
+    replay_buffer->mutex_initialized = true;
+    return replay_buffer;
+}
+
+void gsr_replay_buffer_destroy(gsr_replay_buffer *self) {
+    self->destroy(self);
+    if(self->mutex_initialized && !self->original_replay_buffer) {
+        pthread_mutex_destroy(&self->mutex);
+        self->mutex_initialized = false;
+    }
+    self->original_replay_buffer = NULL;
+    free(self);
+}
+
+void gsr_replay_buffer_lock(gsr_replay_buffer *self) {
+    if(self->original_replay_buffer) {
+        gsr_replay_buffer_lock(self->original_replay_buffer);
+        return;
+    }
+
+    if(self->mutex_initialized)
+        pthread_mutex_lock(&self->mutex);
+}
+
+void gsr_replay_buffer_unlock(gsr_replay_buffer *self) {
+    if(self->original_replay_buffer) {
+        gsr_replay_buffer_unlock(self->original_replay_buffer);
+        return;
+    }
+
+    if(self->mutex_initialized)
+        pthread_mutex_unlock(&self->mutex);
+}
+
+bool gsr_replay_buffer_append(gsr_replay_buffer *self, const AVPacket *av_packet, double timestamp) {
+    return self->append(self, av_packet, timestamp);
+}
+
+void gsr_replay_buffer_clear(gsr_replay_buffer *self) {
+    self->clear(self);
+}
+
+AVPacket* gsr_replay_buffer_iterator_get_packet(gsr_replay_buffer *self, gsr_replay_buffer_iterator iterator) {
+    return self->iterator_get_packet(self, iterator);
+}
+
+uint8_t* gsr_replay_buffer_iterator_get_packet_data(gsr_replay_buffer *self, gsr_replay_buffer_iterator iterator) {
+    return self->iterator_get_packet_data(self, iterator);
+}
+
+gsr_replay_buffer* gsr_replay_buffer_clone(gsr_replay_buffer *self) {
+    return self->clone(self);
+}
+
+gsr_replay_buffer_iterator gsr_replay_buffer_find_packet_index_by_time_passed(gsr_replay_buffer *self, int seconds) {
+    return self->find_packet_index_by_time_passed(self, seconds);
+}
+
+gsr_replay_buffer_iterator gsr_replay_buffer_find_keyframe(gsr_replay_buffer *self, gsr_replay_buffer_iterator start_iterator, int stream_index, bool invert_stream_index) {
+    return self->find_keyframe(self, start_iterator, stream_index, invert_stream_index);
+}
+
+bool gsr_replay_buffer_iterator_next(gsr_replay_buffer *self, gsr_replay_buffer_iterator *iterator) {
+    return self->iterator_next(self, iterator);
+}
diff --git a/src/replay_buffer/replay_buffer_disk.c b/src/replay_buffer/replay_buffer_disk.c
new file mode 100644
index 0000000..3fff9f3
--- /dev/null
+++ b/src/replay_buffer/replay_buffer_disk.c
@@ -0,0 +1,437 @@
+#include "../../include/replay_buffer/replay_buffer_disk.h"
+#include "../../include/utils.h"
+
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <time.h>
+#include <errno.h>
+#include <assert.h>
+
+#define REPLAY_BUFFER_FILE_SIZE_BYTES 1024 * 1024 * 256 /* 256MB */
+#define FILE_PREFIX "Replay"
+
+static void gsr_replay_buffer_disk_set_impl_funcs(gsr_replay_buffer_disk *self);
+
+static void gsr_av_packet_disk_init(gsr_av_packet_disk *self, const AVPacket *av_packet, size_t data_index, double timestamp) {
+    self->packet = *av_packet;
+    self->packet.data = NULL;
+    self->data_index = data_index;
+    self->timestamp = timestamp;
+}
+
+static gsr_replay_buffer_file* gsr_replay_buffer_file_create(char *replay_directory, size_t replay_storage_counter, double timestamp, int *replay_storage_fd) {
+    gsr_replay_buffer_file *self = calloc(1, sizeof(gsr_replay_buffer_file));
+    if(!self) {
+        fprintf(stderr, "gsr error: gsr_av_packet_file_init: failed to create buffer file\n");
+        return NULL;
+    }
+
+    if(create_directory_recursive(replay_directory) != 0) {
+        fprintf(stderr, "gsr error: gsr_av_packet_file_init: failed to create replay directory: %s\n", replay_directory);
+        free(self);
+        return NULL;
+    }
+
+    char filename[PATH_MAX];
+    snprintf(filename, sizeof(filename), "%s/%s_%d.gsr", replay_directory, FILE_PREFIX, (int)replay_storage_counter);
+    *replay_storage_fd = creat(filename, 0700);
+    if(*replay_storage_fd <= 0) {
+        fprintf(stderr, "gsr error: gsr_av_packet_file_init: failed to create replay file: %s\n", filename);
+        free(self);
+        return NULL;
+    }
+
+    self->id = replay_storage_counter;
+    self->start_timestamp = timestamp;
+    self->end_timestamp = timestamp;
+    self->ref_counter = 1;
+    self->fd = -1;
+
+    self->packets = NULL;
+    self->capacity_num_packets = 0;
+    self->num_packets = 0;
+    return self;
+}
+
+static gsr_replay_buffer_file* gsr_replay_buffer_file_ref(gsr_replay_buffer_file *self) {
+    if(self->ref_counter >= 1)
+        ++self->ref_counter;
+    return self;
+}
+
+static void gsr_replay_buffer_file_free(gsr_replay_buffer_file *self, const char *replay_directory) {
+    self->ref_counter = 0;
+
+    if(self->fd > 0) {
+        close(self->fd);
+        self->fd = -1;
+    }
+
+    char filename[PATH_MAX];
+    snprintf(filename, sizeof(filename), "%s/%s_%d.gsr", replay_directory, FILE_PREFIX, (int)self->id);
+    remove(filename);
+
+    if(self->packets) {
+        free(self->packets);
+        self->packets = NULL;
+    }
+    self->num_packets = 0;
+    self->capacity_num_packets = 0;
+
+    free(self);
+}
+
+static void gsr_replay_buffer_file_unref(gsr_replay_buffer_file *self, const char *replay_directory) {
+    if(self->ref_counter > 0)
+        --self->ref_counter;
+
+    if(self->ref_counter <= 0)
+        gsr_replay_buffer_file_free(self, replay_directory);
+}
+
+static void gsr_replay_buffer_disk_clear(gsr_replay_buffer *replay_buffer) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    gsr_replay_buffer_lock(&self->replay_buffer);
+
+    for(size_t i = 0; i < self->num_files; ++i) {
+        gsr_replay_buffer_file_unref(self->files[i], self->replay_directory);
+    }
+    self->num_files = 0;
+
+    if(self->storage_fd > 0) {
+        close(self->storage_fd);
+        self->storage_fd = 0;
+    }
+
+    self->storage_num_bytes_written = 0;
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+}
+
+static void gsr_replay_buffer_disk_destroy(gsr_replay_buffer *replay_buffer) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    gsr_replay_buffer_disk_clear(replay_buffer);
+
+    if(self->owns_directory) {
+        remove(self->replay_directory);
+        self->owns_directory = false;
+    }
+}
+
+static bool file_write_all(int fd, const uint8_t *data, size_t size, size_t *bytes_written_total) {
+    *bytes_written_total = 0;
+    while(*bytes_written_total < size) {
+        const ssize_t bytes_written = write(fd, data + *bytes_written_total, size - *bytes_written_total);
+        if(bytes_written == -1) {
+            if(errno == EAGAIN)
+                continue;
+            else
+                return false;
+        }
+        *bytes_written_total += bytes_written;
+    }
+    return true;
+}
+
+static bool gsr_replay_buffer_disk_create_next_file(gsr_replay_buffer_disk *self, double timestamp) {
+    if(self->num_files + 1 >= GSR_REPLAY_BUFFER_CAPACITY_NUM_FILES) {
+        fprintf(stderr, "gsr error: gsr_replay_buffer_disk_create_next_file: too many replay buffer files created! (> %d), either reduce the replay buffer time or report this as a bug\n", (int)GSR_REPLAY_BUFFER_CAPACITY_NUM_FILES);
+        return false;
+    }
+
+    gsr_replay_buffer_file *replay_buffer_file = gsr_replay_buffer_file_create(self->replay_directory, self->storage_counter, timestamp, &self->storage_fd);
+    if(!replay_buffer_file)
+        return false;
+
+    self->files[self->num_files] = replay_buffer_file;
+    ++self->num_files;
+    ++self->storage_counter;
+    return true;
+}
+
+static bool gsr_replay_buffer_disk_append_to_current_file(gsr_replay_buffer_disk *self, const AVPacket *av_packet, double timestamp) {
+    gsr_replay_buffer_file *replay_buffer_file = self->files[self->num_files - 1];
+    replay_buffer_file->end_timestamp = timestamp;
+
+    if(replay_buffer_file->num_packets + 1 >= replay_buffer_file->capacity_num_packets) {
+        size_t new_capacity_num_packets = replay_buffer_file->capacity_num_packets * 2;
+        if(new_capacity_num_packets == 0)
+            new_capacity_num_packets = 256;
+
+        void *new_packets = realloc(replay_buffer_file->packets, new_capacity_num_packets * sizeof(gsr_av_packet_disk));
+        if(!new_packets) {
+            fprintf(stderr, "gsr error: gsr_replay_buffer_disk_append_to_current_file: failed to reallocate replay buffer file packets\n");
+            return false;
+        }
+
+        replay_buffer_file->capacity_num_packets = new_capacity_num_packets;
+        replay_buffer_file->packets = new_packets;
+    }
+
+    gsr_av_packet_disk *packet = &replay_buffer_file->packets[replay_buffer_file->num_packets];
+    gsr_av_packet_disk_init(packet, av_packet, self->storage_num_bytes_written, timestamp);
+    ++replay_buffer_file->num_packets;
+
+    size_t bytes_written = 0;
+    const bool file_written = file_write_all(self->storage_fd, av_packet->data, av_packet->size, &bytes_written);
+    self->storage_num_bytes_written += bytes_written;
+    if(self->storage_num_bytes_written >= REPLAY_BUFFER_FILE_SIZE_BYTES) {
+        self->storage_num_bytes_written = 0;
+        close(self->storage_fd);
+        self->storage_fd = 0;
+    }
+
+    return file_written;
+}
+
+static void gsr_replay_buffer_disk_remove_first_file(gsr_replay_buffer_disk *self) {
+    gsr_replay_buffer_file_unref(self->files[0], self->replay_directory);
+    for(size_t i = 1; i < self->num_files; ++i) {
+        self->files[i - 1] = self->files[i];
+    }
+    --self->num_files;
+}
+
+static bool gsr_replay_buffer_disk_append(gsr_replay_buffer *replay_buffer, const AVPacket *av_packet, double timestamp) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    bool success = false;
+    gsr_replay_buffer_lock(&self->replay_buffer);
+
+    if(self->storage_fd <= 0) {
+        if(!gsr_replay_buffer_disk_create_next_file(self, timestamp))
+            goto done;
+    }
+
+    const bool data_written = gsr_replay_buffer_disk_append_to_current_file(self, av_packet, timestamp);
+
+    if(self->num_files > 1) {
+        const double buffer_time_accumulated = timestamp - self->files[1]->start_timestamp;
+        if(buffer_time_accumulated >= self->replay_buffer_time)
+            gsr_replay_buffer_disk_remove_first_file(self);
+    }
+
+    success = data_written;
+
+    done:
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+    return success;
+}
+
+static AVPacket* gsr_replay_buffer_disk_iterator_get_packet(gsr_replay_buffer *replay_buffer, gsr_replay_buffer_iterator iterator) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    assert(iterator.file_index < self->num_files);
+    assert(iterator.packet_index < self->files[iterator.file_index]->num_packets);
+    return &self->files[iterator.file_index]->packets[iterator.packet_index].packet;
+}
+
+static uint8_t* gsr_replay_buffer_disk_iterator_get_packet_data(gsr_replay_buffer *replay_buffer, gsr_replay_buffer_iterator iterator) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    assert(iterator.file_index < self->num_files);
+    gsr_replay_buffer_file *file = self->files[iterator.file_index];
+    assert(iterator.packet_index < file->num_packets);
+
+    if(file->fd <= 0) {
+        char filename[PATH_MAX];
+        snprintf(filename, sizeof(filename), "%s/%s_%d.gsr", self->replay_directory, FILE_PREFIX, (int)file->id);
+        file->fd = open(filename, O_RDONLY);
+        if(file->fd <= 0) {
+            fprintf(stderr, "gsr error: gsr_replay_buffer_disk_iterator_get_packet_data: failed to open file\n");
+            return NULL;
+        }
+    }
+
+    const gsr_av_packet_disk *packet = &self->files[iterator.file_index]->packets[iterator.packet_index];
+    if(lseek(file->fd, packet->data_index, SEEK_SET) == -1) {
+        fprintf(stderr, "gsr error: gsr_replay_buffer_disk_iterator_get_packet_data: failed to seek\n");
+        return NULL;
+    }
+
+    uint8_t *packet_data = malloc(packet->packet.size);
+    if(read(file->fd, packet_data, packet->packet.size) != packet->packet.size) {
+        fprintf(stderr, "gsr error: gsr_replay_buffer_disk_iterator_get_packet_data: failed to read data from file\n");
+        free(packet_data);
+        return NULL;
+    }
+
+    return packet_data;
+}
+
+static gsr_replay_buffer* gsr_replay_buffer_disk_clone(gsr_replay_buffer *replay_buffer) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    gsr_replay_buffer_disk *destination = calloc(1, sizeof(gsr_replay_buffer_disk));
+    if(!destination)
+        return NULL;
+
+    gsr_replay_buffer_disk_set_impl_funcs(destination);
+    gsr_replay_buffer_lock(&self->replay_buffer);
+
+    destination->replay_buffer.original_replay_buffer = replay_buffer;
+    destination->replay_buffer.mutex = self->replay_buffer.mutex;
+    destination->replay_buffer.mutex_initialized = self->replay_buffer.mutex_initialized;
+    destination->replay_buffer_time = self->replay_buffer_time;
+    destination->storage_counter = self->storage_counter;
+    destination->storage_num_bytes_written = self->storage_num_bytes_written;
+    destination->storage_fd = 0; // We only want to read from the clone. If there is a need to write to it in the future then TODO change this
+
+    for(size_t i = 0; i < self->num_files; ++i) {
+        destination->files[i] = gsr_replay_buffer_file_ref(self->files[i]);
+    }
+    destination->num_files = self->num_files;
+
+    snprintf(destination->replay_directory, sizeof(destination->replay_directory), "%s", self->replay_directory);
+    destination->owns_directory = false;
+
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+    return (gsr_replay_buffer*)destination;
+}
+
+/* Binary search */
+static size_t gsr_replay_buffer_file_find_packet_index_by_time_passed(const gsr_replay_buffer_file *self, int seconds) {
+    const double now = clock_get_monotonic_seconds();
+    if(self->num_packets == 0) {
+        return 0;
+    }
+
+    size_t lower_bound = 0;
+    size_t upper_bound = self->num_packets;
+    size_t index = 0;
+
+    for(;;) {
+        index = lower_bound + (upper_bound - lower_bound) / 2;
+        const gsr_av_packet_disk *packet = &self->packets[index];
+        const double time_passed_since_packet = now - packet->timestamp;
+        if(time_passed_since_packet >= seconds) {
+            if(lower_bound == index)
+                break;
+            lower_bound = index;
+        } else {
+            if(upper_bound == index)
+                break;
+            upper_bound = index;
+        }
+    }
+
+    return index;
+}
+
+/* Binary search */
+static gsr_replay_buffer_iterator gsr_replay_buffer_disk_find_file_index_by_time_passed(gsr_replay_buffer *replay_buffer, int seconds) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    gsr_replay_buffer_lock(&self->replay_buffer);
+
+    const double now = clock_get_monotonic_seconds();
+    if(self->num_files == 0) {
+        gsr_replay_buffer_unlock(&self->replay_buffer);
+        return (gsr_replay_buffer_iterator){0, 0};
+    }
+
+    size_t lower_bound = 0;
+    size_t upper_bound = self->num_files;
+    size_t file_index = 0;
+
+    for(;;) {
+        file_index = lower_bound + (upper_bound - lower_bound) / 2;
+        const gsr_replay_buffer_file *file = self->files[file_index];
+        const double time_passed_since_file_start = now - file->start_timestamp;
+        const double time_passed_since_file_end = now - file->end_timestamp;
+        if(time_passed_since_file_start >= seconds && time_passed_since_file_end <= seconds) {
+            break;
+        } else if(time_passed_since_file_start >= seconds) {
+            if(lower_bound == file_index)
+                break;
+            lower_bound = file_index;
+        } else {
+            if(upper_bound == file_index)
+                break;
+            upper_bound = file_index;
+        }
+    }
+
+    const gsr_replay_buffer_file *file = self->files[file_index];
+    const size_t packet_index = gsr_replay_buffer_file_find_packet_index_by_time_passed(file, seconds);
+
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+    return (gsr_replay_buffer_iterator){packet_index, file_index};
+}
+
+static gsr_replay_buffer_iterator gsr_replay_buffer_disk_find_keyframe(gsr_replay_buffer *replay_buffer, gsr_replay_buffer_iterator start_iterator, int stream_index, bool invert_stream_index) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    gsr_replay_buffer_iterator keyframe_iterator = {(size_t)-1, 0};
+    gsr_replay_buffer_lock(&self->replay_buffer);
+    size_t packet_index = start_iterator.packet_index;
+    for(size_t file_index = start_iterator.file_index; file_index < self->num_files; ++file_index) {
+        const gsr_replay_buffer_file *file = self->files[file_index];
+        for(; packet_index < file->num_packets; ++packet_index) {
+            const gsr_av_packet_disk *packet = &file->packets[packet_index];
+            if((packet->packet.flags & AV_PKT_FLAG_KEY) && (invert_stream_index ? packet->packet.stream_index != stream_index : packet->packet.stream_index == stream_index)) {
+                keyframe_iterator.packet_index = packet_index;
+                keyframe_iterator.file_index = file_index;
+                goto done;
+            }
+        }
+        packet_index = 0;
+    }
+    done:
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+    return keyframe_iterator;
+}
+
+static bool gsr_replay_buffer_disk_iterator_next(gsr_replay_buffer *replay_buffer, gsr_replay_buffer_iterator *iterator) {
+    gsr_replay_buffer_disk *self = (gsr_replay_buffer_disk*)replay_buffer;
+    if(iterator->file_index >= self->num_files)
+        return false;
+
+    if(iterator->packet_index + 1 >= self->files[iterator->file_index]->num_packets) {
+        if(iterator->file_index + 1 >= self->num_files)
+            return false;
+
+        if(self->files[iterator->file_index + 1]->num_packets == 0)
+            return false;
+
+        ++iterator->file_index;
+        iterator->packet_index = 0;
+        return true;
+    } else {
+        ++iterator->packet_index;
+        return true;
+    }
+}
+
+static void get_current_time(char *time_str, size_t time_str_size) {
+    time_t now = time(NULL);
+    struct tm *t = localtime(&now);
+    strftime(time_str, time_str_size - 1, "%Y-%m-%d_%H-%M-%S", t);
+}
+
+static void gsr_replay_buffer_disk_set_impl_funcs(gsr_replay_buffer_disk *self) {
+    self->replay_buffer.destroy = gsr_replay_buffer_disk_destroy;
+    self->replay_buffer.append = gsr_replay_buffer_disk_append;
+    self->replay_buffer.clear = gsr_replay_buffer_disk_clear;
+    self->replay_buffer.iterator_get_packet = gsr_replay_buffer_disk_iterator_get_packet;
+    self->replay_buffer.iterator_get_packet_data = gsr_replay_buffer_disk_iterator_get_packet_data;
+    self->replay_buffer.clone = gsr_replay_buffer_disk_clone;
+    self->replay_buffer.find_packet_index_by_time_passed = gsr_replay_buffer_disk_find_file_index_by_time_passed;
+    self->replay_buffer.find_keyframe = gsr_replay_buffer_disk_find_keyframe;
+    self->replay_buffer.iterator_next = gsr_replay_buffer_disk_iterator_next;
+}
+
+gsr_replay_buffer* gsr_replay_buffer_disk_create(const char *replay_directory, double replay_buffer_time) {
+    assert(replay_buffer_time > 0);
+    gsr_replay_buffer_disk *replay_buffer = calloc(1, sizeof(gsr_replay_buffer_disk));
+    if(!replay_buffer)
+        return NULL;
+
+    char time_str[128];
+    get_current_time(time_str, sizeof(time_str));
+
+    replay_buffer->num_files = 0;
+    replay_buffer->storage_counter = 0;
+    replay_buffer->replay_buffer_time = replay_buffer_time;
+    snprintf(replay_buffer->replay_directory, sizeof(replay_buffer->replay_directory), "%s/gsr-replay-%s.gsr", replay_directory, time_str);
+    replay_buffer->owns_directory = true;
+
+    gsr_replay_buffer_disk_set_impl_funcs(replay_buffer);
+    return (gsr_replay_buffer*)replay_buffer;
+}
diff --git a/src/replay_buffer/replay_buffer_ram.c b/src/replay_buffer/replay_buffer_ram.c
new file mode 100644
index 0000000..890588f
--- /dev/null
+++ b/src/replay_buffer/replay_buffer_ram.c
@@ -0,0 +1,256 @@
+#include "../../include/replay_buffer/replay_buffer_ram.h"
+#include "../../include/utils.h"
+
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+
+#include <libavutil/mem.h>
+
+static void gsr_replay_buffer_ram_set_impl_funcs(gsr_replay_buffer_ram *self);
+
+static gsr_av_packet_ram* gsr_av_packet_ram_create(const AVPacket *av_packet, double timestamp) {
+    gsr_av_packet_ram *self = malloc(sizeof(gsr_av_packet_ram));
+    if(!self)
+        return NULL;
+
+    self->ref_counter = 1;
+    self->packet = *av_packet;
+    self->timestamp = timestamp;
+    // Why are we doing this you ask? there is a ffmpeg bug that causes cpu usage to increase over time when you have
+    // packets that are not being free'd until later. So we copy the packet data, free the packet and then reconstruct
+    // the packet later on when we need it, to keep packets alive only for a short period.
+    self->packet.data = av_memdup(av_packet->data, av_packet->size);
+    if(!self->packet.data) {
+        free(self);
+        return NULL;
+    }
+
+    return self;
+}
+
+static gsr_av_packet_ram* gsr_av_packet_ram_ref(gsr_av_packet_ram *self) {
+    if(self->ref_counter >= 1)
+        ++self->ref_counter;
+    return self;
+}
+
+static void gsr_av_packet_ram_free(gsr_av_packet_ram *self) {
+    self->ref_counter = 0;
+    if(self->packet.data) {
+        av_free(self->packet.data);
+        self->packet.data = NULL;
+    }
+    free(self);
+}
+
+static void gsr_av_packet_ram_unref(gsr_av_packet_ram *self) {
+    if(self->ref_counter >= 1)
+        --self->ref_counter;
+
+    if(self->ref_counter <= 0)
+        gsr_av_packet_ram_free(self);
+}
+
+static void gsr_replay_buffer_ram_destroy(gsr_replay_buffer *replay_buffer) {
+    gsr_replay_buffer_ram *self = (gsr_replay_buffer_ram*)replay_buffer;
+    gsr_replay_buffer_lock(&self->replay_buffer);
+    for(size_t i = 0; i < self->num_packets; ++i) {
+        if(self->packets[i]) {
+            gsr_av_packet_ram_unref(self->packets[i]);
+            self->packets[i] = NULL;
+        }
+    }
+    self->num_packets = 0;
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+
+    if(self->packets) {
+        free(self->packets);
+        self->packets = NULL;
+    }
+
+    self->capacity_num_packets = 0;
+    self->index = 0;
+}
+
+static bool gsr_replay_buffer_ram_append(gsr_replay_buffer *replay_buffer, const AVPacket *av_packet, double timestamp) {
+    gsr_replay_buffer_ram *self = (gsr_replay_buffer_ram*)replay_buffer;
+    gsr_replay_buffer_lock(&self->replay_buffer);
+    gsr_av_packet_ram *packet = gsr_av_packet_ram_create(av_packet, timestamp);
+    if(!packet) {
+        gsr_replay_buffer_unlock(&self->replay_buffer);
+        return false;
+    }
+
+    if(self->packets[self->index]) {
+        gsr_av_packet_ram_unref(self->packets[self->index]);
+        self->packets[self->index] = NULL;
+    }
+    self->packets[self->index] = packet;
+
+    self->index = (self->index + 1) % self->capacity_num_packets;
+    ++self->num_packets;
+    if(self->num_packets > self->capacity_num_packets)
+        self->num_packets = self->capacity_num_packets;
+
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+    return true;
+}
+
+static void gsr_replay_buffer_ram_clear(gsr_replay_buffer *replay_buffer) {
+    gsr_replay_buffer_ram *self = (gsr_replay_buffer_ram*)replay_buffer;
+    gsr_replay_buffer_lock(&self->replay_buffer);
+    for(size_t i = 0; i < self->num_packets; ++i) {
+        if(self->packets[i]) {
+            gsr_av_packet_ram_unref(self->packets[i]);
+            self->packets[i] = NULL;
+        }
+    }
+    self->num_packets = 0;
+    self->index = 0;
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+}
+
+static gsr_av_packet_ram* gsr_replay_buffer_ram_get_packet_at_index(gsr_replay_buffer *replay_buffer, size_t index) {
+    gsr_replay_buffer_ram *self = (gsr_replay_buffer_ram*)replay_buffer;
+    assert(index < self->num_packets);
+    size_t start_index = 0;
+    if(self->num_packets < self->capacity_num_packets)
+        start_index = self->num_packets - self->index;
+    else
+        start_index = self->index;
+
+    const size_t offset = (start_index + index) % self->capacity_num_packets;
+    return self->packets[offset];
+}
+
+static AVPacket* gsr_replay_buffer_ram_iterator_get_packet(gsr_replay_buffer *replay_buffer, gsr_replay_buffer_iterator iterator) {
+    return &gsr_replay_buffer_ram_get_packet_at_index(replay_buffer, iterator.packet_index)->packet;
+}
+
+static uint8_t* gsr_replay_buffer_ram_iterator_get_packet_data(gsr_replay_buffer *replay_buffer, gsr_replay_buffer_iterator iterator) {
+    (void)replay_buffer;
+    (void)iterator;
+    return NULL;
+}
+
+static gsr_replay_buffer* gsr_replay_buffer_ram_clone(gsr_replay_buffer *replay_buffer) {
+    gsr_replay_buffer_ram *self = (gsr_replay_buffer_ram*)replay_buffer;
+    gsr_replay_buffer_ram *destination = calloc(1, sizeof(gsr_replay_buffer_ram));
+    if(!destination)
+        return NULL;
+
+    gsr_replay_buffer_ram_set_impl_funcs(destination);
+    gsr_replay_buffer_lock(&self->replay_buffer);
+
+    destination->replay_buffer.original_replay_buffer = replay_buffer;
+    destination->replay_buffer.mutex = self->replay_buffer.mutex;
+    destination->replay_buffer.mutex_initialized = self->replay_buffer.mutex_initialized;
+    destination->capacity_num_packets = self->capacity_num_packets;
+    destination->index = self->index;
+    destination->packets = calloc(destination->capacity_num_packets, sizeof(gsr_av_packet_ram*));
+    if(!destination->packets) {
+        free(destination);
+        gsr_replay_buffer_unlock(&self->replay_buffer);
+        return NULL;
+    }
+
+    destination->num_packets = self->num_packets;
+    for(size_t i = 0; i < destination->num_packets; ++i) {
+        destination->packets[i] = gsr_av_packet_ram_ref(self->packets[i]);
+    }
+
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+    return (gsr_replay_buffer*)destination;
+}
+
+/* Binary search */
+static gsr_replay_buffer_iterator gsr_replay_buffer_ram_find_packet_index_by_time_passed(gsr_replay_buffer *replay_buffer, int seconds) {
+    gsr_replay_buffer_ram *self = (gsr_replay_buffer_ram*)replay_buffer;
+    gsr_replay_buffer_lock(&self->replay_buffer);
+
+    const double now = clock_get_monotonic_seconds();
+    if(self->num_packets == 0) {
+        gsr_replay_buffer_unlock(&self->replay_buffer);
+        return (gsr_replay_buffer_iterator){0, 0};
+    }
+
+    size_t lower_bound = 0;
+    size_t upper_bound = self->num_packets;
+    size_t index = 0;
+
+    for(;;) {
+        index = lower_bound + (upper_bound - lower_bound) / 2;
+        const gsr_av_packet_ram *packet = gsr_replay_buffer_ram_get_packet_at_index(replay_buffer, index);
+        const double time_passed_since_packet = now - packet->timestamp;
+        if(time_passed_since_packet >= seconds) {
+            if(lower_bound == index)
+                break;
+            lower_bound = index;
+        } else {
+            if(upper_bound == index)
+                break;
+            upper_bound = index;
+        }
+    }
+
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+    return (gsr_replay_buffer_iterator){index, 0};
+}
+
+static gsr_replay_buffer_iterator gsr_replay_buffer_ram_find_keyframe(gsr_replay_buffer *replay_buffer, gsr_replay_buffer_iterator start_iterator, int stream_index, bool invert_stream_index) {
+    gsr_replay_buffer_ram *self = (gsr_replay_buffer_ram*)replay_buffer;
+    size_t keyframe_index = (size_t)-1;
+    gsr_replay_buffer_lock(&self->replay_buffer);
+    for(size_t i = start_iterator.packet_index; i < self->num_packets; ++i) {
+        const gsr_av_packet_ram *packet = gsr_replay_buffer_ram_get_packet_at_index(replay_buffer, i);
+        if((packet->packet.flags & AV_PKT_FLAG_KEY) && (invert_stream_index ? packet->packet.stream_index != stream_index : packet->packet.stream_index == stream_index)) {
+            keyframe_index = i;
+            break;
+        }
+    }
+    gsr_replay_buffer_unlock(&self->replay_buffer);
+    return (gsr_replay_buffer_iterator){keyframe_index, 0};
+}
+
+static bool gsr_replay_buffer_ram_iterator_next(gsr_replay_buffer *replay_buffer, gsr_replay_buffer_iterator *iterator) {
+    gsr_replay_buffer_ram *self = (gsr_replay_buffer_ram*)replay_buffer;
+    if(iterator->packet_index + 1 < self->num_packets) {
+        ++iterator->packet_index;
+        return true;
+    } else {
+        return false;
+    }
+}
+
+static void gsr_replay_buffer_ram_set_impl_funcs(gsr_replay_buffer_ram *self) {
+    self->replay_buffer.destroy = gsr_replay_buffer_ram_destroy;
+    self->replay_buffer.append = gsr_replay_buffer_ram_append;
+    self->replay_buffer.clear = gsr_replay_buffer_ram_clear;
+    self->replay_buffer.iterator_get_packet = gsr_replay_buffer_ram_iterator_get_packet;
+    self->replay_buffer.iterator_get_packet_data = gsr_replay_buffer_ram_iterator_get_packet_data;
+    self->replay_buffer.clone = gsr_replay_buffer_ram_clone;
+    self->replay_buffer.find_packet_index_by_time_passed = gsr_replay_buffer_ram_find_packet_index_by_time_passed;
+    self->replay_buffer.find_keyframe = gsr_replay_buffer_ram_find_keyframe;
+    self->replay_buffer.iterator_next = gsr_replay_buffer_ram_iterator_next;
+}
+
+gsr_replay_buffer* gsr_replay_buffer_ram_create(size_t replay_buffer_num_packets) {
+    assert(replay_buffer_num_packets > 0);
+    gsr_replay_buffer_ram *replay_buffer = calloc(1, sizeof(gsr_replay_buffer_ram));
+    if(!replay_buffer)
+        return NULL;
+
+    replay_buffer->capacity_num_packets = replay_buffer_num_packets;
+    replay_buffer->num_packets = 0;
+    replay_buffer->index = 0;
+    replay_buffer->packets = calloc(replay_buffer->capacity_num_packets, sizeof(gsr_av_packet_ram*));
+    if(!replay_buffer->packets) {
+        gsr_replay_buffer_ram_destroy(&replay_buffer->replay_buffer);
+        free(replay_buffer);
+        return NULL;
+    }
+
+    gsr_replay_buffer_ram_set_impl_funcs(replay_buffer);
+    return (gsr_replay_buffer*)replay_buffer;
+}
diff --git a/src/shader.c b/src/shader.c
index f8d7eb2..ba4db80 100644
--- a/src/shader.c
+++ b/src/shader.c
@@ -1,15 +1,18 @@
 #include "../include/shader.h"
+#include "../include/egl.h"
 #include <stdio.h>
 #include <assert.h>
 
+static bool print_compile_errors = false;
+
 static int min_int(int a, int b) {
     return a < b ? a : b;
 }
 
-static unsigned int loader_shader(gsr_egl *egl, unsigned int type, const char *source) {
+static unsigned int load_shader(gsr_egl *egl, unsigned int type, const char *source) {
     unsigned int shader_id = egl->glCreateShader(type);
     if(shader_id == 0) {
-        fprintf(stderr, "gsr error: loader_shader: failed to create shader, error: %d\n", egl->glGetError());
+        fprintf(stderr, "gsr error: load_shader: failed to create shader, error: %d\n", egl->glGetError());
         return 0;
     }
 
@@ -22,10 +25,10 @@ static unsigned int loader_shader(gsr_egl *egl, unsigned int type, const char *s
         int info_length = 0;
         egl->glGetShaderiv(shader_id, GL_INFO_LOG_LENGTH, &info_length);
         
-        if(info_length > 1) {
+        if(info_length > 1 && print_compile_errors) {
             char info_log[4096];
             egl->glGetShaderInfoLog(shader_id, min_int(4096, info_length), NULL, info_log);
-            fprintf(stderr, "gsr error: loader shader: failed to compile shader, error:\n%s\n", info_log);
+            fprintf(stderr, "gsr error: load_shader: failed to compile shader, error:\n%s\nshader source:\n%s\n", info_log, source);
         }
 
         egl->glDeleteShader(shader_id);
@@ -35,28 +38,36 @@ static unsigned int loader_shader(gsr_egl *egl, unsigned int type, const char *s
     return shader_id;
 }
 
-static unsigned int load_program(gsr_egl *egl, const char *vertex_shader, const char *fragment_shader) {
+static unsigned int load_program(gsr_egl *egl, const char *vertex_shader, const char *fragment_shader, const char *compute_shader) {
     unsigned int vertex_shader_id = 0;
     unsigned int fragment_shader_id = 0;
+    unsigned int compute_shader_id = 0;
     unsigned int program_id = 0;
     int linked = 0;
+    bool success = false;
 
     if(vertex_shader) {
-        vertex_shader_id = loader_shader(egl, GL_VERTEX_SHADER, vertex_shader);
+        vertex_shader_id = load_shader(egl, GL_VERTEX_SHADER, vertex_shader);
         if(vertex_shader_id == 0)
-            goto err;
+            goto done;
     }
 
     if(fragment_shader) {
-        fragment_shader_id = loader_shader(egl, GL_FRAGMENT_SHADER, fragment_shader);
+        fragment_shader_id = load_shader(egl, GL_FRAGMENT_SHADER, fragment_shader);
         if(fragment_shader_id == 0)
-            goto err;
+            goto done;
+    }
+
+    if(compute_shader) {
+        compute_shader_id = load_shader(egl, GL_COMPUTE_SHADER, compute_shader);
+        if(compute_shader_id == 0)
+            goto done;
     }
 
     program_id = egl->glCreateProgram();
     if(program_id == 0) {
         fprintf(stderr, "gsr error: load_program: failed to create shader program, error: %d\n", egl->glGetError());
-        goto err;
+        goto done;
     }
 
     if(vertex_shader_id)
@@ -65,6 +76,9 @@ static unsigned int load_program(gsr_egl *egl, const char *vertex_shader, const
     if(fragment_shader_id)
         egl->glAttachShader(program_id, fragment_shader_id);
 
+    if(compute_shader_id)
+        egl->glAttachShader(program_id, compute_shader_id);
+
     egl->glLinkProgram(program_id);
 
     egl->glGetProgramiv(program_id, GL_LINK_STATUS, &linked);
@@ -78,37 +92,36 @@ static unsigned int load_program(gsr_egl *egl, const char *vertex_shader, const
             fprintf(stderr, "gsr error: load program: linking shader program failed, error:\n%s\n", info_log);            
         }
 
-        goto err;
+        goto done;
     }
 
-    if(fragment_shader_id)
-        egl->glDeleteShader(fragment_shader_id);
-    if(vertex_shader_id)
-        egl->glDeleteShader(vertex_shader_id);
-
-    return program_id;
+    success = true;
+    done:
 
-    err:
-    if(program_id)
-        egl->glDeleteProgram(program_id);
+    if(!success) {
+        if(program_id)
+            egl->glDeleteProgram(program_id);
+    }
+    if(compute_shader_id)
+        egl->glDeleteShader(compute_shader_id);
     if(fragment_shader_id)
         egl->glDeleteShader(fragment_shader_id);
     if(vertex_shader_id)
         egl->glDeleteShader(vertex_shader_id);
-    return 0;
+    return program_id;
 }
 
-int gsr_shader_init(gsr_shader *self, gsr_egl *egl, const char *vertex_shader, const char *fragment_shader) {
+int gsr_shader_init(gsr_shader *self, gsr_egl *egl, const char *vertex_shader, const char *fragment_shader, const char *compute_shader) {
     assert(egl);
     self->egl = egl;
     self->program_id = 0;
 
-    if(!vertex_shader && !fragment_shader) {
-        fprintf(stderr, "gsr error: gsr_shader_init: vertex shader and fragment shader can't be NULL at the same time\n");
+    if(!vertex_shader && !fragment_shader && !compute_shader) {
+        fprintf(stderr, "gsr error: gsr_shader_init: vertex, fragment shader and compute shaders can't be NULL at the same time\n");
         return -1;
     }
 
-    self->program_id = load_program(self->egl, vertex_shader, fragment_shader);
+    self->program_id = load_program(self->egl, vertex_shader, fragment_shader, compute_shader);
     if(self->program_id == 0)
         return -1;
 
@@ -140,3 +153,7 @@ void gsr_shader_use(gsr_shader *self) {
 void gsr_shader_use_none(gsr_shader *self) {
     self->egl->glUseProgram(0);
 }
+
+void gsr_shader_enable_debug_output(bool enable) {
+    print_compile_errors = enable;
+}
diff --git a/src/sound.cpp b/src/sound.cpp
index c3aa4d4..d954609 100644
--- a/src/sound.cpp
+++ b/src/sound.cpp
@@ -8,12 +8,16 @@ extern "C" {
 #include <string.h>
 #include <cmath>
 #include <time.h>
+#include <mutex>
 
 #include <pulse/pulseaudio.h>
 #include <pulse/mainloop.h>
 #include <pulse/xmalloc.h>
 #include <pulse/error.h>
 
+#define RECONNECT_TRY_TIMEOUT_SECONDS 0.5
+#define DEVICE_NAME_MAX_SIZE 128
+
 #define CHECK_DEAD_GOTO(p, rerror, label)                               \
     do {                                                                \
         if (!(p)->context || !PA_CONTEXT_IS_GOOD(pa_context_get_state((p)->context)) || \
@@ -29,6 +33,12 @@ extern "C" {
         }                                                               \
     } while(false);
 
+enum class DeviceType {
+    STANDARD,
+    DEFAULT_OUTPUT,
+    DEFAULT_INPUT
+};
+
 struct pa_handle {
     pa_context *context;
     pa_stream *stream;
@@ -41,49 +51,153 @@ struct pa_handle {
     size_t output_index, output_length;
 
     int operation_success;
+    double latency_seconds;
+
+    pa_buffer_attr attr;
+    pa_sample_spec ss;
+
+    std::mutex reconnect_mutex;
+    DeviceType device_type;
+    char stream_name[256];
+    bool reconnect;
+    double reconnect_last_tried_seconds;
+
+    char device_name[DEVICE_NAME_MAX_SIZE];
+    char default_output_device_name[DEVICE_NAME_MAX_SIZE];
+    char default_input_device_name[DEVICE_NAME_MAX_SIZE];
 };
 
-static void pa_sound_device_free(pa_handle *s) {
-    assert(s);
+static void pa_sound_device_free(pa_handle *p) {
+    assert(p);
+
+    if (p->stream) {
+        pa_stream_unref(p->stream);
+        p->stream = NULL;
+    }
+
+    if (p->context) {
+        pa_context_disconnect(p->context);
+        pa_context_unref(p->context);
+        p->context = NULL;
+    }
+
+    if (p->mainloop) {
+        pa_mainloop_free(p->mainloop);
+        p->mainloop = NULL;
+    }
+
+    if (p->output_data) {
+        free(p->output_data);
+        p->output_data = NULL;
+    }
+
+    pa_xfree(p);
+}
+
+static void subscribe_update_default_devices(pa_context*, const pa_server_info *server_info, void *userdata) {
+    pa_handle *handle = (pa_handle*)userdata;
+    std::lock_guard<std::mutex> lock(handle->reconnect_mutex);
+
+    if(server_info->default_sink_name) {
+        // TODO: Size check
+        snprintf(handle->default_output_device_name, sizeof(handle->default_output_device_name), "%s.monitor", server_info->default_sink_name);
+        if(handle->device_type == DeviceType::DEFAULT_OUTPUT && strcmp(handle->device_name, handle->default_output_device_name) != 0) {
+            handle->reconnect = true;
+            handle->reconnect_last_tried_seconds = clock_get_monotonic_seconds();
+            // TODO: Size check
+            snprintf(handle->device_name, sizeof(handle->device_name), "%s", handle->default_output_device_name);
+        }
+    }
+
+    if(server_info->default_source_name) {
+        // TODO: Size check
+        snprintf(handle->default_input_device_name, sizeof(handle->default_input_device_name), "%s", server_info->default_source_name);
+        if(handle->device_type == DeviceType::DEFAULT_INPUT && strcmp(handle->device_name, handle->default_input_device_name) != 0) {
+            handle->reconnect = true;
+            handle->reconnect_last_tried_seconds = clock_get_monotonic_seconds();
+            // TODO: Size check
+            snprintf(handle->device_name, sizeof(handle->device_name), "%s", handle->default_input_device_name);
+        }
+    }
+}
+
+static void subscribe_cb(pa_context *c, pa_subscription_event_type_t t, uint32_t idx, void *userdata) {
+    (void)idx;
+    pa_handle *handle = (pa_handle*)userdata;
+    if((t & PA_SUBSCRIPTION_EVENT_FACILITY_MASK) == PA_SUBSCRIPTION_EVENT_SERVER) {
+        pa_operation *pa = pa_context_get_server_info(c, subscribe_update_default_devices, handle);
+        if(pa)
+            pa_operation_unref(pa);
+    }
+}
 
-    if (s->stream)
-        pa_stream_unref(s->stream);
+static void store_default_devices(pa_context*, const pa_server_info *server_info, void *userdata) {
+    pa_handle *handle = (pa_handle*)userdata;
+    if(server_info->default_sink_name)
+        snprintf(handle->default_output_device_name, sizeof(handle->default_output_device_name), "%s.monitor", server_info->default_sink_name);
+    if(server_info->default_source_name)
+        snprintf(handle->default_input_device_name, sizeof(handle->default_input_device_name), "%s", server_info->default_source_name);
+}
 
-    if (s->context) {
-        pa_context_disconnect(s->context);
-        pa_context_unref(s->context);
+static bool startup_get_default_devices(pa_handle *p, const char *device_name) {
+    pa_operation *pa = pa_context_get_server_info(p->context, store_default_devices, p);
+    while(pa) {
+        pa_operation_state state = pa_operation_get_state(pa);
+        if(state == PA_OPERATION_DONE) {
+            pa_operation_unref(pa);
+            break;
+        } else if(state == PA_OPERATION_CANCELLED) {
+            pa_operation_unref(pa);
+            return false;
+        }
+        pa_mainloop_iterate(p->mainloop, 1, NULL);
     }
 
-    if (s->mainloop)
-        pa_mainloop_free(s->mainloop);
+    if(p->default_output_device_name[0] == '\0') {
+        fprintf(stderr, "gsr error: failed to find default audio output device\n");
+        return false;
+    }
 
-    if (s->output_data) {
-        free(s->output_data);
-        s->output_data = NULL;
+    if(strcmp(device_name, "default_output") == 0) {
+        snprintf(p->device_name, sizeof(p->device_name), "%s", p->default_output_device_name);
+        p->device_type = DeviceType::DEFAULT_OUTPUT;
+    } else if(strcmp(device_name, "default_input") == 0) {
+        snprintf(p->device_name, sizeof(p->device_name), "%s", p->default_input_device_name);
+        p->device_type = DeviceType::DEFAULT_INPUT;
+    } else {
+        snprintf(p->device_name, sizeof(p->device_name), "%s", device_name);
+        p->device_type = DeviceType::STANDARD;
     }
 
-    pa_xfree(s);
+    return true;
 }
 
 static pa_handle* pa_sound_device_new(const char *server,
         const char *name,
-        const char *dev,
+        const char *device_name,
         const char *stream_name,
         const pa_sample_spec *ss,
         const pa_buffer_attr *attr,
         int *rerror) {
     pa_handle *p;
-    int error = PA_ERR_INTERNAL, r;
+    int error = PA_ERR_INTERNAL;
+    pa_operation *pa = NULL;
 
     p = pa_xnew0(pa_handle, 1);
-    p->read_data = NULL;
-    p->read_length = 0;
-    p->read_index = 0;
+    p->attr = *attr;
+    p->ss = *ss;
+    snprintf(p->stream_name, sizeof(p->stream_name), "%s", stream_name);
+
+    p->reconnect = true;
+    p->reconnect_last_tried_seconds = clock_get_monotonic_seconds() - 1000.0;
+    p->default_output_device_name[0] = '\0';
+    p->default_input_device_name[0] = '\0';
+    p->device_type = DeviceType::STANDARD;
 
-    const int buffer_size = attr->maxlength;
+    const int buffer_size = attr->fragsize;
     void *buffer = malloc(buffer_size);
     if(!buffer) {
-        fprintf(stderr, "failed to allocate buffer for audio\n");
+        fprintf(stderr, "gsr error: failed to allocate buffer for audio\n");
         *rerror = -1;
         return NULL;
     }
@@ -117,56 +231,96 @@ static pa_handle* pa_sound_device_new(const char *server,
         pa_mainloop_iterate(p->mainloop, 1, NULL);
     }
 
-    if (!(p->stream = pa_stream_new(p->context, stream_name, ss, NULL))) {
-        error = pa_context_errno(p->context);
+    if(!startup_get_default_devices(p, device_name))
         goto fail;
+
+    pa_context_set_subscribe_callback(p->context, subscribe_cb, p);
+    pa = pa_context_subscribe(p->context, PA_SUBSCRIPTION_MASK_SERVER, NULL, NULL);
+    if(pa)
+        pa_operation_unref(pa);
+
+    return p;
+
+fail:
+    if (rerror)
+        *rerror = error;
+    pa_sound_device_free(p);
+    return NULL;
+}
+
+static bool pa_sound_device_should_reconnect(pa_handle *p, double now, char *device_name, size_t device_name_size) {
+    std::lock_guard<std::mutex> lock(p->reconnect_mutex);
+    if(p->reconnect && now - p->reconnect_last_tried_seconds >= RECONNECT_TRY_TIMEOUT_SECONDS) {
+        p->reconnect_last_tried_seconds = now;
+        // TODO: Size check
+        snprintf(device_name, device_name_size, "%s", p->device_name);
+        return true;
+    }
+    return false;
+}
+
+static bool pa_sound_device_handle_reconnect(pa_handle *p, char *device_name, size_t device_name_size, double now) {
+    int r;
+    if(!pa_sound_device_should_reconnect(p, now, device_name, device_name_size))
+        return true;
+
+    if(p->stream) {
+        pa_stream_disconnect(p->stream);
+        pa_stream_unref(p->stream);
+        p->stream = NULL;
     }
 
-    r = pa_stream_connect_record(p->stream, dev, attr,
+    if(!(p->stream = pa_stream_new(p->context, p->stream_name, &p->ss, NULL))) {
+        //pa_context_errno(p->context);
+        return false;
+    }
+
+    r = pa_stream_connect_record(p->stream, device_name, &p->attr,
         (pa_stream_flags_t)(PA_STREAM_INTERPOLATE_TIMING|PA_STREAM_ADJUST_LATENCY|PA_STREAM_AUTO_TIMING_UPDATE));
 
-    if (r < 0) {
-        error = pa_context_errno(p->context);
-        goto fail;
+    if(r < 0) {
+        //pa_context_errno(p->context);
+        return false;
     }
 
-    for (;;) {
+    for(;;) {
         pa_stream_state_t state = pa_stream_get_state(p->stream);
 
-        if (state == PA_STREAM_READY)
+        if(state == PA_STREAM_READY)
             break;
 
-        if (!PA_STREAM_IS_GOOD(state)) {
-            error = pa_context_errno(p->context);
-            goto fail;
+        if(!PA_STREAM_IS_GOOD(state)) {
+            //pa_context_errno(p->context);
+            return false;
         }
 
         pa_mainloop_iterate(p->mainloop, 1, NULL);
     }
 
-    return p;
-
-fail:
-    if (rerror)
-        *rerror = error;
-    pa_sound_device_free(p);
-    return NULL;
+    std::lock_guard<std::mutex> lock(p->reconnect_mutex);
+    p->reconnect = false;
+    return true;
 }
 
-// Returns a negative value on failure or if |p->output_length| data is not available within the time frame specified by the sample rate
-static int pa_sound_device_read(pa_handle *p) {
+static int pa_sound_device_read(pa_handle *p, double timeout_seconds) {
     assert(p);
 
-    const int64_t timeout_ms = std::round((1000.0 / (double)pa_stream_get_sample_spec(p->stream)->rate) * 1000.0);
     const double start_time = clock_get_monotonic_seconds();
+    char device_name[DEVICE_NAME_MAX_SIZE];
 
     bool success = false;
     int r = 0;
     int *rerror = &r;
+    pa_usec_t latency = 0;
+    int negative = 0;
+
+    if(!pa_sound_device_handle_reconnect(p, device_name, sizeof(device_name), start_time))
+        goto fail;
+
     CHECK_DEAD_GOTO(p, rerror, fail);
 
     while (p->output_index < p->output_length) {
-        if((clock_get_monotonic_seconds() - start_time) * 1000 >= timeout_ms)
+        if(clock_get_monotonic_seconds() - start_time >= timeout_seconds)
             return -1;
 
         if(!p->read_data) {
@@ -195,6 +349,15 @@ static int pa_sound_device_read(pa_handle *p) {
                 CHECK_DEAD_GOTO(p, rerror, fail);
                 continue;
             }
+
+            pa_operation_unref(pa_stream_update_timing_info(p->stream, NULL, NULL));
+            // TODO: Deal with one pa_stream_peek not being enough. In that case we need to add multiple of these together(?)
+            if(pa_stream_get_latency(p->stream, &latency, &negative) >= 0) {
+                p->latency_seconds = negative ? -(double)latency : latency;
+                if(p->latency_seconds < 0.0)
+                    p->latency_seconds = 0.0;
+                p->latency_seconds *= 0.0000001;
+            }
         }
 
         const size_t space_free_in_output_buffer = p->output_length - p->output_index;
@@ -254,16 +417,16 @@ int sound_device_get_by_name(SoundDevice *device, const char *device_name, const
     ss.channels = num_channels;
 
     pa_buffer_attr buffer_attr;
+    buffer_attr.fragsize = period_frame_size * audio_format_to_get_bytes_per_sample(audio_format) * num_channels; // 2/4 bytes/sample, @num_channels channels
     buffer_attr.tlength = -1;
     buffer_attr.prebuf = -1;
     buffer_attr.minreq = -1;
-    buffer_attr.maxlength = period_frame_size * audio_format_to_get_bytes_per_sample(audio_format) * num_channels; // 2/4 bytes/sample, @num_channels channels
-    buffer_attr.fragsize = buffer_attr.maxlength;
+    buffer_attr.maxlength = buffer_attr.fragsize;
 
     int error = 0;
     pa_handle *handle = pa_sound_device_new(nullptr, description, device_name, description, &ss, &buffer_attr, &error);
     if(!handle) {
-        fprintf(stderr, "pa_sound_device_new() failed: %s. Audio input device %s might not be valid\n", pa_strerror(error), description);
+        fprintf(stderr, "gsr error: pa_sound_device_new() failed: %s. Audio input device %s might not be valid\n", pa_strerror(error), device_name);
         return -1;
     }
 
@@ -278,13 +441,15 @@ void sound_device_close(SoundDevice *device) {
     device->handle = NULL;
 }
 
-int sound_device_read_next_chunk(SoundDevice *device, void **buffer) {
+int sound_device_read_next_chunk(SoundDevice *device, void **buffer, double timeout_sec, double *latency_seconds) {
     pa_handle *pa = (pa_handle*)device->handle;
-    if(pa_sound_device_read(pa) < 0) {
+    if(pa_sound_device_read(pa, timeout_sec) < 0) {
         //fprintf(stderr, "pa_simple_read() failed: %s\n", pa_strerror(error));
+        *latency_seconds = 0.0;
         return -1;
     }
     *buffer = pa->output_data;
+    *latency_seconds = pa->latency_seconds;
     return device->frames;
 }
 
@@ -308,26 +473,138 @@ static void pa_state_cb(pa_context *c, void *userdata) {
     }
 }
 
-static void pa_sourcelist_cb(pa_context *ctx, const pa_source_info *source_info, int eol, void *userdata) {
-    (void)ctx;
+static void pa_sourcelist_cb(pa_context*, const pa_source_info *source_info, int eol, void *userdata) {
     if(eol > 0)
         return;
 
-    std::vector<AudioInput> *inputs = (std::vector<AudioInput>*)userdata;
-    inputs->push_back({ source_info->name, source_info->description });
+    AudioDevices *audio_devices = (AudioDevices*)userdata;
+    audio_devices->audio_inputs.push_back({ source_info->name, source_info->description });
+}
+
+static void pa_server_info_cb(pa_context*, const pa_server_info *server_info, void *userdata) {
+    AudioDevices *audio_devices = (AudioDevices*)userdata;
+    if(server_info->default_sink_name)
+        audio_devices->default_output = std::string(server_info->default_sink_name) + ".monitor";
+    if(server_info->default_source_name)
+        audio_devices->default_input = server_info->default_source_name;
 }
 
-std::vector<AudioInput> get_pulseaudio_inputs() {
-    std::vector<AudioInput> inputs;
+static void server_info_callback(pa_context*, const pa_server_info *server_info, void *userdata) {
+    bool *is_server_pipewire = (bool*)userdata;
+    if(server_info->server_name && strstr(server_info->server_name, "PipeWire"))
+        *is_server_pipewire = true;
+}
+
+static void get_pulseaudio_default_inputs(AudioDevices &audio_devices) {
+    int state = 0;
+    int pa_ready = 0;
+    pa_operation *pa_op = NULL;
+
     pa_mainloop *main_loop = pa_mainloop_new();
+    if(!main_loop)
+        return;
 
     pa_context *ctx = pa_context_new(pa_mainloop_get_api(main_loop), "gpu-screen-recorder");
-    pa_context_connect(ctx, NULL, PA_CONTEXT_NOFLAGS, NULL);
+    if(pa_context_connect(ctx, NULL, PA_CONTEXT_NOFLAGS, NULL) < 0)
+        goto done;
+
+    pa_context_set_state_callback(ctx, pa_state_cb, &pa_ready);
+
+    for(;;) {
+        // Not ready
+        if(pa_ready == 0) {
+            pa_mainloop_iterate(main_loop, 1, NULL);
+            continue;
+        }
+
+        switch(state) {
+            case 0: {
+                pa_op = pa_context_get_server_info(ctx, pa_server_info_cb, &audio_devices);
+                ++state;
+                break;
+            }
+        }
+
+        // Couldn't get connection to the server
+        if(pa_ready == 2 || (state == 1 && pa_op && pa_operation_get_state(pa_op) == PA_OPERATION_DONE))
+            break;
+
+        pa_mainloop_iterate(main_loop, 1, NULL);
+    }
+
+    done:
+    if(pa_op)
+        pa_operation_unref(pa_op);
+    pa_context_disconnect(ctx);
+    pa_context_unref(ctx);
+    pa_mainloop_free(main_loop);
+}
+
+AudioDevices get_pulseaudio_inputs() {
+    AudioDevices audio_devices;
     int state = 0;
     int pa_ready = 0;
+    pa_operation *pa_op = NULL;
+
+    // TODO: Do this in the same connection below instead of two separate connections
+    get_pulseaudio_default_inputs(audio_devices);
+
+    pa_mainloop *main_loop = pa_mainloop_new();
+    if(!main_loop)
+        return audio_devices;
+
+    pa_context *ctx = pa_context_new(pa_mainloop_get_api(main_loop), "gpu-screen-recorder");
+    if(pa_context_connect(ctx, NULL, PA_CONTEXT_NOFLAGS, NULL) < 0)
+        goto done;
+
     pa_context_set_state_callback(ctx, pa_state_cb, &pa_ready);
 
+    for(;;) {
+        // Not ready
+        if(pa_ready == 0) {
+            pa_mainloop_iterate(main_loop, 1, NULL);
+            continue;
+        }
+
+        switch(state) {
+            case 0: {
+                pa_op = pa_context_get_source_info_list(ctx, pa_sourcelist_cb, &audio_devices);
+                ++state;
+                break;
+            }
+        }
+
+        // Couldn't get connection to the server
+        if(pa_ready == 2 || (state == 1 && pa_op && pa_operation_get_state(pa_op) == PA_OPERATION_DONE))
+            break;
+
+        pa_mainloop_iterate(main_loop, 1, NULL);
+    }
+
+    done:
+    if(pa_op)
+        pa_operation_unref(pa_op);
+    pa_context_disconnect(ctx);
+    pa_context_unref(ctx);
+    pa_mainloop_free(main_loop);
+    return audio_devices;
+}
+
+bool pulseaudio_server_is_pipewire() {
+    int state = 0;
+    int pa_ready = 0;
     pa_operation *pa_op = NULL;
+    bool is_server_pipewire = false;
+
+    pa_mainloop *main_loop = pa_mainloop_new();
+    if(!main_loop)
+        return is_server_pipewire;
+
+    pa_context *ctx = pa_context_new(pa_mainloop_get_api(main_loop), "gpu-screen-recorder");
+    if(pa_context_connect(ctx, NULL, PA_CONTEXT_NOFLAGS, NULL) < 0)
+        goto done;
+
+    pa_context_set_state_callback(ctx, pa_state_cb, &pa_ready);
 
     for(;;) {
         // Not ready
@@ -338,24 +615,24 @@ std::vector<AudioInput> get_pulseaudio_inputs() {
 
         switch(state) {
             case 0: {
-                pa_op = pa_context_get_source_info_list(ctx, pa_sourcelist_cb, &inputs);
+                pa_op = pa_context_get_server_info(ctx, server_info_callback, &is_server_pipewire);
                 ++state;
                 break;
             }
         }
 
         // Couldn't get connection to the server
-        if(pa_ready == 2 || (state == 1 && pa_op && pa_operation_get_state(pa_op) == PA_OPERATION_DONE)) {
-            if(pa_op)
-                pa_operation_unref(pa_op);
-            pa_context_disconnect(ctx);
-            pa_context_unref(ctx);
+        if(pa_ready == 2 || (state == 1 && pa_op && pa_operation_get_state(pa_op) == PA_OPERATION_DONE))
             break;
-        }
 
         pa_mainloop_iterate(main_loop, 1, NULL);
     }
 
+    done:
+    if(pa_op)
+        pa_operation_unref(pa_op);
+    pa_context_disconnect(ctx);
+    pa_context_unref(ctx);
     pa_mainloop_free(main_loop);
-    return inputs;
+    return is_server_pipewire;
 }
diff --git a/src/utils.c b/src/utils.c
index 94b0037..c1d399a 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -1,12 +1,25 @@
 #include "../include/utils.h"
+#include "../include/window/window.h"
+
 #include <time.h>
 #include <string.h>
 #include <stdio.h>
 #include <unistd.h>
 #include <fcntl.h>
+#include <stdlib.h>
+#include <sys/stat.h>
+#include <sys/random.h>
+#include <errno.h>
+#include <assert.h>
+
 #include <xf86drmMode.h>
 #include <xf86drm.h>
-#include <stdlib.h>
+#include <X11/Xatom.h>
+#include <X11/extensions/Xrandr.h>
+#include <libavcodec/avcodec.h>
+#include <libavutil/hwcontext_vaapi.h>
+
+#define DRM_NUM_BUF_ATTRS 4
 
 double clock_get_monotonic_seconds(void) {
     struct timespec ts;
@@ -16,6 +29,25 @@ double clock_get_monotonic_seconds(void) {
     return (double)ts.tv_sec + (double)ts.tv_nsec * 0.000000001;
 }
 
+bool generate_random_characters(char *buffer, int buffer_size, const char *alphabet, size_t alphabet_size) {
+    /* TODO: Use other functions on other platforms than linux */
+    if(getrandom(buffer, buffer_size, 0) < buffer_size) {
+        fprintf(stderr, "Failed to get random bytes, error: %s\n", strerror(errno));
+        return false;
+    }
+
+    for(int i = 0; i < buffer_size; ++i) {
+        unsigned char c = *(unsigned char*)&buffer[i];
+        buffer[i] = alphabet[c % alphabet_size];
+    }
+
+    return true;
+}
+
+bool generate_random_characters_standard_alphabet(char *buffer, int buffer_size) {
+    return generate_random_characters(buffer, buffer_size, "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", 62);
+}
+
 static const XRRModeInfo* get_mode_info(const XRRScreenResources *sr, RRMode id) {
     for(int i = 0; i < sr->nmode; ++i) {
         if(sr->modes[i].id == id)
@@ -24,29 +56,73 @@ static const XRRModeInfo* get_mode_info(const XRRScreenResources *sr, RRMode id)
     return NULL;
 }
 
-static void for_each_active_monitor_output_x11(Display *display, active_monitor_callback callback, void *userdata) {
+static gsr_monitor_rotation x11_rotation_to_gsr_rotation(int rot) {
+    switch(rot) {
+        case RR_Rotate_0:   return GSR_MONITOR_ROT_0;
+        case RR_Rotate_90:  return GSR_MONITOR_ROT_90;
+        case RR_Rotate_180: return GSR_MONITOR_ROT_180;
+        case RR_Rotate_270: return GSR_MONITOR_ROT_270;
+    }
+    return GSR_MONITOR_ROT_0;
+}
+
+static uint32_t x11_output_get_connector_id(Display *dpy, RROutput output, Atom randr_connector_id_atom) {
+    Atom type = 0;
+    int format = 0;
+    unsigned long bytes_after = 0;
+    unsigned long nitems = 0;
+    unsigned char *prop = NULL;
+    XRRGetOutputProperty(dpy, output, randr_connector_id_atom, 0, 128, false, false, AnyPropertyType, &type, &format, &nitems, &bytes_after, &prop);
+
+    long result = 0;
+    if(type == XA_INTEGER && format == 32)
+        result = *(long*)prop;
+
+    free(prop);
+    return result;
+}
+
+static vec2i get_monitor_size_rotated(int width, int height, gsr_monitor_rotation rotation) {
+    vec2i size = { .x = width, .y = height };
+    if(rotation == GSR_MONITOR_ROT_90 || rotation == GSR_MONITOR_ROT_270) {
+        int tmp_x = size.x;
+        size.x = size.y;
+        size.y = tmp_x;
+    }
+    return size;
+}
+
+void for_each_active_monitor_output_x11_not_cached(Display *display, active_monitor_callback callback, void *userdata) {
     XRRScreenResources *screen_res = XRRGetScreenResources(display, DefaultRootWindow(display));
     if(!screen_res)
         return;
 
+    const Atom randr_connector_id_atom = XInternAtom(display, "CONNECTOR_ID", False);
+
     char display_name[256];
     for(int i = 0; i < screen_res->noutput; ++i) {
         XRROutputInfo *out_info = XRRGetOutputInfo(display, screen_res, screen_res->outputs[i]);
         if(out_info && out_info->crtc && out_info->connection == RR_Connected) {
             XRRCrtcInfo *crt_info = XRRGetCrtcInfo(display, screen_res, out_info->crtc);
             if(crt_info && crt_info->mode) {
+                // We want to use the current mode info width/height (mode_info->width/height) instead of crtc info width/height (crt_info->width/height) because crtc info
+                // is scaled if the monitor is scaled (xrandr --output DP-1 --scale 1.5). Normally this is not an issue for x11 applications,
+                // but gpu screen recorder captures the drm framebuffer instead of x11 api. This drm framebuffer which doesn't increase in size when using xrandr scaling.
+                // Maybe a better option would be to get the drm crtc size instead.
                 const XRRModeInfo *mode_info = get_mode_info(screen_res, crt_info->mode);
-                if(mode_info && out_info->nameLen < (int)sizeof(display_name)) {
-                    memcpy(display_name, out_info->name, out_info->nameLen);
-                    display_name[out_info->nameLen] = '\0';
+                if(mode_info) {
+                    snprintf(display_name, sizeof(display_name), "%.*s", (int)out_info->nameLen, out_info->name);
+                    const gsr_monitor_rotation rotation = x11_rotation_to_gsr_rotation(crt_info->rotation);
+                    const vec2i monitor_size = get_monitor_size_rotated(mode_info->width, mode_info->height, rotation);
 
-                    gsr_monitor monitor = {
+                    const gsr_monitor monitor = {
                         .name = display_name,
                         .name_len = out_info->nameLen,
                         .pos = { .x = crt_info->x, .y = crt_info->y },
-                        .size = { .x = (int)crt_info->width, .y = (int)crt_info->height },
-                        .crt_info = crt_info,
-                        .connector_id = 0 // TODO: Get connector id
+                        .size = monitor_size,
+                        .connector_id = x11_output_get_connector_id(display, screen_res->outputs[i], randr_connector_id_atom),
+                        .rotation = rotation,
+                        .monitor_identifier = out_info->crtc
                     };
                     callback(&monitor, userdata);
                 }
@@ -61,27 +137,41 @@ static void for_each_active_monitor_output_x11(Display *display, active_monitor_
     XRRFreeScreenResources(screen_res);
 }
 
-typedef struct {
-    int type;
-    int count;
-} drm_connector_type_count;
-
-#define CONNECTOR_TYPE_COUNTS 32
+/* TODO: Support more connector types */
+int get_connector_type_by_name(const char *name) {
+    int len = strlen(name);
+    if(len >= 5 && strncmp(name, "HDMI-", 5) == 0)
+        return 1;
+    else if(len >= 3 && strncmp(name, "DP-", 3) == 0)
+        return 2;
+    else if(len >= 12 && strncmp(name, "DisplayPort-", 12) == 0)
+        return 3;
+    else if(len >= 4 && strncmp(name, "eDP-", 4) == 0)
+        return 4;
+    else
+        return -1;
+}
 
-static drm_connector_type_count* drm_connector_types_get_index(drm_connector_type_count *type_counts, int *num_type_counts, int connector_type) {
-    for(int i = 0; i < *num_type_counts; ++i) {
-        if(type_counts[i].type == connector_type)
-            return &type_counts[i];
+int get_connector_type_id_by_name(const char *name) {
+    int len = strlen(name);
+    int num_start = 0;
+    for(int i = len - 1; i >= 0; --i) {
+        const bool is_num = name[i] >= '0' && name[i] <= '9';
+        if(!is_num) {
+            num_start = i + 1;
+            break;
+        }
     }
 
-    if(*num_type_counts == CONNECTOR_TYPE_COUNTS)
-        return NULL;
+    const int num_len = len - num_start;
+    if(num_len <= 0)
+        return -1;
 
-    const int index = *num_type_counts;
-    type_counts[index].type = connector_type;
-    type_counts[index].count = 0;
-    ++*num_type_counts;
-    return &type_counts[index];
+    return atoi(name + num_start);
+}
+
+uint32_t monitor_identifier_from_type_and_count(int monitor_type_index, int monitor_type_count) {
+    return ((uint32_t)monitor_type_index << 16) | ((uint32_t)monitor_type_count);
 }
 
 static bool connector_get_property_by_name(int drmfd, drmModeConnectorPtr props, const char *name, uint64_t *result) {
@@ -99,36 +189,15 @@ static bool connector_get_property_by_name(int drmfd, drmModeConnectorPtr props,
     return false;
 }
 
-static void for_each_active_monitor_output_wayland(gsr_egl *egl, active_monitor_callback callback, void *userdata) {
-    if(!gsr_egl_supports_wayland_capture(egl))
+static void for_each_active_monitor_output_drm(const char *card_path, active_monitor_callback callback, void *userdata) {
+    int fd = open(card_path, O_RDONLY);
+    if(fd == -1) {
+        fprintf(stderr, "gsr error: for_each_active_monitor_output_drm failed, failed to open \"%s\", error: %s\n", card_path, strerror(errno));
         return;
-
-    for(int i = 0; i < egl->wayland.num_outputs; ++i) {
-        if(!egl->wayland.outputs[i].name)
-            continue;
-
-        gsr_monitor monitor = {
-            .name = egl->wayland.outputs[i].name,
-            .name_len = strlen(egl->wayland.outputs[i].name),
-            .pos = { .x = egl->wayland.outputs[i].pos.x, .y = egl->wayland.outputs[i].pos.y },
-            .size = { .x = egl->wayland.outputs[i].size.x, .y = egl->wayland.outputs[i].size.y },
-            .crt_info = NULL,
-            .connector_id = 0
-        };
-        callback(&monitor, userdata);
     }
-}
-
-static void for_each_active_monitor_output_drm(const char *drm_card_path, active_monitor_callback callback, void *userdata) {
-    int fd = open(drm_card_path, O_RDONLY);
-    if(fd == -1)
-        return;
 
     drmSetClientCap(fd, DRM_CLIENT_CAP_ATOMIC, 1);
 
-    drm_connector_type_count type_counts[CONNECTOR_TYPE_COUNTS];
-    int num_type_counts = 0;
-
     char display_name[256];
     drmModeResPtr resources = drmModeGetResources(fd);
     if(resources) {
@@ -137,12 +206,6 @@ static void for_each_active_monitor_output_drm(const char *drm_card_path, active
             if(!connector)
                 continue;
 
-            drm_connector_type_count *connector_type = drm_connector_types_get_index(type_counts, &num_type_counts, connector->connector_type);
-            const char *connection_name = drmModeGetConnectorTypeName(connector->connector_type);
-            const int connection_name_len = strlen(connection_name);
-            if(connector_type)
-                ++connector_type->count;
-
             if(connector->connection != DRM_MODE_CONNECTED) {
                 drmModeFreeConnector(connector);
                 continue;
@@ -152,15 +215,20 @@ static void for_each_active_monitor_output_drm(const char *drm_card_path, active
             connector_get_property_by_name(fd, connector, "CRTC_ID", &crtc_id);
 
             drmModeCrtcPtr crtc = drmModeGetCrtc(fd, crtc_id);
-            if(connector_type && crtc_id > 0 && crtc && connection_name_len + 5 < (int)sizeof(display_name)) {
-                const int display_name_len = snprintf(display_name, sizeof(display_name), "%s-%d", connection_name, connector_type->count);
-                gsr_monitor monitor = {
+            const char *connection_name = drmModeGetConnectorTypeName(connector->connector_type);
+
+            if(connection_name && crtc_id > 0 && crtc) {
+                const int connector_type_index_name = get_connector_type_by_name(display_name);
+                const int display_name_len = snprintf(display_name, sizeof(display_name), "%s-%u", connection_name, connector->connector_type_id);
+
+                const gsr_monitor monitor = {
                     .name = display_name,
                     .name_len = display_name_len,
                     .pos = { .x = crtc->x, .y = crtc->y },
                     .size = { .x = (int)crtc->width, .y = (int)crtc->height },
-                    .crt_info = NULL,
-                    .connector_id = connector->connector_id
+                    .connector_id = connector->connector_id,
+                    .rotation = GSR_MONITOR_ROT_0,
+                    .monitor_identifier = connector_type_index_name != -1 ? monitor_identifier_from_type_and_count(connector_type_index_name, connector->connector_type_id) : 0
                 };
                 callback(&monitor, userdata);
             }
@@ -176,16 +244,14 @@ static void for_each_active_monitor_output_drm(const char *drm_card_path, active
     close(fd);
 }
 
-void for_each_active_monitor_output(void *connection, gsr_connection_type connection_type, active_monitor_callback callback, void *userdata) {
+void for_each_active_monitor_output(const gsr_window *window, const char *card_path, gsr_connection_type connection_type, active_monitor_callback callback, void *userdata) {
     switch(connection_type) {
         case GSR_CONNECTION_X11:
-            for_each_active_monitor_output_x11(connection, callback, userdata);
-            break;
         case GSR_CONNECTION_WAYLAND:
-            for_each_active_monitor_output_wayland(connection, callback, userdata);
+            gsr_window_for_each_active_monitor_output_cached(window, callback, userdata);
             break;
         case GSR_CONNECTION_DRM:
-            for_each_active_monitor_output_drm(connection, callback, userdata);
+            for_each_active_monitor_output_drm(card_path, callback, userdata);
             break;
     }
 }
@@ -195,20 +261,96 @@ static void get_monitor_by_name_callback(const gsr_monitor *monitor, void *userd
     if(!data->found_monitor && strcmp(data->name, monitor->name) == 0) {
         data->monitor->pos = monitor->pos;
         data->monitor->size = monitor->size;
+        data->monitor->connector_id = monitor->connector_id;
+        data->monitor->rotation = monitor->rotation;
+        data->monitor->monitor_identifier = monitor->monitor_identifier;
         data->found_monitor = true;
     }
 }
 
-bool get_monitor_by_name(void *connection, gsr_connection_type connection_type, const char *name, gsr_monitor *monitor) {
+bool get_monitor_by_name(const gsr_egl *egl, gsr_connection_type connection_type, const char *name, gsr_monitor *monitor) {
     get_monitor_by_name_userdata userdata;
     userdata.name = name;
     userdata.name_len = strlen(name);
     userdata.monitor = monitor;
     userdata.found_monitor = false;
-    for_each_active_monitor_output(connection, connection_type, get_monitor_by_name_callback, &userdata);
+    for_each_active_monitor_output(egl->window, egl->card_path, connection_type, get_monitor_by_name_callback, &userdata);
     return userdata.found_monitor;
 }
 
+typedef struct {
+    const gsr_monitor *monitor;
+    gsr_monitor_rotation rotation;
+    vec2i position;
+    bool match_found;
+} get_monitor_by_connector_id_userdata;
+
+static bool vec2i_eql(vec2i a, vec2i b) {
+    return a.x == b.x && a.y == b.y;
+}
+
+static void get_monitor_by_name_and_size_callback(const gsr_monitor *monitor, void *userdata) {
+    get_monitor_by_connector_id_userdata *data = (get_monitor_by_connector_id_userdata*)userdata;
+    if(monitor->name && data->monitor->name && strcmp(monitor->name, data->monitor->name) == 0 && vec2i_eql(monitor->size, data->monitor->size)) {
+        data->rotation = monitor->rotation;
+        data->position = monitor->pos;
+        data->match_found = true;
+    }
+}
+
+static void get_monitor_by_connector_id_callback(const gsr_monitor *monitor, void *userdata) {
+    get_monitor_by_connector_id_userdata *data = (get_monitor_by_connector_id_userdata*)userdata;
+    if(monitor->connector_id == data->monitor->connector_id ||
+        (!monitor->connector_id && monitor->monitor_identifier == data->monitor->monitor_identifier))
+    {
+        data->rotation = monitor->rotation;
+        data->position = monitor->pos;
+        data->match_found = true;
+    }
+}
+
+bool drm_monitor_get_display_server_data(const gsr_window *window, const gsr_monitor *monitor, gsr_monitor_rotation *monitor_rotation, vec2i *monitor_position) {
+    *monitor_rotation = GSR_MONITOR_ROT_0;
+    *monitor_position = (vec2i){0, 0};
+
+    if(gsr_window_get_display_server(window) == GSR_DISPLAY_SERVER_WAYLAND) {
+        {
+            get_monitor_by_connector_id_userdata userdata;
+            userdata.monitor = monitor;
+            userdata.rotation = GSR_MONITOR_ROT_0;
+            userdata.position = (vec2i){0, 0};
+            userdata.match_found = false;
+            gsr_window_for_each_active_monitor_output_cached(window, get_monitor_by_name_and_size_callback, &userdata);
+            if(userdata.match_found) {
+                *monitor_rotation = userdata.rotation;
+                *monitor_position = userdata.position;
+                return true;
+            }
+        }
+        {
+            get_monitor_by_connector_id_userdata userdata;
+            userdata.monitor = monitor;
+            userdata.rotation = GSR_MONITOR_ROT_0;
+            userdata.position = (vec2i){0, 0};
+            userdata.match_found = false;
+            gsr_window_for_each_active_monitor_output_cached(window, get_monitor_by_connector_id_callback, &userdata);
+            *monitor_rotation = userdata.rotation;
+            *monitor_position = userdata.position;
+            return userdata.match_found;
+        }
+    } else {
+        get_monitor_by_connector_id_userdata userdata;
+        userdata.monitor = monitor;
+        userdata.rotation = GSR_MONITOR_ROT_0;
+        userdata.position = (vec2i){0, 0};
+        userdata.match_found = false;
+        gsr_window_for_each_active_monitor_output_cached(window, get_monitor_by_connector_id_callback, &userdata);
+        *monitor_rotation = userdata.rotation;
+        *monitor_position = userdata.position;
+        return userdata.match_found;
+    }
+}
+
 bool gl_get_gpu_info(gsr_egl *egl, gsr_gpu_info *info) {
     const char *software_renderers[] = { "llvmpipe", "SWR", "softpipe", NULL };
     bool supported = true;
@@ -216,6 +358,7 @@ bool gl_get_gpu_info(gsr_egl *egl, gsr_gpu_info *info) {
     const unsigned char *gl_renderer = egl->glGetString(GL_RENDERER);
 
     info->gpu_version = 0;
+    info->is_steam_deck = false;
 
     if(!gl_vendor) {
         fprintf(stderr, "gsr error: failed to get gpu vendor\n");
@@ -235,10 +378,14 @@ bool gl_get_gpu_info(gsr_egl *egl, gsr_gpu_info *info) {
 
     if(strstr((const char*)gl_vendor, "AMD"))
         info->vendor = GSR_GPU_VENDOR_AMD;
+    else if(strstr((const char*)gl_vendor, "Mesa") && gl_renderer && strstr((const char*)gl_renderer, "AMD"))
+        info->vendor = GSR_GPU_VENDOR_AMD;
     else if(strstr((const char*)gl_vendor, "Intel"))
         info->vendor = GSR_GPU_VENDOR_INTEL;
     else if(strstr((const char*)gl_vendor, "NVIDIA"))
         info->vendor = GSR_GPU_VENDOR_NVIDIA;
+    else if(strstr((const char*)gl_vendor, "Broadcom"))
+        info->vendor = GSR_GPU_VENDOR_BROADCOM;
     else {
         fprintf(stderr, "gsr error: unknown gpu vendor: %s\n", gl_vendor);
         supported = false;
@@ -248,53 +395,66 @@ bool gl_get_gpu_info(gsr_egl *egl, gsr_gpu_info *info) {
     if(gl_renderer) {
         if(info->vendor == GSR_GPU_VENDOR_NVIDIA)
             sscanf((const char*)gl_renderer, "%*s %*s %*s %d", &info->gpu_version);
+        info->is_steam_deck = strstr((const char*)gl_renderer, "vangogh") != NULL;
     }
 
     end:
     return supported;
 }
 
-bool gsr_get_valid_card_path(char *output) {
-    for(int i = 0; i < 10; ++i) {
-        drmVersion *ver = NULL;
-        drmModePlaneResPtr planes = NULL;
-        bool found_screen_card = false;
+bool try_card_has_valid_plane(const char *card_path) {
+    drmVersion *ver = NULL;
+    drmModePlaneResPtr planes = NULL;
+    bool found_screen_card = false;
 
-        sprintf(output, DRM_DEV_NAME, DRM_DIR_NAME, i);
-        int fd = open(output, O_RDONLY);
-        if(fd == -1)
-            continue;
-
-        ver = drmGetVersion(fd);
-        if(!ver || strstr(ver->name, "nouveau"))
-            goto next;
+    int fd = open(card_path, O_RDONLY);
+    if(fd == -1)
+        return false;
 
-        drmSetClientCap(fd, DRM_CLIENT_CAP_UNIVERSAL_PLANES, 1);
+    ver = drmGetVersion(fd);
+    if(!ver || strstr(ver->name, "nouveau"))
+        goto next;
 
-        planes = drmModeGetPlaneResources(fd);
-        if(!planes)
-            goto next;
+    drmSetClientCap(fd, DRM_CLIENT_CAP_UNIVERSAL_PLANES, 1);
 
-        for(uint32_t j = 0; j < planes->count_planes; ++j) {
-            drmModePlanePtr plane = drmModeGetPlane(fd, planes->planes[j]);
-            if(!plane)
-                continue;
+    planes = drmModeGetPlaneResources(fd);
+    if(!planes)
+        goto next;
 
-            if(plane->fb_id)
-                found_screen_card = true;
+    for(uint32_t j = 0; j < planes->count_planes; ++j) {
+        drmModePlanePtr plane = drmModeGetPlane(fd, planes->planes[j]);
+        if(!plane)
+            continue;
 
-            drmModeFreePlane(plane);
-            if(found_screen_card)
-                break;
-        }
+        if(plane->fb_id)
+            found_screen_card = true;
 
-        next:
-        if(planes)
-            drmModeFreePlaneResources(planes);
-        if(ver)
-            drmFreeVersion(ver);
-        close(fd);
+        drmModeFreePlane(plane);
         if(found_screen_card)
+            break;
+    }
+
+    next:
+    if(planes)
+        drmModeFreePlaneResources(planes);
+    if(ver)
+        drmFreeVersion(ver);
+    close(fd);
+    if(found_screen_card)
+        return true;
+
+    return false;
+}
+
+bool gsr_get_valid_card_path(gsr_egl *egl, char *output, bool is_monitor_capture) {
+    if(egl->dri_card_path) {
+        snprintf(output, 128, "%s", egl->dri_card_path);
+        return is_monitor_capture ? try_card_has_valid_plane(output) : true;
+    }
+
+    for(int i = 0; i < 10; ++i) {
+        snprintf(output, 128, DRM_DEV_NAME, DRM_DIR_NAME, i);
+        if(try_card_has_valid_plane(output))
             return true;
     }
     return false;
@@ -307,7 +467,7 @@ bool gsr_card_path_get_render_path(const char *card_path, char *render_path) {
 
     char *render_path_tmp = drmGetRenderDeviceNameFromFd(fd);
     if(render_path_tmp) {
-        strncpy(render_path, render_path_tmp, 128);
+        snprintf(render_path, 128, "%s", render_path_tmp);
         free(render_path_tmp);
         close(fd);
         return true;
@@ -317,6 +477,136 @@ bool gsr_card_path_get_render_path(const char *card_path, char *render_path) {
     return false;
 }
 
-int even_number_ceil(int value) {
-    return value + (value & 1);
+int create_directory_recursive(char *path) {
+    int path_len = strlen(path);
+    char *p = path;
+    char *end = path + path_len;
+    for(;;) {
+        char *slash_p = strchr(p, '/');
+
+        // Skips first '/', we don't want to try and create the root directory
+        if(slash_p == path) {
+            ++p;
+            continue;
+        }
+
+        if(!slash_p)
+            slash_p = end;
+
+        char prev_char = *slash_p;
+        *slash_p = '\0';
+        int err = mkdir(path, S_IRWXU);
+        *slash_p = prev_char;
+
+        if(err == -1 && errno != EEXIST)
+            return err;
+
+        if(slash_p == end)
+            break;
+        else
+            p = slash_p + 1;
+    }
+    return 0;
+}
+
+void setup_dma_buf_attrs(intptr_t *img_attr, uint32_t format, uint32_t width, uint32_t height, const int *fds, const uint32_t *offsets, const uint32_t *pitches, const uint64_t *modifiers, int num_planes, bool use_modifier) {
+    const uint32_t plane_fd_attrs[DRM_NUM_BUF_ATTRS] = {
+        EGL_DMA_BUF_PLANE0_FD_EXT,
+        EGL_DMA_BUF_PLANE1_FD_EXT,
+        EGL_DMA_BUF_PLANE2_FD_EXT,
+        EGL_DMA_BUF_PLANE3_FD_EXT
+    };
+
+    const uint32_t plane_offset_attrs[DRM_NUM_BUF_ATTRS] = {
+        EGL_DMA_BUF_PLANE0_OFFSET_EXT,
+        EGL_DMA_BUF_PLANE1_OFFSET_EXT,
+        EGL_DMA_BUF_PLANE2_OFFSET_EXT,
+        EGL_DMA_BUF_PLANE3_OFFSET_EXT
+    };
+
+    const uint32_t plane_pitch_attrs[DRM_NUM_BUF_ATTRS] = {
+        EGL_DMA_BUF_PLANE0_PITCH_EXT,
+        EGL_DMA_BUF_PLANE1_PITCH_EXT,
+        EGL_DMA_BUF_PLANE2_PITCH_EXT,
+        EGL_DMA_BUF_PLANE3_PITCH_EXT
+    };
+
+    const uint32_t plane_modifier_lo_attrs[DRM_NUM_BUF_ATTRS] = {
+        EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT,
+        EGL_DMA_BUF_PLANE1_MODIFIER_LO_EXT,
+        EGL_DMA_BUF_PLANE2_MODIFIER_LO_EXT,
+        EGL_DMA_BUF_PLANE3_MODIFIER_LO_EXT
+    };
+
+    const uint32_t plane_modifier_hi_attrs[DRM_NUM_BUF_ATTRS] = {
+        EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT,
+        EGL_DMA_BUF_PLANE1_MODIFIER_HI_EXT,
+        EGL_DMA_BUF_PLANE2_MODIFIER_HI_EXT,
+        EGL_DMA_BUF_PLANE3_MODIFIER_HI_EXT
+    };
+
+    size_t img_attr_index = 0;
+
+    img_attr[img_attr_index++] = EGL_LINUX_DRM_FOURCC_EXT;
+    img_attr[img_attr_index++] = format;
+
+    img_attr[img_attr_index++] = EGL_WIDTH;
+    img_attr[img_attr_index++] = width;
+
+    img_attr[img_attr_index++] = EGL_HEIGHT;
+    img_attr[img_attr_index++] = height;
+
+    assert(num_planes <= DRM_NUM_BUF_ATTRS);
+    for(int i = 0; i < num_planes; ++i) {
+        img_attr[img_attr_index++] = plane_fd_attrs[i];
+        img_attr[img_attr_index++] = fds[i];
+
+        img_attr[img_attr_index++] = plane_offset_attrs[i];
+        img_attr[img_attr_index++] = offsets[i];
+
+        img_attr[img_attr_index++] = plane_pitch_attrs[i];
+        img_attr[img_attr_index++] = pitches[i];
+
+        if(use_modifier) {
+            img_attr[img_attr_index++] = plane_modifier_lo_attrs[i];
+            img_attr[img_attr_index++] = modifiers[i] & 0xFFFFFFFFULL;
+
+            img_attr[img_attr_index++] = plane_modifier_hi_attrs[i];
+            img_attr[img_attr_index++] = modifiers[i] >> 32ULL;
+        }
+    }
+
+    img_attr[img_attr_index++] = EGL_NONE;
+    assert(img_attr_index <= 44);
+}
+
+vec2i scale_keep_aspect_ratio(vec2i from, vec2i to) {
+    if(from.x == 0 || from.y == 0)
+        return (vec2i){0, 0};
+
+    const double height_to_width_ratio = (double)from.y / (double)from.x;
+    from.x = to.x;
+    from.y = from.x * height_to_width_ratio;
+    
+    if(from.y > to.y) {
+        const double width_height_ratio = (double)from.x / (double)from.y;
+        from.y = to.y;
+        from.x = from.y * width_height_ratio;
+    }
+
+    return from;
+}
+
+unsigned int gl_create_texture(gsr_egl *egl, int width, int height, int internal_format, unsigned int format, int filter) {
+    unsigned int texture_id = 0;
+    egl->glGenTextures(1, &texture_id);
+    egl->glBindTexture(GL_TEXTURE_2D, texture_id);
+    //egl->glTexImage2D(GL_TEXTURE_2D, 0, internal_format, width, height, 0, format, GL_UNSIGNED_BYTE, NULL);
+    egl->glTexStorage2D(GL_TEXTURE_2D, 1, internal_format, width, height);
+
+    egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, filter);
+    egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, filter);
+
+    egl->glBindTexture(GL_TEXTURE_2D, 0);
+    return texture_id;
 }
diff --git a/src/window/wayland.c b/src/window/wayland.c
new file mode 100644
index 0000000..037c85f
--- /dev/null
+++ b/src/window/wayland.c
@@ -0,0 +1,391 @@
+#include "../../include/window/wayland.h"
+
+#include "../../include/vec2.h"
+#include "../../include/defs.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <wayland-client.h>
+#include <wayland-egl.h>
+#include "xdg-output-unstable-v1-client-protocol.h"
+
+#define GSR_MAX_OUTPUTS 32
+
+typedef struct gsr_window_wayland gsr_window_wayland;
+
+typedef struct {
+    uint32_t wl_name;
+    struct wl_output *output;
+    struct zxdg_output_v1 *xdg_output;
+    vec2i pos;
+    vec2i size;
+    int32_t transform;
+    char *name;
+} gsr_wayland_output;
+
+struct gsr_window_wayland {
+    struct wl_display *display;
+    struct wl_egl_window *window;
+    struct wl_registry *registry;
+    struct wl_surface *surface;
+    struct wl_compositor *compositor;
+    gsr_wayland_output outputs[GSR_MAX_OUTPUTS];
+    int num_outputs;
+    struct zxdg_output_manager_v1 *xdg_output_manager;
+};
+
+static void output_handle_geometry(void *data, struct wl_output *wl_output,
+        int32_t x, int32_t y, int32_t phys_width, int32_t phys_height,
+        int32_t subpixel, const char *make, const char *model,
+        int32_t transform) {
+    (void)wl_output;
+    (void)phys_width;
+    (void)phys_height;
+    (void)subpixel;
+    (void)make;
+    (void)model;
+    gsr_wayland_output *gsr_output = data;
+    gsr_output->pos.x = x;
+    gsr_output->pos.y = y;
+    gsr_output->transform = transform;
+}
+
+static void output_handle_mode(void *data, struct wl_output *wl_output, uint32_t flags, int32_t width, int32_t height, int32_t refresh) {
+    (void)wl_output;
+    (void)flags;
+    (void)refresh;
+    gsr_wayland_output *gsr_output = data;
+    gsr_output->size.x = width;
+    gsr_output->size.y = height;
+}
+
+static void output_handle_done(void *data, struct wl_output *wl_output) {
+    (void)data;
+    (void)wl_output;
+}
+
+static void output_handle_scale(void* data, struct wl_output *wl_output, int32_t factor) {
+    (void)data;
+    (void)wl_output;
+    (void)factor;
+}
+
+static void output_handle_name(void *data, struct wl_output *wl_output, const char *name) {
+    (void)wl_output;
+    gsr_wayland_output *gsr_output = data;
+    if(gsr_output->name) {
+        free(gsr_output->name);
+        gsr_output->name = NULL;
+    }
+    gsr_output->name = strdup(name);
+}
+
+static void output_handle_description(void *data, struct wl_output *wl_output, const char *description) {
+    (void)data;
+    (void)wl_output;
+    (void)description;
+}
+
+static const struct wl_output_listener output_listener = {
+    .geometry = output_handle_geometry,
+    .mode = output_handle_mode,
+    .done = output_handle_done,
+    .scale = output_handle_scale,
+    .name = output_handle_name,
+    .description = output_handle_description,
+};
+
+static void registry_add_object(void *data, struct wl_registry *registry, uint32_t name, const char *interface, uint32_t version) {
+    (void)version;
+    gsr_window_wayland *window_wayland = data;
+    if(strcmp(interface, "wl_compositor") == 0) {
+        if(window_wayland->compositor)
+            return;
+
+        window_wayland->compositor = wl_registry_bind(registry, name, &wl_compositor_interface, 1);
+    } else if(strcmp(interface, wl_output_interface.name) == 0) {
+        if(version < 4) {
+            fprintf(stderr, "gsr warning: wl output interface version is < 4, expected >= 4 to capture a monitor\n");
+            return;
+        }
+
+        if(window_wayland->num_outputs == GSR_MAX_OUTPUTS) {
+            fprintf(stderr, "gsr warning: reached maximum outputs (%d), ignoring output %u\n", GSR_MAX_OUTPUTS, name);
+            return;
+        }
+
+        gsr_wayland_output *gsr_output = &window_wayland->outputs[window_wayland->num_outputs];
+        window_wayland->num_outputs++;
+        *gsr_output = (gsr_wayland_output) {
+            .wl_name = name,
+            .output = wl_registry_bind(registry, name, &wl_output_interface, 4),
+            .pos = { .x = 0, .y = 0 },
+            .size = { .x = 0, .y = 0 },
+            .transform = 0,
+            .name = NULL,
+        };
+        wl_output_add_listener(gsr_output->output, &output_listener, gsr_output);
+    } else if(strcmp(interface, zxdg_output_manager_v1_interface.name) == 0) {
+        if(version < 1) {
+            fprintf(stderr, "gsr warning: xdg output interface version is < 1, expected >= 1 to capture a monitor\n");
+            return;
+        }
+
+        if(window_wayland->xdg_output_manager)
+            return;
+
+        window_wayland->xdg_output_manager = wl_registry_bind(registry, name, &zxdg_output_manager_v1_interface, 1);
+    }
+}
+
+static void registry_remove_object(void *data, struct wl_registry *registry, uint32_t name) {
+    (void)data;
+    (void)registry;
+    (void)name;
+    // TODO: Remove output
+}
+
+static struct wl_registry_listener registry_listener = {
+    .global = registry_add_object,
+    .global_remove = registry_remove_object,
+};
+
+static void xdg_output_logical_position(void *data, struct zxdg_output_v1 *zxdg_output_v1, int32_t x, int32_t y) {
+    (void)zxdg_output_v1;
+    gsr_wayland_output *gsr_xdg_output = data;
+    gsr_xdg_output->pos.x = x;
+    gsr_xdg_output->pos.y = y;
+}
+
+static void xdg_output_handle_logical_size(void *data, struct zxdg_output_v1 *xdg_output, int32_t width, int32_t height) {
+    (void)data;
+    (void)xdg_output;
+    (void)width;
+    (void)height;
+}
+
+static void xdg_output_handle_done(void *data, struct zxdg_output_v1 *xdg_output) {
+    (void)data;
+    (void)xdg_output;
+}
+
+static void xdg_output_handle_name(void *data, struct zxdg_output_v1 *xdg_output, const char *name) {
+    (void)data;
+    (void)xdg_output;
+    (void)name;
+}
+
+static void xdg_output_handle_description(void *data, struct zxdg_output_v1 *xdg_output, const char *description) {
+    (void)data;
+    (void)xdg_output;
+    (void)description;
+}
+
+static const struct zxdg_output_v1_listener xdg_output_listener = {
+    .logical_position = xdg_output_logical_position,
+    .logical_size = xdg_output_handle_logical_size,
+    .done = xdg_output_handle_done,
+    .name = xdg_output_handle_name,
+    .description = xdg_output_handle_description,
+};
+
+static void gsr_window_wayland_set_monitor_outputs_from_xdg_output(gsr_window_wayland *self) {
+    if(!self->xdg_output_manager) {
+        fprintf(stderr, "gsr warning: zxdg_output_manager not found. registered monitor positions might be incorrect\n");
+        return;
+    }
+
+    for(int i = 0; i < self->num_outputs; ++i) {
+        self->outputs[i].xdg_output = zxdg_output_manager_v1_get_xdg_output(self->xdg_output_manager, self->outputs[i].output);
+        zxdg_output_v1_add_listener(self->outputs[i].xdg_output, &xdg_output_listener, &self->outputs[i]);
+    }
+
+    // Fetch xdg_output
+    wl_display_roundtrip(self->display);
+}
+
+static void gsr_window_wayland_deinit(gsr_window_wayland *self) {
+    if(self->window) {
+        wl_egl_window_destroy(self->window);
+        self->window = NULL;
+    }
+
+    if(self->surface) {
+        wl_surface_destroy(self->surface);
+        self->surface = NULL;
+    }
+
+    for(int i = 0; i < self->num_outputs; ++i) {
+        if(self->outputs[i].output) {
+            wl_output_destroy(self->outputs[i].output);
+            self->outputs[i].output = NULL;
+        }
+
+        if(self->outputs[i].name) {
+            free(self->outputs[i].name);
+            self->outputs[i].name = NULL;
+        }
+
+        if(self->outputs[i].xdg_output) {
+            zxdg_output_v1_destroy(self->outputs[i].xdg_output);
+            self->outputs[i].output = NULL;
+        }
+    }
+    self->num_outputs = 0;
+
+    if(self->xdg_output_manager) {
+        zxdg_output_manager_v1_destroy(self->xdg_output_manager);
+        self->xdg_output_manager = NULL;
+    }
+
+    if(self->compositor) {
+        wl_compositor_destroy(self->compositor);
+        self->compositor = NULL;
+    }
+
+    if(self->registry) {
+        wl_registry_destroy(self->registry);
+        self->registry = NULL;
+    }
+
+    if(self->display) {
+        wl_display_disconnect(self->display);
+        self->display = NULL;
+    }
+}
+
+static bool gsr_window_wayland_init(gsr_window_wayland *self) {
+    self->display = wl_display_connect(NULL);
+    if(!self->display) {
+        fprintf(stderr, "gsr error: gsr_window_wayland_init failed: failed to connect to the Wayland server\n");
+        goto fail;
+    }
+
+    self->registry = wl_display_get_registry(self->display); // TODO: Error checking
+    wl_registry_add_listener(self->registry, &registry_listener, self); // TODO: Error checking
+
+    // Fetch globals
+    wl_display_roundtrip(self->display);
+
+    // Fetch wl_output
+    wl_display_roundtrip(self->display);
+
+    gsr_window_wayland_set_monitor_outputs_from_xdg_output(self);
+
+    if(!self->compositor) {
+        fprintf(stderr, "gsr error: gsr_window_wayland_init failed: failed to find compositor\n");
+        goto fail;
+    }
+
+    self->surface = wl_compositor_create_surface(self->compositor);
+    if(!self->surface) {
+        fprintf(stderr, "gsr error: gsr_window_wayland_init failed: failed to create surface\n");
+        goto fail;
+    }
+
+    self->window = wl_egl_window_create(self->surface, 16, 16);
+    if(!self->window) {
+        fprintf(stderr, "gsr error: gsr_window_wayland_init failed: failed to create window\n");
+        goto fail;
+    }
+
+    return true;
+
+    fail:
+    gsr_window_wayland_deinit(self);
+    return false;
+}
+
+static void gsr_window_wayland_destroy(gsr_window *window) {
+    gsr_window_wayland *self = window->priv;
+    gsr_window_wayland_deinit(self);
+    free(self);
+    free(window);
+}
+
+static bool gsr_window_wayland_process_event(gsr_window *window) {
+    gsr_window_wayland *self = window->priv;
+    // TODO: pselect on wl_display_get_fd before doing dispatch
+    const bool events_available = wl_display_dispatch_pending(self->display) > 0;
+    wl_display_flush(self->display);
+    return events_available;
+}
+
+static gsr_display_server gsr_wayland_get_display_server(void) {
+    return GSR_DISPLAY_SERVER_WAYLAND;
+}
+
+static void* gsr_window_wayland_get_display(gsr_window *window) {
+    gsr_window_wayland *self = window->priv;
+    return self->display;
+}
+
+static void* gsr_window_wayland_get_window(gsr_window *window) {
+    gsr_window_wayland *self = window->priv;
+    return self->window;
+}
+
+static gsr_monitor_rotation wayland_transform_to_gsr_rotation(int32_t rot) {
+    switch(rot) {
+        case 0: return GSR_MONITOR_ROT_0;
+        case 1: return GSR_MONITOR_ROT_90;
+        case 2: return GSR_MONITOR_ROT_180;
+        case 3: return GSR_MONITOR_ROT_270;
+    }
+    return GSR_MONITOR_ROT_0;
+}
+
+static void gsr_window_wayland_for_each_active_monitor_output_cached(const gsr_window *window, active_monitor_callback callback, void *userdata) {
+    const gsr_window_wayland *self = window->priv;
+    for(int i = 0; i < self->num_outputs; ++i) {
+        const gsr_wayland_output *output = &self->outputs[i];
+        if(!output->name)
+            continue;
+
+        const int connector_type_index = get_connector_type_by_name(output->name);
+        const int connector_type_id = get_connector_type_id_by_name(output->name);
+        const gsr_monitor monitor = {
+            .name = output->name,
+            .name_len = strlen(output->name),
+            .pos = { .x = output->pos.x, .y = output->pos.y },
+            .size = { .x = output->size.x, .y = output->size.y },
+            .connector_id = 0,
+            .rotation = wayland_transform_to_gsr_rotation(output->transform),
+            .monitor_identifier = (connector_type_index != -1 && connector_type_id != -1) ? monitor_identifier_from_type_and_count(connector_type_index, connector_type_id) : 0
+        };
+        callback(&monitor, userdata);
+    }
+}
+
+gsr_window* gsr_window_wayland_create(void) {
+    gsr_window *window = calloc(1, sizeof(gsr_window));
+    if(!window)
+        return window;
+
+    gsr_window_wayland *window_wayland = calloc(1, sizeof(gsr_window_wayland));
+    if(!window_wayland) {
+        free(window);
+        return NULL;
+    }
+
+    if(!gsr_window_wayland_init(window_wayland)) {
+        free(window_wayland);
+        free(window);
+        return NULL;
+    }
+
+    *window = (gsr_window) {
+        .destroy = gsr_window_wayland_destroy,
+        .process_event = gsr_window_wayland_process_event,
+        .get_event_data = NULL,
+        .get_display_server = gsr_wayland_get_display_server,
+        .get_display = gsr_window_wayland_get_display,
+        .get_window = gsr_window_wayland_get_window,
+        .for_each_active_monitor_output_cached = gsr_window_wayland_for_each_active_monitor_output_cached,
+        .priv = window_wayland
+    };
+
+    return window;
+}
diff --git a/src/window/window.c b/src/window/window.c
new file mode 100644
index 0000000..1c6a24e
--- /dev/null
+++ b/src/window/window.c
@@ -0,0 +1,30 @@
+#include "../../include/window/window.h"
+#include <stddef.h>
+
+void gsr_window_destroy(gsr_window *self);
+
+bool gsr_window_process_event(gsr_window *self) {
+    return self->process_event(self);
+}
+
+XEvent* gsr_window_get_event_data(gsr_window *self) {
+    if(self->get_event_data)
+        return self->get_event_data(self);
+    return NULL;
+}
+
+gsr_display_server gsr_window_get_display_server(const gsr_window *self) {
+    return self->get_display_server();
+}
+
+void* gsr_window_get_display(gsr_window *self) {
+    return self->get_display(self);
+}
+
+void* gsr_window_get_window(gsr_window *self) {
+    return self->get_window(self);
+}
+
+void gsr_window_for_each_active_monitor_output_cached(const gsr_window *self, active_monitor_callback callback, void *userdata) {
+    self->for_each_active_monitor_output_cached(self, callback, userdata);
+}
diff --git a/src/window/x11.c b/src/window/x11.c
new file mode 100644
index 0000000..964422d
--- /dev/null
+++ b/src/window/x11.c
@@ -0,0 +1,162 @@
+#include "../../include/window/x11.h"
+
+#include "../../include/vec2.h"
+#include "../../include/defs.h"
+#include "../../include/utils.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <X11/Xlib.h>
+
+#define GSR_MAX_OUTPUTS 32
+
+typedef struct {
+    char *name;
+    vec2i pos;
+    vec2i size;
+    uint32_t connector_id;
+    gsr_monitor_rotation rotation;
+    uint32_t monitor_identifier; /* crtc id */
+} gsr_x11_output;
+
+typedef struct {
+    Display *display;
+    Window window;
+    gsr_x11_output outputs[GSR_MAX_OUTPUTS];
+    int num_outputs;
+    XEvent xev;
+} gsr_window_x11;
+
+static void store_x11_monitor(const gsr_monitor *monitor, void *userdata) {
+    gsr_window_x11 *window_x11 = userdata;
+    if(window_x11->num_outputs == GSR_MAX_OUTPUTS) {
+        fprintf(stderr, "gsr warning: reached maximum outputs (%d), ignoring output %s\n", GSR_MAX_OUTPUTS, monitor->name);
+        return;
+    }
+
+    char *monitor_name = strdup(monitor->name);
+    if(!monitor_name)
+        return;
+
+    const int index = window_x11->num_outputs;
+    window_x11->outputs[index].name = monitor_name;
+    window_x11->outputs[index].pos = monitor->pos;
+    window_x11->outputs[index].size = monitor->size;
+    window_x11->outputs[index].connector_id = monitor->connector_id;
+    window_x11->outputs[index].rotation = monitor->rotation;
+    window_x11->outputs[index].monitor_identifier = monitor->monitor_identifier;
+    ++window_x11->num_outputs;
+}
+
+static void gsr_window_x11_deinit(gsr_window_x11 *self) {
+    if(self->window) {
+        XDestroyWindow(self->display, self->window);
+        self->window = None;
+    }
+
+    for(int i = 0; i < self->num_outputs; ++i) {
+        if(self->outputs[i].name) {
+            free(self->outputs[i].name);
+            self->outputs[i].name = NULL;
+        }
+    }
+    self->num_outputs = 0;
+}
+
+static bool gsr_window_x11_init(gsr_window_x11 *self) {
+    self->window = XCreateWindow(self->display, DefaultRootWindow(self->display), 0, 0, 16, 16, 0, CopyFromParent, InputOutput, CopyFromParent, 0, NULL);
+    if(!self->window) {
+        fprintf(stderr, "gsr error: gsr_window_x11_init failed: failed to create gl window\n");
+        return false;
+    }
+
+    self->num_outputs = 0;
+    for_each_active_monitor_output_x11_not_cached(self->display, store_x11_monitor, self);
+    return true;
+}
+
+static void gsr_window_x11_destroy(gsr_window *window) {
+    gsr_window_x11 *self = window->priv;
+    gsr_window_x11_deinit(self);
+    free(self);
+    free(window);
+}
+
+static bool gsr_window_x11_process_event(gsr_window *window) {
+    gsr_window_x11 *self = window->priv;
+    if(XPending(self->display)) {
+        XNextEvent(self->display, &self->xev);
+        return true;
+    }
+    return false;
+}
+
+static XEvent* gsr_window_x11_get_event_data(gsr_window *window) {
+    gsr_window_x11 *self = window->priv;
+    return &self->xev;
+}
+
+static gsr_display_server gsr_window_x11_get_display_server(void) {
+    return GSR_DISPLAY_SERVER_X11;
+}
+
+static void* gsr_window_x11_get_display(gsr_window *window) {
+    gsr_window_x11 *self = window->priv;
+    return self->display;
+}
+
+static void* gsr_window_x11_get_window(gsr_window *window) {
+    gsr_window_x11 *self = window->priv;
+    return (void*)self->window;
+}
+
+static void gsr_window_x11_for_each_active_monitor_output_cached(const gsr_window *window, active_monitor_callback callback, void *userdata) {
+    const gsr_window_x11 *self = window->priv;
+    for(int i = 0; i < self->num_outputs; ++i) {
+        const gsr_x11_output *output = &self->outputs[i];
+        const gsr_monitor monitor = {
+            .name = output->name,
+            .name_len = strlen(output->name),
+            .pos = output->pos,
+            .size = output->size,
+            .connector_id = output->connector_id,
+            .rotation = output->rotation,
+            .monitor_identifier = output->monitor_identifier
+        };
+        callback(&monitor, userdata);
+    }
+}
+
+gsr_window* gsr_window_x11_create(Display *display) {
+    gsr_window *window = calloc(1, sizeof(gsr_window));
+    if(!window)
+        return window;
+
+    gsr_window_x11 *window_x11 = calloc(1, sizeof(gsr_window_x11));
+    if(!window_x11) {
+        free(window);
+        return NULL;
+    }
+
+    window_x11->display = display;
+    if(!gsr_window_x11_init(window_x11)) {
+        free(window_x11);
+        free(window);
+        return NULL;
+    }
+
+    *window = (gsr_window) {
+        .destroy = gsr_window_x11_destroy,
+        .process_event = gsr_window_x11_process_event,
+        .get_event_data = gsr_window_x11_get_event_data,
+        .get_display_server = gsr_window_x11_get_display_server,
+        .get_display = gsr_window_x11_get_display,
+        .get_window = gsr_window_x11_get_window,
+        .for_each_active_monitor_output_cached = gsr_window_x11_for_each_active_monitor_output_cached,
+        .priv = window_x11
+    };
+
+    return window;
+}
diff --git a/src/window_texture.c b/src/window_texture.c
index 7448323..ba7212a 100644
--- a/src/window_texture.c
+++ b/src/window_texture.c
@@ -16,6 +16,7 @@ int window_texture_init(WindowTexture *window_texture, Display *display, Window
     window_texture->display = display;
     window_texture->window = window;
     window_texture->pixmap = None;
+    window_texture->image = NULL;
     window_texture->texture_id = 0;
     window_texture->redirected = 0;
     window_texture->egl = egl;
@@ -34,6 +35,11 @@ static void window_texture_cleanup(WindowTexture *self, int delete_texture) {
         self->texture_id = 0;
     }
 
+    if(self->image) {
+        self->egl->eglDestroyImage(self->egl->egl_display, self->image);
+        self->image = NULL;
+    }
+
     if(self->pixmap) {
         XFreePixmap(self->display, self->pixmap);
         self->pixmap = None;
@@ -79,8 +85,6 @@ int window_texture_on_resize(WindowTexture *self) {
         texture_id = self->texture_id;
     }
 
-    self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
-    self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
     self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
     self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
 
@@ -101,14 +105,14 @@ int window_texture_on_resize(WindowTexture *self) {
 
     self->pixmap = pixmap;
     self->texture_id = texture_id;
+    self->image = image;
 
     cleanup:
     self->egl->glBindTexture(GL_TEXTURE_2D, 0);
 
-    if(image)
-        self->egl->eglDestroyImage(self->egl->egl_display, image);
-
     if(result != 0) {
+        if(image)
+            self->egl->eglDestroyImage(self->egl->egl_display, image);
         if(texture_id != 0)
             self->egl->glDeleteTextures(1, &texture_id);
         if(pixmap)
@@ -120,4 +124,4 @@ int window_texture_on_resize(WindowTexture *self) {
 
 unsigned int window_texture_get_opengl_texture_id(WindowTexture *self) {
     return self->texture_id;
-}
-\ No newline at end of file
+}
diff --git a/src/xnvctrl.c b/src/xnvctrl.c
index b738455..af46493 100644
--- a/src/xnvctrl.c
+++ b/src/xnvctrl.c
@@ -15,7 +15,7 @@ bool gsr_xnvctrl_load(gsr_xnvctrl *self, Display *display) {
         return false;
     }
 
-    dlsym_assign required_dlsym[] = {
+    const dlsym_assign required_dlsym[] = {
         { (void**)&self->XNVCTRLQueryExtension, "XNVCTRLQueryExtension" },
         { (void**)&self->XNVCTRLSetTargetAttributeAndGetStatus, "XNVCTRLSetTargetAttributeAndGetStatus" },
         { (void**)&self->XNVCTRLQueryValidTargetAttributeValues, "XNVCTRLQueryValidTargetAttributeValues" },
diff --git a/study/color_space_transform_matrix.png b/study/color_space_transform_matrix.png
new file mode 100644
index 0000000..2b7729e5
--- /dev/null
+++ b/study/color_space_transform_matrix.png
diff --git a/study/create_matrix.py b/study/create_matrix.py
new file mode 100755
index 0000000..1599a12
--- /dev/null
+++ b/study/create_matrix.py
@@ -0,0 +1,48 @@
+#!/usr/bin/env python3
+
+import sys
+
+def usage():
+    print("usage: Kr Kg Kb full|limited")
+    print("examples:")
+    print("  create_matrix.py 0.2126 0.7152 0.0722 full")
+    print("  create_matrix.py 0.2126 0.7152 0.0722 limited")
+    exit(1)
+
+def a(v):
+    if v >= 0:
+        return " %f" % v
+    else:
+        return "%f" % v
+
+def main(argv):
+    if len(argv) != 5:
+        usage()
+
+    Kr = float(sys.argv[1])
+    Kg = float(sys.argv[2])
+    Kb = float(sys.argv[3])
+    color_range = sys.argv[4]
+    luma_offset = 0.0
+    transform_range = 1.0
+
+    if color_range == "full":
+        pass
+    elif color_range == "limited":
+        transform_range = (235.0 - 16.0) / 255.0
+        luma_offset = 16.0 / 255.0
+
+    matrix = [
+        [Kr,                        Kg,                        Kb],
+        [-0.5 * (Kr / (1.0 - Kb)), -0.5 * (Kg / (1.0 - Kb)),  0.5],
+        [0.5,                      -0.5 * (Kg / (1.0 - Kr)), -0.5 * (Kb / (1.0 -Kr))],
+        [0.0,                       0.5,                      0.5]
+    ]
+
+    # Transform from row major to column major for glsl
+    print("const mat4 RGBtoYUV = mat4(%f, %s, %s, %f,"  % (matrix[0][0] * transform_range, a(matrix[1][0] * transform_range), a(matrix[2][0] * transform_range), 0.0))
+    print("                           %f, %s, %s, %f,"  % (matrix[0][1] * transform_range, a(matrix[1][1] * transform_range), a(matrix[2][1] * transform_range), 0.0))
+    print("                           %f, %s, %s, %f,"  % (matrix[0][2] * transform_range, a(matrix[1][2] * transform_range), a(matrix[2][2] * transform_range), 0.0))
+    print("                           %f, %s, %s, %f);" % (matrix[3][0] + luma_offset,     a(matrix[3][1]),                   a(matrix[3][2]),                   1.0))
+
+main(sys.argv)
diff --git a/uninstall.sh b/uninstall.sh
index 9457d1f..b8aac26 100755
--- a/uninstall.sh
+++ b/uninstall.sh
@@ -1,9 +1,10 @@
-#!/bin/sh
+#!/bin/sh -e
+
+script_dir=$(dirname "$0")
+cd "$script_dir"
 
 [ $(id -u) -ne 0 ] && echo "You need root privileges to run the uninstall script" && exit 1
 
-rm -f "/usr/bin/gsr-kms-server"
-rm -f "/usr/bin/gpu-screen-recorder"
-rm -f "/usr/lib/systemd/user/gpu-screen-recorder.service"
+ninja -C build uninstall
 
-echo "Successfully uninstalled gpu-screen-recorder"
-\ No newline at end of file
+echo "Successfully uninstalled gpu-screen-recorder"