diff options
-rw-r--r--include/cuda.h (renamed from include/CudaLibrary.hpp)89
-rw-r--r--src/library_loader.c (renamed from include/LibraryLoader.hpp)16
-rw-r--r--study/color_space_transform_matrix.pngbin0 -> 7166 bytes
79 files changed, 9448 insertions, 2014 deletions
diff --git a/.clang-format b/.clang-format
deleted file mode 100644
index 80d3293..0000000
--- a/.clang-format
+++ /dev/null
@@ -1,2 +0,0 @@
-BasedOnStyle: LLVM
-IndentWidth: 4
diff --git a/.gitignore b/.gitignore
index 24fee1f..bc99e58 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,11 +4,22 @@ compile_commands.json
diff --git a/README.md b/README.md
index fcd5898..52cb1ee 100644
--- a/README.md
+++ b/README.md
@@ -1,46 +1,134 @@
# GPU Screen Recorder
-This is a screen recorder that has minimal impact on system performance by recording a window using the GPU only,
+This is a screen recorder that has minimal impact on system performance by recording your monitor using the GPU only,
similar to shadowplay on windows. This is the fastest screen recording tool for Linux.
This screen recorder can be used for recording your desktop offline, for live streaming and for nvidia shadowplay-like instant replay,
-where only the last few seconds are saved.
+where only the last few minutes are saved.
-## Note
-This software works only on x11.\
-Recording a window doesn't work when using picom in glx mode. However it works in xrender mode or when recording the a monitor/screen (which uses NvFBC).\
-If you are using a variable refresh rate monitor, then choose to record "screen-direct". This will allow variable refresh rate to work when recording fullscreen applications. Note that some applications such as mpv will not work in fullscreen mode. A fix is being developed for this.\
-For screen capture to work with PRIME (laptops with a nvidia gpu), you must set the primary GPU to use your dedicated nvidia graphics card. You can do this by selecting "NVIDIA (Performance Mode) in nvidia settings:\
-and then rebooting your laptop.
-screen-direct capture has been temporary disabled as it causes issues with stuttering. This might be a nvfbc bug.
+Supported video codecs:
+* H264 (default)
+* AV1 (not currently supported on NVIDIA if you use GPU Screen Recorder flatpak)
+Supported audio codecs:
+* Opus (default)
+* AAC
+## Note
+This software works with x11 and wayland, but when using Wayland then only monitors can be recorded.
+1) screen-direct capture has been temporary disabled as it causes issues with stuttering. This might be a nvfbc bug.
+2) Videos are in variable framerate format. Use MPV to play such videos, otherwise you might experience stuttering in the video if you are using a buggy video player. You can try saving the video into a .mkv file instead as some software may have better support for .mkv files (such as kdenlive). You can use the "-fm cfr" option to to use constant framerate mode.
+3) HDR capture is supported (on wayland), but all GPU drivers have bugs that ignore HDR metadata so the HDR metadata will be missing in the video file. I will eventually patch the video file to workaround these GPU driver issues.
+4) FLAC audio codec is disabled at the moment because of temporary issues.
+### AMD/Intel/Wayland root permission
+When recording a window under AMD/Intel no special user permission is required, however when recording a monitor (or when using wayland) the program needs root permission (to access KMS).\
+To make this safer, the part that needs root access has been moved to its own executable (to make it as small as possible).\
+For you as a user this only means that if you installed GPU Screen Recorder as a flatpak then a prompt asking for root password will show up when you start recording.
# Performance
+On a system with a i5 4690k CPU and a GTX 1080 GPU:\
When recording Legend of Zelda Breath of the Wild at 4k, fps drops from 30 to 7 when using OBS Studio + nvenc, however when using this screen recorder the fps remains at 30.\
-When recording GTA V at 4k on highest settings, fps drops from 60 to 23 when using obs-nvfbc + nvenc, however when using this screen recorder the fps only drops to 55. The quality is also much better when using gpu-screen-recorder.\
-It is recommended to save the video to a SSD because of the large file size, which a slow HDD might not be fast enough to handle.\
-Using NvFBC (recording the monitor/screen) is not faster than not using NvFBC (recording a single window) with gpu screen recorder, in fact it might be a tiny bit slower.
+When recording GTA V at 4k on highest settings, fps drops from 60 to 23 when using obs-nvfbc + nvenc, however when using this screen recorder the fps only drops to 58. The quality is also much better when using gpu screen recorder.\
+GPU Screen Recorder also produces much smoother videos than OBS when GPU utilization is close to 100%, see comparison here: [https://www.youtube.com/watch?v=zfj4sNVLLLg](https://www.youtube.com/watch?v=zfj4sNVLLLg).\
+It is recommended to save the video to a SSD because of the large file size, which a slow HDD might not be fast enough to handle. Using variable framerate mode (-fm vfr) which is the default is also recommended as this reduces encoding load. Ultra quality is also overkill most of the time, very high (the default) or lower quality is usually enough.
+## Note about optimal performance on NVIDIA
+NVIDIA driver has a "feature" (read: bug) where it will downclock memory transfer rate when a program uses cuda (or nvenc, which uses cuda), such as GPU Screen Recorder. To work around this bug, GPU Screen Recorder can overclock your GPU memory transfer rate to it's normal optimal level.\
+To enable overclocking for optimal performance use the `-oc` option when running GPU Screen Recorder. You also need to have "Coolbits" NVIDIA X setting set to "12" to enable overclocking. You can automatically add this option if you run `sudo nvidia-xconfig --cool-bits=12` and then reboot your computer.\
+Note that this only works when Xorg server is running as root, and using this option will only give you a performance boost if the game you are recording is bottlenecked by your GPU.\
+Note! use at your own risk!
+This should work fine on AMD/Intel X11 or Wayland. On Nvidia X11 G-SYNC only works with the -w screen-direct-force option, but because of bugs in the Nvidia driver this option is not always recommended.
+For example it can cause your computer to freeze when recording certain games.
# Installation
If you are running an Arch Linux based distro, then you can find gpu screen recorder on aur under the name gpu-screen-recorder-git (`yay -S gpu-screen-recorder-git`).\
-If you are running an Ubuntu based distro then run `install_ubuntu.sh` as root: `sudo ./install_ubuntu.sh`. You also need to install the `libnvidia-compute` version that fits your nvidia driver to install libcuda.so to run gpu-screen-recorder and `libnvidia-fbc.so.1` when using nvfbc. But it's recommended that you use the flatpak version of gpu-screen-recorder if you use an older version of ubuntu as the ffmpeg version will be old and wont support the best quality options.\
-If you are running another distro then you can run `install.sh` as root: `sudo ./install.sh`, but you need to manually install the dependencies, as described below.\
-You can also install gpu screen recorder ([the gtk gui version](https://git.dec05eba.com/gpu-screen-recorder-gtk/)) from [flathub](https://flathub.org/apps/details/com.dec05eba.gpu_screen_recorder).
+If you are running another distro then you can run `sudo ./install.sh`, but you need to manually install the dependencies, as described below.\
+You can also install gpu screen recorder ([the gtk gui version](https://git.dec05eba.com/gpu-screen-recorder-gtk/)) from [flathub](https://flathub.org/apps/details/com.dec05eba.gpu_screen_recorder), which is the easiest method
+to install GPU Screen Recorder on non-arch based distros.\
+The only official ways to install GPU Screen Recorder is either from source, AUR or flathub. If you install GPU Screen Recorder from somewhere else and have an issue then try installing it
+from one of the official sources before reporting it as an issue.
+If you install GPU Screen Recorder flatpak, which is the gtk gui version then you can still run GPU Screen Recorder command line by using the flatpak command option, for example `flatpak run --command=gpu-screen-recorder com.dec05eba.gpu_screen_recorder -w screen -f 60 -o video.mp4`. Note that if you want to record your monitor on AMD/Intel then you need to install the flatpak system-wide (like so: `flatpak install flathub --system com.dec05eba.gpu_screen_recorder`).
# Dependencies
-`libgl (libglvnd), ffmpeg, libx11, libxcomposite, libpulse`. You need to additionally have `libcuda.so` installed when you run `gpu-screen-recorder` and `libnvidia-fbc.so.1` when using nvfbc.\
-Recording monitors requires a gpu with NvFBC support (note: this is not required when recording a single window!). Normally only tesla and quadro gpus support this, but by using [nvidia-patch](https://github.com/keylase/nvidia-patch) or [nvlax](https://github.com/illnyang/nvlax) you can do this on all gpus that support nvenc as well (gpus as old as the nvidia 600 series), provided you are not using outdated gpu drivers.
+## AMD
+libglvnd (which provides libgl and libegl)\
+ffmpeg (libavcodec, libavformat, libavutil, libswresample, libavfilter)\
+x11 (libx11, libxcomposite, libxrandr, libxfixes, libxdamage, libxi)\
+vaapi (libva, libva-mesa-driver)\
+## Intel
+libglvnd (which provides libgl and libegl)\
+ffmpeg (libavcodec, libavformat, libavutil, libswresample, libavfilter)\
+x11 (libx11, libxcomposite, libxrandr, libxfixes, libxdamage, libxi)\
+vaapi (libva, intel-media-driver/libva-intel-driver)\
+libglvnd (which provides libgl and libegl)\
+ffmpeg (libavcodec, libavformat, libavutil, libswresample, libavfilter)\
+x11 (libx11, libxcomposite, libxrandr, libxfixes, libxdamage, libxi)\
+cuda runtime (libcuda.so.1) (libnvidia-compute)\
+nvenc (libnvidia-encode)\
+nvfbc (libnvidia-fbc1, when recording the screen on x11)\
+xnvctrl (libxnvctrl0, when using the `-oc` option)
# How to use
-Run `scripts/interactive.sh` or run gpu-screen-recorder directly, for example: `gpu-screen-recorder -w $(xdotool selectwindow) -c mp4 -f 60 -a "$(pactl get-default-sink).monitor" -o test_video.mp4`\
-Then stop the screen recorder with Ctrl+C, which will also save the recording.\
-Send signal SIGUSR1 (`killall -SIGUSR1 gpu-screen-recorder`) to gpu-screen-recorder when in replay mode to save the replay. The paths to the saved files is output to stdout after the recording is saved.\
-You can find the default output audio device (headset, speakers (in other words, desktop audio)) with the command `pactl get-default-sink`. Add `monitor` to the end of that to use that as an audio input in gpu-screen-recorder.\
-You can find the default input audio device (microphone) with the command `pactl get-default-source`. This input should not have `monitor` added to the end when used in gpu-screen-recorder.\
-Example of recording both desktop audio and microphone: `gpu-screen-recorder -w $(xdotool selectwindow) -c mp4 -f 60 -a "$(pactl get-default-sink).monitor" -a "$(pactl get-default-source)" -o test_video.mp4`.\
-Note that if you use multiple audio inputs then they are each recorded into separate audio tracks in the video file. There is currently no option to merge audio tracks, but it's a planned feature.
+Run `gpu-screen-recorder --help` to see all options.
+## Recording
+Here is an example of how to record all monitors and the default audio output: `gpu-screen-recorder -w screen -f 60 -a "$(pactl get-default-sink).monitor" -o ~/Videos/test_video.mp4` then stop the screen recorder with `Ctrl+C`, which will also save the recording. You can record a single monitor if you change `-w screen` to the name of a monitor, which you can find if you run the `xrandr`. An example of a monitor name is HDMI-1.
+## Streaming
+Streaming works the same as recording, but the `-o` argument should be path to the live streaming service you want to use (including your live streaming key). Take a look at scripts/twitch-stream.sh to see an example of how to stream to twitch.
+## Replay mode
+Run `gpu-screen-recorder` with the `-c mp4` and `-r` option, for example: `gpu-screen-recorder -w screen -f 60 -r 30 -c mp4 -o ~/Videos`. Note that in this case, `-o` should point to a directory.\
+If `-mf yes` is set, replays are save in folders based on the date.
+To save a video in replay mode, you need to send signal SIGUSR1 to gpu screen recorder. You can do this by running `killall -SIGUSR1 gpu-screen-recorder`.\
+To stop recording send SIGINT to gpu screen recorder. You can do this by running `killall -SIGINT gpu-screen-recorder` or pressing `Ctrl-C` in the terminal that runs gpu screen recorder.\
+To pause/unpause recording send SIGUSR2 to gpu screen recorder. You can do this by running `killall -SIGUSR2 gpu-screen-recorder`. This is only applicable and useful when recording (not streaming nor replay).\
+The file path to the saved replay is output to stdout. All other output from GPU Screen Recorder is output to stderr.\
+The replay buffer is stored in ram (as encoded video), so don't use a too large replay time and/or video quality unless you have enough ram to store it.
+## Finding audio device name
+You can find the default output audio device (headset, speakers (in other words, desktop audio)) with the command `pactl get-default-sink`. Add `monitor` to the end of that to use that as an audio input in gpu screen recorder.\
+You can find the default input audio device (microphone) with the command `pactl get-default-source`. This input should not have `monitor` added to the end when used in gpu screen recorder.\
+Example of recording both desktop audio and microphone: `gpu-screen-recorder -w screen -f 60 -a "$(pactl get-default-sink).monitor" -a "$(pactl get-default-source)" -o ~/Videos/test_video.mp4`.\
+A name (that is visible to pipewire) can be given to an audio input device by prefixing the audio input with `<name>/`, for example `dummy/$(pactl get-default-sink).monitor`.\
+Note that if you use multiple audio inputs then they are each recorded into separate audio tracks in the video file. If you want to merge multiple audio inputs into one audio track then separate the audio inputs by "|" in one -a argument,
+for example `-a "$(pactl get-default-sink).monitor|$(pactl get-default-source)"`.
+There is also a gui for the gpu screen recorder called [gpu-screen-recorder-gtk](https://git.dec05eba.com/gpu-screen-recorder-gtk/).
+## Simple way to run replay without gui
+Run the script `scripts/start-replay.sh` to start replay and then `scripts/save-replay.sh` to save a replay and `scripts/stop-replay.sh` to stop the replay. The videos are saved to `$HOME/Videos`.
+You can use these scripts to start replay at system startup if you add `scripts/start-replay.sh` to startup (this can be done differently depending on your desktop environment / window manager) and then go into
+hotkey settings on your system and choose a hotkey to run the script `scripts/save-replay.sh`. Modify `scripts/start-replay.sh` if you want to use other replay options.
+## Run replay on system startup
+If you installed GPU Screen Recorder from AUR or from source and you are running a distro that uses systemd then you will have a systemd service installed that can be started with `systemctl enable --now --user gpu-screen-recorder`. This systemd service runs GPU Screen Recorder on system startup.\
+It's configured with `$HOME/.config/gpu-screen-recorder.env` (create it if it doesn't exist). You can look at [extra/gpu-screen-recorder.env](https://git.dec05eba.com/gpu-screen-recorder/plain/extra/gpu-screen-recorder.env) to see an example.
+You can see which variables that you can use in the `gpu-screen-recorder.env` file by looking at the `extra/gpu-screen-recorder.service` file. Note that all of the variables are optional, you only have to set the ones that are you interested in.
+You can use the `scripts/save-replay.sh` script to save a replay and by default the systemd service saves videos in `$HOME/Videos`.\
+If you are using a NVIDIA GPU then it's recommended to set PreserveVideoMemoryAllocations=1 as mentioned in the section below.
+## Examples
+Look at the [scripts](https://git.dec05eba.com/gpu-screen-recorder/tree/scripts) directory for script examples. For example if you want to automatically save a recording/replay into a folder with the same name as the game you are recording.
+# Issues
+Nvidia drivers have an issue where CUDA breaks if CUDA is running when suspend/hibernation happens, and it remains broken until you reload the nvidia driver. To fix this, either disable suspend or tell the NVIDIA driver to preserve video memory on suspend/hibernate by using the `NVreg_PreserveVideoMemoryAllocations=1` option. You can run `sudo extra/install_preserve_video_memory.sh` to automatically add that option to your system.
-There is also a gui for the gpu-screen-recorder called [gpu-screen-recorder-gtk](https://git.dec05eba.com/gpu-screen-recorder-gtk/).
+# Reporting bugs
+Issues are reported on this Github page: [https://github.com/dec05eba/gpu-screen-recorder-issues/issues](https://github.com/dec05eba/gpu-screen-recorder-issues/issues)
+# Contributing patches
+See [https://git.dec05eba.com/?p=about](https://git.dec05eba.com/?p=about)
# Demo
[![Click here to watch a demo video on youtube](https://img.youtube.com/vi/n5tm0g01n6A/0.jpg)](https://www.youtube.com/watch?v=n5tm0g01n6A)
@@ -48,17 +136,41 @@ There is also a gui for the gpu-screen-recorder called [gpu-screen-recorder-gtk]
## How is this different from using OBS with nvenc?
OBS only uses the gpu for video encoding, but the window image that is encoded is copied from the GPU to the CPU and then back to the GPU (video encoding unit). These operations are very slow and causes all of the fps drops when using OBS. OBS only uses the GPU efficiently on Windows 10 and Nvidia.\
-This gpu-screen-recorder keeps the window image on the GPU and sends it directly to the video encoding unit on the GPU by using CUDA. This means that CPU usage remains at around 0% when using this screen recorder.
+This gpu screen recorder keeps the window image on the GPU and sends it directly to the video encoding unit on the GPU by using CUDA. This means that CPU usage remains at around 0% when using this screen recorder.
## How is this different from using OBS NvFBC plugin?
The plugin does everything on the GPU and gives the texture to OBS, but OBS does not know how to use the texture directly on the GPU so it copies the texture to the CPU and then back to the GPU (video encoding unit). These operations are very slow and causes a lot of fps drops unless you have a fast CPU. This is especially noticable when recording at higher resolutions than 1080p.
## How is this different from using FFMPEG with x11grab and nvenc?
FFMPEG only uses the GPU with CUDA when doing transcoding from an input video to an output video, and not when recording the screen when using x11grab. So FFMPEG has the same fps drop issues that OBS has.
+## It tells me that my AMD/Intel GPU is not supported or that my GPU doesn't support h264/hevc, but that's not true!
+Some linux distros (such as manjaro and fedora) disable hardware accelerated h264/hevc on AMD/Intel because of "patent license issues". If you are using an arch-based distro then you can install mesa-git instead of mesa and if you are using another distro then you may have to switch to a better distro. On fedora based distros you can follow this: [Hardware Accelerated Codec](https://rpmfusion.org/Howto/Multimedia).\
+If you installed GPU Screen Recorder flatpak then you can try installing mesa-extra freedesktop runtime by running this command: `flatpak install --system org.freedesktop.Platform.GL.default//23.08-extra`
+## I have an old nvidia GPU that supports nvenc but I get a cuda error when trying to record
+Newer ffmpeg versions don't support older nvidia cards. Try installing GPU Screen Recorder flatpak from [flathub](https://flathub.org/apps/details/com.dec05eba.gpu_screen_recorder) instead. It comes with an older ffmpeg version which might work for your GPU.
+## I get a black screen/glitches while live streaming
+It seems like ffmpeg earlier than version 6.1 has some type of bug. Install ffmpeg 6.1 and then reinstall GPU Screen Recorder to fix this issue. The flatpak version of GPU Screen Recorder comes with ffmpeg 6.1 so no extra steps are needed.
+## I can't play the video in my browser directly or in discord
+Browsers and discord don't support hevc video codec at the moment. Choose h264 video codec instead with the -k h264 option.
+Note that websites such as youtube support hevc so there is no need to choose h264 video codec if you intend to upload the video to youtube or if you want to play the video locally or if you intend to
+edit the video with a video editor. Hevc allows for better video quality (especially at lower file sizes) so hevc (or av1) is recommended for source videos.
+## I get a black bar/distorted colors on the right/bottom in the video
+This is mostly an issue on AMD. For av1 it's a hardware issue, see: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9185. For hevc it's a software issue that has been fixed but not released yet, see: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10985.
+If you get this issue then a workaround is to record with h264 video codec instead (using the -k h264 option).
+## The video is glitched, looks like checkerboard pattern
+This is an issue on some intel integrated gpus on wayland caused by power saving option. Right now the only way to fix this is to record on X11 instead.
+## The video doesn't display or has a green/yellow overlay
+This can happen if your video player is missing the H264/HEVC video codecs. Either install the codecs or use mpv.
+## I get stutter in the video
+Try recording to an SSD and make sure it's not using NTFS file system. Also record in variable framerate format.
+## I get a black screen when recording
+This can happen if you use software such as prime-run to run GPU Screen Recorder. Such software should not be used to run GPU Screen Recorder.
+GPU Screen Recorder needs to run on the same GPU that you use to display your monitors graphics to work.
+# Donations
+If you want to donate you can donate via bitcoin or monero.
+* Bitcoin: bc1qqvuqnwrdyppf707ge27fqz2n9y9gu7lf5ypyuf
+* Monero: 4An9kp2qW1C9Gah7ewv4JzcNFQ5TAX7ineGCqXWK6vQnhsGGcRpNgcn8r9EC3tMcgY7vqCKs3nSRXhejMHBaGvFdN2egYet
-* Support AMD and Intel, using VAAPI. Currently there are a lot of driver bugs with both AMD and Intel that causes video encoding to either fail, performance issues or causes the entire driver to crash.
-libraries at compile-time.
-* Clean up the code!
* Dynamically change bitrate/resolution to match desired fps. This would be helpful when streaming for example, where the encode output speed also depends on upload speed to the streaming service.
-* Show cursor when recording. Currently the cursor is not visible when recording a window or screen-direct.
-* Implement opengl injection to capture texture. This fixes composition issues and (VRR) without having to use NvFBC direct capture.
+* Implement opengl injection to capture texture. This fixes VRR without having to use NvFBC direct capture.
* Always use direct capture with NvFBC once the capture issue in mpv fullscreen has been resolved (maybe detect if direct capture fails in nvfbc and switch to non-direct recording. NvFBC says if direct capture fails).
diff --git a/TODO b/TODO
index ad1b158..39c8c54 100644
--- a/TODO
+++ b/TODO
@@ -1,16 +1,140 @@
Check for reparent.
-Only add window to list if its the window is a topmost window.
-Track window damages and only update then. That is better for output file size.
-Getting the texture of a window when using a compositor is an nvidia specific limitation. When gpu-screen-recorder supports other gpus then this can be ignored.
Quickly changing workspace and back while recording under i3 breaks the screen recorder. i3 probably unmaps windows in other workspaces.
See https://trac.ffmpeg.org/wiki/EncodingForStreamingSites for optimizing streaming.
-Add option to merge audio tracks into one (muxing?) by adding multiple audio streams in one -a arg separated by comma.
Look at VK_EXT_external_memory_dma_buf.
Allow setting a different output resolution than the input resolution.
Use mov+faststart.
Allow recording all monitors/selected monitor without nvfbc by recording the compositor proxy window and only recording the part that matches the monitor(s).
Allow recording a region by recording the compositor proxy window / nvfbc window and copying part of it.
-Resizing the target window to be smaller than the initial size is buggy. The window texture ends up duplicated in the video.
-Handle frames (especially for applications with rounded client-side decorations, such as gnome applications. They are huge).
Use nvenc directly, which allows removing the use of cuda.
-Fallback to nvfbc and window tracking if window capture fails. \ No newline at end of file
+Handle xrandr monitor change in nvfbc.
+Implement follow focused in drm.
+Support amf and qsv.
+Disable flipping on nvidia? this might fix some stuttering issues on some setups. See NvCtrlGetAttribute/NvCtrlSetAttributeAndGetStatus NV_CTRL_SYNC_TO_VBLANK https://github.com/NVIDIA/nvidia-settings/blob/d5f022976368cbceb2f20b838ddb0bf992f0cfb9/src/gtk%2B-2.x/ctkopengl.c.
+Replays seem to have some issues with audio/video. Why?
+Cleanup unused gl/egl functions, macro, etc.
+Add option to disable overlapping of replays (the old behavior kinda. Remove the whole replay buffer data after saving when doing this).
+Set audio track name to audio device name (if not merge of multiple audio devices).
+Add support for webcam, but only really for amd/intel because amd/intel can get drm fd access to webcam, nvidia cant. This allows us to create an opengl texture directly from the webcam fd for optimal performance.
+Reverse engineer nvapi so we can disable "force p2 state" on linux too (nvapi profile api with the settings id 0x50166c5e).
+Support yuv444p on amd/intel.
+fix yuv444 for hevc.
+Do not allow streaming if yuv444.
+Re-enable yuv444.
+Support 10 bit output because of better gradients. May even be smaller file size. Better supported on hevc (not supported at all on h264 on my gpu).
+Add nvidia/(amd/intel) specific install script for ubuntu. User should run install_ubuntu.sh but it should run different install dep script depending on if /proc/driver/nvidia/version exists or not. But what about switchable graphics setup?
+Test different combinations of switchable graphics. Intel hybrid mode (running intel but possible to run specific applications with prime-run), running pure intel. Detect switchable graphics.
+The video output will be black if if the system is suspended on nvidia and NVreg_PreserveVideoMemoryAllocations is not set to 1. This happens because I think that the driver invalidates textures/cuda buffers? To fix this we could try and recreate gsr capture when gsr_capture_capture fails (with timeout to retry again).
+Restore nvfbc screen recording on monitor reconfiguration.
+Window capture doesn't work properly in _control_ game after going from pause menu to in-game (and back to pause menu). There might be some x11 event we need to catch. Same for vr-video-player.
+Monitor capture on steam deck is slightly below the game fps, but only when capturing on the steam deck screen. If capturing on another monitor, there is no issue.
+ Is this related to the dma buf rotation issue? different modifier being slow? does this always happen?
+Fallback to vaapi copy in kms if opengl version fails. This can happen on steam deck for some reason (driver bug?). Also vaapi copy uses less gpu since it uses video codec unit to copy.
+Test if vaapi copy version uses less memory than opengl version.
+Intel is a bit weird with monitor capture and multiple monitors. If one of the monitors is rotated then all the kms will be rotated as well.
+ Is that only the case when the primary monitor is rotated? Also the primary monitor becomes position 0, 0 so crtc (x11 randr) position doesn't match the drm pos. Maybe get monitor position and size from drm instead.
+ How about if multiple monitors are rotated?
+Support vp8/vp9. This is especially important on amd which on some distros (such as Manjaro) where hardware accelerated h264/hevc is disabled in the mesa package.
+Support screen (all monitors) capture on amd/intel and nvidia wayland when no combined plane is found. Right now screen just takes the first output.
+Use separate plane (which has offset and pitch) from combined plane instead of the combined plane.
+Both twitch and youtube support variable bitrate but twitch recommends constant bitrate to reduce stream buffering/dropped frames when going from low motion to high motion: https://help.twitch.tv/s/article/broadcasting-guidelines?language=en_US. Info for youtube: https://support.google.com/youtube/answer/2853702?hl=en#zippy=%2Cvariable-bitrate-with-custom-stream-keys-in-live-control-room%2Ck-p-fps%2Cp-fps.
+Limit fps recording with x damage. This is good when running replay mode 24/7 and being afk or when not much is happening on the screen.
+On nvidia some games apparently causes the game to appear to stutter (without dropping fps) when recording a monitor but not using
+ when using direct screen capture. Observed in Deus Ex and Apex Legends.
+Support "screen" (all monitors) capture on wayland. This should be done by getting all drm fds and multiple EGL_DMA_BUF_PLANEX_FD_EXT to create one egl image with all fds combined.
+Support pipewire screen capture?
+CPU usage is pretty high on AMD/Intel/(Nvidia(wayland)), why? opening and closing fds, creating egl, cuda association, is slow when done every frame. Test if desktop portal screencast has better performance.
+Capture is broken on amd on wlroots. It's disabled at the moment and instead uses kms capture. Find out why we get a black screen in wlroots.
+Support vulkan video encoding. That might workaround forced p2 state nvidia driver "bug". Ffmpeg supports vulkan video encoding if it's encoding with --enable-vulkan
+It may be possible to improve color conversion rgb->yuv shader for color edges by biasing colors to an edge, instead of letting color overlaying with bilinear filtering handle it.
+When webcam is supported mention that nvidia_drm.modeset=1 must be set on nvidia x11 (it's required on wayland so it's not needed there. Or does eglstream work without it??). Check if this really is the case.
+ Support green screen removal, cropping, shader effects in general (circle mask, rounded corners, etc).
+Preset is set to p5 for now but it should ideally be p6 or p7.
+ This change is needed because for certain sizes of a window (or monitor?) such as 971x780 causes encoding to freeze
+ when using h264 codec. This is a new(?) nvidia driver bug.
+ Maybe dont choose p6 or p7 again? it causes micro stutter for some users (?).
+For low latency, see https://developer.download.nvidia.com/compute/nvenc/v4.0/NVENC_VideoEncoder_API_ProgGuide.pdf (section 7.1).
+Remove follow focused option.
+Exit if X11/Wayland killed (if drm plane dead or something?)
+Use SRC_W and SRC_H for screen plane instead of crtc_w and crtc_h.
+Make it possible to select which /dev/dri/card* to use, but that requires opengl to also use the same card. Not sure if that is possible for amd, intel and nvidia without using vulkan instead.
+Support intel display framebuffer compression (I915_FORMAT_MOD_Y_TILED_CCS modifier) (and other power saving modifiers, see https://trac.ffmpeg.org/ticket/8542). The only fix may be to use desktop portal for recording. This issue doesn't appear on x11 since these modifiers are not used by xorg server.
+This issue only appears on some intel iGPUs, such as Intel Iris Xe, see: https://github.com/dec05eba/gpu-screen-recorder-issues/issues/1.
+Intel dedicated GPU (intel arc a750) can have a similar issue, but it's not related to compression. In that case the modifier is I915_FORMAT_MOD_4_TILED.
+Test if p2 state can be worked around by using pure nvenc api and overwriting cuInit/cuCtxCreate* to not do anything. Cuda might be loaded when using nvenc but it might not be used, with certain record options? (such as h264 p5).
+ nvenc uses cuda when using b frames and rgb->yuv conversion, so convert the image ourselves instead.-
+Mesa doesn't support global headers (AV_CODEC_FLAG_GLOBAL_HEADER) with h264... which also breaks mkv since mkv requires global header. Right now gpu screen recorder will forcefully set video codec to hevc when h264 is requested for mkv files.
+Drop frames if live streaming cant keep up with target fps, or dynamically change resolution/quality.
+Support low power option (does it even work with vaapi in ffmpeg??). Would be very useful for steam deck.
+Instead of sending a big list of drm data back to kms client, send the monitor we want to record to kms server and the server should respond with only the matching monitor, and cursor.
+Tonemap hdr to sdr when hdr is enabled and when hevc_hdr/av1_hdr is not used.
+Add 10 bit record option, h264_10bit, hevc_10bit and av1_10bit.
+Rotate cursor texture properly (around top left origin).
+Setup hardware video context so we can query constraints and capabilities for better default and better error messages.
+Use CAP_SYS_NICE in flatpak too on the main gpu screen recorder binary. It makes recording smoother, especially with constant framerate.
+Show error when using compressed kms plane which isn't supported. Also do that in the gui.
+Modify ffmpeg to accept opengl texture for nvenc encoding. Removes extra buffers and copies.
+When vulkan encode is added, mention minimum nvidia driver required. (550.54.14?).
+Support drm plane rotation. Neither X11 nor any Wayland compositor currently rotates drm planes so this might not be needed.
+Investigate if there is a way to do gpu->gpu copy directly without touching system ram to enable video encoding on a different gpu. On nvidia this is possible with cudaMemcpyPeer, but how about from an intel/amd gpu to an nvidia gpu or the other way around or any combination of iGPU and dedicated GPU?
+ Maybe something with clEnqueueMigrateMemObjects? on AMD something with DirectGMA maybe?
+Go back to using pure vaapi without opengl for video encoding? rotation (transpose) can be done if its done after (rgb to yuv) color conversion.
+Implement scaling and use lanczos resampling for better quality. Lanczos resampling can also be used for YUV chroma for better color quality on small text.
+Try fixing HDR by passing HDR+10 data as well, and in the packet. Run "ffprobe -loglevel quiet -read_intervals "%+#2" -select_streams v:0 -show_entries side_data video.mp4" to test if the file has correct metadata.
+Flac is disabled because the frame sizes are too large which causes big audio/video desync.
+Add 10-bit capture option. This is good because it reduces banding and quality in very dark areas while reducing the file size compared to doing the same thing with 8-bits.
+Enable b-frames.
+Support vfr matching games exact fps all the time. On x11 use damage tracking, on wayland? maybe there is drm plane damage tracking. But that may not be accurate as the compositor may update it every monitor hz anyways. On wayland maybe only support it for desktop portal + pipewire capture.
+Support selecting which gpu to use. This can be done in egl with eglQueryDevicesEXT and then eglGetPlatformDisplayEXT. This will automatically work on AMD and Intel as vaapi uses the same device. On nvidia we need to use eglQueryDeviceAttribEXT with EGL_CUDA_DEVICE_NV.
+ Maybe on glx (nvidia x11 nvfbc) we need to use __NV_PRIME_RENDER_OFFLOAD_PROVIDER and __GLX_VENDOR_LIBRARY_NAME instead.
+Remove is_damaged and clear_damage and return a value from capture function instead that states if the image has been updated or not.
diff --git a/build.sh b/build.sh
deleted file mode 100755
index 8b92d92..0000000
--- a/build.sh
+++ /dev/null
@@ -1,9 +0,0 @@
-#!/bin/sh -e
-dependencies="libavcodec libavformat libavutil x11 xcomposite libpulse libswresample"
-includes="$(pkg-config --cflags $dependencies)"
-libs="$(pkg-config --libs $dependencies) -ldl -pthread -lm"
-g++ -c src/sound.cpp -O2 -g0 -DNDEBUG $includes
-g++ -c src/main.cpp -O2 -g0 -DNDEBUG $includes
-g++ -o gpu-screen-recorder -O2 sound.o main.o -s $libs
-echo "Successfully built gpu-screen-recorder" \ No newline at end of file
diff --git a/extra/gpu-screen-recorder.env b/extra/gpu-screen-recorder.env
new file mode 100644
index 0000000..ce9f223
--- /dev/null
+++ b/extra/gpu-screen-recorder.env
@@ -0,0 +1,11 @@
diff --git a/extra/gpu-screen-recorder.service b/extra/gpu-screen-recorder.service
new file mode 100644
index 0000000..6933f66
--- /dev/null
+++ b/extra/gpu-screen-recorder.service
@@ -0,0 +1,25 @@
+Description=GPU Screen Recorder Service
+ExecStart=/bin/sh -c 'AUDIO="${AUDIO_DEVICE:-$(pactl get-default-sink).monitor}"; gpu-screen-recorder -v no -w $WINDOW -c $CONTAINER -q $QUALITY -k $CODEC -ac $AUDIO_CODEC -a "$AUDIO" -a "$SECONDARY_AUDIO_DEVICE" -f $FRAMERATE -r $REPLAYDURATION -o "$OUTPUTDIR" -mf $MAKEFOLDERS $ADDITIONAL_ARGS -cr $COLOR_RANGE -keyint $KEYINT'
diff --git a/extra/gsr-nvidia.conf b/extra/gsr-nvidia.conf
new file mode 100644
index 0000000..10cbf7d
--- /dev/null
+++ b/extra/gsr-nvidia.conf
@@ -0,0 +1 @@
+options nvidia NVreg_PreserveVideoMemoryAllocations=1
diff --git a/extra/install_preserve_video_memory.sh b/extra/install_preserve_video_memory.sh
new file mode 100755
index 0000000..c5cf658
--- /dev/null
+++ b/extra/install_preserve_video_memory.sh
@@ -0,0 +1,8 @@
+script_dir=$(dirname "$0")
+cd "$script_dir"
+[ $(id -u) -ne 0 ] && echo "You need root privileges to run the install script" && exit 1
+install -Dm644 gsr-nvidia.conf /etc/modprobe.d/gsr-nvidia.conf
diff --git a/extra/meson_post_install.sh b/extra/meson_post_install.sh
new file mode 100755
index 0000000..f1f6a5a
--- /dev/null
+++ b/extra/meson_post_install.sh
@@ -0,0 +1,5 @@
+setcap cap_sys_admin+ep ${MESON_INSTALL_DESTDIR_PREFIX}/bin/gsr-kms-server \
+ || echo "\n!!! Please re-run install as root\n"
+setcap cap_sys_nice+ep ${MESON_INSTALL_DESTDIR_PREFIX}/bin/gpu-screen-recorder
diff --git a/include/GlLibrary.hpp b/include/GlLibrary.hpp
deleted file mode 100644
index 1337ef3..0000000
--- a/include/GlLibrary.hpp
+++ /dev/null
@@ -1,156 +0,0 @@
-#pragma once
-#include "LibraryLoader.hpp"
-#include <X11/X.h>
-#include <X11/Xutil.h>
-#include <dlfcn.h>
-#include <stdio.h>
-typedef XID GLXPixmap;
-typedef XID GLXDrawable;
-typedef XID GLXWindow;
-typedef struct __GLXcontextRec *GLXContext;
-typedef struct __GLXFBConfigRec *GLXFBConfig;
-#define GL_TEXTURE_2D 0x0DE1
-#define GL_RGB 0x1907
-#define GL_UNSIGNED_BYTE 0x1401
-#define GL_COLOR_BUFFER_BIT 0x00004000
-#define GL_TEXTURE_WRAP_S 0x2802
-#define GL_TEXTURE_WRAP_T 0x2803
-#define GL_TEXTURE_MAG_FILTER 0x2800
-#define GL_TEXTURE_MIN_FILTER 0x2801
-#define GL_TEXTURE_WIDTH 0x1000
-#define GL_TEXTURE_HEIGHT 0x1001
-#define GL_NEAREST 0x2600
-#define GL_RENDERER 0x1F01
-#define GLX_BUFFER_SIZE 2
-#define GLX_RED_SIZE 8
-#define GLX_GREEN_SIZE 9
-#define GLX_BLUE_SIZE 10
-#define GLX_ALPHA_SIZE 11
-#define GLX_DEPTH_SIZE 12
-#define GLX_RGBA_BIT 0x00000001
-#define GLX_RENDER_TYPE 0x8011
-#define GLX_FRONT_EXT 0x20DE
-#define GLX_DRAWABLE_TYPE 0x8010
-#define GLX_WINDOW_BIT 0x00000001
-#define GLX_PIXMAP_BIT 0x00000002
-#define GLX_TEXTURE_2D_BIT_EXT 0x00000002
-#define GLX_TEXTURE_2D_EXT 0x20DC
-#define GLX_CONTEXT_FLAGS_ARB 0x2094
-struct GlLibrary {
- GLXPixmap (*glXCreatePixmap)(Display *dpy, GLXFBConfig config, Pixmap pixmap, const int *attribList);
- void (*glXDestroyPixmap)(Display *dpy, GLXPixmap pixmap);
- void (*glXBindTexImageEXT)(Display *dpy, GLXDrawable drawable, int buffer, const int *attrib_list);
- void (*glXReleaseTexImageEXT)(Display *dpy, GLXDrawable drawable, int buffer);
- GLXFBConfig* (*glXChooseFBConfig)(Display *dpy, int screen, const int *attribList, int *nitems);
- XVisualInfo* (*glXGetVisualFromFBConfig)(Display *dpy, GLXFBConfig config);
- GLXContext (*glXCreateContextAttribsARB)(Display *dpy, GLXFBConfig config, GLXContext share_context, Bool direct, const int *attrib_list);
- Bool (*glXMakeContextCurrent)(Display *dpy, GLXDrawable draw, GLXDrawable read, GLXContext ctx);
- void (*glXDestroyContext)(Display *dpy, GLXContext ctx);
- void (*glXSwapBuffers)(Display *dpy, GLXDrawable drawable);
- void (*glXSwapIntervalEXT)(Display *dpy, GLXDrawable drawable, int interval);
- int (*glXSwapIntervalMESA)(unsigned int interval);
- int (*glXSwapIntervalSGI)(int interval);
- void (*glClearTexImage)(unsigned int texture, unsigned int level, unsigned int format, unsigned int type, const void *data);
- unsigned int (*glGetError)(void);
- const unsigned char* (*glGetString)(unsigned int name);
- void (*glClear)(unsigned int mask);
- void (*glGenTextures)(int n, unsigned int *textures);
- void (*glDeleteTextures)(int n, const unsigned int *texture);
- void (*glBindTexture)(unsigned int target, unsigned int texture);
- void (*glTexParameteri)(unsigned int target, unsigned int pname, int param);
- void (*glGetTexLevelParameteriv)(unsigned int target, int level, unsigned int pname, int *params);
- void (*glTexImage2D)(unsigned int target, int level, int internalFormat, int width, int height, int border, unsigned int format, unsigned int type, const void *pixels);
- void (*glCopyImageSubData)(unsigned int srcName, unsigned int srcTarget, int srcLevel, int srcX, int srcY, int srcZ, unsigned int dstName, unsigned int dstTarget, int dstLevel, int dstX, int dstY, int dstZ, int srcWidth, int srcHeight, int srcDepth);
- ~GlLibrary() {
- unload();
- }
- bool load() {
- if(library)
- return true;
- dlerror(); // clear
- void *lib = dlopen("libGL.so.1", RTLD_LAZY);
- if(!lib) {
- fprintf(stderr, "Error: failed to load libGL.so.1, error: %s\n", dlerror());
- return false;
- }
- dlsym_assign optional_dlsym[] = {
- { (void**)&glClearTexImage, "glClearTexImage" },
- { (void**)&glXSwapIntervalEXT, "glXSwapIntervalEXT" },
- { (void**)&glXSwapIntervalMESA, "glXSwapIntervalMESA" },
- { (void**)&glXSwapIntervalSGI, "glXSwapIntervalSGI" },
- { NULL, NULL }
- };
- dlsym_load_list_optional(lib, optional_dlsym);
- dlsym_assign required_dlsym[] = {
- { (void**)&glXCreatePixmap, "glXCreatePixmap" },
- { (void**)&glXDestroyPixmap, "glXDestroyPixmap" },
- { (void**)&glXBindTexImageEXT, "glXBindTexImageEXT" },
- { (void**)&glXReleaseTexImageEXT, "glXReleaseTexImageEXT" },
- { (void**)&glXChooseFBConfig, "glXChooseFBConfig" },
- { (void**)&glXGetVisualFromFBConfig, "glXGetVisualFromFBConfig" },
- { (void**)&glXCreateContextAttribsARB, "glXCreateContextAttribsARB" },
- { (void**)&glXMakeContextCurrent, "glXMakeContextCurrent" },
- { (void**)&glXDestroyContext, "glXDestroyContext" },
- { (void**)&glXSwapBuffers, "glXSwapBuffers" },
- { (void**)&glGetError, "glGetError" },
- { (void**)&glGetString, "glGetString" },
- { (void**)&glClear, "glClear" },
- { (void**)&glGenTextures, "glGenTextures" },
- { (void**)&glDeleteTextures, "glDeleteTextures" },
- { (void**)&glBindTexture, "glBindTexture" },
- { (void**)&glTexParameteri, "glTexParameteri" },
- { (void**)&glGetTexLevelParameteriv, "glGetTexLevelParameteriv" },
- { (void**)&glTexImage2D, "glTexImage2D" },
- { (void**)&glCopyImageSubData, "glCopyImageSubData" },
- { NULL, NULL }
- };
- if(dlsym_load_list(lib, required_dlsym)) {
- library = lib;
- return true;
- } else {
- fprintf(stderr, "Error: missing required symbols in libGL.so.1\n");
- dlclose(lib);
- return false;
- }
- }
- void unload() {
- if(library) {
- dlclose(library);
- library = nullptr;
- }
- }
- void *library = nullptr;
diff --git a/include/NvFBCLibrary.hpp b/include/NvFBCLibrary.hpp
deleted file mode 100644
index dc7db1f..0000000
--- a/include/NvFBCLibrary.hpp
+++ /dev/null
@@ -1,321 +0,0 @@
-#pragma once
-#include "../external/NvFBC.h"
-#include <dlfcn.h>
-#include <string.h>
-#include <stdio.h>
-#include <string.h>
-class NvFBCLibrary {
- ~NvFBCLibrary() {
- if(fbc_handle_created) {
- memset(&destroy_capture_params, 0, sizeof(destroy_capture_params));
- destroy_capture_params.dwVersion = NVFBC_DESTROY_CAPTURE_SESSION_PARAMS_VER;
- nv_fbc_function_list.nvFBCDestroyCaptureSession(nv_fbc_handle, &destroy_capture_params);
- memset(&destroy_params, 0, sizeof(destroy_params));
- destroy_params.dwVersion = NVFBC_DESTROY_HANDLE_PARAMS_VER;
- nv_fbc_function_list.nvFBCDestroyHandle(nv_fbc_handle, &destroy_params);
- }
- if(library)
- dlclose(library);
- }
- bool load() {
- if(library)
- return true;
- dlerror(); // clear
- void *lib = dlopen("libnvidia-fbc.so.1", RTLD_LAZY);
- if(!lib) {
- fprintf(stderr, "Error: failed to load libnvidia-fbc.so.1, error: %s\n", dlerror());
- return false;
- }
- nv_fbc_create_instance = (PNVFBCCREATEINSTANCE)dlsym(lib, "NvFBCCreateInstance");
- if(!nv_fbc_create_instance) {
- fprintf(stderr, "Error: unable to resolve symbol 'NvFBCCreateInstance'\n");
- dlclose(lib);
- return false;
- }
- memset(&nv_fbc_function_list, 0, sizeof(nv_fbc_function_list));
- nv_fbc_function_list.dwVersion = NVFBC_VERSION;
- NVFBCSTATUS status = nv_fbc_create_instance(&nv_fbc_function_list);
- if(status != NVFBC_SUCCESS) {
- fprintf(stderr, "Error: failed to create NvFBC instance (status: %d)\n", status);
- dlclose(lib);
- return false;
- }
- library = lib;
- return true;
- }
- // If |display_to_capture| is "screen", then the entire x11 screen is captured (all displays).
- bool create(const char *display_to_capture, uint32_t fps, /*out*/ uint32_t *display_width, /*out*/ uint32_t *display_height, uint32_t x = 0, uint32_t y = 0, uint32_t width = 0, uint32_t height = 0, bool direct_capture = false) {
- if(!library || !display_to_capture || !display_width || !display_height || fbc_handle_created)
- return false;
- this->fps = fps;
- const bool capture_region = (x > 0 || y > 0 || width > 0 || height > 0);
- bool supports_direct_cursor = false;
- int driver_major_version = 0;
- int driver_minor_version = 0;
- if(direct_capture && get_driver_version(&driver_major_version, &driver_minor_version)) {
- fprintf(stderr, "Info: detected nvidia version: %d.%d\n", driver_major_version, driver_minor_version);
- if(version_at_least(driver_major_version, driver_minor_version, 515, 57) && version_less_than(driver_major_version, driver_minor_version, 520, 56)) {
- direct_capture = false;
- fprintf(stderr, "Warning: \"screen-direct\" has temporary been disabled as it causes stuttering with driver versions >= 515.57 and < 520.56. Please update your driver if possible. Capturing \"screen\" instead.\n");
- }
- // TODO:
- // Cursor capture disabled because moving the cursor doesn't update capture rate to monitor hz and instead captures at 10-30 hz
- /*
- if(direct_capture) {
- if(version_at_least(driver_major_version, driver_minor_version, 515, 57))
- supports_direct_cursor = true;
- else
- fprintf(stderr, "Info: capturing \"screen-direct\" but driver version appears to be less than 515.57. Disabling capture of cursor. Please update your driver if you want to capture your cursor or record \"screen\" instead.\n");
- }
- */
- }
- NVFBC_TRACKING_TYPE tracking_type;
- bool capture_session_created = false;
- uint32_t output_id = 0;
- fbc_handle_created = false;
- memset(&create_params, 0, sizeof(create_params));
- create_params.dwVersion = NVFBC_CREATE_HANDLE_PARAMS_VER;
- status = nv_fbc_function_list.nvFBCCreateHandle(&nv_fbc_handle, &create_params);
- if(status != NVFBC_SUCCESS) {
- // Reverse engineering for interoperability
- const uint8_t enable_key[] = { 0xac, 0x10, 0xc9, 0x2e, 0xa5, 0xe6, 0x87, 0x4f, 0x8f, 0x4b, 0xf4, 0x61, 0xf8, 0x56, 0x27, 0xe9 };
- create_params.privateData = enable_key;
- create_params.privateDataSize = 16;
- status = nv_fbc_function_list.nvFBCCreateHandle(&nv_fbc_handle, &create_params);
- if(status != NVFBC_SUCCESS) {
- fprintf(stderr, "Error: %s\n", nv_fbc_function_list.nvFBCGetLastErrorStr(nv_fbc_handle));
- return false;
- }
- }
- fbc_handle_created = true;
- NVFBC_GET_STATUS_PARAMS status_params;
- memset(&status_params, 0, sizeof(status_params));
- status_params.dwVersion = NVFBC_GET_STATUS_PARAMS_VER;
- status = nv_fbc_function_list.nvFBCGetStatus(nv_fbc_handle, &status_params);
- if(status != NVFBC_SUCCESS) {
- fprintf(stderr, "Error: %s\n", nv_fbc_function_list.nvFBCGetLastErrorStr(nv_fbc_handle));
- goto error_cleanup;
- }
- if(status_params.bCanCreateNow == NVFBC_FALSE) {
- fprintf(stderr, "Error: it's not possible to create a capture session on this system\n");
- goto error_cleanup;
- }
- tracking_type = strcmp(display_to_capture, "screen") == 0 ? NVFBC_TRACKING_SCREEN : NVFBC_TRACKING_OUTPUT;
- if(tracking_type == NVFBC_TRACKING_OUTPUT) {
- if(!status_params.bXRandRAvailable) {
- fprintf(stderr, "Error: the xrandr extension is not available\n");
- goto error_cleanup;
- }
- if(status_params.bInModeset) {
- fprintf(stderr, "Error: the x server is in modeset, unable to record\n");
- goto error_cleanup;
- }
- output_id = get_output_id_from_display_name(status_params.outputs, status_params.dwOutputNum, display_to_capture, display_width, display_height);
- if(output_id == 0) {
- fprintf(stderr, "Error: display '%s' not found\n", display_to_capture);
- goto error_cleanup;
- }
- } else {
- *display_width = status_params.screenSize.w;
- *display_height = status_params.screenSize.h;
- }
- memset(&create_capture_params, 0, sizeof(create_capture_params));
- create_capture_params.dwVersion = NVFBC_CREATE_CAPTURE_SESSION_PARAMS_VER;
- create_capture_params.eCaptureType = NVFBC_CAPTURE_SHARED_CUDA;
- create_capture_params.bWithCursor = (!direct_capture || supports_direct_cursor) ? NVFBC_TRUE : NVFBC_FALSE;
- if(capture_region) {
- create_capture_params.captureBox = { x, y, width, height };
- *display_width = width;
- *display_height = height;
- }
- create_capture_params.eTrackingType = tracking_type;
- create_capture_params.dwSamplingRateMs = 1000 / (fps + 1);
- create_capture_params.bAllowDirectCapture = direct_capture ? NVFBC_TRUE : NVFBC_FALSE;
- create_capture_params.bPushModel = direct_capture ? NVFBC_TRUE : NVFBC_FALSE;
- if(tracking_type == NVFBC_TRACKING_OUTPUT)
- create_capture_params.dwOutputId = output_id;
- status = nv_fbc_function_list.nvFBCCreateCaptureSession(nv_fbc_handle, &create_capture_params);
- if(status != NVFBC_SUCCESS) {
- fprintf(stderr, "Error: %s\n", nv_fbc_function_list.nvFBCGetLastErrorStr(nv_fbc_handle));
- goto error_cleanup;
- }
- capture_session_created = true;
- memset(&setup_params, 0, sizeof(setup_params));
- setup_params.dwVersion = NVFBC_TOCUDA_SETUP_PARAMS_VER;
- setup_params.eBufferFormat = NVFBC_BUFFER_FORMAT_BGRA;
- status = nv_fbc_function_list.nvFBCToCudaSetUp(nv_fbc_handle, &setup_params);
- if(status != NVFBC_SUCCESS) {
- fprintf(stderr, "Error: %s\n", nv_fbc_function_list.nvFBCGetLastErrorStr(nv_fbc_handle));
- goto error_cleanup;
- }
- return true;
- error_cleanup:
- if(fbc_handle_created) {
- if(capture_session_created) {
- memset(&destroy_capture_params, 0, sizeof(destroy_capture_params));
- destroy_capture_params.dwVersion = NVFBC_DESTROY_CAPTURE_SESSION_PARAMS_VER;
- nv_fbc_function_list.nvFBCDestroyCaptureSession(nv_fbc_handle, &destroy_capture_params);
- }
- memset(&destroy_params, 0, sizeof(destroy_params));
- destroy_params.dwVersion = NVFBC_DESTROY_HANDLE_PARAMS_VER;
- nv_fbc_function_list.nvFBCDestroyHandle(nv_fbc_handle, &destroy_params);
- fbc_handle_created = false;
- }
- output_id = 0;
- return false;
- }
- bool capture(/*out*/ void *cu_device_ptr, uint32_t *byte_size) {
- if(!library || !fbc_handle_created || !cu_device_ptr || !byte_size)
- return false;
- memset(&frame_info, 0, sizeof(frame_info));
- memset(&grab_params, 0, sizeof(grab_params));
- grab_params.dwVersion = NVFBC_TOCUDA_GRAB_FRAME_PARAMS_VER;
- grab_params.pFrameGrabInfo = &frame_info;
- grab_params.pCUDADeviceBuffer = cu_device_ptr;
- grab_params.dwTimeoutMs = 0;//1000 / (fps + 10);
- status = nv_fbc_function_list.nvFBCToCudaGrabFrame(nv_fbc_handle, &grab_params);
- if(status != NVFBC_SUCCESS) {
- fprintf(stderr, "Error: capture: %s\n", nv_fbc_function_list.nvFBCGetLastErrorStr(nv_fbc_handle));
- return false;
- }
- *byte_size = frame_info.dwByteSize;
- // TODO: Check bIsNewFrame
- // TODO: Check dwWidth and dwHeight and update size in video output in ffmpeg. This can happen when xrandr is used to change monitor resolution
- return true;
- }
- static char to_upper(char c) {
- if(c >= 'a' && c <= 'z')
- return c - 32;
- else
- return c;
- }
- static bool strcase_equals(const char *str1, const char *str2) {
- for(;;) {
- char c1 = to_upper(*str1);
- char c2 = to_upper(*str2);
- if(c1 != c2)
- return false;
- if(c1 == '\0' || c2 == '\0')
- return true;
- ++str1;
- ++str2;
- }
- }
- // Returns 0 on failure
- static uint32_t get_output_id_from_display_name(NVFBC_RANDR_OUTPUT_INFO *outputs, uint32_t num_outputs, const char *display_name, uint32_t *display_width, uint32_t *display_height) {
- if(!outputs)
- return 0;
- for(uint32_t i = 0; i < num_outputs; ++i) {
- if(strcase_equals(outputs[i].name, display_name)) {
- *display_width = outputs[i].trackedBox.w;
- *display_height = outputs[i].trackedBox.h;
- return outputs[i].dwId;
- }
- }
- return 0;
- }
- // TODO: Test with optimus and open kernel modules
- static bool get_driver_version(int *major, int *minor) {
- *major = 0;
- *minor = 0;
- FILE *f = fopen("/proc/driver/nvidia/version", "rb");
- if(!f) {
- fprintf(stderr, "Warning: failed to get nvidia driver version (failed to read /proc/driver/nvidia/version)\n");
- return false;
- }
- char buffer[2048];
- size_t bytes_read = fread(buffer, 1, sizeof(buffer) - 1, f);
- buffer[bytes_read] = '\0';
- bool success = false;
- const char *p = strstr(buffer, "Kernel Module");
- if(p) {
- p += 13;
- int driver_major_version = 0, driver_minor_version = 0;
- if(sscanf(p, "%d.%d", &driver_major_version, &driver_minor_version) == 2) {
- *major = driver_major_version;
- *minor = driver_minor_version;
- success = true;
- }
- }
- if(!success)
- fprintf(stderr, "Warning: failed to get nvidia driver version\n");
- fclose(f);
- return success;
- }
- static bool version_at_least(int major, int minor, int expected_major, int expected_minor) {
- return major > expected_major || (major == expected_major && minor >= expected_minor);
- }
- static bool version_less_than(int major, int minor, int expected_major, int expected_minor) {
- return major < expected_major || (major == expected_major && minor < expected_minor);
- }
- void *library = nullptr;
- PNVFBCCREATEINSTANCE nv_fbc_create_instance = nullptr;
- NVFBC_API_FUNCTION_LIST nv_fbc_function_list;
- NVFBC_SESSION_HANDLE nv_fbc_handle;
- bool fbc_handle_created = false;
- int fps = 0;
diff --git a/include/capture/capture.h b/include/capture/capture.h
new file mode 100644
index 0000000..fbbe767
--- /dev/null
+++ b/include/capture/capture.h
@@ -0,0 +1,70 @@
+#include "../color_conversion.h"
+#include <stdbool.h>
+typedef struct AVCodecContext AVCodecContext;
+typedef struct AVFrame AVFrame;
+typedef void* VADisplay;
+typedef struct _VADRMPRIMESurfaceDescriptor VADRMPRIMESurfaceDescriptor;
+typedef struct gsr_cuda gsr_cuda;
+typedef struct AVFrame AVFrame;
+typedef struct CUgraphicsResource_st *CUgraphicsResource;
+typedef struct CUarray_st *CUarray;
+typedef struct CUctx_st *CUcontext;
+typedef struct CUstream_st *CUstream;
+typedef struct gsr_capture gsr_capture;
+struct gsr_capture {
+ /* These methods should not be called manually. Call gsr_capture_* instead */
+ int (*start)(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame *frame);
+ void (*tick)(gsr_capture *cap, AVCodecContext *video_codec_context); /* can be NULL */
+ bool (*is_damaged)(gsr_capture *cap); /* can be NULL */
+ void (*clear_damage)(gsr_capture *cap); /* can be NULL */
+ bool (*should_stop)(gsr_capture *cap, bool *err); /* can be NULL */
+ int (*capture)(gsr_capture *cap, AVFrame *frame);
+ void (*capture_end)(gsr_capture *cap, AVFrame *frame); /* can be NULL */
+ void (*destroy)(gsr_capture *cap, AVCodecContext *video_codec_context);
+ void *priv; /* can be NULL */
+ bool started;
+typedef struct gsr_capture_base gsr_capture_base;
+struct gsr_capture_base {
+ gsr_egl *egl;
+ unsigned int input_texture;
+ unsigned int target_textures[2];
+ unsigned int cursor_texture;
+ gsr_color_conversion color_conversion;
+ AVCodecContext *video_codec_context;
+typedef struct {
+ gsr_cuda *cuda;
+ CUgraphicsResource *cuda_graphics_resources;
+ CUarray *mapped_arrays;
+} gsr_cuda_context;
+int gsr_capture_start(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame *frame);
+void gsr_capture_tick(gsr_capture *cap, AVCodecContext *video_codec_context);
+bool gsr_capture_should_stop(gsr_capture *cap, bool *err);
+int gsr_capture_capture(gsr_capture *cap, AVFrame *frame);
+void gsr_capture_end(gsr_capture *cap, AVFrame *frame);
+/* Calls |gsr_capture_stop| as well */
+void gsr_capture_destroy(gsr_capture *cap, AVCodecContext *video_codec_context);
+bool gsr_capture_base_setup_vaapi_textures(gsr_capture_base *self, AVFrame *frame, VADisplay va_dpy, VADRMPRIMESurfaceDescriptor *prime, gsr_color_range color_range);
+bool gsr_capture_base_setup_cuda_textures(gsr_capture_base *self, AVFrame *frame, gsr_cuda_context *cuda_context, gsr_color_range color_range, gsr_source_color source_color, bool hdr);
+void gsr_capture_base_stop(gsr_capture_base *self);
+bool drm_create_codec_context(const char *card_path, AVCodecContext *video_codec_context, int width, int height, bool hdr, VADisplay *va_dpy);
+bool cuda_create_codec_context(CUcontext cu_ctx, AVCodecContext *video_codec_context, int width, int height, bool hdr, CUstream *cuda_stream);
diff --git a/include/capture/kms.h b/include/capture/kms.h
new file mode 100644
index 0000000..674813a
--- /dev/null
+++ b/include/capture/kms.h
@@ -0,0 +1,50 @@
+#include "capture.h"
+#include "../../kms/client/kms_client.h"
+#include "../color_conversion.h"
+#include "../vec2.h"
+#include "../defs.h"
+#include <stdbool.h>
+typedef struct AVCodecContext AVCodecContext;
+typedef struct AVMasteringDisplayMetadata AVMasteringDisplayMetadata;
+typedef struct AVContentLightMetadata AVContentLightMetadata;
+typedef struct gsr_capture_kms gsr_capture_kms;
+typedef struct gsr_egl gsr_egl;
+typedef struct AVFrame AVFrame;
+typedef struct {
+ uint32_t connector_ids[MAX_CONNECTOR_IDS];
+ int num_connector_ids;
+} MonitorId;
+struct gsr_capture_kms {
+ gsr_capture_base base;
+ bool should_stop;
+ bool stop_is_error;
+ gsr_kms_client kms_client;
+ gsr_kms_response kms_response;
+ vec2i capture_pos;
+ vec2i capture_size;
+ MonitorId monitor_id;
+ AVMasteringDisplayMetadata *mastering_display_metadata;
+ AVContentLightMetadata *light_metadata;
+ gsr_monitor_rotation monitor_rotation;
+/* Returns 0 on success */
+int gsr_capture_kms_start(gsr_capture_kms *self, const char *display_to_capture, gsr_egl *egl, AVCodecContext *video_codec_context, AVFrame *frame);
+void gsr_capture_kms_stop(gsr_capture_kms *self);
+bool gsr_capture_kms_capture(gsr_capture_kms *self, AVFrame *frame, bool hdr, bool screen_plane_use_modifiers, bool cursor_texture_is_external, bool record_cursor);
+void gsr_capture_kms_cleanup_kms_fds(gsr_capture_kms *self);
+#endif /* GSR_CAPTURE_KMS_H */
diff --git a/include/capture/kms_cuda.h b/include/capture/kms_cuda.h
new file mode 100644
index 0000000..433e053
--- /dev/null
+++ b/include/capture/kms_cuda.h
@@ -0,0 +1,19 @@
+#include "../vec2.h"
+#include "../utils.h"
+#include "../color_conversion.h"
+#include "capture.h"
+typedef struct {
+ gsr_egl *egl;
+ const char *display_to_capture; /* if this is "screen", then the first monitor is captured. A copy is made of this */
+ bool hdr;
+ gsr_color_range color_range;
+ bool record_cursor;
+} gsr_capture_kms_cuda_params;
+gsr_capture* gsr_capture_kms_cuda_create(const gsr_capture_kms_cuda_params *params);
+#endif /* GSR_CAPTURE_KMS_CUDA_H */
diff --git a/include/capture/kms_vaapi.h b/include/capture/kms_vaapi.h
new file mode 100644
index 0000000..bf078b5
--- /dev/null
+++ b/include/capture/kms_vaapi.h
@@ -0,0 +1,19 @@
+#include "../vec2.h"
+#include "../utils.h"
+#include "../color_conversion.h"
+#include "capture.h"
+typedef struct {
+ gsr_egl *egl;
+ const char *display_to_capture; /* if this is "screen", then the first monitor is captured. A copy is made of this */
+ bool hdr;
+ gsr_color_range color_range;
+ bool record_cursor;
+} gsr_capture_kms_vaapi_params;
+gsr_capture* gsr_capture_kms_vaapi_create(const gsr_capture_kms_vaapi_params *params);
diff --git a/include/capture/nvfbc.h b/include/capture/nvfbc.h
new file mode 100644
index 0000000..36bc2b6
--- /dev/null
+++ b/include/capture/nvfbc.h
@@ -0,0 +1,22 @@
+#include "capture.h"
+#include "../vec2.h"
+typedef struct {
+ gsr_egl *egl;
+ const char *display_to_capture; /* if this is "screen", then the entire x11 screen is captured (all displays). A copy is made of this */
+ int fps;
+ vec2i pos;
+ vec2i size;
+ bool direct_capture;
+ bool overclock;
+ bool hdr;
+ gsr_color_range color_range;
+ bool record_cursor;
+} gsr_capture_nvfbc_params;
+gsr_capture* gsr_capture_nvfbc_create(const gsr_capture_nvfbc_params *params);
+#endif /* GSR_CAPTURE_NVFBC_H */
diff --git a/include/capture/xcomposite.h b/include/capture/xcomposite.h
new file mode 100644
index 0000000..27b289a
--- /dev/null
+++ b/include/capture/xcomposite.h
@@ -0,0 +1,58 @@
+#include "capture.h"
+#include "../egl.h"
+#include "../vec2.h"
+#include "../color_conversion.h"
+#include "../window_texture.h"
+#include "../cursor.h"
+typedef struct {
+ gsr_egl *egl;
+ Window window;
+ bool follow_focused; /* If this is set then |window| is ignored */
+ vec2i region_size; /* This is currently only used with |follow_focused| */
+ gsr_color_range color_range;
+ bool record_cursor;
+ bool track_damage;
+} gsr_capture_xcomposite_params;
+typedef struct {
+ gsr_capture_base base;
+ gsr_capture_xcomposite_params params;
+ XEvent xev;
+ bool should_stop;
+ bool stop_is_error;
+ bool window_resized;
+ bool follow_focused_initialized;
+ Window window;
+ vec2i window_size;
+ vec2i texture_size;
+ double window_resize_timer;
+ WindowTexture window_texture;
+ Atom net_active_window_atom;
+ gsr_cursor cursor;
+ int damage_event;
+ int damage_error;
+ XID damage;
+ bool damaged;
+} gsr_capture_xcomposite;
+void gsr_capture_xcomposite_init(gsr_capture_xcomposite *self, const gsr_capture_xcomposite_params *params);
+int gsr_capture_xcomposite_start(gsr_capture_xcomposite *self, AVCodecContext *video_codec_context, AVFrame *frame);
+void gsr_capture_xcomposite_stop(gsr_capture_xcomposite *self);
+void gsr_capture_xcomposite_tick(gsr_capture_xcomposite *self, AVCodecContext *video_codec_context);
+bool gsr_capture_xcomposite_is_damaged(gsr_capture_xcomposite *self);
+void gsr_capture_xcomposite_clear_damage(gsr_capture_xcomposite *self);
+bool gsr_capture_xcomposite_should_stop(gsr_capture_xcomposite *self, bool *err);
+int gsr_capture_xcomposite_capture(gsr_capture_xcomposite *self, AVFrame *frame);
diff --git a/include/capture/xcomposite_cuda.h b/include/capture/xcomposite_cuda.h
new file mode 100644
index 0000000..b93c6de
--- /dev/null
+++ b/include/capture/xcomposite_cuda.h
@@ -0,0 +1,14 @@
+#include "capture.h"
+#include "xcomposite.h"
+typedef struct {
+ gsr_capture_xcomposite_params base;
+ bool overclock;
+} gsr_capture_xcomposite_cuda_params;
+gsr_capture* gsr_capture_xcomposite_cuda_create(const gsr_capture_xcomposite_cuda_params *params);
diff --git a/include/capture/xcomposite_vaapi.h b/include/capture/xcomposite_vaapi.h
new file mode 100644
index 0000000..5d4b338
--- /dev/null
+++ b/include/capture/xcomposite_vaapi.h
@@ -0,0 +1,13 @@
+#include "capture.h"
+#include "xcomposite.h"
+typedef struct {
+ gsr_capture_xcomposite_params base;
+} gsr_capture_xcomposite_vaapi_params;
+gsr_capture* gsr_capture_xcomposite_vaapi_create(const gsr_capture_xcomposite_vaapi_params *params);
diff --git a/include/color_conversion.h b/include/color_conversion.h
new file mode 100644
index 0000000..d05df6a
--- /dev/null
+++ b/include/color_conversion.h
@@ -0,0 +1,58 @@
+#include "shader.h"
+#include "vec2.h"
+#include <stdbool.h>
+typedef enum {
+} gsr_color_range;
+typedef enum {
+} gsr_source_color;
+typedef enum {
+ GSR_DESTINATION_COLOR_NV12, /* YUV420, BT709, 8-bit */
+ GSR_DESTINATION_COLOR_P010 /* YUV420, BT2020, 10-bit */
+} gsr_destination_color;
+typedef struct {
+ int offset;
+ int rotation;
+} gsr_color_uniforms;
+typedef struct {
+ gsr_egl *egl;
+ gsr_source_color source_color;
+ gsr_destination_color destination_color;
+ unsigned int destination_textures[2];
+ int num_destination_textures;
+ gsr_color_range color_range;
+ bool load_external_image_shader;
+} gsr_color_conversion_params;
+typedef struct {
+ gsr_color_conversion_params params;
+ gsr_color_uniforms uniforms[4];
+ gsr_shader shaders[4];
+ unsigned int framebuffers[2];
+ unsigned int vertex_array_object_id;
+ unsigned int vertex_buffer_object_id;
+} gsr_color_conversion;
+int gsr_color_conversion_init(gsr_color_conversion *self, const gsr_color_conversion_params *params);
+void gsr_color_conversion_deinit(gsr_color_conversion *self);
+void gsr_color_conversion_draw(gsr_color_conversion *self, unsigned int texture_id, vec2i source_pos, vec2i source_size, vec2i texture_pos, vec2i texture_size, float rotation, bool external_texture);
+void gsr_color_conversion_clear(gsr_color_conversion *self);
diff --git a/include/CudaLibrary.hpp b/include/cuda.h
index fe99975..fd1f9f9 100644
--- a/include/CudaLibrary.hpp
+++ b/include/cuda.h
@@ -1,13 +1,15 @@
-#pragma once
+#ifndef GSR_CUDA_H
+#define GSR_CUDA_H
-#include "LibraryLoader.hpp"
-#include <dlfcn.h>
-#include <stdio.h>
+#include "overclock.h"
+#include <stddef.h>
+#include <stdbool.h>
// To prevent hwcontext_cuda.h from including cuda.h
#define CUDA_VERSION 11070
+#define CU_CTX_SCHED_AUTO 0
#if defined(_WIN64) || defined(__LP64__)
typedef unsigned long long CUdeviceptr_v2;
@@ -22,7 +24,7 @@ typedef struct CUctx_st *CUcontext;
typedef struct CUstream_st *CUstream;
typedef struct CUarray_st *CUarray;
-static const int CUDA_SUCCESS = 0;
+#define CUDA_SUCCESS 0
typedef enum CUgraphicsMapResourceFlags_enum {
@@ -69,75 +71,38 @@ typedef struct CUDA_MEMCPY2D_st {
-static const int CU_CTX_SCHED_AUTO = 0;
typedef struct CUgraphicsResource_st *CUgraphicsResource;
-struct Cuda {
+typedef struct gsr_cuda gsr_cuda;
+struct gsr_cuda {
+ gsr_overclock overclock;
+ bool do_overclock;
+ void *library;
+ CUcontext cu_ctx;
CUresult (*cuInit)(unsigned int Flags);
CUresult (*cuDeviceGetCount)(int *count);
CUresult (*cuDeviceGet)(CUdevice *device, int ordinal);
CUresult (*cuCtxCreate_v2)(CUcontext *pctx, unsigned int flags, CUdevice dev);
+ CUresult (*cuCtxDestroy_v2)(CUcontext ctx);
CUresult (*cuCtxPushCurrent_v2)(CUcontext ctx);
CUresult (*cuCtxPopCurrent_v2)(CUcontext *pctx);
CUresult (*cuGetErrorString)(CUresult error, const char **pStr);
- CUresult (*cuMemsetD8_v2)(CUdeviceptr dstDevice, unsigned char uc, size_t N);
CUresult (*cuMemcpy2D_v2)(const CUDA_MEMCPY2D *pCopy);
+ CUresult (*cuMemcpy2DAsync_v2)(const CUDA_MEMCPY2D *pcopy, CUstream hStream);
+ CUresult (*cuStreamSynchronize)(CUstream hStream);
- CUresult (*cuGraphicsGLRegisterImage)(CUgraphicsResource *pCudaResource, unsigned int image, unsigned int target, unsigned int Flags);
+ CUresult (*cuGraphicsGLRegisterImage)(CUgraphicsResource *pCudaResource, unsigned int image, unsigned int target, unsigned int flags);
+ CUresult (*cuGraphicsEGLRegisterImage)(CUgraphicsResource *pCudaResource, void *image, unsigned int flags);
CUresult (*cuGraphicsResourceSetMapFlags)(CUgraphicsResource resource, unsigned int flags);
CUresult (*cuGraphicsMapResources)(unsigned int count, CUgraphicsResource *resources, CUstream hStream);
+ CUresult (*cuGraphicsUnmapResources)(unsigned int count, CUgraphicsResource *resources, CUstream hStream);
CUresult (*cuGraphicsUnregisterResource)(CUgraphicsResource resource);
CUresult (*cuGraphicsSubResourceGetMappedArray)(CUarray *pArray, CUgraphicsResource resource, unsigned int arrayIndex, unsigned int mipLevel);
- ~Cuda() {
- if(library)
- dlclose(library);
- }
- bool load() {
- if(library)
- return true;
- dlerror(); // clear
- void *lib = dlopen("libcuda.so.1", RTLD_LAZY);
- if(!lib) {
- lib = dlopen("libcuda.so", RTLD_LAZY);
- if(!lib) {
- fprintf(stderr, "Error: failed to load libcuda.so/libcuda.so.1, error: %s\n", dlerror());
- return false;
- }
- }
- dlsym_assign required_dlsym[] = {
- { (void**)&cuInit, "cuInit" },
- { (void**)&cuDeviceGetCount, "cuDeviceGetCount" },
- { (void**)&cuDeviceGet, "cuDeviceGet" },
- { (void**)&cuCtxCreate_v2, "cuCtxCreate_v2" },
- { (void**)&cuCtxPushCurrent_v2, "cuCtxPushCurrent_v2" },
- { (void**)&cuCtxPopCurrent_v2, "cuCtxPopCurrent_v2" },
- { (void**)&cuGetErrorString, "cuGetErrorString" },
- { (void**)&cuMemsetD8_v2, "cuMemsetD8_v2" },
- { (void**)&cuMemcpy2D_v2, "cuMemcpy2D_v2" },
- { (void**)&cuGraphicsGLRegisterImage, "cuGraphicsGLRegisterImage" },
- { (void**)&cuGraphicsResourceSetMapFlags, "cuGraphicsResourceSetMapFlags" },
- { (void**)&cuGraphicsMapResources, "cuGraphicsMapResources" },
- { (void**)&cuGraphicsUnregisterResource, "cuGraphicsUnregisterResource" },
- { (void**)&cuGraphicsSubResourceGetMappedArray, "cuGraphicsSubResourceGetMappedArray" },
- { NULL, NULL }
- };
- if(dlsym_load_list(lib, required_dlsym)) {
- library = lib;
- return true;
- } else {
- fprintf(stderr, "Error: missing required symbols in libcuda.so\n");
- dlclose(lib);
- return false;
- }
- }
- void *library = nullptr;
+bool gsr_cuda_load(gsr_cuda *self, Display *display, bool overclock);
+void gsr_cuda_unload(gsr_cuda *self);
+#endif /* GSR_CUDA_H */
diff --git a/include/cursor.h b/include/cursor.h
new file mode 100644
index 0000000..2f26dfd
--- /dev/null
+++ b/include/cursor.h
@@ -0,0 +1,30 @@
+#ifndef GSR_CURSOR_H
+#define GSR_CURSOR_H
+#include "egl.h"
+#include "vec2.h"
+typedef struct {
+ gsr_egl *egl;
+ Display *display;
+ int x_fixes_event_base;
+ int xi_opcode;
+ unsigned int texture_id;
+ vec2i size;
+ vec2i hotspot;
+ vec2i position;
+ bool cursor_image_set;
+ bool visible;
+ bool cursor_moved;
+} gsr_cursor;
+int gsr_cursor_init(gsr_cursor *self, gsr_egl *egl, Display *display);
+void gsr_cursor_deinit(gsr_cursor *self);
+/* Returns true if the cursor image has updated or if the cursor has moved */
+bool gsr_cursor_update(gsr_cursor *self, XEvent *xev);
+void gsr_cursor_tick(gsr_cursor *self, Window relative_to);
+#endif /* GSR_CURSOR_H */
diff --git a/include/defs.h b/include/defs.h
new file mode 100644
index 0000000..473583c
--- /dev/null
+++ b/include/defs.h
@@ -0,0 +1,28 @@
+#ifndef GSR_DEFS_H
+#define GSR_DEFS_H
+typedef enum {
+} gsr_gpu_vendor;
+typedef struct {
+ gsr_gpu_vendor vendor;
+ int gpu_version; /* 0 if unknown */
+} gsr_gpu_info;
+typedef enum {
+} gsr_monitor_rotation;
+typedef enum {
+} gsr_connection_type;
+#endif /* GSR_DEFS_H */
diff --git a/include/egl.h b/include/egl.h
new file mode 100644
index 0000000..64dd2c6
--- /dev/null
+++ b/include/egl.h
@@ -0,0 +1,288 @@
+#ifndef GSR_EGL_H
+#define GSR_EGL_H
+/* OpenGL EGL library with a hidden window context (to allow using the opengl functions) */
+#include <X11/X.h>
+#include <X11/Xutil.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include "vec2.h"
+#include "defs.h"
+#ifdef _WIN64
+typedef signed long long int khronos_intptr_t;
+typedef unsigned long long int khronos_uintptr_t;
+typedef signed long long int khronos_ssize_t;
+typedef unsigned long long int khronos_usize_t;
+typedef signed long int khronos_intptr_t;
+typedef unsigned long int khronos_uintptr_t;
+typedef signed long int khronos_ssize_t;
+typedef unsigned long int khronos_usize_t;
+typedef void* EGLDisplay;
+typedef void* EGLNativeDisplayType;
+typedef uintptr_t EGLNativeWindowType;
+typedef uintptr_t EGLNativePixmapType;
+typedef void* EGLConfig;
+typedef void* EGLSurface;
+typedef void* EGLContext;
+typedef void* EGLClientBuffer;
+typedef void* EGLImage;
+typedef void* EGLImageKHR;
+typedef void *GLeglImageOES;
+typedef void (*__eglMustCastToProperFunctionPointerType)(void);
+typedef struct __GLXFBConfigRec *GLXFBConfig;
+typedef struct __GLXcontextRec *GLXContext;
+typedef XID GLXDrawable;
+typedef void(*__GLXextFuncPtr)(void);
+#define EGL_SUCCESS 0x3000
+#define EGL_BUFFER_SIZE 0x3020
+#define EGL_RENDERABLE_TYPE 0x3040
+#define EGL_OPENGL_API 0x30A2
+#define EGL_OPENGL_BIT 0x0008
+#define EGL_NONE 0x3038
+#define EGL_BACK_BUFFER 0x3084
+#define EGL_GL_TEXTURE_2D 0x30B1
+#define EGL_TRUE 1
+#define EGL_LINUX_DRM_FOURCC_EXT 0x3271
+#define EGL_WIDTH 0x3057
+#define EGL_HEIGHT 0x3056
+#define EGL_DMA_BUF_PLANE0_FD_EXT 0x3272
+#define EGL_DMA_BUF_PLANE0_PITCH_EXT 0x3274
+#define EGL_LINUX_DMA_BUF_EXT 0x3270
+#define EGL_RED_SIZE 0x3024
+#define EGL_ALPHA_SIZE 0x3021
+#define EGL_BLUE_SIZE 0x3022
+#define EGL_GREEN_SIZE 0x3023
+#define EGL_DEVICE_EXT 0x322C
+#define EGL_DRM_DEVICE_FILE_EXT 0x3233
+#define GL_FLOAT 0x1406
+#define GL_FALSE 0
+#define GL_TRUE 1
+#define GL_TRIANGLES 0x0004
+#define GL_TEXTURE_2D 0x0DE1
+#define GL_RED 0x1903
+#define GL_GREEN 0x1904
+#define GL_BLUE 0x1905
+#define GL_ALPHA 0x1906
+#define GL_RG 0x8227
+#define GL_RGB 0x1907
+#define GL_RGBA 0x1908
+#define GL_RGBA8 0x8058
+#define GL_R8 0x8229
+#define GL_RG8 0x822B
+#define GL_R16 0x822A
+#define GL_RG16 0x822C
+#define GL_UNSIGNED_BYTE 0x1401
+#define GL_COLOR_BUFFER_BIT 0x00004000
+#define GL_TEXTURE_WRAP_S 0x2802
+#define GL_TEXTURE_WRAP_T 0x2803
+#define GL_TEXTURE_MAG_FILTER 0x2800
+#define GL_TEXTURE_MIN_FILTER 0x2801
+#define GL_TEXTURE_WIDTH 0x1000
+#define GL_TEXTURE_HEIGHT 0x1001
+#define GL_NEAREST 0x2600
+#define GL_CLAMP_TO_EDGE 0x812F
+#define GL_LINEAR 0x2601
+#define GL_FRAMEBUFFER 0x8D40
+#define GL_STREAM_DRAW 0x88E0
+#define GL_ARRAY_BUFFER 0x8892
+#define GL_BLEND 0x0BE2
+#define GL_SRC_ALPHA 0x0302
+#define GL_ONE_MINUS_SRC_ALPHA 0x0303
+#define GL_DEBUG_OUTPUT 0x92E0
+#define GL_SCISSOR_TEST 0x0C11
+#define GL_VENDOR 0x1F00
+#define GL_RENDERER 0x1F01
+#define GL_COMPILE_STATUS 0x8B81
+#define GL_INFO_LOG_LENGTH 0x8B84
+#define GL_FRAGMENT_SHADER 0x8B30
+#define GL_VERTEX_SHADER 0x8B31
+#define GL_COMPILE_STATUS 0x8B81
+#define GL_LINK_STATUS 0x8B82
+typedef unsigned int (*FUNC_eglExportDMABUFImageQueryMESA)(EGLDisplay dpy, EGLImageKHR image, int *fourcc, int *num_planes, uint64_t *modifiers);
+typedef unsigned int (*FUNC_eglExportDMABUFImageMESA)(EGLDisplay dpy, EGLImageKHR image, int *fds, int32_t *strides, int32_t *offsets);
+typedef void (*FUNC_glEGLImageTargetTexture2DOES)(unsigned int target, GLeglImageOES image);
+typedef GLXContext (*FUNC_glXCreateContextAttribsARB)(Display *dpy, GLXFBConfig config, GLXContext share_context, Bool direct, const int *attrib_list);
+typedef void (*FUNC_glXSwapIntervalEXT)(Display * dpy, GLXDrawable drawable, int interval);
+typedef int (*FUNC_glXSwapIntervalMESA)(unsigned int interval);
+typedef int (*FUNC_glXSwapIntervalSGI)(int interval);
+typedef void (*GLDEBUGPROC)(unsigned int source, unsigned int type, unsigned int id, unsigned int severity, int length, const char *message, const void *userParam);
+typedef int (*FUNC_eglQueryDisplayAttribEXT)(EGLDisplay dpy, int32_t attribute, intptr_t *value);
+typedef const char* (*FUNC_eglQueryDeviceStringEXT)(void *device, int32_t name);
+#define GSR_MAX_OUTPUTS 32
+typedef struct {
+ Display *dpy;
+ Window window;
+} gsr_x11;
+typedef struct {
+ uint32_t wl_name;
+ void *output;
+ vec2i pos;
+ vec2i size;
+ int32_t transform;
+ char *name;
+} gsr_wayland_output;
+typedef struct {
+ void *dpy;
+ void *window;
+ void *registry;
+ void *surface;
+ void *compositor;
+ gsr_wayland_output outputs[GSR_MAX_OUTPUTS];
+ int num_outputs;
+} gsr_wayland;
+typedef enum {
+} gsr_gl_context_type;
+typedef struct gsr_egl gsr_egl;
+struct gsr_egl {
+ void *egl_library;
+ void *glx_library;
+ void *gl_library;
+ gsr_gl_context_type context_type;
+ EGLDisplay egl_display;
+ EGLSurface egl_surface;
+ EGLContext egl_context;
+ const char *dri_card_path;
+ void *glx_context;
+ void *glx_fb_config;
+ gsr_gpu_info gpu_info;
+ gsr_x11 x11;
+ gsr_wayland wayland;
+ char card_path[128];
+ int32_t (*eglGetError)(void);
+ EGLDisplay (*eglGetDisplay)(EGLNativeDisplayType display_id);
+ unsigned int (*eglInitialize)(EGLDisplay dpy, int32_t *major, int32_t *minor);
+ unsigned int (*eglTerminate)(EGLDisplay dpy);
+ unsigned int (*eglChooseConfig)(EGLDisplay dpy, const int32_t *attrib_list, EGLConfig *configs, int32_t config_size, int32_t *num_config);
+ EGLSurface (*eglCreateWindowSurface)(EGLDisplay dpy, EGLConfig config, EGLNativeWindowType win, const int32_t *attrib_list);
+ EGLContext (*eglCreateContext)(EGLDisplay dpy, EGLConfig config, EGLContext share_context, const int32_t *attrib_list);
+ unsigned int (*eglMakeCurrent)(EGLDisplay dpy, EGLSurface draw, EGLSurface read, EGLContext ctx);
+ EGLImage (*eglCreateImage)(EGLDisplay dpy, EGLContext ctx, unsigned int target, EGLClientBuffer buffer, const intptr_t *attrib_list);
+ unsigned int (*eglDestroyContext)(EGLDisplay dpy, EGLContext ctx);
+ unsigned int (*eglDestroySurface)(EGLDisplay dpy, EGLSurface surface);
+ unsigned int (*eglDestroyImage)(EGLDisplay dpy, EGLImage image);
+ unsigned int (*eglSwapInterval)(EGLDisplay dpy, int32_t interval);
+ unsigned int (*eglSwapBuffers)(EGLDisplay dpy, EGLSurface surface);
+ unsigned int (*eglBindAPI)(unsigned int api);
+ __eglMustCastToProperFunctionPointerType (*eglGetProcAddress)(const char *procname);
+ FUNC_eglExportDMABUFImageQueryMESA eglExportDMABUFImageQueryMESA;
+ FUNC_eglExportDMABUFImageMESA eglExportDMABUFImageMESA;
+ FUNC_glEGLImageTargetTexture2DOES glEGLImageTargetTexture2DOES;
+ FUNC_eglQueryDisplayAttribEXT eglQueryDisplayAttribEXT;
+ FUNC_eglQueryDeviceStringEXT eglQueryDeviceStringEXT;
+ __GLXextFuncPtr (*glXGetProcAddress)(const unsigned char *procName);
+ GLXFBConfig* (*glXChooseFBConfig)(Display *dpy, int screen, const int *attribList, int *nitems);
+ Bool (*glXMakeContextCurrent)(Display *dpy, GLXDrawable draw, GLXDrawable read, GLXContext ctx);
+ // TODO: Remove
+ GLXContext (*glXCreateNewContext)(Display *dpy, GLXFBConfig config, int renderType, GLXContext shareList, Bool direct);
+ void (*glXDestroyContext)(Display *dpy, GLXContext ctx);
+ void (*glXSwapBuffers)(Display *dpy, GLXDrawable drawable);
+ FUNC_glXCreateContextAttribsARB glXCreateContextAttribsARB;
+ /* Optional */
+ FUNC_glXSwapIntervalEXT glXSwapIntervalEXT;
+ FUNC_glXSwapIntervalMESA glXSwapIntervalMESA;
+ FUNC_glXSwapIntervalSGI glXSwapIntervalSGI;
+ unsigned int (*glGetError)(void);
+ const unsigned char* (*glGetString)(unsigned int name);
+ void (*glFlush)(void);
+ void (*glFinish)(void);
+ void (*glClear)(unsigned int mask);
+ void (*glClearColor)(float red, float green, float blue, float alpha);
+ void (*glGenTextures)(int n, unsigned int *textures);
+ void (*glDeleteTextures)(int n, const unsigned int *texture);
+ void (*glBindTexture)(unsigned int target, unsigned int texture);
+ void (*glTexParameteri)(unsigned int target, unsigned int pname, int param);
+ void (*glTexParameteriv)(unsigned int target, unsigned int pname, const int *params);
+ void (*glGetTexLevelParameteriv)(unsigned int target, int level, unsigned int pname, int *params);
+ void (*glTexImage2D)(unsigned int target, int level, int internalFormat, int width, int height, int border, unsigned int format, unsigned int type, const void *pixels);
+ void (*glCopyImageSubData)(unsigned int srcName, unsigned int srcTarget, int srcLevel, int srcX, int srcY, int srcZ, unsigned int dstName, unsigned int dstTarget, int dstLevel, int dstX, int dstY, int dstZ, int srcWidth, int srcHeight, int srcDepth);
+ void (*glClearTexImage)(unsigned int texture, unsigned int level, unsigned int format, unsigned int type, const void *data);
+ void (*glGenFramebuffers)(int n, unsigned int *framebuffers);
+ void (*glBindFramebuffer)(unsigned int target, unsigned int framebuffer);
+ void (*glDeleteFramebuffers)(int n, const unsigned int *framebuffers);
+ void (*glViewport)(int x, int y, int width, int height);
+ void (*glFramebufferTexture2D)(unsigned int target, unsigned int attachment, unsigned int textarget, unsigned int texture, int level);
+ void (*glDrawBuffers)(int n, const unsigned int *bufs);
+ unsigned int (*glCheckFramebufferStatus)(unsigned int target);
+ void (*glBindBuffer)(unsigned int target, unsigned int buffer);
+ void (*glGenBuffers)(int n, unsigned int *buffers);
+ void (*glBufferData)(unsigned int target, khronos_ssize_t size, const void *data, unsigned int usage);
+ void (*glBufferSubData)(unsigned int target, khronos_intptr_t offset, khronos_ssize_t size, const void *data);
+ void (*glDeleteBuffers)(int n, const unsigned int *buffers);
+ void (*glGenVertexArrays)(int n, unsigned int *arrays);
+ void (*glBindVertexArray)(unsigned int array);
+ void (*glDeleteVertexArrays)(int n, const unsigned int *arrays);
+ unsigned int (*glCreateProgram)(void);
+ unsigned int (*glCreateShader)(unsigned int type);
+ void (*glAttachShader)(unsigned int program, unsigned int shader);
+ void (*glBindAttribLocation)(unsigned int program, unsigned int index, const char *name);
+ void (*glCompileShader)(unsigned int shader);
+ void (*glLinkProgram)(unsigned int program);
+ void (*glShaderSource)(unsigned int shader, int count, const char *const*string, const int *length);
+ void (*glUseProgram)(unsigned int program);
+ void (*glGetProgramInfoLog)(unsigned int program, int bufSize, int *length, char *infoLog);
+ void (*glGetShaderiv)(unsigned int shader, unsigned int pname, int *params);
+ void (*glGetShaderInfoLog)(unsigned int shader, int bufSize, int *length, char *infoLog);
+ void (*glDeleteProgram)(unsigned int program);
+ void (*glDeleteShader)(unsigned int shader);
+ void (*glGetProgramiv)(unsigned int program, unsigned int pname, int *params);
+ void (*glVertexAttribPointer)(unsigned int index, int size, unsigned int type, unsigned char normalized, int stride, const void *pointer);
+ void (*glEnableVertexAttribArray)(unsigned int index);
+ void (*glDrawArrays)(unsigned int mode, int first, int count);
+ void (*glEnable)(unsigned int cap);
+ void (*glDisable)(unsigned int cap);
+ void (*glBlendFunc)(unsigned int sfactor, unsigned int dfactor);
+ int (*glGetUniformLocation)(unsigned int program, const char *name);
+ void (*glUniform1f)(int location, float v0);
+ void (*glUniform2f)(int location, float v0, float v1);
+ void (*glDebugMessageCallback)(GLDEBUGPROC callback, const void *userParam);
+ void (*glScissor)(int x, int y, int width, int height);
+bool gsr_egl_load(gsr_egl *self, Display *dpy, bool wayland, bool is_monitor_capture);
+void gsr_egl_unload(gsr_egl *self);
+void gsr_egl_update(gsr_egl *self);
+#endif /* GSR_EGL_H */
diff --git a/include/library_loader.h b/include/library_loader.h
new file mode 100644
index 0000000..47bc9f0
--- /dev/null
+++ b/include/library_loader.h
@@ -0,0 +1,17 @@
+#include <stdbool.h>
+typedef struct {
+ void **func;
+ const char *name;
+} dlsym_assign;
+void* dlsym_print_fail(void *handle, const char *name, bool required);
+/* |dlsyms| should be null terminated */
+bool dlsym_load_list(void *handle, const dlsym_assign *dlsyms);
+/* |dlsyms| should be null terminated */
+void dlsym_load_list_optional(void *handle, const dlsym_assign *dlsyms);
+#endif /* GSR_LIBRARY_LOADER_H */
diff --git a/include/overclock.h b/include/overclock.h
new file mode 100644
index 0000000..d6ff901
--- /dev/null
+++ b/include/overclock.h
@@ -0,0 +1,17 @@
+#include "xnvctrl.h"
+typedef struct {
+ gsr_xnvctrl xnvctrl;
+ int num_performance_levels;
+} gsr_overclock;
+bool gsr_overclock_load(gsr_overclock *self, Display *display);
+void gsr_overclock_unload(gsr_overclock *self);
+bool gsr_overclock_start(gsr_overclock *self);
+void gsr_overclock_stop(gsr_overclock *self);
+#endif /* GSR_OVERCLOCK_H */
diff --git a/include/shader.h b/include/shader.h
new file mode 100644
index 0000000..57d1096
--- /dev/null
+++ b/include/shader.h
@@ -0,0 +1,19 @@
+#ifndef GSR_SHADER_H
+#define GSR_SHADER_H
+typedef struct gsr_egl gsr_egl;
+typedef struct {
+ gsr_egl *egl;
+ unsigned int program_id;
+} gsr_shader;
+/* |vertex_shader| or |fragment_shader| may be NULL */
+int gsr_shader_init(gsr_shader *self, gsr_egl *egl, const char *vertex_shader, const char *fragment_shader);
+void gsr_shader_deinit(gsr_shader *self);
+int gsr_shader_bind_attribute_location(gsr_shader *self, const char *attribute, int location);
+void gsr_shader_use(gsr_shader *self);
+void gsr_shader_use_none(gsr_shader *self);
+#endif /* GSR_SHADER_H */
diff --git a/include/sound.hpp b/include/sound.hpp
index 710533e..77bec99 100644
--- a/include/sound.hpp
+++ b/include/sound.hpp
@@ -31,13 +31,23 @@ struct AudioInput {
std::string description;
+struct MergedAudioInputs {
+ std::vector<AudioInput> audio_inputs;
+typedef enum {
+ S16,
+ S32,
+ F32
+} AudioFormat;
Get a sound device by name, returning the device into the @device parameter.
The device should be closed with @sound_device_close after it has been used
to clean up internal resources.
Returns 0 on success, or a negative value on failure.
-int sound_device_get_by_name(SoundDevice *device, const char *device_name, const char *description, unsigned int num_channels, unsigned int period_frame_size);
+int sound_device_get_by_name(SoundDevice *device, const char *device_name, const char *description, unsigned int num_channels, unsigned int period_frame_size, AudioFormat audio_format);
void sound_device_close(SoundDevice *device);
@@ -45,7 +55,7 @@ void sound_device_close(SoundDevice *device);
Returns the next chunk of audio into @buffer.
Returns the number of frames read, or a negative value on failure.
-int sound_device_read_next_chunk(SoundDevice *device, void **buffer);
+int sound_device_read_next_chunk(SoundDevice *device, void **buffer, double timeout_sec, double *latency_seconds);
std::vector<AudioInput> get_pulseaudio_inputs();
diff --git a/include/utils.h b/include/utils.h
new file mode 100644
index 0000000..c5d659a
--- /dev/null
+++ b/include/utils.h
@@ -0,0 +1,44 @@
+#ifndef GSR_UTILS_H
+#define GSR_UTILS_H
+#include "vec2.h"
+#include "../include/egl.h"
+#include "../include/defs.h"
+#include <stdbool.h>
+#include <stdint.h>
+#include <X11/extensions/Xrandr.h>
+typedef struct {
+ const char *name;
+ int name_len;
+ vec2i pos;
+ vec2i size;
+ XRRCrtcInfo *crt_info; /* Only on x11 */
+ uint32_t connector_id; /* Only on x11 and drm */
+ gsr_monitor_rotation rotation; /* Only on x11 and wayland */
+ uint32_t monitor_identifier; /* Only on drm and wayland */
+} gsr_monitor;
+typedef struct {
+ const char *name;
+ int name_len;
+ gsr_monitor *monitor;
+ bool found_monitor;
+} get_monitor_by_name_userdata;
+double clock_get_monotonic_seconds(void);
+typedef void (*active_monitor_callback)(const gsr_monitor *monitor, void *userdata);
+void for_each_active_monitor_output_x11(Display *display, active_monitor_callback callback, void *userdata);
+void for_each_active_monitor_output(const gsr_egl *egl, gsr_connection_type connection_type, active_monitor_callback callback, void *userdata);
+bool get_monitor_by_name(const gsr_egl *egl, gsr_connection_type connection_type, const char *name, gsr_monitor *monitor);
+gsr_monitor_rotation drm_monitor_get_display_server_rotation(const gsr_egl *egl, const gsr_monitor *monitor);
+bool gl_get_gpu_info(gsr_egl *egl, gsr_gpu_info *info);
+/* |output| should be at least 128 bytes in size */
+bool gsr_get_valid_card_path(gsr_egl *egl, char *output, bool is_monitor_capture);
+/* |render_path| should be at least 128 bytes in size */
+bool gsr_card_path_get_render_path(const char *card_path, char *render_path);
+#endif /* GSR_UTILS_H */
diff --git a/include/vec2.h b/include/vec2.h
new file mode 100644
index 0000000..3e33cfb
--- /dev/null
+++ b/include/vec2.h
@@ -0,0 +1,12 @@
+#ifndef VEC2_H
+#define VEC2_H
+typedef struct {
+ int x, y;
+} vec2i;
+typedef struct {
+ float x, y;
+} vec2f;
+#endif /* VEC2_H */
diff --git a/include/window_texture.h b/include/window_texture.h
new file mode 100644
index 0000000..75bb2a7
--- /dev/null
+++ b/include/window_texture.h
@@ -0,0 +1,27 @@
+#include "egl.h"
+typedef struct {
+ Display *display;
+ Window window;
+ Pixmap pixmap;
+ unsigned int texture_id;
+ int redirected;
+ gsr_egl *egl;
+} WindowTexture;
+/* Returns 0 on success */
+int window_texture_init(WindowTexture *window_texture, Display *display, Window window, gsr_egl *egl);
+void window_texture_deinit(WindowTexture *self);
+ This should ONLY be called when the target window is resized.
+ Returns 0 on success.
+int window_texture_on_resize(WindowTexture *self);
+unsigned int window_texture_get_opengl_texture_id(WindowTexture *self);
+#endif /* WINDOW_TEXTURE_H */
diff --git a/include/xnvctrl.h b/include/xnvctrl.h
new file mode 100644
index 0000000..33fc442
--- /dev/null
+++ b/include/xnvctrl.h
@@ -0,0 +1,45 @@
+#ifndef GSR_XNVCTRL_H
+#define GSR_XNVCTRL_H
+#include <stdbool.h>
+#include <stdint.h>
+typedef struct _XDisplay Display;
+typedef struct {
+ int type;
+ union {
+ struct {
+ int64_t min;
+ int64_t max;
+ } range;
+ struct {
+ unsigned int ints;
+ } bits;
+ } u;
+ unsigned int permissions;
+} NVCTRLAttributeValidValuesRec;
+typedef struct {
+ Display *display;
+ void *library;
+ int (*XNVCTRLQueryExtension)(Display *dpy, int *event_basep, int *error_basep);
+ int (*XNVCTRLSetTargetAttributeAndGetStatus)(Display *dpy, int target_type, int target_id, unsigned int display_mask, unsigned int attribute, int value);
+ int (*XNVCTRLQueryValidTargetAttributeValues)(Display *dpy, int target_type, int target_id, unsigned int display_mask, unsigned int attribute, NVCTRLAttributeValidValuesRec *values);
+ int (*XNVCTRLQueryTargetStringAttribute)(Display *dpy, int target_type, int target_id, unsigned int display_mask, unsigned int attribute, char **ptr);
+} gsr_xnvctrl;
+bool gsr_xnvctrl_load(gsr_xnvctrl *self, Display *display);
+void gsr_xnvctrl_unload(gsr_xnvctrl *self);
+#endif /* GSR_XNVCTRL_H */
diff --git a/install.sh b/install.sh
index 1984c2c..ab921fa 100755
--- a/install.sh
+++ b/install.sh
@@ -1,11 +1,14 @@
+#!/bin/sh -e
script_dir=$(dirname "$0")
cd "$script_dir"
[ $(id -u) -ne 0 ] && echo "You need root privileges to run the install script" && exit 1
-install -Dm755 "gpu-screen-recorder" "/usr/local/bin/gpu-screen-recorder"
-install -Dm755 "gpu-screen-recorder" "/usr/bin/gpu-screen-recorder"
+echo "Warning: this install.sh script is deprecated. Use meson directly instead if possible"
+test -d build || meson setup build
+meson configure --prefix=/usr --buildtype=release -Dsystemd=true -Dstrip=true build
+ninja -C build install
echo "Successfully installed gpu-screen-recorder"
diff --git a/install_ubuntu.sh b/install_ubuntu.sh
deleted file mode 100755
index 6a3bb4a..0000000
--- a/install_ubuntu.sh
+++ /dev/null
@@ -1,14 +0,0 @@
-script_dir=$(dirname "$0")
-cd "$script_dir"
-[ $(id -u) -ne 0 ] && echo "You need root privileges to run the install script" && exit 1
-set -e
-apt-get -y install build-essential\
- libswresample-dev libavformat-dev libavcodec-dev libavutil-dev\
- libgl-dev libx11-dev libxcomposite-dev\
- libpulse-dev
diff --git a/kms/client/kms_client.c b/kms/client/kms_client.c
new file mode 100644
index 0000000..869bf81
--- /dev/null
+++ b/kms/client/kms_client.c
@@ -0,0 +1,441 @@
+#include "kms_client.h"
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <unistd.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <fcntl.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/wait.h>
+#include <sys/stat.h>
+#include <sys/capability.h>
+static void cleanup_socket(gsr_kms_client *self, bool kill_server);
+static int gsr_kms_client_replace_connection(gsr_kms_client *self);
+static bool generate_random_characters(char *buffer, int buffer_size, const char *alphabet, size_t alphabet_size) {
+ int fd = open("/dev/urandom", O_RDONLY);
+ if(fd == -1) {
+ perror("/dev/urandom");
+ return false;
+ }
+ if(read(fd, buffer, buffer_size) < buffer_size) {
+ fprintf(stderr, "Failed to read %d bytes from /dev/urandom\n", buffer_size);
+ close(fd);
+ return false;
+ }
+ for(int i = 0; i < buffer_size; ++i) {
+ unsigned char c = *(unsigned char*)&buffer[i];
+ buffer[i] = alphabet[c % alphabet_size];
+ }
+ close(fd);
+ return true;
+static void close_fds(gsr_kms_response *response) {
+ for(int i = 0; i < response->num_fds; ++i) {
+ if(response->fds[i].fd > 0)
+ close(response->fds[i].fd);
+ response->fds[i].fd = 0;
+ }
+static int send_msg_to_server(int server_fd, gsr_kms_request *request) {
+ struct iovec iov;
+ iov.iov_base = request;
+ iov.iov_len = sizeof(*request);
+ struct msghdr response_message = {0};
+ response_message.msg_iov = &iov;
+ response_message.msg_iovlen = 1;
+ char cmsgbuf[CMSG_SPACE(sizeof(int) * 1)];
+ memset(cmsgbuf, 0, sizeof(cmsgbuf));
+ if(request->new_connection_fd > 0) {
+ response_message.msg_control = cmsgbuf;
+ response_message.msg_controllen = sizeof(cmsgbuf);
+ struct cmsghdr *cmsg = CMSG_FIRSTHDR(&response_message);
+ cmsg->cmsg_level = SOL_SOCKET;
+ cmsg->cmsg_type = SCM_RIGHTS;
+ cmsg->cmsg_len = CMSG_LEN(sizeof(int) * 1);
+ int *fds = (int*)CMSG_DATA(cmsg);
+ fds[0] = request->new_connection_fd;
+ response_message.msg_controllen = cmsg->cmsg_len;
+ }
+ return sendmsg(server_fd, &response_message, 0);
+static int recv_msg_from_server(int server_pid, int server_fd, gsr_kms_response *response) {
+ struct iovec iov;
+ iov.iov_base = response;
+ iov.iov_len = sizeof(*response);
+ struct msghdr response_message = {0};
+ response_message.msg_iov = &iov;
+ response_message.msg_iovlen = 1;
+ char cmsgbuf[CMSG_SPACE(sizeof(int) * GSR_KMS_MAX_PLANES)];
+ memset(cmsgbuf, 0, sizeof(cmsgbuf));
+ response_message.msg_control = cmsgbuf;
+ response_message.msg_controllen = sizeof(cmsgbuf);
+ int res = 0;
+ for(;;) {
+ res = recvmsg(server_fd, &response_message, MSG_DONTWAIT);
+ if(res <= 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) {
+ // If we are replacing the connection and closing the application at the same time
+ // then recvmsg can get stuck (because the server died), so we prevent that by doing
+ // non-blocking recvmsg and checking if the server died
+ int status = 0;
+ int wait_result = waitpid(server_pid, &status, WNOHANG);
+ if(wait_result != 0) {
+ res = -1;
+ break;
+ }
+ usleep(1000);
+ } else {
+ break;
+ }
+ }
+ if(res > 0 && response->num_fds > 0) {
+ struct cmsghdr *cmsg = CMSG_FIRSTHDR(&response_message);
+ if(cmsg) {
+ int *fds = (int*)CMSG_DATA(cmsg);
+ for(int i = 0; i < response->num_fds; ++i) {
+ response->fds[i].fd = fds[i];
+ }
+ } else {
+ close_fds(response);
+ }
+ }
+ return res;
+/* We have to use $HOME because in flatpak there is no simple path that is accessible, read and write, that multiple flatpak instances can access */
+static bool create_socket_path(char *output_path, size_t output_path_size) {
+ const char *home = getenv("HOME");
+ if(!home)
+ home = "/tmp";
+ char random_characters[11];
+ random_characters[10] = '\0';
+ if(!generate_random_characters(random_characters, 10, "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", 62))
+ return false;
+ snprintf(output_path, output_path_size, "%s/.gsr-kms-socket-%s", home, random_characters);
+ return true;
+static void string_copy(char *dst, const char *src, int len) {
+ int src_len = strlen(src);
+ int min_len = src_len;
+ if(len - 1 < min_len)
+ min_len = len - 1;
+ memcpy(dst, src, min_len);
+ dst[min_len] = '\0';
+static bool find_program_in_path(const char *program_name, char *filepath, int filepath_len) {
+ const char *path = getenv("PATH");
+ if(!path)
+ return false;
+ int program_name_len = strlen(program_name);
+ const char *end = path + strlen(path);
+ while(path != end) {
+ const char *part_end = strchr(path, ':');
+ const char *next = part_end;
+ if(part_end) {
+ next = part_end + 1;
+ } else {
+ part_end = end;
+ next = end;
+ }
+ int len = part_end - path;
+ if(len + 1 + program_name_len < filepath_len) {
+ memcpy(filepath, path, len);
+ filepath[len] = '/';
+ memcpy(filepath + len + 1, program_name, program_name_len);
+ filepath[len + 1 + program_name_len] = '\0';
+ if(access(filepath, F_OK) == 0)
+ return true;
+ }
+ path = next;
+ }
+ return false;
+int gsr_kms_client_init(gsr_kms_client *self, const char *card_path) {
+ int result = -1;
+ self->kms_server_pid = -1;
+ self->initial_socket_fd = -1;
+ self->initial_client_fd = -1;
+ self->initial_socket_path[0] = '\0';
+ self->socket_pair[0] = -1;
+ self->socket_pair[1] = -1;
+ struct sockaddr_un local_addr = {0};
+ struct sockaddr_un remote_addr = {0};
+ if(!create_socket_path(self->initial_socket_path, sizeof(self->initial_socket_path))) {
+ fprintf(stderr, "gsr error: gsr_kms_client_init: failed to create path to kms socket\n");
+ return -1;
+ }
+ char server_filepath[PATH_MAX];
+ if(!find_program_in_path("gsr-kms-server", server_filepath, sizeof(server_filepath))) {
+ fprintf(stderr, "gsr error: gsr_kms_client_init: gsr-kms-server is not installed\n");
+ return -1;
+ }
+ const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
+ const char *home = getenv("HOME");
+ if(!home)
+ home = "/tmp";
+ bool has_perm = 0;
+ if(geteuid() == 0) {
+ has_perm = true;
+ } else {
+ cap_t kms_server_cap = cap_get_file(server_filepath);
+ if(kms_server_cap) {
+ cap_flag_value_t res = CAP_CLEAR;
+ cap_get_flag(kms_server_cap, CAP_SYS_ADMIN, CAP_PERMITTED, &res);
+ if(res == CAP_SET) {
+ //fprintf(stderr, "has permission!\n");
+ has_perm = true;
+ } else {
+ //fprintf(stderr, "No permission:(\n");
+ }
+ cap_free(kms_server_cap);
+ } else if(!inside_flatpak) {
+ if(errno == ENODATA)
+ fprintf(stderr, "gsr info: gsr_kms_client_init: gsr-kms-server is missing sys_admin cap and will require root authentication. To bypass this automatically, run: sudo setcap cap_sys_admin+ep '%s'\n", server_filepath);
+ else
+ fprintf(stderr, "gsr info: gsr_kms_client_init: failed to get cap\n");
+ }
+ }
+ if(socketpair(AF_UNIX, SOCK_STREAM, 0, self->socket_pair) == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_init: socketpair failed, error: %s\n", strerror(errno));
+ goto err;
+ }
+ self->initial_socket_fd = socket(AF_UNIX, SOCK_STREAM, 0);
+ if(self->initial_socket_fd == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_init: socket failed, error: %s\n", strerror(errno));
+ goto err;
+ }
+ local_addr.sun_family = AF_UNIX;
+ string_copy(local_addr.sun_path, self->initial_socket_path, sizeof(local_addr.sun_path));
+ const mode_t prev_mask = umask(0000);
+ const int bind_res = bind(self->initial_socket_fd, (struct sockaddr*)&local_addr, sizeof(local_addr.sun_family) + strlen(local_addr.sun_path));
+ umask(prev_mask);
+ if(bind_res == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_init: failed to bind socket, error: %s\n", strerror(errno));
+ goto err;
+ }
+ if(listen(self->initial_socket_fd, 1) == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_init: failed to listen on socket, error: %s\n", strerror(errno));
+ goto err;
+ }
+ pid_t pid = fork();
+ if(pid == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_init: fork failed, error: %s\n", strerror(errno));
+ goto err;
+ } else if(pid == 0) { /* child */
+ if(inside_flatpak) {
+ const char *args[] = { "flatpak-spawn", "--host", "/var/lib/flatpak/app/com.dec05eba.gpu_screen_recorder/current/active/files/bin/kms-server-proxy", self->initial_socket_path, card_path, home, NULL };
+ execvp(args[0], (char *const*)args);
+ } else if(has_perm) {
+ const char *args[] = { server_filepath, self->initial_socket_path, card_path, NULL };
+ execvp(args[0], (char *const*)args);
+ } else {
+ const char *args[] = { "pkexec", server_filepath, self->initial_socket_path, card_path, NULL };
+ execvp(args[0], (char *const*)args);
+ }
+ fprintf(stderr, "gsr error: gsr_kms_client_init: execvp failed, error: %s\n", strerror(errno));
+ _exit(127);
+ } else { /* parent */
+ self->kms_server_pid = pid;
+ }
+ fprintf(stderr, "gsr info: gsr_kms_client_init: waiting for server to connect\n");
+ for(;;) {
+ struct timeval tv;
+ fd_set rfds;
+ FD_ZERO(&rfds);
+ FD_SET(self->initial_socket_fd, &rfds);
+ tv.tv_sec = 0;
+ tv.tv_usec = 100 * 1000; // 100 ms
+ int select_res = select(1 + self->initial_socket_fd, &rfds, NULL, NULL, &tv);
+ if(select_res > 0) {
+ socklen_t sock_len = 0;
+ self->initial_client_fd = accept(self->initial_socket_fd, (struct sockaddr*)&remote_addr, &sock_len);
+ if(self->initial_client_fd == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_init: accept failed on socket, error: %s\n", strerror(errno));
+ goto err;
+ }
+ break;
+ } else {
+ int status = 0;
+ int wait_result = waitpid(self->kms_server_pid, &status, WNOHANG);
+ if(wait_result != 0) {
+ int exit_code = -1;
+ if(WIFEXITED(status))
+ exit_code = WEXITSTATUS(status);
+ fprintf(stderr, "gsr error: gsr_kms_client_init: kms server died or never started, exit code: %d\n", exit_code);
+ self->kms_server_pid = -1;
+ if(exit_code != 0)
+ result = exit_code;
+ goto err;
+ }
+ }
+ }
+ fprintf(stderr, "gsr info: gsr_kms_client_init: server connected\n");
+ fprintf(stderr, "gsr info: replacing file-backed unix domain socket with socketpair\n");
+ if(gsr_kms_client_replace_connection(self) != 0)
+ goto err;
+ cleanup_socket(self, false);
+ fprintf(stderr, "gsr info: using socketpair\n");
+ return 0;
+ err:
+ gsr_kms_client_deinit(self);
+ return result;
+void cleanup_socket(gsr_kms_client *self, bool kill_server) {
+ if(self->initial_client_fd != -1) {
+ close(self->initial_client_fd);
+ self->initial_client_fd = -1;
+ }
+ if(self->initial_socket_fd != -1) {
+ close(self->initial_socket_fd);
+ self->initial_socket_fd = -1;
+ }
+ if(kill_server) {
+ for(int i = 0; i < 2; ++i) {
+ if(self->socket_pair[i] > 0) {
+ close(self->socket_pair[i]);
+ self->socket_pair[i] = -1;
+ }
+ }
+ }
+ if(kill_server && self->kms_server_pid != -1) {
+ kill(self->kms_server_pid, SIGKILL);
+ //int status;
+ //waitpid(self->kms_server_pid, &status, 0);
+ self->kms_server_pid = -1;
+ }
+ if(self->initial_socket_path[0] != '\0') {
+ remove(self->initial_socket_path);
+ self->initial_socket_path[0] = '\0';
+ }
+void gsr_kms_client_deinit(gsr_kms_client *self) {
+ cleanup_socket(self, true);
+int gsr_kms_client_replace_connection(gsr_kms_client *self) {
+ gsr_kms_response response;
+ response.version = 0;
+ response.result = KMS_RESULT_FAILED_TO_SEND;
+ response.err_msg[0] = '\0';
+ gsr_kms_request request;
+ request.version = GSR_KMS_PROTOCOL_VERSION;
+ request.new_connection_fd = self->socket_pair[GSR_SOCKET_PAIR_REMOTE];
+ if(send_msg_to_server(self->initial_client_fd, &request) == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_replace_connection: failed to send request message to server\n");
+ return -1;
+ }
+ const int recv_res = recv_msg_from_server(self->kms_server_pid, self->socket_pair[GSR_SOCKET_PAIR_LOCAL], &response);
+ if(recv_res == 0) {
+ fprintf(stderr, "gsr warning: gsr_kms_client_replace_connection: kms server shut down\n");
+ return -1;
+ } else if(recv_res == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_replace_connection: failed to receive response\n");
+ return -1;
+ }
+ if(response.version != GSR_KMS_PROTOCOL_VERSION) {
+ fprintf(stderr, "gsr error: gsr_kms_client_replace_connection: expected gsr-kms-server protocol version to be %u, but it's %u\n", GSR_KMS_PROTOCOL_VERSION, response.version);
+ /*close_fds(response);*/
+ return -1;
+ }
+ return 0;
+int gsr_kms_client_get_kms(gsr_kms_client *self, gsr_kms_response *response) {
+ response->version = 0;
+ response->result = KMS_RESULT_FAILED_TO_SEND;
+ response->err_msg[0] = '\0';
+ gsr_kms_request request;
+ request.version = GSR_KMS_PROTOCOL_VERSION;
+ request.type = KMS_REQUEST_TYPE_GET_KMS;
+ request.new_connection_fd = 0;
+ if(send_msg_to_server(self->socket_pair[GSR_SOCKET_PAIR_LOCAL], &request) == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_get_kms: failed to send request message to server\n");
+ strcpy(response->err_msg, "failed to send");
+ return -1;
+ }
+ const int recv_res = recv_msg_from_server(self->kms_server_pid, self->socket_pair[GSR_SOCKET_PAIR_LOCAL], response);
+ if(recv_res == 0) {
+ fprintf(stderr, "gsr warning: gsr_kms_client_get_kms: kms server shut down\n");
+ strcpy(response->err_msg, "failed to receive");
+ return -1;
+ } else if(recv_res == -1) {
+ fprintf(stderr, "gsr error: gsr_kms_client_get_kms: failed to receive response\n");
+ strcpy(response->err_msg, "failed to receive");
+ return -1;
+ }
+ if(response->version != GSR_KMS_PROTOCOL_VERSION) {
+ fprintf(stderr, "gsr error: gsr_kms_client_get_kms: expected gsr-kms-server protocol version to be %u, but it's %u\n", GSR_KMS_PROTOCOL_VERSION, response->version);
+ /*close_fds(response);*/
+ strcpy(response->err_msg, "mismatching protocol version");
+ return -1;
+ }
+ return 0;
diff --git a/kms/client/kms_client.h b/kms/client/kms_client.h
new file mode 100644
index 0000000..2d18848
--- /dev/null
+++ b/kms/client/kms_client.h
@@ -0,0 +1,24 @@
+#include "../kms_shared.h"
+#include <sys/types.h>
+#include <limits.h>
+typedef struct gsr_kms_client gsr_kms_client;
+struct gsr_kms_client {
+ pid_t kms_server_pid;
+ int initial_socket_fd;
+ int initial_client_fd;
+ char initial_socket_path[PATH_MAX];
+ int socket_pair[2];
+/* |card_path| should be a path to card, for example /dev/dri/card0 */
+int gsr_kms_client_init(gsr_kms_client *self, const char *card_path);
+void gsr_kms_client_deinit(gsr_kms_client *self);
+int gsr_kms_client_get_kms(gsr_kms_client *self, gsr_kms_response *response);
+#endif /* #define GSR_KMS_CLIENT_H */
diff --git a/kms/kms_shared.h b/kms/kms_shared.h
new file mode 100644
index 0000000..4fa9c38
--- /dev/null
+++ b/kms/kms_shared.h
@@ -0,0 +1,60 @@
+#include <stdint.h>
+#include <stdbool.h>
+#include <drm_mode.h>
+#define GSR_KMS_MAX_PLANES 10
+typedef struct gsr_kms_response_fd gsr_kms_response_fd;
+typedef struct gsr_kms_response gsr_kms_response;
+typedef enum {
+} gsr_kms_request_type;
+typedef enum {
+} gsr_kms_result;
+typedef struct {
+ uint32_t version; /* GSR_KMS_PROTOCOL_VERSION */
+ int type; /* gsr_kms_request_type */
+ int new_connection_fd;
+} gsr_kms_request;
+struct gsr_kms_response_fd {
+ int fd;
+ uint32_t width;
+ uint32_t height;
+ uint32_t pitch;
+ uint32_t offset;
+ uint32_t pixel_format;
+ uint64_t modifier;
+ uint32_t connector_id; /* 0 if unknown */
+ bool is_combined_plane;
+ bool is_cursor;
+ bool has_hdr_metadata;
+ int x;
+ int y;
+ int src_w;
+ int src_h;
+ struct hdr_output_metadata hdr_metadata;
+struct gsr_kms_response {
+ uint32_t version; /* GSR_KMS_PROTOCOL_VERSION */
+ int result; /* gsr_kms_result */
+ char err_msg[128];
+ gsr_kms_response_fd fds[GSR_KMS_MAX_PLANES];
+ int num_fds;
+#endif /* #define GSR_KMS_SHARED_H */
diff --git a/kms/server/.gitignore b/kms/server/.gitignore
new file mode 100644
index 0000000..97420ef
--- /dev/null
+++ b/kms/server/.gitignore
@@ -0,0 +1 @@
diff --git a/kms/server/kms_server.c b/kms/server/kms_server.c
new file mode 100644
index 0000000..2eaa1ed
--- /dev/null
+++ b/kms/server/kms_server.c
@@ -0,0 +1,572 @@
+#include "../kms_shared.h"
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <time.h>
+#include <xf86drm.h>
+#include <xf86drmMode.h>
+#include <drm_mode.h>
+#define MAX_CONNECTORS 32
+typedef struct {
+ int drmfd;
+ drmModePlaneResPtr planes;
+} gsr_drm;
+typedef struct {
+ uint32_t connector_id;
+ uint64_t crtc_id;
+ uint64_t hdr_metadata_blob_id;
+} connector_crtc_pair;
+typedef struct {
+ connector_crtc_pair maps[MAX_CONNECTORS];
+ int num_maps;
+} connector_to_crtc_map;
+static int max_int(int a, int b) {
+ return a > b ? a : b;
+static int send_msg_to_client(int client_fd, gsr_kms_response *response) {
+ struct iovec iov;
+ iov.iov_base = response;
+ iov.iov_len = sizeof(*response);
+ struct msghdr response_message = {0};
+ response_message.msg_iov = &iov;
+ response_message.msg_iovlen = 1;
+ char cmsgbuf[CMSG_SPACE(sizeof(int) * max_int(1, response->num_fds))];
+ memset(cmsgbuf, 0, sizeof(cmsgbuf));
+ if(response->num_fds > 0) {
+ response_message.msg_control = cmsgbuf;
+ response_message.msg_controllen = sizeof(cmsgbuf);
+ struct cmsghdr *cmsg = CMSG_FIRSTHDR(&response_message);
+ cmsg->cmsg_level = SOL_SOCKET;
+ cmsg->cmsg_type = SCM_RIGHTS;
+ cmsg->cmsg_len = CMSG_LEN(sizeof(int) * response->num_fds);
+ int *fds = (int*)CMSG_DATA(cmsg);
+ for(int i = 0; i < response->num_fds; ++i) {
+ fds[i] = response->fds[i].fd;
+ }
+ response_message.msg_controllen = cmsg->cmsg_len;
+ }
+ return sendmsg(client_fd, &response_message, 0);
+static int recv_msg_from_client(int client_fd, gsr_kms_request *request) {
+ struct iovec iov;
+ iov.iov_base = request;
+ iov.iov_len = sizeof(*request);
+ struct msghdr response_message = {0};
+ response_message.msg_iov = &iov;
+ response_message.msg_iovlen = 1;
+ char cmsgbuf[CMSG_SPACE(sizeof(int) * 1)];
+ memset(cmsgbuf, 0, sizeof(cmsgbuf));
+ response_message.msg_control = cmsgbuf;
+ response_message.msg_controllen = sizeof(cmsgbuf);
+ int res = recvmsg(client_fd, &response_message, MSG_WAITALL);
+ if(res <= 0)
+ return res;
+ if(request->new_connection_fd > 0) {
+ struct cmsghdr *cmsg = CMSG_FIRSTHDR(&response_message);
+ if(cmsg) {
+ int *fds = (int*)CMSG_DATA(cmsg);
+ request->new_connection_fd = fds[0];
+ } else {
+ if(request->new_connection_fd > 0) {
+ close(request->new_connection_fd);
+ request->new_connection_fd = 0;
+ }
+ }
+ }
+ return res;
+static bool connector_get_property_by_name(int drmfd, drmModeConnectorPtr props, const char *name, uint64_t *result) {
+ for(int i = 0; i < props->count_props; ++i) {
+ drmModePropertyPtr prop = drmModeGetProperty(drmfd, props->props[i]);
+ if(prop) {
+ if(strcmp(name, prop->name) == 0) {
+ *result = props->prop_values[i];
+ drmModeFreeProperty(prop);
+ return true;
+ }
+ drmModeFreeProperty(prop);
+ }
+ }
+ return false;
+typedef enum {
+ PLANE_PROPERTY_X = 1 << 0,
+ PLANE_PROPERTY_Y = 1 << 1,
+} plane_property_mask;
+/* Returns plane_property_mask */
+static uint32_t plane_get_properties(int drmfd, uint32_t plane_id, int *x, int *y, int *src_x, int *src_y, int *src_w, int *src_h) {
+ *x = 0;
+ *y = 0;
+ *src_x = 0;
+ *src_y = 0;
+ *src_w = 0;
+ *src_h = 0;
+ plane_property_mask property_mask = 0;
+ drmModeObjectPropertiesPtr props = drmModeObjectGetProperties(drmfd, plane_id, DRM_MODE_OBJECT_PLANE);
+ if(!props)
+ return property_mask;
+ // TODO: Dont do this every frame
+ for(uint32_t i = 0; i < props->count_props; ++i) {
+ drmModePropertyPtr prop = drmModeGetProperty(drmfd, props->props[i]);
+ if(!prop)
+ continue;
+ // SRC_* values are fixed 16.16 points
+ const uint32_t type = prop->flags & (DRM_MODE_PROP_LEGACY_TYPE | DRM_MODE_PROP_EXTENDED_TYPE);
+ if((type & DRM_MODE_PROP_SIGNED_RANGE) && strcmp(prop->name, "CRTC_X") == 0) {
+ *x = (int)props->prop_values[i];
+ property_mask |= PLANE_PROPERTY_X;
+ } else if((type & DRM_MODE_PROP_SIGNED_RANGE) && strcmp(prop->name, "CRTC_Y") == 0) {
+ *y = (int)props->prop_values[i];
+ property_mask |= PLANE_PROPERTY_Y;
+ } else if((type & DRM_MODE_PROP_RANGE) && strcmp(prop->name, "SRC_X") == 0) {
+ *src_x = (int)(props->prop_values[i] >> 16);
+ property_mask |= PLANE_PROPERTY_SRC_X;
+ } else if((type & DRM_MODE_PROP_RANGE) && strcmp(prop->name, "SRC_Y") == 0) {
+ *src_y = (int)(props->prop_values[i] >> 16);
+ property_mask |= PLANE_PROPERTY_SRC_Y;
+ } else if((type & DRM_MODE_PROP_RANGE) && strcmp(prop->name, "SRC_W") == 0) {
+ *src_w = (int)(props->prop_values[i] >> 16);
+ property_mask |= PLANE_PROPERTY_SRC_W;
+ } else if((type & DRM_MODE_PROP_RANGE) && strcmp(prop->name, "SRC_H") == 0) {
+ *src_h = (int)(props->prop_values[i] >> 16);
+ property_mask |= PLANE_PROPERTY_SRC_H;
+ } else if((type & DRM_MODE_PROP_ENUM) && strcmp(prop->name, "type") == 0) {
+ const uint64_t current_enum_value = props->prop_values[i];
+ for(int j = 0; j < prop->count_enums; ++j) {
+ if(prop->enums[j].value == current_enum_value && strcmp(prop->enums[j].name, "Primary") == 0) {
+ property_mask |= PLANE_PROPERTY_IS_PRIMARY;
+ break;
+ } else if(prop->enums[j].value == current_enum_value && strcmp(prop->enums[j].name, "Cursor") == 0) {
+ property_mask |= PLANE_PROPERTY_IS_CURSOR;
+ break;
+ }
+ }
+ }
+ drmModeFreeProperty(prop);
+ }
+ drmModeFreeObjectProperties(props);
+ return property_mask;
+/* Returns 0 if not found */
+static const connector_crtc_pair* get_connector_pair_by_crtc_id(const connector_to_crtc_map *c2crtc_map, uint32_t crtc_id) {
+ for(int i = 0; i < c2crtc_map->num_maps; ++i) {
+ if(c2crtc_map->maps[i].crtc_id == crtc_id)
+ return &c2crtc_map->maps[i];
+ }
+ return NULL;
+static void map_crtc_to_connector_ids(gsr_drm *drm, connector_to_crtc_map *c2crtc_map) {
+ c2crtc_map->num_maps = 0;
+ drmModeResPtr resources = drmModeGetResources(drm->drmfd);
+ if(!resources)
+ return;
+ for(int i = 0; i < resources->count_connectors && c2crtc_map->num_maps < MAX_CONNECTORS; ++i) {
+ drmModeConnectorPtr connector = drmModeGetConnectorCurrent(drm->drmfd, resources->connectors[i]);
+ if(!connector)
+ continue;
+ uint64_t crtc_id = 0;
+ connector_get_property_by_name(drm->drmfd, connector, "CRTC_ID", &crtc_id);
+ uint64_t hdr_output_metadata_blob_id = 0;
+ connector_get_property_by_name(drm->drmfd, connector, "HDR_OUTPUT_METADATA", &hdr_output_metadata_blob_id);
+ c2crtc_map->maps[c2crtc_map->num_maps].connector_id = connector->connector_id;
+ c2crtc_map->maps[c2crtc_map->num_maps].crtc_id = crtc_id;
+ c2crtc_map->maps[c2crtc_map->num_maps].hdr_metadata_blob_id = hdr_output_metadata_blob_id;
+ ++c2crtc_map->num_maps;
+ drmModeFreeConnector(connector);
+ }
+ drmModeFreeResources(resources);
+static void drm_mode_cleanup_handles(int drmfd, drmModeFB2Ptr drmfb) {
+ for(int i = 0; i < 4; ++i) {
+ if(!drmfb->handles[i])
+ continue;
+ bool already_closed = false;
+ for(int j = 0; j < i; ++j) {
+ if(drmfb->handles[i] == drmfb->handles[j]) {
+ already_closed = true;
+ break;
+ }
+ }
+ if(already_closed)
+ continue;
+ drmCloseBufferHandle(drmfd, drmfb->handles[i]);
+ }
+static bool get_hdr_metadata(int drm_fd, uint64_t hdr_metadata_blob_id, struct hdr_output_metadata *hdr_metadata) {
+ drmModePropertyBlobPtr hdr_metadata_blob = drmModeGetPropertyBlob(drm_fd, hdr_metadata_blob_id);
+ if(!hdr_metadata_blob)
+ return false;
+ if(hdr_metadata_blob->length >= sizeof(struct hdr_output_metadata))
+ *hdr_metadata = *(struct hdr_output_metadata*)hdr_metadata_blob->data;
+ drmModeFreePropertyBlob(hdr_metadata_blob);
+ return true;
+static int kms_get_fb(gsr_drm *drm, gsr_kms_response *response, connector_to_crtc_map *c2crtc_map) {
+ int result = -1;
+ response->result = KMS_RESULT_OK;
+ response->err_msg[0] = '\0';
+ response->num_fds = 0;
+ for(uint32_t i = 0; i < drm->planes->count_planes && response->num_fds < GSR_KMS_MAX_PLANES; ++i) {
+ drmModePlanePtr plane = NULL;
+ drmModeFB2Ptr drmfb = NULL;
+ plane = drmModeGetPlane(drm->drmfd, drm->planes->planes[i]);
+ if(!plane) {
+ response->result = KMS_RESULT_FAILED_TO_GET_PLANE;
+ snprintf(response->err_msg, sizeof(response->err_msg), "failed to get drm plane with id %u, error: %s\n", drm->planes->planes[i], strerror(errno));
+ fprintf(stderr, "kms server error: %s\n", response->err_msg);
+ goto next;
+ }
+ if(!plane->fb_id)
+ goto next;
+ drmfb = drmModeGetFB2(drm->drmfd, plane->fb_id);
+ if(!drmfb) {
+ // Commented out for now because we get here if the cursor is moved to another monitor and we dont care about the cursor
+ //response->result = KMS_RESULT_FAILED_TO_GET_PLANE;
+ //snprintf(response->err_msg, sizeof(response->err_msg), "drmModeGetFB2 failed, error: %s", strerror(errno));
+ //fprintf(stderr, "kms server error: %s\n", response->err_msg);
+ goto next;
+ }
+ if(!drmfb->handles[0]) {
+ response->result = KMS_RESULT_FAILED_TO_GET_PLANE;
+ snprintf(response->err_msg, sizeof(response->err_msg), "drmfb handle is NULL");
+ fprintf(stderr, "kms server error: %s\n", response->err_msg);
+ goto cleanup_handles;
+ }
+ // TODO: Check if dimensions have changed by comparing width and height to previous time this was called.
+ // TODO: Support other plane formats than rgb (with multiple planes, such as direct YUV420 on wayland).
+ int fb_fd = -1;
+ const int ret = drmPrimeHandleToFD(drm->drmfd, drmfb->handles[0], O_RDONLY, &fb_fd);
+ if(ret != 0 || fb_fd == -1) {
+ response->result = KMS_RESULT_FAILED_TO_GET_PLANE;
+ snprintf(response->err_msg, sizeof(response->err_msg), "failed to get fd from drm handle, error: %s", strerror(errno));
+ fprintf(stderr, "kms server error: %s\n", response->err_msg);
+ goto cleanup_handles;
+ }
+ const int fd_index = response->num_fds;
+ int x = 0, y = 0, src_x = 0, src_y = 0, src_w = 0, src_h = 0;
+ plane_property_mask property_mask = plane_get_properties(drm->drmfd, plane->plane_id, &x, &y, &src_x, &src_y, &src_w, &src_h);
+ if((property_mask & PLANE_PROPERTY_IS_PRIMARY) || (property_mask & PLANE_PROPERTY_IS_CURSOR)) {
+ const connector_crtc_pair *crtc_pair = get_connector_pair_by_crtc_id(c2crtc_map, plane->crtc_id);
+ if(crtc_pair && crtc_pair->hdr_metadata_blob_id) {
+ response->fds[fd_index].has_hdr_metadata = get_hdr_metadata(drm->drmfd, crtc_pair->hdr_metadata_blob_id, &response->fds[fd_index].hdr_metadata);
+ } else {
+ response->fds[fd_index].has_hdr_metadata = false;
+ }
+ response->fds[fd_index].fd = fb_fd;
+ response->fds[fd_index].width = drmfb->width;
+ response->fds[fd_index].height = drmfb->height;
+ response->fds[fd_index].pitch = drmfb->pitches[0];
+ response->fds[fd_index].offset = drmfb->offsets[0];
+ response->fds[fd_index].pixel_format = drmfb->pixel_format;
+ response->fds[fd_index].modifier = drmfb->modifier;
+ response->fds[fd_index].connector_id = crtc_pair ? crtc_pair->connector_id : 0;
+ response->fds[fd_index].is_cursor = property_mask & PLANE_PROPERTY_IS_CURSOR;
+ response->fds[fd_index].is_combined_plane = false;
+ if(property_mask & PLANE_PROPERTY_IS_CURSOR) {
+ response->fds[fd_index].x = x;
+ response->fds[fd_index].y = y;
+ response->fds[fd_index].src_w = 0;
+ response->fds[fd_index].src_h = 0;
+ } else {
+ response->fds[fd_index].x = src_x;
+ response->fds[fd_index].y = src_y;
+ response->fds[fd_index].src_w = src_w;
+ response->fds[fd_index].src_h = src_h;
+ }
+ ++response->num_fds;
+ } else {
+ close(fb_fd);
+ }
+ cleanup_handles:
+ drm_mode_cleanup_handles(drm->drmfd, drmfb);
+ next:
+ if(drmfb)
+ drmModeFreeFB2(drmfb);
+ if(plane)
+ drmModeFreePlane(plane);
+ }
+ if(response->num_fds > 0)
+ response->result = KMS_RESULT_OK;
+ if(response->result == KMS_RESULT_OK) {
+ result = 0;
+ } else {
+ for(int i = 0; i < response->num_fds; ++i) {
+ close(response->fds[i].fd);
+ }
+ response->num_fds = 0;
+ }
+ return result;
+static double clock_get_monotonic_seconds(void) {
+ struct timespec ts;
+ ts.tv_sec = 0;
+ ts.tv_nsec = 0;
+ clock_gettime(CLOCK_MONOTONIC, &ts);
+ return (double)ts.tv_sec + (double)ts.tv_nsec * 0.000000001;
+static void string_copy(char *dst, const char *src, int len) {
+ int src_len = strlen(src);
+ int min_len = src_len;
+ if(len - 1 < min_len)
+ min_len = len - 1;
+ memcpy(dst, src, min_len);
+ dst[min_len] = '\0';
+int main(int argc, char **argv) {
+ int res = 0;
+ int socket_fd = 0;
+ gsr_drm drm;
+ drm.drmfd = 0;
+ drm.planes = NULL;
+ if(argc != 3) {
+ fprintf(stderr, "usage: gsr-kms-server <domain_socket_path> <card_path>\n");
+ return 1;
+ }
+ const char *domain_socket_path = argv[1];
+ socket_fd = socket(AF_UNIX, SOCK_STREAM, 0);
+ if(socket_fd == -1) {
+ fprintf(stderr, "kms server error: failed to create socket, error: %s\n", strerror(errno));
+ return 2;
+ }
+ const char *card_path = argv[2];
+ drm.drmfd = open(card_path, O_RDONLY);
+ if(drm.drmfd < 0) {
+ fprintf(stderr, "kms server error: failed to open %s, error: %s", card_path, strerror(errno));
+ res = 2;
+ goto done;
+ }
+ if(drmSetClientCap(drm.drmfd, DRM_CLIENT_CAP_UNIVERSAL_PLANES, 1) != 0) {
+ fprintf(stderr, "kms server error: drmSetClientCap DRM_CLIENT_CAP_UNIVERSAL_PLANES failed, error: %s\n", strerror(errno));
+ res = 2;
+ goto done;
+ }
+ if(drmSetClientCap(drm.drmfd, DRM_CLIENT_CAP_ATOMIC, 1) != 0) {
+ fprintf(stderr, "kms server warning: drmSetClientCap DRM_CLIENT_CAP_ATOMIC failed, error: %s. The wrong monitor may be captured as a result\n", strerror(errno));
+ }
+ drm.planes = drmModeGetPlaneResources(drm.drmfd);
+ if(!drm.planes) {
+ fprintf(stderr, "kms server error: failed to get plane resources, error: %s\n", strerror(errno));
+ res = 2;
+ goto done;
+ }
+ connector_to_crtc_map c2crtc_map;
+ c2crtc_map.num_maps = 0;
+ map_crtc_to_connector_ids(&drm, &c2crtc_map);
+ fprintf(stderr, "kms server info: connecting to the client\n");
+ bool connected = false;
+ const double connect_timeout_sec = 5.0;
+ const double start_time = clock_get_monotonic_seconds();
+ while(clock_get_monotonic_seconds() - start_time < connect_timeout_sec) {
+ struct sockaddr_un remote_addr = {0};
+ remote_addr.sun_family = AF_UNIX;
+ string_copy(remote_addr.sun_path, domain_socket_path, sizeof(remote_addr.sun_path));
+ // TODO: Check if parent disconnected
+ if(connect(socket_fd, (struct sockaddr*)&remote_addr, sizeof(remote_addr.sun_family) + strlen(remote_addr.sun_path)) == -1) {
+ if(errno == ECONNREFUSED || errno == ENOENT) {
+ goto next;
+ } else if(errno == EISCONN) {
+ connected = true;
+ break;
+ }
+ fprintf(stderr, "kms server error: connect failed, error: %s (%d)\n", strerror(errno), errno);
+ res = 2;
+ goto done;
+ }
+ next:
+ usleep(30 * 1000); // 30 milliseconds
+ }
+ if(connected) {
+ fprintf(stderr, "kms server info: connected to the client\n");
+ } else {
+ fprintf(stderr, "kms server error: failed to connect to the client in %f seconds\n", connect_timeout_sec);
+ res = 2;
+ goto done;
+ }
+ for(;;) {
+ gsr_kms_request request;
+ request.version = 0;
+ request.type = -1;
+ request.new_connection_fd = 0;
+ const int recv_res = recv_msg_from_client(socket_fd, &request);
+ if(recv_res == 0) {
+ fprintf(stderr, "kms server info: kms client shutdown, shutting down the server\n");
+ res = 3;
+ goto done;
+ } else if(recv_res == -1) {
+ const int err = errno;
+ fprintf(stderr, "kms server error: failed to read all data in client request (error: %s), ignoring\n", strerror(err));
+ if(err == EBADF) {
+ fprintf(stderr, "kms server error: invalid client fd, shutting down the server\n");
+ res = 3;
+ goto done;
+ }
+ continue;
+ }
+ if(request.version != GSR_KMS_PROTOCOL_VERSION) {
+ fprintf(stderr, "kms server error: expected gpu screen recorder protocol version to be %u, but it's %u\n", GSR_KMS_PROTOCOL_VERSION, request.version);
+ /*
+ if(request.new_connection_fd > 0)
+ close(request.new_connection_fd);
+ */
+ continue;
+ }
+ switch(request.type) {
+ gsr_kms_response response;
+ response.version = GSR_KMS_PROTOCOL_VERSION;
+ response.num_fds = 0;
+ if(request.new_connection_fd > 0) {
+ if(socket_fd > 0)
+ close(socket_fd);
+ socket_fd = request.new_connection_fd;
+ response.result = KMS_RESULT_OK;
+ if(send_msg_to_client(socket_fd, &response) == -1)
+ fprintf(stderr, "kms server error: failed to respond to client KMS_REQUEST_TYPE_REPLACE_CONNECTION request\n");
+ } else {
+ response.result = KMS_RESULT_INVALID_REQUEST;
+ snprintf(response.err_msg, sizeof(response.err_msg), "received invalid connection fd");
+ fprintf(stderr, "kms server error: %s\n", response.err_msg);
+ if(send_msg_to_client(socket_fd, &response) == -1)
+ fprintf(stderr, "kms server error: failed to respond to client request\n");
+ }
+ break;
+ }
+ gsr_kms_response response;
+ response.version = GSR_KMS_PROTOCOL_VERSION;
+ response.num_fds = 0;
+ if(kms_get_fb(&drm, &response, &c2crtc_map) == 0) {
+ if(send_msg_to_client(socket_fd, &response) == -1)
+ fprintf(stderr, "kms server error: failed to respond to client KMS_REQUEST_TYPE_GET_KMS request\n");
+ } else {
+ if(send_msg_to_client(socket_fd, &response) == -1)
+ fprintf(stderr, "kms server error: failed to respond to client KMS_REQUEST_TYPE_GET_KMS request\n");
+ }
+ for(int i = 0; i < response.num_fds; ++i) {
+ close(response.fds[i].fd);
+ }
+ break;
+ }
+ default: {
+ gsr_kms_response response;
+ response.version = GSR_KMS_PROTOCOL_VERSION;
+ response.result = KMS_RESULT_INVALID_REQUEST;
+ response.num_fds = 0;
+ snprintf(response.err_msg, sizeof(response.err_msg), "invalid request type %d, expected %d (%s)", request.type, KMS_REQUEST_TYPE_GET_KMS, "KMS_REQUEST_TYPE_GET_KMS");
+ fprintf(stderr, "kms server error: %s\n", response.err_msg);
+ if(send_msg_to_client(socket_fd, &response) == -1)
+ fprintf(stderr, "kms server error: failed to respond to client request\n");
+ break;
+ }
+ }
+ }
+ done:
+ if(drm.planes)
+ drmModeFreePlaneResources(drm.planes);
+ if(drm.drmfd > 0)
+ close(drm.drmfd);
+ if(socket_fd > 0)
+ close(socket_fd);
+ return res;
diff --git a/kms/server/project.conf b/kms/server/project.conf
new file mode 100644
index 0000000..26a1947
--- /dev/null
+++ b/kms/server/project.conf
@@ -0,0 +1,11 @@
+name = "gsr-kms-server"
+type = "executable"
+version = "1.0.0"
+platforms = ["posix"]
+error_on_warning = "true"
+libdrm = ">=2"
diff --git a/meson.build b/meson.build
new file mode 100644
index 0000000..a188f16
--- /dev/null
+++ b/meson.build
@@ -0,0 +1,62 @@
+project('gpu-screen-recorder', ['c', 'cpp'], version : '3.8.0', default_options : ['warning_level=2'])
+add_project_arguments('-Wshadow', language : ['c', 'cpp'])
+if get_option('buildtype') == 'debug'
+ add_project_arguments('-g3', language : ['c', 'cpp'])
+elif get_option('buildtype') == 'release'
+ add_project_arguments('-DNDEBUG', language : ['c', 'cpp'])
+src = [
+ 'kms/client/kms_client.c',
+ 'src/capture/capture.c',
+ 'src/capture/nvfbc.c',
+ 'src/capture/xcomposite.c',
+ 'src/capture/xcomposite_cuda.c',
+ 'src/capture/xcomposite_vaapi.c',
+ 'src/capture/kms_vaapi.c',
+ 'src/capture/kms_cuda.c',
+ 'src/capture/kms.c',
+ 'src/egl.c',
+ 'src/cuda.c',
+ 'src/xnvctrl.c',
+ 'src/overclock.c',
+ 'src/window_texture.c',
+ 'src/shader.c',
+ 'src/color_conversion.c',
+ 'src/utils.c',
+ 'src/library_loader.c',
+ 'src/cursor.c',
+ 'src/sound.cpp',
+ 'src/main.cpp',
+dep = [
+ dependency('libavcodec'),
+ dependency('libavformat'),
+ dependency('libavutil'),
+ dependency('x11'),
+ dependency('xcomposite'),
+ dependency('xrandr'),
+ dependency('xfixes'),
+ dependency('xdamage'),
+ dependency('xi'),
+ dependency('libpulse'),
+ dependency('libswresample'),
+ dependency('libavfilter'),
+ dependency('libva'),
+ dependency('libcap'),
+ dependency('libdrm'),
+ dependency('wayland-egl'),
+ dependency('wayland-client'),
+executable('gsr-kms-server', 'kms/server/kms_server.c', dependencies : dependency('libdrm'), c_args : '-fstack-protector-all', install : true)
+executable('gpu-screen-recorder', src, dependencies : dep, install : true)
+if get_option('systemd') == true
+ install_data(files('extra/gpu-screen-recorder.service'), install_dir : '/usr/lib/systemd/user')
+if get_option('capabilities') == true
+ meson.add_install_script('extra/meson_post_install.sh')
diff --git a/meson_options.txt b/meson_options.txt
new file mode 100644
index 0000000..7286d14
--- /dev/null
+++ b/meson_options.txt
@@ -0,0 +1,2 @@
+option('systemd', type : 'boolean', value : false, description : 'Install systemd service file')
+option('capabilities', type : 'boolean', value : true, description : 'Set binary admin capabilities to remove password prompt and increase performance')
diff --git a/project.conf b/project.conf
index 8ac9d98..a7e2757 100644
--- a/project.conf
+++ b/project.conf
@@ -1,14 +1,28 @@
name = "gpu-screen-recorder"
type = "executable"
-version = "1.2.0"
+version = "3.8.0"
platforms = ["posix"]
+ignore_dirs = ["kms/server", "build"]
+error_on_warning = "true"
libavcodec = ">=58"
libavformat = ">=58"
libavutil = ">=56.2"
x11 = ">=1"
xcomposite = ">=0.2"
+xrandr = ">=1"
+xfixes = ">=2"
+xdamage = ">=1"
+xi = ">=1"
libpulse = ">=13"
libswresample = ">=3"
+libavfilter = ">=5"
+libva = ">=1"
+libcap = ">=2"
+libdrm = ">=2"
+wayland-egl = ">=15"
+wayland-client = ">=1" \ No newline at end of file
diff --git a/scripts/record-application-name.sh b/scripts/record-application-name.sh
new file mode 100755
index 0000000..cc29255
--- /dev/null
+++ b/scripts/record-application-name.sh
@@ -0,0 +1,6 @@
+window=$(xdotool selectwindow)
+window_name=$(xdotool getwindowclassname "$window" || xdotool getwindowname "$window" || echo "Game")
+window_name="$(echo "$window_name" | tr '/\\' '_')"
+gpu-screen-recorder -w "$window" -f 60 -a "$(pactl get-default-sink).monitor" -o "$HOME/Videos/recording/$window_name/$(date +"Video_%Y-%m-%d_%H-%M-%S.mp4")"
diff --git a/scripts/record-save-application-name.sh b/scripts/record-save-application-name.sh
new file mode 100755
index 0000000..46c51f0
--- /dev/null
+++ b/scripts/record-save-application-name.sh
@@ -0,0 +1,14 @@
+# This script should be passed to gpu-screen-recorder with the -sc option, for example:
+# gpu-screen-recorder -w screen -f 60 -a "$(pactl get-default-sink).monitor" -r 60 -sc scripts/record-save-application-name.sh -c mp4 -o "$HOME/Videos"
+window=$(xdotool getwindowfocus)
+window_name=$(xdotool getwindowclassname "$window" || xdotool getwindowname "$window" || echo "Game")
+window_name="$(echo "$window_name" | tr '/\\' '_')"
+mkdir -p "$video_dir"
+video="$video_dir/$(date +"${window_name}_%Y-%m-%d_%H-%M-%S.mp4")"
+mv "$1" "$video"
+sleep 0.5 && notify-send -t 2000 -u low "GPU Screen Recorder" "Replay saved to $video" \ No newline at end of file
diff --git a/scripts/replay-application-name.sh b/scripts/replay-application-name.sh
new file mode 100755
index 0000000..18df61a
--- /dev/null
+++ b/scripts/replay-application-name.sh
@@ -0,0 +1,6 @@
+window=$(xdotool selectwindow)
+window_name=$(xdotool getwindowclassname "$window" || xdotool getwindowname "$window" || echo "Game")
+window_name="$(echo "$window_name" | tr '/\\' '_')"
+gpu-screen-recorder -w "$window" -f 60 -c mkv -a "$(pactl get-default-sink).monitor" -r 60 -o "$HOME/Videos/Replays/$window_name"
diff --git a/scripts/replay.sh b/scripts/replay.sh
index cf6c494..2781e1e 100755
--- a/scripts/replay.sh
+++ b/scripts/replay.sh
@@ -3,4 +3,4 @@
[ "$#" -ne 4 ] && echo "usage: replay.sh <window_id> <fps> <replay_time_sec> <output_directory>" && exit 1
active_sink="$(pactl get-default-sink).monitor"
mkdir -p "$4"
-gpu-screen-recorder -w "$1" -c mp4 -f "$2" -a "$active_sink" -r "$3" -o "$4"
+gpu-screen-recorder -w "$1" -c mkv -f "$2" -a "$active_sink" -r "$3" -o "$4"
diff --git a/scripts/save-recording.sh b/scripts/save-recording.sh
new file mode 100755
index 0000000..90fefc1
--- /dev/null
+++ b/scripts/save-recording.sh
@@ -0,0 +1,3 @@
+killall -SIGINT gpu-screen-recorder && sleep 0.5 && notify-send -t 1500 -u low "GPU Screen Recorder" "Recording saved"
diff --git a/scripts/save-replay.sh b/scripts/save-replay.sh
new file mode 100755
index 0000000..f9390aa
--- /dev/null
+++ b/scripts/save-replay.sh
@@ -0,0 +1,3 @@
+#!/bin/sh -e
+killall -SIGUSR1 gpu-screen-recorder && sleep 0.5 && notify-send -t 1500 -u low -- "GPU Screen Recorder" "Replay saved"
diff --git a/scripts/start-replay.sh b/scripts/start-replay.sh
new file mode 100755
index 0000000..e36d59d
--- /dev/null
+++ b/scripts/start-replay.sh
@@ -0,0 +1,5 @@
+mkdir -p "$video_path"
+gpu-screen-recorder -w screen -f 60 -a "$(pactl get-default-sink).monitor" -c mkv -r 30 -o "$video_path"
diff --git a/scripts/stop-replay.sh b/scripts/stop-replay.sh
new file mode 100755
index 0000000..d38da9c
--- /dev/null
+++ b/scripts/stop-replay.sh
@@ -0,0 +1,3 @@
+killall -SIGINT gpu-screen-recorder
diff --git a/scripts/toggle-recording-selected.sh b/scripts/toggle-recording-selected.sh
index 663f360..309e4d1 100755
--- a/scripts/toggle-recording-selected.sh
+++ b/scripts/toggle-recording-selected.sh
@@ -1,9 +1,9 @@
#!/bin/sh -e
-killall -INT gpu-screen-recorder && notify-send -u low 'GPU Screen Recorder' 'Stopped recording' && exit 0;
+killall -SIGINT gpu-screen-recorder && sleep 0.5 && notify-send -t 1500 -u low 'GPU Screen Recorder' 'Stopped recording' && exit 0;
window=$(xdotool selectwindow)
active_sink="$(pactl get-default-sink).monitor"
mkdir -p "$HOME/Videos"
video="$HOME/Videos/$(date +"Video_%Y-%m-%d_%H-%M-%S.mp4")"
-notify-send -u low 'GPU Screen Recorder' "Started recording video to $video"
+notify-send -t 1500 -u low 'GPU Screen Recorder' "Started recording video to $video"
gpu-screen-recorder -w "$window" -c mp4 -f 60 -a "$active_sink" -o "$video"
diff --git a/scripts/twitch-stream-local-copy.sh b/scripts/twitch-stream-local-copy.sh
index dba9d15..4a678e8 100755
--- a/scripts/twitch-stream-local-copy.sh
+++ b/scripts/twitch-stream-local-copy.sh
@@ -4,4 +4,4 @@
[ "$#" -ne 4 ] && echo "usage: twitch-stream-local-copy.sh <window_id> <fps> <livestream_key> <local_file>" && exit 1
active_sink="$(pactl get-default-sink).monitor"
-gpu-screen-recorder -w "$1" -c flv -f "$2" -a "$active_sink" | tee -- "$4" | ffmpeg -i pipe:0 -c copy -f flv -- "rtmp://live.twitch.tv/app/$3"
+gpu-screen-recorder -w "$1" -c flv -f "$2" -q high -a "$active_sink" | tee -- "$4" | ffmpeg -i pipe:0 -c copy -f flv -- "rtmp://live.twitch.tv/app/$3"
diff --git a/scripts/twitch-stream.sh b/scripts/twitch-stream.sh
index cd4737a..aaa5828 100755
--- a/scripts/twitch-stream.sh
+++ b/scripts/twitch-stream.sh
@@ -2,4 +2,4 @@
[ "$#" -ne 3 ] && echo "usage: twitch-stream.sh <window_id> <fps> <livestream_key>" && exit 1
active_sink="$(pactl get-default-sink).monitor"
-gpu-screen-recorder -w "$1" -c flv -f "$2" -a "$active_sink" -o "rtmp://live.twitch.tv/app/$3"
+gpu-screen-recorder -w "$1" -c flv -f "$2" -q high -a "$active_sink" -o "rtmp://live.twitch.tv/app/$3"
diff --git a/scripts/youtube-hls-stream.sh b/scripts/youtube-hls-stream.sh
index 21619af..2f1659e 100755
--- a/scripts/youtube-hls-stream.sh
+++ b/scripts/youtube-hls-stream.sh
@@ -1,11 +1,5 @@
[ "$#" -ne 3 ] && echo "usage: youtube-hls-stream.sh <window_id> <fps> <livestream_key>" && exit 1
-mkdir "youtube_stream"
-cd "youtube_stream"
active_sink="$(pactl get-default-sink).monitor"
-gpu-screen-recorder -w "$1" -c mpegts -f "$2" -a "$active_sink" | ffmpeg -i pipe:0 -c copy -f hls \
- -hls_time 2 -hls_flags independent_segments -hls_flags delete_segments -hls_segment_type mpegts -hls_segment_filename stream%02d.ts -master_pl_name stream.m3u8 out1 &
-echo "Waiting until stream segments are created..."
-sleep 10
-ffmpeg -i stream.m3u8 -c copy -- "https://a.upload.youtube.com/http_upload_hls?cid=$3&copy=0&file=stream.m3u8"
+gpu-screen-recorder -w "$1" -c hls -f "$2" -q high -a "$active_sink" -ac aac -o "https://a.upload.youtube.com/http_upload_hls?cid=$3&copy=0&file=stream.m3u8" \ No newline at end of file
diff --git a/src/capture/capture.c b/src/capture/capture.c
new file mode 100644
index 0000000..5e1f546
--- /dev/null
+++ b/src/capture/capture.c
@@ -0,0 +1,399 @@
+#include "../../include/capture/capture.h"
+#include "../../include/egl.h"
+#include "../../include/cuda.h"
+#include "../../include/utils.h"
+#include <stdio.h>
+#include <stdint.h>
+#include <va/va.h>
+#include <va/va_drmcommon.h>
+#include <libavutil/frame.h>
+#include <libavutil/hwcontext_vaapi.h>
+#include <libavutil/hwcontext_cuda.h>
+#include <libavcodec/avcodec.h>
+int gsr_capture_start(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame *frame) {
+ if(cap->started)
+ return -1;
+ int res = cap->start(cap, video_codec_context, frame);
+ if(res == 0)
+ cap->started = true;
+ return res;
+void gsr_capture_tick(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ if(!cap->started) {
+ fprintf(stderr, "gsr error: gsp_capture_tick failed: the gsr capture has not been started\n");
+ return;
+ }
+ if(cap->tick)
+ cap->tick(cap, video_codec_context);
+bool gsr_capture_should_stop(gsr_capture *cap, bool *err) {
+ if(!cap->started) {
+ fprintf(stderr, "gsr error: gsr_capture_should_stop failed: the gsr capture has not been started\n");
+ return false;
+ }
+ if(!cap->should_stop)
+ return false;
+ return cap->should_stop(cap, err);
+int gsr_capture_capture(gsr_capture *cap, AVFrame *frame) {
+ if(!cap->started) {
+ fprintf(stderr, "gsr error: gsr_capture_capture failed: the gsr capture has not been started\n");
+ return -1;
+ }
+ return cap->capture(cap, frame);
+void gsr_capture_end(gsr_capture *cap, AVFrame *frame) {
+ if(!cap->started) {
+ fprintf(stderr, "gsr error: gsr_capture_end failed: the gsr capture has not been started\n");
+ return;
+ }
+ if(!cap->capture_end)
+ return;
+ cap->capture_end(cap, frame);
+void gsr_capture_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ cap->destroy(cap, video_codec_context);
+static uint32_t fourcc(uint32_t a, uint32_t b, uint32_t c, uint32_t d) {
+ return (d << 24) | (c << 16) | (b << 8) | a;
+bool gsr_capture_base_setup_vaapi_textures(gsr_capture_base *self, AVFrame *frame, VADisplay va_dpy, VADRMPRIMESurfaceDescriptor *prime, gsr_color_range color_range) {
+ const int res = av_hwframe_get_buffer(self->video_codec_context->hw_frames_ctx, frame, 0);
+ if(res < 0) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_vaapi_textures: av_hwframe_get_buffer failed: %d\n", res);
+ return false;
+ }
+ VASurfaceID target_surface_id = (uintptr_t)frame->data[3];
+ VAStatus va_status = vaExportSurfaceHandle(va_dpy, target_surface_id, VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_2, VA_EXPORT_SURFACE_WRITE_ONLY | VA_EXPORT_SURFACE_SEPARATE_LAYERS, prime);
+ if(va_status != VA_STATUS_SUCCESS) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_vaapi_textures: vaExportSurfaceHandle failed, error: %d\n", va_status);
+ return false;
+ }
+ vaSyncSurface(va_dpy, target_surface_id);
+ self->egl->glGenTextures(1, &self->input_texture);
+ self->egl->glBindTexture(GL_TEXTURE_2D, self->input_texture);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+ self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+ self->egl->glGenTextures(1, &self->cursor_texture);
+ self->egl->glBindTexture(GL_TEXTURE_2D, self->cursor_texture);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+ self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+ const uint32_t formats_nv12[2] = { fourcc('R', '8', ' ', ' '), fourcc('G', 'R', '8', '8') };
+ const uint32_t formats_p010[2] = { fourcc('R', '1', '6', ' '), fourcc('G', 'R', '3', '2') };
+ if(prime->fourcc == VA_FOURCC_NV12 || prime->fourcc == VA_FOURCC_P010) {
+ const uint32_t *formats = prime->fourcc == VA_FOURCC_NV12 ? formats_nv12 : formats_p010;
+ const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+ self->egl->glGenTextures(2, self->target_textures);
+ for(int i = 0; i < 2; ++i) {
+ const int layer = i;
+ const int plane = 0;
+ const uint64_t modifier = prime->objects[prime->layers[layer].object_index[plane]].drm_format_modifier;
+ const intptr_t img_attr[] = {
+ EGL_WIDTH, prime->width / div[i],
+ EGL_HEIGHT, prime->height / div[i],
+ EGL_DMA_BUF_PLANE0_FD_EXT, prime->objects[prime->layers[layer].object_index[plane]].fd,
+ EGL_DMA_BUF_PLANE0_OFFSET_EXT, prime->layers[layer].offset[plane],
+ EGL_DMA_BUF_PLANE0_PITCH_EXT, prime->layers[layer].pitch[plane],
+ };
+ while(self->egl->eglGetError() != EGL_SUCCESS){}
+ EGLImage image = self->egl->eglCreateImage(self->egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
+ if(!image) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_vaapi_textures: failed to create egl image from drm fd for output drm fd, error: %d\n", self->egl->eglGetError());
+ return false;
+ }
+ self->egl->glBindTexture(GL_TEXTURE_2D, self->target_textures[i]);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+ while(self->egl->glGetError()) {}
+ while(self->egl->eglGetError() != EGL_SUCCESS){}
+ self->egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
+ if(self->egl->glGetError() != 0 || self->egl->eglGetError() != EGL_SUCCESS) {
+ // TODO: Get the error properly
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_vaapi_textures: failed to bind egl image to gl texture, error: %d\n", self->egl->eglGetError());
+ self->egl->eglDestroyImage(self->egl->egl_display, image);
+ self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+ return false;
+ }
+ self->egl->eglDestroyImage(self->egl->egl_display, image);
+ self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+ }
+ gsr_color_conversion_params color_conversion_params = {0};
+ color_conversion_params.color_range = color_range;
+ color_conversion_params.egl = self->egl;
+ color_conversion_params.source_color = GSR_SOURCE_COLOR_RGB;
+ if(prime->fourcc == VA_FOURCC_NV12)
+ color_conversion_params.destination_color = GSR_DESTINATION_COLOR_NV12;
+ else
+ color_conversion_params.destination_color = GSR_DESTINATION_COLOR_P010;
+ color_conversion_params.destination_textures[0] = self->target_textures[0];
+ color_conversion_params.destination_textures[1] = self->target_textures[1];
+ color_conversion_params.num_destination_textures = 2;
+ if(gsr_color_conversion_init(&self->color_conversion, &color_conversion_params) != 0) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_vaapi_textures: failed to create color conversion\n");
+ return false;
+ }
+ gsr_color_conversion_clear(&self->color_conversion);
+ return true;
+ } else {
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_vaapi_textures: unexpected fourcc %u for output drm fd, expected nv12 or p010\n", prime->fourcc);
+ return false;
+ }
+static unsigned int gl_create_texture(gsr_egl *egl, int width, int height, int internal_format, unsigned int format) {
+ unsigned int texture_id = 0;
+ egl->glGenTextures(1, &texture_id);
+ egl->glBindTexture(GL_TEXTURE_2D, texture_id);
+ egl->glTexImage2D(GL_TEXTURE_2D, 0, internal_format, width, height, 0, format, GL_UNSIGNED_BYTE, NULL);
+ egl->glBindTexture(GL_TEXTURE_2D, 0);
+ return texture_id;
+static bool cuda_register_opengl_texture(gsr_cuda *cuda, CUgraphicsResource *cuda_graphics_resource, CUarray *mapped_array, unsigned int texture_id) {
+ CUresult res;
+ res = cuda->cuGraphicsGLRegisterImage(cuda_graphics_resource, texture_id, GL_TEXTURE_2D, CU_GRAPHICS_REGISTER_FLAGS_NONE);
+ if (res != CUDA_SUCCESS) {
+ const char *err_str = "unknown";
+ cuda->cuGetErrorString(res, &err_str);
+ fprintf(stderr, "gsr error: cuda_register_opengl_texture: cuGraphicsGLRegisterImage failed, error: %s, texture " "id: %u\n", err_str, texture_id);
+ return false;
+ }
+ res = cuda->cuGraphicsResourceSetMapFlags(*cuda_graphics_resource, CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE);
+ res = cuda->cuGraphicsMapResources(1, cuda_graphics_resource, 0);
+ res = cuda->cuGraphicsSubResourceGetMappedArray(mapped_array, *cuda_graphics_resource, 0, 0);
+ return true;
+bool gsr_capture_base_setup_cuda_textures(gsr_capture_base *self, AVFrame *frame, gsr_cuda_context *cuda_context, gsr_color_range color_range, gsr_source_color source_color, bool hdr) {
+ // TODO:
+ const int res = av_hwframe_get_buffer(self->video_codec_context->hw_frames_ctx, frame, 0);
+ if(res < 0) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_cuda_textures: av_hwframe_get_buffer failed: %d\n", res);
+ return false;
+ }
+ self->egl->glGenTextures(1, &self->input_texture);
+ self->egl->glBindTexture(GL_TEXTURE_2D, self->input_texture);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+ self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+ self->egl->glGenTextures(1, &self->cursor_texture);
+ self->egl->glBindTexture(GL_TEXTURE_EXTERNAL_OES, self->cursor_texture);
+ self->egl->glBindTexture(GL_TEXTURE_EXTERNAL_OES, 0);
+ const unsigned int internal_formats_nv12[2] = { GL_R8, GL_RG8 };
+ const unsigned int internal_formats_p010[2] = { GL_R16, GL_RG16 };
+ const unsigned int formats[2] = { GL_RED, GL_RG };
+ const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+ for(int i = 0; i < 2; ++i) {
+ self->target_textures[i] = gl_create_texture(self->egl, self->video_codec_context->width / div[i], self->video_codec_context->height / div[i], !hdr ? internal_formats_nv12[i] : internal_formats_p010[i], formats[i]);
+ if(self->target_textures[i] == 0) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_cuda_textures: failed to create opengl texture\n");
+ return false;
+ }
+ if(!cuda_register_opengl_texture(cuda_context->cuda, &cuda_context->cuda_graphics_resources[i], &cuda_context->mapped_arrays[i], self->target_textures[i])) {
+ return false;
+ }
+ }
+ gsr_color_conversion_params color_conversion_params = {0};
+ color_conversion_params.color_range = color_range;
+ color_conversion_params.egl = self->egl;
+ color_conversion_params.source_color = source_color;
+ if(!hdr)
+ color_conversion_params.destination_color = GSR_DESTINATION_COLOR_NV12;
+ else
+ color_conversion_params.destination_color = GSR_DESTINATION_COLOR_P010;
+ color_conversion_params.destination_textures[0] = self->target_textures[0];
+ color_conversion_params.destination_textures[1] = self->target_textures[1];
+ color_conversion_params.num_destination_textures = 2;
+ color_conversion_params.load_external_image_shader = true;
+ if(gsr_color_conversion_init(&self->color_conversion, &color_conversion_params) != 0) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_setup_cuda_textures: failed to create color conversion\n");
+ return false;
+ }
+ gsr_color_conversion_clear(&self->color_conversion);
+ return true;
+void gsr_capture_base_stop(gsr_capture_base *self) {
+ gsr_color_conversion_deinit(&self->color_conversion);
+ if(self->egl->egl_context) {
+ if(self->input_texture) {
+ self->egl->glDeleteTextures(1, &self->input_texture);
+ self->input_texture = 0;
+ }
+ if(self->cursor_texture) {
+ self->egl->glDeleteTextures(1, &self->cursor_texture);
+ self->cursor_texture = 0;
+ }
+ self->egl->glDeleteTextures(2, self->target_textures);
+ self->target_textures[0] = 0;
+ self->target_textures[1] = 0;
+ }
+ if(self->video_codec_context->hw_device_ctx)
+ av_buffer_unref(&self->video_codec_context->hw_device_ctx);
+ if(self->video_codec_context->hw_frames_ctx)
+ av_buffer_unref(&self->video_codec_context->hw_frames_ctx);
+bool drm_create_codec_context(const char *card_path, AVCodecContext *video_codec_context, int width, int height, bool hdr, VADisplay *va_dpy) {
+ char render_path[128];
+ if(!gsr_card_path_get_render_path(card_path, render_path)) {
+ fprintf(stderr, "gsr error: failed to get /dev/dri/renderDXXX file from %s\n", card_path);
+ return false;
+ }
+ AVBufferRef *device_ctx;
+ if(av_hwdevice_ctx_create(&device_ctx, AV_HWDEVICE_TYPE_VAAPI, render_path, NULL, 0) < 0) {
+ fprintf(stderr, "Error: Failed to create hardware device context\n");
+ return false;
+ }
+ AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
+ if(!frame_context) {
+ fprintf(stderr, "Error: Failed to create hwframe context\n");
+ av_buffer_unref(&device_ctx);
+ return false;
+ }
+ AVHWFramesContext *hw_frame_context =
+ (AVHWFramesContext *)frame_context->data;
+ hw_frame_context->width = width;
+ hw_frame_context->height = height;
+ hw_frame_context->sw_format = hdr ? AV_PIX_FMT_P010LE : AV_PIX_FMT_NV12;
+ hw_frame_context->format = video_codec_context->pix_fmt;
+ hw_frame_context->device_ref = device_ctx;
+ hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
+ //hw_frame_context->initial_pool_size = 20;
+ AVVAAPIDeviceContext *vactx =((AVHWDeviceContext*)device_ctx->data)->hwctx;
+ *va_dpy = vactx->display;
+ if (av_hwframe_ctx_init(frame_context) < 0) {
+ fprintf(stderr, "Error: Failed to initialize hardware frame context "
+ "(note: ffmpeg version needs to be > 4.0)\n");
+ av_buffer_unref(&device_ctx);
+ //av_buffer_unref(&frame_context);
+ return false;
+ }
+ video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
+ video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
+ return true;
+bool cuda_create_codec_context(CUcontext cu_ctx, AVCodecContext *video_codec_context, int width, int height, bool hdr, CUstream *cuda_stream) {
+ AVBufferRef *device_ctx = av_hwdevice_ctx_alloc(AV_HWDEVICE_TYPE_CUDA);
+ if(!device_ctx) {
+ fprintf(stderr, "gsr error: cuda_create_codec_context failed: failed to create hardware device context\n");
+ return false;
+ }
+ AVHWDeviceContext *hw_device_context = (AVHWDeviceContext*)device_ctx->data;
+ AVCUDADeviceContext *cuda_device_context = (AVCUDADeviceContext*)hw_device_context->hwctx;
+ cuda_device_context->cuda_ctx = cu_ctx;
+ if(av_hwdevice_ctx_init(device_ctx) < 0) {
+ fprintf(stderr, "gsr error: cuda_create_codec_context failed: failed to create hardware device context\n");
+ av_buffer_unref(&device_ctx);
+ return false;
+ }
+ AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
+ if(!frame_context) {
+ fprintf(stderr, "gsr error: cuda_create_codec_context failed: failed to create hwframe context\n");
+ av_buffer_unref(&device_ctx);
+ return false;
+ }
+ AVHWFramesContext *hw_frame_context = (AVHWFramesContext*)frame_context->data;
+ hw_frame_context->width = width;
+ hw_frame_context->height = height;
+ hw_frame_context->sw_format = hdr ? AV_PIX_FMT_P010LE : AV_PIX_FMT_NV12;
+ hw_frame_context->format = video_codec_context->pix_fmt;
+ hw_frame_context->device_ref = device_ctx;
+ hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
+ if (av_hwframe_ctx_init(frame_context) < 0) {
+ fprintf(stderr, "gsr error: cuda_create_codec_context failed: failed to initialize hardware frame context "
+ "(note: ffmpeg version needs to be > 4.0)\n");
+ av_buffer_unref(&device_ctx);
+ //av_buffer_unref(&frame_context);
+ return false;
+ }
+ *cuda_stream = cuda_device_context->stream;
+ video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
+ video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
+ return true;
diff --git a/src/capture/kms.c b/src/capture/kms.c
new file mode 100644
index 0000000..ec83cab
--- /dev/null
+++ b/src/capture/kms.c
@@ -0,0 +1,397 @@
+#include "../../include/capture/kms.h"
+#include "../../include/capture/capture.h"
+#include "../../include/utils.h"
+#include <string.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <libavcodec/avcodec.h>
+#include <libavutil/mastering_display_metadata.h>
+#define HDMI_EOTF_SMPTE_ST2084 2
+static int max_int(int a, int b) {
+ return a > b ? a : b;
+/* TODO: On monitor reconfiguration, find monitor x, y, width and height again. Do the same for nvfbc. */
+typedef struct {
+ MonitorId *monitor_id;
+ const char *monitor_to_capture;
+ int monitor_to_capture_len;
+ int num_monitors;
+} MonitorCallbackUserdata;
+static void monitor_callback(const gsr_monitor *monitor, void *userdata) {
+ MonitorCallbackUserdata *monitor_callback_userdata = userdata;
+ ++monitor_callback_userdata->num_monitors;
+ if(monitor_callback_userdata->monitor_to_capture_len != monitor->name_len || memcmp(monitor_callback_userdata->monitor_to_capture, monitor->name, monitor->name_len) != 0)
+ return;
+ if(monitor_callback_userdata->monitor_id->num_connector_ids < MAX_CONNECTOR_IDS) {
+ monitor_callback_userdata->monitor_id->connector_ids[monitor_callback_userdata->monitor_id->num_connector_ids] = monitor->connector_id;
+ ++monitor_callback_userdata->monitor_id->num_connector_ids;
+ }
+ if(monitor_callback_userdata->monitor_id->num_connector_ids == MAX_CONNECTOR_IDS)
+ fprintf(stderr, "gsr warning: reached max connector ids\n");
+int gsr_capture_kms_start(gsr_capture_kms *self, const char *display_to_capture, gsr_egl *egl, AVCodecContext *video_codec_context, AVFrame *frame) {
+ memset(self, 0, sizeof(*self));
+ self->base.video_codec_context = video_codec_context;
+ self->base.egl = egl;
+ gsr_monitor monitor;
+ self->monitor_id.num_connector_ids = 0;
+ int kms_init_res = gsr_kms_client_init(&self->kms_client, egl->card_path);
+ if(kms_init_res != 0)
+ return kms_init_res;
+ MonitorCallbackUserdata monitor_callback_userdata = {
+ &self->monitor_id,
+ display_to_capture, strlen(display_to_capture),
+ 0,
+ };
+ for_each_active_monitor_output(egl, GSR_CONNECTION_DRM, monitor_callback, &monitor_callback_userdata);
+ if(!get_monitor_by_name(egl, GSR_CONNECTION_DRM, display_to_capture, &monitor)) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_start: failed to find monitor by name \"%s\"\n", display_to_capture);
+ return -1;
+ }
+ monitor.name = display_to_capture;
+ self->monitor_rotation = drm_monitor_get_display_server_rotation(egl, &monitor);
+ self->capture_pos = monitor.pos;
+ if(self->monitor_rotation == GSR_MONITOR_ROT_90 || self->monitor_rotation == GSR_MONITOR_ROT_270) {
+ self->capture_size.x = monitor.size.y;
+ self->capture_size.y = monitor.size.x;
+ } else {
+ self->capture_size = monitor.size;
+ }
+ /* Disable vsync */
+ egl->eglSwapInterval(egl->egl_display, 0);
+ // TODO: Move this and xcomposite equivalent to a common section unrelated to capture method
+ if(egl->gpu_info.vendor == GSR_GPU_VENDOR_AMD && video_codec_context->codec_id == AV_CODEC_ID_HEVC) {
+ // TODO: dont do this if using ffmpeg reports that this is not needed (AMD driver bug that was fixed recently)
+ self->base.video_codec_context->width = FFALIGN(self->capture_size.x, 64);
+ self->base.video_codec_context->height = FFALIGN(self->capture_size.y, 16);
+ } else if(egl->gpu_info.vendor == GSR_GPU_VENDOR_AMD && video_codec_context->codec_id == AV_CODEC_ID_AV1) {
+ // TODO: Dont do this for VCN 5 and forward which should fix this hardware bug
+ self->base.video_codec_context->width = FFALIGN(self->capture_size.x, 64);
+ // AMD driver has special case handling for 1080 height to set it to 1082 instead of 1088 (1080 aligned to 16).
+ // TODO: Set height to 1082 in this case, but it wont work because it will be aligned to 1088.
+ if(self->capture_size.y == 1080) {
+ self->base.video_codec_context->height = 1080;
+ } else {
+ self->base.video_codec_context->height = FFALIGN(self->capture_size.y, 16);
+ }
+ } else {
+ self->base.video_codec_context->width = FFALIGN(self->capture_size.x, 2);
+ self->base.video_codec_context->height = FFALIGN(self->capture_size.y, 2);
+ }
+ frame->width = self->base.video_codec_context->width;
+ frame->height = self->base.video_codec_context->height;
+ return 0;
+void gsr_capture_kms_stop(gsr_capture_kms *self) {
+ gsr_capture_kms_cleanup_kms_fds(self);
+ gsr_kms_client_deinit(&self->kms_client);
+ gsr_capture_base_stop(&self->base);
+static float monitor_rotation_to_radians(gsr_monitor_rotation rot) {
+ switch(rot) {
+ case GSR_MONITOR_ROT_0: return 0.0f;
+ case GSR_MONITOR_ROT_90: return M_PI_2;
+ case GSR_MONITOR_ROT_180: return M_PI;
+ case GSR_MONITOR_ROT_270: return M_PI + M_PI_2;
+ }
+ return 0.0f;
+/* Prefer non combined planes */
+static gsr_kms_response_fd* find_drm_by_connector_id(gsr_kms_response *kms_response, uint32_t connector_id) {
+ int index_combined = -1;
+ for(int i = 0; i < kms_response->num_fds; ++i) {
+ if(kms_response->fds[i].connector_id == connector_id && !kms_response->fds[i].is_cursor) {
+ if(kms_response->fds[i].is_combined_plane)
+ index_combined = i;
+ else
+ return &kms_response->fds[i];
+ }
+ }
+ if(index_combined != -1)
+ return &kms_response->fds[index_combined];
+ else
+ return NULL;
+static gsr_kms_response_fd* find_first_combined_drm(gsr_kms_response *kms_response) {
+ for(int i = 0; i < kms_response->num_fds; ++i) {
+ if(kms_response->fds[i].is_combined_plane && !kms_response->fds[i].is_cursor)
+ return &kms_response->fds[i];
+ }
+ return NULL;
+static gsr_kms_response_fd* find_largest_drm(gsr_kms_response *kms_response) {
+ if(kms_response->num_fds == 0)
+ return NULL;
+ int64_t largest_size = 0;
+ gsr_kms_response_fd *largest_drm = &kms_response->fds[0];
+ for(int i = 0; i < kms_response->num_fds; ++i) {
+ const int64_t size = (int64_t)kms_response->fds[i].width * (int64_t)kms_response->fds[i].height;
+ if(size > largest_size && !kms_response->fds[i].is_cursor) {
+ largest_size = size;
+ largest_drm = &kms_response->fds[i];
+ }
+ }
+ return largest_drm;
+static gsr_kms_response_fd* find_cursor_drm(gsr_kms_response *kms_response) {
+ for(int i = 0; i < kms_response->num_fds; ++i) {
+ if(kms_response->fds[i].is_cursor)
+ return &kms_response->fds[i];
+ }
+ return NULL;
+static bool hdr_metadata_is_supported_format(const struct hdr_output_metadata *hdr_metadata) {
+ return hdr_metadata->metadata_type == HDMI_STATIC_METADATA_TYPE1 &&
+ hdr_metadata->hdmi_metadata_type1.metadata_type == HDMI_STATIC_METADATA_TYPE1 &&
+ hdr_metadata->hdmi_metadata_type1.eotf == HDMI_EOTF_SMPTE_ST2084;
+static void gsr_kms_set_hdr_metadata(gsr_capture_kms *self, AVFrame *frame, gsr_kms_response_fd *drm_fd) {
+ if(!self->mastering_display_metadata)
+ self->mastering_display_metadata = av_mastering_display_metadata_create_side_data(frame);
+ if(!self->light_metadata)
+ self->light_metadata = av_content_light_metadata_create_side_data(frame);
+ if(self->mastering_display_metadata) {
+ for(int i = 0; i < 3; ++i) {
+ self->mastering_display_metadata->display_primaries[i][0] = av_make_q(drm_fd->hdr_metadata.hdmi_metadata_type1.display_primaries[i].x, 50000);
+ self->mastering_display_metadata->display_primaries[i][1] = av_make_q(drm_fd->hdr_metadata.hdmi_metadata_type1.display_primaries[i].y, 50000);
+ }
+ self->mastering_display_metadata->white_point[0] = av_make_q(drm_fd->hdr_metadata.hdmi_metadata_type1.white_point.x, 50000);
+ self->mastering_display_metadata->white_point[1] = av_make_q(drm_fd->hdr_metadata.hdmi_metadata_type1.white_point.y, 50000);
+ self->mastering_display_metadata->min_luminance = av_make_q(drm_fd->hdr_metadata.hdmi_metadata_type1.min_display_mastering_luminance, 10000);
+ self->mastering_display_metadata->max_luminance = av_make_q(drm_fd->hdr_metadata.hdmi_metadata_type1.max_display_mastering_luminance, 1);
+ self->mastering_display_metadata->has_primaries = self->mastering_display_metadata->display_primaries[0][0].num > 0;
+ self->mastering_display_metadata->has_luminance = self->mastering_display_metadata->max_luminance.num > 0;
+ }
+ if(self->light_metadata) {
+ self->light_metadata->MaxCLL = drm_fd->hdr_metadata.hdmi_metadata_type1.max_cll;
+ self->light_metadata->MaxFALL = drm_fd->hdr_metadata.hdmi_metadata_type1.max_fall;
+ }
+static vec2i swap_vec2i(vec2i value) {
+ int tmp = value.x;
+ value.x = value.y;
+ value.y = tmp;
+ return value;
+bool gsr_capture_kms_capture(gsr_capture_kms *self, AVFrame *frame, bool hdr, bool screen_plane_use_modifiers, bool cursor_texture_is_external, bool record_cursor) {
+ //egl->glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
+ self->base.egl->glClear(0);
+ gsr_capture_kms_cleanup_kms_fds(self);
+ gsr_kms_response_fd *drm_fd = NULL;
+ gsr_kms_response_fd *cursor_drm_fd = NULL;
+ bool capture_is_combined_plane = false;
+ if(gsr_kms_client_get_kms(&self->kms_client, &self->kms_response) != 0) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_capture: failed to get kms, error: %d (%s)\n", self->kms_response.result, self->kms_response.err_msg);
+ return false;
+ }
+ if(self->kms_response.num_fds == 0) {
+ static bool error_shown = false;
+ if(!error_shown) {
+ error_shown = true;
+ fprintf(stderr, "gsr error: no drm found, capture will fail\n");
+ }
+ return false;
+ }
+ for(int i = 0; i < self->monitor_id.num_connector_ids; ++i) {
+ drm_fd = find_drm_by_connector_id(&self->kms_response, self->monitor_id.connector_ids[i]);
+ if(drm_fd)
+ break;
+ }
+ // Will never happen on wayland unless the target monitor has been disconnected
+ if(!drm_fd) {
+ drm_fd = find_first_combined_drm(&self->kms_response);
+ if(!drm_fd)
+ drm_fd = find_largest_drm(&self->kms_response);
+ capture_is_combined_plane = true;
+ }
+ cursor_drm_fd = find_cursor_drm(&self->kms_response);
+ if(!drm_fd)
+ return false;
+ if(!capture_is_combined_plane && cursor_drm_fd && cursor_drm_fd->connector_id != drm_fd->connector_id)
+ cursor_drm_fd = NULL;
+ if(drm_fd->has_hdr_metadata && hdr && hdr_metadata_is_supported_format(&drm_fd->hdr_metadata))
+ gsr_kms_set_hdr_metadata(self, frame, drm_fd);
+ // TODO: This causes a crash sometimes on steam deck, why? is it a driver bug? a vaapi pure version doesn't cause a crash.
+ // Even ffmpeg kmsgrab causes this crash. The error is:
+ // amdgpu: Failed to allocate a buffer:
+ // amdgpu: size : 28508160 bytes
+ // amdgpu: alignment : 2097152 bytes
+ // amdgpu: domains : 4
+ // amdgpu: flags : 4
+ // amdgpu: Failed to allocate a buffer:
+ // amdgpu: size : 28508160 bytes
+ // amdgpu: alignment : 2097152 bytes
+ // amdgpu: domains : 4
+ // amdgpu: flags : 4
+ // EE ../jupiter-mesa/src/gallium/drivers/radeonsi/radeon_vcn_enc.c:516 radeon_create_encoder UVD - Can't create CPB buffer.
+ // [hevc_vaapi @ 0x55ea72b09840] Failed to upload encode parameters: 2 (resource allocation failed).
+ // [hevc_vaapi @ 0x55ea72b09840] Encode failed: -5.
+ // Error: avcodec_send_frame failed, error: Input/output error
+ // Assertion pic->display_order == pic->encode_order failed at libavcodec/vaapi_encode_h265.c:765
+ // kms server info: kms client shutdown, shutting down the server
+ intptr_t img_attr[18] = {
+ EGL_LINUX_DRM_FOURCC_EXT, drm_fd->pixel_format,
+ EGL_WIDTH, drm_fd->width,
+ EGL_HEIGHT, drm_fd->height,
+ EGL_DMA_BUF_PLANE0_FD_EXT, drm_fd->fd,
+ EGL_DMA_BUF_PLANE0_OFFSET_EXT, drm_fd->offset,
+ EGL_DMA_BUF_PLANE0_PITCH_EXT, drm_fd->pitch,
+ };
+ if(screen_plane_use_modifiers) {
+ img_attr[13] = drm_fd->modifier & 0xFFFFFFFFULL;
+ img_attr[15] = drm_fd->modifier >> 32ULL;
+ img_attr[16] = EGL_NONE;
+ img_attr[17] = EGL_NONE;
+ } else {
+ img_attr[12] = EGL_NONE;
+ img_attr[13] = EGL_NONE;
+ }
+ EGLImage image = self->base.egl->eglCreateImage(self->base.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr);
+ self->base.egl->glBindTexture(GL_TEXTURE_2D, self->base.input_texture);
+ self->base.egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
+ self->base.egl->eglDestroyImage(self->base.egl->egl_display, image);
+ self->base.egl->glBindTexture(GL_TEXTURE_2D, 0);
+ vec2i capture_pos = self->capture_pos;
+ if(!capture_is_combined_plane)
+ capture_pos = (vec2i){drm_fd->x, drm_fd->y};
+ const float texture_rotation = monitor_rotation_to_radians(self->monitor_rotation);
+ const int target_x = max_int(0, frame->width / 2 - self->capture_size.x / 2);
+ const int target_y = max_int(0, frame->height / 2 - self->capture_size.y / 2);
+ gsr_color_conversion_draw(&self->base.color_conversion, self->base.input_texture,
+ (vec2i){target_x, target_y}, self->capture_size,
+ capture_pos, self->capture_size,
+ texture_rotation, false);
+ if(record_cursor && cursor_drm_fd) {
+ const vec2i cursor_size = {cursor_drm_fd->width, cursor_drm_fd->height};
+ vec2i cursor_pos = {cursor_drm_fd->x, cursor_drm_fd->y};
+ switch(self->monitor_rotation) {
+ break;
+ case GSR_MONITOR_ROT_90:
+ cursor_pos = swap_vec2i(cursor_pos);
+ cursor_pos.x = self->capture_size.x - cursor_pos.x;
+ // TODO: Remove this horrible hack
+ cursor_pos.x -= cursor_size.x;
+ break;
+ case GSR_MONITOR_ROT_180:
+ cursor_pos.x = self->capture_size.x - cursor_pos.x;
+ cursor_pos.y = self->capture_size.y - cursor_pos.y;
+ // TODO: Remove this horrible hack
+ cursor_pos.x -= cursor_size.x;
+ cursor_pos.y -= cursor_size.y;
+ break;
+ case GSR_MONITOR_ROT_270:
+ cursor_pos = swap_vec2i(cursor_pos);
+ cursor_pos.y = self->capture_size.y - cursor_pos.y;
+ // TODO: Remove this horrible hack
+ cursor_pos.y -= cursor_size.y;
+ break;
+ }
+ cursor_pos.x += target_x;
+ cursor_pos.y += target_y;
+ const intptr_t img_attr_cursor[] = {
+ EGL_LINUX_DRM_FOURCC_EXT, cursor_drm_fd->pixel_format,
+ EGL_WIDTH, cursor_drm_fd->width,
+ EGL_HEIGHT, cursor_drm_fd->height,
+ EGL_DMA_BUF_PLANE0_FD_EXT, cursor_drm_fd->fd,
+ EGL_DMA_BUF_PLANE0_OFFSET_EXT, cursor_drm_fd->offset,
+ EGL_DMA_BUF_PLANE0_PITCH_EXT, cursor_drm_fd->pitch,
+ EGL_DMA_BUF_PLANE0_MODIFIER_LO_EXT, cursor_drm_fd->modifier & 0xFFFFFFFFULL,
+ EGL_DMA_BUF_PLANE0_MODIFIER_HI_EXT, cursor_drm_fd->modifier >> 32ULL,
+ };
+ EGLImage cursor_image = self->base.egl->eglCreateImage(self->base.egl->egl_display, 0, EGL_LINUX_DMA_BUF_EXT, NULL, img_attr_cursor);
+ const int target = cursor_texture_is_external ? GL_TEXTURE_EXTERNAL_OES : GL_TEXTURE_2D;
+ self->base.egl->glBindTexture(target, self->base.cursor_texture);
+ self->base.egl->glEGLImageTargetTexture2DOES(target, cursor_image);
+ self->base.egl->eglDestroyImage(self->base.egl->egl_display, cursor_image);
+ self->base.egl->glBindTexture(target, 0);
+ self->base.egl->glEnable(GL_SCISSOR_TEST);
+ self->base.egl->glScissor(target_x, target_y, self->capture_size.x, self->capture_size.y);
+ gsr_color_conversion_draw(&self->base.color_conversion, self->base.cursor_texture,
+ cursor_pos, cursor_size,
+ (vec2i){0, 0}, cursor_size,
+ texture_rotation, cursor_texture_is_external);
+ self->base.egl->glDisable(GL_SCISSOR_TEST);
+ }
+ self->base.egl->eglSwapBuffers(self->base.egl->egl_display, self->base.egl->egl_surface);
+ //self->base.egl->glFlush();
+ //self->base.egl->glFinish();
+ return true;
+void gsr_capture_kms_cleanup_kms_fds(gsr_capture_kms *self) {
+ for(int i = 0; i < self->kms_response.num_fds; ++i) {
+ if(self->kms_response.fds[i].fd > 0)
+ close(self->kms_response.fds[i].fd);
+ self->kms_response.fds[i].fd = 0;
+ }
+ self->kms_response.num_fds = 0;
diff --git a/src/capture/kms_cuda.c b/src/capture/kms_cuda.c
new file mode 100644
index 0000000..a9f1f8e
--- /dev/null
+++ b/src/capture/kms_cuda.c
@@ -0,0 +1,181 @@
+#include "../../include/capture/kms_cuda.h"
+#include "../../include/capture/kms.h"
+#include "../../include/cuda.h"
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <assert.h>
+#include <libavutil/hwcontext.h>
+#include <libavutil/hwcontext_cuda.h>
+#include <libavcodec/avcodec.h>
+typedef struct {
+ gsr_capture_kms kms;
+ gsr_capture_kms_cuda_params params;
+ gsr_cuda cuda;
+ CUgraphicsResource cuda_graphics_resources[2];
+ CUarray mapped_arrays[2];
+ CUstream cuda_stream;
+} gsr_capture_kms_cuda;
+static void gsr_capture_kms_cuda_stop(gsr_capture *cap, AVCodecContext *video_codec_context);
+static int gsr_capture_kms_cuda_start(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame *frame) {
+ gsr_capture_kms_cuda *cap_kms = cap->priv;
+ const int res = gsr_capture_kms_start(&cap_kms->kms, cap_kms->params.display_to_capture, cap_kms->params.egl, video_codec_context, frame);
+ if(res != 0) {
+ gsr_capture_kms_cuda_stop(cap, video_codec_context);
+ return res;
+ }
+ // TODO: overclocking is not supported on wayland...
+ if(!gsr_cuda_load(&cap_kms->cuda, NULL, false)) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_cuda_start: failed to load cuda\n");
+ gsr_capture_kms_cuda_stop(cap, video_codec_context);
+ return -1;
+ }
+ if(!cuda_create_codec_context(cap_kms->cuda.cu_ctx, video_codec_context, video_codec_context->width, video_codec_context->height, cap_kms->params.hdr, &cap_kms->cuda_stream)) {
+ gsr_capture_kms_cuda_stop(cap, video_codec_context);
+ return -1;
+ }
+ gsr_cuda_context cuda_context = {
+ .cuda = &cap_kms->cuda,
+ .cuda_graphics_resources = cap_kms->cuda_graphics_resources,
+ .mapped_arrays = cap_kms->mapped_arrays
+ };
+ if(!gsr_capture_base_setup_cuda_textures(&cap_kms->kms.base, frame, &cuda_context, cap_kms->params.color_range, GSR_SOURCE_COLOR_RGB, cap_kms->params.hdr)) {
+ gsr_capture_kms_cuda_stop(cap, video_codec_context);
+ return -1;
+ }
+ return 0;
+static bool gsr_capture_kms_cuda_should_stop(gsr_capture *cap, bool *err) {
+ gsr_capture_kms_cuda *cap_kms = cap->priv;
+ if(cap_kms->kms.should_stop) {
+ if(err)
+ *err = cap_kms->kms.stop_is_error;
+ return true;
+ }
+ if(err)
+ *err = false;
+ return false;
+static void gsr_capture_kms_unload_cuda_graphics(gsr_capture_kms_cuda *cap_kms) {
+ if(cap_kms->cuda.cu_ctx) {
+ for(int i = 0; i < 2; ++i) {
+ if(cap_kms->cuda_graphics_resources[i]) {
+ cap_kms->cuda.cuGraphicsUnmapResources(1, &cap_kms->cuda_graphics_resources[i], 0);
+ cap_kms->cuda.cuGraphicsUnregisterResource(cap_kms->cuda_graphics_resources[i]);
+ cap_kms->cuda_graphics_resources[i] = 0;
+ }
+ }
+ }
+static int gsr_capture_kms_cuda_capture(gsr_capture *cap, AVFrame *frame) {
+ gsr_capture_kms_cuda *cap_kms = cap->priv;
+ gsr_capture_kms_capture(&cap_kms->kms, frame, cap_kms->params.hdr, true, true, cap_kms->params.record_cursor);
+ const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+ for(int i = 0; i < 2; ++i) {
+ CUDA_MEMCPY2D memcpy_struct;
+ memcpy_struct.srcXInBytes = 0;
+ memcpy_struct.srcY = 0;
+ memcpy_struct.srcMemoryType = CU_MEMORYTYPE_ARRAY;
+ memcpy_struct.dstXInBytes = 0;
+ memcpy_struct.dstY = 0;
+ memcpy_struct.dstMemoryType = CU_MEMORYTYPE_DEVICE;
+ memcpy_struct.srcArray = cap_kms->mapped_arrays[i];
+ memcpy_struct.srcPitch = frame->width / div[i];
+ memcpy_struct.dstDevice = (CUdeviceptr)frame->data[i];
+ memcpy_struct.dstPitch = frame->linesize[i];
+ memcpy_struct.WidthInBytes = frame->width * (cap_kms->params.hdr ? 2 : 1);
+ memcpy_struct.Height = frame->height / div[i];
+ // TODO: Remove this copy if possible
+ cap_kms->cuda.cuMemcpy2DAsync_v2(&memcpy_struct, cap_kms->cuda_stream);
+ }
+ // TODO: needed?
+ cap_kms->cuda.cuStreamSynchronize(cap_kms->cuda_stream);
+ return 0;
+static void gsr_capture_kms_cuda_capture_end(gsr_capture *cap, AVFrame *frame) {
+ (void)frame;
+ gsr_capture_kms_cuda *cap_kms = cap->priv;
+ gsr_capture_kms_cleanup_kms_fds(&cap_kms->kms);
+static void gsr_capture_kms_cuda_stop(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ gsr_capture_kms_cuda *cap_kms = cap->priv;
+ gsr_capture_kms_unload_cuda_graphics(cap_kms);
+ gsr_cuda_unload(&cap_kms->cuda);
+ gsr_capture_kms_stop(&cap_kms->kms);
+static void gsr_capture_kms_cuda_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ gsr_capture_kms_cuda *cap_kms = cap->priv;
+ if(cap->priv) {
+ gsr_capture_kms_cuda_stop(cap, video_codec_context);
+ free((void*)cap_kms->params.display_to_capture);
+ cap_kms->params.display_to_capture = NULL;
+ free(cap->priv);
+ cap->priv = NULL;
+ }
+ free(cap);
+gsr_capture* gsr_capture_kms_cuda_create(const gsr_capture_kms_cuda_params *params) {
+ if(!params) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_cuda_create params is NULL\n");
+ return NULL;
+ }
+ gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+ if(!cap)
+ return NULL;
+ gsr_capture_kms_cuda *cap_kms = calloc(1, sizeof(gsr_capture_kms_cuda));
+ if(!cap_kms) {
+ free(cap);
+ return NULL;
+ }
+ const char *display_to_capture = strdup(params->display_to_capture);
+ if(!display_to_capture) {
+ free(cap);
+ free(cap_kms);
+ return NULL;
+ }
+ cap_kms->params = *params;
+ cap_kms->params.display_to_capture = display_to_capture;
+ *cap = (gsr_capture) {
+ .start = gsr_capture_kms_cuda_start,
+ .tick = NULL,
+ .should_stop = gsr_capture_kms_cuda_should_stop,
+ .capture = gsr_capture_kms_cuda_capture,
+ .capture_end = gsr_capture_kms_cuda_capture_end,
+ .destroy = gsr_capture_kms_cuda_destroy,
+ .priv = cap_kms
+ };
+ return cap;
diff --git a/src/capture/kms_vaapi.c b/src/capture/kms_vaapi.c
new file mode 100644
index 0000000..b9c9ee5
--- /dev/null
+++ b/src/capture/kms_vaapi.c
@@ -0,0 +1,135 @@
+#include "../../include/capture/kms_vaapi.h"
+#include "../../include/capture/kms.h"
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <assert.h>
+#include <libavutil/hwcontext.h>
+#include <libavutil/hwcontext_vaapi.h>
+#include <libavcodec/avcodec.h>
+#include <va/va_drmcommon.h>
+typedef struct {
+ gsr_capture_kms kms;
+ gsr_capture_kms_vaapi_params params;
+ VADisplay va_dpy;
+ VADRMPRIMESurfaceDescriptor prime;
+} gsr_capture_kms_vaapi;
+static void gsr_capture_kms_vaapi_stop(gsr_capture *cap, AVCodecContext *video_codec_context);
+static int gsr_capture_kms_vaapi_start(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame *frame) {
+ gsr_capture_kms_vaapi *cap_kms = cap->priv;
+ int res = gsr_capture_kms_start(&cap_kms->kms, cap_kms->params.display_to_capture, cap_kms->params.egl, video_codec_context, frame);
+ if(res != 0) {
+ gsr_capture_kms_vaapi_stop(cap, video_codec_context);
+ return res;
+ }
+ if(!drm_create_codec_context(cap_kms->params.egl->card_path, video_codec_context, video_codec_context->width, video_codec_context->height, cap_kms->params.hdr, &cap_kms->va_dpy)) {
+ gsr_capture_kms_vaapi_stop(cap, video_codec_context);
+ return -1;
+ }
+ if(!gsr_capture_base_setup_vaapi_textures(&cap_kms->kms.base, frame, cap_kms->va_dpy, &cap_kms->prime, cap_kms->params.color_range)) {
+ gsr_capture_kms_vaapi_stop(cap, video_codec_context);
+ return -1;
+ }
+ return 0;
+static bool gsr_capture_kms_vaapi_should_stop(gsr_capture *cap, bool *err) {
+ gsr_capture_kms_vaapi *cap_kms = cap->priv;
+ if(cap_kms->kms.should_stop) {
+ if(err)
+ *err = cap_kms->kms.stop_is_error;
+ return true;
+ }
+ if(err)
+ *err = false;
+ return false;
+static int gsr_capture_kms_vaapi_capture(gsr_capture *cap, AVFrame *frame) {
+ gsr_capture_kms_vaapi *cap_kms = cap->priv;
+ gsr_capture_kms_capture(&cap_kms->kms, frame, cap_kms->params.hdr, cap_kms->params.egl->gpu_info.vendor == GSR_GPU_VENDOR_INTEL, false, cap_kms->params.record_cursor);
+ return 0;
+static void gsr_capture_kms_vaapi_capture_end(gsr_capture *cap, AVFrame *frame) {
+ (void)frame;
+ gsr_capture_kms_vaapi *cap_kms = cap->priv;
+ gsr_capture_kms_cleanup_kms_fds(&cap_kms->kms);
+static void gsr_capture_kms_vaapi_stop(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ gsr_capture_kms_vaapi *cap_kms = cap->priv;
+ for(uint32_t i = 0; i < cap_kms->prime.num_objects; ++i) {
+ if(cap_kms->prime.objects[i].fd > 0) {
+ close(cap_kms->prime.objects[i].fd);
+ cap_kms->prime.objects[i].fd = 0;
+ }
+ }
+ gsr_capture_kms_stop(&cap_kms->kms);
+static void gsr_capture_kms_vaapi_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ gsr_capture_kms_vaapi *cap_kms = cap->priv;
+ if(cap->priv) {
+ gsr_capture_kms_vaapi_stop(cap, video_codec_context);
+ free((void*)cap_kms->params.display_to_capture);
+ cap_kms->params.display_to_capture = NULL;
+ free(cap->priv);
+ cap->priv = NULL;
+ }
+ free(cap);
+gsr_capture* gsr_capture_kms_vaapi_create(const gsr_capture_kms_vaapi_params *params) {
+ if(!params) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_vaapi_create params is NULL\n");
+ return NULL;
+ }
+ gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+ if(!cap)
+ return NULL;
+ gsr_capture_kms_vaapi *cap_kms = calloc(1, sizeof(gsr_capture_kms_vaapi));
+ if(!cap_kms) {
+ free(cap);
+ return NULL;
+ }
+ const char *display_to_capture = strdup(params->display_to_capture);
+ if(!display_to_capture) {
+ /* TODO XCloseDisplay */
+ free(cap);
+ free(cap_kms);
+ return NULL;
+ }
+ cap_kms->params = *params;
+ cap_kms->params.display_to_capture = display_to_capture;
+ *cap = (gsr_capture) {
+ .start = gsr_capture_kms_vaapi_start,
+ .tick = NULL,
+ .should_stop = gsr_capture_kms_vaapi_should_stop,
+ .capture = gsr_capture_kms_vaapi_capture,
+ .capture_end = gsr_capture_kms_vaapi_capture_end,
+ .destroy = gsr_capture_kms_vaapi_destroy,
+ .priv = cap_kms
+ };
+ return cap;
diff --git a/src/capture/nvfbc.c b/src/capture/nvfbc.c
new file mode 100644
index 0000000..9eabb18
--- /dev/null
+++ b/src/capture/nvfbc.c
@@ -0,0 +1,535 @@
+#include "../../include/capture/nvfbc.h"
+#include "../../external/NvFBC.h"
+#include "../../include/cuda.h"
+#include "../../include/egl.h"
+#include "../../include/utils.h"
+#include <dlfcn.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include <math.h>
+#include <X11/Xlib.h>
+#include <libavutil/hwcontext.h>
+#include <libavutil/hwcontext_cuda.h>
+#include <libavutil/frame.h>
+#include <libavutil/version.h>
+#include <libavcodec/avcodec.h>
+typedef struct {
+ gsr_capture_base base;
+ gsr_capture_nvfbc_params params;
+ void *library;
+ NVFBC_SESSION_HANDLE nv_fbc_handle;
+ PNVFBCCREATEINSTANCE nv_fbc_create_instance;
+ NVFBC_API_FUNCTION_LIST nv_fbc_function_list;
+ bool fbc_handle_created;
+ bool capture_session_created;
+ gsr_cuda cuda;
+ CUgraphicsResource cuda_graphics_resources[2];
+ CUarray mapped_arrays[2];
+ CUstream cuda_stream; // TODO: asdasdsa
+ bool direct_capture;
+ bool supports_direct_cursor;
+ bool capture_region;
+ uint32_t x, y, width, height;
+ NVFBC_TRACKING_TYPE tracking_type;
+ uint32_t output_id;
+ uint32_t tracking_width, tracking_height;
+ bool nvfbc_needs_recreate;
+ double nvfbc_dead_start;
+} gsr_capture_nvfbc;
+#if defined(_WIN64) || defined(__LP64__)
+typedef unsigned long long CUdeviceptr_v2;
+typedef unsigned int CUdeviceptr_v2;
+typedef CUdeviceptr_v2 CUdeviceptr;
+static int max_int(int a, int b) {
+ return a > b ? a : b;
+/* Returns 0 on failure */
+static uint32_t get_output_id_from_display_name(NVFBC_RANDR_OUTPUT_INFO *outputs, uint32_t num_outputs, const char *display_name, uint32_t *width, uint32_t *height) {
+ if(!outputs)
+ return 0;
+ for(uint32_t i = 0; i < num_outputs; ++i) {
+ if(strcmp(outputs[i].name, display_name) == 0) {
+ *width = outputs[i].trackedBox.w;
+ *height = outputs[i].trackedBox.h;
+ return outputs[i].dwId;
+ }
+ }
+ return 0;
+/* TODO: Test with optimus and open kernel modules */
+static bool get_driver_version(int *major, int *minor) {
+ *major = 0;
+ *minor = 0;
+ FILE *f = fopen("/proc/driver/nvidia/version", "rb");
+ if(!f) {
+ fprintf(stderr, "gsr warning: failed to get nvidia driver version (failed to read /proc/driver/nvidia/version)\n");
+ return false;
+ }
+ char buffer[2048];
+ size_t bytes_read = fread(buffer, 1, sizeof(buffer) - 1, f);
+ buffer[bytes_read] = '\0';
+ bool success = false;
+ const char *p = strstr(buffer, "Kernel Module");
+ if(p) {
+ p += 13;
+ int driver_major_version = 0, driver_minor_version = 0;
+ if(sscanf(p, "%d.%d", &driver_major_version, &driver_minor_version) == 2) {
+ *major = driver_major_version;
+ *minor = driver_minor_version;
+ success = true;
+ }
+ }
+ if(!success)
+ fprintf(stderr, "gsr warning: failed to get nvidia driver version\n");
+ fclose(f);
+ return success;
+static bool version_at_least(int major, int minor, int expected_major, int expected_minor) {
+ return major > expected_major || (major == expected_major && minor >= expected_minor);
+static bool version_less_than(int major, int minor, int expected_major, int expected_minor) {
+ return major < expected_major || (major == expected_major && minor < expected_minor);
+static void set_func_ptr(void **dst, void *src) {
+ *dst = src;
+static bool gsr_capture_nvfbc_load_library(gsr_capture *cap) {
+ gsr_capture_nvfbc *cap_nvfbc = cap->priv;
+ dlerror(); /* clear */
+ void *lib = dlopen("libnvidia-fbc.so.1", RTLD_LAZY);
+ if(!lib) {
+ fprintf(stderr, "gsr error: failed to load libnvidia-fbc.so.1, error: %s\n", dlerror());
+ return false;
+ }
+ set_func_ptr((void**)&cap_nvfbc->nv_fbc_create_instance, dlsym(lib, "NvFBCCreateInstance"));
+ if(!cap_nvfbc->nv_fbc_create_instance) {
+ fprintf(stderr, "gsr error: unable to resolve symbol 'NvFBCCreateInstance'\n");
+ dlclose(lib);
+ return false;
+ }
+ memset(&cap_nvfbc->nv_fbc_function_list, 0, sizeof(cap_nvfbc->nv_fbc_function_list));
+ cap_nvfbc->nv_fbc_function_list.dwVersion = NVFBC_VERSION;
+ NVFBCSTATUS status = cap_nvfbc->nv_fbc_create_instance(&cap_nvfbc->nv_fbc_function_list);
+ if(status != NVFBC_SUCCESS) {
+ fprintf(stderr, "gsr error: failed to create NvFBC instance (status: %d)\n", status);
+ dlclose(lib);
+ return false;
+ }
+ cap_nvfbc->library = lib;
+ return true;
+/* TODO: check for glx swap control extension string (GLX_EXT_swap_control, etc) */
+static void set_vertical_sync_enabled(gsr_egl *egl, int enabled) {
+ int result = 0;
+ if(egl->glXSwapIntervalEXT) {
+ egl->glXSwapIntervalEXT(egl->x11.dpy, egl->x11.window, enabled ? 1 : 0);
+ } else if(egl->glXSwapIntervalMESA) {
+ result = egl->glXSwapIntervalMESA(enabled ? 1 : 0);
+ } else if(egl->glXSwapIntervalSGI) {
+ result = egl->glXSwapIntervalSGI(enabled ? 1 : 0);
+ } else {
+ static int warned = 0;
+ if (!warned) {
+ warned = 1;
+ fprintf(stderr, "gsr warning: setting vertical sync not supported\n");
+ }
+ }
+ if(result != 0)
+ fprintf(stderr, "gsr warning: setting vertical sync failed\n");
+static void gsr_capture_nvfbc_destroy_session(gsr_capture_nvfbc *cap_nvfbc) {
+ if(cap_nvfbc->fbc_handle_created && cap_nvfbc->capture_session_created) {
+ memset(&destroy_capture_params, 0, sizeof(destroy_capture_params));
+ destroy_capture_params.dwVersion = NVFBC_DESTROY_CAPTURE_SESSION_PARAMS_VER;
+ cap_nvfbc->nv_fbc_function_list.nvFBCDestroyCaptureSession(cap_nvfbc->nv_fbc_handle, &destroy_capture_params);
+ cap_nvfbc->capture_session_created = false;
+ }
+static void gsr_capture_nvfbc_destroy_handle(gsr_capture_nvfbc *cap_nvfbc) {
+ if(cap_nvfbc->fbc_handle_created) {
+ memset(&destroy_params, 0, sizeof(destroy_params));
+ destroy_params.dwVersion = NVFBC_DESTROY_HANDLE_PARAMS_VER;
+ cap_nvfbc->nv_fbc_function_list.nvFBCDestroyHandle(cap_nvfbc->nv_fbc_handle, &destroy_params);
+ cap_nvfbc->fbc_handle_created = false;
+ cap_nvfbc->nv_fbc_handle = 0;
+ }
+static void gsr_capture_nvfbc_destroy_session_and_handle(gsr_capture_nvfbc *cap_nvfbc) {
+ gsr_capture_nvfbc_destroy_session(cap_nvfbc);
+ gsr_capture_nvfbc_destroy_handle(cap_nvfbc);
+static int gsr_capture_nvfbc_setup_handle(gsr_capture_nvfbc *cap_nvfbc) {
+ memset(&create_params, 0, sizeof(create_params));
+ create_params.dwVersion = NVFBC_CREATE_HANDLE_PARAMS_VER;
+ create_params.bExternallyManagedContext = NVFBC_TRUE;
+ create_params.glxCtx = cap_nvfbc->params.egl->glx_context;
+ create_params.glxFBConfig = cap_nvfbc->params.egl->glx_fb_config;
+ status = cap_nvfbc->nv_fbc_function_list.nvFBCCreateHandle(&cap_nvfbc->nv_fbc_handle, &create_params);
+ if(status != NVFBC_SUCCESS) {
+ // Reverse engineering for interoperability
+ const uint8_t enable_key[] = { 0xac, 0x10, 0xc9, 0x2e, 0xa5, 0xe6, 0x87, 0x4f, 0x8f, 0x4b, 0xf4, 0x61, 0xf8, 0x56, 0x27, 0xe9 };
+ create_params.privateData = enable_key;
+ create_params.privateDataSize = 16;
+ status = cap_nvfbc->nv_fbc_function_list.nvFBCCreateHandle(&cap_nvfbc->nv_fbc_handle, &create_params);
+ if(status != NVFBC_SUCCESS) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
+ goto error_cleanup;
+ }
+ }
+ cap_nvfbc->fbc_handle_created = true;
+ NVFBC_GET_STATUS_PARAMS status_params;
+ memset(&status_params, 0, sizeof(status_params));
+ status_params.dwVersion = NVFBC_GET_STATUS_PARAMS_VER;
+ status = cap_nvfbc->nv_fbc_function_list.nvFBCGetStatus(cap_nvfbc->nv_fbc_handle, &status_params);
+ if(status != NVFBC_SUCCESS) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
+ goto error_cleanup;
+ }
+ if(status_params.bCanCreateNow == NVFBC_FALSE) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: it's not possible to create a capture session on this system\n");
+ goto error_cleanup;
+ }
+ cap_nvfbc->tracking_width = XWidthOfScreen(DefaultScreenOfDisplay(cap_nvfbc->params.egl->x11.dpy));
+ cap_nvfbc->tracking_height = XHeightOfScreen(DefaultScreenOfDisplay(cap_nvfbc->params.egl->x11.dpy));
+ cap_nvfbc->tracking_type = strcmp(cap_nvfbc->params.display_to_capture, "screen") == 0 ? NVFBC_TRACKING_SCREEN : NVFBC_TRACKING_OUTPUT;
+ if(cap_nvfbc->tracking_type == NVFBC_TRACKING_OUTPUT) {
+ if(!status_params.bXRandRAvailable) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: the xrandr extension is not available\n");
+ goto error_cleanup;
+ }
+ if(status_params.bInModeset) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: the x server is in modeset, unable to record\n");
+ goto error_cleanup;
+ }
+ cap_nvfbc->output_id = get_output_id_from_display_name(status_params.outputs, status_params.dwOutputNum, cap_nvfbc->params.display_to_capture, &cap_nvfbc->tracking_width, &cap_nvfbc->tracking_height);
+ if(cap_nvfbc->output_id == 0) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: display '%s' not found\n", cap_nvfbc->params.display_to_capture);
+ goto error_cleanup;
+ }
+ }
+ return 0;
+ error_cleanup:
+ gsr_capture_nvfbc_destroy_session_and_handle(cap_nvfbc);
+ return -1;
+static int gsr_capture_nvfbc_setup_session(gsr_capture_nvfbc *cap_nvfbc) {
+ memset(&create_capture_params, 0, sizeof(create_capture_params));
+ create_capture_params.dwVersion = NVFBC_CREATE_CAPTURE_SESSION_PARAMS_VER;
+ create_capture_params.eCaptureType = NVFBC_CAPTURE_TO_GL;
+ create_capture_params.bWithCursor = (!cap_nvfbc->direct_capture || cap_nvfbc->supports_direct_cursor) ? NVFBC_TRUE : NVFBC_FALSE;
+ if(!cap_nvfbc->params.record_cursor)
+ create_capture_params.bWithCursor = false;
+ if(cap_nvfbc->capture_region)
+ create_capture_params.captureBox = (NVFBC_BOX){ cap_nvfbc->x, cap_nvfbc->y, cap_nvfbc->width, cap_nvfbc->height };
+ create_capture_params.eTrackingType = cap_nvfbc->tracking_type;
+ create_capture_params.dwSamplingRateMs = (uint32_t)ceilf(1000.0f / (float)cap_nvfbc->params.fps);
+ create_capture_params.bAllowDirectCapture = cap_nvfbc->direct_capture ? NVFBC_TRUE : NVFBC_FALSE;
+ create_capture_params.bPushModel = cap_nvfbc->direct_capture ? NVFBC_TRUE : NVFBC_FALSE;
+ create_capture_params.bDisableAutoModesetRecovery = true;
+ if(cap_nvfbc->tracking_type == NVFBC_TRACKING_OUTPUT)
+ create_capture_params.dwOutputId = cap_nvfbc->output_id;
+ NVFBCSTATUS status = cap_nvfbc->nv_fbc_function_list.nvFBCCreateCaptureSession(cap_nvfbc->nv_fbc_handle, &create_capture_params);
+ if(status != NVFBC_SUCCESS) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
+ return -1;
+ }
+ cap_nvfbc->capture_session_created = true;
+ memset(&cap_nvfbc->setup_params, 0, sizeof(cap_nvfbc->setup_params));
+ cap_nvfbc->setup_params.dwVersion = NVFBC_TOGL_SETUP_PARAMS_VER;
+ cap_nvfbc->setup_params.eBufferFormat = NVFBC_BUFFER_FORMAT_BGRA;
+ status = cap_nvfbc->nv_fbc_function_list.nvFBCToGLSetUp(cap_nvfbc->nv_fbc_handle, &cap_nvfbc->setup_params);
+ if(status != NVFBC_SUCCESS) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_start failed: %s\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle));
+ gsr_capture_nvfbc_destroy_session(cap_nvfbc);
+ return -1;
+ }
+ return 0;
+static int gsr_capture_nvfbc_start(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame *frame) {
+ gsr_capture_nvfbc *cap_nvfbc = cap->priv;
+ cap_nvfbc->base.video_codec_context = video_codec_context;
+ cap_nvfbc->base.egl = cap_nvfbc->params.egl;
+ if(!gsr_cuda_load(&cap_nvfbc->cuda, cap_nvfbc->params.egl->x11.dpy, cap_nvfbc->params.overclock))
+ return -1;
+ if(!gsr_capture_nvfbc_load_library(cap)) {
+ gsr_cuda_unload(&cap_nvfbc->cuda);
+ return -1;
+ }
+ cap_nvfbc->x = max_int(cap_nvfbc->params.pos.x, 0);
+ cap_nvfbc->y = max_int(cap_nvfbc->params.pos.y, 0);
+ cap_nvfbc->width = max_int(cap_nvfbc->params.size.x, 0);
+ cap_nvfbc->height = max_int(cap_nvfbc->params.size.y, 0);
+ cap_nvfbc->capture_region = (cap_nvfbc->x > 0 || cap_nvfbc->y > 0 || cap_nvfbc->width > 0 || cap_nvfbc->height > 0);
+ cap_nvfbc->supports_direct_cursor = false;
+ bool direct_capture = cap_nvfbc->params.direct_capture;
+ int driver_major_version = 0;
+ int driver_minor_version = 0;
+ if(direct_capture && get_driver_version(&driver_major_version, &driver_minor_version)) {
+ fprintf(stderr, "Info: detected nvidia version: %d.%d\n", driver_major_version, driver_minor_version);
+ // TODO:
+ if(version_at_least(driver_major_version, driver_minor_version, 515, 57) && version_less_than(driver_major_version, driver_minor_version, 520, 56)) {
+ direct_capture = false;
+ fprintf(stderr, "Warning: \"screen-direct\" has temporary been disabled as it causes stuttering with driver versions >= 515.57 and < 520.56. Please update your driver if possible. Capturing \"screen\" instead.\n");
+ }
+ // TODO:
+ // Cursor capture disabled because moving the cursor doesn't update capture rate to monitor hz and instead captures at 10-30 hz
+ /*
+ if(direct_capture) {
+ if(version_at_least(driver_major_version, driver_minor_version, 515, 57))
+ supports_direct_cursor = true;
+ else
+ fprintf(stderr, "Info: capturing \"screen-direct\" but driver version appears to be less than 515.57. Disabling capture of cursor. Please update your driver if you want to capture your cursor or record \"screen\" instead.\n");
+ }
+ */
+ }
+ if(gsr_capture_nvfbc_setup_handle(cap_nvfbc) != 0) {
+ goto error_cleanup;
+ }
+ if(gsr_capture_nvfbc_setup_session(cap_nvfbc) != 0) {
+ goto error_cleanup;
+ }
+ if(cap_nvfbc->capture_region) {
+ video_codec_context->width = cap_nvfbc->width & ~1;
+ video_codec_context->height = cap_nvfbc->height & ~1;
+ } else {
+ video_codec_context->width = cap_nvfbc->tracking_width & ~1;
+ video_codec_context->height = cap_nvfbc->tracking_height & ~1;
+ }
+ frame->width = video_codec_context->width;
+ frame->height = video_codec_context->height;
+ if(!cuda_create_codec_context(cap_nvfbc->cuda.cu_ctx, video_codec_context, video_codec_context->width, video_codec_context->height, false, &cap_nvfbc->cuda_stream))
+ goto error_cleanup;
+ gsr_cuda_context cuda_context = {
+ .cuda = &cap_nvfbc->cuda,
+ .cuda_graphics_resources = cap_nvfbc->cuda_graphics_resources,
+ .mapped_arrays = cap_nvfbc->mapped_arrays
+ };
+ // TODO: Remove this, it creates shit we dont need
+ if(!gsr_capture_base_setup_cuda_textures(&cap_nvfbc->base, frame, &cuda_context, cap_nvfbc->params.color_range, GSR_SOURCE_COLOR_BGR, cap_nvfbc->params.hdr)) {
+ goto error_cleanup;
+ }
+ /* Disable vsync */
+ set_vertical_sync_enabled(cap_nvfbc->params.egl, 0);
+ return 0;
+ error_cleanup:
+ gsr_capture_nvfbc_destroy_session_and_handle(cap_nvfbc);
+ gsr_capture_base_stop(&cap_nvfbc->base);
+ gsr_cuda_unload(&cap_nvfbc->cuda);
+ return -1;
+static int gsr_capture_nvfbc_capture(gsr_capture *cap, AVFrame *frame) {
+ gsr_capture_nvfbc *cap_nvfbc = cap->priv;
+ const double nvfbc_recreate_retry_time_seconds = 1.0;
+ if(cap_nvfbc->nvfbc_needs_recreate) {
+ const double now = clock_get_monotonic_seconds();
+ if(now - cap_nvfbc->nvfbc_dead_start >= nvfbc_recreate_retry_time_seconds) {
+ cap_nvfbc->nvfbc_dead_start = now;
+ gsr_capture_nvfbc_destroy_session_and_handle(cap_nvfbc);
+ if(gsr_capture_nvfbc_setup_handle(cap_nvfbc) != 0) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_capture failed to recreate nvfbc handle, trying again in %f second(s)\n", nvfbc_recreate_retry_time_seconds);
+ return -1;
+ }
+ if(gsr_capture_nvfbc_setup_session(cap_nvfbc) != 0) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_capture failed to recreate nvfbc session, trying again in %f second(s)\n", nvfbc_recreate_retry_time_seconds);
+ return -1;
+ }
+ cap_nvfbc->nvfbc_needs_recreate = false;
+ } else {
+ return 0;
+ }
+ }
+ memset(&frame_info, 0, sizeof(frame_info));
+ memset(&grab_params, 0, sizeof(grab_params));
+ grab_params.dwVersion = NVFBC_TOGL_GRAB_FRAME_PARAMS_VER;
+ grab_params.pFrameGrabInfo = &frame_info;
+ grab_params.dwTimeoutMs = 0;
+ NVFBCSTATUS status = cap_nvfbc->nv_fbc_function_list.nvFBCToGLGrabFrame(cap_nvfbc->nv_fbc_handle, &grab_params);
+ if(status != NVFBC_SUCCESS) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_capture failed: %s (%d), recreating session after %f second(s)\n", cap_nvfbc->nv_fbc_function_list.nvFBCGetLastErrorStr(cap_nvfbc->nv_fbc_handle), status, nvfbc_recreate_retry_time_seconds);
+ cap_nvfbc->nvfbc_needs_recreate = true;
+ cap_nvfbc->nvfbc_dead_start = clock_get_monotonic_seconds();
+ return 0;
+ }
+ //cap_nvfbc->params.egl->glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
+ cap_nvfbc->params.egl->glClear(0);
+ gsr_color_conversion_draw(&cap_nvfbc->base.color_conversion, cap_nvfbc->setup_params.dwTextures[grab_params.dwTextureIndex],
+ (vec2i){0, 0}, (vec2i){frame->width, frame->height},
+ (vec2i){0, 0}, (vec2i){frame->width, frame->height},
+ 0.0f, false);
+ cap_nvfbc->params.egl->glXSwapBuffers(cap_nvfbc->params.egl->x11.dpy, cap_nvfbc->params.egl->x11.window);
+ // TODO: HDR is broken
+ const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+ for(int i = 0; i < 2; ++i) {
+ CUDA_MEMCPY2D memcpy_struct;
+ memcpy_struct.srcXInBytes = 0;
+ memcpy_struct.srcY = 0;
+ memcpy_struct.srcMemoryType = CU_MEMORYTYPE_ARRAY;
+ memcpy_struct.dstXInBytes = 0;
+ memcpy_struct.dstY = 0;
+ memcpy_struct.dstMemoryType = CU_MEMORYTYPE_DEVICE;
+ memcpy_struct.srcArray = cap_nvfbc->mapped_arrays[i];
+ memcpy_struct.srcPitch = frame->width / div[i];
+ memcpy_struct.dstDevice = (CUdeviceptr)frame->data[i];
+ memcpy_struct.dstPitch = frame->linesize[i];
+ memcpy_struct.WidthInBytes = frame->width * (cap_nvfbc->params.hdr ? 2 : 1);
+ memcpy_struct.Height = frame->height / div[i];
+ // TODO: Remove this copy if possible
+ cap_nvfbc->cuda.cuMemcpy2DAsync_v2(&memcpy_struct, cap_nvfbc->cuda_stream);
+ }
+ // TODO: needed?
+ cap_nvfbc->cuda.cuStreamSynchronize(cap_nvfbc->cuda_stream);
+ return 0;
+static void gsr_capture_nvfbc_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ gsr_capture_nvfbc *cap_nvfbc = cap->priv;
+ gsr_capture_nvfbc_destroy_session_and_handle(cap_nvfbc);
+ if(cap_nvfbc) {
+ gsr_capture_base_stop(&cap_nvfbc->base);
+ gsr_cuda_unload(&cap_nvfbc->cuda);
+ dlclose(cap_nvfbc->library);
+ free((void*)cap_nvfbc->params.display_to_capture);
+ cap_nvfbc->params.display_to_capture = NULL;
+ free(cap->priv);
+ cap->priv = NULL;
+ }
+ free(cap);
+gsr_capture* gsr_capture_nvfbc_create(const gsr_capture_nvfbc_params *params) {
+ if(!params) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_create params is NULL\n");
+ return NULL;
+ }
+ if(!params->display_to_capture) {
+ fprintf(stderr, "gsr error: gsr_capture_nvfbc_create params.display_to_capture is NULL\n");
+ return NULL;
+ }
+ gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+ if(!cap)
+ return NULL;
+ gsr_capture_nvfbc *cap_nvfbc = calloc(1, sizeof(gsr_capture_nvfbc));
+ if(!cap_nvfbc) {
+ free(cap);
+ return NULL;
+ }
+ const char *display_to_capture = strdup(params->display_to_capture);
+ if(!display_to_capture) {
+ free(cap);
+ free(cap_nvfbc);
+ return NULL;
+ }
+ cap_nvfbc->params = *params;
+ cap_nvfbc->params.display_to_capture = display_to_capture;
+ cap_nvfbc->params.fps = max_int(cap_nvfbc->params.fps, 1);
+ *cap = (gsr_capture) {
+ .start = gsr_capture_nvfbc_start,
+ .tick = NULL,
+ .should_stop = NULL,
+ .capture = gsr_capture_nvfbc_capture,
+ .capture_end = NULL,
+ .destroy = gsr_capture_nvfbc_destroy,
+ .priv = cap_nvfbc
+ };
+ return cap;
diff --git a/src/capture/xcomposite.c b/src/capture/xcomposite.c
new file mode 100644
index 0000000..3240ed8
--- /dev/null
+++ b/src/capture/xcomposite.c
@@ -0,0 +1,351 @@
+#include "../../include/capture/xcomposite.h"
+#include "../../include/window_texture.h"
+#include "../../include/utils.h"
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <assert.h>
+#include <X11/Xlib.h>
+#include <X11/extensions/Xdamage.h>
+#include <libavutil/hwcontext.h>
+#include <libavutil/hwcontext.h>
+#include <libavutil/frame.h>
+#include <libavcodec/avcodec.h>
+#include <va/va.h>
+#include <va/va_drmcommon.h>
+static int max_int(int a, int b) {
+ return a > b ? a : b;
+void gsr_capture_xcomposite_init(gsr_capture_xcomposite *self, const gsr_capture_xcomposite_params *params) {
+ memset(self, 0, sizeof(*self));
+ self->params = *params;
+static Window get_focused_window(Display *display, Atom net_active_window_atom) {
+ Atom type;
+ int format = 0;
+ unsigned long num_items = 0;
+ unsigned long bytes_after = 0;
+ unsigned char *properties = NULL;
+ if(XGetWindowProperty(display, DefaultRootWindow(display), net_active_window_atom, 0, 1024, False, AnyPropertyType, &type, &format, &num_items, &bytes_after, &properties) == Success && properties) {
+ Window focused_window = *(unsigned long*)properties;
+ XFree(properties);
+ return focused_window;
+ }
+ return None;
+static void gsr_capture_xcomposite_setup_damage(gsr_capture_xcomposite *self, Window window) {
+ if(self->damage_event == 0)
+ return;
+ if(self->damage) {
+ XDamageDestroy(self->params.egl->x11.dpy, self->damage);
+ self->damage = None;
+ }
+ self->damage = XDamageCreate(self->params.egl->x11.dpy, window, XDamageReportNonEmpty);
+ if(self->damage) {
+ XDamageSubtract(self->params.egl->x11.dpy, self->damage, None, None);
+ } else {
+ fprintf(stderr, "gsr warning: gsr_capture_xcomposite_setup_damage: XDamageCreate failed\n");
+ }
+int gsr_capture_xcomposite_start(gsr_capture_xcomposite *self, AVCodecContext *video_codec_context, AVFrame *frame) {
+ self->base.video_codec_context = video_codec_context;
+ self->base.egl = self->params.egl;
+ if(self->params.follow_focused) {
+ self->net_active_window_atom = XInternAtom(self->params.egl->x11.dpy, "_NET_ACTIVE_WINDOW", False);
+ if(!self->net_active_window_atom) {
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_start failed: failed to get _NET_ACTIVE_WINDOW atom\n");
+ return -1;
+ }
+ self->window = get_focused_window(self->params.egl->x11.dpy, self->net_active_window_atom);
+ } else {
+ self->window = self->params.window;
+ }
+ if(self->params.track_damage) {
+ if(!XDamageQueryExtension(self->params.egl->x11.dpy, &self->damage_event, &self->damage_error)) {
+ fprintf(stderr, "gsr warning: gsr_capture_xcomposite_start: XDamage is not supported by your X11 server\n");
+ self->damage_event = 0;
+ self->damage_error = 0;
+ }
+ } else {
+ self->damage_event = 0;
+ self->damage_error = 0;
+ }
+ self->damaged = true;
+ gsr_capture_xcomposite_setup_damage(self, self->window);
+ /* TODO: Do these in tick, and allow error if follow_focused */
+ XWindowAttributes attr;
+ if(!XGetWindowAttributes(self->params.egl->x11.dpy, self->window, &attr) && !self->params.follow_focused) {
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_start failed: invalid window id: %lu\n", self->window);
+ return -1;
+ }
+ self->window_size.x = max_int(attr.width, 0);
+ self->window_size.y = max_int(attr.height, 0);
+ if(self->params.follow_focused)
+ XSelectInput(self->params.egl->x11.dpy, DefaultRootWindow(self->params.egl->x11.dpy), PropertyChangeMask);
+ // TODO: Get select and add these on top of it and then restore at the end. Also do the same in other xcomposite
+ XSelectInput(self->params.egl->x11.dpy, self->window, StructureNotifyMask | ExposureMask);
+ if(!self->params.egl->eglExportDMABUFImageQueryMESA) {
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_start: could not find eglExportDMABUFImageQueryMESA\n");
+ return -1;
+ }
+ if(!self->params.egl->eglExportDMABUFImageMESA) {
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_start: could not find eglExportDMABUFImageMESA\n");
+ return -1;
+ }
+ /* Disable vsync */
+ self->params.egl->eglSwapInterval(self->params.egl->egl_display, 0);
+ if(window_texture_init(&self->window_texture, self->params.egl->x11.dpy, self->window, self->params.egl) != 0 && !self->params.follow_focused) {
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_start: failed to get window texture for window %ld\n", self->window);
+ return -1;
+ }
+ if(gsr_cursor_init(&self->cursor, self->params.egl, self->params.egl->x11.dpy) != 0) {
+ gsr_capture_xcomposite_stop(self);
+ return -1;
+ }
+ self->texture_size.x = 0;
+ self->texture_size.y = 0;
+ self->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&self->window_texture));
+ self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &self->texture_size.x);
+ self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &self->texture_size.y);
+ self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+ vec2i video_size = self->texture_size;
+ if(self->params.region_size.x > 0 && self->params.region_size.y > 0)
+ video_size = self->params.region_size;
+ if(self->params.egl->gpu_info.vendor == GSR_GPU_VENDOR_AMD && video_codec_context->codec_id == AV_CODEC_ID_HEVC) {
+ // TODO: dont do this if using ffmpeg reports that this is not needed (AMD driver bug that was fixed recently)
+ video_codec_context->width = FFALIGN(video_size.x, 64);
+ video_codec_context->height = FFALIGN(video_size.y, 16);
+ } else if(self->params.egl->gpu_info.vendor == GSR_GPU_VENDOR_AMD && video_codec_context->codec_id == AV_CODEC_ID_AV1) {
+ // TODO: Dont do this for VCN 5 and forward which should fix this hardware bug
+ video_codec_context->width = FFALIGN(video_size.x, 64);
+ // AMD driver has special case handling for 1080 height to set it to 1082 instead of 1088 (1080 aligned to 16).
+ // TODO: Set height to 1082 in this case, but it wont work because it will be aligned to 1088.
+ if(video_size.y == 1080) {
+ video_codec_context->height = 1080;
+ } else {
+ video_codec_context->height = FFALIGN(video_size.y, 16);
+ }
+ } else {
+ video_codec_context->width = FFALIGN(video_size.x, 2);
+ video_codec_context->height = FFALIGN(video_size.y, 2);
+ }
+ frame->width = video_codec_context->width;
+ frame->height = video_codec_context->height;
+ self->window_resize_timer = clock_get_monotonic_seconds();
+ return 0;
+void gsr_capture_xcomposite_stop(gsr_capture_xcomposite *self) {
+ if(self->damage) {
+ XDamageDestroy(self->params.egl->x11.dpy, self->damage);
+ self->damage = None;
+ }
+ window_texture_deinit(&self->window_texture);
+ gsr_cursor_deinit(&self->cursor);
+ gsr_capture_base_stop(&self->base);
+void gsr_capture_xcomposite_tick(gsr_capture_xcomposite *self, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ //self->params.egl->glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
+ self->params.egl->glClear(0);
+ bool init_new_window = false;
+ while(XPending(self->params.egl->x11.dpy)) {
+ XNextEvent(self->params.egl->x11.dpy, &self->xev);
+ switch(self->xev.type) {
+ case DestroyNotify: {
+ /* Window died (when not following focused window), so we stop recording */
+ if(!self->params.follow_focused && self->xev.xdestroywindow.window == self->window) {
+ self->should_stop = true;
+ self->stop_is_error = false;
+ }
+ break;
+ }
+ case Expose: {
+ /* Requires window texture recreate */
+ if(self->xev.xexpose.count == 0 && self->xev.xexpose.window == self->window) {
+ self->window_resize_timer = clock_get_monotonic_seconds();
+ self->window_resized = true;
+ }
+ break;
+ }
+ case ConfigureNotify: {
+ /* Window resized */
+ if(self->xev.xconfigure.window == self->window && (self->xev.xconfigure.width != self->window_size.x || self->xev.xconfigure.height != self->window_size.y)) {
+ self->window_size.x = max_int(self->xev.xconfigure.width, 0);
+ self->window_size.y = max_int(self->xev.xconfigure.height, 0);
+ self->window_resize_timer = clock_get_monotonic_seconds();
+ self->window_resized = true;
+ }
+ break;
+ }
+ case PropertyNotify: {
+ /* Focused window changed */
+ if(self->params.follow_focused && self->xev.xproperty.atom == self->net_active_window_atom) {
+ init_new_window = true;
+ }
+ break;
+ }
+ }
+ if(self->damage_event && self->xev.type == self->damage_event + XDamageNotify) {
+ XDamageNotifyEvent *de = (XDamageNotifyEvent*)&self->xev;
+ XserverRegion region = XFixesCreateRegion(self->params.egl->x11.dpy, NULL, 0);
+ // Subtract all the damage, repairing the window
+ XDamageSubtract(self->params.egl->x11.dpy, de->damage, None, region);
+ XFixesDestroyRegion(self->params.egl->x11.dpy, region);
+ self->damaged = true;
+ }
+ if(gsr_cursor_update(&self->cursor, &self->xev)) {
+ if(self->params.record_cursor && self->cursor.visible) {
+ self->damaged = true;
+ }
+ }
+ }
+ if(self->params.follow_focused && !self->follow_focused_initialized) {
+ init_new_window = true;
+ }
+ if(init_new_window) {
+ Window focused_window = get_focused_window(self->params.egl->x11.dpy, self->net_active_window_atom);
+ if(focused_window != self->window || !self->follow_focused_initialized) {
+ self->follow_focused_initialized = true;
+ XSelectInput(self->params.egl->x11.dpy, self->window, 0);
+ self->window = focused_window;
+ XSelectInput(self->params.egl->x11.dpy, self->window, StructureNotifyMask | ExposureMask);
+ XWindowAttributes attr;
+ attr.width = 0;
+ attr.height = 0;
+ if(!XGetWindowAttributes(self->params.egl->x11.dpy, self->window, &attr))
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_tick failed: invalid window id: %lu\n", self->window);
+ self->window_size.x = max_int(attr.width, 0);
+ self->window_size.y = max_int(attr.height, 0);
+ self->window_resized = true;
+ window_texture_deinit(&self->window_texture);
+ window_texture_init(&self->window_texture, self->params.egl->x11.dpy, self->window, self->params.egl); // TODO: Do not do the below window_texture_on_resize after this
+ gsr_capture_xcomposite_setup_damage(self, self->window);
+ }
+ }
+ const double window_resize_timeout = 1.0; // 1 second
+ if(self->window_resized && clock_get_monotonic_seconds() - self->window_resize_timer >= window_resize_timeout) {
+ self->window_resized = false;
+ if(window_texture_on_resize(&self->window_texture) != 0) {
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_tick: window_texture_on_resize failed\n");
+ //self->should_stop = true;
+ //self->stop_is_error = true;
+ return;
+ }
+ self->texture_size.x = 0;
+ self->texture_size.y = 0;
+ self->params.egl->glBindTexture(GL_TEXTURE_2D, window_texture_get_opengl_texture_id(&self->window_texture));
+ self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &self->texture_size.x);
+ self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &self->texture_size.y);
+ self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+ gsr_color_conversion_clear(&self->base.color_conversion);
+ gsr_capture_xcomposite_setup_damage(self, self->window);
+ }
+bool gsr_capture_xcomposite_is_damaged(gsr_capture_xcomposite *self) {
+ return self->damage_event ? self->damaged : true;
+void gsr_capture_xcomposite_clear_damage(gsr_capture_xcomposite *self) {
+ self->damaged = false;
+bool gsr_capture_xcomposite_should_stop(gsr_capture_xcomposite *self, bool *err) {
+ if(self->should_stop) {
+ if(err)
+ *err = self->stop_is_error;
+ return true;
+ }
+ if(err)
+ *err = false;
+ return false;
+int gsr_capture_xcomposite_capture(gsr_capture_xcomposite *self, AVFrame *frame) {
+ (void)frame;
+ const int target_x = max_int(0, frame->width / 2 - self->texture_size.x / 2);
+ const int target_y = max_int(0, frame->height / 2 - self->texture_size.y / 2);
+ const vec2i cursor_pos = {
+ target_x + self->cursor.position.x - self->cursor.hotspot.x,
+ target_y + self->cursor.position.y - self->cursor.hotspot.y
+ };
+ gsr_color_conversion_draw(&self->base.color_conversion, window_texture_get_opengl_texture_id(&self->window_texture),
+ (vec2i){target_x, target_y}, self->texture_size,
+ (vec2i){0, 0}, self->texture_size,
+ 0.0f, false);
+ if(self->params.record_cursor && self->cursor.visible) {
+ gsr_cursor_tick(&self->cursor, self->window);
+ const bool cursor_inside_window =
+ cursor_pos.x + self->cursor.size.x >= target_x &&
+ cursor_pos.x <= target_x + self->texture_size.x &&
+ cursor_pos.y + self->cursor.size.y >= target_y &&
+ cursor_pos.y <= target_y + self->texture_size.y;
+ if(cursor_inside_window) {
+ self->base.egl->glEnable(GL_SCISSOR_TEST);
+ self->base.egl->glScissor(target_x, target_y, self->texture_size.x, self->texture_size.y);
+ gsr_color_conversion_draw(&self->base.color_conversion, self->cursor.texture_id,
+ cursor_pos, self->cursor.size,
+ (vec2i){0, 0}, self->cursor.size,
+ 0.0f, false);
+ self->base.egl->glDisable(GL_SCISSOR_TEST);
+ }
+ }
+ self->params.egl->eglSwapBuffers(self->params.egl->egl_display, self->params.egl->egl_surface);
+ //self->params.egl->glFlush();
+ //self->params.egl->glFinish();
+ return 0;
diff --git a/src/capture/xcomposite_cuda.c b/src/capture/xcomposite_cuda.c
new file mode 100644
index 0000000..c436221
--- /dev/null
+++ b/src/capture/xcomposite_cuda.c
@@ -0,0 +1,167 @@
+#include "../../include/capture/xcomposite_cuda.h"
+#include "../../include/cuda.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <libavutil/frame.h>
+#include <libavcodec/avcodec.h>
+typedef struct {
+ gsr_capture_xcomposite xcomposite;
+ bool overclock;
+ gsr_cuda cuda;
+ CUgraphicsResource cuda_graphics_resources[2];
+ CUarray mapped_arrays[2];
+ CUstream cuda_stream;
+} gsr_capture_xcomposite_cuda;
+static void gsr_capture_xcomposite_cuda_stop(gsr_capture *cap, AVCodecContext *video_codec_context);
+static int gsr_capture_xcomposite_cuda_start(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame *frame) {
+ gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
+ const int res = gsr_capture_xcomposite_start(&cap_xcomp->xcomposite, video_codec_context, frame);
+ if(res != 0) {
+ gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
+ return res;
+ }
+ if(!gsr_cuda_load(&cap_xcomp->cuda, cap_xcomp->xcomposite.params.egl->x11.dpy, cap_xcomp->overclock)) {
+ fprintf(stderr, "gsr error: gsr_capture_kms_cuda_start: failed to load cuda\n");
+ gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
+ return -1;
+ }
+ if(!cuda_create_codec_context(cap_xcomp->cuda.cu_ctx, video_codec_context, video_codec_context->width, video_codec_context->height, false, &cap_xcomp->cuda_stream)) {
+ gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
+ return -1;
+ }
+ gsr_cuda_context cuda_context = {
+ .cuda = &cap_xcomp->cuda,
+ .cuda_graphics_resources = cap_xcomp->cuda_graphics_resources,
+ .mapped_arrays = cap_xcomp->mapped_arrays
+ };
+ if(!gsr_capture_base_setup_cuda_textures(&cap_xcomp->xcomposite.base, frame, &cuda_context, cap_xcomp->xcomposite.params.color_range, GSR_SOURCE_COLOR_RGB, false)) {
+ gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
+ return -1;
+ }
+ return 0;
+static void gsr_capture_xcomposite_unload_cuda_graphics(gsr_capture_xcomposite_cuda *cap_xcomp) {
+ if(cap_xcomp->cuda.cu_ctx) {
+ for(int i = 0; i < 2; ++i) {
+ if(cap_xcomp->cuda_graphics_resources[i]) {
+ cap_xcomp->cuda.cuGraphicsUnmapResources(1, &cap_xcomp->cuda_graphics_resources[i], 0);
+ cap_xcomp->cuda.cuGraphicsUnregisterResource(cap_xcomp->cuda_graphics_resources[i]);
+ cap_xcomp->cuda_graphics_resources[i] = 0;
+ }
+ }
+ }
+static void gsr_capture_xcomposite_cuda_stop(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
+ gsr_capture_xcomposite_stop(&cap_xcomp->xcomposite);
+ gsr_capture_xcomposite_unload_cuda_graphics(cap_xcomp);
+ gsr_cuda_unload(&cap_xcomp->cuda);
+static void gsr_capture_xcomposite_cuda_tick(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
+ gsr_capture_xcomposite_tick(&cap_xcomp->xcomposite, video_codec_context);
+static bool gsr_capture_xcomposite_cuda_is_damaged(gsr_capture *cap) {
+ gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
+ return gsr_capture_xcomposite_is_damaged(&cap_xcomp->xcomposite);
+static void gsr_capture_xcomposite_cuda_clear_damage(gsr_capture *cap) {
+ gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
+ gsr_capture_xcomposite_clear_damage(&cap_xcomp->xcomposite);
+static bool gsr_capture_xcomposite_cuda_should_stop(gsr_capture *cap, bool *err) {
+ gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
+ return gsr_capture_xcomposite_should_stop(&cap_xcomp->xcomposite, err);
+static int gsr_capture_xcomposite_cuda_capture(gsr_capture *cap, AVFrame *frame) {
+ gsr_capture_xcomposite_cuda *cap_xcomp = cap->priv;
+ gsr_capture_xcomposite_capture(&cap_xcomp->xcomposite, frame);
+ const int div[2] = {1, 2}; // divide UV texture size by 2 because chroma is half size
+ for(int i = 0; i < 2; ++i) {
+ CUDA_MEMCPY2D memcpy_struct;
+ memcpy_struct.srcXInBytes = 0;
+ memcpy_struct.srcY = 0;
+ memcpy_struct.srcMemoryType = CU_MEMORYTYPE_ARRAY;
+ memcpy_struct.dstXInBytes = 0;
+ memcpy_struct.dstY = 0;
+ memcpy_struct.dstMemoryType = CU_MEMORYTYPE_DEVICE;
+ memcpy_struct.srcArray = cap_xcomp->mapped_arrays[i];
+ memcpy_struct.srcPitch = frame->width / div[i];
+ memcpy_struct.dstDevice = (CUdeviceptr)frame->data[i];
+ memcpy_struct.dstPitch = frame->linesize[i];
+ memcpy_struct.WidthInBytes = frame->width;
+ memcpy_struct.Height = frame->height / div[i];
+ // TODO: Remove this copy if possible
+ cap_xcomp->cuda.cuMemcpy2DAsync_v2(&memcpy_struct, cap_xcomp->cuda_stream);
+ }
+ // TODO: needed?
+ cap_xcomp->cuda.cuStreamSynchronize(cap_xcomp->cuda_stream);
+ return 0;
+static void gsr_capture_xcomposite_cuda_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ if(cap->priv) {
+ gsr_capture_xcomposite_cuda_stop(cap, video_codec_context);
+ free(cap->priv);
+ cap->priv = NULL;
+ }
+ free(cap);
+gsr_capture* gsr_capture_xcomposite_cuda_create(const gsr_capture_xcomposite_cuda_params *params) {
+ if(!params) {
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_cuda_create params is NULL\n");
+ return NULL;
+ }
+ gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+ if(!cap)
+ return NULL;
+ gsr_capture_xcomposite_cuda *cap_xcomp = calloc(1, sizeof(gsr_capture_xcomposite_cuda));
+ if(!cap_xcomp) {
+ free(cap);
+ return NULL;
+ }
+ gsr_capture_xcomposite_init(&cap_xcomp->xcomposite, &params->base);
+ cap_xcomp->overclock = params->overclock;
+ *cap = (gsr_capture) {
+ .start = gsr_capture_xcomposite_cuda_start,
+ .tick = gsr_capture_xcomposite_cuda_tick,
+ .is_damaged = gsr_capture_xcomposite_cuda_is_damaged,
+ .clear_damage = gsr_capture_xcomposite_cuda_clear_damage,
+ .should_stop = gsr_capture_xcomposite_cuda_should_stop,
+ .capture = gsr_capture_xcomposite_cuda_capture,
+ .capture_end = NULL,
+ .destroy = gsr_capture_xcomposite_cuda_destroy,
+ .priv = cap_xcomp
+ };
+ return cap;
diff --git a/src/capture/xcomposite_vaapi.c b/src/capture/xcomposite_vaapi.c
new file mode 100644
index 0000000..3f27014
--- /dev/null
+++ b/src/capture/xcomposite_vaapi.c
@@ -0,0 +1,121 @@
+#include "../../include/capture/xcomposite_vaapi.h"
+#include "../../include/capture/xcomposite.h"
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <va/va.h>
+#include <va/va_drmcommon.h>
+#include <libavcodec/avcodec.h>
+typedef struct {
+ gsr_capture_xcomposite xcomposite;
+ VADisplay va_dpy;
+ VADRMPRIMESurfaceDescriptor prime;
+} gsr_capture_xcomposite_vaapi;
+static void gsr_capture_xcomposite_vaapi_stop(gsr_capture *cap, AVCodecContext *video_codec_context);
+static int gsr_capture_xcomposite_vaapi_start(gsr_capture *cap, AVCodecContext *video_codec_context, AVFrame *frame) {
+ gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
+ const int res = gsr_capture_xcomposite_start(&cap_xcomp->xcomposite, video_codec_context, frame);
+ if(res != 0) {
+ gsr_capture_xcomposite_vaapi_stop(cap, video_codec_context);
+ return res;
+ }
+ if(!drm_create_codec_context(cap_xcomp->xcomposite.params.egl->card_path, video_codec_context, video_codec_context->width, video_codec_context->height, false, &cap_xcomp->va_dpy)) {
+ gsr_capture_xcomposite_vaapi_stop(cap, video_codec_context);
+ return -1;
+ }
+ if(!gsr_capture_base_setup_vaapi_textures(&cap_xcomp->xcomposite.base, frame, cap_xcomp->va_dpy, &cap_xcomp->prime, cap_xcomp->xcomposite.params.color_range)) {
+ gsr_capture_xcomposite_vaapi_stop(cap, video_codec_context);
+ return -1;
+ }
+ return 0;
+static void gsr_capture_xcomposite_vaapi_tick(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
+ gsr_capture_xcomposite_tick(&cap_xcomp->xcomposite, video_codec_context);
+static bool gsr_capture_xcomposite_vaapi_is_damaged(gsr_capture *cap) {
+ gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
+ return gsr_capture_xcomposite_is_damaged(&cap_xcomp->xcomposite);
+static void gsr_capture_xcomposite_vaapi_clear_damage(gsr_capture *cap) {
+ gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
+ gsr_capture_xcomposite_clear_damage(&cap_xcomp->xcomposite);
+static bool gsr_capture_xcomposite_vaapi_should_stop(gsr_capture *cap, bool *err) {
+ gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
+ return gsr_capture_xcomposite_should_stop(&cap_xcomp->xcomposite, err);
+static int gsr_capture_xcomposite_vaapi_capture(gsr_capture *cap, AVFrame *frame) {
+ gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
+ return gsr_capture_xcomposite_capture(&cap_xcomp->xcomposite, frame);
+static void gsr_capture_xcomposite_vaapi_stop(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ gsr_capture_xcomposite_vaapi *cap_xcomp = cap->priv;
+ for(uint32_t i = 0; i < cap_xcomp->prime.num_objects; ++i) {
+ if(cap_xcomp->prime.objects[i].fd > 0) {
+ close(cap_xcomp->prime.objects[i].fd);
+ cap_xcomp->prime.objects[i].fd = 0;
+ }
+ }
+ gsr_capture_xcomposite_stop(&cap_xcomp->xcomposite);
+static void gsr_capture_xcomposite_vaapi_destroy(gsr_capture *cap, AVCodecContext *video_codec_context) {
+ (void)video_codec_context;
+ if(cap->priv) {
+ gsr_capture_xcomposite_vaapi_stop(cap, video_codec_context);
+ free(cap->priv);
+ cap->priv = NULL;
+ }
+ free(cap);
+gsr_capture* gsr_capture_xcomposite_vaapi_create(const gsr_capture_xcomposite_vaapi_params *params) {
+ if(!params) {
+ fprintf(stderr, "gsr error: gsr_capture_xcomposite_vaapi_create params is NULL\n");
+ return NULL;
+ }
+ gsr_capture *cap = calloc(1, sizeof(gsr_capture));
+ if(!cap)
+ return NULL;
+ gsr_capture_xcomposite_vaapi *cap_xcomp = calloc(1, sizeof(gsr_capture_xcomposite_vaapi));
+ if(!cap_xcomp) {
+ free(cap);
+ return NULL;
+ }
+ gsr_capture_xcomposite_init(&cap_xcomp->xcomposite, &params->base);
+ *cap = (gsr_capture) {
+ .start = gsr_capture_xcomposite_vaapi_start,
+ .tick = gsr_capture_xcomposite_vaapi_tick,
+ .is_damaged = gsr_capture_xcomposite_vaapi_is_damaged,
+ .clear_damage = gsr_capture_xcomposite_vaapi_clear_damage,
+ .should_stop = gsr_capture_xcomposite_vaapi_should_stop,
+ .capture = gsr_capture_xcomposite_vaapi_capture,
+ .capture_end = NULL,
+ .destroy = gsr_capture_xcomposite_vaapi_destroy,
+ .priv = cap_xcomp
+ };
+ return cap;
diff --git a/src/color_conversion.c b/src/color_conversion.c
new file mode 100644
index 0000000..cd0397e
--- /dev/null
+++ b/src/color_conversion.c
@@ -0,0 +1,469 @@
+#include "../include/color_conversion.h"
+#include "../include/egl.h"
+#include <stdio.h>
+#include <string.h>
+#include <math.h>
+#include <assert.h>
+/* TODO: highp instead of mediump? */
+#define MAX_SHADERS 4
+static float abs_f(float v) {
+ return v >= 0.0f ? v : -v;
+#define ROTATE_Z "mat4 rotate_z(in float angle) {\n" \
+ " return mat4(cos(angle), -sin(angle), 0.0, 0.0,\n" \
+ " sin(angle), cos(angle), 0.0, 0.0,\n" \
+ " 0.0, 0.0, 1.0, 0.0,\n" \
+ " 0.0, 0.0, 0.0, 1.0);\n" \
+ "}\n"
+/* https://en.wikipedia.org/wiki/YCbCr, see study/color_space_transform_matrix.png */
+/* ITU-R BT2020, full */
+/* https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.2020-2-201510-I!!PDF-E.pdf */
+#define RGB_TO_P010_FULL "const mat4 RGBtoYUV = mat4(0.262700, -0.139630, 0.500000, 0.000000,\n" \
+ " 0.678000, -0.360370, -0.459786, 0.000000,\n" \
+ " 0.059300, 0.500000, -0.040214, 0.000000,\n" \
+ " 0.000000, 0.500000, 0.500000, 1.000000);"
+/* ITU-R BT2020, limited (full multiplied by (235-16)/255, adding 16/255 to luma) */
+#define RGB_TO_P010_LIMITED "const mat4 RGBtoYUV = mat4(0.225613, -0.119918, 0.429412, 0.000000,\n" \
+ " 0.582282, -0.309494, -0.394875, 0.000000,\n" \
+ " 0.050928, 0.429412, -0.034537, 0.000000,\n" \
+ " 0.062745, 0.500000, 0.500000, 1.000000);"
+/* ITU-R BT709, full, custom values: 0.2110 0.7110 0.0710 */
+/* https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!!PDF-E.pdf */
+#define RGB_TO_NV12_FULL "const mat4 RGBtoYUV = mat4(0.211000, -0.113563, 0.500000, 0.000000,\n" \
+ " 0.711000, -0.382670, -0.450570, 0.000000,\n" \
+ " 0.071000, 0.500000, -0.044994, 0.000000,\n" \
+ " 0.000000, 0.500000, 0.500000, 1.000000);"
+/* ITU-R BT709, limited, custom values: 0.2100 0.7100 0.0700 (full multiplied by (235-16)/255, adding 16/255 to luma) */
+#define RGB_TO_NV12_LIMITED "const mat4 RGBtoYUV = mat4(0.180353, -0.096964, 0.429412, 0.000000,\n" \
+ " 0.609765, -0.327830, -0.385927, 0.000000,\n" \
+ " 0.060118, 0.429412, -0.038049, 0.000000,\n" \
+ " 0.062745, 0.500000, 0.500000, 1.000000);"
+static const char* color_format_range_get_transform_matrix(gsr_destination_color color_format, gsr_color_range color_range) {
+ switch(color_format) {
+ switch(color_range) {
+ return RGB_TO_NV12_LIMITED;
+ return RGB_TO_NV12_FULL;
+ }
+ break;
+ }
+ switch(color_range) {
+ return RGB_TO_P010_LIMITED;
+ return RGB_TO_P010_FULL;
+ }
+ break;
+ }
+ default:
+ return NULL;
+ }
+ return NULL;
+static int load_shader_y(gsr_shader *shader, gsr_egl *egl, gsr_color_uniforms *uniforms, gsr_destination_color color_format, gsr_color_range color_range, bool external_texture) {
+ const char *color_transform_matrix = color_format_range_get_transform_matrix(color_format, color_range);
+ char vertex_shader[2048];
+ snprintf(vertex_shader, sizeof(vertex_shader),
+ "#version 300 es \n"
+ "in vec2 pos; \n"
+ "in vec2 texcoords; \n"
+ "out vec2 texcoords_out; \n"
+ "uniform vec2 offset; \n"
+ "uniform float rotation; \n"
+ "void main() \n"
+ "{ \n"
+ " texcoords_out = (vec4(texcoords.x - 0.5, texcoords.y - 0.5, 0.0, 0.0) * rotate_z(rotation)).xy + vec2(0.5, 0.5); \n"
+ " gl_Position = vec4(offset.x, offset.y, 0.0, 0.0) + vec4(pos.x, pos.y, 0.0, 1.0); \n"
+ "} \n");
+ char fragment_shader[2048];
+ if(external_texture) {
+ snprintf(fragment_shader, sizeof(fragment_shader),
+ "#version 300 es \n"
+ "#extension GL_OES_EGL_image_external : enable \n"
+ "#extension GL_OES_EGL_image_external_essl3 : require \n"
+ "precision mediump float; \n"
+ "in vec2 texcoords_out; \n"
+ "uniform samplerExternalOES tex1; \n"
+ "out vec4 FragColor; \n"
+ "%s"
+ "void main() \n"
+ "{ \n"
+ " vec4 pixel = texture(tex1, texcoords_out); \n"
+ " FragColor.x = (RGBtoYUV * vec4(pixel.rgb, 1.0)).x; \n"
+ " FragColor.w = pixel.a; \n"
+ "} \n", color_transform_matrix);
+ } else {
+ snprintf(fragment_shader, sizeof(fragment_shader),
+ "#version 300 es \n"
+ "precision mediump float; \n"
+ "in vec2 texcoords_out; \n"
+ "uniform sampler2D tex1; \n"
+ "out vec4 FragColor; \n"
+ "%s"
+ "void main() \n"
+ "{ \n"
+ " vec4 pixel = texture(tex1, texcoords_out); \n"
+ " FragColor.x = (RGBtoYUV * vec4(pixel.rgb, 1.0)).x; \n"
+ " FragColor.w = pixel.a; \n"
+ "} \n", color_transform_matrix);
+ }
+ if(gsr_shader_init(shader, egl, vertex_shader, fragment_shader) != 0)
+ return -1;
+ gsr_shader_bind_attribute_location(shader, "pos", 0);
+ gsr_shader_bind_attribute_location(shader, "texcoords", 1);
+ uniforms->offset = egl->glGetUniformLocation(shader->program_id, "offset");
+ uniforms->rotation = egl->glGetUniformLocation(shader->program_id, "rotation");
+ return 0;
+static unsigned int load_shader_uv(gsr_shader *shader, gsr_egl *egl, gsr_color_uniforms *uniforms, gsr_destination_color color_format, gsr_color_range color_range, bool external_texture) {
+ const char *color_transform_matrix = color_format_range_get_transform_matrix(color_format, color_range);
+ char vertex_shader[2048];
+ snprintf(vertex_shader, sizeof(vertex_shader),
+ "#version 300 es \n"
+ "in vec2 pos; \n"
+ "in vec2 texcoords; \n"
+ "out vec2 texcoords_out; \n"
+ "uniform vec2 offset; \n"
+ "uniform float rotation; \n"
+ "void main() \n"
+ "{ \n"
+ " texcoords_out = (vec4(texcoords.x - 0.5, texcoords.y - 0.5, 0.0, 0.0) * rotate_z(rotation)).xy + vec2(0.5, 0.5); \n"
+ " gl_Position = (vec4(offset.x, offset.y, 0.0, 0.0) + vec4(pos.x, pos.y, 0.0, 1.0)) * vec4(0.5, 0.5, 1.0, 1.0) - vec4(0.5, 0.5, 0.0, 0.0); \n"
+ "} \n");
+ char fragment_shader[2048];
+ if(external_texture) {
+ snprintf(fragment_shader, sizeof(fragment_shader),
+ "#version 300 es \n"
+ "#extension GL_OES_EGL_image_external : enable \n"
+ "#extension GL_OES_EGL_image_external_essl3 : require \n"
+ "precision mediump float; \n"
+ "in vec2 texcoords_out; \n"
+ "uniform samplerExternalOES tex1; \n"
+ "out vec4 FragColor; \n"
+ "%s"
+ "void main() \n"
+ "{ \n"
+ " vec4 pixel = texture(tex1, texcoords_out); \n"
+ " FragColor.xy = (RGBtoYUV * vec4(pixel.rgb, 1.0)).yz; \n"
+ " FragColor.w = pixel.a; \n"
+ "} \n", color_transform_matrix);
+ } else {
+ snprintf(fragment_shader, sizeof(fragment_shader),
+ "#version 300 es \n"
+ "precision mediump float; \n"
+ "in vec2 texcoords_out; \n"
+ "uniform sampler2D tex1; \n"
+ "out vec4 FragColor; \n"
+ "%s"
+ "void main() \n"
+ "{ \n"
+ " vec4 pixel = texture(tex1, texcoords_out); \n"
+ " FragColor.xy = (RGBtoYUV * vec4(pixel.rgb, 1.0)).yz; \n"
+ " FragColor.w = pixel.a; \n"
+ "} \n", color_transform_matrix);
+ }
+ if(gsr_shader_init(shader, egl, vertex_shader, fragment_shader) != 0)
+ return -1;
+ gsr_shader_bind_attribute_location(shader, "pos", 0);
+ gsr_shader_bind_attribute_location(shader, "texcoords", 1);
+ uniforms->offset = egl->glGetUniformLocation(shader->program_id, "offset");
+ uniforms->rotation = egl->glGetUniformLocation(shader->program_id, "rotation");
+ return 0;
+static int load_framebuffers(gsr_color_conversion *self) {
+ /* TODO: Only generate the necessary amount of framebuffers (self->params.num_destination_textures) */
+ const unsigned int draw_buffer = GL_COLOR_ATTACHMENT0;
+ self->params.egl->glGenFramebuffers(MAX_FRAMEBUFFERS, self->framebuffers);
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[0]);
+ self->params.egl->glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, self->params.destination_textures[0], 0);
+ self->params.egl->glDrawBuffers(1, &draw_buffer);
+ if(self->params.egl->glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) {
+ fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to create framebuffer for Y\n");
+ goto err;
+ }
+ if(self->params.num_destination_textures > 1) {
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[1]);
+ self->params.egl->glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, self->params.destination_textures[1], 0);
+ self->params.egl->glDrawBuffers(1, &draw_buffer);
+ if(self->params.egl->glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) {
+ fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to create framebuffer for UV\n");
+ goto err;
+ }
+ }
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, 0);
+ return 0;
+ err:
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, 0);
+ return -1;
+static int create_vertices(gsr_color_conversion *self) {
+ self->params.egl->glGenVertexArrays(1, &self->vertex_array_object_id);
+ self->params.egl->glBindVertexArray(self->vertex_array_object_id);
+ self->params.egl->glGenBuffers(1, &self->vertex_buffer_object_id);
+ self->params.egl->glBindBuffer(GL_ARRAY_BUFFER, self->vertex_buffer_object_id);
+ self->params.egl->glBufferData(GL_ARRAY_BUFFER, 24 * sizeof(float), NULL, GL_STREAM_DRAW);
+ self->params.egl->glEnableVertexAttribArray(0);
+ self->params.egl->glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 4 * sizeof(float), (void*)0);
+ self->params.egl->glEnableVertexAttribArray(1);
+ self->params.egl->glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 4 * sizeof(float), (void*)(2 * sizeof(float)));
+ self->params.egl->glBindVertexArray(0);
+ return 0;
+int gsr_color_conversion_init(gsr_color_conversion *self, const gsr_color_conversion_params *params) {
+ assert(params);
+ assert(params->egl);
+ memset(self, 0, sizeof(*self));
+ self->params.egl = params->egl;
+ self->params = *params;
+ switch(params->destination_color) {
+ if(self->params.num_destination_textures != 2) {
+ fprintf(stderr, "gsr error: gsr_color_conversion_init: expected 2 destination textures for destination color NV12/P010, got %d destination texture(s)\n", self->params.num_destination_textures);
+ return -1;
+ }
+ if(load_shader_y(&self->shaders[0], self->params.egl, &self->uniforms[0], params->destination_color, params->color_range, false) != 0) {
+ fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y shader\n");
+ goto err;
+ }
+ if(load_shader_uv(&self->shaders[1], self->params.egl, &self->uniforms[1], params->destination_color, params->color_range, false) != 0) {
+ fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV shader\n");
+ goto err;
+ }
+ if(self->params.load_external_image_shader) {
+ if(load_shader_y(&self->shaders[2], self->params.egl, &self->uniforms[2], params->destination_color, params->color_range, true) != 0) {
+ fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load Y shader\n");
+ goto err;
+ }
+ if(load_shader_uv(&self->shaders[3], self->params.egl, &self->uniforms[3], params->destination_color, params->color_range, true) != 0) {
+ fprintf(stderr, "gsr error: gsr_color_conversion_init: failed to load UV shader\n");
+ goto err;
+ }
+ }
+ break;
+ }
+ }
+ if(load_framebuffers(self) != 0)
+ goto err;
+ if(create_vertices(self) != 0)
+ goto err;
+ return 0;
+ err:
+ gsr_color_conversion_deinit(self);
+ return -1;
+void gsr_color_conversion_deinit(gsr_color_conversion *self) {
+ if(!self->params.egl)
+ return;
+ if(self->vertex_buffer_object_id) {
+ self->params.egl->glDeleteBuffers(1, &self->vertex_buffer_object_id);
+ self->vertex_buffer_object_id = 0;
+ }
+ if(self->vertex_array_object_id) {
+ self->params.egl->glDeleteVertexArrays(1, &self->vertex_array_object_id);
+ self->vertex_array_object_id = 0;
+ }
+ self->params.egl->glDeleteFramebuffers(MAX_FRAMEBUFFERS, self->framebuffers);
+ for(int i = 0; i < MAX_FRAMEBUFFERS; ++i) {
+ self->framebuffers[i] = 0;
+ }
+ for(int i = 0; i < MAX_SHADERS; ++i) {
+ gsr_shader_deinit(&self->shaders[i]);
+ }
+ self->params.egl = NULL;
+static void gsr_color_conversion_swizzle_texture_source(gsr_color_conversion *self) {
+ if(self->params.source_color == GSR_SOURCE_COLOR_BGR) {
+ const int swizzle_mask[] = { GL_BLUE, GL_GREEN, GL_RED, 1 };
+ self->params.egl->glTexParameteriv(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_RGBA, swizzle_mask);
+ }
+static void gsr_color_conversion_swizzle_reset(gsr_color_conversion *self) {
+ if(self->params.source_color == GSR_SOURCE_COLOR_BGR) {
+ const int swizzle_mask[] = { GL_RED, GL_GREEN, GL_BLUE, GL_ALPHA };
+ self->params.egl->glTexParameteriv(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_RGBA, swizzle_mask);
+ }
+/* |source_pos| is in pixel coordinates and |source_size| */
+void gsr_color_conversion_draw(gsr_color_conversion *self, unsigned int texture_id, vec2i source_pos, vec2i source_size, vec2i texture_pos, vec2i texture_size, float rotation, bool external_texture) {
+ // TODO: Remove this crap
+ rotation = M_PI*2.0f - rotation;
+ /* TODO: Do not call this every frame? */
+ vec2i dest_texture_size = {0, 0};
+ self->params.egl->glBindTexture(GL_TEXTURE_2D, self->params.destination_textures[0]);
+ self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH, &dest_texture_size.x);
+ self->params.egl->glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &dest_texture_size.y);
+ self->params.egl->glBindTexture(GL_TEXTURE_2D, 0);
+ const int texture_target = external_texture ? GL_TEXTURE_EXTERNAL_OES : GL_TEXTURE_2D;
+ self->params.egl->glBindTexture(texture_target, texture_id);
+ vec2i source_texture_size = {0, 0};
+ if(external_texture) {
+ source_texture_size = source_size;
+ } else {
+ /* TODO: Do not call this every frame? */
+ self->params.egl->glGetTexLevelParameteriv(texture_target, 0, GL_TEXTURE_WIDTH, &source_texture_size.x);
+ self->params.egl->glGetTexLevelParameteriv(texture_target, 0, GL_TEXTURE_HEIGHT, &source_texture_size.y);
+ }
+ // TODO: Remove this crap
+ if(abs_f(M_PI * 0.5f - rotation) <= 0.001f || abs_f(M_PI * 1.5f - rotation) <= 0.001f) {
+ float tmp = source_texture_size.x;
+ source_texture_size.x = source_texture_size.y;
+ source_texture_size.y = tmp;
+ }
+ const vec2f pos_norm = {
+ ((float)source_pos.x / (dest_texture_size.x == 0 ? 1.0f : (float)dest_texture_size.x)) * 2.0f,
+ ((float)source_pos.y / (dest_texture_size.y == 0 ? 1.0f : (float)dest_texture_size.y)) * 2.0f,
+ };
+ const vec2f size_norm = {
+ ((float)source_size.x / (dest_texture_size.x == 0 ? 1.0f : (float)dest_texture_size.x)) * 2.0f,
+ ((float)source_size.y / (dest_texture_size.y == 0 ? 1.0f : (float)dest_texture_size.y)) * 2.0f,
+ };
+ const vec2f texture_pos_norm = {
+ (float)texture_pos.x / (source_texture_size.x == 0 ? 1.0f : (float)source_texture_size.x),
+ (float)texture_pos.y / (source_texture_size.y == 0 ? 1.0f : (float)source_texture_size.y),
+ };
+ const vec2f texture_size_norm = {
+ (float)texture_size.x / (source_texture_size.x == 0 ? 1.0f : (float)source_texture_size.x),
+ (float)texture_size.y / (source_texture_size.y == 0 ? 1.0f : (float)source_texture_size.y),
+ };
+ const float vertices[] = {
+ -1.0f + 0.0f, -1.0f + 0.0f + size_norm.y, texture_pos_norm.x, texture_pos_norm.y + texture_size_norm.y,
+ -1.0f + 0.0f, -1.0f + 0.0f, texture_pos_norm.x, texture_pos_norm.y,
+ -1.0f + 0.0f + size_norm.x, -1.0f + 0.0f, texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y,
+ -1.0f + 0.0f, -1.0f + 0.0f + size_norm.y, texture_pos_norm.x, texture_pos_norm.y + texture_size_norm.y,
+ -1.0f + 0.0f + size_norm.x, -1.0f + 0.0f, texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y,
+ -1.0f + 0.0f + size_norm.x, -1.0f + 0.0f + size_norm.y, texture_pos_norm.x + texture_size_norm.x, texture_pos_norm.y + texture_size_norm.y
+ };
+ gsr_color_conversion_swizzle_texture_source(self);
+ self->params.egl->glBindVertexArray(self->vertex_array_object_id);
+ self->params.egl->glViewport(0, 0, dest_texture_size.x, dest_texture_size.y);
+ /* TODO: this, also cleanup */
+ //self->params.egl->glBindBuffer(GL_ARRAY_BUFFER, self->vertex_buffer_object_id);
+ self->params.egl->glBufferSubData(GL_ARRAY_BUFFER, 0, 24 * sizeof(float), vertices);
+ {
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[0]);
+ //cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT); // TODO: Do this in a separate clear_ function. We want to do that when using multiple drm to create the final image (multiple monitors for example)
+ const int shader_index = external_texture ? 2 : 0;
+ gsr_shader_use(&self->shaders[shader_index]);
+ self->params.egl->glUniform1f(self->uniforms[shader_index].rotation, rotation);
+ self->params.egl->glUniform2f(self->uniforms[shader_index].offset, pos_norm.x, pos_norm.y);
+ self->params.egl->glDrawArrays(GL_TRIANGLES, 0, 6);
+ }
+ if(self->params.num_destination_textures > 1) {
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[1]);
+ //cap_xcomp->params.egl->glClear(GL_COLOR_BUFFER_BIT);
+ const int shader_index = external_texture ? 3 : 1;
+ gsr_shader_use(&self->shaders[shader_index]);
+ self->params.egl->glUniform1f(self->uniforms[shader_index].rotation, rotation);
+ self->params.egl->glUniform2f(self->uniforms[shader_index].offset, pos_norm.x, pos_norm.y);
+ self->params.egl->glDrawArrays(GL_TRIANGLES, 0, 6);
+ }
+ self->params.egl->glBindVertexArray(0);
+ gsr_shader_use_none(&self->shaders[0]);
+ self->params.egl->glBindTexture(texture_target, 0);
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, 0);
+ gsr_color_conversion_swizzle_reset(self);
+void gsr_color_conversion_clear(gsr_color_conversion *self) {
+ float color1[4] = {0.0f, 0.0f, 0.0f, 1.0f};
+ float color2[4] = {0.0f, 0.0f, 0.0f, 1.0f};
+ switch(self->params.destination_color) {
+ color2[0] = 0.5f;
+ color2[1] = 0.5f;
+ color2[2] = 0.0f;
+ color2[3] = 1.0f;
+ break;
+ }
+ }
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[0]);
+ self->params.egl->glClearColor(color1[0], color1[1], color1[2], color1[3]);
+ self->params.egl->glClear(GL_COLOR_BUFFER_BIT);
+ if(self->params.num_destination_textures > 1) {
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, self->framebuffers[1]);
+ self->params.egl->glClearColor(color2[0], color2[1], color2[2], color2[3]);
+ self->params.egl->glClear(GL_COLOR_BUFFER_BIT);
+ }
+ self->params.egl->glBindFramebuffer(GL_FRAMEBUFFER, 0);
diff --git a/src/cuda.c b/src/cuda.c
new file mode 100644
index 0000000..6d685b5
--- /dev/null
+++ b/src/cuda.c
@@ -0,0 +1,117 @@
+#include "../include/cuda.h"
+#include "../include/library_loader.h"
+#include <string.h>
+#include <stdio.h>
+#include <dlfcn.h>
+#include <assert.h>
+bool gsr_cuda_load(gsr_cuda *self, Display *display, bool do_overclock) {
+ memset(self, 0, sizeof(gsr_cuda));
+ self->do_overclock = do_overclock;
+ dlerror(); /* clear */
+ void *lib = dlopen("libcuda.so.1", RTLD_LAZY);
+ if(!lib) {
+ lib = dlopen("libcuda.so", RTLD_LAZY);
+ if(!lib) {
+ fprintf(stderr, "gsr error: gsr_cuda_load failed: failed to load libcuda.so/libcuda.so.1, error: %s\n", dlerror());
+ return false;
+ }
+ }
+ dlsym_assign required_dlsym[] = {
+ { (void**)&self->cuInit, "cuInit" },
+ { (void**)&self->cuDeviceGetCount, "cuDeviceGetCount" },
+ { (void**)&self->cuDeviceGet, "cuDeviceGet" },
+ { (void**)&self->cuCtxCreate_v2, "cuCtxCreate_v2" },
+ { (void**)&self->cuCtxDestroy_v2, "cuCtxDestroy_v2" },
+ { (void**)&self->cuCtxPushCurrent_v2, "cuCtxPushCurrent_v2" },
+ { (void**)&self->cuCtxPopCurrent_v2, "cuCtxPopCurrent_v2" },
+ { (void**)&self->cuGetErrorString, "cuGetErrorString" },
+ { (void**)&self->cuMemcpy2D_v2, "cuMemcpy2D_v2" },
+ { (void**)&self->cuMemcpy2DAsync_v2, "cuMemcpy2DAsync_v2" },
+ { (void**)&self->cuStreamSynchronize, "cuStreamSynchronize" },
+ { (void**)&self->cuGraphicsGLRegisterImage, "cuGraphicsGLRegisterImage" },
+ { (void**)&self->cuGraphicsEGLRegisterImage, "cuGraphicsEGLRegisterImage" },
+ { (void**)&self->cuGraphicsResourceSetMapFlags, "cuGraphicsResourceSetMapFlags" },
+ { (void**)&self->cuGraphicsMapResources, "cuGraphicsMapResources" },
+ { (void**)&self->cuGraphicsUnmapResources, "cuGraphicsUnmapResources" },
+ { (void**)&self->cuGraphicsUnregisterResource, "cuGraphicsUnregisterResource" },
+ { (void**)&self->cuGraphicsSubResourceGetMappedArray, "cuGraphicsSubResourceGetMappedArray" },
+ { NULL, NULL }
+ };
+ CUresult res;
+ if(!dlsym_load_list(lib, required_dlsym)) {
+ fprintf(stderr, "gsr error: gsr_cuda_load failed: missing required symbols in libcuda.so/libcuda.so.1\n");
+ goto fail;
+ }
+ res = self->cuInit(0);
+ if(res != CUDA_SUCCESS) {
+ const char *err_str = "unknown";
+ self->cuGetErrorString(res, &err_str);
+ fprintf(stderr, "gsr error: gsr_cuda_load failed: cuInit failed, error: %s (result: %d)\n", err_str, res);
+ goto fail;
+ }
+ int nGpu = 0;
+ self->cuDeviceGetCount(&nGpu);
+ if(nGpu <= 0) {
+ fprintf(stderr, "gsr error: gsr_cuda_load failed: no cuda supported devices found\n");
+ goto fail;
+ }
+ CUdevice cu_dev;
+ res = self->cuDeviceGet(&cu_dev, 0);
+ if(res != CUDA_SUCCESS) {
+ const char *err_str = "unknown";
+ self->cuGetErrorString(res, &err_str);
+ fprintf(stderr, "gsr error: gsr_cuda_load failed: unable to get CUDA device, error: %s (result: %d)\n", err_str, res);
+ goto fail;
+ }
+ res = self->cuCtxCreate_v2(&self->cu_ctx, CU_CTX_SCHED_AUTO, cu_dev);
+ if(res != CUDA_SUCCESS) {
+ const char *err_str = "unknown";
+ self->cuGetErrorString(res, &err_str);
+ fprintf(stderr, "gsr error: gsr_cuda_load failed: unable to create CUDA context, error: %s (result: %d)\n", err_str, res);
+ goto fail;
+ }
+ if(self->do_overclock) {
+ assert(display);
+ if(gsr_overclock_load(&self->overclock, display))
+ gsr_overclock_start(&self->overclock);
+ else
+ fprintf(stderr, "gsr warning: gsr_cuda_load: failed to load xnvctrl, failed to overclock memory transfer rate\n");
+ }
+ self->library = lib;
+ return true;
+ fail:
+ dlclose(lib);
+ memset(self, 0, sizeof(gsr_cuda));
+ return false;
+void gsr_cuda_unload(gsr_cuda *self) {
+ if(self->do_overclock && self->overclock.xnvctrl.library) {
+ gsr_overclock_stop(&self->overclock);
+ gsr_overclock_unload(&self->overclock);
+ }
+ if(self->library) {
+ if(self->cu_ctx) {
+ self->cuCtxDestroy_v2(self->cu_ctx);
+ self->cu_ctx = 0;
+ }
+ dlclose(self->library);
+ }
+ memset(self, 0, sizeof(gsr_cuda));
diff --git a/src/cursor.c b/src/cursor.c
new file mode 100644
index 0000000..9825ad2
--- /dev/null
+++ b/src/cursor.c
@@ -0,0 +1,192 @@
+#include "../include/cursor.h"
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+#include <X11/extensions/Xfixes.h>
+#include <X11/extensions/XI2.h>
+#include <X11/extensions/XInput2.h>
+// TODO: Test cursor visibility with XFixesHideCursor
+static bool gsr_cursor_set_from_x11_cursor_image(gsr_cursor *self, XFixesCursorImage *x11_cursor_image, bool *visible) {
+ uint8_t *cursor_data = NULL;
+ uint8_t *out = NULL;
+ *visible = false;
+ if(!x11_cursor_image)
+ goto err;
+ if(!x11_cursor_image->pixels)
+ goto err;
+ self->hotspot.x = x11_cursor_image->xhot;
+ self->hotspot.y = x11_cursor_image->yhot;
+ self->egl->glBindTexture(GL_TEXTURE_2D, self->texture_id);
+ self->size.x = x11_cursor_image->width;
+ self->size.y = x11_cursor_image->height;
+ const unsigned long *pixels = x11_cursor_image->pixels;
+ cursor_data = malloc(self->size.x * self->size.y * 4);
+ if(!cursor_data)
+ goto err;
+ out = cursor_data;
+ /* Un-premultiply alpha */
+ for(int y = 0; y < self->size.y; ++y) {
+ for(int x = 0; x < self->size.x; ++x) {
+ uint32_t pixel = *pixels++;
+ uint8_t *in = (uint8_t*)&pixel;
+ uint8_t alpha = in[3];
+ if(alpha == 0) {
+ alpha = 1;
+ } else {
+ *visible = true;
+ }
+ *out++ = (unsigned)*in++ * 255/alpha;
+ *out++ = (unsigned)*in++ * 255/alpha;
+ *out++ = (unsigned)*in++ * 255/alpha;
+ *out++ = *in++;
+ }
+ }
+ self->egl->glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, self->size.x, self->size.y, 0, GL_RGBA, GL_UNSIGNED_BYTE, cursor_data);
+ free(cursor_data);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+ self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+ XFree(x11_cursor_image);
+ return true;
+ err:
+ self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+ if(x11_cursor_image)
+ XFree(x11_cursor_image);
+ return false;
+static bool xinput_is_supported(Display *dpy, int *xi_opcode) {
+ *xi_opcode = 0;
+ int query_event = 0;
+ int query_error = 0;
+ if(!XQueryExtension(dpy, "XInputExtension", xi_opcode, &query_event, &query_error)) {
+ fprintf(stderr, "gsr error: gsr_cursor_init: X Input extension not available\n");
+ return false;
+ }
+ int major = 2;
+ int minor = 1;
+ int retval = XIQueryVersion(dpy, &major, &minor);
+ if (retval != Success) {
+ fprintf(stderr, "gsr error: gsr_cursor_init: XInput 2.1 is not supported\n");
+ return false;
+ }
+ return true;
+int gsr_cursor_init(gsr_cursor *self, gsr_egl *egl, Display *display) {
+ int x_fixes_error_base = 0;
+ assert(egl);
+ assert(display);
+ memset(self, 0, sizeof(*self));
+ self->egl = egl;
+ self->display = display;
+ self->x_fixes_event_base = 0;
+ if(!XFixesQueryExtension(self->display, &self->x_fixes_event_base, &x_fixes_error_base)) {
+ fprintf(stderr, "gsr error: gsr_cursor_init: your X11 server is missing the XFixes extension\n");
+ gsr_cursor_deinit(self);
+ return -1;
+ }
+ if(!xinput_is_supported(self->display, &self->xi_opcode)) {
+ gsr_cursor_deinit(self);
+ return -1;
+ }
+ unsigned char mask[XIMaskLen(XI_LASTEVENT)];
+ memset(mask, 0, sizeof(mask));
+ XISetMask(mask, XI_RawMotion);
+ XIEventMask xi_masks;
+ xi_masks.deviceid = XIAllMasterDevices;
+ xi_masks.mask_len = sizeof(mask);
+ xi_masks.mask = mask;
+ if(XISelectEvents(self->display, DefaultRootWindow(self->display), &xi_masks, 1) != Success) {
+ fprintf(stderr, "gsr error: gsr_cursor_init: XISelectEvents failed\n");
+ gsr_cursor_deinit(self);
+ return -1;
+ }
+ self->egl->glGenTextures(1, &self->texture_id);
+ XFixesSelectCursorInput(self->display, DefaultRootWindow(self->display), XFixesDisplayCursorNotifyMask);
+ gsr_cursor_set_from_x11_cursor_image(self, XFixesGetCursorImage(self->display), &self->visible);
+ self->cursor_image_set = true;
+ self->cursor_moved = true;
+ return 0;
+void gsr_cursor_deinit(gsr_cursor *self) {
+ if(!self->egl)
+ return;
+ if(self->texture_id) {
+ self->egl->glDeleteTextures(1, &self->texture_id);
+ self->texture_id = 0;
+ }
+ XISelectEvents(self->display, DefaultRootWindow(self->display), NULL, 0);
+ XFixesSelectCursorInput(self->display, DefaultRootWindow(self->display), 0);
+ self->display = NULL;
+ self->egl = NULL;
+bool gsr_cursor_update(gsr_cursor *self, XEvent *xev) {
+ bool updated = false;
+ XGenericEventCookie *cookie = (XGenericEventCookie*)&xev->xcookie;
+ const Bool got_event_data = XGetEventData(self->display, cookie);
+ if(got_event_data && cookie->type == GenericEvent && cookie->extension == self->xi_opcode && cookie->evtype == XI_RawMotion) {
+ updated = true;
+ self->cursor_moved = true;
+ }
+ if(got_event_data)
+ XFreeEventData(self->display, cookie);
+ if(xev->type == self->x_fixes_event_base + XFixesCursorNotify) {
+ XFixesCursorNotifyEvent *cursor_notify_event = (XFixesCursorNotifyEvent*)xev;
+ if(cursor_notify_event->subtype == XFixesDisplayCursorNotify && cursor_notify_event->window == DefaultRootWindow(self->display)) {
+ self->cursor_image_set = false;
+ }
+ }
+ if(!self->cursor_image_set) {
+ self->cursor_image_set = true;
+ gsr_cursor_set_from_x11_cursor_image(self, XFixesGetCursorImage(self->display), &self->visible);
+ updated = true;
+ }
+ return updated;
+void gsr_cursor_tick(gsr_cursor *self, Window relative_to) {
+ if(!self->cursor_moved)
+ return;
+ self->cursor_moved = false;
+ Window dummy_window;
+ int dummy_i;
+ unsigned int dummy_u;
+ XQueryPointer(self->display, relative_to, &dummy_window, &dummy_window, &dummy_i, &dummy_i, &self->position.x, &self->position.y, &dummy_u);
diff --git a/src/egl.c b/src/egl.c
new file mode 100644
index 0000000..552d5f4
--- /dev/null
+++ b/src/egl.c
@@ -0,0 +1,652 @@
+#include "../include/egl.h"
+#include "../include/library_loader.h"
+#include "../include/utils.h"
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <dlfcn.h>
+#include <assert.h>
+#include <wayland-client.h>
+#include <wayland-egl.h>
+#include <unistd.h>
+#include <sys/capability.h>
+// TODO: rename gsr_egl to something else since this includes both egl and eglx and in the future maybe vulkan too
+// TODO: Move this shit to a separate wayland file, and have a separate file for x11.
+static void output_handle_geometry(void *data, struct wl_output *wl_output,
+ int32_t x, int32_t y, int32_t phys_width, int32_t phys_height,
+ int32_t subpixel, const char *make, const char *model,
+ int32_t transform) {
+ (void)wl_output;
+ (void)phys_width;
+ (void)phys_height;
+ (void)subpixel;
+ (void)make;
+ (void)model;
+ gsr_wayland_output *gsr_output = data;
+ gsr_output->pos.x = x;
+ gsr_output->pos.y = y;
+ gsr_output->transform = transform;
+static void output_handle_mode(void *data, struct wl_output *wl_output, uint32_t flags, int32_t width, int32_t height, int32_t refresh) {
+ (void)wl_output;
+ (void)flags;
+ (void)refresh;
+ gsr_wayland_output *gsr_output = data;
+ gsr_output->size.x = width;
+ gsr_output->size.y = height;
+static void output_handle_done(void *data, struct wl_output *wl_output) {
+ (void)data;
+ (void)wl_output;
+static void output_handle_scale(void* data, struct wl_output *wl_output, int32_t factor) {
+ (void)data;
+ (void)wl_output;
+ (void)factor;
+static void output_handle_name(void *data, struct wl_output *wl_output, const char *name) {
+ (void)wl_output;
+ gsr_wayland_output *gsr_output = data;
+ if(gsr_output->name) {
+ free(gsr_output->name);
+ gsr_output->name = NULL;
+ }
+ gsr_output->name = strdup(name);
+static void output_handle_description(void *data, struct wl_output *wl_output, const char *description) {
+ (void)data;
+ (void)wl_output;
+ (void)description;
+static const struct wl_output_listener output_listener = {
+ .geometry = output_handle_geometry,
+ .mode = output_handle_mode,
+ .done = output_handle_done,
+ .scale = output_handle_scale,
+ .name = output_handle_name,
+ .description = output_handle_description,
+static void registry_add_object(void *data, struct wl_registry *registry, uint32_t name, const char *interface, uint32_t version) {
+ (void)version;
+ gsr_egl *egl = data;
+ if (strcmp(interface, "wl_compositor") == 0) {
+ if(egl->wayland.compositor) {
+ wl_compositor_destroy(egl->wayland.compositor);
+ egl->wayland.compositor = NULL;
+ }
+ egl->wayland.compositor = wl_registry_bind(registry, name, &wl_compositor_interface, 1);
+ } else if(strcmp(interface, wl_output_interface.name) == 0) {
+ if(version < 4) {
+ fprintf(stderr, "gsr warning: wl output interface version is < 4, expected >= 4 to capture a monitor. Using KMS capture instead\n");
+ return;
+ }
+ if(egl->wayland.num_outputs == GSR_MAX_OUTPUTS) {
+ fprintf(stderr, "gsr warning: reached maximum outputs (32), ignoring output %u\n", name);
+ return;
+ }
+ gsr_wayland_output *gsr_output = &egl->wayland.outputs[egl->wayland.num_outputs];
+ egl->wayland.num_outputs++;
+ *gsr_output = (gsr_wayland_output) {
+ .wl_name = name,
+ .output = wl_registry_bind(registry, name, &wl_output_interface, 4),
+ .pos = { .x = 0, .y = 0 },
+ .size = { .x = 0, .y = 0 },
+ .transform = 0,
+ .name = NULL,
+ };
+ wl_output_add_listener(gsr_output->output, &output_listener, gsr_output);
+ }
+static void registry_remove_object(void *data, struct wl_registry *registry, uint32_t name) {
+ (void)data;
+ (void)registry;
+ (void)name;
+static struct wl_registry_listener registry_listener = {
+ .global = registry_add_object,
+ .global_remove = registry_remove_object,
+static void reset_cap_nice(void) {
+ cap_t caps = cap_get_proc();
+ if(!caps)
+ return;
+ const cap_value_t cap_to_remove = CAP_SYS_NICE;
+ cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_to_remove, CAP_CLEAR);
+ cap_set_flag(caps, CAP_PERMITTED, 1, &cap_to_remove, CAP_CLEAR);
+ cap_set_proc(caps);
+ cap_free(caps);
+#define GLX_DRAWABLE_TYPE 0x8010
+#define GLX_RENDER_TYPE 0x8011
+#define GLX_RGBA_BIT 0x00000001
+#define GLX_WINDOW_BIT 0x00000001
+#define GLX_PIXMAP_BIT 0x00000002
+#define GLX_TEXTURE_2D_BIT_EXT 0x00000002
+#define GLX_RED_SIZE 8
+#define GLX_GREEN_SIZE 9
+#define GLX_BLUE_SIZE 10
+#define GLX_ALPHA_SIZE 11
+#define GLX_DEPTH_SIZE 12
+#define GLX_RGBA_TYPE 0x8014
+static GLXFBConfig glx_fb_config_choose(gsr_egl *self) {
+ const int glx_visual_attribs[] = {
+ // TODO:
+ None, None
+ };
+ // TODO: Cleanup
+ int c = 0;
+ GLXFBConfig *fb_configs = self->glXChooseFBConfig(self->x11.dpy, DefaultScreen(self->x11.dpy), glx_visual_attribs, &c);
+ if(c == 0 || !fb_configs)
+ return NULL;
+ return fb_configs[0];
+// TODO: Create egl context without surface (in other words, x11/wayland agnostic, doesn't require x11/wayland dependency)
+static bool gsr_egl_create_window(gsr_egl *self, bool wayland) {
+ EGLConfig ecfg;
+ int32_t num_config = 0;
+ const int32_t attr[] = {
+ };
+ const int32_t ctxattr[] = {
+ EGL_CONTEXT_PRIORITY_LEVEL_IMG, EGL_CONTEXT_PRIORITY_HIGH_IMG, /* requires cap_sys_nice, ignored otherwise */
+ };
+ if(wayland) {
+ self->wayland.dpy = wl_display_connect(NULL);
+ if(!self->wayland.dpy) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: wl_display_connect failed\n");
+ goto fail;
+ }
+ self->wayland.registry = wl_display_get_registry(self->wayland.dpy); // TODO: Error checking
+ wl_registry_add_listener(self->wayland.registry, &registry_listener, self); // TODO: Error checking
+ // Fetch globals
+ wl_display_roundtrip(self->wayland.dpy);
+ // Fetch wl_output
+ wl_display_roundtrip(self->wayland.dpy);
+ if(!self->wayland.compositor) {
+ fprintf(stderr, "gsr error: gsr_gl_create_window failed: failed to find compositor\n");
+ goto fail;
+ }
+ } else {
+ self->x11.window = XCreateWindow(self->x11.dpy, DefaultRootWindow(self->x11.dpy), 0, 0, 16, 16, 0, CopyFromParent, InputOutput, CopyFromParent, 0, NULL);
+ if(!self->x11.window) {
+ fprintf(stderr, "gsr error: gsr_gl_create_window failed: failed to create gl window\n");
+ goto fail;
+ }
+ }
+ self->eglBindAPI(EGL_OPENGL_API);
+ self->egl_display = self->eglGetDisplay(self->wayland.dpy ? (EGLNativeDisplayType)self->wayland.dpy : (EGLNativeDisplayType)self->x11.dpy);
+ if(!self->egl_display) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: eglGetDisplay failed\n");
+ goto fail;
+ }
+ if(!self->eglInitialize(self->egl_display, NULL, NULL)) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: eglInitialize failed\n");
+ goto fail;
+ }
+ if(!self->eglChooseConfig(self->egl_display, attr, &ecfg, 1, &num_config) || num_config != 1) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to find a matching config\n");
+ goto fail;
+ }
+ self->egl_context = self->eglCreateContext(self->egl_display, ecfg, NULL, ctxattr);
+ if(!self->egl_context) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to create egl context\n");
+ goto fail;
+ }
+ if(wayland) {
+ self->wayland.surface = wl_compositor_create_surface(self->wayland.compositor);
+ self->wayland.window = wl_egl_window_create(self->wayland.surface, 16, 16);
+ self->egl_surface = self->eglCreateWindowSurface(self->egl_display, ecfg, (EGLNativeWindowType)self->wayland.window, NULL);
+ } else {
+ self->egl_surface = self->eglCreateWindowSurface(self->egl_display, ecfg, (EGLNativeWindowType)self->x11.window, NULL);
+ }
+ if(!self->egl_surface) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to create window surface\n");
+ goto fail;
+ }
+ if(!self->eglMakeCurrent(self->egl_display, self->egl_surface, self->egl_surface, self->egl_context)) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to make egl context current\n");
+ goto fail;
+ }
+ reset_cap_nice();
+ return true;
+ fail:
+ reset_cap_nice();
+ gsr_egl_unload(self);
+ return false;
+static bool gsr_egl_switch_to_glx_context(gsr_egl *self) {
+ // TODO: Cleanup
+ if(self->egl_context) {
+ self->eglMakeCurrent(self->egl_display, NULL, NULL, NULL);
+ self->eglDestroyContext(self->egl_display, self->egl_context);
+ self->egl_context = NULL;
+ }
+ if(self->egl_surface) {
+ self->eglDestroySurface(self->egl_display, self->egl_surface);
+ self->egl_surface = NULL;
+ }
+ if(self->egl_display) {
+ self->eglTerminate(self->egl_display);
+ self->egl_display = NULL;
+ }
+ self->glx_fb_config = glx_fb_config_choose(self);
+ if(!self->glx_fb_config) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to find a suitable fb config\n");
+ goto fail;
+ }
+ // TODO:
+ //self->glx_context = self->glXCreateContextAttribsARB(self->x11.dpy, self->glx_fb_config, NULL, True, context_attrib_list);
+ self->glx_context = self->glXCreateNewContext(self->x11.dpy, self->glx_fb_config, GLX_RGBA_TYPE, NULL, True);
+ if(!self->glx_context) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to create glx context\n");
+ goto fail;
+ }
+ if(!self->glXMakeContextCurrent(self->x11.dpy, self->x11.window, self->x11.window, self->glx_context)) {
+ fprintf(stderr, "gsr error: gsr_egl_create_window failed: failed to make glx context current\n");
+ goto fail;
+ }
+ return true;
+ fail:
+ if(self->glx_context) {
+ self->glXMakeContextCurrent(self->x11.dpy, None, None, NULL);
+ self->glXDestroyContext(self->x11.dpy, self->glx_context);
+ self->glx_context = NULL;
+ self->glx_fb_config = NULL;
+ }
+ return false;
+static bool gsr_egl_load_egl(gsr_egl *self, void *library) {
+ const dlsym_assign required_dlsym[] = {
+ { (void**)&self->eglGetError, "eglGetError" },
+ { (void**)&self->eglGetDisplay, "eglGetDisplay" },
+ { (void**)&self->eglInitialize, "eglInitialize" },
+ { (void**)&self->eglTerminate, "eglTerminate" },
+ { (void**)&self->eglChooseConfig, "eglChooseConfig" },
+ { (void**)&self->eglCreateWindowSurface, "eglCreateWindowSurface" },
+ { (void**)&self->eglCreateContext, "eglCreateContext" },
+ { (void**)&self->eglMakeCurrent, "eglMakeCurrent" },
+ { (void**)&self->eglCreateImage, "eglCreateImage" },
+ { (void**)&self->eglDestroyContext, "eglDestroyContext" },
+ { (void**)&self->eglDestroySurface, "eglDestroySurface" },
+ { (void**)&self->eglDestroyImage, "eglDestroyImage" },
+ { (void**)&self->eglSwapInterval, "eglSwapInterval" },
+ { (void**)&self->eglSwapBuffers, "eglSwapBuffers" },
+ { (void**)&self->eglBindAPI, "eglBindAPI" },
+ { (void**)&self->eglGetProcAddress, "eglGetProcAddress" },
+ { NULL, NULL }
+ };
+ if(!dlsym_load_list(library, required_dlsym)) {
+ fprintf(stderr, "gsr error: gsr_egl_load failed: missing required symbols in libEGL.so.1\n");
+ return false;
+ }
+ return true;
+static bool gsr_egl_proc_load_egl(gsr_egl *self) {
+ self->eglExportDMABUFImageQueryMESA = (FUNC_eglExportDMABUFImageQueryMESA)self->eglGetProcAddress("eglExportDMABUFImageQueryMESA");
+ self->eglExportDMABUFImageMESA = (FUNC_eglExportDMABUFImageMESA)self->eglGetProcAddress("eglExportDMABUFImageMESA");
+ self->glEGLImageTargetTexture2DOES = (FUNC_glEGLImageTargetTexture2DOES)self->eglGetProcAddress("glEGLImageTargetTexture2DOES");
+ self->eglQueryDisplayAttribEXT = (FUNC_eglQueryDisplayAttribEXT)self->eglGetProcAddress("eglQueryDisplayAttribEXT");
+ self->eglQueryDeviceStringEXT = (FUNC_eglQueryDeviceStringEXT)self->eglGetProcAddress("eglQueryDeviceStringEXT");
+ if(!self->glEGLImageTargetTexture2DOES) {
+ fprintf(stderr, "gsr error: gsr_egl_load failed: could not find glEGLImageTargetTexture2DOES\n");
+ return false;
+ }
+ return true;
+static bool gsr_egl_load_glx(gsr_egl *self, void *library) {
+ const dlsym_assign required_dlsym[] = {
+ { (void**)&self->glXGetProcAddress, "glXGetProcAddress" },
+ { (void**)&self->glXChooseFBConfig, "glXChooseFBConfig" },
+ { (void**)&self->glXMakeContextCurrent, "glXMakeContextCurrent" },
+ { (void**)&self->glXCreateNewContext, "glXCreateNewContext" },
+ { (void**)&self->glXDestroyContext, "glXDestroyContext" },
+ { (void**)&self->glXSwapBuffers, "glXSwapBuffers" },
+ { NULL, NULL }
+ };
+ if(!dlsym_load_list(library, required_dlsym)) {
+ fprintf(stderr, "gsr error: gsr_egl_load failed: missing required symbols in libGLX.so.0\n");
+ return false;
+ }
+ self->glXCreateContextAttribsARB = (FUNC_glXCreateContextAttribsARB)self->glXGetProcAddress((const unsigned char*)"glXCreateContextAttribsARB");
+ if(!self->glXCreateContextAttribsARB) {
+ fprintf(stderr, "gsr error: gsr_egl_load_glx failed: could not find glXCreateContextAttribsARB\n");
+ return false;
+ }
+ self->glXSwapIntervalEXT = (FUNC_glXSwapIntervalEXT)self->glXGetProcAddress((const unsigned char*)"glXSwapIntervalEXT");
+ self->glXSwapIntervalMESA = (FUNC_glXSwapIntervalMESA)self->glXGetProcAddress((const unsigned char*)"glXSwapIntervalMESA");
+ self->glXSwapIntervalSGI = (FUNC_glXSwapIntervalSGI)self->glXGetProcAddress((const unsigned char*)"glXSwapIntervalSGI");
+ return true;
+static bool gsr_egl_load_gl(gsr_egl *self, void *library) {
+ const dlsym_assign required_dlsym[] = {
+ { (void**)&self->glGetError, "glGetError" },
+ { (void**)&self->glGetString, "glGetString" },
+ { (void**)&self->glFlush, "glFlush" },
+ { (void**)&self->glFinish, "glFinish" },
+ { (void**)&self->glClear, "glClear" },
+ { (void**)&self->glClearColor, "glClearColor" },
+ { (void**)&self->glGenTextures, "glGenTextures" },
+ { (void**)&self->glDeleteTextures, "glDeleteTextures" },
+ { (void**)&self->glBindTexture, "glBindTexture" },
+ { (void**)&self->glTexParameteri, "glTexParameteri" },
+ { (void**)&self->glTexParameteriv, "glTexParameteriv" },
+ { (void**)&self->glGetTexLevelParameteriv, "glGetTexLevelParameteriv" },
+ { (void**)&self->glTexImage2D, "glTexImage2D" },
+ { (void**)&self->glCopyImageSubData, "glCopyImageSubData" },
+ { (void**)&self->glClearTexImage, "glClearTexImage" },
+ { (void**)&self->glGenFramebuffers, "glGenFramebuffers" },
+ { (void**)&self->glBindFramebuffer, "glBindFramebuffer" },
+ { (void**)&self->glDeleteFramebuffers, "glDeleteFramebuffers" },
+ { (void**)&self->glViewport, "glViewport" },
+ { (void**)&self->glFramebufferTexture2D, "glFramebufferTexture2D" },
+ { (void**)&self->glDrawBuffers, "glDrawBuffers" },
+ { (void**)&self->glCheckFramebufferStatus, "glCheckFramebufferStatus" },
+ { (void**)&self->glBindBuffer, "glBindBuffer" },
+ { (void**)&self->glGenBuffers, "glGenBuffers" },
+ { (void**)&self->glBufferData, "glBufferData" },
+ { (void**)&self->glBufferSubData, "glBufferSubData" },
+ { (void**)&self->glDeleteBuffers, "glDeleteBuffers" },
+ { (void**)&self->glGenVertexArrays, "glGenVertexArrays" },
+ { (void**)&self->glBindVertexArray, "glBindVertexArray" },
+ { (void**)&self->glDeleteVertexArrays, "glDeleteVertexArrays" },
+ { (void**)&self->glCreateProgram, "glCreateProgram" },
+ { (void**)&self->glCreateShader, "glCreateShader" },
+ { (void**)&self->glAttachShader, "glAttachShader" },
+ { (void**)&self->glBindAttribLocation, "glBindAttribLocation" },
+ { (void**)&self->glCompileShader, "glCompileShader" },
+ { (void**)&self->glLinkProgram, "glLinkProgram" },
+ { (void**)&self->glShaderSource, "glShaderSource" },
+ { (void**)&self->glUseProgram, "glUseProgram" },
+ { (void**)&self->glGetProgramInfoLog, "glGetProgramInfoLog" },
+ { (void**)&self->glGetShaderiv, "glGetShaderiv" },
+ { (void**)&self->glGetShaderInfoLog, "glGetShaderInfoLog" },
+ { (void**)&self->glDeleteProgram, "glDeleteProgram" },
+ { (void**)&self->glDeleteShader, "glDeleteShader" },
+ { (void**)&self->glGetProgramiv, "glGetProgramiv" },
+ { (void**)&self->glVertexAttribPointer, "glVertexAttribPointer" },
+ { (void**)&self->glEnableVertexAttribArray, "glEnableVertexAttribArray" },
+ { (void**)&self->glDrawArrays, "glDrawArrays" },
+ { (void**)&self->glEnable, "glEnable" },
+ { (void**)&self->glDisable, "glDisable" },
+ { (void**)&self->glBlendFunc, "glBlendFunc" },
+ { (void**)&self->glGetUniformLocation, "glGetUniformLocation" },
+ { (void**)&self->glUniform1f, "glUniform1f" },
+ { (void**)&self->glUniform2f, "glUniform2f" },
+ { (void**)&self->glDebugMessageCallback, "glDebugMessageCallback" },
+ { (void**)&self->glScissor, "glScissor" },
+ { NULL, NULL }
+ };
+ if(!dlsym_load_list(library, required_dlsym)) {
+ fprintf(stderr, "gsr error: gsr_egl_load failed: missing required symbols in libGL.so.1\n");
+ return false;
+ }
+ return true;
+// #define GL_DEBUG_TYPE_ERROR 0x824C
+// static void debug_callback( unsigned int source,
+// unsigned int type,
+// unsigned int id,
+// unsigned int severity,
+// int length,
+// const char* message,
+// const void* userParam )
+// {
+// (void)source;
+// (void)id;
+// (void)length;
+// (void)userParam;
+// fprintf( stderr, "GL CALLBACK: %s type = 0x%x, severity = 0x%x, message = %s\n",
+// ( type == GL_DEBUG_TYPE_ERROR ? "** GL ERROR **" : "" ),
+// type, severity, message );
+// }
+bool gsr_egl_load(gsr_egl *self, Display *dpy, bool wayland, bool is_monitor_capture) {
+ memset(self, 0, sizeof(gsr_egl));
+ self->x11.dpy = dpy;
+ self->context_type = GSR_GL_CONTEXT_TYPE_EGL;
+ dlerror(); /* clear */
+ self->egl_library = dlopen("libEGL.so.1", RTLD_LAZY);
+ if(!self->egl_library) {
+ fprintf(stderr, "gsr error: gsr_egl_load: failed to load libEGL.so.1, error: %s\n", dlerror());
+ goto fail;
+ }
+ self->glx_library = dlopen("libGLX.so.0", RTLD_LAZY);
+ if(!self->glx_library) {
+ fprintf(stderr, "gsr error: gsr_egl_load: failed to load libGLX.so.0, error: %s\n", dlerror());
+ goto fail;
+ }
+ self->gl_library = dlopen("libGL.so.1", RTLD_LAZY);
+ if(!self->egl_library) {
+ fprintf(stderr, "gsr error: gsr_egl_load: failed to load libGL.so.1, error: %s\n", dlerror());
+ goto fail;
+ }
+ if(!gsr_egl_load_egl(self, self->egl_library))
+ goto fail;
+ if(!gsr_egl_load_glx(self, self->glx_library))
+ goto fail;
+ if(!gsr_egl_load_gl(self, self->gl_library))
+ goto fail;
+ if(!gsr_egl_proc_load_egl(self))
+ goto fail;
+ if(!gsr_egl_create_window(self, wayland))
+ goto fail;
+ if(!gl_get_gpu_info(self, &self->gpu_info))
+ goto fail;
+ if(self->eglQueryDisplayAttribEXT && self->eglQueryDeviceStringEXT) {
+ intptr_t device = 0;
+ if(self->eglQueryDisplayAttribEXT(self->egl_display, EGL_DEVICE_EXT, &device) && device)
+ self->dri_card_path = self->eglQueryDeviceStringEXT((void*)device, EGL_DRM_DEVICE_FILE_EXT);
+ }
+ /* Nvfbc requires glx */
+ if(!wayland && is_monitor_capture && self->gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA) {
+ self->context_type = GSR_GL_CONTEXT_TYPE_GLX;
+ self->dri_card_path = NULL;
+ if(!gsr_egl_switch_to_glx_context(self))
+ goto fail;
+ }
+ self->glEnable(GL_BLEND);
+ //self->glEnable(GL_DEBUG_OUTPUT);
+ //self->glDebugMessageCallback(debug_callback, NULL);
+ return true;
+ fail:
+ gsr_egl_unload(self);
+ return false;
+void gsr_egl_unload(gsr_egl *self) {
+ if(self->egl_context) {
+ self->eglMakeCurrent(self->egl_display, NULL, NULL, NULL);
+ self->eglDestroyContext(self->egl_display, self->egl_context);
+ self->egl_context = NULL;
+ }
+ if(self->egl_surface) {
+ self->eglDestroySurface(self->egl_display, self->egl_surface);
+ self->egl_surface = NULL;
+ }
+ if(self->egl_display) {
+ self->eglTerminate(self->egl_display);
+ self->egl_display = NULL;
+ }
+ if(self->glx_context) {
+ self->glXMakeContextCurrent(self->x11.dpy, None, None, NULL);
+ self->glXDestroyContext(self->x11.dpy, self->glx_context);
+ self->glx_context = NULL;
+ self->glx_fb_config = NULL;
+ }
+ if(self->x11.window) {
+ XDestroyWindow(self->x11.dpy, self->x11.window);
+ self->x11.window = None;
+ }
+ if(self->wayland.window) {
+ wl_egl_window_destroy(self->wayland.window);
+ self->wayland.window = NULL;
+ }
+ if(self->wayland.surface) {
+ wl_surface_destroy(self->wayland.surface);
+ self->wayland.surface = NULL;
+ }
+ for(int i = 0; i < self->wayland.num_outputs; ++i) {
+ if(self->wayland.outputs[i].output) {
+ wl_output_destroy(self->wayland.outputs[i].output);
+ self->wayland.outputs[i].output = NULL;
+ }
+ if(self->wayland.outputs[i].name) {
+ free(self->wayland.outputs[i].name);
+ self->wayland.outputs[i].name = NULL;
+ }
+ }
+ self->wayland.num_outputs = 0;
+ if(self->wayland.compositor) {
+ wl_compositor_destroy(self->wayland.compositor);
+ self->wayland.compositor = NULL;
+ }
+ if(self->wayland.registry) {
+ wl_registry_destroy(self->wayland.registry);
+ self->wayland.registry = NULL;
+ }
+ if(self->wayland.dpy) {
+ wl_display_disconnect(self->wayland.dpy);
+ self->wayland.dpy = NULL;
+ }
+ if(self->egl_library) {
+ dlclose(self->egl_library);
+ self->egl_library = NULL;
+ }
+ if(self->glx_library) {
+ dlclose(self->glx_library);
+ self->glx_library = NULL;
+ }
+ if(self->gl_library) {
+ dlclose(self->gl_library);
+ self->gl_library = NULL;
+ }
+ memset(self, 0, sizeof(gsr_egl));
+void gsr_egl_update(gsr_egl *self) {
+ if(!self->wayland.dpy)
+ return;
+ // TODO: pselect on wl_display_get_fd before doing dispatch
+ wl_display_dispatch(self->wayland.dpy);
diff --git a/include/LibraryLoader.hpp b/src/library_loader.c
index 16dc580..0aeee9b 100644
--- a/include/LibraryLoader.hpp
+++ b/src/library_loader.c
@@ -1,14 +1,10 @@
-#pragma once
+#include "../include/library_loader.h"
#include <dlfcn.h>
+#include <stdbool.h>
#include <stdio.h>
-typedef struct {
- void **func;
- const char *name;
-} dlsym_assign;
-static void* dlsym_print_fail(void *handle, const char *name, bool required) {
+void* dlsym_print_fail(void *handle, const char *name, bool required) {
void *sym = dlsym(handle, name);
char *err_str = dlerror();
@@ -20,7 +16,7 @@ static void* dlsym_print_fail(void *handle, const char *name, bool required) {
/* |dlsyms| should be null terminated */
-static bool dlsym_load_list(void *handle, const dlsym_assign *dlsyms) {
+bool dlsym_load_list(void *handle, const dlsym_assign *dlsyms) {
bool success = true;
for(int i = 0; dlsyms[i].func; ++i) {
*dlsyms[i].func = dlsym_print_fail(handle, dlsyms[i].name, true);
@@ -31,8 +27,8 @@ static bool dlsym_load_list(void *handle, const dlsym_assign *dlsyms) {
/* |dlsyms| should be null terminated */
-static void dlsym_load_list_optional(void *handle, const dlsym_assign *dlsyms) {
+void dlsym_load_list_optional(void *handle, const dlsym_assign *dlsyms) {
for(int i = 0; dlsyms[i].func; ++i) {
*dlsyms[i].func = dlsym_print_fail(handle, dlsyms[i].name, false);
-} \ No newline at end of file
diff --git a/src/main.cpp b/src/main.cpp
index aac777e..bd4be62 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -1,19 +1,13 @@
- Copyright (C) 2020 dec05eba
- This program is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program. If not, see <https://www.gnu.org/licenses/>.
+extern "C" {
+#include "../include/capture/nvfbc.h"
+#include "../include/capture/xcomposite_cuda.h"
+#include "../include/capture/xcomposite_vaapi.h"
+#include "../include/capture/kms_vaapi.h"
+#include "../include/capture/kms_cuda.h"
+#include "../include/egl.h"
+#include "../include/utils.h"
+#include "../include/color_conversion.h"
#include <assert.h>
#include <stdio.h>
@@ -26,45 +20,52 @@
#include <map>
#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
-#include <fcntl.h>
+#include <sys/wait.h>
+#include <libgen.h>
#include "../include/sound.hpp"
-#include "../include/NvFBCLibrary.hpp"
-#include "../include/CudaLibrary.hpp"
-#include "../include/GlLibrary.hpp"
-#include <X11/extensions/Xcomposite.h>
-//#include <X11/Xatom.h>
extern "C" {
#include <libavutil/pixfmt.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
-#include <libavutil/hwcontext.h>
-#include <libavutil/hwcontext_cuda.h>
#include <libavutil/opt.h>
#include <libswresample/swresample.h>
#include <libavutil/avutil.h>
#include <libavutil/time.h>
-extern "C" {
-#include <libavutil/hwcontext.h>
+#include <libavfilter/avfilter.h>
+#include <libavfilter/buffersink.h>
+#include <libavfilter/buffersrc.h>
#include <deque>
#include <future>
+// TODO: If options are not supported then they are returned (allocated) in the options. This should be free'd.
// TODO: Remove LIBAVUTIL_VERSION_MAJOR checks in the future when ubuntu, pop os LTS etc update ffmpeg to >= 5.0
+static const int AUDIO_SAMPLE_RATE = 48000;
static const int VIDEO_STREAM_INDEX = 0;
static thread_local char av_error_buffer[AV_ERROR_MAX_STRING_SIZE];
-static Cuda cuda;
-static GlLibrary gl;
+static void monitor_output_callback_print(const gsr_monitor *monitor, void *userdata) {
+ (void)userdata;
+ fprintf(stderr, " \"%.*s\" (%dx%d+%d+%d)\n", monitor->name_len, monitor->name, monitor->size.x, monitor->size.y, monitor->pos.x, monitor->pos.y);
+typedef struct {
+ const char *output_name;
+} FirstOutputCallback;
+static void get_first_output(const gsr_monitor *monitor, void *userdata) {
+ FirstOutputCallback *first_output = (FirstOutputCallback*)userdata;
+ if(!first_output->output_name)
+ first_output->output_name = strndup(monitor->name, monitor->name_len + 1);
static char* av_error_to_string(int err) {
if(av_strerror(err, av_error_buffer, sizeof(av_error_buffer)) < 0)
@@ -72,30 +73,6 @@ static char* av_error_to_string(int err) {
return av_error_buffer;
-struct ScopedGLXFBConfig {
- ~ScopedGLXFBConfig() {
- if (configs)
- XFree(configs);
- }
- GLXFBConfig *configs = nullptr;
-struct WindowPixmap {
- Pixmap pixmap = None;
- GLXPixmap glx_pixmap = None;
- unsigned int texture_id = 0;
- unsigned int target_texture_id = 0;
- int texture_width = 0;
- int texture_height = 0;
- int texture_real_width = 0;
- int texture_real_height = 0;
- Window composite_window = None;
enum class VideoQuality {
@@ -105,482 +82,235 @@ enum class VideoQuality {
enum class VideoCodec {
- H265
+ AV1,
-static double clock_get_monotonic_seconds() {
- struct timespec ts;
- ts.tv_sec = 0;
- ts.tv_nsec = 0;
- clock_gettime(CLOCK_MONOTONIC, &ts);
- return (double)ts.tv_sec + (double)ts.tv_nsec * 0.000000001;
+enum class AudioCodec {
+ AAC,
-static bool x11_supports_composite_named_window_pixmap(Display *dpy) {
- int extension_major;
- int extension_minor;
- if (!XCompositeQueryExtension(dpy, &extension_major, &extension_minor))
- return false;
+enum class PixelFormat {
+ YUV420,
+ YUV444
- int major_version;
- int minor_version;
- return XCompositeQueryVersion(dpy, &major_version, &minor_version) &&
- (major_version > 0 || minor_version >= 2);
+enum class FramerateMode {
-static int x11_error_handler(Display *dpy, XErrorEvent *ev) {
-#if 0
- char type_str[128];
- XGetErrorText(dpy, ev->type, type_str, sizeof(type_str));
- char major_opcode_str[128];
- XGetErrorText(dpy, ev->type, major_opcode_str, sizeof(major_opcode_str));
- char minor_opcode_str[128];
- XGetErrorText(dpy, ev->type, minor_opcode_str, sizeof(minor_opcode_str));
- fprintf(stderr,
- "X Error of failed request: %s\n"
- "Major opcode of failed request: %d (%s)\n"
- "Minor opcode of failed request: %d (%s)\n"
- "Serial number of failed request: %d\n",
- type_str,
- ev->request_code, major_opcode_str,
- ev->minor_code, minor_opcode_str);
+static int x11_error_handler(Display*, XErrorEvent*) {
return 0;
-static int x11_io_error_handler(Display *dpy) {
+static int x11_io_error_handler(Display*) {
return 0;
-static Window get_compositor_window(Display *display) {
- Window overlay_window = XCompositeGetOverlayWindow(display, DefaultRootWindow(display));
- XCompositeReleaseOverlayWindow(display, DefaultRootWindow(display));
- /*
- Atom xdnd_proxy = XInternAtom(display, "XdndProxy", False);
- if(!xdnd_proxy)
- return None;
- Atom type = None;
- int format = 0;
- unsigned long nitems = 0, after = 0;
- unsigned char *data = nullptr;
- if(XGetWindowProperty(display, overlay_window, xdnd_proxy, 0, 1, False, XA_WINDOW, &type, &format, &nitems, &after, &data) != Success)
- return None;
- fprintf(stderr, "type: %ld, format: %d, num items: %lu\n", type, format, nitems);
- if(type == XA_WINDOW && format == 32 && nitems == 1)
- fprintf(stderr, "Proxy window: %ld\n", *(Window*)data);
- if(data)
- XFree(data);
- */
- Window root_window, parent_window;
- Window *children = nullptr;
- unsigned int num_children = 0;
- if(XQueryTree(display, overlay_window, &root_window, &parent_window, &children, &num_children) == 0)
- return None;
- Window compositor_window = None;
- if(num_children == 1) {
- compositor_window = children[0];
- const int screen_width = XWidthOfScreen(DefaultScreenOfDisplay(display));
- const int screen_height = XHeightOfScreen(DefaultScreenOfDisplay(display));
- XWindowAttributes attr;
- if(!XGetWindowAttributes(display, compositor_window, &attr) || attr.width != screen_width || attr.height != screen_height)
- compositor_window = None;
- }
- if(children)
- XFree(children);
- return compositor_window;
-static void cleanup_window_pixmap(Display *dpy, WindowPixmap &pixmap) {
- if (pixmap.target_texture_id) {
- gl.glDeleteTextures(1, &pixmap.target_texture_id);
- pixmap.target_texture_id = 0;
- }
- if (pixmap.texture_id) {
- gl.glDeleteTextures(1, &pixmap.texture_id);
- pixmap.texture_id = 0;
- pixmap.texture_width = 0;
- pixmap.texture_height = 0;
- pixmap.texture_real_width = 0;
- pixmap.texture_real_height = 0;
- }
- if (pixmap.glx_pixmap) {
- gl.glXDestroyPixmap(dpy, pixmap.glx_pixmap);
- gl.glXReleaseTexImageEXT(dpy, pixmap.glx_pixmap, GLX_FRONT_EXT);
- pixmap.glx_pixmap = None;
- }
- if (pixmap.pixmap) {
- XFreePixmap(dpy, pixmap.pixmap);
- pixmap.pixmap = None;
- }
- if(pixmap.composite_window) {
- XCompositeUnredirectWindow(dpy, pixmap.composite_window, CompositeRedirectAutomatic);
- pixmap.composite_window = None;
- }
-static bool recreate_window_pixmap(Display *dpy, Window window_id,
- WindowPixmap &pixmap, bool fallback_composite_window = true) {
- cleanup_window_pixmap(dpy, pixmap);
- XWindowAttributes attr;
- if (!XGetWindowAttributes(dpy, window_id, &attr)) {
- fprintf(stderr, "Failed to get window attributes\n");
- return false;
- }
- const int pixmap_config[] = {
- None};
- const int pixmap_attribs[] = {GLX_TEXTURE_TARGET_EXT,
- None};
- int c;
- GLXFBConfig *configs = gl.glXChooseFBConfig(dpy, 0, pixmap_config, &c);
- if (!configs) {
- fprintf(stderr, "Failed too choose fb config\n");
- return false;
- }
- ScopedGLXFBConfig scoped_configs;
- scoped_configs.configs = configs;
- bool found = false;
- GLXFBConfig config;
- for (int i = 0; i < c; i++) {
- config = configs[i];
- XVisualInfo *visual = gl.glXGetVisualFromFBConfig(dpy, config);
- if (!visual)
- continue;
- if (attr.depth != visual->depth) {
- XFree(visual);
- continue;
- }
- XFree(visual);
- found = true;
- break;
- }
- if(!found) {
- fprintf(stderr, "No matching fb config found\n");
- return false;
- }
- Pixmap new_window_pixmap = XCompositeNameWindowPixmap(dpy, window_id);
- if (!new_window_pixmap) {
- fprintf(stderr, "Failed to get pixmap for window %ld\n", window_id);
- return false;
- }
- GLXPixmap glx_pixmap = gl.glXCreatePixmap(dpy, config, new_window_pixmap, pixmap_attribs);
- if (!glx_pixmap) {
- fprintf(stderr, "Failed to create glx pixmap\n");
- XFreePixmap(dpy, new_window_pixmap);
- return false;
- }
- pixmap.pixmap = new_window_pixmap;
- pixmap.glx_pixmap = glx_pixmap;
- //glEnable(GL_TEXTURE_2D);
- gl.glGenTextures(1, &pixmap.texture_id);
- gl.glBindTexture(GL_TEXTURE_2D, pixmap.texture_id);
- // glEnable(GL_BLEND);
- gl.glXBindTexImageEXT(dpy, pixmap.glx_pixmap, GLX_FRONT_EXT, NULL);
- gl.glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH,
- &pixmap.texture_width);
- gl.glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT,
- &pixmap.texture_height);
- pixmap.texture_real_width = pixmap.texture_width;
- pixmap.texture_real_height = pixmap.texture_height;
- if(pixmap.texture_width == 0 || pixmap.texture_height == 0) {
- gl.glBindTexture(GL_TEXTURE_2D, 0);
- pixmap.texture_width = attr.width;
- pixmap.texture_height = attr.height;
- pixmap.texture_real_width = pixmap.texture_width;
- pixmap.texture_real_height = pixmap.texture_height;
- if(fallback_composite_window) {
- Window compositor_window = get_compositor_window(dpy);
- if(!compositor_window) {
- fprintf(stderr, "Warning: failed to get texture size. You are probably running an unsupported compositor and recording the selected window doesn't work at the moment. This could also happen if you are trying to record a window with client-side decorations. A black window will be displayed instead. A workaround is to record the whole monitor (which uses NvFBC).\n");
- return false;
- }
- fprintf(stderr, "Warning: failed to get texture size. You are probably trying to record a window with client-side decorations (using GNOME?). Trying to fallback to recording the compositor proxy window\n");
- XCompositeRedirectWindow(dpy, compositor_window, CompositeRedirectAutomatic);
- // TODO: Target texture should be the same size as the target window, not the size of the composite window
- if(recreate_window_pixmap(dpy, compositor_window, pixmap, false)) {
- pixmap.composite_window = compositor_window;
- pixmap.texture_width = attr.width;
- pixmap.texture_height = attr.height;
- return true;
- }
- pixmap.texture_width = attr.width;
- pixmap.texture_height = attr.height;
+static bool video_codec_is_hdr(VideoCodec video_codec) {
+ switch(video_codec) {
+ case VideoCodec::HEVC_HDR:
+ case VideoCodec::AV1_HDR:
+ return true;
+ default:
return false;
- } else {
- fprintf(stderr, "Warning: failed to get texture size. You are probably running an unsupported compositor and recording the selected window doesn't work at the moment. This could also happen if you are trying to record a window with client-side decorations. A black window will be displayed instead. A workaround is to record the whole monitor (which uses NvFBC).\n");
- }
- fprintf(stderr, "texture width: %d, height: %d\n", pixmap.texture_width,
- pixmap.texture_height);
- // Generating this second texture is needed because
- // cuGraphicsGLRegisterImage cant be used with the texture that is mapped
- // directly to the pixmap.
- // TODO: Investigate if it's somehow possible to use the pixmap texture
- // directly, this should improve performance since only less image copy is
- // then needed every frame.
- gl.glGenTextures(1, &pixmap.target_texture_id);
- gl.glBindTexture(GL_TEXTURE_2D, pixmap.target_texture_id);
- gl.glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, pixmap.texture_width,
- pixmap.texture_height, 0, GL_RGB, GL_UNSIGNED_BYTE, NULL);
- unsigned int err2 = gl.glGetError();
- //fprintf(stderr, "error: %d\n", err2);
- // glXBindTexImageEXT(dpy, pixmap.glx_pixmap, GLX_FRONT_EXT, NULL);
- // glGenerateTextureMipmapEXT(glxpixmap, GL_TEXTURE_2D);
- // glGenerateMipmap(GL_TEXTURE_2D);
- // gl.glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE );
- // gl.glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE );
- gl.glBindTexture(GL_TEXTURE_2D, 0);
- return pixmap.texture_id != 0 && pixmap.target_texture_id != 0;
-static Window create_opengl_window(Display *display) {
- const int attr[] = {
- None
- };
- XVisualInfo *visual_info = NULL;
- GLXFBConfig fbconfig = NULL;
- int numfbconfigs = 0;
- GLXFBConfig *fbconfigs = gl.glXChooseFBConfig(display, DefaultScreen(display), attr, &numfbconfigs);
- for(int i = 0; i < numfbconfigs; i++) {
- visual_info = gl.glXGetVisualFromFBConfig(display, fbconfigs[i]);
- if(!visual_info)
- continue;
+struct PacketData {
+ PacketData() {}
+ PacketData(const PacketData&) = delete;
+ PacketData& operator=(const PacketData&) = delete;
- fbconfig = fbconfigs[i];
- break;
+ ~PacketData() {
+ av_free(data.data);
- if(!visual_info) {
- fprintf(stderr, "mgl error: no appropriate visual found\n");
- return -1;
- }
- // TODO: Remove need for 4.2 when copy texture function has been removed
- int context_attribs[] = {
- None
- };
- GLXContext gl_context = gl.glXCreateContextAttribsARB(display, fbconfig, nullptr, True, context_attribs);
- if(!gl_context) {
- fprintf(stderr, "Error: failed to create gl context\n");
- return None;
- }
- Colormap colormap = XCreateColormap(display, DefaultRootWindow(display), visual_info->visual, AllocNone);
- if(!colormap) {
- fprintf(stderr, "Error: failed to create x11 colormap\n");
- gl.glXDestroyContext(display, gl_context);
- }
- XSetWindowAttributes window_attr;
- window_attr.colormap = colormap;
- // TODO: Is there a way to remove the need to create a window?
- Window window = XCreateWindow(display, DefaultRootWindow(display), 0, 0, 1, 1, 0, visual_info->depth, InputOutput, visual_info->visual, CWColormap, &window_attr);
- if(!window) {
- fprintf(stderr, "Error: failed to create gl window\n");
- goto fail;
- }
- if(!gl.glXMakeContextCurrent(display, window, window, gl_context)) {
- fprintf(stderr, "Error: failed to make gl context current\n");
- goto fail;
- }
- return window;
- fail:
- XFreeColormap(display, colormap);
- gl.glXDestroyContext(display, gl_context);
- return None;
-/* TODO: check for glx swap control extension string (GLX_EXT_swap_control, etc) */
-static void set_vertical_sync_enabled(Display *display, Window window, bool enabled) {
- int result = 0;
- if(gl.glXSwapIntervalEXT) {
- gl.glXSwapIntervalEXT(display, window, enabled ? 1 : 0);
- } else if(gl.glXSwapIntervalMESA) {
- result = gl.glXSwapIntervalMESA(enabled ? 1 : 0);
- } else if(gl.glXSwapIntervalSGI) {
- result = gl.glXSwapIntervalSGI(enabled ? 1 : 0);
- } else {
- static int warned = 0;
- if (!warned) {
- warned = 1;
- fprintf(stderr, "Warning: setting vertical sync not supported\n");
- }
- }
- if(result != 0)
- fprintf(stderr, "Warning: setting vertical sync failed\n");
+ AVPacket data;
// |stream| is only required for non-replay mode
-static void receive_frames(AVCodecContext *av_codec_context, int stream_index, AVStream *stream, AVFrame *frame,
+static void receive_frames(AVCodecContext *av_codec_context, int stream_index, AVStream *stream, int64_t pts,
AVFormatContext *av_format_context,
double replay_start_time,
- std::deque<AVPacket> &frame_data_queue,
+ std::deque<std::shared_ptr<PacketData>> &frame_data_queue,
int replay_buffer_size_secs,
bool &frames_erased,
- std::mutex &write_output_mutex) {
+ std::mutex &write_output_mutex,
+ double paused_time_offset) {
for (;;) {
- // TODO: Use av_packet_alloc instead because sizeof(av_packet) might not be future proof(?)
- AVPacket av_packet;
- memset(&av_packet, 0, sizeof(av_packet));
- av_packet.data = NULL;
- av_packet.size = 0;
- int res = avcodec_receive_packet(av_codec_context, &av_packet);
- if (res == 0) { // we have a packet, send the packet to the muxer
- av_packet.stream_index = stream_index;
- av_packet.pts = av_packet.dts = frame->pts;
+ AVPacket *av_packet = av_packet_alloc();
+ if(!av_packet)
+ break;
- if(frame->flags & AV_FRAME_FLAG_DISCARD)
- av_packet.flags |= AV_PKT_FLAG_DISCARD;
+ av_packet->data = NULL;
+ av_packet->size = 0;
+ int res = avcodec_receive_packet(av_codec_context, av_packet);
+ if (res == 0) { // we have a packet, send the packet to the muxer
+ av_packet->stream_index = stream_index;
+ av_packet->pts = pts;
+ av_packet->dts = pts;
std::lock_guard<std::mutex> lock(write_output_mutex);
if(replay_buffer_size_secs != -1) {
- double time_now = clock_get_monotonic_seconds();
+ // TODO: Preallocate all frames data and use those instead.
+ // Why are we doing this you ask? there is a new ffmpeg bug that causes cpu usage to increase over time when you have
+ // packets that are not being free'd until later. So we copy the packet data, free the packet and then reconstruct
+ // the packet later on when we need it, to keep packets alive only for a short period.
+ auto new_packet = std::make_shared<PacketData>();
+ new_packet->data = *av_packet;
+ new_packet->data.data = (uint8_t*)av_malloc(av_packet->size);
+ memcpy(new_packet->data.data, av_packet->data, av_packet->size);
+ double time_now = clock_get_monotonic_seconds() - paused_time_offset;
double replay_time_elapsed = time_now - replay_start_time;
- AVPacket new_pack;
- av_packet_move_ref(&new_pack, &av_packet);
- frame_data_queue.push_back(std::move(new_pack));
+ frame_data_queue.push_back(std::move(new_packet));
if(replay_time_elapsed >= replay_buffer_size_secs) {
- av_packet_unref(&frame_data_queue.front());
frames_erased = true;
- av_packet_unref(&av_packet);
} else {
- av_packet_rescale_ts(&av_packet, av_codec_context->time_base, stream->time_base);
- av_packet.stream_index = stream->index;
- int ret = av_interleaved_write_frame(av_format_context, &av_packet);
+ av_packet_rescale_ts(av_packet, av_codec_context->time_base, stream->time_base);
+ av_packet->stream_index = stream->index;
+ // TODO: Is av_interleaved_write_frame needed?. Answer: might be needed for mkv but dont use it! it causes frames to be inconsistent, skipping frames and duplicating frames
+ int ret = av_write_frame(av_format_context, av_packet);
if(ret < 0) {
- fprintf(stderr, "Error: Failed to write frame index %d to muxer, reason: %s (%d)\n", av_packet.stream_index, av_error_to_string(ret), ret);
+ fprintf(stderr, "Error: Failed to write frame index %d to muxer, reason: %s (%d)\n", av_packet->stream_index, av_error_to_string(ret), ret);
+ av_packet_free(&av_packet);
} else if (res == AVERROR(EAGAIN)) { // we have no packet
// fprintf(stderr, "No packet!\n");
- av_packet_unref(&av_packet);
+ av_packet_free(&av_packet);
} else if (res == AVERROR_EOF) { // this is the end of the stream
+ av_packet_free(&av_packet);
fprintf(stderr, "End of stream!\n");
- av_packet_unref(&av_packet);
} else {
+ av_packet_free(&av_packet);
fprintf(stderr, "Unexpected error: %d\n", res);
- av_packet_unref(&av_packet);
-static AVCodecContext* create_audio_codec_context(int fps) {
- const AVCodec *codec = avcodec_find_encoder(AV_CODEC_ID_AAC);
+static const char* audio_codec_get_name(AudioCodec audio_codec) {
+ switch(audio_codec) {
+ case AudioCodec::AAC: return "aac";
+ case AudioCodec::OPUS: return "opus";
+ case AudioCodec::FLAC: return "flac";
+ }
+ assert(false);
+ return "";
+static AVCodecID audio_codec_get_id(AudioCodec audio_codec) {
+ switch(audio_codec) {
+ case AudioCodec::AAC: return AV_CODEC_ID_AAC;
+ case AudioCodec::OPUS: return AV_CODEC_ID_OPUS;
+ case AudioCodec::FLAC: return AV_CODEC_ID_FLAC;
+ }
+ assert(false);
+ return AV_CODEC_ID_AAC;
+static AVSampleFormat audio_codec_get_sample_format(AudioCodec audio_codec, const AVCodec *codec, bool mix_audio) {
+ switch(audio_codec) {
+ case AudioCodec::AAC: {
+ }
+ case AudioCodec::OPUS: {
+ bool supports_s16 = false;
+ bool supports_flt = false;
+ for(size_t i = 0; codec->sample_fmts && codec->sample_fmts[i] != -1; ++i) {
+ if(codec->sample_fmts[i] == AV_SAMPLE_FMT_S16) {
+ supports_s16 = true;
+ } else if(codec->sample_fmts[i] == AV_SAMPLE_FMT_FLT) {
+ supports_flt = true;
+ }
+ }
+ // Amix only works with float audio
+ if(mix_audio)
+ supports_s16 = false;
+ if(!supports_s16 && !supports_flt) {
+ fprintf(stderr, "Warning: opus audio codec is chosen but your ffmpeg version does not support s16/flt sample format and performance might be slightly worse.\n");
+ fprintf(stderr, " You can either rebuild ffmpeg with libopus instead of the built-in opus, use the flatpak version of gpu screen recorder or record with aac audio codec instead (-ac aac).\n");
+ fprintf(stderr, " Falling back to fltp audio sample format instead.\n");
+ }
+ if(supports_s16)
+ return AV_SAMPLE_FMT_S16;
+ else if(supports_flt)
+ else
+ }
+ case AudioCodec::FLAC: {
+ return AV_SAMPLE_FMT_S32;
+ }
+ }
+ assert(false);
+static int64_t audio_codec_get_get_bitrate(AudioCodec audio_codec) {
+ switch(audio_codec) {
+ case AudioCodec::AAC: return 160000;
+ case AudioCodec::OPUS: return 128000;
+ case AudioCodec::FLAC: return 128000;
+ }
+ assert(false);
+ return 128000;
+static AudioFormat audio_codec_context_get_audio_format(const AVCodecContext *audio_codec_context) {
+ switch(audio_codec_context->sample_fmt) {
+ case AV_SAMPLE_FMT_FLT: return F32;
+ case AV_SAMPLE_FMT_FLTP: return S32;
+ case AV_SAMPLE_FMT_S16: return S16;
+ case AV_SAMPLE_FMT_S32: return S32;
+ default: return S16;
+ }
+static AVSampleFormat audio_format_to_sample_format(const AudioFormat audio_format) {
+ switch(audio_format) {
+ case S16: return AV_SAMPLE_FMT_S16;
+ case S32: return AV_SAMPLE_FMT_S32;
+ case F32: return AV_SAMPLE_FMT_FLT;
+ }
+ assert(false);
+ return AV_SAMPLE_FMT_S16;
+static AVCodecContext* create_audio_codec_context(int fps, AudioCodec audio_codec, bool mix_audio, int audio_bitrate) {
+ (void)fps;
+ const AVCodec *codec = avcodec_find_encoder(audio_codec_get_id(audio_codec));
if (!codec) {
- fprintf(
- stderr,
- "Error: Could not find aac encoder\n");
- exit(1);
+ fprintf(stderr, "Error: Could not find %s audio encoder\n", audio_codec_get_name(audio_codec));
+ _exit(1);
AVCodecContext *codec_context = avcodec_alloc_context3(codec);
assert(codec->type == AVMEDIA_TYPE_AUDIO);
- /*
- codec_context->sample_fmt = (*codec)->sample_fmts
- ? (*codec)->sample_fmts[0]
- */
- codec_context->codec_id = AV_CODEC_ID_AAC;
- codec_context->sample_fmt = AV_SAMPLE_FMT_FLTP;
- //codec_context->bit_rate = 64000;
- codec_context->sample_rate = 48000;
- codec_context->profile = FF_PROFILE_AAC_LOW;
+ codec_context->codec_id = codec->id;
+ codec_context->sample_fmt = audio_codec_get_sample_format(audio_codec, codec, mix_audio);
+ codec_context->bit_rate = audio_bitrate == 0 ? audio_codec_get_get_bitrate(audio_codec) : audio_bitrate;
+ codec_context->sample_rate = AUDIO_SAMPLE_RATE;
+ if(audio_codec == AudioCodec::AAC)
+ codec_context->profile = FF_PROFILE_AAC_LOW;
codec_context->channel_layout = AV_CH_LAYOUT_STEREO;
codec_context->channels = 2;
@@ -590,9 +320,7 @@ static AVCodecContext* create_audio_codec_context(int fps) {
codec_context->time_base.num = 1;
codec_context->time_base.den = codec_context->sample_rate;
- codec_context->framerate.num = fps;
- codec_context->framerate.den = 1;
+ codec_context->thread_count = 1;
codec_context->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
return codec_context;
@@ -600,8 +328,8 @@ static AVCodecContext* create_audio_codec_context(int fps) {
static AVCodecContext *create_video_codec_context(AVPixelFormat pix_fmt,
VideoQuality video_quality,
- int record_width, int record_height,
- int fps, const AVCodec *codec, bool is_livestream) {
+ int fps, const AVCodec *codec, bool is_livestream, gsr_gpu_vendor vendor, FramerateMode framerate_mode,
+ bool hdr, gsr_color_range color_range, float keyint) {
AVCodecContext *codec_context = avcodec_alloc_context3(codec);
@@ -609,31 +337,39 @@ static AVCodecContext *create_video_codec_context(AVPixelFormat pix_fmt,
assert(codec->type == AVMEDIA_TYPE_VIDEO);
codec_context->codec_id = codec->id;
- codec_context->width = record_width & ~1;
- codec_context->height = record_height & ~1;
// Timebase: This is the fundamental unit of time (in seconds) in terms
// of which frame timestamps are represented. For fixed-fps content,
// timebase should be 1/framerate and timestamp increments should be
// identical to 1
codec_context->time_base.num = 1;
- codec_context->time_base.den = fps;
+ codec_context->time_base.den = framerate_mode == FramerateMode::CONSTANT ? fps : AV_TIME_BASE;
codec_context->framerate.num = fps;
codec_context->framerate.den = 1;
codec_context->sample_aspect_ratio.num = 0;
codec_context->sample_aspect_ratio.den = 0;
- // High values reeduce file size but increases time it takes to seek
+ // High values reduce file size but increases time it takes to seek
if(is_livestream) {
codec_context->flags2 |= AV_CODEC_FLAG2_FAST;
//codec_context->gop_size = std::numeric_limits<int>::max();
//codec_context->keyint_min = std::numeric_limits<int>::max();
- codec_context->gop_size = fps * 2;
+ codec_context->gop_size = fps * keyint;
} else {
- codec_context->gop_size = fps * 2;
+ codec_context->gop_size = fps * keyint;
codec_context->max_b_frames = 0;
codec_context->pix_fmt = pix_fmt;
- codec_context->color_range = AVCOL_RANGE_JPEG;
+ codec_context->color_range = color_range == GSR_COLOR_RANGE_LIMITED ? AVCOL_RANGE_MPEG : AVCOL_RANGE_JPEG;
+ if(hdr) {
+ codec_context->color_primaries = AVCOL_PRI_BT2020;
+ codec_context->color_trc = AVCOL_TRC_SMPTE2084;
+ codec_context->colorspace = AVCOL_SPC_BT2020_NCL;
+ } else {
+ codec_context->color_primaries = AVCOL_PRI_BT709;
+ codec_context->color_trc = AVCOL_TRC_BT709;
+ codec_context->colorspace = AVCOL_SPC_BT709;
+ }
+ //codec_context->chroma_sample_location = AVCHROMA_LOC_CENTER;
if(codec->id == AV_CODEC_ID_HEVC)
codec_context->codec_tag = MKTAG('h', 'v', 'c', '1');
switch(video_quality) {
@@ -681,74 +417,199 @@ static AVCodecContext *create_video_codec_context(AVPixelFormat pix_fmt,
codec_context->bit_rate = 0;
+ // 8 bit / 10 bit = 80%, and increase it even more
+ const float quality_multiply = hdr ? (8.0f/10.0f * 0.7f) : 1.0f;
+ if(vendor != GSR_GPU_VENDOR_NVIDIA) {
+ switch(video_quality) {
+ case VideoQuality::MEDIUM:
+ codec_context->global_quality = 180 * quality_multiply;
+ break;
+ case VideoQuality::HIGH:
+ codec_context->global_quality = 140 * quality_multiply;
+ break;
+ case VideoQuality::VERY_HIGH:
+ codec_context->global_quality = 120 * quality_multiply;
+ break;
+ case VideoQuality::ULTRA:
+ codec_context->global_quality = 100 * quality_multiply;
+ break;
+ }
+ }
+ av_opt_set_int(codec_context->priv_data, "b_ref_mode", 0, 0);
+ //av_opt_set_int(codec_context->priv_data, "cbr", true, 0);
+ if(vendor != GSR_GPU_VENDOR_NVIDIA) {
+ // TODO: More options, better options
+ //codec_context->bit_rate = codec_context->width * codec_context->height;
+ av_opt_set(codec_context->priv_data, "rc_mode", "CQP", 0);
+ //codec_context->global_quality = 4;
+ //codec_context->compression_level = 2;
+ }
+ //av_opt_set(codec_context->priv_data, "bsf", "hevc_metadata=colour_primaries=9:transfer_characteristics=16:matrix_coefficients=9", 0);
//codec_context->rc_max_rate = codec_context->bit_rate;
//codec_context->rc_min_rate = codec_context->bit_rate;
//codec_context->rc_buffer_size = codec_context->bit_rate / 10;
+ // TODO: Do this when not using cqp
+ //codec_context->rc_initial_buffer_occupancy = codec_context->bit_rate * 1000;
codec_context->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
return codec_context;
-static const AVCodec* find_h264_encoder() {
- const AVCodec *codec = avcodec_find_encoder_by_name("h264_nvenc");
+static bool vaapi_create_codec_context(AVCodecContext *video_codec_context, const char *card_path) {
+ char render_path[128];
+ if(!gsr_card_path_get_render_path(card_path, render_path)) {
+ fprintf(stderr, "gsr error: failed to get /dev/dri/renderDXXX file from %s\n", card_path);
+ return false;
+ }
+ AVBufferRef *device_ctx;
+ if(av_hwdevice_ctx_create(&device_ctx, AV_HWDEVICE_TYPE_VAAPI, render_path, NULL, 0) < 0) {
+ fprintf(stderr, "Error: Failed to create hardware device context\n");
+ return false;
+ }
+ AVBufferRef *frame_context = av_hwframe_ctx_alloc(device_ctx);
+ if(!frame_context) {
+ fprintf(stderr, "Error: Failed to create hwframe context\n");
+ av_buffer_unref(&device_ctx);
+ return false;
+ }
+ AVHWFramesContext *hw_frame_context =
+ (AVHWFramesContext *)frame_context->data;
+ hw_frame_context->width = video_codec_context->width;
+ hw_frame_context->height = video_codec_context->height;
+ hw_frame_context->sw_format = AV_PIX_FMT_NV12;
+ hw_frame_context->format = video_codec_context->pix_fmt;
+ hw_frame_context->device_ref = device_ctx;
+ hw_frame_context->device_ctx = (AVHWDeviceContext*)device_ctx->data;
+ //hw_frame_context->initial_pool_size = 1;
+ if (av_hwframe_ctx_init(frame_context) < 0) {
+ fprintf(stderr, "Error: Failed to initialize hardware frame context "
+ "(note: ffmpeg version needs to be > 4.0)\n");
+ av_buffer_unref(&device_ctx);
+ //av_buffer_unref(&frame_context);
+ return false;
+ }
+ video_codec_context->hw_device_ctx = av_buffer_ref(device_ctx);
+ video_codec_context->hw_frames_ctx = av_buffer_ref(frame_context);
+ return true;
+static bool check_if_codec_valid_for_hardware(const AVCodec *codec, gsr_gpu_vendor vendor, const char *card_path) {
+ // Do not use AV_PIX_FMT_CUDA because we dont want to do full check with hardware context
+ AVCodecContext *codec_context = create_video_codec_context(vendor == GSR_GPU_VENDOR_NVIDIA ? AV_PIX_FMT_YUV420P : AV_PIX_FMT_VAAPI, VideoQuality::VERY_HIGH, 60, codec, false, vendor, FramerateMode::CONSTANT, false, GSR_COLOR_RANGE_LIMITED, 2);
+ if(!codec_context)
+ return false;
+ codec_context->width = 512;
+ codec_context->height = 512;
+ if(vendor != GSR_GPU_VENDOR_NVIDIA) {
+ if(!vaapi_create_codec_context(codec_context, card_path)) {
+ avcodec_free_context(&codec_context);
+ return false;
+ }
+ }
+ bool success = false;
+ success = avcodec_open2(codec_context, codec_context->codec, NULL) == 0;
+ if(codec_context->hw_device_ctx)
+ av_buffer_unref(&codec_context->hw_device_ctx);
+ if(codec_context->hw_frames_ctx)
+ av_buffer_unref(&codec_context->hw_frames_ctx);
+ avcodec_free_context(&codec_context);
+ return success;
+static const AVCodec* find_h264_encoder(gsr_gpu_vendor vendor, const char *card_path) {
+ const AVCodec *codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "h264_nvenc" : "h264_vaapi");
- codec = avcodec_find_encoder_by_name("nvenc_h264");
+ codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "nvenc_h264" : "vaapi_h264");
+ if(!codec)
+ return nullptr;
static bool checked = false;
+ static bool checked_success = true;
if(!checked) {
checked = true;
- // Do not use AV_PIX_FMT_CUDA because we dont want to do full check with hardware context
- AVCodecContext *codec_context = create_video_codec_context(AV_PIX_FMT_YUV420P, VideoQuality::VERY_HIGH, 1920, 1080, 60, codec, false);
- if(codec_context) {
- if (avcodec_open2(codec_context, codec_context->codec, NULL) < 0) {
- avcodec_free_context(&codec_context);
- return nullptr;
- }
- avcodec_free_context(&codec_context);
- }
+ if(!check_if_codec_valid_for_hardware(codec, vendor, card_path))
+ checked_success = false;
- return codec;
+ return checked_success ? codec : nullptr;
-static const AVCodec* find_h265_encoder() {
- const AVCodec *codec = avcodec_find_encoder_by_name("hevc_nvenc");
+static const AVCodec* find_hevc_encoder(gsr_gpu_vendor vendor, const char *card_path) {
+ const AVCodec *codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "hevc_nvenc" : "hevc_vaapi");
- codec = avcodec_find_encoder_by_name("nvenc_hevc");
+ codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "nvenc_hevc" : "vaapi_hevc");
return nullptr;
static bool checked = false;
+ static bool checked_success = true;
if(!checked) {
checked = true;
- // Do not use AV_PIX_FMT_CUDA because we dont want to do full check with hardware context
- AVCodecContext *codec_context = create_video_codec_context(AV_PIX_FMT_YUV420P, VideoQuality::VERY_HIGH, 1920, 1080, 60, codec, false);
- if(codec_context) {
- if (avcodec_open2(codec_context, codec_context->codec, NULL) < 0) {
- avcodec_free_context(&codec_context);
- return nullptr;
- }
- avcodec_free_context(&codec_context);
- }
+ if(!check_if_codec_valid_for_hardware(codec, vendor, card_path))
+ checked_success = false;
+ }
+ return checked_success ? codec : nullptr;
+static const AVCodec* find_av1_encoder(gsr_gpu_vendor vendor, const char *card_path) {
+ // Workaround bug with av1 nvidia in older ffmpeg versions that causes the whole application to crash
+ // when avcodec_open2 is opened with av1_nvenc
+ return nullptr;
+ }
+ const AVCodec *codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "av1_nvenc" : "av1_vaapi");
+ if(!codec)
+ codec = avcodec_find_encoder_by_name(vendor == GSR_GPU_VENDOR_NVIDIA ? "nvenc_av1" : "vaapi_av1");
+ if(!codec)
+ return nullptr;
+ static bool checked = false;
+ static bool checked_success = true;
+ if(!checked) {
+ checked = true;
+ if(!check_if_codec_valid_for_hardware(codec, vendor, card_path))
+ checked_success = false;
- return codec;
+ return checked_success ? codec : nullptr;
-static AVFrame* open_audio(AVCodecContext *audio_codec_context) {
+static void open_audio(AVCodecContext *audio_codec_context) {
+ AVDictionary *options = nullptr;
+ av_dict_set(&options, "strict", "experimental", 0);
int ret;
- ret = avcodec_open2(audio_codec_context, audio_codec_context->codec, nullptr);
+ ret = avcodec_open2(audio_codec_context, audio_codec_context->codec, &options);
if(ret < 0) {
fprintf(stderr, "failed to open codec, reason: %s\n", av_error_to_string(ret));
- exit(1);
+ _exit(1);
+static AVFrame* create_audio_frame(AVCodecContext *audio_codec_context) {
AVFrame *frame = av_frame_alloc();
if(!frame) {
fprintf(stderr, "failed to allocate audio frame\n");
- exit(1);
+ _exit(1);
+ frame->sample_rate = audio_codec_context->sample_rate;
frame->nb_samples = audio_codec_context->frame_size;
frame->format = audio_codec_context->sample_fmt;
@@ -758,204 +619,323 @@ static AVFrame* open_audio(AVCodecContext *audio_codec_context) {
av_channel_layout_copy(&frame->ch_layout, &audio_codec_context->ch_layout);
- ret = av_frame_get_buffer(frame, 0);
+ int ret = av_frame_get_buffer(frame, 0);
if(ret < 0) {
fprintf(stderr, "failed to allocate audio data buffers, reason: %s\n", av_error_to_string(ret));
- exit(1);
+ _exit(1);
return frame;
-static AVBufferRef* dummy_hw_frame_init(int size) {
- return av_buffer_alloc(size);
-static AVBufferRef* dummy_hw_frame_init(size_t size) {
- return av_buffer_alloc(size);
-static void open_video(AVCodecContext *codec_context,
- WindowPixmap &window_pixmap, AVBufferRef **device_ctx,
- CUgraphicsResource *cuda_graphics_resource, CUcontext cuda_context, bool use_nvfbc, VideoQuality video_quality, bool is_livestream, bool very_old_gpu) {
- int ret;
- *device_ctx = av_hwdevice_ctx_alloc(AV_HWDEVICE_TYPE_CUDA);
- if(!*device_ctx) {
- fprintf(stderr, "Error: Failed to create hardware device context\n");
- exit(1);
- }
- AVHWDeviceContext *hw_device_context = (AVHWDeviceContext *)(*device_ctx)->data;
- AVCUDADeviceContext *cuda_device_context = (AVCUDADeviceContext *)hw_device_context->hwctx;
- cuda_device_context->cuda_ctx = cuda_context;
- if(av_hwdevice_ctx_init(*device_ctx) < 0) {
- fprintf(stderr, "Error: Failed to create hardware device context\n");
- exit(1);
- }
- AVBufferRef *frame_context = av_hwframe_ctx_alloc(*device_ctx);
- if (!frame_context) {
- fprintf(stderr, "Error: Failed to create hwframe context\n");
- exit(1);
- }
- AVHWFramesContext *hw_frame_context =
- (AVHWFramesContext *)frame_context->data;
- hw_frame_context->width = codec_context->width;
- hw_frame_context->height = codec_context->height;
- hw_frame_context->sw_format = AV_PIX_FMT_0RGB32;
- hw_frame_context->format = codec_context->pix_fmt;
- hw_frame_context->device_ref = *device_ctx;
- hw_frame_context->device_ctx = (AVHWDeviceContext *)(*device_ctx)->data;
- if(use_nvfbc) {
- hw_frame_context->pool = av_buffer_pool_init(1, dummy_hw_frame_init);
- hw_frame_context->initial_pool_size = 1;
- }
- if (av_hwframe_ctx_init(frame_context) < 0) {
- fprintf(stderr, "Error: Failed to initialize hardware frame context "
- "(note: ffmpeg version needs to be > 4.0\n");
- exit(1);
- }
- codec_context->hw_device_ctx = *device_ctx;
- codec_context->hw_frames_ctx = frame_context;
- bool supports_p4 = false;
- bool supports_p6 = false;
+static void open_video(AVCodecContext *codec_context, VideoQuality video_quality, bool very_old_gpu, gsr_gpu_vendor vendor, PixelFormat pixel_format, bool hdr) {
+ (void)very_old_gpu;
+ AVDictionary *options = nullptr;
+ // 8 bit / 10 bit = 80%
+ const float qp_multiply = hdr ? 8.0f/10.0f : 1.0f;
+ if(vendor == GSR_GPU_VENDOR_NVIDIA) {
+ // Disable setting preset since some nvidia gpus cant handle it nicely and greatly reduce encoding performance (from more than 60 fps to less than 45 fps) (such as Nvidia RTX A2000)
+ #if 0
+ bool supports_p4 = false;
+ bool supports_p5 = false;
+ const AVOption *opt = nullptr;
+ while((opt = av_opt_next(codec_context->priv_data, opt))) {
+ if(opt->type == AV_OPT_TYPE_CONST) {
+ if(strcmp(opt->name, "p4") == 0)
+ supports_p4 = true;
+ else if(strcmp(opt->name, "p5") == 0)
+ supports_p5 = true;
+ }
+ }
+ #endif
- const AVOption *opt = nullptr;
- while((opt = av_opt_next(codec_context->priv_data, opt))) {
- if(opt->type == AV_OPT_TYPE_CONST) {
- if(strcmp(opt->name, "p4") == 0)
- supports_p4 = true;
- else if(strcmp(opt->name, "p6") == 0)
- supports_p6 = true;
+ if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+ switch(video_quality) {
+ case VideoQuality::MEDIUM:
+ av_dict_set_int(&options, "qp", 37 * qp_multiply, 0);
+ break;
+ case VideoQuality::HIGH:
+ av_dict_set_int(&options, "qp", 32 * qp_multiply, 0);
+ break;
+ case VideoQuality::VERY_HIGH:
+ av_dict_set_int(&options, "qp", 28 * qp_multiply, 0);
+ break;
+ case VideoQuality::ULTRA:
+ av_dict_set_int(&options, "qp", 24 * qp_multiply, 0);
+ break;
+ }
+ } else if(codec_context->codec_id == AV_CODEC_ID_H264) {
+ switch(video_quality) {
+ case VideoQuality::MEDIUM:
+ av_dict_set_int(&options, "qp", 34 * qp_multiply, 0);
+ break;
+ case VideoQuality::HIGH:
+ av_dict_set_int(&options, "qp", 30 * qp_multiply, 0);
+ break;
+ case VideoQuality::VERY_HIGH:
+ av_dict_set_int(&options, "qp", 26 * qp_multiply, 0);
+ break;
+ case VideoQuality::ULTRA:
+ av_dict_set_int(&options, "qp", 22 * qp_multiply, 0);
+ break;
+ }
+ } else {
+ switch(video_quality) {
+ case VideoQuality::MEDIUM:
+ av_dict_set_int(&options, "qp", 37 * qp_multiply, 0);
+ break;
+ case VideoQuality::HIGH:
+ av_dict_set_int(&options, "qp", 32 * qp_multiply, 0);
+ break;
+ case VideoQuality::VERY_HIGH:
+ av_dict_set_int(&options, "qp", 28 * qp_multiply, 0);
+ break;
+ case VideoQuality::ULTRA:
+ av_dict_set_int(&options, "qp", 24 * qp_multiply, 0);
+ break;
+ }
- }
- AVDictionary *options = nullptr;
- if(very_old_gpu) {
- switch(video_quality) {
- case VideoQuality::MEDIUM:
- av_dict_set_int(&options, "qp", 37, 0);
- break;
- case VideoQuality::HIGH:
- av_dict_set_int(&options, "qp", 32, 0);
- break;
- case VideoQuality::VERY_HIGH:
- av_dict_set_int(&options, "qp", 27, 0);
- break;
- case VideoQuality::ULTRA:
- av_dict_set_int(&options, "qp", 21, 0);
- break;
+ #if 0
+ if(!supports_p4 && !supports_p5)
+ fprintf(stderr, "Info: your ffmpeg version is outdated. It's recommended that you use the flatpak version of gpu-screen-recorder version instead, which you can find at https://flathub.org/apps/details/com.dec05eba.gpu_screen_recorder\n");
+ //if(is_livestream) {
+ // av_dict_set_int(&options, "zerolatency", 1, 0);
+ // //av_dict_set(&options, "preset", "llhq", 0);
+ //}
+ // I want to use a good preset for the gpu but all gpus prefer different
+ // presets. Nvidia and ffmpeg used to support "hq" preset that chose the best preset for the gpu
+ // with pretty good performance but you now have to choose p1-p7, which are gpu agnostic and on
+ // older gpus p5-p7 slow the gpu down to a crawl...
+ // "hq" is now just an alias for p7 in ffmpeg :(
+ // TODO: Temporary disable because of stuttering?
+ // TODO: Preset is set to p5 for now but it should ideally be p6 or p7.
+ // This change is needed because for certain sizes of a window (or monitor?) such as 971x780 causes encoding to freeze
+ // when using h264 codec. This is a new(?) nvidia driver bug.
+ if(very_old_gpu)
+ av_dict_set(&options, "preset", supports_p4 ? "p4" : "medium", 0);
+ else
+ av_dict_set(&options, "preset", supports_p5 ? "p5" : "slow", 0);
+ #endif
+ av_dict_set(&options, "tune", "hq", 0);
+ av_dict_set(&options, "rc", "constqp", 0);
+ if(codec_context->codec_id == AV_CODEC_ID_H264) {
+ switch(pixel_format) {
+ case PixelFormat::YUV420:
+ av_dict_set(&options, "profile", "high", 0);
+ break;
+ case PixelFormat::YUV444:
+ av_dict_set(&options, "profile", "high444p", 0);
+ break;
+ }
+ } else if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+ switch(pixel_format) {
+ case PixelFormat::YUV420:
+ av_dict_set(&options, "rgb_mode", "yuv420", 0);
+ break;
+ case PixelFormat::YUV444:
+ av_dict_set(&options, "rgb_mode", "yuv444", 0);
+ break;
+ }
+ } else {
+ //av_dict_set(&options, "profile", "main10", 0);
+ //av_dict_set(&options, "pix_fmt", "yuv420p16le", 0);
+ if(hdr) {
+ av_dict_set(&options, "profile", "main10", 0);
+ } else {
+ av_dict_set(&options, "profile", "main", 0);
+ }
} else {
- switch(video_quality) {
- case VideoQuality::MEDIUM:
- av_dict_set_int(&options, "qp", 40, 0);
- break;
- case VideoQuality::HIGH:
- av_dict_set_int(&options, "qp", 35, 0);
- break;
- case VideoQuality::VERY_HIGH:
- av_dict_set_int(&options, "qp", 30, 0);
- break;
- case VideoQuality::ULTRA:
- av_dict_set_int(&options, "qp", 24, 0);
- break;
+ if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+ // Using global_quality option
+ } else if(codec_context->codec_id == AV_CODEC_ID_H264) {
+ switch(video_quality) {
+ case VideoQuality::MEDIUM:
+ av_dict_set_int(&options, "qp", 34 * qp_multiply, 0);
+ break;
+ case VideoQuality::HIGH:
+ av_dict_set_int(&options, "qp", 30 * qp_multiply, 0);
+ break;
+ case VideoQuality::VERY_HIGH:
+ av_dict_set_int(&options, "qp", 26 * qp_multiply, 0);
+ break;
+ case VideoQuality::ULTRA:
+ av_dict_set_int(&options, "qp", 22 * qp_multiply, 0);
+ break;
+ }
+ } else {
+ switch(video_quality) {
+ case VideoQuality::MEDIUM:
+ av_dict_set_int(&options, "qp", 37 * qp_multiply, 0);
+ break;
+ case VideoQuality::HIGH:
+ av_dict_set_int(&options, "qp", 32 * qp_multiply, 0);
+ break;
+ case VideoQuality::VERY_HIGH:
+ av_dict_set_int(&options, "qp", 28 * qp_multiply, 0);
+ break;
+ case VideoQuality::ULTRA:
+ av_dict_set_int(&options, "qp", 24 * qp_multiply, 0);
+ break;
+ }
- }
- if(!supports_p4 && !supports_p6) {
- fprintf(stderr, "Info: your ffmpeg version is outdated. It's recommended that you use the flatpak version of gpu-screen-recorder version instead, which you can find at https://flathub.org/apps/details/com.dec05eba.gpu_screen_recorder\n");
+ // TODO: More quality options
+ av_dict_set(&options, "rc_mode", "CQP", 0);
+ //av_dict_set_int(&options, "low_power", 1, 0);
+ if(codec_context->codec_id == AV_CODEC_ID_H264) {
+ av_dict_set(&options, "profile", "high", 0);
+ // Removed because it causes stutter in games for some people
+ //av_dict_set_int(&options, "quality", 5, 0); // quality preset
+ } else if(codec_context->codec_id == AV_CODEC_ID_AV1) {
+ av_dict_set(&options, "profile", "main", 0); // TODO: use professional instead?
+ av_dict_set(&options, "tier", "main", 0);
+ } else {
+ if(hdr) {
+ av_dict_set(&options, "profile", "main10", 0);
+ av_dict_set(&options, "sei", "hdr", 0);
+ } else {
+ av_dict_set(&options, "profile", "main", 0);
+ }
+ }
- //if(is_livestream) {
- // av_dict_set_int(&options, "zerolatency", 1, 0);
- // //av_dict_set(&options, "preset", "llhq", 0);
- //}
- // Fuck nvidia and ffmpeg, I want to use a good preset for the gpu but all gpus prefer different
- // presets. Nvidia and ffmpeg used to support "hq" preset that chose the best preset for the gpu
- // with pretty good performance but you now have to choose p1-p7, which are gpu agnostic and on
- // older gpus p5-p7 slow the gpu down to a crawl...
- // "hq" is now just an alias for p7 in ffmpeg :(
- // TODO: Temporary disable because of stuttering?
- if(very_old_gpu)
- av_dict_set(&options, "preset", supports_p4 ? "p4" : "medium", 0);
- else
- av_dict_set(&options, "preset", supports_p6 ? "p6" : "slow", 0);
- av_dict_set(&options, "tune", "hq", 0);
- av_dict_set(&options, "rc", "constqp", 0);
+ if(codec_context->codec_id == AV_CODEC_ID_H264) {
+ av_dict_set(&options, "coder", "cabac", 0); // TODO: cavlc is faster than cabac but worse compression. Which to use?
+ }
- if(codec_context->codec_id == AV_CODEC_ID_H264)
- av_dict_set(&options, "profile", "high", 0);
+ av_dict_set(&options, "strict", "experimental", 0);
- ret = avcodec_open2(codec_context, codec_context->codec, &options);
+ int ret = avcodec_open2(codec_context, codec_context->codec, &options);
if (ret < 0) {
- fprintf(stderr, "Error: Could not open video codec: %s\n",
- "blabla"); // av_err2str(ret));
- exit(1);
- }
- if(window_pixmap.target_texture_id != 0) {
- CUresult res;
- CUcontext old_ctx;
- res = cuda.cuCtxPopCurrent_v2(&old_ctx);
- res = cuda.cuCtxPushCurrent_v2(cuda_context);
- res = cuda.cuGraphicsGLRegisterImage(
- cuda_graphics_resource, window_pixmap.target_texture_id, GL_TEXTURE_2D,
- // cuda.cuGraphicsUnregisterResource(*cuda_graphics_resource);
- if (res != CUDA_SUCCESS) {
- const char *err_str;
- cuda.cuGetErrorString(res, &err_str);
- fprintf(stderr,
- "Error: cuda.cuGraphicsGLRegisterImage failed, error %s, texture "
- "id: %u\n",
- err_str, window_pixmap.target_texture_id);
- exit(1);
- }
- res = cuda.cuCtxPopCurrent_v2(&old_ctx);
+ fprintf(stderr, "Error: Could not open video codec: %s\n", av_error_to_string(ret));
+ _exit(1);
-static void close_video(AVStream *video_stream, AVFrame *frame) {
- // avcodec_close(video_stream->codec);
- // av_frame_free(&frame);
+static void usage_header() {
+ const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
+ const char *program_name = inside_flatpak ? "flatpak run --command=gpu-screen-recorder com.dec05eba.gpu_screen_recorder" : "gpu-screen-recorder";
+ fprintf(stderr, "usage: %s -w <window_id|monitor|focused> [-c <container_format>] [-s WxH] -f <fps> [-a <audio_input>] [-q <quality>] [-r <replay_buffer_size_sec>] [-k h264|hevc|hevc_hdr|av1|av1_hdr] [-ac aac|opus|flac] [-ab <bitrate>] [-oc yes|no] [-fm cfr|vfr|content] [-cr limited|full] [-v yes|no] [-h|--help] [-o <output_file>] [-mf yes|no] [-sc <script_path>] [-cursor yes|no] [-keyint <value>]\n", program_name);
-static void usage() {
- fprintf(stderr, "usage: gpu-screen-recorder -w <window_id> [-c <container_format>] -f <fps> [-a <audio_input>...] [-q <quality>] [-r <replay_buffer_size_sec>] [-o <output_file>]\n");
+static void usage_full() {
+ const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
+ const char *program_name = inside_flatpak ? "flatpak run --command=gpu-screen-recorder com.dec05eba.gpu_screen_recorder" : "gpu-screen-recorder";
+ usage_header();
+ fprintf(stderr, "\n");
fprintf(stderr, "OPTIONS:\n");
- fprintf(stderr, " -w Window to record or a display, \"screen\" or \"screen-direct\". The display is the display name in xrandr and if \"screen\" or \"screen-direct\" is selected then all displays are recorded and they are recorded in h265 (aka hevc)."
- "\"screen-direct\" skips one texture copy for fullscreen applications so it may lead to better performance and it works with VRR monitors when recording fullscreen application but may break some applications, such as mpv in fullscreen mode. Recording a display requires a gpu with NvFBC support.\n");
- fprintf(stderr, " -s The size (area) to record at in the format WxH, for example 1920x1080. Usually you want to set this to the size of the window. Optional, by default the size of the window (which is passed to -w). This option is only supported when recording a window, not a screen/monitor.\n");
- fprintf(stderr, " -c Container format for output file, for example mp4, or flv. Only required if no output file is specified or if recording in replay buffer mode. If an output file is specified and -c is not used then the container format is determined from the output filename extension.\n");
- fprintf(stderr, " -f Framerate to record at.\n");
- fprintf(stderr, " -a Audio device to record from (pulse audio device). Can be specified multiple times. Each time this is specified a new audio track is added for the specified audio device. A name can be given to the audio input device by prefixing the audio input with <name>/, for example \"dummy/alsa_output.pci-0000_00_1b.0.analog-stereo.monitor\". Optional, no audio track is added by default.\n");
- fprintf(stderr, " -q Video quality. Should be either 'medium', 'high', 'very_high' or 'ultra'. 'high' is the recommended option when live streaming or when you have a slower harddrive. Optional, set to 'very_high' be default.\n");
- fprintf(stderr, " -r Replay buffer size in seconds. If this is set, then only the last seconds as set by this option will be stored"
- " and the video will only be saved when the gpu-screen-recorder is closed. This feature is similar to Nvidia's instant replay feature."
- " This option has be between 5 and 1200. Note that the replay buffer size will not always be precise, because of keyframes. Optional, disabled by default.\n");
- fprintf(stderr, " -k Codec to use. Should be either 'auto', 'h264' or 'h265'. Defaults to 'auto' which defaults to 'h265' unless recording at a higher resolution than 60. Forcefully set to 'h264' if -c is 'flv'.\n");
- fprintf(stderr, " -o The output file path. If omitted then the encoded data is sent to stdout. Required in replay mode (when using -r). In replay mode this has to be an existing directory instead of a file.\n");
+ fprintf(stderr, " -w Window id to record, a display (monitor name), \"screen\", \"screen-direct-force\" or \"focused\".\n");
+ fprintf(stderr, " If this is \"screen\" or \"screen-direct-force\" then all monitors are recorded.\n");
+ fprintf(stderr, " \"screen-direct-force\" is not recommended unless you use a VRR (G-SYNC) monitor on Nvidia X11 and you are aware that using this option can cause games to freeze/crash or other issues because of Nvidia driver issues.\n");
+ fprintf(stderr, " \"screen-direct-force\" option is only available on Nvidia X11. VRR works without this option on other systems.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -c Container format for output file, for example mp4, or flv. Only required if no output file is specified or if recording in replay buffer mode.\n");
+ fprintf(stderr, " If an output file is specified and -c is not used then the container format is determined from the output filename extension.\n");
+ fprintf(stderr, " Only containers that support h264, hevc or av1 are supported, which means that only mp4, mkv, flv (and some others) are supported.\n");
+ fprintf(stderr, " WebM is not supported yet (most hardware doesn't support WebM video encoding).\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -s The size (area) to record at in the format WxH, for example 1920x1080. This option is only supported (and required) when -w is \"focused\".\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -f Frame rate to record at. Recording will only capture frames at this target frame rate.\n");
+ fprintf(stderr, " For constant frame rate mode this option is the frame rate every frame will be captured at and if the capture frame rate is below this target frame rate then the frames will be duplicated.\n");
+ fprintf(stderr, " For variable frame rate mode this option is the max frame rate and if the capture frame rate is below this target frame rate then frames will not be duplicated.\n");
+ fprintf(stderr, " Content frame rate is similar to variable frame rate mode, except the frame rate will match the frame rate of the captured content when possible, but not capturing above the frame rate set in this -f option.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -a Audio device to record from (pulse audio device). Can be specified multiple times. Each time this is specified a new audio track is added for the specified audio device.\n");
+ fprintf(stderr, " A name can be given to the audio input device by prefixing the audio input with <name>/, for example \"dummy/alsa_output.pci-0000_00_1b.0.analog-stereo.monitor\".\n");
+ fprintf(stderr, " Multiple audio devices can be merged into one audio track by using \"|\" as a separator into one -a argument, for example: -a \"alsa_output1|alsa_output2\".\n");
+ fprintf(stderr, " If the audio device is an empty string then the audio device is ignored.\n");
+ fprintf(stderr, " Optional, no audio track is added by default.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -q Video quality. Should be either 'medium', 'high', 'very_high' or 'ultra'. 'high' is the recommended option when live streaming or when you have a slower harddrive.\n");
+ fprintf(stderr, " Optional, set to 'very_high' be default.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -r Replay buffer size in seconds. If this is set, then only the last seconds as set by this option will be stored\n");
+ fprintf(stderr, " and the video will only be saved when the gpu-screen-recorder is closed. This feature is similar to Nvidia's instant replay feature.\n");
+ fprintf(stderr, " This option has be between 5 and 1200. Note that the replay buffer size will not always be precise, because of keyframes. Optional, disabled by default.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -k Video codec to use. Should be either 'auto', 'h264', 'hevc', 'av1', 'hevc_hdr' or 'av1_hdr'. Defaults to 'auto' which defaults to 'h264'.\n");
+ fprintf(stderr, " Forcefully set to 'h264' if the file container type is 'flv'.\n");
+ fprintf(stderr, " 'hevc_hdr' and 'av1_hdr' option is not available on X11.\n");
+ fprintf(stderr, " Note: hdr metadata is not included in the video when recording with 'hevc_hdr'/'av1_hdr' because of bugs in AMD, Intel and NVIDIA drivers (amazin', they are all bugged).\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -ac Audio codec to use. Should be either 'aac', 'opus' or 'flac'. Defaults to 'opus' for .mp4/.mkv files, otherwise defaults to 'aac'.\n");
+ fprintf(stderr, " 'opus' and 'flac' is only supported by .mp4/.mkv files. 'opus' is recommended for best performance and smallest audio size.\n");
+ fprintf(stderr, " Flac audio codec is option is disable at the moment because of a temporary issue.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -ab Audio bitrate to use. Optional, by default the bitrate is 128000 for opus and flac and 160000 for aac.\n");
+ fprintf(stderr, " If this is set to 0 then it's the same as if it's absent, in which case the bitrate is determined automatically depending on the audio codec.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -oc Overclock memory transfer rate to the maximum performance level. This only applies to NVIDIA on X11 and exists to overcome a bug in NVIDIA driver where performance level\n");
+ fprintf(stderr, " is dropped when you record a game. Only needed if you are recording a game that is bottlenecked by GPU. The same issue exists on Wayland but overclocking is not possible on Wayland.\n");
+ fprintf(stderr, " Works only if your have \"Coolbits\" set to \"12\" in NVIDIA X settings, see README for more information. Note! use at your own risk! Optional, disabled by default.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -fm Framerate mode. Should be either 'cfr' (constant frame rate), 'vfr' (variable frame rate) or 'content'. Defaults to 'vfr'.\n");
+ fprintf(stderr, " 'vfr' is recommended for recording for less issue with very high system load but some applications such as video editors may not support it properly.\n");
+ fprintf(stderr, " 'content' is currently only supported when recording a single window, on X11. The 'content' option matches the recording frame rate to the captured content.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -cr Color range. Should be either 'limited' (aka mpeg) or 'full' (aka jpeg). Defaults to 'limited'.\n");
+ fprintf(stderr, " Limited color range means that colors are in range 16-235 (4112-60395 for hdr) while full color range means that colors are in range 0-255 (0-65535 for hdr).\n");
+ fprintf(stderr, " Note that some buggy video players (such as vlc) are unable to correctly display videos in full color range.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -v Prints per second, fps updates. Optional, set to 'yes' by default.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -h, --help\n");
+ fprintf(stderr, " Show this help.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -mf Organise replays in folders based on the current date.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -sc Run a script on the saved video file (non-blocking). The first argument to the script is the filepath to the saved video file and the second argument is the recording type (either \"regular\" or \"replay\").\n");
+ fprintf(stderr, " Not applicable for live streams.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " -cursor\n");
+ fprintf(stderr, " Record cursor. Defaults to 'yes'.\n");
+ fprintf(stderr, " -keyint\n");
+ fprintf(stderr, " Specifies the keyframe interval in seconds, the max amount of time to wait to generate a keyframe. Keyframes can be generated more often than this.\n");
+ fprintf(stderr, " This also affects seeking in the video and may affect how the replay video is cut. If this is set to 10 for example then you can only seek in 10-second chunks in the video.\n");
+ fprintf(stderr, " Setting this to a higher value reduces the video file size if you are ok with the previously described downside. This option is expected to be a floating point number.\n");
+ fprintf(stderr, " By default this value is set to 2.0.\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, " --list-supported-video-codecs\n");
+ fprintf(stderr, " List supported video codecs and exits. Prints h264, hevc, hevc_hdr, av1 and av1_hdr (if supported).\n");
+ fprintf(stderr, "\n");
+ //fprintf(stderr, " -pixfmt The pixel format to use for the output video. yuv420 is the most common format and is best supported, but the color is compressed, so colors can look washed out and certain colors of text can look bad. Use yuv444 for no color compression, but the video may not work everywhere and it may not work with hardware video decoding. Optional, defaults to yuv420\n");
+ fprintf(stderr, " -o The output file path. If omitted then the encoded data is sent to stdout. Required in replay mode (when using -r).\n");
+ fprintf(stderr, " In replay mode this has to be a directory instead of a file.\n");
+ fprintf(stderr, " The directory to the file is created (recursively) if it doesn't already exist.\n");
+ fprintf(stderr, "\n");
fprintf(stderr, "NOTES:\n");
- fprintf(stderr, " Send signal SIGINT (Ctrl+C) to gpu-screen-recorder to stop and save the recording (when not using replay mode).\n");
- fprintf(stderr, " Send signal SIGUSR1 (killall -SIGUSR1 gpu-screen-recorder) to gpu-screen-recorder to save a replay.\n");
- exit(1);
+ fprintf(stderr, " Send signal SIGINT to gpu-screen-recorder (Ctrl+C, or killall -SIGINT gpu-screen-recorder) to stop and save the recording. When in replay mode this stops recording without saving.\n");
+ fprintf(stderr, " Send signal SIGUSR1 to gpu-screen-recorder (killall -SIGUSR1 gpu-screen-recorder) to save a replay (when in replay mode).\n");
+ fprintf(stderr, " Send signal SIGUSR2 to gpu-screen-recorder (killall -SIGUSR2 gpu-screen-recorder) to pause/unpause recording. Only applicable and useful when recording (not streaming nor replay).\n");
+ fprintf(stderr, "\n");
+ fprintf(stderr, "EXAMPLES:\n");
+ fprintf(stderr, " %s -w screen -f 60 -a \"$(pactl get-default-sink).monitor\" -o \"$HOME/Videos/video.mp4\"\n", program_name);
+ fprintf(stderr, " %s -w screen -f 60 -a \"$(pactl get-default-sink).monitor|$(pactl get-default-source)\" -o \"$HOME/Videos/video.mp4\"\n", program_name);
+ fprintf(stderr, " %s -w screen -f 60 -a \"$(pactl get-default-sink).monitor\" -c mkv -r 60 -o \"$HOME/Videos\"\n", program_name);
+ //fprintf(stderr, " gpu-screen-recorder -w screen -f 60 -q ultra -pixfmt yuv444 -o video.mp4\n");
+ _exit(1);
+static void usage() {
+ usage_header();
+ _exit(1);
static sig_atomic_t running = 1;
static sig_atomic_t save_replay = 0;
+static sig_atomic_t toggle_pause = 0;
-static void int_handler(int) {
+static void stop_handler(int) {
running = 0;
@@ -963,37 +943,35 @@ static void save_replay_handler(int) {
save_replay = 1;
-struct Arg {
- std::vector<const char*> values;
- bool optional = false;
- bool list = false;
- const char* value() const {
- if(values.empty())
- return nullptr;
- return values.front();
- }
+static void toggle_pause_handler(int) {
+ toggle_pause = 1;
static bool is_hex_num(char c) {
return (c >= 'A' && c <= 'F') || (c >= 'a' && c <= 'f') || (c >= '0' && c <= '9');
static bool contains_non_hex_number(const char *str) {
+ bool hex_start = false;
size_t len = strlen(str);
if(len >= 2 && memcmp(str, "0x", 2) == 0) {
str += 2;
len -= 2;
+ hex_start = true;
+ bool is_hex = false;
for(size_t i = 0; i < len; ++i) {
char c = str[i];
if(c == '\0')
return false;
return true;
+ if((c >= 'A' && c <= 'F') || (c >= 'a' && c <= 'f'))
+ is_hex = true;
- return false;
+ return is_hex && !hex_start;
static std::string get_date_str() {
@@ -1004,11 +982,27 @@ static std::string get_date_str() {
return str;
+static std::string get_date_only_str() {
+ char str[128];
+ time_t now = time(NULL);
+ struct tm *t = localtime(&now);
+ strftime(str, sizeof(str)-1, "%Y-%m-%d", t);
+ return str;
+static std::string get_time_only_str() {
+ char str[128];
+ time_t now = time(NULL);
+ struct tm *t = localtime(&now);
+ strftime(str, sizeof(str)-1, "%H-%M-%S", t);
+ return str;
static AVStream* create_stream(AVFormatContext *av_format_context, AVCodecContext *codec_context) {
AVStream *stream = avformat_new_stream(av_format_context, nullptr);
if (!stream) {
fprintf(stderr, "Error: Could not allocate stream\n");
- exit(1);
+ _exit(1);
stream->id = av_format_context->nb_streams - 1;
stream->time_base = codec_context->time_base;
@@ -1016,23 +1010,125 @@ static AVStream* create_stream(AVFormatContext *av_format_context, AVCodecContex
return stream;
-struct AudioTrack {
- AVCodecContext *codec_context = nullptr;
- AVFrame *frame = nullptr;
- AVStream *stream = nullptr;
+static void run_recording_saved_script_async(const char *script_file, const char *video_file, const char *type) {
+ char script_file_full[PATH_MAX];
+ script_file_full[0] = '\0';
+ if(!realpath(script_file, script_file_full)) {
+ fprintf(stderr, "Error: script file not found: %s\n", script_file);
+ return;
+ }
+ const char *args[6];
+ const bool inside_flatpak = getenv("FLATPAK_ID") != NULL;
+ if(inside_flatpak) {
+ args[0] = "flatpak-spawn";
+ args[1] = "--host";
+ args[2] = script_file_full;
+ args[3] = video_file;
+ args[4] = type;
+ args[5] = NULL;
+ } else {
+ args[0] = script_file_full;
+ args[1] = video_file;
+ args[2] = type;
+ args[3] = NULL;
+ }
+ pid_t pid = fork();
+ if(pid == -1) {
+ perror(script_file_full);
+ return;
+ } else if(pid == 0) { // child
+ setsid();
+ signal(SIGHUP, SIG_IGN);
+ pid_t second_child = fork();
+ if(second_child == 0) { // child
+ execvp(args[0], (char* const*)args);
+ perror(script_file_full);
+ _exit(127);
+ } else if(second_child != -1) { // parent
+ _exit(0);
+ }
+ } else { // parent
+ waitpid(pid, NULL, 0);
+ }
+static double audio_codec_get_desired_delay(AudioCodec audio_codec, int fps) {
+ const double fps_inv = 1.0 / (double)fps;
+ const double base = 0.01 + 1.0/165.0;
+ switch(audio_codec) {
+ case AudioCodec::OPUS:
+ return std::max(0.0, base - fps_inv);
+ case AudioCodec::AAC:
+ return std::max(0.0, (base + 0.008) * 2.0 - fps_inv);
+ case AudioCodec::FLAC:
+ // TODO: Test
+ return std::max(0.0, base - fps_inv);
+ }
+ assert(false);
+ return std::max(0.0, base - fps_inv);
+struct AudioDevice {
SoundDevice sound_device;
+ AudioInput audio_input;
+ AVFilterContext *src_filter_ctx = nullptr;
+ AVFrame *frame = nullptr;
std::thread thread; // TODO: Instead of having a thread for each track, have one thread for all threads and read the data with non-blocking read
+// TODO: Cleanup
+struct AudioTrack {
+ AVCodecContext *codec_context = nullptr;
+ AVStream *stream = nullptr;
+ std::vector<AudioDevice> audio_devices;
+ AVFilterGraph *graph = nullptr;
+ AVFilterContext *sink = nullptr;
int stream_index = 0;
- AudioInput audio_input;
+ int64_t pts = 0;
static std::future<void> save_replay_thread;
-static std::vector<AVPacket> save_replay_packets;
+static std::vector<std::shared_ptr<PacketData>> save_replay_packets;
static std::string save_replay_output_filepath;
-static void save_replay_async(AVCodecContext *video_codec_context, int video_stream_index, std::vector<AudioTrack> &audio_tracks, const std::deque<AVPacket> &frame_data_queue, bool frames_erased, std::string output_dir, const char *container_format, const std::string &file_extension, std::mutex &write_output_mutex) {
+static int create_directory_recursive(char *path) {
+ int path_len = strlen(path);
+ char *p = path;
+ char *end = path + path_len;
+ for(;;) {
+ char *slash_p = strchr(p, '/');
+ // Skips first '/', we don't want to try and create the root directory
+ if(slash_p == path) {
+ ++p;
+ continue;
+ }
+ if(!slash_p)
+ slash_p = end;
+ char prev_char = *slash_p;
+ *slash_p = '\0';
+ int err = mkdir(path, S_IRWXU);
+ *slash_p = prev_char;
+ if(err == -1 && errno != EEXIST)
+ return err;
+ if(slash_p == end)
+ break;
+ else
+ p = slash_p + 1;
+ }
+ return 0;
+static void save_replay_async(AVCodecContext *video_codec_context, int video_stream_index, std::vector<AudioTrack> &audio_tracks, std::deque<std::shared_ptr<PacketData>> &frame_data_queue, bool frames_erased, std::string output_dir, const char *container_format, const std::string &file_extension, std::mutex &write_output_mutex, bool make_folders) {
@@ -1044,7 +1140,7 @@ static void save_replay_async(AVCodecContext *video_codec_context, int video_str
std::lock_guard<std::mutex> lock(write_output_mutex);
start_index = (size_t)-1;
for(size_t i = 0; i < frame_data_queue.size(); ++i) {
- const AVPacket &av_packet = frame_data_queue[i];
+ const AVPacket &av_packet = frame_data_queue[i]->data;
if((av_packet.flags & AV_PKT_FLAG_KEY) && av_packet.stream_index == video_stream_index) {
start_index = i;
@@ -1055,11 +1151,11 @@ static void save_replay_async(AVCodecContext *video_codec_context, int video_str
if(frames_erased) {
- video_pts_offset = frame_data_queue[start_index].pts;
+ video_pts_offset = frame_data_queue[start_index]->data.pts;
// Find the next audio packet to use as audio pts offset
for(size_t i = start_index; i < frame_data_queue.size(); ++i) {
- const AVPacket &av_packet = frame_data_queue[i];
+ const AVPacket &av_packet = frame_data_queue[i]->data;
if(av_packet.stream_index != video_stream_index) {
audio_pts_offset = av_packet.pts;
@@ -1071,18 +1167,23 @@ static void save_replay_async(AVCodecContext *video_codec_context, int video_str
for(size_t i = 0; i < frame_data_queue.size(); ++i) {
- av_packet_ref(&save_replay_packets[i], &frame_data_queue[i]);
+ save_replay_packets[i] = frame_data_queue[i];
- save_replay_output_filepath = output_dir + "/Replay_" + get_date_str() + "." + file_extension;
+ if (make_folders) {
+ std::string output_folder = output_dir + '/' + get_date_only_str();
+ create_directory_recursive(&output_folder[0]);
+ save_replay_output_filepath = output_folder + "/Replay_" + get_time_only_str() + "." + file_extension;
+ } else {
+ create_directory_recursive(&output_dir[0]);
+ save_replay_output_filepath = output_dir + "/Replay_" + get_date_str() + "." + file_extension;
+ }
save_replay_thread = std::async(std::launch::async, [video_stream_index, container_format, start_index, video_pts_offset, audio_pts_offset, video_codec_context, &audio_tracks]() mutable {
AVFormatContext *av_format_context;
avformat_alloc_output_context2(&av_format_context, nullptr, container_format, nullptr);
- av_format_context->flags |= AVFMT_FLAG_GENPTS;
- av_format_context->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
AVStream *video_stream = create_stream(av_format_context, video_codec_context);
avcodec_parameters_from_context(video_stream->codecpar, video_codec_context);
@@ -1100,14 +1201,27 @@ static void save_replay_async(AVCodecContext *video_codec_context, int video_str
- ret = avformat_write_header(av_format_context, nullptr);
+ AVDictionary *options = nullptr;
+ av_dict_set(&options, "strict", "experimental", 0);
+ ret = avformat_write_header(av_format_context, &options);
if (ret < 0) {
fprintf(stderr, "Error occurred when writing header to output file: %s\n", av_error_to_string(ret));
for(size_t i = start_index; i < save_replay_packets.size(); ++i) {
- AVPacket &av_packet = save_replay_packets[i];
+ // TODO: Check if successful
+ AVPacket av_packet;
+ memset(&av_packet, 0, sizeof(av_packet));
+ //av_packet_from_data(av_packet, save_replay_packets[i]->data.data, save_replay_packets[i]->data.size);
+ av_packet.data = save_replay_packets[i]->data.data;
+ av_packet.size = save_replay_packets[i]->data.size;
+ av_packet.stream_index = save_replay_packets[i]->data.stream_index;
+ av_packet.pts = save_replay_packets[i]->data.pts;
+ av_packet.dts = save_replay_packets[i]->data.pts;
+ av_packet.flags = save_replay_packets[i]->data.flags;
+ //av_packet.duration = save_replay_packets[i]->data.duration;
AVStream *stream = video_stream;
AVCodecContext *codec_context = video_codec_context;
@@ -1127,9 +1241,11 @@ static void save_replay_async(AVCodecContext *video_codec_context, int video_str
av_packet.stream_index = stream->index;
av_packet_rescale_ts(&av_packet, codec_context->time_base, stream->time_base);
- int ret = av_interleaved_write_frame(av_format_context, &av_packet);
+ ret = av_write_frame(av_format_context, &av_packet);
if(ret < 0)
fprintf(stderr, "Error: Failed to write frame index %d to muxer, reason: %s (%d)\n", stream->index, av_error_to_string(ret), ret);
+ //av_packet_free(&av_packet);
if (av_write_trailer(av_format_context) != 0)
@@ -1137,6 +1253,7 @@ static void save_replay_async(AVCodecContext *video_codec_context, int video_str
+ av_dict_free(&options);
for(AudioTrack &audio_track : audio_tracks) {
audio_track.stream = nullptr;
@@ -1144,15 +1261,34 @@ static void save_replay_async(AVCodecContext *video_codec_context, int video_str
-static AudioInput parse_audio_input_arg(const char *str) {
- AudioInput audio_input;
- audio_input.name = str;
- const size_t index = audio_input.name.find('/');
- if(index != std::string::npos) {
- audio_input.description = audio_input.name.substr(0, index);
- audio_input.name.erase(audio_input.name.begin(), audio_input.name.begin() + index + 1);
+static void split_string(const std::string &str, char delimiter, std::function<bool(const char*,size_t)> callback) {
+ size_t index = 0;
+ while(index < str.size()) {
+ size_t end_index = str.find(delimiter, index);
+ if(end_index == std::string::npos)
+ end_index = str.size();
+ if(!callback(&str[index], end_index - index))
+ break;
+ index = end_index + 1;
- return audio_input;
+static std::vector<AudioInput> parse_audio_input_arg(const char *str) {
+ std::vector<AudioInput> audio_inputs;
+ split_string(str, '|', [&audio_inputs](const char *sub, size_t size) {
+ AudioInput audio_input;
+ audio_input.name.assign(sub, size);
+ const size_t index = audio_input.name.find('/');
+ if(index != std::string::npos) {
+ audio_input.description = audio_input.name.substr(0, index);
+ audio_input.name.erase(audio_input.name.begin(), audio_input.name.begin() + index + 1);
+ }
+ audio_inputs.push_back(std::move(audio_input));
+ return true;
+ });
+ return audio_inputs;
// TODO: Does this match all livestreaming cases?
@@ -1166,13 +1302,391 @@ static bool is_livestream_path(const char *str) {
return false;
+// TODO: Proper cleanup
+static int init_filter_graph(AVCodecContext *audio_codec_context, AVFilterGraph **graph, AVFilterContext **sink, std::vector<AVFilterContext*> &src_filter_ctx, size_t num_sources) {
+ char ch_layout[64];
+ int err = 0;
+ ch_layout[0] = '\0';
+ AVFilterGraph *filter_graph = avfilter_graph_alloc();
+ if (!filter_graph) {
+ fprintf(stderr, "Unable to create filter graph.\n");
+ }
+ for(size_t i = 0; i < num_sources; ++i) {
+ const AVFilter *abuffer = avfilter_get_by_name("abuffer");
+ if (!abuffer) {
+ fprintf(stderr, "Could not find the abuffer filter.\n");
+ }
+ AVFilterContext *abuffer_ctx = avfilter_graph_alloc_filter(filter_graph, abuffer, NULL);
+ if (!abuffer_ctx) {
+ fprintf(stderr, "Could not allocate the abuffer instance.\n");
+ }
+ av_get_channel_layout_string(ch_layout, sizeof(ch_layout), 0, AV_CH_LAYOUT_STEREO);
+ #else
+ av_channel_layout_describe(&audio_codec_context->ch_layout, ch_layout, sizeof(ch_layout));
+ #endif
+ av_opt_set (abuffer_ctx, "channel_layout", ch_layout, AV_OPT_SEARCH_CHILDREN);
+ av_opt_set (abuffer_ctx, "sample_fmt", av_get_sample_fmt_name(audio_codec_context->sample_fmt), AV_OPT_SEARCH_CHILDREN);
+ av_opt_set_q (abuffer_ctx, "time_base", audio_codec_context->time_base, AV_OPT_SEARCH_CHILDREN);
+ av_opt_set_int(abuffer_ctx, "sample_rate", audio_codec_context->sample_rate, AV_OPT_SEARCH_CHILDREN);
+ av_opt_set_int(abuffer_ctx, "bit_rate", audio_codec_context->bit_rate, AV_OPT_SEARCH_CHILDREN);
+ err = avfilter_init_str(abuffer_ctx, NULL);
+ if (err < 0) {
+ fprintf(stderr, "Could not initialize the abuffer filter.\n");
+ return err;
+ }
+ src_filter_ctx.push_back(abuffer_ctx);
+ }
+ const AVFilter *mix_filter = avfilter_get_by_name("amix");
+ if (!mix_filter) {
+ av_log(NULL, AV_LOG_ERROR, "Could not find the mix filter.\n");
+ }
+ char args[512];
+ snprintf(args, sizeof(args), "inputs=%d", (int)num_sources);
+ AVFilterContext *mix_ctx;
+ err = avfilter_graph_create_filter(&mix_ctx, mix_filter, "amix", args, NULL, filter_graph);
+ if (err < 0) {
+ av_log(NULL, AV_LOG_ERROR, "Cannot create audio amix filter\n");
+ return err;
+ }
+ const AVFilter *abuffersink = avfilter_get_by_name("abuffersink");
+ if (!abuffersink) {
+ fprintf(stderr, "Could not find the abuffersink filter.\n");
+ }
+ AVFilterContext *abuffersink_ctx = avfilter_graph_alloc_filter(filter_graph, abuffersink, "sink");
+ if (!abuffersink_ctx) {
+ fprintf(stderr, "Could not allocate the abuffersink instance.\n");
+ }
+ err = avfilter_init_str(abuffersink_ctx, NULL);
+ if (err < 0) {
+ fprintf(stderr, "Could not initialize the abuffersink instance.\n");
+ return err;
+ }
+ err = 0;
+ for(size_t i = 0; i < src_filter_ctx.size(); ++i) {
+ AVFilterContext *src_ctx = src_filter_ctx[i];
+ if (err >= 0)
+ err = avfilter_link(src_ctx, 0, mix_ctx, i);
+ }
+ if (err >= 0)
+ err = avfilter_link(mix_ctx, 0, abuffersink_ctx, 0);
+ if (err < 0) {
+ av_log(NULL, AV_LOG_ERROR, "Error connecting filters\n");
+ return err;
+ }
+ err = avfilter_graph_config(filter_graph, NULL);
+ if (err < 0) {
+ av_log(NULL, AV_LOG_ERROR, "Error configuring the filter graph\n");
+ return err;
+ }
+ *graph = filter_graph;
+ *sink = abuffersink_ctx;
+ return 0;
+static void xwayland_check_callback(const gsr_monitor *monitor, void *userdata) {
+ bool *xwayland_found = (bool*)userdata;
+ if(monitor->name_len >= 8 && strncmp(monitor->name, "XWAYLAND", 8) == 0)
+ *xwayland_found = true;
+ else if(memmem(monitor->name, monitor->name_len, "X11", 3))
+ *xwayland_found = true;
+static bool is_xwayland(Display *display) {
+ int opcode, event, error;
+ if(XQueryExtension(display, "XWAYLAND", &opcode, &event, &error))
+ return true;
+ bool xwayland_found = false;
+ for_each_active_monitor_output_x11(display, xwayland_check_callback, &xwayland_found);
+ return xwayland_found;
+static void list_supported_video_codecs() {
+ bool wayland = false;
+ Display *dpy = XOpenDisplay(nullptr);
+ if (!dpy) {
+ wayland = true;
+ fprintf(stderr, "Warning: failed to connect to the X server. Assuming wayland is running without Xwayland\n");
+ }
+ XSetErrorHandler(x11_error_handler);
+ XSetIOErrorHandler(x11_io_error_handler);
+ if(!wayland)
+ wayland = is_xwayland(dpy);
+ gsr_egl egl;
+ if(!gsr_egl_load(&egl, dpy, wayland, false)) {
+ fprintf(stderr, "gsr error: failed to load opengl\n");
+ _exit(1);
+ }
+ char card_path[128];
+ card_path[0] = '\0';
+ if(wayland || egl.gpu_info.vendor != GSR_GPU_VENDOR_NVIDIA) {
+ // TODO: Allow specifying another card, and in other places
+ if(!gsr_get_valid_card_path(&egl, card_path, false)) {
+ fprintf(stderr, "Error: no /dev/dri/cardX device found. If you are running GPU Screen Recorder with prime-run then try running without it. Also make sure that you have at least one connected monitor or record a single window instead on X11\n");
+ _exit(2);
+ }
+ }
+ av_log_set_level(AV_LOG_FATAL);
+ // TODO: Output hdr
+ if(find_h264_encoder(egl.gpu_info.vendor, card_path))
+ puts("h264");
+ if(find_hevc_encoder(egl.gpu_info.vendor, card_path))
+ puts("hevc");
+ if(find_av1_encoder(egl.gpu_info.vendor, card_path))
+ puts("av1");
+ fflush(stdout);
+ gsr_egl_unload(&egl);
+ if(dpy)
+ XCloseDisplay(dpy);
+static gsr_capture* create_capture_impl(const char *window_str, const char *screen_region, bool wayland, gsr_egl &egl, int fps, bool overclock, VideoCodec video_codec, gsr_color_range color_range, bool record_cursor, bool track_damage) {
+ vec2i region_size = { 0, 0 };
+ Window src_window_id = None;
+ bool follow_focused = false;
+ gsr_capture *capture = nullptr;
+ if(strcmp(window_str, "focused") == 0) {
+ if(wayland) {
+ fprintf(stderr, "Error: GPU Screen Recorder window capture only works in a pure X11 session. Xwayland is not supported. You can record a monitor instead on wayland\n");
+ _exit(2);
+ }
+ if(!screen_region) {
+ fprintf(stderr, "Error: option -s is required when using -w focused\n");
+ usage();
+ }
+ if(sscanf(screen_region, "%dx%d", &region_size.x, &region_size.y) != 2) {
+ fprintf(stderr, "Error: invalid value for option -s '%s', expected a value in format WxH\n", screen_region);
+ usage();
+ }
+ if(region_size.x <= 0 || region_size.y <= 0) {
+ fprintf(stderr, "Error: invalud value for option -s '%s', expected width and height to be greater than 0\n", screen_region);
+ usage();
+ }
+ follow_focused = true;
+ } else if(contains_non_hex_number(window_str)) {
+ if(wayland || egl.gpu_info.vendor != GSR_GPU_VENDOR_NVIDIA) {
+ if(strcmp(window_str, "screen") == 0) {
+ FirstOutputCallback first_output;
+ first_output.output_name = NULL;
+ for_each_active_monitor_output(&egl, GSR_CONNECTION_DRM, get_first_output, &first_output);
+ if(first_output.output_name) {
+ window_str = first_output.output_name;
+ } else {
+ fprintf(stderr, "Error: no available output found\n");
+ _exit(1);
+ }
+ }
+ gsr_monitor gmon;
+ if(!get_monitor_by_name(&egl, GSR_CONNECTION_DRM, window_str, &gmon)) {
+ fprintf(stderr, "gsr error: display \"%s\" not found, expected one of:\n", window_str);
+ fprintf(stderr, " \"screen\"\n");
+ for_each_active_monitor_output(&egl, GSR_CONNECTION_DRM, monitor_output_callback_print, NULL);
+ _exit(1);
+ }
+ } else {
+ if(strcmp(window_str, "screen") != 0 && strcmp(window_str, "screen-direct") != 0 && strcmp(window_str, "screen-direct-force") != 0) {
+ gsr_monitor gmon;
+ if(!get_monitor_by_name(&egl, GSR_CONNECTION_X11, window_str, &gmon)) {
+ const int screens_width = XWidthOfScreen(DefaultScreenOfDisplay(egl.x11.dpy));
+ const int screens_height = XWidthOfScreen(DefaultScreenOfDisplay(egl.x11.dpy));
+ fprintf(stderr, "gsr error: display \"%s\" not found, expected one of:\n", window_str);
+ fprintf(stderr, " \"screen\" (%dx%d+%d+%d)\n", screens_width, screens_height, 0, 0);
+ fprintf(stderr, " \"screen-direct\" (%dx%d+%d+%d)\n", screens_width, screens_height, 0, 0);
+ fprintf(stderr, " \"screen-direct-force\" (%dx%d+%d+%d)\n", screens_width, screens_height, 0, 0);
+ for_each_active_monitor_output(&egl, GSR_CONNECTION_X11, monitor_output_callback_print, NULL);
+ _exit(1);
+ }
+ }
+ }
+ if(egl.gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA) {
+ if(wayland) {
+ gsr_capture_kms_cuda_params kms_params;
+ kms_params.egl = &egl;
+ kms_params.display_to_capture = window_str;
+ kms_params.hdr = video_codec_is_hdr(video_codec);
+ kms_params.color_range = color_range;
+ kms_params.record_cursor = record_cursor;
+ capture = gsr_capture_kms_cuda_create(&kms_params);
+ if(!capture)
+ _exit(1);
+ } else {
+ const char *capture_target = window_str;
+ bool direct_capture = strcmp(window_str, "screen-direct") == 0;
+ if(direct_capture) {
+ capture_target = "screen";
+ // TODO: Temporary disable direct capture because push model causes stuttering when it's direct capturing. This might be a nvfbc bug. This does not happen when using a compositor.
+ direct_capture = false;
+ fprintf(stderr, "Warning: screen-direct has temporary been disabled as it causes stuttering. This is likely a NvFBC bug. Falling back to \"screen\".\n");
+ }
+ if(strcmp(window_str, "screen-direct-force") == 0) {
+ direct_capture = true;
+ capture_target = "screen";
+ }
+ gsr_capture_nvfbc_params nvfbc_params;
+ nvfbc_params.egl = &egl;
+ nvfbc_params.display_to_capture = capture_target;
+ nvfbc_params.fps = fps;
+ nvfbc_params.pos = { 0, 0 };
+ nvfbc_params.size = { 0, 0 };
+ nvfbc_params.direct_capture = direct_capture;
+ nvfbc_params.overclock = overclock;
+ nvfbc_params.hdr = video_codec_is_hdr(video_codec);
+ nvfbc_params.color_range = color_range;
+ nvfbc_params.record_cursor = record_cursor;
+ capture = gsr_capture_nvfbc_create(&nvfbc_params);
+ if(!capture)
+ _exit(1);
+ }
+ } else {
+ gsr_capture_kms_vaapi_params kms_params;
+ kms_params.egl = &egl;
+ kms_params.display_to_capture = window_str;
+ kms_params.hdr = video_codec_is_hdr(video_codec);
+ kms_params.color_range = color_range;
+ kms_params.record_cursor = record_cursor;
+ capture = gsr_capture_kms_vaapi_create(&kms_params);
+ if(!capture)
+ _exit(1);
+ }
+ } else {
+ if(wayland) {
+ fprintf(stderr, "Error: GPU Screen Recorder window capture only works in a pure X11 session. Xwayland is not supported. You can record a monitor instead on wayland\n");
+ _exit(2);
+ }
+ errno = 0;
+ src_window_id = strtol(window_str, nullptr, 0);
+ if(src_window_id == None || errno == EINVAL) {
+ fprintf(stderr, "Invalid window number %s\n", window_str);
+ usage();
+ }
+ }
+ if(!capture) {
+ switch(egl.gpu_info.vendor) {
+ gsr_capture_xcomposite_vaapi_params xcomposite_params;
+ xcomposite_params.base.egl = &egl;
+ xcomposite_params.base.window = src_window_id;
+ xcomposite_params.base.follow_focused = follow_focused;
+ xcomposite_params.base.region_size = region_size;
+ xcomposite_params.base.color_range = color_range;
+ xcomposite_params.base.record_cursor = record_cursor;
+ xcomposite_params.base.track_damage = track_damage;
+ capture = gsr_capture_xcomposite_vaapi_create(&xcomposite_params);
+ if(!capture)
+ _exit(1);
+ break;
+ }
+ gsr_capture_xcomposite_cuda_params xcomposite_params;
+ xcomposite_params.base.egl = &egl;
+ xcomposite_params.base.window = src_window_id;
+ xcomposite_params.base.follow_focused = follow_focused;
+ xcomposite_params.base.region_size = region_size;
+ xcomposite_params.base.color_range = color_range;
+ xcomposite_params.base.record_cursor = record_cursor;
+ xcomposite_params.base.track_damage = track_damage;
+ xcomposite_params.overclock = overclock;
+ capture = gsr_capture_xcomposite_cuda_create(&xcomposite_params);
+ if(!capture)
+ _exit(1);
+ break;
+ }
+ }
+ }
+ return capture;
+struct Arg {
+ std::vector<const char*> values;
+ bool optional = false;
+ bool list = false;
+ const char* value() const {
+ if(values.empty())
+ return nullptr;
+ return values.front();
+ }
int main(int argc, char **argv) {
- signal(SIGINT, int_handler);
+ signal(SIGINT, stop_handler);
signal(SIGUSR1, save_replay_handler);
+ signal(SIGUSR2, toggle_pause_handler);
+ // Stop nvidia driver from buffering frames
+ setenv("__GL_MaxFramesAllowed", "1", true);
+ // If this is set to 1 then cuGraphicsGLRegisterImage will fail for egl context with error: invalid OpenGL or DirectX context,
+ // so we overwrite it
+ setenv("__GL_THREADED_OPTIMIZATIONS", "0", true);
+ // Some people set this to nvidia (for nvdec) or vdpau (for nvidia vdpau), which breaks gpu screen recorder since
+ // nvidia doesn't support vaapi and nvidia-vaapi-driver doesn't support encoding yet.
+ // Let vaapi find the match vaapi driver instead of forcing a specific one.
+ unsetenv("LIBVA_DRIVER_NAME");
+ // Some people set this to force all applications to vsync on nvidia, but this makes eglSwapBuffers never return.
+ unsetenv("__GL_SYNC_TO_VBLANK");
+ // Same as above, but for amd/intel
+ unsetenv("vblank_mode");
+ if(argc <= 1)
+ usage_full();
+ if(argc == 2 && (strcmp(argv[1], "-h") == 0 || strcmp(argv[1], "--help") == 0))
+ usage_full();
+ if(argc == 2 && strcmp(argv[1], "--list-supported-video-codecs") == 0) {
+ list_supported_video_codecs();
+ _exit(0);
+ }
+ //av_log_set_level(AV_LOG_TRACE);
std::map<std::string, Arg> args = {
{ "-w", Arg { {}, false, false } },
- //{ "-s", Arg { nullptr, true } },
{ "-c", Arg { {}, true, false } },
{ "-f", Arg { {}, false, false } },
{ "-s", Arg { {}, true, false } },
@@ -1180,10 +1694,22 @@ int main(int argc, char **argv) {
{ "-q", Arg { {}, true, false } },
{ "-o", Arg { {}, true, false } },
{ "-r", Arg { {}, true, false } },
- { "-k", Arg { {}, true, false } }
+ { "-k", Arg { {}, true, false } },
+ { "-ac", Arg { {}, true, false } },
+ { "-ab", Arg { {}, true, false } },
+ { "-oc", Arg { {}, true, false } },
+ { "-fm", Arg { {}, true, false } },
+ { "-pixfmt", Arg { {}, true, false } },
+ { "-v", Arg { {}, true, false } },
+ { "-mf", Arg { {}, true, false } },
+ { "-sc", Arg { {}, true, false } },
+ { "-cr", Arg { {}, true, false } },
+ { "-cursor", Arg { {}, true, false } },
+ { "-gopm", Arg { {}, true, false } }, // deprecated, used keyint instead
+ { "-keyint", Arg { {}, true, false } },
- for(int i = 1; i < argc - 1; i += 2) {
+ for(int i = 1; i < argc; i += 2) {
auto it = args.find(argv[i]);
if(it == args.end()) {
fprintf(stderr, "Invalid argument '%s'\n", argv[i]);
@@ -1195,6 +1721,11 @@ int main(int argc, char **argv) {
+ if(i + 1 >= argc) {
+ fprintf(stderr, "Missing value for argument '%s'\n", argv[i]);
+ usage();
+ }
it->second.values.push_back(argv[i + 1]);
@@ -1205,81 +1736,225 @@ int main(int argc, char **argv) {
- VideoCodec video_codec;
- const char *codec_to_use = args["-k"].value();
- if(!codec_to_use)
- codec_to_use = "auto";
+ VideoCodec video_codec = VideoCodec::HEVC;
+ const char *video_codec_to_use = args["-k"].value();
+ if(!video_codec_to_use)
+ video_codec_to_use = "auto";
- if(strcmp(codec_to_use, "h264") == 0) {
+ if(strcmp(video_codec_to_use, "h264") == 0) {
video_codec = VideoCodec::H264;
- } else if(strcmp(codec_to_use, "h265") == 0) {
- video_codec = VideoCodec::H265;
- } else if(strcmp(codec_to_use, "auto") != 0) {
- fprintf(stderr, "Error: -k should either be either 'auto', 'h264' or 'h265', got: '%s'\n", codec_to_use);
+ } else if(strcmp(video_codec_to_use, "h265") == 0 || strcmp(video_codec_to_use, "hevc") == 0) {
+ video_codec = VideoCodec::HEVC;
+ } else if(strcmp(video_codec_to_use, "hevc_hdr") == 0) {
+ video_codec = VideoCodec::HEVC_HDR;
+ } else if(strcmp(video_codec_to_use, "av1") == 0) {
+ video_codec = VideoCodec::AV1;
+ } else if(strcmp(video_codec_to_use, "av1_hdr") == 0) {
+ video_codec = VideoCodec::AV1_HDR;
+ } else if(strcmp(video_codec_to_use, "auto") != 0) {
+ fprintf(stderr, "Error: -k should either be either 'auto', 'h264', 'hevc', 'hevc_hdr', 'av1' or 'av1_hdr', got: '%s'\n", video_codec_to_use);
+ usage();
+ }
+ AudioCodec audio_codec = AudioCodec::OPUS;
+ const char *audio_codec_to_use = args["-ac"].value();
+ if(!audio_codec_to_use)
+ audio_codec_to_use = "opus";
+ if(strcmp(audio_codec_to_use, "aac") == 0) {
+ audio_codec = AudioCodec::AAC;
+ } else if(strcmp(audio_codec_to_use, "opus") == 0) {
+ audio_codec = AudioCodec::OPUS;
+ } else if(strcmp(audio_codec_to_use, "flac") == 0) {
+ audio_codec = AudioCodec::FLAC;
+ } else {
+ fprintf(stderr, "Error: -ac should either be either 'aac', 'opus' or 'flac', got: '%s'\n", audio_codec_to_use);
+ usage();
+ }
+ if(audio_codec == AudioCodec::FLAC) {
+ fprintf(stderr, "Warning: flac audio codec is temporary disabled, using opus audio codec instead\n");
+ audio_codec_to_use = "opus";
+ audio_codec = AudioCodec::OPUS;
+ }
+ int audio_bitrate = 0;
+ const char *audio_bitrate_str = args["-ab"].value();
+ if(audio_bitrate_str) {
+ if(sscanf(audio_bitrate_str, "%d", &audio_bitrate) != 1) {
+ fprintf(stderr, "Error: -ab argument \"%s\" is not an integer\n", audio_bitrate_str);
+ usage();
+ }
+ }
+ float keyint = 2.0;
+ const char *gopm_str = args["-gopm"].value();
+ const char *keyint_str = args["-keyint"].value();
+ if(keyint_str) {
+ if(sscanf(keyint_str, "%f", &keyint) != 1) {
+ fprintf(stderr, "Error: -keyint argument \"%s\" is not a floating point number\n", keyint_str);
+ usage();
+ }
+ if(keyint < 0) {
+ fprintf(stderr, "Error: -keyint is expected to be 0 or larger\n");
+ usage();
+ }
+ } else if(gopm_str) {
+ if(sscanf(gopm_str, "%f", &keyint) != 1) {
+ fprintf(stderr, "Error: -gopm argument \"%s\" is not a floating point number\n", gopm_str);
+ usage();
+ }
+ if(keyint < 0) {
+ fprintf(stderr, "Error: -gopm is expected to be 0 or larger\n");
+ usage();
+ }
+ fprintf(stderr, "Warning: -gopm argument is deprecated, use -keyint instead\n");
+ }
+ bool overclock = false;
+ const char *overclock_str = args["-oc"].value();
+ if(!overclock_str)
+ overclock_str = "no";
+ if(strcmp(overclock_str, "yes") == 0) {
+ overclock = true;
+ } else if(strcmp(overclock_str, "no") == 0) {
+ overclock = false;
+ } else {
+ fprintf(stderr, "Error: -oc should either be either 'yes' or 'no', got: '%s'\n", overclock_str);
+ usage();
+ }
+ bool verbose = true;
+ const char *verbose_str = args["-v"].value();
+ if(!verbose_str)
+ verbose_str = "yes";
+ if(strcmp(verbose_str, "yes") == 0) {
+ verbose = true;
+ } else if(strcmp(verbose_str, "no") == 0) {
+ verbose = false;
+ } else {
+ fprintf(stderr, "Error: -v should either be either 'yes' or 'no', got: '%s'\n", verbose_str);
+ usage();
+ }
+ bool record_cursor = true;
+ const char *record_cursor_str = args["-cursor"].value();
+ if(!record_cursor_str)
+ record_cursor_str = "yes";
+ if(strcmp(record_cursor_str, "yes") == 0) {
+ record_cursor = true;
+ } else if(strcmp(record_cursor_str, "no") == 0) {
+ record_cursor = false;
+ } else {
+ fprintf(stderr, "Error: -cursor should either be either 'yes' or 'no', got: '%s'\n", record_cursor_str);
+ usage();
+ }
+ bool make_folders = false;
+ const char *make_folders_str = args["-mf"].value();
+ if(!make_folders_str)
+ make_folders_str = "no";
+ if(strcmp(make_folders_str, "yes") == 0) {
+ make_folders = true;
+ } else if(strcmp(make_folders_str, "no") == 0) {
+ make_folders = false;
+ } else {
+ fprintf(stderr, "Error: -mf should either be either 'yes' or 'no', got: '%s'\n", make_folders_str);
+ usage();
+ }
+ const char *recording_saved_script = args["-sc"].value();
+ if(recording_saved_script) {
+ struct stat buf;
+ if(stat(recording_saved_script, &buf) == -1 || !S_ISREG(buf.st_mode)) {
+ fprintf(stderr, "Error: Script \"%s\" either doesn't exist or it's not a file\n", recording_saved_script);
+ usage();
+ }
+ if(!(buf.st_mode & S_IXUSR)) {
+ fprintf(stderr, "Error: Script \"%s\" is not executable\n", recording_saved_script);
+ usage();
+ }
+ }
+ PixelFormat pixel_format = PixelFormat::YUV420;
+ const char *pixfmt = args["-pixfmt"].value();
+ if(!pixfmt)
+ pixfmt = "yuv420";
+ if(strcmp(pixfmt, "yuv420") == 0) {
+ pixel_format = PixelFormat::YUV420;
+ } else if(strcmp(pixfmt, "yuv444") == 0) {
+ pixel_format = PixelFormat::YUV444;
+ } else {
+ fprintf(stderr, "Error: -pixfmt should either be either 'yuv420', or 'yuv444', got: '%s'\n", pixfmt);
const Arg &audio_input_arg = args["-a"];
- const std::vector<AudioInput> audio_inputs = get_pulseaudio_inputs();
- std::vector<AudioInput> requested_audio_inputs;
+ std::vector<AudioInput> audio_inputs;
+ if(!audio_input_arg.values.empty())
+ audio_inputs = get_pulseaudio_inputs();
+ std::vector<MergedAudioInputs> requested_audio_inputs;
+ bool uses_amix = false;
// Manually check if the audio inputs we give exist. This is only needed for pipewire, not pulseaudio.
for(const char *audio_input : audio_input_arg.values) {
- requested_audio_inputs.push_back(parse_audio_input_arg(audio_input));
- AudioInput &request_audio_input = requested_audio_inputs.back();
+ if(!audio_input || audio_input[0] == '\0')
+ continue;
- bool match = false;
- for(const auto &existing_audio_input : audio_inputs) {
- if(strcmp(request_audio_input.name.c_str(), existing_audio_input.name.c_str()) == 0) {
- if(request_audio_input.description.empty())
- request_audio_input.description = "gsr-" + existing_audio_input.description;
+ requested_audio_inputs.push_back({parse_audio_input_arg(audio_input)});
+ if(requested_audio_inputs.back().audio_inputs.size() > 1)
+ uses_amix = true;
- match = true;
- break;
- }
- }
- if(!match) {
- fprintf(stderr, "Error: Audio input device '%s' is not a valid audio device. Expected one of:\n", request_audio_input.name.c_str());
+ for(AudioInput &request_audio_input : requested_audio_inputs.back().audio_inputs) {
+ bool match = false;
for(const auto &existing_audio_input : audio_inputs) {
- fprintf(stderr, " %s\n", existing_audio_input.name.c_str());
- }
- exit(2);
- }
- }
+ if(strcmp(request_audio_input.name.c_str(), existing_audio_input.name.c_str()) == 0) {
+ if(request_audio_input.description.empty())
+ request_audio_input.description = "gsr-" + existing_audio_input.description;
- uint32_t region_x = 0;
- uint32_t region_y = 0;
- uint32_t region_width = 0;
- uint32_t region_height = 0;
+ match = true;
+ break;
+ }
+ }
- /*
- TODO: Fix this. Doesn't work for some reason
- const char *screen_region = args["-s"].value();
- if(screen_region) {
- if(sscanf(screen_region, "%ux%u+%u+%u", &region_x, &region_y, &region_width, &region_height) != 4) {
- fprintf(stderr, "Invalid value for -s '%s', expected a value in format WxH+X+Y\n", screen_region);
- return 1;
+ if(!match) {
+ fprintf(stderr, "Error: Audio input device '%s' is not a valid audio device, expected one of:\n", request_audio_input.name.c_str());
+ for(const auto &existing_audio_input : audio_inputs) {
+ fprintf(stderr, " %s (%s)\n", existing_audio_input.name.c_str(), existing_audio_input.description.c_str());
+ }
+ _exit(2);
+ }
- */
const char *container_format = args["-c"].value();
+ if(container_format && strcmp(container_format, "mkv") == 0)
+ container_format = "matroska";
int fps = atoi(args["-f"].value());
if(fps == 0) {
fprintf(stderr, "Invalid fps argument: %s\n", args["-f"].value());
- return 1;
+ _exit(1);
if(fps < 1)
fps = 1;
+ VideoQuality quality = VideoQuality::VERY_HIGH;
const char *quality_str = args["-q"].value();
quality_str = "very_high";
- VideoQuality quality;
if(strcmp(quality_str, "medium") == 0) {
quality = VideoQuality::MEDIUM;
} else if(strcmp(quality_str, "high") == 0) {
@@ -1299,112 +1974,138 @@ int main(int argc, char **argv) {
replay_buffer_size_secs = atoi(replay_buffer_size_secs_str);
if(replay_buffer_size_secs < 5 || replay_buffer_size_secs > 1200) {
fprintf(stderr, "Error: option -r has to be between 5 and 1200, was: %s\n", replay_buffer_size_secs_str);
- return 1;
+ _exit(1);
- replay_buffer_size_secs += 5; // Add a few seconds to account of lost packets because of non-keyframe packets skipped
+ replay_buffer_size_secs += std::ceil(keyint); // Add a few seconds to account of lost packets because of non-keyframe packets skipped
- if(!cuda.load()) {
- fprintf(stderr, "Error: failed to load cuda\n");
- return 2;
+ const char *window_str = strdup(args["-w"].value());
+ bool wayland = false;
+ Display *dpy = XOpenDisplay(nullptr);
+ if (!dpy) {
+ wayland = true;
+ fprintf(stderr, "Warning: failed to connect to the X server. Assuming wayland is running without Xwayland\n");
- CUresult res;
+ XSetErrorHandler(x11_error_handler);
+ XSetIOErrorHandler(x11_io_error_handler);
- res = cuda.cuInit(0);
- if(res != CUDA_SUCCESS) {
- const char *err_str;
- cuda.cuGetErrorString(res, &err_str);
- fprintf(stderr, "Error: cuInit failed, error %s (result: %d)\n", err_str, res);
- return 1;
- }
+ if(!wayland)
+ wayland = is_xwayland(dpy);
- int nGpu = 0;
- cuda.cuDeviceGetCount(&nGpu);
- if (nGpu <= 0) {
- fprintf(stderr, "Error: no cuda supported devices found\n");
- return 1;
+ if(video_codec_is_hdr(video_codec) && !wayland) {
+ fprintf(stderr, "Error: hdr video codec option %s is not available on X11\n", video_codec_to_use);
+ _exit(1);
- CUdevice cu_dev;
- res = cuda.cuDeviceGet(&cu_dev, 0);
- if(res != CUDA_SUCCESS) {
- const char *err_str;
- cuda.cuGetErrorString(res, &err_str);
- fprintf(stderr, "Error: unable to get CUDA device, error: %s (result: %d)\n", err_str, res);
- return 1;
+ const bool is_monitor_capture = strcmp(window_str, "focused") != 0 && contains_non_hex_number(window_str);
+ gsr_egl egl;
+ if(!gsr_egl_load(&egl, dpy, wayland, is_monitor_capture)) {
+ fprintf(stderr, "gsr error: failed to load opengl\n");
+ _exit(1);
- CUcontext cu_ctx;
- res = cuda.cuCtxCreate_v2(&cu_ctx, CU_CTX_SCHED_AUTO, cu_dev);
- if(res != CUDA_SUCCESS) {
- const char *err_str;
- cuda.cuGetErrorString(res, &err_str);
- fprintf(stderr, "Error: unable to create CUDA context, error: %s (result: %d)\n", err_str, res);
- return 1;
- }
+ bool very_old_gpu = false;
- const char *record_area = args["-s"].value();
+ if(egl.gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA && egl.gpu_info.gpu_version != 0 && egl.gpu_info.gpu_version < 900) {
+ fprintf(stderr, "Info: your gpu appears to be very old (older than maxwell architecture). Switching to lower preset\n");
+ very_old_gpu = true;
+ }
- uint32_t window_width = 0;
- uint32_t window_height = 0;
- int window_x = 0;
- int window_y = 0;
+ if(egl.gpu_info.vendor != GSR_GPU_VENDOR_NVIDIA && overclock) {
+ fprintf(stderr, "Info: overclock option has no effect on amd/intel, ignoring option\n");
+ }
- NvFBCLibrary nv_fbc_library;
+ if(egl.gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA && overclock && wayland) {
+ fprintf(stderr, "Info: overclocking is not possible on nvidia on wayland, ignoring option\n");
+ }
- const char *window_str = args["-w"].value();
- Window src_window_id = None;
- if(contains_non_hex_number(window_str)) {
- if(record_area) {
- fprintf(stderr, "Option -s is not supported when recording a monitor/screen\n");
- usage();
+ egl.card_path[0] = '\0';
+ if(wayland || egl.gpu_info.vendor != GSR_GPU_VENDOR_NVIDIA) {
+ // TODO: Allow specifying another card, and in other places
+ if(!gsr_get_valid_card_path(&egl, egl.card_path, is_monitor_capture)) {
+ fprintf(stderr, "Error: no /dev/dri/cardX device found. If you are running GPU Screen Recorder with prime-run then try running without it. Also make sure that you have at least one connected monitor or record a single window instead on X11\n");
+ _exit(2);
+ }
- if(!nv_fbc_library.load())
- return 1;
+ // TODO: Fix constant framerate not working properly on amd/intel because capture framerate gets locked to the same framerate as
+ // game framerate, which doesn't work well when you need to encode multiple duplicate frames (AMD/Intel is slow at encoding!).
+ // It also appears to skip audio frames on nvidia wayland? why? that should be fine, but it causes video stuttering because of audio/video sync.
+ FramerateMode framerate_mode = FramerateMode::VARIABLE;
+ const char *framerate_mode_str = args["-fm"].value();
+ if(!framerate_mode_str)
+ framerate_mode_str = "vfr";
+ if(strcmp(framerate_mode_str, "cfr") == 0) {
+ framerate_mode = FramerateMode::CONSTANT;
+ } else if(strcmp(framerate_mode_str, "vfr") == 0) {
+ framerate_mode = FramerateMode::VARIABLE;
+ } else if(strcmp(framerate_mode_str, "content") == 0) {
+ framerate_mode = FramerateMode::CONTENT;
+ } else {
+ fprintf(stderr, "Error: -fm should either be either 'cfr', 'vfr' or 'content', got: '%s'\n", framerate_mode_str);
+ usage();
+ }
- const char *capture_target = window_str;
- bool direct_capture = strcmp(window_str, "screen-direct") == 0;
- if(direct_capture) {
- capture_target = "screen";
- // TODO: Temporary disable direct capture because push model causes stuttering when it's direct capturing. This might be a nvfbc bug. This does not happen when using a compositor.
- direct_capture = false;
- fprintf(stderr, "Warning: screen-direct has temporary been disabled as it causes stuttering. This is likely a NvFBC bug. Falling back to \"screen\".\n");
- }
+ if(framerate_mode == FramerateMode::CONTENT && (wayland || is_monitor_capture)) {
+ fprintf(stderr, "Error: -fm 'content' is currently only supported on X11 and when capturing a single window.\n");
+ usage();
+ }
- if(!nv_fbc_library.create(capture_target, fps, &window_width, &window_height, region_x, region_y, region_width, region_height, direct_capture))
- return 1;
+ gsr_color_range color_range = GSR_COLOR_RANGE_LIMITED;
+ const char *color_range_str = args["-cr"].value();
+ if(!color_range_str)
+ color_range_str = "limited";
+ if(strcmp(color_range_str, "limited") == 0) {
+ color_range = GSR_COLOR_RANGE_LIMITED;
+ } else if(strcmp(color_range_str, "full") == 0) {
+ color_range = GSR_COLOR_RANGE_FULL;
} else {
- errno = 0;
- src_window_id = strtol(window_str, nullptr, 0);
- if(src_window_id == None || errno == EINVAL) {
- fprintf(stderr, "Invalid window number %s\n", window_str);
- usage();
- }
+ fprintf(stderr, "Error: -cr should either be either 'limited' or 'full', got: '%s'\n", color_range_str);
+ usage();
- int record_width = window_width;
- int record_height = window_height;
- if(record_area) {
- if(sscanf(record_area, "%dx%d", &record_width, &record_height) != 2) {
- fprintf(stderr, "Invalid value for -s '%s', expected a value in format WxH\n", record_area);
- return 1;
- }
+ const char *screen_region = args["-s"].value();
+ if(screen_region && strcmp(window_str, "focused") != 0) {
+ fprintf(stderr, "Error: option -s is only available when using -w focused\n");
+ usage();
+ bool is_livestream = false;
const char *filename = args["-o"].value();
if(filename) {
- if(replay_buffer_size_secs != -1) {
- if(!container_format) {
- fprintf(stderr, "Error: option -c is required when using option -r\n");
- usage();
+ is_livestream = is_livestream_path(filename);
+ if(is_livestream) {
+ if(replay_buffer_size_secs != -1) {
+ fprintf(stderr, "Error: replay mode is not applicable to live streaming\n");
+ _exit(1);
+ } else {
+ if(replay_buffer_size_secs == -1) {
+ char directory_buf[PATH_MAX];
+ strcpy(directory_buf, filename);
+ char *directory = dirname(directory_buf);
+ if(strcmp(directory, ".") != 0 && strcmp(directory, "/") != 0) {
+ if(create_directory_recursive(directory) != 0) {
+ fprintf(stderr, "Error: failed to create directory for output file: %s\n", filename);
+ _exit(1);
+ }
+ }
+ } else {
+ if(!container_format) {
+ fprintf(stderr, "Error: option -c is required when using option -r\n");
+ usage();
+ }
- struct stat buf;
- if(stat(filename, &buf) == -1 || !S_ISDIR(buf.st_mode)) {
- fprintf(stderr, "Error: directory \"%s\" does not exist or is not a directory\n", filename);
- usage();
+ struct stat buf;
+ if(stat(filename, &buf) != -1 && !S_ISDIR(buf.st_mode)) {
+ fprintf(stderr, "Error: File \"%s\" exists but it's not a directory\n", filename);
+ usage();
+ }
} else {
@@ -1421,98 +2122,17 @@ int main(int argc, char **argv) {
- const double target_fps = 1.0 / (double)fps;
- Display *dpy = XOpenDisplay(nullptr);
- if (!dpy) {
- fprintf(stderr, "Error: Failed to open display\n");
- return 1;
- }
- XSetErrorHandler(x11_error_handler);
- XSetIOErrorHandler(x11_io_error_handler);
- WindowPixmap window_pixmap;
- Window window = None;
- if(src_window_id) {
- bool has_name_pixmap = x11_supports_composite_named_window_pixmap(dpy);
- if (!has_name_pixmap) {
- fprintf(stderr, "Error: XCompositeNameWindowPixmap is not supported by "
- "your X11 server\n");
- return 1;
- }
- XWindowAttributes attr;
- if (!XGetWindowAttributes(dpy, src_window_id, &attr)) {
- fprintf(stderr, "Error: Invalid window id: %lu\n", src_window_id);
- return 1;
- }
- window_width = std::max(0, attr.width);
- window_height = std::max(0, attr.height);
- window_x = attr.x;
- window_y = attr.y;
- Window c;
- XTranslateCoordinates(dpy, src_window_id, DefaultRootWindow(dpy), 0, 0, &window_x, &window_y, &c);
- XCompositeRedirectWindow(dpy, src_window_id, CompositeRedirectAutomatic);
- if(!gl.load()) {
- fprintf(stderr, "Error: Failed to load opengl\n");
- return 1;
- }
- window = create_opengl_window(dpy);
- if(!window)
- return 1;
- set_vertical_sync_enabled(dpy, window, false);
- recreate_window_pixmap(dpy, src_window_id, window_pixmap);
- if(!record_area) {
- record_width = window_pixmap.texture_width;
- record_height = window_pixmap.texture_height;
- fprintf(stderr, "Record size: %dx%d\n", record_width, record_height);
- }
- } else {
- window_pixmap.texture_id = 0;
- window_pixmap.target_texture_id = 0;
- window_pixmap.texture_width = window_width;
- window_pixmap.texture_height = window_height;
- }
- bool very_old_gpu = false;
- bool gl_loaded = window;
- if(!gl_loaded) {
- if(!gl.load()) {
- fprintf(stderr, "Error: Failed to load opengl\n");
- return 1;
- }
- }
- const unsigned char *gl_renderer = gl.glGetString(GL_RENDERER);
- if(gl_renderer) {
- int gpu_num = 1000;
- sscanf((const char*)gl_renderer, "%*s %*s %*s %d", &gpu_num);
- if(gpu_num < 900) {
- fprintf(stderr, "Info: your gpu appears to be very old (older than maxwell architecture). Switching to lower preset\n");
- very_old_gpu = true;
- }
- }
- if(!gl_loaded)
- gl.unload();
AVFormatContext *av_format_context;
// The output format is automatically guessed by the file extension
avformat_alloc_output_context2(&av_format_context, nullptr, container_format, filename);
if (!av_format_context) {
- fprintf(stderr, "Error: Failed to deduce container format from file extension\n");
- return 1;
+ if(container_format)
+ fprintf(stderr, "Error: Container format '%s' (argument -c) is not valid\n", container_format);
+ else
+ fprintf(stderr, "Error: Failed to deduce container format from file extension\n");
+ _exit(1);
- av_format_context->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
- av_format_context->flags |= AVFMT_FLAG_GENPTS;
const AVOutputFormat *output_format = av_format_context->oformat;
std::string file_extension = output_format->extensions;
@@ -1522,83 +2142,295 @@ int main(int argc, char **argv) {
file_extension = file_extension.substr(0, comma_index);
- if(strcmp(codec_to_use, "auto") == 0) {
- const AVCodec *h265_codec = find_h265_encoder();
+ const bool force_no_audio_offset = is_livestream || (file_extension != "mp4" && file_extension != "mkv" && file_extension != "webm");
+ switch(audio_codec) {
+ case AudioCodec::AAC: {
+ if(file_extension == "webm") {
+ audio_codec_to_use = "opus";
+ audio_codec = AudioCodec::OPUS;
+ fprintf(stderr, "Warning: .webm files only support opus audio codec, changing audio codec from aac to opus\n");
+ }
+ break;
+ }
+ case AudioCodec::OPUS: {
+ // TODO: Also check mpegts?
+ if(file_extension != "mp4" && file_extension != "mkv" && file_extension != "webm") {
+ audio_codec_to_use = "aac";
+ audio_codec = AudioCodec::AAC;
+ fprintf(stderr, "Warning: opus audio codec is only supported by .mp4, .mkv and .webm files, falling back to aac instead\n");
+ }
+ break;
+ }
+ case AudioCodec::FLAC: {
+ // TODO: Also check mpegts?
+ if(file_extension == "webm") {
+ audio_codec_to_use = "opus";
+ audio_codec = AudioCodec::OPUS;
+ fprintf(stderr, "Warning: .webm files only support opus audio codec, changing audio codec from flac to opus\n");
+ } else if(file_extension != "mp4" && file_extension != "mkv") {
+ audio_codec_to_use = "aac";
+ audio_codec = AudioCodec::AAC;
+ fprintf(stderr, "Warning: flac audio codec is only supported by .mp4 and .mkv files, falling back to aac instead\n");
+ } else if(uses_amix) {
+ // TODO: remove this? is it true anymore?
+ audio_codec_to_use = "opus";
+ audio_codec = AudioCodec::OPUS;
+ fprintf(stderr, "Warning: flac audio codec is not supported when mixing audio sources, falling back to opus instead\n");
+ }
+ break;
+ }
+ }
+ const double target_fps = 1.0 / (double)fps;
- // h265 generally allows recording at a higher resolution than h264 on nvidia cards. On a gtx 1080 4k is the max resolution for h264 but for h265 it's 8k.
- // Another important info is that when recording at a higher fps than.. 60? h265 has very bad performance. For example when recording at 144 fps the fps drops to 1
- // while with h264 the fps doesn't drop.
- if(!h265_codec) {
- fprintf(stderr, "Info: using h264 encoder because a codec was not specified and your gpu does not support h265\n");
- codec_to_use = "h264";
+ const bool video_codec_auto = strcmp(video_codec_to_use, "auto") == 0;
+ if(video_codec_auto) {
+ const AVCodec *h264_codec = find_h264_encoder(egl.gpu_info.vendor, egl.card_path);
+ if(!h264_codec) {
+ fprintf(stderr, "Info: using hevc encoder because a codec was not specified and your gpu does not support h264\n");
+ video_codec_to_use = "hevc";
+ video_codec = VideoCodec::HEVC;
+ } else {
+ fprintf(stderr, "Info: using h264 encoder because a codec was not specified\n");
+ video_codec_to_use = "h264";
video_codec = VideoCodec::H264;
- } else if(fps > 60) {
- fprintf(stderr, "Info: using h264 encoder because a codec was not specified and fps is more than 60\n");
- codec_to_use = "h264";
+ }
+ }
+ // TODO: Allow hevc, vp9 and av1 in (enhanced) flv (supported since ffmpeg 6.1)
+ const bool is_flv = strcmp(file_extension.c_str(), "flv") == 0;
+ if(is_flv) {
+ if(video_codec != VideoCodec::H264) {
+ video_codec_to_use = "h264";
video_codec = VideoCodec::H264;
- } else {
- fprintf(stderr, "Info: using h265 encoder because a codec was not specified\n");
- codec_to_use = "h265";
- video_codec = VideoCodec::H265;
+ fprintf(stderr, "Warning: hevc/av1 is not compatible with flv, falling back to h264 instead.\n");
+ }
+ if(audio_codec != AudioCodec::AAC) {
+ audio_codec_to_use = "aac";
+ audio_codec = AudioCodec::AAC;
+ fprintf(stderr, "Warning: flv only supports aac, falling back to aac instead.\n");
- //bool use_hevc = strcmp(window_str, "screen") == 0 || strcmp(window_str, "screen-direct") == 0;
- if(video_codec != VideoCodec::H264 && strcmp(file_extension.c_str(), "flv") == 0) {
- video_codec = VideoCodec::H264;
- fprintf(stderr, "Warning: h265 is not compatible with flv, falling back to h264 instead.\n");
+ const bool is_hls = strcmp(file_extension.c_str(), "m3u8") == 0;
+ if(is_hls) {
+ if(video_codec == VideoCodec::AV1 || video_codec == VideoCodec::AV1_HDR) {
+ video_codec_to_use = "hevc";
+ video_codec = VideoCodec::HEVC;
+ fprintf(stderr, "Warning: av1 is not compatible with hls (m3u8), falling back to hevc instead.\n");
+ }
+ if(audio_codec != AudioCodec::AAC) {
+ audio_codec_to_use = "aac";
+ audio_codec = AudioCodec::AAC;
+ fprintf(stderr, "Warning: hls (m3u8) only supports aac, falling back to aac instead.\n");
+ }
const AVCodec *video_codec_f = nullptr;
switch(video_codec) {
case VideoCodec::H264:
- video_codec_f = find_h264_encoder();
+ video_codec_f = find_h264_encoder(egl.gpu_info.vendor, egl.card_path);
+ break;
+ case VideoCodec::HEVC:
+ case VideoCodec::HEVC_HDR:
+ video_codec_f = find_hevc_encoder(egl.gpu_info.vendor, egl.card_path);
- case VideoCodec::H265:
- video_codec_f = find_h265_encoder();
+ case VideoCodec::AV1:
+ case VideoCodec::AV1_HDR:
+ video_codec_f = find_av1_encoder(egl.gpu_info.vendor, egl.card_path);
+ if(!video_codec_auto && !video_codec_f && !is_flv) {
+ switch(video_codec) {
+ case VideoCodec::H264: {
+ fprintf(stderr, "Warning: selected video codec h264 is not supported, trying hevc instead\n");
+ video_codec_to_use = "hevc";
+ video_codec = VideoCodec::HEVC;
+ video_codec_f = find_hevc_encoder(egl.gpu_info.vendor, egl.card_path);
+ break;
+ }
+ case VideoCodec::HEVC:
+ case VideoCodec::HEVC_HDR: {
+ fprintf(stderr, "Warning: selected video codec hevc is not supported, trying h264 instead\n");
+ video_codec_to_use = "h264";
+ video_codec = VideoCodec::H264;
+ video_codec_f = find_h264_encoder(egl.gpu_info.vendor, egl.card_path);
+ break;
+ }
+ case VideoCodec::AV1:
+ case VideoCodec::AV1_HDR: {
+ fprintf(stderr, "Warning: selected video codec av1 is not supported, trying h264 instead\n");
+ video_codec_to_use = "h264";
+ video_codec = VideoCodec::H264;
+ video_codec_f = find_h264_encoder(egl.gpu_info.vendor, egl.card_path);
+ break;
+ }
+ }
+ }
if(!video_codec_f) {
- fprintf(stderr, "Error: your gpu does not support '%s' video codec\n", video_codec == VideoCodec::H264 ? "h264" : "h265");
- exit(2);
+ const char *video_codec_name = "";
+ switch(video_codec) {
+ case VideoCodec::H264: {
+ video_codec_name = "h264";
+ break;
+ }
+ case VideoCodec::HEVC:
+ case VideoCodec::HEVC_HDR: {
+ video_codec_name = "hevc";
+ break;
+ }
+ case VideoCodec::AV1:
+ case VideoCodec::AV1_HDR: {
+ video_codec_name = "av1";
+ break;
+ }
+ }
+ fprintf(stderr, "Error: your gpu does not support '%s' video codec. If you are sure that your gpu does support '%s' video encoding and you are using an AMD/Intel GPU,\n"
+ " then make sure you have installed the GPU specific vaapi packages (intel-media-driver, libva-intel-driver or libva-mesa-driver).\n"
+ " It's also possible that your distro has disabled hardware accelerated video encoding for '%s' video codec.\n"
+ " This may be the case on corporate distros such as Manjaro, Fedora or OpenSUSE.\n"
+ " You can test this by running 'vainfo | grep VAEntrypointEncSlice' to see if it matches any H264/HEVC profile.\n"
+ " On such distros, you need to manually install mesa from source to enable H264/HEVC hardware acceleration, or use a more user friendly distro. Alternatively record with AV1 if supported by your GPU.\n"
+ " You can alternatively use the flatpak version of GPU Screen Recorder (https://flathub.org/apps/com.dec05eba.gpu_screen_recorder) which bypasses system issues with patented H264/HEVC codecs.\n"
+ " Make sure you have mesa-extra freedesktop runtime installed when using the flatpak (this should be the default), which can be installed with this command:\n"
+ " flatpak install --system org.freedesktop.Platform.GL.default//23.08-extra", video_codec_name, video_codec_name, video_codec_name);
+ _exit(2);
- const bool is_livestream = is_livestream_path(filename);
+ gsr_capture *capture = create_capture_impl(window_str, screen_region, wayland, egl, fps, overclock, video_codec, color_range, record_cursor, framerate_mode == FramerateMode::CONTENT);
// (Some?) livestreaming services require at least one audio track to work.
// If not audio is provided then create one silent audio track.
if(is_livestream && requested_audio_inputs.empty()) {
fprintf(stderr, "Info: live streaming but no audio track was added. Adding a silent audio track\n");
- requested_audio_inputs.push_back({ "", "gsr-silent" });
+ MergedAudioInputs mai;
+ mai.audio_inputs.push_back({ "", "gsr-silent" });
+ requested_audio_inputs.push_back(std::move(mai));
+ }
+ if(is_livestream && recording_saved_script) {
+ fprintf(stderr, "Warning: live stream detected, -sc script is ignored\n");
+ recording_saved_script = nullptr;
AVStream *video_stream = nullptr;
std::vector<AudioTrack> audio_tracks;
+ const bool hdr = video_codec_is_hdr(video_codec);
- AVCodecContext *video_codec_context = create_video_codec_context(AV_PIX_FMT_CUDA, quality, record_width, record_height, fps, video_codec_f, is_livestream);
+ AVCodecContext *video_codec_context = create_video_codec_context(egl.gpu_info.vendor == GSR_GPU_VENDOR_NVIDIA ? AV_PIX_FMT_CUDA : AV_PIX_FMT_VAAPI, quality, fps, video_codec_f, is_livestream, egl.gpu_info.vendor, framerate_mode, hdr, color_range, keyint);
if(replay_buffer_size_secs == -1)
video_stream = create_stream(av_format_context, video_codec_context);
- AVBufferRef *device_ctx;
- CUgraphicsResource cuda_graphics_resource;
- open_video(video_codec_context, window_pixmap, &device_ctx, &cuda_graphics_resource, cu_ctx, !src_window_id, quality, is_livestream, very_old_gpu);
+ AVFrame *video_frame = av_frame_alloc();
+ if(!video_frame) {
+ fprintf(stderr, "Error: Failed to allocate video frame\n");
+ _exit(1);
+ }
+ video_frame->format = video_codec_context->pix_fmt;
+ video_frame->width = video_codec_context->width;
+ video_frame->height = video_codec_context->height;
+ video_frame->color_range = video_codec_context->color_range;
+ video_frame->color_primaries = video_codec_context->color_primaries;
+ video_frame->color_trc = video_codec_context->color_trc;
+ video_frame->colorspace = video_codec_context->colorspace;
+ video_frame->chroma_location = video_codec_context->chroma_sample_location;
+ int capture_result = gsr_capture_start(capture, video_codec_context, video_frame);
+ if(capture_result != 0) {
+ fprintf(stderr, "gsr error: gsr_capture_start failed\n");
+ _exit(capture_result);
+ }
+ open_video(video_codec_context, quality, very_old_gpu, egl.gpu_info.vendor, pixel_format, hdr);
avcodec_parameters_from_context(video_stream->codecpar, video_codec_context);
+ int audio_max_frame_size = 1024;
int audio_stream_index = VIDEO_STREAM_INDEX + 1;
- for(const AudioInput &audio_input : requested_audio_inputs) {
- AVCodecContext *audio_codec_context = create_audio_codec_context(fps);
+ for(const MergedAudioInputs &merged_audio_inputs : requested_audio_inputs) {
+ const bool use_amix = merged_audio_inputs.audio_inputs.size() > 1;
+ AVCodecContext *audio_codec_context = create_audio_codec_context(fps, audio_codec, use_amix, audio_bitrate);
AVStream *audio_stream = nullptr;
if(replay_buffer_size_secs == -1)
audio_stream = create_stream(av_format_context, audio_codec_context);
- AVFrame *audio_frame = open_audio(audio_codec_context);
+ open_audio(audio_codec_context);
avcodec_parameters_from_context(audio_stream->codecpar, audio_codec_context);
- audio_tracks.push_back({ audio_codec_context, audio_frame, audio_stream, {}, {}, audio_stream_index, audio_input });
+ const int num_channels = audio_codec_context->channels;
+ #else
+ const int num_channels = audio_codec_context->ch_layout.nb_channels;
+ #endif
+ //audio_frame->sample_rate = audio_codec_context->sample_rate;
+ std::vector<AVFilterContext*> src_filter_ctx;
+ AVFilterGraph *graph = nullptr;
+ AVFilterContext *sink = nullptr;
+ if(use_amix) {
+ int err = init_filter_graph(audio_codec_context, &graph, &sink, src_filter_ctx, merged_audio_inputs.audio_inputs.size());
+ if(err < 0) {
+ fprintf(stderr, "Error: failed to create audio filter\n");
+ _exit(1);
+ }
+ }
+ // TODO: Cleanup above
+ const double audio_fps = (double)audio_codec_context->sample_rate / (double)audio_codec_context->frame_size;
+ const double timeout_sec = 1000.0 / audio_fps / 1000.0;
+ const double audio_startup_time_seconds = force_no_audio_offset ? 0 : audio_codec_get_desired_delay(audio_codec, fps);// * ((double)audio_codec_context->frame_size / 1024.0);
+ const double num_audio_frames_shift = audio_startup_time_seconds / timeout_sec;
+ std::vector<AudioDevice> audio_devices;
+ for(size_t i = 0; i < merged_audio_inputs.audio_inputs.size(); ++i) {
+ auto &audio_input = merged_audio_inputs.audio_inputs[i];
+ AVFilterContext *src_ctx = nullptr;
+ if(use_amix)
+ src_ctx = src_filter_ctx[i];
+ AudioDevice audio_device;
+ audio_device.audio_input = audio_input;
+ audio_device.src_filter_ctx = src_ctx;
+ if(audio_input.name.empty()) {
+ audio_device.sound_device.handle = NULL;
+ audio_device.sound_device.frames = 0;
+ } else {
+ if(sound_device_get_by_name(&audio_device.sound_device, audio_input.name.c_str(), audio_input.description.c_str(), num_channels, audio_codec_context->frame_size, audio_codec_context_get_audio_format(audio_codec_context)) != 0) {
+ fprintf(stderr, "Error: failed to get \"%s\" sound device\n", audio_input.name.c_str());
+ _exit(1);
+ }
+ }
+ audio_device.frame = create_audio_frame(audio_codec_context);
+ audio_device.frame->pts = -audio_codec_context->frame_size * num_audio_frames_shift;
+ audio_devices.push_back(std::move(audio_device));
+ }
+ AudioTrack audio_track;
+ audio_track.codec_context = audio_codec_context;
+ audio_track.stream = audio_stream;
+ audio_track.audio_devices = std::move(audio_devices);
+ audio_track.graph = graph;
+ audio_track.sink = sink;
+ audio_track.stream_index = audio_stream_index;
+ audio_track.pts = -audio_codec_context->frame_size * num_audio_frames_shift;
+ audio_tracks.push_back(std::move(audio_track));
+ audio_max_frame_size = std::max(audio_max_frame_size, audio_codec_context->frame_size);
//av_dump_format(av_format_context, 0, filename, 1);
@@ -1607,460 +2439,334 @@ int main(int argc, char **argv) {
int ret = avio_open(&av_format_context->pb, filename, AVIO_FLAG_WRITE);
if (ret < 0) {
fprintf(stderr, "Error: Could not open '%s': %s\n", filename, av_error_to_string(ret));
- return 1;
+ _exit(1);
- //video_stream->duration = AV_TIME_BASE * 15;
- //audio_stream->duration = AV_TIME_BASE * 15;
- //av_format_context->duration = AV_TIME_BASE * 15;
if(replay_buffer_size_secs == -1) {
- int ret = avformat_write_header(av_format_context, nullptr);
+ AVDictionary *options = nullptr;
+ av_dict_set(&options, "strict", "experimental", 0);
+ //av_dict_set_int(&av_format_context->metadata, "video_full_range_flag", 1, 0);
+ int ret = avformat_write_header(av_format_context, &options);
if (ret < 0) {
fprintf(stderr, "Error occurred when writing header to output file: %s\n", av_error_to_string(ret));
- return 1;
+ _exit(1);
- }
- // av_frame_free(&rgb_frame);
- // avcodec_close(av_codec_context);
- if(src_window_id)
- XSelectInput(dpy, src_window_id, StructureNotifyMask | ExposureMask);
- /*
- int damage_event;
- int damage_error;
- if (!XDamageQueryExtension(dpy, &damage_event, &damage_error)) {
- fprintf(stderr, "Error: XDamage is not supported by your X11 server\n");
- return 1;
- }
- Damage damage = XDamageCreate(dpy, src_window_id, XDamageReportNonEmpty);
- XDamageSubtract(dpy, damage,None,None);
- */
- const double start_time_pts = clock_get_monotonic_seconds();
- CUcontext old_ctx;
- CUarray mapped_array;
- if(src_window_id) {
- res = cuda.cuCtxPopCurrent_v2(&old_ctx);
- res = cuda.cuCtxPushCurrent_v2(cu_ctx);
- // Get texture
- res = cuda.cuGraphicsResourceSetMapFlags(
- cuda_graphics_resource, CU_GRAPHICS_MAP_RESOURCE_FLAGS_READ_ONLY);
- res = cuda.cuGraphicsMapResources(1, &cuda_graphics_resource, 0);
- // Map texture to cuda array
- res = cuda.cuGraphicsSubResourceGetMappedArray(&mapped_array,
- cuda_graphics_resource, 0, 0);
+ av_dict_free(&options);
- // Release texture
- // res = cuGraphicsUnmapResources(1, &cuda_graphics_resource, 0);
- double start_time = clock_get_monotonic_seconds();
- double frame_timer_start = start_time;
- double window_resize_timer = start_time;
- bool window_resized = false;
+ double fps_start_time = clock_get_monotonic_seconds();
+ double frame_timer_start = fps_start_time - target_fps; // We want to capture the first frame immediately
int fps_counter = 0;
+ int damage_fps_counter = 0;
- AVFrame *frame = av_frame_alloc();
- if (!frame) {
- fprintf(stderr, "Error: Failed to allocate frame\n");
- exit(1);
- }
- frame->format = video_codec_context->pix_fmt;
- frame->width = video_codec_context->width;
- frame->height = video_codec_context->height;
- if(src_window_id) {
- if (av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, frame, 0) < 0) {
- fprintf(stderr, "Error: av_hwframe_get_buffer failed\n");
- exit(1);
- }
- } else {
- frame->hw_frames_ctx = av_buffer_ref(video_codec_context->hw_frames_ctx);
- frame->buf[0] = av_buffer_pool_get(((AVHWFramesContext*)video_codec_context->hw_frames_ctx->data)->pool);
- frame->extended_data = frame->data;
- }
- frame->color_range = AVCOL_RANGE_JPEG;
- if(window_pixmap.texture_width < record_width)
- frame->width = window_pixmap.texture_width & ~1;
- else
- frame->width = record_width & ~1;
- if(window_pixmap.texture_height < record_height)
- frame->height = window_pixmap.texture_height & ~1;
- else
- frame->height = record_height & ~1;
+ bool paused = false;
+ double paused_time_offset = 0.0;
+ double paused_time_start = 0.0;
std::mutex write_output_mutex;
+ std::mutex audio_filter_mutex;
const double record_start_time = clock_get_monotonic_seconds();
- std::deque<AVPacket> frame_data_queue;
+ std::deque<std::shared_ptr<PacketData>> frame_data_queue;
bool frames_erased = false;
- const size_t audio_buffer_size = 1024 * 2 * 2; // 2 bytes/sample, 2 channels
+ const size_t audio_buffer_size = audio_max_frame_size * 4 * 2; // max 4 bytes/sample, 2 channels
uint8_t *empty_audio = (uint8_t*)malloc(audio_buffer_size);
if(!empty_audio) {
fprintf(stderr, "Error: failed to create empty audio\n");
- exit(1);
+ _exit(1);
memset(empty_audio, 0, audio_buffer_size);
for(AudioTrack &audio_track : audio_tracks) {
- audio_track.thread = std::thread([record_start_time, replay_buffer_size_secs, &frame_data_queue, &frames_erased, &audio_track, empty_audio](AVFormatContext *av_format_context, std::mutex *write_output_mutex) mutable {
- const int num_channels = audio_track.codec_context->channels;
- #else
- const int num_channels = audio_track.codec_context->ch_layout.nb_channels;
- #endif
- if(audio_track.audio_input.name.empty()) {
- audio_track.sound_device.handle = NULL;
- audio_track.sound_device.frames = 0;
- } else {
- if(sound_device_get_by_name(&audio_track.sound_device, audio_track.audio_input.name.c_str(), audio_track.audio_input.description.c_str(), num_channels, audio_track.codec_context->frame_size) != 0) {
- fprintf(stderr, "failed to get 'pulse' sound device\n");
- exit(1);
+ for(AudioDevice &audio_device : audio_track.audio_devices) {
+ audio_device.thread = std::thread([&]() mutable {
+ const AVSampleFormat sound_device_sample_format = audio_format_to_sample_format(audio_codec_context_get_audio_format(audio_track.codec_context));
+ // TODO: Always do conversion for now. This fixes issue with stuttering audio on pulseaudio with opus + multiple audio sources merged
+ const bool needs_audio_conversion = true;//audio_track.codec_context->sample_fmt != sound_device_sample_format;
+ SwrContext *swr = nullptr;
+ if(needs_audio_conversion) {
+ swr = swr_alloc();
+ if(!swr) {
+ fprintf(stderr, "Failed to create SwrContext\n");
+ _exit(1);
+ }
+ av_opt_set_channel_layout(swr, "in_channel_layout", AV_CH_LAYOUT_STEREO, 0);
+ av_opt_set_channel_layout(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0);
+ av_opt_set_chlayout(swr, "in_chlayout", &audio_track.codec_context->ch_layout, 0);
+ av_opt_set_chlayout(swr, "out_chlayout", &audio_track.codec_context->ch_layout, 0);
+ #else
+ av_opt_set_chlayout(swr, "in_channel_layout", &audio_track.codec_context->ch_layout, 0);
+ av_opt_set_chlayout(swr, "out_channel_layout", &audio_track.codec_context->ch_layout, 0);
+ #endif
+ av_opt_set_int(swr, "in_sample_rate", audio_track.codec_context->sample_rate, 0);
+ av_opt_set_int(swr, "out_sample_rate", audio_track.codec_context->sample_rate, 0);
+ av_opt_set_sample_fmt(swr, "in_sample_fmt", sound_device_sample_format, 0);
+ av_opt_set_sample_fmt(swr, "out_sample_fmt", audio_track.codec_context->sample_fmt, 0);
+ swr_init(swr);
- }
- SwrContext *swr = swr_alloc();
- if(!swr) {
- fprintf(stderr, "Failed to create SwrContext\n");
- exit(1);
- }
- av_opt_set_int(swr, "in_channel_layout", AV_CH_LAYOUT_STEREO, 0);
- av_opt_set_int(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0);
- av_opt_set_int(swr, "in_sample_rate", audio_track.codec_context->sample_rate, 0);
- av_opt_set_int(swr, "out_sample_rate", audio_track.codec_context->sample_rate, 0);
- av_opt_set_sample_fmt(swr, "in_sample_fmt", AV_SAMPLE_FMT_S16, 0);
- av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_FLTP, 0);
- swr_init(swr);
- int64_t pts = 0;
- const double target_audio_hz = 1.0 / (double)audio_track.codec_context->sample_rate;
- double received_audio_time = clock_get_monotonic_seconds();
- const int64_t timeout_ms = std::round((1000.0 / (double)audio_track.codec_context->sample_rate) * 1000.0);
- while(running) {
- void *sound_buffer;
- int sound_buffer_size = -1;
- if(audio_track.sound_device.handle)
- sound_buffer_size = sound_device_read_next_chunk(&audio_track.sound_device, &sound_buffer);
- const bool got_audio_data = sound_buffer_size >= 0;
- const double this_audio_frame_time = clock_get_monotonic_seconds();
- if(got_audio_data)
- received_audio_time = this_audio_frame_time;
- int ret = av_frame_make_writable(audio_track.frame);
- if (ret < 0) {
- fprintf(stderr, "Failed to make audio frame writable\n");
- break;
- }
+ const double audio_fps = (double)audio_track.codec_context->sample_rate / (double)audio_track.codec_context->frame_size;
+ const int64_t timeout_ms = std::round(1000.0 / audio_fps);
+ const double timeout_sec = 1000.0 / audio_fps / 1000.0;
+ bool first_frame = true;
+ int64_t num_received_frames = 0;
+ while(running) {
+ void *sound_buffer;
+ int sound_buffer_size = -1;
+ //const double time_before_read_seconds = clock_get_monotonic_seconds();
+ if(audio_device.sound_device.handle) {
+ // TODO: use this instead of calculating time to read. But this can fluctuate and we dont want to go back in time,
+ // also it's 0.0 for some users???
+ double latency_seconds = 0.0;
+ sound_buffer_size = sound_device_read_next_chunk(&audio_device.sound_device, &sound_buffer, timeout_sec * 2.0, &latency_seconds);
+ }
- int64_t num_missing_frames = std::round((this_audio_frame_time - received_audio_time) / target_audio_hz / (int64_t)audio_track.frame->nb_samples);
- if(got_audio_data)
- num_missing_frames = std::max((int64_t)0, num_missing_frames - 1);
- if(!audio_track.sound_device.handle)
- num_missing_frames = std::max((int64_t)1, num_missing_frames);
- if(num_missing_frames >= 5 || !audio_track.sound_device.handle) {
- // TODO:
- //audio_track.frame->data[0] = empty_audio;
- received_audio_time = this_audio_frame_time;
- swr_convert(swr, &audio_track.frame->data[0], audio_track.frame->nb_samples, (const uint8_t**)&empty_audio, audio_track.codec_context->frame_size);
- // TODO: Check if duplicate frame can be saved just by writing it with a different pts instead of sending it again
- for(int i = 0; i < num_missing_frames; ++i) {
- audio_track.frame->pts = pts;
- pts += audio_track.frame->nb_samples;
- ret = avcodec_send_frame(audio_track.codec_context, audio_track.frame);
- if(ret >= 0){
- receive_frames(audio_track.codec_context, audio_track.stream_index, audio_track.stream, audio_track.frame, av_format_context, record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, *write_output_mutex);
- } else {
- fprintf(stderr, "Failed to encode audio!\n");
- }
+ const bool got_audio_data = sound_buffer_size >= 0;
+ //fprintf(stderr, "got audio data: %s\n", got_audio_data ? "yes" : "no");
+ //const double time_after_read_seconds = clock_get_monotonic_seconds();
+ //const double time_to_read_seconds = time_after_read_seconds - time_before_read_seconds;
+ //fprintf(stderr, "time to read: %f, %s, %f\n", time_to_read_seconds, got_audio_data ? "yes" : "no", timeout_sec);
+ const double this_audio_frame_time = clock_get_monotonic_seconds() - paused_time_offset;
+ if(paused) {
+ if(!audio_device.sound_device.handle)
+ usleep(timeout_ms * 1000);
+ continue;
- }
- if(!audio_track.sound_device.handle)
- usleep(timeout_ms * 1000);
+ int ret = av_frame_make_writable(audio_device.frame);
+ if (ret < 0) {
+ fprintf(stderr, "Failed to make audio frame writable\n");
+ break;
+ }
- if(got_audio_data) {
- // TODO: Instead of converting audio, get float audio from alsa. Or does alsa do conversion internally to get this format?
- swr_convert(swr, &audio_track.frame->data[0], audio_track.frame->nb_samples, (const uint8_t**)&sound_buffer, audio_track.codec_context->frame_size);
+ // TODO: Is this |received_audio_time| really correct?
+ const int64_t num_expected_frames = std::round((this_audio_frame_time - record_start_time) / timeout_sec);
+ int64_t num_missing_frames = std::max((int64_t)0LL, num_expected_frames - num_received_frames);
+ if(got_audio_data)
+ num_missing_frames = std::max((int64_t)0LL, num_missing_frames - 1);
+ if(!audio_device.sound_device.handle)
+ num_missing_frames = std::max((int64_t)1, num_missing_frames);
+ // Fucking hell is there a better way to do this? I JUST WANT TO KEEP VIDEO AND AUDIO SYNCED HOLY FUCK I WANT TO KILL MYSELF NOW.
+ // This garbage is needed because we want to produce constant frame rate videos instead of variable frame rate
+ // videos because bad software such as video editing software and VLC do not support variable frame rate software,
+ // despite nvidia shadowplay and xbox game bar producing variable frame rate videos.
+ // So we have to make sure we produce frames at the same relative rate as the video.
+ if((num_missing_frames >= 1 && got_audio_data) || num_missing_frames >= 5 || !audio_device.sound_device.handle) {
+ // TODO:
+ //audio_track.frame->data[0] = empty_audio;
+ if(first_frame || num_missing_frames >= 5) {
+ if(needs_audio_conversion)
+ swr_convert(swr, &audio_device.frame->data[0], audio_track.codec_context->frame_size, (const uint8_t**)&empty_audio, audio_track.codec_context->frame_size);
+ else
+ audio_device.frame->data[0] = empty_audio;
+ }
+ first_frame = false;
+ // TODO: Check if duplicate frame can be saved just by writing it with a different pts instead of sending it again
+ std::lock_guard<std::mutex> lock(audio_filter_mutex);
+ for(int i = 0; i < num_missing_frames; ++i) {
+ if(audio_track.graph) {
+ // TODO: av_buffersrc_add_frame
+ if(av_buffersrc_write_frame(audio_device.src_filter_ctx, audio_device.frame) < 0) {
+ fprintf(stderr, "Error: failed to add audio frame to filter\n");
+ }
+ } else {
+ ret = avcodec_send_frame(audio_track.codec_context, audio_device.frame);
+ if(ret >= 0) {
+ // TODO: Move to separate thread because this could write to network (for example when livestreaming)
+ receive_frames(audio_track.codec_context, audio_track.stream_index, audio_track.stream, audio_device.frame->pts, av_format_context, record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex, paused_time_offset);
+ } else {
+ fprintf(stderr, "Failed to encode audio!\n");
+ }
+ }
+ audio_device.frame->pts += audio_track.codec_context->frame_size;
+ num_received_frames++;
+ }
+ }
- audio_track.frame->pts = pts;
- pts += audio_track.frame->nb_samples;
+ if(!audio_device.sound_device.handle)
+ usleep(timeout_ms * 1000);
+ if(got_audio_data) {
+ // TODO: Instead of converting audio, get float audio from alsa. Or does alsa do conversion internally to get this format?
+ if(needs_audio_conversion)
+ swr_convert(swr, &audio_device.frame->data[0], audio_track.codec_context->frame_size, (const uint8_t**)&sound_buffer, audio_track.codec_context->frame_size);
+ else
+ audio_device.frame->data[0] = (uint8_t*)sound_buffer;
+ first_frame = false;
+ if(audio_track.graph) {
+ std::lock_guard<std::mutex> lock(audio_filter_mutex);
+ // TODO: av_buffersrc_add_frame
+ if(av_buffersrc_write_frame(audio_device.src_filter_ctx, audio_device.frame) < 0) {
+ fprintf(stderr, "Error: failed to add audio frame to filter\n");
+ }
+ } else {
+ ret = avcodec_send_frame(audio_track.codec_context, audio_device.frame);
+ if(ret >= 0) {
+ // TODO: Move to separate thread because this could write to network (for example when livestreaming)
+ receive_frames(audio_track.codec_context, audio_track.stream_index, audio_track.stream, audio_device.frame->pts, av_format_context, record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex, paused_time_offset);
+ } else {
+ fprintf(stderr, "Failed to encode audio!\n");
+ }
+ }
- ret = avcodec_send_frame(audio_track.codec_context, audio_track.frame);
- if(ret >= 0){
- receive_frames(audio_track.codec_context, audio_track.stream_index, audio_track.stream, audio_track.frame, av_format_context, record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, *write_output_mutex);
- } else {
- fprintf(stderr, "Failed to encode audio!\n");
+ audio_device.frame->pts += audio_track.codec_context->frame_size;
+ num_received_frames++;
- }
- sound_device_close(&audio_track.sound_device);
- swr_free(&swr);
- }, av_format_context, &write_output_mutex);
+ if(swr)
+ swr_free(&swr);
+ });
+ }
// Set update_fps to 24 to test if duplicate/delayed frames cause video/audio desync or too fast/slow video.
const double update_fps = fps + 190;
+ bool should_stop_error = false;
+ AVFrame *aframe = av_frame_alloc();
int64_t video_pts_counter = 0;
+ int64_t video_prev_pts = 0;
- XEvent e;
- while (running) {
+ while(running) {
double frame_start = clock_get_monotonic_seconds();
- if(window)
- gl.glClear(GL_COLOR_BUFFER_BIT);
- if(src_window_id) {
- if (XCheckTypedWindowEvent(dpy, src_window_id, DestroyNotify, &e)) {
- running = 0;
- }
- if (XCheckTypedWindowEvent(dpy, src_window_id, Expose, &e) && e.xexpose.count == 0) {
- window_resize_timer = clock_get_monotonic_seconds();
- window_resized = true;
- }
+ gsr_capture_tick(capture, video_codec_context);
+ should_stop_error = false;
+ if(gsr_capture_should_stop(capture, &should_stop_error)) {
+ running = 0;
+ break;
+ }
- if (XCheckTypedWindowEvent(dpy, src_window_id, ConfigureNotify, &e) && e.xconfigure.window == src_window_id) {
- while(XCheckTypedWindowEvent(dpy, src_window_id, ConfigureNotify, &e)) {}
- window_x = e.xconfigure.x;
- window_y = e.xconfigure.y;
- Window c;
- XTranslateCoordinates(dpy, src_window_id, DefaultRootWindow(dpy), 0, 0, &window_x, &window_y, &c);
- // Window resize
- if(e.xconfigure.width != (int)window_width || e.xconfigure.height != (int)window_height) {
- window_width = std::max(0, e.xconfigure.width);
- window_height = std::max(0, e.xconfigure.height);
- window_resize_timer = clock_get_monotonic_seconds();
- window_resized = true;
+ // TODO: Move to another thread, since this shouldn't be locked to video encoding fps
+ {
+ std::lock_guard<std::mutex> lock(audio_filter_mutex);
+ for(AudioTrack &audio_track : audio_tracks) {
+ if(!audio_track.sink)
+ continue;
+ int err = 0;
+ while ((err = av_buffersink_get_frame(audio_track.sink, aframe)) >= 0) {
+ aframe->pts = audio_track.pts;
+ err = avcodec_send_frame(audio_track.codec_context, aframe);
+ if(err >= 0){
+ // TODO: Move to separate thread because this could write to network (for example when livestreaming)
+ receive_frames(audio_track.codec_context, audio_track.stream_index, audio_track.stream, aframe->pts, av_format_context, record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex, paused_time_offset);
+ } else {
+ fprintf(stderr, "Failed to encode audio!\n");
+ }
+ av_frame_unref(aframe);
+ audio_track.pts += audio_track.codec_context->frame_size;
+ }
- const double window_resize_timeout = 1.0; // 1 second
- if(window_resized && clock_get_monotonic_seconds() - window_resize_timer >= window_resize_timeout) {
- window_resized = false;
- fprintf(stderr, "Resize window!\n");
- recreate_window_pixmap(dpy, src_window_id, window_pixmap);
- // Resolution must be a multiple of two
- //video_stream->codec->width = window_pixmap.texture_width & ~1;
- //video_stream->codec->height = window_pixmap.texture_height & ~1;
- cuda.cuGraphicsUnregisterResource(cuda_graphics_resource);
- res = cuda.cuGraphicsGLRegisterImage(
- &cuda_graphics_resource, window_pixmap.target_texture_id, GL_TEXTURE_2D,
- if (res != CUDA_SUCCESS) {
- const char *err_str;
- cuda.cuGetErrorString(res, &err_str);
- fprintf(stderr,
- "Error: cuda.cuGraphicsGLRegisterImage failed, error %s, texture "
- "id: %u\n",
- err_str, window_pixmap.target_texture_id);
- running = false;
- break;
- }
- res = cuda.cuGraphicsResourceSetMapFlags(
- cuda_graphics_resource, CU_GRAPHICS_MAP_RESOURCE_FLAGS_READ_ONLY);
- res = cuda.cuGraphicsMapResources(1, &cuda_graphics_resource, 0);
- res = cuda.cuGraphicsSubResourceGetMappedArray(&mapped_array, cuda_graphics_resource, 0, 0);
- av_frame_free(&frame);
- frame = av_frame_alloc();
- if (!frame) {
- fprintf(stderr, "Error: Failed to allocate frame\n");
- running = false;
- break;
- }
- frame->format = video_codec_context->pix_fmt;
- frame->width = video_codec_context->width;
- frame->height = video_codec_context->height;
- if (av_hwframe_get_buffer(video_codec_context->hw_frames_ctx, frame, 0) < 0) {
- fprintf(stderr, "Error: av_hwframe_get_buffer failed\n");
- running = false;
- break;
- }
- if(window_pixmap.texture_width < record_width)
- frame->width = window_pixmap.texture_width & ~1;
- else
- frame->width = record_width & ~1;
- if(window_pixmap.texture_height < record_height)
- frame->height = window_pixmap.texture_height & ~1;
- else
- frame->height = record_height & ~1;
- // Make the new completely black to clear unused parts
- // TODO: cuMemsetD32?
- cuda.cuMemsetD8_v2((CUdeviceptr)frame->data[0], 0, record_width * record_height * 4);
- }
+ const bool damaged = !capture->is_damaged || capture->is_damaged(capture);
+ if(damaged) {
+ ++damage_fps_counter;
double time_now = clock_get_monotonic_seconds();
double frame_timer_elapsed = time_now - frame_timer_start;
- double elapsed = time_now - start_time;
+ double elapsed = time_now - fps_start_time;
if (elapsed >= 1.0) {
- fprintf(stderr, "update fps: %d\n", fps_counter);
- start_time = time_now;
+ if(verbose) {
+ fprintf(stderr, "update fps: %d, damage fps: %d\n", fps_counter, damage_fps_counter);
+ }
+ fps_start_time = time_now;
fps_counter = 0;
+ damage_fps_counter = 0;
double frame_time_overflow = frame_timer_elapsed - target_fps;
- if (frame_time_overflow >= 0.0) {
+ if (frame_time_overflow >= 0.0 && damaged) {
+ if(capture->clear_damage)
+ capture->clear_damage(capture);
+ frame_time_overflow = std::min(frame_time_overflow, target_fps);
frame_timer_start = time_now - frame_time_overflow;
- if(src_window_id) {
- // TODO: Use a framebuffer instead. glCopyImageSubData requires
- // opengl 4.2
- int source_x = 0;
- int source_y = 0;
- int source_width = window_pixmap.texture_width;
- int source_height = window_pixmap.texture_height;
- bool clamped = false;
+ const double this_video_frame_time = clock_get_monotonic_seconds() - paused_time_offset;
+ const int64_t expected_frames = std::round((this_video_frame_time - record_start_time) / target_fps);
+ const int num_frames = framerate_mode == FramerateMode::CONSTANT ? std::max((int64_t)0LL, expected_frames - video_pts_counter) : 1;
- if(window_pixmap.composite_window) {
- source_x = window_x;
- source_y = window_y;
- int underflow_x = 0;
- int underflow_y = 0;
- if(source_x < 0) {
- underflow_x = -source_x;
- source_x = 0;
- source_width += source_x;
- }
+ if(num_frames > 0 && !paused) {
+ gsr_capture_capture(capture, video_frame);
- if(source_y < 0) {
- underflow_y = -source_y;
- source_y = 0;
- source_height += source_y;
- }
- const int clamped_source_width = std::max(0, window_pixmap.texture_real_width - source_x - underflow_x);
- const int clamped_source_height = std::max(0, window_pixmap.texture_real_height - source_y - underflow_y);
- if(clamped_source_width < source_width) {
- source_width = clamped_source_width;
- clamped = true;
- }
- if(clamped_source_height < source_height) {
- source_height = clamped_source_height;
- clamped = true;
+ // TODO: Check if duplicate frame can be saved just by writing it with a different pts instead of sending it again
+ for(int i = 0; i < num_frames; ++i) {
+ if(framerate_mode == FramerateMode::CONSTANT) {
+ video_frame->pts = video_pts_counter + i;
+ } else {
+ video_frame->pts = (this_video_frame_time - record_start_time) * (double)AV_TIME_BASE;
+ const bool same_pts = video_frame->pts == video_prev_pts;
+ video_prev_pts = video_frame->pts;
+ if(same_pts)
+ continue;
- }
- if(clamped) {
- // Requires opengl 4.4... TODO: Replace with earlier opengl if opengl < 4.2
- if(gl.glClearTexImage)
- gl.glClearTexImage(window_pixmap.target_texture_id, 0, GL_RGB, GL_UNSIGNED_BYTE, nullptr);
- }
- // Requires opengl 4.2... TODO: Replace with earlier opengl if opengl < 4.2
- gl.glCopyImageSubData(
- window_pixmap.texture_id, GL_TEXTURE_2D, 0, source_x, source_y, 0,
- window_pixmap.target_texture_id, GL_TEXTURE_2D, 0, 0, 0, 0,
- source_width, source_height, 1);
- unsigned int err = gl.glGetError();
- if(err != 0) {
- static bool error_shown = false;
- if(!error_shown) {
- error_shown = true;
- fprintf(stderr, "Error: glCopyImageSubData failed, gl error: %d\n", err);
+ int ret = avcodec_send_frame(video_codec_context, video_frame);
+ if(ret == 0) {
+ // TODO: Move to separate thread because this could write to network (for example when livestreaming)
+ receive_frames(video_codec_context, VIDEO_STREAM_INDEX, video_stream, video_frame->pts, av_format_context,
+ record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex, paused_time_offset);
+ } else {
+ fprintf(stderr, "Error: avcodec_send_frame failed, error: %s\n", av_error_to_string(ret));
- gl.glXSwapBuffers(dpy, window);
- // int err = gl.glGetError();
- // fprintf(stderr, "error: %d\n", err);
- // TODO: Remove this copy, which is only possible by using nvenc directly and encoding window_pixmap.target_texture_id
- frame->linesize[0] = frame->width * 4;
- CUDA_MEMCPY2D memcpy_struct;
- memcpy_struct.srcXInBytes = 0;
- memcpy_struct.srcY = 0;
- memcpy_struct.srcMemoryType = CUmemorytype::CU_MEMORYTYPE_ARRAY;
- memcpy_struct.dstXInBytes = 0;
- memcpy_struct.dstY = 0;
- memcpy_struct.dstMemoryType = CUmemorytype::CU_MEMORYTYPE_DEVICE;
+ gsr_capture_end(capture, video_frame);
+ video_pts_counter += num_frames;
+ }
+ }
- memcpy_struct.srcArray = mapped_array;
- memcpy_struct.dstDevice = (CUdeviceptr)frame->data[0];
- memcpy_struct.dstPitch = frame->linesize[0];
- memcpy_struct.WidthInBytes = frame->width * 4;
- memcpy_struct.Height = frame->height;
- cuda.cuMemcpy2D_v2(&memcpy_struct);
+ if(toggle_pause == 1) {
+ const bool new_paused_state = !paused;
+ if(new_paused_state) {
+ paused_time_start = clock_get_monotonic_seconds();
+ fprintf(stderr, "Paused\n");
} else {
- // TODO: Check when src_cu_device_ptr changes and re-register resource
- frame->linesize[0] = frame->width * 4;
- uint32_t byte_size = 0;
- CUdeviceptr src_cu_device_ptr = 0;
- nv_fbc_library.capture(&src_cu_device_ptr, &byte_size);
- frame->data[0] = (uint8_t*)src_cu_device_ptr;
+ paused_time_offset += (clock_get_monotonic_seconds() - paused_time_start);
+ fprintf(stderr, "Unpaused\n");
- // res = cuda.cuCtxPopCurrent_v2(&old_ctx);
- const double this_video_frame_time = clock_get_monotonic_seconds();
- const int64_t expected_frames = std::round((this_video_frame_time - start_time_pts) / target_fps);
- const int num_frames = std::max(0L, expected_frames - video_pts_counter);
- frame->flags &= ~AV_FRAME_FLAG_DISCARD;
- // TODO: Check if duplicate frame can be saved just by writing it with a different pts instead of sending it again
- for(int i = 0; i < num_frames; ++i) {
- if(i > 0)
- frame->flags |= AV_FRAME_FLAG_DISCARD;
- frame->pts = video_pts_counter + i;
- if (avcodec_send_frame(video_codec_context, frame) >= 0) {
- receive_frames(video_codec_context, VIDEO_STREAM_INDEX, video_stream, frame, av_format_context,
- record_start_time, frame_data_queue, replay_buffer_size_secs, frames_erased, write_output_mutex);
- } else {
- fprintf(stderr, "Error: avcodec_send_frame failed\n");
- }
- }
- video_pts_counter += num_frames;
+ toggle_pause = 0;
+ paused = !paused;
if(save_replay_thread.valid() && save_replay_thread.wait_for(std::chrono::seconds(0)) == std::future_status::ready) {
+ fflush(stdout);
+ if(recording_saved_script)
+ run_recording_saved_script_async(recording_saved_script, save_replay_output_filepath.c_str(), "replay");
+ std::lock_guard<std::mutex> lock(write_output_mutex);
if(save_replay == 1 && !save_replay_thread.valid() && replay_buffer_size_secs != -1) {
save_replay = 0;
- save_replay_async(video_codec_context, VIDEO_STREAM_INDEX, audio_tracks, frame_data_queue, frames_erased, filename, container_format, file_extension, write_output_mutex);
+ save_replay_async(video_codec_context, VIDEO_STREAM_INDEX, audio_tracks, frame_data_queue, frames_erased, filename, container_format, file_extension, write_output_mutex, make_folders);
- // av_frame_free(&frame);
double frame_end = clock_get_monotonic_seconds();
double frame_sleep_fps = 1.0 / update_fps;
double sleep_time = frame_sleep_fps - (frame_end - frame_start);
@@ -2068,17 +2774,27 @@ int main(int argc, char **argv) {
usleep(sleep_time * 1000.0 * 1000.0);
- running = 0;
+ running = 0;
if(save_replay_thread.valid()) {
+ fflush(stdout);
+ if(recording_saved_script)
+ run_recording_saved_script_async(recording_saved_script, save_replay_output_filepath.c_str(), "replay");
+ std::lock_guard<std::mutex> lock(write_output_mutex);
+ save_replay_packets.clear();
for(AudioTrack &audio_track : audio_tracks) {
- audio_track.thread.join();
+ for(AudioDevice &audio_device : audio_track.audio_devices) {
+ audio_device.thread.join();
+ sound_device_close(&audio_device.sound_device);
+ }
+ av_frame_free(&aframe);
if (replay_buffer_size_secs == -1 && av_write_trailer(av_format_context) != 0) {
fprintf(stderr, "Failed to write trailer\n");
@@ -2086,8 +2802,24 @@ int main(int argc, char **argv) {
if(replay_buffer_size_secs == -1 && !(output_format->flags & AVFMT_NOFILE))
- if(dpy)
- XCloseDisplay(dpy);
+ gsr_capture_destroy(capture, video_codec_context);
+ if(replay_buffer_size_secs == -1 && recording_saved_script)
+ run_recording_saved_script_async(recording_saved_script, filename, "regular");
+ if(dpy) {
+ // TODO: This causes a crash, why? maybe some other library dlclose xlib and that also happened to unload this???
+ //XCloseDisplay(dpy);
+ }
+ //av_frame_free(&video_frame);
+ free((void*)window_str);
+ // We do an _exit here because cuda uses at_exit to do _something_ that causes the program to freeze,
+ // but only on some nvidia driver versions on some gpus (RTX?), and _exit exits the program without calling
+ // the at_exit registered functions.
+ // Cuda (cuvid library in this case) seems to be waiting for a thread that never finishes execution.
+ // Maybe this happens because we dont clean up all ffmpeg resources?
+ // TODO: Investigate this.
+ _exit(should_stop_error ? 3 : 0);
diff --git a/src/overclock.c b/src/overclock.c
new file mode 100644
index 0000000..2cba623
--- /dev/null
+++ b/src/overclock.c
@@ -0,0 +1,281 @@
+#include "../include/overclock.h"
+#include <X11/Xlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+// HACK!!!: When a program uses cuda (including nvenc) then the nvidia driver drops to performance level 2 (memory transfer rate is dropped and possibly graphics clock).
+// Nvidia does this because in some very extreme cases of cuda there can be memory corruption when running at max memory transfer rate.
+// So to get around this we overclock memory transfer rate (maybe this should also be done for graphics clock?) to the best performance level while GPU Screen Recorder is running.
+// TODO: Does it always drop to performance level 2?
+static int min_int(int a, int b) {
+ return a < b ? a : b;
+// Fields are 0 if not set
+typedef struct {
+ int perf;
+ int nv_clock;
+ int nv_clock_min;
+ int nv_clock_max;
+ int mem_clock;
+ int mem_clock_min;
+ int mem_clock_max;
+ int mem_transfer_rate;
+ int mem_transfer_rate_min;
+ int mem_transfer_rate_max;
+} NVCTRLPerformanceLevel;
+typedef struct {
+ NVCTRLPerformanceLevel performance_level[MAX_PERFORMANCE_LEVELS];
+ int num_performance_levels;
+} NVCTRLPerformanceLevelQuery;
+typedef void (*split_callback)(const char *str, size_t size, void *userdata);
+static void split_by_delimiter(const char *str, size_t size, char delimiter, split_callback callback, void *userdata) {
+ const char *it = str;
+ while(it < str + size) {
+ const char *prev_it = it;
+ it = memchr(it, delimiter, (str + size) - it);
+ if(!it)
+ it = str + size;
+ callback(prev_it, it - prev_it, userdata);
+ it += 1; // skip delimiter
+ }
+typedef enum {
+} NvCTRLAttributeType;
+static unsigned int attribute_type_to_attribute_param(NvCTRLAttributeType attribute_type) {
+ switch(attribute_type) {
+ }
+ return 0;
+static unsigned int attribute_type_to_attribute_param_all_levels(NvCTRLAttributeType attribute_type) {
+ switch(attribute_type) {
+ }
+ return 0;
+// Returns 0 on error
+static int xnvctrl_get_attribute_max_value(gsr_xnvctrl *xnvctrl, int num_performance_levels, NvCTRLAttributeType attribute_type) {
+ NVCTRLAttributeValidValuesRec valid;
+ if(xnvctrl->XNVCTRLQueryValidTargetAttributeValues(xnvctrl->display, NV_CTRL_TARGET_TYPE_GPU, 0, 0, attribute_type_to_attribute_param_all_levels(attribute_type), &valid)) {
+ return valid.u.range.max;
+ }
+ if(num_performance_levels > 0 && xnvctrl->XNVCTRLQueryValidTargetAttributeValues(xnvctrl->display, NV_CTRL_TARGET_TYPE_GPU, 0, num_performance_levels - 1, attribute_type_to_attribute_param(attribute_type), &valid)) {
+ return valid.u.range.max;
+ }
+ return 0;
+static bool xnvctrl_set_attribute_offset(gsr_xnvctrl *xnvctrl, int num_performance_levels, int offset, NvCTRLAttributeType attribute_type) {
+ bool success = false;
+ // NV_CTRL_GPU_MEM_TRANSFER_RATE_OFFSET_ALL_PERFORMANCE_LEVELS works (or at least used to?) without Xorg running as root
+ // so we try that first. NV_CTRL_GPU_MEM_TRANSFER_RATE_OFFSET_ALL_PERFORMANCE_LEVELS also only works with GTX 1000+.
+ // TODO: Reverse engineer NVIDIA Xorg driver so we can set this always without root access.
+ if(xnvctrl->XNVCTRLSetTargetAttributeAndGetStatus(xnvctrl->display, NV_CTRL_TARGET_TYPE_GPU, 0, 0, attribute_type_to_attribute_param_all_levels(attribute_type), offset))
+ success = true;
+ for(int i = 0; i < num_performance_levels; ++i) {
+ success |= xnvctrl->XNVCTRLSetTargetAttributeAndGetStatus(xnvctrl->display, NV_CTRL_TARGET_TYPE_GPU, 0, i, attribute_type_to_attribute_param(attribute_type), offset);
+ }
+ return success;
+static void strip(const char **str, int *size) {
+ const char *str_d = *str;
+ int s_d = *size;
+ const char *start = str_d;
+ const char *end = start + s_d;
+ while(str_d < end) {
+ char c = *str_d;
+ if(c != ' ' && c != '\t' && c != '\n')
+ break;
+ ++str_d;
+ }
+ int start_offset = str_d - start;
+ while(s_d > start_offset) {
+ char c = start[s_d];
+ if(c != ' ' && c != '\t' && c != '\n')
+ break;
+ --s_d;
+ }
+ *str = str_d;
+ *size = s_d;
+static void attribute_callback(const char *str, size_t size, void *userdata) {
+ if(size > 255 - 1)
+ return;
+ int size_i = size;
+ strip(&str, &size_i);
+ char attribute[255];
+ memcpy(attribute, str, size_i);
+ attribute[size_i] = '\0';
+ const char *sep = strchr(attribute, '=');
+ if(!sep)
+ return;
+ const char *attribute_name = attribute;
+ size_t attribute_name_len = sep - attribute_name;
+ const char *attribute_value_str = sep + 1;
+ int attribute_value = 0;
+ if(sscanf(attribute_value_str, "%d", &attribute_value) != 1)
+ return;
+ NVCTRLPerformanceLevel *performance_level = userdata;
+ if(attribute_name_len == 4 && memcmp(attribute_name, "perf", 4) == 0)
+ performance_level->perf = attribute_value;
+ else if(attribute_name_len == 7 && memcmp(attribute_name, "nvclock", 7) == 0)
+ performance_level->nv_clock = attribute_value;
+ else if(attribute_name_len == 10 && memcmp(attribute_name, "nvclockmin", 10) == 0)
+ performance_level->nv_clock_min = attribute_value;
+ else if(attribute_name_len == 10 && memcmp(attribute_name, "nvclockmax", 10) == 0)
+ performance_level->nv_clock_max = attribute_value;
+ else if(attribute_name_len == 8 && memcmp(attribute_name, "memclock", 8) == 0)
+ performance_level->mem_clock = attribute_value;
+ else if(attribute_name_len == 11 && memcmp(attribute_name, "memclockmin", 11) == 0)
+ performance_level->mem_clock_min = attribute_value;
+ else if(attribute_name_len == 11 && memcmp(attribute_name, "memclockmax", 11) == 0)
+ performance_level->mem_clock_max = attribute_value;
+ else if(attribute_name_len == 15 && memcmp(attribute_name, "memTransferRate", 15) == 0)
+ performance_level->mem_transfer_rate = attribute_value;
+ else if(attribute_name_len == 18 && memcmp(attribute_name, "memTransferRatemin", 18) == 0)
+ performance_level->mem_transfer_rate_min = attribute_value;
+ else if(attribute_name_len == 18 && memcmp(attribute_name, "memTransferRatemax", 18) == 0)
+ performance_level->mem_transfer_rate_max = attribute_value;
+static void attribute_line_callback(const char *str, size_t size, void *userdata) {
+ NVCTRLPerformanceLevelQuery *query = userdata;
+ if(query->num_performance_levels >= MAX_PERFORMANCE_LEVELS)
+ return;
+ NVCTRLPerformanceLevel *current_performance_level = &query->performance_level[query->num_performance_levels];
+ memset(current_performance_level, 0, sizeof(NVCTRLPerformanceLevel));
+ ++query->num_performance_levels;
+ split_by_delimiter(str, size, ',', attribute_callback, current_performance_level);
+static bool xnvctrl_get_performance_levels(gsr_xnvctrl *xnvctrl, NVCTRLPerformanceLevelQuery *query) {
+ bool success = false;
+ memset(query, 0, sizeof(NVCTRLPerformanceLevelQuery));
+ char *attributes = NULL;
+ if(!xnvctrl->XNVCTRLQueryTargetStringAttribute(xnvctrl->display, NV_CTRL_TARGET_TYPE_GPU, 0, 0, NV_CTRL_STRING_PERFORMANCE_MODES, &attributes)) {
+ success = false;
+ goto done;
+ }
+ split_by_delimiter(attributes, strlen(attributes), ';', attribute_line_callback, query);
+ success = true;
+ done:
+ if(attributes)
+ XFree(attributes);
+ return success;
+static int compare_mem_transfer_rate_max_asc(const void *a, const void *b) {
+ const NVCTRLPerformanceLevel *perf_a = a;
+ const NVCTRLPerformanceLevel *perf_b = b;
+ return perf_a->mem_transfer_rate_max - perf_b->mem_transfer_rate_max;
+bool gsr_overclock_load(gsr_overclock *self, Display *display) {
+ memset(self, 0, sizeof(gsr_overclock));
+ self->num_performance_levels = 0;
+ return gsr_xnvctrl_load(&self->xnvctrl, display);
+void gsr_overclock_unload(gsr_overclock *self) {
+ gsr_xnvctrl_unload(&self->xnvctrl);
+bool gsr_overclock_start(gsr_overclock *self) {
+ int basep = 0;
+ int errorp = 0;
+ if(!self->xnvctrl.XNVCTRLQueryExtension(self->xnvctrl.display, &basep, &errorp)) {
+ fprintf(stderr, "gsr warning: gsr_overclock_start: xnvctrl is not supported on your system, failed to overclock memory transfer rate\n");
+ return false;
+ }
+ NVCTRLPerformanceLevelQuery query;
+ if(!xnvctrl_get_performance_levels(&self->xnvctrl, &query) || query.num_performance_levels == 0) {
+ fprintf(stderr, "gsr warning: gsr_overclock_start: failed to get performance levels for overclocking\n");
+ return false;
+ }
+ self->num_performance_levels = query.num_performance_levels;
+ qsort(query.performance_level, query.num_performance_levels, sizeof(NVCTRLPerformanceLevel), compare_mem_transfer_rate_max_asc);
+ int target_transfer_rate_offset = xnvctrl_get_attribute_max_value(&self->xnvctrl, query.num_performance_levels, NVCTRL_ATTRIB_GPU_MEM_TRANSFER_RATE);
+ if(query.num_performance_levels > 1) {
+ const int transfer_rate_max_diff = query.performance_level[query.num_performance_levels - 1].mem_transfer_rate_max - query.performance_level[query.num_performance_levels - 2].mem_transfer_rate_max;
+ target_transfer_rate_offset = min_int(target_transfer_rate_offset, transfer_rate_max_diff);
+ if(target_transfer_rate_offset >= 0 && xnvctrl_set_attribute_offset(&self->xnvctrl, self->num_performance_levels, target_transfer_rate_offset, NVCTRL_ATTRIB_GPU_MEM_TRANSFER_RATE)) {
+ fprintf(stderr, "gsr info: gsr_overclock_start: sucessfully set memory transfer rate offset to %d\n", target_transfer_rate_offset);
+ } else {
+ fprintf(stderr, "gsr info: gsr_overclock_start: failed to overclock memory transfer rate offset to %d\n", target_transfer_rate_offset);
+ }
+ }
+ // TODO: Sort by nv_clock_max
+ // TODO: Enable. Crashes on my system (gtx 1080) so it's disabled for now. Seems to crash even if graphics clock is increasd by 1, let alone 1200
+ /*
+ int target_nv_clock_offset = xnvctrl_get_attribute_max_value(&self->xnvctrl, query.num_performance_levels, NVCTRL_GPU_NVCLOCK);
+ if(query.num_performance_levels > 1) {
+ const int nv_clock_max_diff = query.performance_level[query.num_performance_levels - 1].nv_clock_max - query.performance_level[query.num_performance_levels - 2].nv_clock_max;
+ target_nv_clock_offset = min_int(target_nv_clock_offset, nv_clock_max_diff);
+ if(target_nv_clock_offset >= 0 && xnvctrl_set_attribute_offset(&self->xnvctrl, self->num_performance_levels, target_nv_clock_offset, NVCTRL_GPU_NVCLOCK)) {
+ fprintf(stderr, "gsr info: gsr_overclock_start: sucessfully set nv clock offset to %d\n", target_nv_clock_offset);
+ } else {
+ fprintf(stderr, "gsr info: gsr_overclock_start: failed to overclock nv clock offset to %d\n", target_nv_clock_offset);
+ }
+ }
+ */
+ XSync(self->xnvctrl.display, False);
+ return true;
+void gsr_overclock_stop(gsr_overclock *self) {
+ xnvctrl_set_attribute_offset(&self->xnvctrl, self->num_performance_levels, 0, NVCTRL_ATTRIB_GPU_MEM_TRANSFER_RATE);
+ //xnvctrl_set_attribute_offset(&self->xnvctrl, self->num_performance_levels, 0, NVCTRL_GPU_NVCLOCK);
+ XSync(self->xnvctrl.display, False);
diff --git a/src/shader.c b/src/shader.c
new file mode 100644
index 0000000..dcb956b
--- /dev/null
+++ b/src/shader.c
@@ -0,0 +1,143 @@
+#include "../include/shader.h"
+#include "../include/egl.h"
+#include <stdio.h>
+#include <assert.h>
+static int min_int(int a, int b) {
+ return a < b ? a : b;
+static unsigned int loader_shader(gsr_egl *egl, unsigned int type, const char *source) {
+ unsigned int shader_id = egl->glCreateShader(type);
+ if(shader_id == 0) {
+ fprintf(stderr, "gsr error: loader_shader: failed to create shader, error: %d\n", egl->glGetError());
+ return 0;
+ }
+ egl->glShaderSource(shader_id, 1, &source, NULL);
+ egl->glCompileShader(shader_id);
+ int compiled = 0;
+ egl->glGetShaderiv(shader_id, GL_COMPILE_STATUS, &compiled);
+ if(!compiled) {
+ int info_length = 0;
+ egl->glGetShaderiv(shader_id, GL_INFO_LOG_LENGTH, &info_length);
+ if(info_length > 1) {
+ char info_log[4096];
+ egl->glGetShaderInfoLog(shader_id, min_int(4096, info_length), NULL, info_log);
+ fprintf(stderr, "gsr error: loader shader: failed to compile shader, error:\n%s\nshader source:\n%s\n", info_log, source);
+ }
+ egl->glDeleteShader(shader_id);
+ return 0;
+ }
+ return shader_id;
+static unsigned int load_program(gsr_egl *egl, const char *vertex_shader, const char *fragment_shader) {
+ unsigned int vertex_shader_id = 0;
+ unsigned int fragment_shader_id = 0;
+ unsigned int program_id = 0;
+ int linked = 0;
+ if(vertex_shader) {
+ vertex_shader_id = loader_shader(egl, GL_VERTEX_SHADER, vertex_shader);
+ if(vertex_shader_id == 0)
+ goto err;
+ }
+ if(fragment_shader) {
+ fragment_shader_id = loader_shader(egl, GL_FRAGMENT_SHADER, fragment_shader);
+ if(fragment_shader_id == 0)
+ goto err;
+ }
+ program_id = egl->glCreateProgram();
+ if(program_id == 0) {
+ fprintf(stderr, "gsr error: load_program: failed to create shader program, error: %d\n", egl->glGetError());
+ goto err;
+ }
+ if(vertex_shader_id)
+ egl->glAttachShader(program_id, vertex_shader_id);
+ if(fragment_shader_id)
+ egl->glAttachShader(program_id, fragment_shader_id);
+ egl->glLinkProgram(program_id);
+ egl->glGetProgramiv(program_id, GL_LINK_STATUS, &linked);
+ if(!linked) {
+ int info_length = 0;
+ egl->glGetProgramiv(program_id, GL_INFO_LOG_LENGTH, &info_length);
+ if(info_length > 1) {
+ char info_log[4096];
+ egl->glGetProgramInfoLog(program_id, min_int(4096, info_length), NULL, info_log);
+ fprintf(stderr, "gsr error: load program: linking shader program failed, error:\n%s\n", info_log);
+ }
+ goto err;
+ }
+ if(fragment_shader_id)
+ egl->glDeleteShader(fragment_shader_id);
+ if(vertex_shader_id)
+ egl->glDeleteShader(vertex_shader_id);
+ return program_id;
+ err:
+ if(program_id)
+ egl->glDeleteProgram(program_id);
+ if(fragment_shader_id)
+ egl->glDeleteShader(fragment_shader_id);
+ if(vertex_shader_id)
+ egl->glDeleteShader(vertex_shader_id);
+ return 0;
+int gsr_shader_init(gsr_shader *self, gsr_egl *egl, const char *vertex_shader, const char *fragment_shader) {
+ assert(egl);
+ self->egl = egl;
+ self->program_id = 0;
+ if(!vertex_shader && !fragment_shader) {
+ fprintf(stderr, "gsr error: gsr_shader_init: vertex shader and fragment shader can't be NULL at the same time\n");
+ return -1;
+ }
+ self->program_id = load_program(self->egl, vertex_shader, fragment_shader);
+ if(self->program_id == 0)
+ return -1;
+ return 0;
+void gsr_shader_deinit(gsr_shader *self) {
+ if(!self->egl)
+ return;
+ if(self->program_id) {
+ self->egl->glDeleteProgram(self->program_id);
+ self->program_id = 0;
+ }
+ self->egl = NULL;
+int gsr_shader_bind_attribute_location(gsr_shader *self, const char *attribute, int location) {
+ while(self->egl->glGetError()) {}
+ self->egl->glBindAttribLocation(self->program_id, location, attribute);
+ return self->egl->glGetError();
+void gsr_shader_use(gsr_shader *self) {
+ self->egl->glUseProgram(self->program_id);
+void gsr_shader_use_none(gsr_shader *self) {
+ self->egl->glUseProgram(0);
diff --git a/src/sound.cpp b/src/sound.cpp
index 794d3ea..53000bd 100644
--- a/src/sound.cpp
+++ b/src/sound.cpp
@@ -1,21 +1,7 @@
- Copyright (C) 2020 dec05eba
- This program is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program. If not, see <https://www.gnu.org/licenses/>.
#include "../include/sound.hpp"
+extern "C" {
+#include "../include/utils.h"
#include <stdlib.h>
#include <stdio.h>
@@ -43,14 +29,6 @@
} \
} while(false);
-static double clock_get_monotonic_seconds() {
- struct timespec ts;
- ts.tv_sec = 0;
- ts.tv_nsec = 0;
- clock_gettime(CLOCK_MONOTONIC, &ts);
- return (double)ts.tv_sec + (double)ts.tv_nsec * 0.000000001;
struct pa_handle {
pa_context *context;
pa_stream *stream;
@@ -63,6 +41,7 @@ struct pa_handle {
size_t output_index, output_length;
int operation_success;
+ double latency_seconds;
static void pa_sound_device_free(pa_handle *s) {
@@ -101,8 +80,9 @@ static pa_handle* pa_sound_device_new(const char *server,
p->read_data = NULL;
p->read_length = 0;
p->read_index = 0;
+ p->latency_seconds = 0.0;
- const int buffer_size = attr->maxlength;
+ const int buffer_size = attr->fragsize;
void *buffer = malloc(buffer_size);
if(!buffer) {
fprintf(stderr, "failed to allocate buffer for audio\n");
@@ -175,20 +155,21 @@ fail:
return NULL;
-// Returns a negative value on failure or if |p->output_length| data is not available within the time frame specified by the sample rate
-static int pa_sound_device_read(pa_handle *p) {
+static int pa_sound_device_read(pa_handle *p, double timeout_seconds) {
- const int64_t timeout_ms = std::round((1000.0 / (double)pa_stream_get_sample_spec(p->stream)->rate) * 1000.0);
const double start_time = clock_get_monotonic_seconds();
bool success = false;
int r = 0;
int *rerror = &r;
+ pa_usec_t latency = 0;
+ int negative = 0;
CHECK_DEAD_GOTO(p, rerror, fail);
while (p->output_index < p->output_length) {
- if((clock_get_monotonic_seconds() - start_time) * 1000 >= timeout_ms)
+ if(clock_get_monotonic_seconds() - start_time >= timeout_seconds)
return -1;
if(!p->read_data) {
@@ -217,6 +198,15 @@ static int pa_sound_device_read(pa_handle *p) {
CHECK_DEAD_GOTO(p, rerror, fail);
+ pa_operation_unref(pa_stream_update_timing_info(p->stream, NULL, NULL));
+ // TODO: Deal with one pa_stream_peek not being enough. In that case we need to add multiple of these together(?)
+ if(pa_stream_get_latency(p->stream, &latency, &negative) >= 0) {
+ p->latency_seconds = negative ? -(double)latency : latency;
+ if(p->latency_seconds < 0.0)
+ p->latency_seconds = 0.0;
+ p->latency_seconds *= 0.0000001;
+ }
const size_t space_free_in_output_buffer = p->output_length - p->output_index;
@@ -249,20 +239,40 @@ static int pa_sound_device_read(pa_handle *p) {
return success ? 0 : -1;
-int sound_device_get_by_name(SoundDevice *device, const char *device_name, const char *description, unsigned int num_channels, unsigned int period_frame_size) {
+static pa_sample_format_t audio_format_to_pulse_audio_format(AudioFormat audio_format) {
+ switch(audio_format) {
+ case S16: return PA_SAMPLE_S16LE;
+ case S32: return PA_SAMPLE_S32LE;
+ case F32: return PA_SAMPLE_FLOAT32LE;
+ }
+ assert(false);
+ return PA_SAMPLE_S16LE;
+static int audio_format_to_get_bytes_per_sample(AudioFormat audio_format) {
+ switch(audio_format) {
+ case S16: return 2;
+ case S32: return 4;
+ case F32: return 4;
+ }
+ assert(false);
+ return 2;
+int sound_device_get_by_name(SoundDevice *device, const char *device_name, const char *description, unsigned int num_channels, unsigned int period_frame_size, AudioFormat audio_format) {
pa_sample_spec ss;
- ss.format = PA_SAMPLE_S16LE;
+ ss.format = audio_format_to_pulse_audio_format(audio_format);
ss.rate = 48000;
ss.channels = num_channels;
- int error;
pa_buffer_attr buffer_attr;
+ buffer_attr.fragsize = period_frame_size * audio_format_to_get_bytes_per_sample(audio_format) * num_channels; // 2/4 bytes/sample, @num_channels channels
buffer_attr.tlength = -1;
buffer_attr.prebuf = -1;
buffer_attr.minreq = -1;
- buffer_attr.maxlength = period_frame_size * 2 * num_channels; // 2 bytes/sample, @num_channels channels
- buffer_attr.fragsize = buffer_attr.maxlength;
+ buffer_attr.maxlength = buffer_attr.fragsize;
+ int error = 0;
pa_handle *handle = pa_sound_device_new(nullptr, description, device_name, description, &ss, &buffer_attr, &error);
if(!handle) {
fprintf(stderr, "pa_sound_device_new() failed: %s. Audio input device %s might not be valid\n", pa_strerror(error), description);
@@ -280,13 +290,15 @@ void sound_device_close(SoundDevice *device) {
device->handle = NULL;
-int sound_device_read_next_chunk(SoundDevice *device, void **buffer) {
+int sound_device_read_next_chunk(SoundDevice *device, void **buffer, double timeout_sec, double *latency_seconds) {
pa_handle *pa = (pa_handle*)device->handle;
- if(pa_sound_device_read(pa) < 0) {
+ if(pa_sound_device_read(pa, timeout_sec) < 0) {
//fprintf(stderr, "pa_simple_read() failed: %s\n", pa_strerror(error));
+ *latency_seconds = 0.0;
return -1;
*buffer = pa->output_data;
+ *latency_seconds = pa->latency_seconds;
return device->frames;
@@ -311,6 +323,7 @@ static void pa_state_cb(pa_context *c, void *userdata) {
static void pa_sourcelist_cb(pa_context *ctx, const pa_source_info *source_info, int eol, void *userdata) {
+ (void)ctx;
if(eol > 0)
@@ -359,4 +372,4 @@ std::vector<AudioInput> get_pulseaudio_inputs() {
return inputs;
-} \ No newline at end of file
diff --git a/src/utils.c b/src/utils.c
new file mode 100644
index 0000000..e00f3c5
--- /dev/null
+++ b/src/utils.c
@@ -0,0 +1,482 @@
+#include "../include/utils.h"
+#include <time.h>
+#include <string.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <xf86drmMode.h>
+#include <xf86drm.h>
+#include <stdlib.h>
+#include <X11/Xatom.h>
+double clock_get_monotonic_seconds(void) {
+ struct timespec ts;
+ ts.tv_sec = 0;
+ ts.tv_nsec = 0;
+ clock_gettime(CLOCK_MONOTONIC, &ts);
+ return (double)ts.tv_sec + (double)ts.tv_nsec * 0.000000001;
+static const XRRModeInfo* get_mode_info(const XRRScreenResources *sr, RRMode id) {
+ for(int i = 0; i < sr->nmode; ++i) {
+ if(sr->modes[i].id == id)
+ return &sr->modes[i];
+ }
+ return NULL;
+static gsr_monitor_rotation x11_rotation_to_gsr_rotation(int rot) {
+ switch(rot) {
+ case RR_Rotate_0: return GSR_MONITOR_ROT_0;
+ case RR_Rotate_90: return GSR_MONITOR_ROT_90;
+ case RR_Rotate_180: return GSR_MONITOR_ROT_180;
+ case RR_Rotate_270: return GSR_MONITOR_ROT_270;
+ }
+ return GSR_MONITOR_ROT_0;
+static gsr_monitor_rotation wayland_transform_to_gsr_rotation(int32_t rot) {
+ switch(rot) {
+ case 0: return GSR_MONITOR_ROT_0;
+ case 1: return GSR_MONITOR_ROT_90;
+ case 2: return GSR_MONITOR_ROT_180;
+ case 3: return GSR_MONITOR_ROT_270;
+ }
+ return GSR_MONITOR_ROT_0;
+static uint32_t x11_output_get_connector_id(Display *dpy, RROutput output, Atom randr_connector_id_atom) {
+ Atom type = 0;
+ int format = 0;
+ unsigned long bytes_after = 0;
+ unsigned long nitems = 0;
+ unsigned char *prop = NULL;
+ XRRGetOutputProperty(dpy, output, randr_connector_id_atom, 0, 128, false, false, AnyPropertyType, &type, &format, &nitems, &bytes_after, &prop);
+ long result = 0;
+ if(type == XA_INTEGER && format == 32)
+ result = *(long*)prop;
+ free(prop);
+ return result;
+void for_each_active_monitor_output_x11(Display *display, active_monitor_callback callback, void *userdata) {
+ XRRScreenResources *screen_res = XRRGetScreenResources(display, DefaultRootWindow(display));
+ if(!screen_res)
+ return;
+ const Atom randr_connector_id_atom = XInternAtom(display, "CONNECTOR_ID", False);
+ char display_name[256];
+ for(int i = 0; i < screen_res->noutput; ++i) {
+ XRROutputInfo *out_info = XRRGetOutputInfo(display, screen_res, screen_res->outputs[i]);
+ if(out_info && out_info->crtc && out_info->connection == RR_Connected) {
+ XRRCrtcInfo *crt_info = XRRGetCrtcInfo(display, screen_res, out_info->crtc);
+ if(crt_info && crt_info->mode) {
+ const XRRModeInfo *mode_info = get_mode_info(screen_res, crt_info->mode);
+ if(mode_info && out_info->nameLen < (int)sizeof(display_name)) {
+ memcpy(display_name, out_info->name, out_info->nameLen);
+ display_name[out_info->nameLen] = '\0';
+ const gsr_monitor monitor = {
+ .name = display_name,
+ .name_len = out_info->nameLen,
+ .pos = { .x = crt_info->x, .y = crt_info->y },
+ .size = { .x = (int)crt_info->width, .y = (int)crt_info->height },
+ .crt_info = crt_info,
+ .connector_id = x11_output_get_connector_id(display, screen_res->outputs[i], randr_connector_id_atom),
+ .rotation = x11_rotation_to_gsr_rotation(crt_info->rotation),
+ .monitor_identifier = 0
+ };
+ callback(&monitor, userdata);
+ }
+ }
+ if(crt_info)
+ XRRFreeCrtcInfo(crt_info);
+ }
+ if(out_info)
+ XRRFreeOutputInfo(out_info);
+ }
+ XRRFreeScreenResources(screen_res);
+typedef struct {
+ int type;
+ int count;
+ int count_active;
+} drm_connector_type_count;
+static drm_connector_type_count* drm_connector_types_get_index(drm_connector_type_count *type_counts, int *num_type_counts, int connector_type) {
+ for(int i = 0; i < *num_type_counts; ++i) {
+ if(type_counts[i].type == connector_type)
+ return &type_counts[i];
+ }
+ if(*num_type_counts == CONNECTOR_TYPE_COUNTS)
+ return NULL;
+ const int index = *num_type_counts;
+ type_counts[index].type = connector_type;
+ type_counts[index].count = 0;
+ type_counts[index].count_active = 0;
+ ++*num_type_counts;
+ return &type_counts[index];
+static bool connector_get_property_by_name(int drmfd, drmModeConnectorPtr props, const char *name, uint64_t *result) {
+ for(int i = 0; i < props->count_props; ++i) {
+ drmModePropertyPtr prop = drmModeGetProperty(drmfd, props->props[i]);
+ if(prop) {
+ if(strcmp(name, prop->name) == 0) {
+ *result = props->prop_values[i];
+ drmModeFreeProperty(prop);
+ return true;
+ }
+ drmModeFreeProperty(prop);
+ }
+ }
+ return false;
+/* TODO: Support more connector types*/
+static int get_connector_type_by_name(const char *name) {
+ int len = strlen(name);
+ if(len >= 5 && strncmp(name, "HDMI-", 5) == 0)
+ return 1;
+ else if(len >= 3 && strncmp(name, "DP-", 3) == 0)
+ return 2;
+ else if(len >= 12 && strncmp(name, "DisplayPort-", 12) == 0)
+ return 3;
+ else if(len >= 4 && strncmp(name, "eDP-", 4) == 0)
+ return 4;
+ else
+ return -1;
+static uint32_t monitor_identifier_from_type_and_count(int monitor_type_index, int monitor_type_count) {
+ return ((uint32_t)monitor_type_index << 16) | ((uint32_t)monitor_type_count);
+static void for_each_active_monitor_output_wayland(const gsr_egl *egl, active_monitor_callback callback, void *userdata) {
+ drm_connector_type_count type_counts[CONNECTOR_TYPE_COUNTS];
+ int num_type_counts = 0;
+ for(int i = 0; i < egl->wayland.num_outputs; ++i) {
+ const gsr_wayland_output *output = &egl->wayland.outputs[i];
+ if(!output->name)
+ continue;
+ const int connector_type_index = get_connector_type_by_name(output->name);
+ drm_connector_type_count *connector_type = NULL;
+ if(connector_type_index != -1)
+ connector_type = drm_connector_types_get_index(type_counts, &num_type_counts, connector_type_index);
+ if(connector_type) {
+ ++connector_type->count;
+ ++connector_type->count_active;
+ }
+ const gsr_monitor monitor = {
+ .name = output->name,
+ .name_len = strlen(output->name),
+ .pos = { .x = output->pos.x, .y = output->pos.y },
+ .size = { .x = output->size.x, .y = output->size.y },
+ .crt_info = NULL,
+ .connector_id = 0,
+ .rotation = wayland_transform_to_gsr_rotation(output->transform),
+ .monitor_identifier = connector_type ? monitor_identifier_from_type_and_count(connector_type_index, connector_type->count_active) : 0
+ };
+ callback(&monitor, userdata);
+ }
+static void for_each_active_monitor_output_drm(const gsr_egl *egl, active_monitor_callback callback, void *userdata) {
+ int fd = open(egl->card_path, O_RDONLY);
+ if(fd == -1)
+ return;
+ drmSetClientCap(fd, DRM_CLIENT_CAP_ATOMIC, 1);
+ drm_connector_type_count type_counts[CONNECTOR_TYPE_COUNTS];
+ int num_type_counts = 0;
+ char display_name[256];
+ drmModeResPtr resources = drmModeGetResources(fd);
+ if(resources) {
+ for(int i = 0; i < resources->count_connectors; ++i) {
+ drmModeConnectorPtr connector = drmModeGetConnectorCurrent(fd, resources->connectors[i]);
+ if(!connector)
+ continue;
+ drm_connector_type_count *connector_type = drm_connector_types_get_index(type_counts, &num_type_counts, connector->connector_type);
+ const char *connection_name = drmModeGetConnectorTypeName(connector->connector_type);
+ const int connection_name_len = strlen(connection_name);
+ if(connector_type)
+ ++connector_type->count;
+ if(connector->connection != DRM_MODE_CONNECTED) {
+ drmModeFreeConnector(connector);
+ continue;
+ }
+ if(connector_type)
+ ++connector_type->count_active;
+ uint64_t crtc_id = 0;
+ connector_get_property_by_name(fd, connector, "CRTC_ID", &crtc_id);
+ drmModeCrtcPtr crtc = drmModeGetCrtc(fd, crtc_id);
+ if(connector_type && crtc_id > 0 && crtc && connection_name_len + 5 < (int)sizeof(display_name)) {
+ const int display_name_len = snprintf(display_name, sizeof(display_name), "%s-%d", connection_name, connector_type->count);
+ const int connector_type_index_name = get_connector_type_by_name(display_name);
+ const gsr_monitor monitor = {
+ .name = display_name,
+ .name_len = display_name_len,
+ .pos = { .x = crtc->x, .y = crtc->y },
+ .size = { .x = (int)crtc->width, .y = (int)crtc->height },
+ .crt_info = NULL,
+ .connector_id = connector->connector_id,
+ .rotation = GSR_MONITOR_ROT_0,
+ .monitor_identifier = connector_type_index_name != -1 ? monitor_identifier_from_type_and_count(connector_type_index_name, connector_type->count_active) : 0
+ };
+ callback(&monitor, userdata);
+ }
+ if(crtc)
+ drmModeFreeCrtc(crtc);
+ drmModeFreeConnector(connector);
+ }
+ drmModeFreeResources(resources);
+ }
+ close(fd);
+void for_each_active_monitor_output(const gsr_egl *egl, gsr_connection_type connection_type, active_monitor_callback callback, void *userdata) {
+ switch(connection_type) {
+ for_each_active_monitor_output_x11(egl->x11.dpy, callback, userdata);
+ break;
+ for_each_active_monitor_output_wayland(egl, callback, userdata);
+ break;
+ for_each_active_monitor_output_drm(egl, callback, userdata);
+ break;
+ }
+static void get_monitor_by_name_callback(const gsr_monitor *monitor, void *userdata) {
+ get_monitor_by_name_userdata *data = (get_monitor_by_name_userdata*)userdata;
+ if(!data->found_monitor && strcmp(data->name, monitor->name) == 0) {
+ data->monitor->pos = monitor->pos;
+ data->monitor->size = monitor->size;
+ data->monitor->connector_id = monitor->connector_id;
+ data->monitor->rotation = monitor->rotation;
+ data->monitor->monitor_identifier = monitor->monitor_identifier;
+ data->found_monitor = true;
+ }
+bool get_monitor_by_name(const gsr_egl *egl, gsr_connection_type connection_type, const char *name, gsr_monitor *monitor) {
+ get_monitor_by_name_userdata userdata;
+ userdata.name = name;
+ userdata.name_len = strlen(name);
+ userdata.monitor = monitor;
+ userdata.found_monitor = false;
+ for_each_active_monitor_output(egl, connection_type, get_monitor_by_name_callback, &userdata);
+ return userdata.found_monitor;
+typedef struct {
+ const gsr_monitor *monitor;
+ gsr_monitor_rotation rotation;
+ bool match_found;
+} get_monitor_by_connector_id_userdata;
+static bool vec2i_eql(vec2i a, vec2i b) {
+ return a.x == b.x && a.y == b.y;
+static void get_monitor_by_name_and_size_callback(const gsr_monitor *monitor, void *userdata) {
+ get_monitor_by_connector_id_userdata *data = (get_monitor_by_connector_id_userdata*)userdata;
+ if(monitor->name && data->monitor->name && strcmp(monitor->name, data->monitor->name) == 0 && vec2i_eql(monitor->size, data->monitor->size)) {
+ data->rotation = monitor->rotation;
+ data->match_found = true;
+ }
+static void get_monitor_by_connector_id_callback(const gsr_monitor *monitor, void *userdata) {
+ get_monitor_by_connector_id_userdata *data = (get_monitor_by_connector_id_userdata*)userdata;
+ if(monitor->connector_id == data->monitor->connector_id ||
+ (!monitor->connector_id && monitor->monitor_identifier == data->monitor->monitor_identifier))
+ {
+ data->rotation = monitor->rotation;
+ data->match_found = true;
+ }
+gsr_monitor_rotation drm_monitor_get_display_server_rotation(const gsr_egl *egl, const gsr_monitor *monitor) {
+ if(egl->wayland.dpy) {
+ {
+ get_monitor_by_connector_id_userdata userdata;
+ userdata.monitor = monitor;
+ userdata.rotation = GSR_MONITOR_ROT_0;
+ userdata.match_found = false;
+ for_each_active_monitor_output_wayland(egl, get_monitor_by_name_and_size_callback, &userdata);
+ if(userdata.match_found)
+ return userdata.rotation;
+ }
+ {
+ get_monitor_by_connector_id_userdata userdata;
+ userdata.monitor = monitor;
+ userdata.rotation = GSR_MONITOR_ROT_0;
+ userdata.match_found = false;
+ for_each_active_monitor_output_wayland(egl, get_monitor_by_connector_id_callback, &userdata);
+ return userdata.rotation;
+ }
+ } else {
+ get_monitor_by_connector_id_userdata userdata;
+ userdata.monitor = monitor;
+ userdata.rotation = GSR_MONITOR_ROT_0;
+ userdata.match_found = false;
+ for_each_active_monitor_output_x11(egl->x11.dpy, get_monitor_by_connector_id_callback, &userdata);
+ return userdata.rotation;
+ }
+ return GSR_MONITOR_ROT_0;
+bool gl_get_gpu_info(gsr_egl *egl, gsr_gpu_info *info) {
+ const char *software_renderers[] = { "llvmpipe", "SWR", "softpipe", NULL };
+ bool supported = true;
+ const unsigned char *gl_vendor = egl->glGetString(GL_VENDOR);
+ const unsigned char *gl_renderer = egl->glGetString(GL_RENDERER);
+ info->gpu_version = 0;
+ if(!gl_vendor) {
+ fprintf(stderr, "gsr error: failed to get gpu vendor\n");
+ supported = false;
+ goto end;
+ }
+ if(gl_renderer) {
+ for(int i = 0; software_renderers[i]; ++i) {
+ if(strstr((const char*)gl_renderer, software_renderers[i])) {
+ fprintf(stderr, "gsr error: your opengl environment is not properly setup. It's using %s (software rendering) for opengl instead of your graphics card. Please make sure your graphics driver is properly installed\n", software_renderers[i]);
+ supported = false;
+ goto end;
+ }
+ }
+ }
+ if(strstr((const char*)gl_vendor, "AMD"))
+ info->vendor = GSR_GPU_VENDOR_AMD;
+ else if(strstr((const char*)gl_vendor, "Intel"))
+ info->vendor = GSR_GPU_VENDOR_INTEL;
+ else if(strstr((const char*)gl_vendor, "NVIDIA"))
+ info->vendor = GSR_GPU_VENDOR_NVIDIA;
+ else {
+ fprintf(stderr, "gsr error: unknown gpu vendor: %s\n", gl_vendor);
+ supported = false;
+ goto end;
+ }
+ if(gl_renderer) {
+ if(info->vendor == GSR_GPU_VENDOR_NVIDIA)
+ sscanf((const char*)gl_renderer, "%*s %*s %*s %d", &info->gpu_version);
+ }
+ end:
+ return supported;
+static bool try_card_has_valid_plane(const char *card_path) {
+ drmVersion *ver = NULL;
+ drmModePlaneResPtr planes = NULL;
+ bool found_screen_card = false;
+ int fd = open(card_path, O_RDONLY);
+ if(fd == -1)
+ return false;
+ ver = drmGetVersion(fd);
+ if(!ver || strstr(ver->name, "nouveau"))
+ goto next;
+ planes = drmModeGetPlaneResources(fd);
+ if(!planes)
+ goto next;
+ for(uint32_t j = 0; j < planes->count_planes; ++j) {
+ drmModePlanePtr plane = drmModeGetPlane(fd, planes->planes[j]);
+ if(!plane)
+ continue;
+ if(plane->fb_id)
+ found_screen_card = true;
+ drmModeFreePlane(plane);
+ if(found_screen_card)
+ break;
+ }
+ next:
+ if(planes)
+ drmModeFreePlaneResources(planes);
+ if(ver)
+ drmFreeVersion(ver);
+ close(fd);
+ if(found_screen_card)
+ return true;
+ return false;
+static void string_copy(char *dst, const char *src, int len) {
+ int src_len = strlen(src);
+ int min_len = src_len;
+ if(len - 1 < min_len)
+ min_len = len - 1;
+ memcpy(dst, src, min_len);
+ dst[min_len] = '\0';
+bool gsr_get_valid_card_path(gsr_egl *egl, char *output, bool is_monitor_capture) {
+ if(egl->dri_card_path) {
+ string_copy(output, egl->dri_card_path, 127);
+ return is_monitor_capture ? try_card_has_valid_plane(output) : true;
+ }
+ for(int i = 0; i < 10; ++i) {
+ snprintf(output, 127, DRM_DEV_NAME, DRM_DIR_NAME, i);
+ if(try_card_has_valid_plane(output))
+ return true;
+ }
+ return false;
+bool gsr_card_path_get_render_path(const char *card_path, char *render_path) {
+ int fd = open(card_path, O_RDONLY);
+ if(fd == -1)
+ return false;
+ char *render_path_tmp = drmGetRenderDeviceNameFromFd(fd);
+ if(render_path_tmp) {
+ string_copy(render_path, render_path_tmp, 127);
+ free(render_path_tmp);
+ close(fd);
+ return true;
+ }
+ close(fd);
+ return false;
diff --git a/src/window_texture.c b/src/window_texture.c
new file mode 100644
index 0000000..0f4aa2c
--- /dev/null
+++ b/src/window_texture.c
@@ -0,0 +1,123 @@
+#include "../include/window_texture.h"
+#include <X11/extensions/Xcomposite.h>
+static int x11_supports_composite_named_window_pixmap(Display *display) {
+ int extension_major;
+ int extension_minor;
+ if(!XCompositeQueryExtension(display, &extension_major, &extension_minor))
+ return 0;
+ int major_version;
+ int minor_version;
+ return XCompositeQueryVersion(display, &major_version, &minor_version) && (major_version > 0 || minor_version >= 2);
+int window_texture_init(WindowTexture *window_texture, Display *display, Window window, gsr_egl *egl) {
+ window_texture->display = display;
+ window_texture->window = window;
+ window_texture->pixmap = None;
+ window_texture->texture_id = 0;
+ window_texture->redirected = 0;
+ window_texture->egl = egl;
+ if(!x11_supports_composite_named_window_pixmap(display))
+ return 1;
+ XCompositeRedirectWindow(display, window, CompositeRedirectAutomatic);
+ window_texture->redirected = 1;
+ return window_texture_on_resize(window_texture);
+static void window_texture_cleanup(WindowTexture *self, int delete_texture) {
+ if(delete_texture && self->texture_id) {
+ self->egl->glDeleteTextures(1, &self->texture_id);
+ self->texture_id = 0;
+ }
+ if(self->pixmap) {
+ XFreePixmap(self->display, self->pixmap);
+ self->pixmap = None;
+ }
+void window_texture_deinit(WindowTexture *self) {
+ if(self->redirected) {
+ XCompositeUnredirectWindow(self->display, self->window, CompositeRedirectAutomatic);
+ self->redirected = 0;
+ }
+ window_texture_cleanup(self, 1);
+int window_texture_on_resize(WindowTexture *self) {
+ window_texture_cleanup(self, 0);
+ int result = 0;
+ Pixmap pixmap = None;
+ unsigned int texture_id = 0;
+ EGLImage image = NULL;
+ const intptr_t pixmap_attrs[] = {
+ };
+ pixmap = XCompositeNameWindowPixmap(self->display, self->window);
+ if(!pixmap) {
+ result = 2;
+ goto cleanup;
+ }
+ if(self->texture_id == 0) {
+ self->egl->glGenTextures(1, &texture_id);
+ if(texture_id == 0) {
+ result = 3;
+ goto cleanup;
+ }
+ self->egl->glBindTexture(GL_TEXTURE_2D, texture_id);
+ } else {
+ self->egl->glBindTexture(GL_TEXTURE_2D, self->texture_id);
+ texture_id = self->texture_id;
+ }
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
+ self->egl->glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
+ while(self->egl->glGetError()) {}
+ while(self->egl->eglGetError() != EGL_SUCCESS) {}
+ image = self->egl->eglCreateImage(self->egl->egl_display, NULL, EGL_NATIVE_PIXMAP_KHR, (EGLClientBuffer)pixmap, pixmap_attrs);
+ if(!image) {
+ result = 4;
+ goto cleanup;
+ }
+ self->egl->glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, image);
+ if(self->egl->glGetError() != 0 || self->egl->eglGetError() != EGL_SUCCESS) {
+ result = 5;
+ goto cleanup;
+ }
+ self->pixmap = pixmap;
+ self->texture_id = texture_id;
+ cleanup:
+ self->egl->glBindTexture(GL_TEXTURE_2D, 0);
+ if(image)
+ self->egl->eglDestroyImage(self->egl->egl_display, image);
+ if(result != 0) {
+ if(texture_id != 0)
+ self->egl->glDeleteTextures(1, &texture_id);
+ if(pixmap)
+ XFreePixmap(self->display, pixmap);
+ }
+ return result;
+unsigned int window_texture_get_opengl_texture_id(WindowTexture *self) {
+ return self->texture_id;
diff --git a/src/xnvctrl.c b/src/xnvctrl.c
new file mode 100644
index 0000000..b738455
--- /dev/null
+++ b/src/xnvctrl.c
@@ -0,0 +1,46 @@
+#include "../include/xnvctrl.h"
+#include "../include/library_loader.h"
+#include <string.h>
+#include <stdio.h>
+#include <dlfcn.h>
+bool gsr_xnvctrl_load(gsr_xnvctrl *self, Display *display) {
+ memset(self, 0, sizeof(gsr_xnvctrl));
+ self->display = display;
+ dlerror(); /* clear */
+ void *lib = dlopen("libXNVCtrl.so.0", RTLD_LAZY);
+ if(!lib) {
+ fprintf(stderr, "gsr error: gsr_xnvctrl_load failed: failed to load libXNVCtrl.so.0, error: %s\n", dlerror());
+ return false;
+ }
+ dlsym_assign required_dlsym[] = {
+ { (void**)&self->XNVCTRLQueryExtension, "XNVCTRLQueryExtension" },
+ { (void**)&self->XNVCTRLSetTargetAttributeAndGetStatus, "XNVCTRLSetTargetAttributeAndGetStatus" },
+ { (void**)&self->XNVCTRLQueryValidTargetAttributeValues, "XNVCTRLQueryValidTargetAttributeValues" },
+ { (void**)&self->XNVCTRLQueryTargetStringAttribute, "XNVCTRLQueryTargetStringAttribute" },
+ { NULL, NULL }
+ };
+ if(!dlsym_load_list(lib, required_dlsym)) {
+ fprintf(stderr, "gsr error: gsr_xnvctrl_load failed: missing required symbols in libXNVCtrl.so.0\n");
+ goto fail;
+ }
+ self->library = lib;
+ return true;
+ fail:
+ dlclose(lib);
+ memset(self, 0, sizeof(gsr_xnvctrl));
+ return false;
+void gsr_xnvctrl_unload(gsr_xnvctrl *self) {
+ if(self->library) {
+ dlclose(self->library);
+ memset(self, 0, sizeof(gsr_xnvctrl));
+ }
diff --git a/study/color_space_transform_matrix.png b/study/color_space_transform_matrix.png
new file mode 100644
index 0000000..2b7729e5
--- /dev/null
+++ b/study/color_space_transform_matrix.png
Binary files differ
diff --git a/study/create_matrix.py b/study/create_matrix.py
new file mode 100755
index 0000000..1599a12
--- /dev/null
+++ b/study/create_matrix.py
@@ -0,0 +1,48 @@
+#!/usr/bin/env python3
+import sys
+def usage():
+ print("usage: Kr Kg Kb full|limited")
+ print("examples:")
+ print(" create_matrix.py 0.2126 0.7152 0.0722 full")
+ print(" create_matrix.py 0.2126 0.7152 0.0722 limited")
+ exit(1)
+def a(v):
+ if v >= 0:
+ return " %f" % v
+ else:
+ return "%f" % v
+def main(argv):
+ if len(argv) != 5:
+ usage()
+ Kr = float(sys.argv[1])
+ Kg = float(sys.argv[2])
+ Kb = float(sys.argv[3])
+ color_range = sys.argv[4]
+ luma_offset = 0.0
+ transform_range = 1.0
+ if color_range == "full":
+ pass
+ elif color_range == "limited":
+ transform_range = (235.0 - 16.0) / 255.0
+ luma_offset = 16.0 / 255.0
+ matrix = [
+ [Kr, Kg, Kb],
+ [-0.5 * (Kr / (1.0 - Kb)), -0.5 * (Kg / (1.0 - Kb)), 0.5],
+ [0.5, -0.5 * (Kg / (1.0 - Kr)), -0.5 * (Kb / (1.0 -Kr))],
+ [0.0, 0.5, 0.5]
+ ]
+ # Transform from row major to column major for glsl
+ print("const mat4 RGBtoYUV = mat4(%f, %s, %s, %f," % (matrix[0][0] * transform_range, a(matrix[1][0] * transform_range), a(matrix[2][0] * transform_range), 0.0))
+ print(" %f, %s, %s, %f," % (matrix[0][1] * transform_range, a(matrix[1][1] * transform_range), a(matrix[2][1] * transform_range), 0.0))
+ print(" %f, %s, %s, %f," % (matrix[0][2] * transform_range, a(matrix[1][2] * transform_range), a(matrix[2][2] * transform_range), 0.0))
+ print(" %f, %s, %s, %f);" % (matrix[3][0] + luma_offset, a(matrix[3][1]), a(matrix[3][2]), 1.0))
diff --git a/uninstall.sh b/uninstall.sh
new file mode 100755
index 0000000..b8aac26
--- /dev/null
+++ b/uninstall.sh
@@ -0,0 +1,10 @@
+#!/bin/sh -e
+script_dir=$(dirname "$0")
+cd "$script_dir"
+[ $(id -u) -ne 0 ] && echo "You need root privileges to run the uninstall script" && exit 1
+ninja -C build uninstall
+echo "Successfully uninstalled gpu-screen-recorder"