1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
|
Check for reparent.
Quickly changing workspace and back while recording under i3 breaks the screen recorder. i3 probably unmaps windows in other workspaces.
See https://trac.ffmpeg.org/wiki/EncodingForStreamingSites for optimizing streaming.
Look at VK_EXT_external_memory_dma_buf.
Allow setting a different output resolution than the input resolution.
Use mov+faststart.
Allow recording all monitors/selected monitor without nvfbc by recording the compositor proxy window and only recording the part that matches the monitor(s).
Allow recording a region by recording the compositor proxy window / nvfbc window and copying part of it.
Use nvenc directly, which allows removing the use of cuda.
Handle xrandr monitor change in nvfbc.
Implement follow focused in drm.
Support amf and qsv.
Disable flipping on nvidia? this might fix some stuttering issues on some setups. See NvCtrlGetAttribute/NvCtrlSetAttributeAndGetStatus NV_CTRL_SYNC_TO_VBLANK https://github.com/NVIDIA/nvidia-settings/blob/d5f022976368cbceb2f20b838ddb0bf992f0cfb9/src/gtk%2B-2.x/ctkopengl.c.
Replays seem to have some issues with audio/video. Why?
Cleanup unused gl/egl functions, macro, etc.
Add option to disable overlapping of replays (the old behavior kinda. Remove the whole replay buffer data after saving when doing this).
Set audio track name to audio device name (if not merge of multiple audio devices).
Add support for webcam, but only really for amd/intel because amd/intel can get drm fd access to webcam, nvidia cant. This allows us to create an opengl texture directly from the webcam fd for optimal performance.
Reverse engineer nvapi so we can disable "force p2 state" on linux too (nvapi profile api with the settings id 0x50166c5e).
Support yuv444p on amd/intel.
fix yuv444 for hevc.
Do not allow streaming if yuv444.
Re-enable yuv444.
Support 10 bit output because of better gradients. May even be smaller file size. Better supported on hevc (not supported at all on h264 on my gpu).
Add nvidia/(amd/intel) specific install script for ubuntu. User should run install_ubuntu.sh but it should run different install dep script depending on if /proc/driver/nvidia/version exists or not. But what about switchable graphics setup?
Test different combinations of switchable graphics. Intel hybrid mode (running intel but possible to run specific applications with prime-run), running pure intel. Detect switchable graphics.
https://web.archive.org/web/20210306020203/https://forums.developer.nvidia.com/t/performance-power-management-problem-on-shared-vgpu/161986
https://djdallmann.github.io/GamingPCSetup/CONTENT/RESEARCH/FINDINGS/registrykeys_displayadapter_class_4d36e968-e325-11ce-bfc1-08002be10318.txt
The video output will be black if if the system is suspended on nvidia and NVreg_PreserveVideoMemoryAllocations is not set to 1. This happens because I think that the driver invalidates textures/cuda buffers? To fix this we could try and recreate gsr capture when gsr_capture_capture fails (with timeout to retry again).
NVreg_RegistryDwords.
Restore nvfbc screen recording on monitor reconfiguration.
Window capture doesn't work properly in _control_ game after going from pause menu to in-game (and back to pause menu). There might be some x11 event we need to catch. Same for vr-video-player.
Monitor capture on steam deck is slightly below the game fps, but only when capturing on the steam deck screen. If capturing on another monitor, there is no issue.
Is this related to the dma buf rotation issue? different modifier being slow? does this always happen?
Fallback to vaapi copy in kms if opengl version fails. This can happen on steam deck for some reason (driver bug?). Also vaapi copy uses less gpu since it uses video codec unit to copy.
Test if vaapi copy version uses less memory than opengl version.
Intel is a bit weird with monitor capture and multiple monitors. If one of the monitors is rotated then all the kms will be rotated as well.
Is that only the case when the primary monitor is rotated? Also the primary monitor becomes position 0, 0 so crtc (x11 randr) position doesn't match the drm pos. Maybe get monitor position and size from drm instead.
How about if multiple monitors are rotated?
Support vp8/vp9. This is especially important on amd which on some distros (such as Manjaro) where hardware accelerated h264/hevc is disabled in the mesa package.
Support screen (all monitors) capture on amd/intel and nvidia wayland when no combined plane is found. Right now screen just takes the first output.
Use separate plane (which has offset and pitch) from combined plane instead of the combined plane.
Both twitch and youtube support variable bitrate but twitch recommends constant bitrate to reduce stream buffering/dropped frames when going from low motion to high motion: https://help.twitch.tv/s/article/broadcasting-guidelines?language=en_US. Info for youtube: https://support.google.com/youtube/answer/2853702?hl=en#zippy=%2Cvariable-bitrate-with-custom-stream-keys-in-live-control-room%2Ck-p-fps%2Cp-fps.
Limit fps recording with x damage. This is good when running replay mode 24/7 and being afk or when not much is happening on the screen.
On nvidia some games apparently causes the game to appear to stutter (without dropping fps) when recording a monitor but not using
when using direct screen capture. Observed in Deus Ex and Apex Legends.
Support "screen" (all monitors) capture on wayland. This should be done by getting all drm fds and multiple EGL_DMA_BUF_PLANEX_FD_EXT to create one egl image with all fds combined.
Support pipewire screen capture?
CPU usage is pretty high on AMD/Intel/(Nvidia(wayland)), why? opening and closing fds, creating egl, cuda association, is slow when done every frame. Test if desktop portal screencast has better performance.
Capture is broken on amd on wlroots. It's disabled at the moment and instead uses kms capture. Find out why we get a black screen in wlroots.
Support vulkan video encoding. That might workaround forced p2 state nvidia driver "bug". Ffmpeg supports vulkan video encoding if it's encoding with --enable-vulkan
It may be possible to improve color conversion rgb->yuv shader for color edges by biasing colors to an edge, instead of letting color overlaying with bilinear filtering handle it.
When webcam is supported mention that nvidia_drm.modeset=1 must be set on nvidia x11 (it's required on wayland so it's not needed there. Or does eglstream work without it??). Check if this really is the case.
Support green screen removal, cropping, shader effects in general (circle mask, rounded corners, etc).
Preset is set to p5 for now but it should ideally be p6 or p7.
This change is needed because for certain sizes of a window (or monitor?) such as 971x780 causes encoding to freeze
when using h264 codec. This is a new(?) nvidia driver bug.
Maybe dont choose p6 or p7 again? it causes micro stutter for some users (?).
For low latency, see https://developer.download.nvidia.com/compute/nvenc/v4.0/NVENC_VideoEncoder_API_ProgGuide.pdf (section 7.1).
Remove follow focused option.
Exit if X11/Wayland killed (if drm plane dead or something?)
Use SRC_W and SRC_H for screen plane instead of crtc_w and crtc_h.
Make it possible to select which /dev/dri/card* to use, but that requires opengl to also use the same card. Not sure if that is possible for amd, intel and nvidia without using vulkan instead.
Support intel display framebuffer compression (I915_FORMAT_MOD_Y_TILED_CCS modifier) (and other power saving modifiers, see https://trac.ffmpeg.org/ticket/8542). The only fix may be to use desktop portal for recording. This issue doesn't appear on x11 since these modifiers are not used by xorg server.
This issue only appears on some intel iGPUs, such as Intel Iris Xe, see: https://github.com/dec05eba/gpu-screen-recorder-issues/issues/1.
Intel dedicated GPU (intel arc a750) can have a similar issue, but it's not related to compression. In that case the modifier is I915_FORMAT_MOD_4_TILED.
Test if p2 state can be worked around by using pure nvenc api and overwriting cuInit/cuCtxCreate* to not do anything. Cuda might be loaded when using nvenc but it might not be used, with certain record options? (such as h264 p5).
nvenc uses cuda when using b frames and rgb->yuv conversion, so convert the image ourselves instead.-
Mesa doesn't support global headers (AV_CODEC_FLAG_GLOBAL_HEADER) with h264... which also breaks mkv since mkv requires global header. Right now gpu screen recorder will forcefully set video codec to hevc when h264 is requested for mkv files.
Drop frames if live streaming cant keep up with target fps, or dynamically change resolution/quality.
Support low power option (does it even work with vaapi in ffmpeg??). Would be very useful for steam deck.
Instead of sending a big list of drm data back to kms client, send the monitor we want to record to kms server and the server should respond with only the matching monitor, and cursor.
Tonemap hdr to sdr when hdr is enabled and when hevc_hdr/av1_hdr is not used.
Add 10 bit record option, h264_10bit, hevc_10bit and av1_10bit.
Rotate cursor texture properly (around top left origin).
Setup hardware video context so we can query constraints and capabilities for better default and better error messages.
Use CAP_SYS_NICE in flatpak too on the main gpu screen recorder binary. It makes recording smoother, especially with constant framerate.
Show error when using compressed kms plane which isn't supported. Also do that in the gui.
Modify ffmpeg to accept opengl texture for nvenc encoding. Removes extra buffers and copies.
When vulkan encode is added, mention minimum nvidia driver required. (550.54.14?).
Support drm plane rotation. Neither X11 nor any Wayland compositor currently rotates drm planes so this might not be needed.
Investigate if there is a way to do gpu->gpu copy directly without touching system ram to enable video encoding on a different gpu. On nvidia this is possible with cudaMemcpyPeer, but how about from an intel/amd gpu to an nvidia gpu or the other way around or any combination of iGPU and dedicated GPU?
Maybe something with clEnqueueMigrateMemObjects? on AMD something with DirectGMA maybe?
Go back to using pure vaapi without opengl for video encoding? rotation (transpose) can be done if its done after (rgb to yuv) color conversion.
Implement scaling and use lanczos resampling for better quality. Lanczos resampling can also be used for YUV chroma for better color quality on small text.
Try fixing HDR by passing HDR+10 data as well, and in the packet. Run "ffprobe -loglevel quiet -read_intervals "%+#2" -select_streams v:0 -show_entries side_data video.mp4" to test if the file has correct metadata.
Flac is disabled because the frame sizes are too large which causes big audio/video desync.
Add 10-bit capture option. This is good because it reduces banding and quality in very dark areas while reducing the file size compared to doing the same thing with 8-bits.
Enable b-frames.
|