File tuned_latency-balanced.conf of Package pipewire

Overview Repositories Revisions Requests Users Attributes Meta

File tuned_latency-balanced.conf of Package pipewire

#
# tuned configuration
#

[main]
summary=Less power-hungry version of latency-performance profile
#include=latency-performance

[variables]
# use 'rd.fstab=1 fstab=yes' to force systemd to actually use root's mount options
# 'tsc=noirqtime,unstable,nowatchdog' & 'hpet=force clocksource=hpet highres=on' may be mandatory for reliable system timer, maybe add 'lapic=notscdeadline,nowatchdog x2apic_phys apicpmtimer acpi_use_timer_override x86_intel_mid_timer=lapic_and_apbt' for Intel CPUs and 'align_va_addr=on' that is default on AMD CPUs
# 'zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=25' to enable transparently compressed swap which is viable on NVMe but kernel may bug-out and hang on untested regressions
# zswap is cache for disk swap, zram should be used for actual compressed RAM-based swap: https://linuxreviews.org/Comparison_of_Compression_Algorithms
# is 'nohz_full=[core_range]' still required even with NO_HZ_FULL=y, as kernel doc still says ?
# 'threadirqs' also may be still be required despite IRQ_FORCED_THREADING=y
# see https://forum.osdev.org/viewtopic.php?f=1&t=40408 for 'cpu_init_udelay=X'
# on desktop watchdog is only good for triggering accidentally and doing hot reset that corrupts all your data, so disable it with 'acpi_no_watchdog nmi_watchdog=0 nowatchdog'
# defraging and such on hugepages can lead to random latency spikes, so it can be disabled with 'hugetlb_free_vmemmap=off'
cmdline_main=sysrq_always_enabled noautogroup numa_balancing=disable gbpages transparent_hugepage=madvise nmi_watchdog=0 x2apic_phys
# recommended RCU setup, which is somehow not default, is: 'rcu_nocbs=all rcu_nocb_poll rcutree.kthread_prio=1 rcutree.use_softirq=0 rcutree.rcu_kick_kthreads rcupdate.rcu_normal_after_boot rcutree.jiffies_till_sched_qs=333'
# "jiffie"-based values should be adjusted for HZ-rate of the kernel
# see https://lwn.net/Articles/777214/ and https://lkml.org/lkml/2020/9/17/1243 also https://docs.kernel.org/RCU/stallwarn.html and https://www.uwsg.indiana.edu/hypermail/linux/kernel/2012.1/03981.html
cmdline_rcu=rcu_nocbs=all rcutree.kthread_prio=25 rcutree.use_softirq=0 rcutree.rcu_kick_kthreads=1 rcutree.blimit=8 rcutree.qlowmark=24 rcutree.qhimark=256 rcutree.jiffies_till_first_fqs=1 rcutree.rcu_idle_gp_delay=4 rcupdate.rcu_task_ipi_delay=10 rcutree.rcu_idle_lazy_gp_delay=3333
# IOMMU stuff: 'iommu=on,nopt,noforce,noagp,nofullflush,memaper=3,merge iommu.forcedac=1 amd_iommu=on intel_iommu=on,sm_on'
# also use some "RAM cache" for TLB with 'swiotlb=<slabs>' where slabs are 2KB in size for some reason, so 8192=16MB, 131072=256MB, 524288=1GB
# but since v6.6 kernel increases swiotlb at runtime, so it's better to keep it minimal by default
cmdline_io=swiotlb=1024 iommu=noagp,nofullflush,memaper=3,merge amd_iommu=pgtbl_v2 intel-ishtp.ishtp_use_dma=1 ioatdma.ioat_pending_level=4 ioatdma.completion_timeout=999 ioatdma.idle_timeout=10000 idxd.sva=1 idxd.tc_override=1
# PCI[e] optimizations: 'pci=ecrc=on,ioapicreroute,assign-busses,check_enable_amd_mmconf,realloc,pcie_bus_perf,pcie_scan_all pcie_ports=auto/native'
cmdline_pci=pcie_ports=auto pci=noaer,ecrc=on,ioapicreroute,assign-busses,check_enable_amd_mmconf,realloc,pcie_bus_perf
# ASPM is known to trigger critical issues with providing little power-saving, if any. Same goes for PCIe PM, so do 'pcie_aspm=off pcie_port_pm=off' but 'pcie_aspm.policy=performance' may be a gentler option
# other ASPM options are: 'default' (developer's default, NOT config's default/unchanged !), 'powersave' and 'powersupersave'
# the idiocy of ASPM handling is in the fact that to actually disable it, you may have to 'force'-enable (!) it in kernel (with policy=performance) to allow it to do anything because geniuses at Intel decided "to match Windows" for "less risk"….
# BIOS may limit power-states of CPU or kernel may be too "shy" to use them unless directly asked by BIOS, so let's force them with 'processor.ignore_ppc=1 processor.ignore_tpc=1'. that may be detrimental for laptops
# also, see https://github.com/torvalds/linux/commit/25de5718356e264820625600a9edca1df5ff26f8 - kernel was made to be more aggressive in downing frequency which is bad for latency-critical tasks
# https://bugzilla.redhat.com/show_bug.cgi?id=463285#c12 - 2 is default, 1 may give much better powersaving and 3 - better latency guarantees
# hacked BIOS' for locked Intel CPUs may contain microcode that unlocks some higher performance states but that requires disabling OS microcode override with 'dis_ucode_ldr'
# some Intel hacks: "processor.ignore_ppc=1 processor.ignore_tpc=1 processor.latency_factor=3 processor.max_cstate=7 intel_pstate=percpu_perf_limits,force intel_idle.use_acpi=1 skew_tick=1"
cmdline_power=processor.ignore_ppc=1 processor.ignore_tpc=1 pcie_aspm.policy=powersave
# threaded interrrupts of 'nvme.use_threaded_interrupts=1' supposed to be better for performance and latency but they may fail miserably instead
# high nvme.io_queue_depth HW max is supposed to be 65535 but it may increse NVMe I/O latency, it may help to set nvme.max_host_mem_size_mb to be no less than GPU GART (256-1024 MB)
# however, since kernel 5.14.3 nvme.io_queue_depth=65535 hangs at boot, it seems it was limited to 4095: https://lists.infradead.org/pipermail/linux-nvme/2014-July/001064.html and https://lore.kernel.org/all/31c4dc69-5d10-cc6a-4295-e42bbc0993d0@protonmail.com/
# see https://docs.microsoft.com/en-us/answers/questions/71713/a-good-queue-length-figure-for-premium-ssd.html for discerning optimal depth and nr_requests
# the idiotic poll-based nvme access can only be enabled on boot-time BUT you HAVE to guess correct amount of queues for your specific device or it will silently fail
# use something small like 'nvme.poll_queues=4 nvme.write_queues=2' by default
# See https://lists.openwrt.org/pipermail/linux-nvme/2017-July/011956.html and 'NVM Express 1.3d, Section 4.4 ("Scatter Gather List (SGL)")'
# nvme.sgl_threshold=1 should enable "Scatter-Gather List" on all I/O, 4096 - over 4KB, 65536 - over 64KB and 2097152 - over 2MB that is x86_64's hugepage size, nvme_core.streams is disabled by default, nvme_core.default_ps_max_latency_us=5500 disables deepest power-saving mode
# some more iptions like: nvme.use_cmb_sqes=1 nvme_core.streams=1 nvme_core.default_ps_max_latency_us=5500 ?
cmdline_storage=nvme.use_threaded_interrupts=1 nvme.use_cmb_sqes=1 nvme.io_queue_depth=65 nvme.max_host_mem_size_mb=3072 nvme.poll_queues=8 nvme.write_queues=4 nvme.sgl_threshold=65536 nvme_core.streams=1 nvme_core.admin_timeout=900 nvme_core.io_timeout=4294967295 nvme_core.shutdown_timeout=60
# MSI-capable NVMe running on old Intel system may result in boot failure "INTR-REMAP:[fault reason 38]" (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/605686), workaround is 'intremap=nosid'
# amdgpu tends to completely shit itself on init on some systems when 64-bit PCIe adressing (AKA "above 4GB decoding") is enabled and "host bridge windows from ACPI" are "used" in PCI[e], so 'pci=nocrs' needs to be added
# amdgpu's "new output framework" shits the bed in pure UEFI video mode by permanently disabling the output port that was used by UEFI, so disable it by 'amdgpu.dc=0'
# although, it can be worked around also by using UEFI CSM compatibility and legacy VBIOS since amdgpu forces full modeset reinit either way
# 'intremap=nosid,no_x2apic_optout pci=nocrs,big_root_window,skip_isa_align iomem=relaxed acpi_enforce_resources=lax amd_iommu_intr=vapic'
# 'efi=runtime' might be needed to stop disabling BIOS' background trojans
cmdline_workarounds=intremap=nosid,no_x2apic_optout pci=big_root_window,skip_isa_align mem_encrypt=off mce=recovery iomem=relaxed acpi_enforce_resources=lax
# ignore vulnerabilities (all with 'mitigations=off') and disable mitigations that result in huge loss in performance ? most danger comes from virtual machines with untrusted guests
cmdline_vulnerabilities=nospectre_v1 tsx_async_abort=off l1tf=off kvm-intel.vmentry_l1d_flush=cond mds=off
# mirror module options from /etc/modprobe.d/90-HSF.conf for potentially built-in modules
cmdline_options=drm_kms_helper.poll=0 amd_pmf.force_load=1 virtio_blk.queue_depth=33 virtio_blk.poll_queues=8 virtio_blk.num_request_queues=8

[bootloader]
cmdline=${cmdline_main} ${cmdline_rcu} ${cmdline_io} ${cmdline_pci} ${cmdline_power} ${cmdline_storage} ${cmdline_workarounds} ${cmdline_vulnerabilities} ${cmdline_options}

[script]
# script for autoprobing sensors and configuring network devices
script=script.sh

[vm]
# https://www.kernel.org/doc/Documentation/vm/transhuge.txt
# use THP (transparent hugepages of 2M instead of default 4K on x86) to speed up RAM management by avoiding fragmentation & memory controller overload for price of more size overhead.
# 'always' forces them by default, use only with a lot of RAM to spare. Better used in conjunction with kernel boot option of 'transparent_hugepage=always' for early allocation.
# 'madvise' relies on software explicitly requesting them which is the safe option
# this became incompatible with kernel built with PREEMPT_RT=y
transparent_hugepages=always
# also try putting something like 'GLIBC_TUNABLES=glibc.malloc.hugetlb=1' into /etc/environment
# or install libhugetlbfs and preload it

# this seems broken
#[video]
#radeon_powersave=dpm-balanced,auto

[cpu]
# see /usr/lib/python*/site-packages/tuned/plugins/plugin_cpu.py
#force_latency=5
load_threshold=0.33
latency_low=1
latency_high=99
pm_qos_resume_latency_us=111
# this setting is possibly no longer applicable.
# 'schedutil' is considered universal future-proof solution but it's too undercooked yet and might properly work only on AMD.
# for some reason on newer kernels it decides to max-out frequency all the time, just like 'performance'
# 'ondemand' is safe default for all cases.
# on Intel CPUs only "performance" and "powersave" may be available and former completely disables frequency scaling '|' can be used for "if absent then try next"
# 'userspace' may be viable with things like thermald
governor=schedutil|ondemand|conservative|powersave
# https://lore.kernel.org/patchwork/patch/655892/ & https://lkml.org/lkml/2019/3/18/612
# dumbasses from Intel decided to hardcode this to 'normal' during kernel init and disregard BIOS settings: https://patchwork.kernel.org/project/linux-pm/patch/6369897.qxlu8PgE1t@house/
# bias is set from 0 to 15 where performance=0, balance-performance=4, normal=6, balance-power=8, power=15
# or numbers can be used directly; for desktop you may want 'performance' and for laptop - 'balance-power'
energy_perf_bias=performance|balance-performance
# see /sys/devices/system/cpu/cpufreq/policy*/energy_performance_available_preferences
energy_performance_preference=performance|balance-performance
sampling_down_factor=3
#min_perf_pct=63
#max_perf_pct=133
#hwp_dynamic_boost=1
#energy_efficiency=0
#no_turbo=0

# once again, this is completely broken because there is no such thing as 'perf' python module and pyperf does not count
#[scheduler]
# sysctl settings that were imprisoned in debugfs by some I-will-happily-impose-my-favourite-default-settings-unto-whole-world guys
# while Linux's default latency for high-load multi-subsystem multi-thread multimedia apps like games is still terrible
# https://lkml.org/lkml/2021/10/21/663
# https://github.com/zen-kernel/zen-kernel/issues/238
# https://serverfault.com/questions/925815/cpu-utilization-impact-due-to-granularity-kernel-parameter-rhel6-vs-rhel7
# https://dev.to/satorutakeuchi/the-linux-s-sysctl-parameters-about-process-scheduler-1dh5
# previously, 100k was the lowest achievable value and defaults were 10 times that
# for desktop you may want ≤10k and for laptop - ≥100k
#sched_min_granularity_ns=3111
#sched_idle_min_granularity_ns=96333
#sched_latency_ns=6333
#sched_wakeup_granularity_ns=1333
# "sched_tunable_scaling controls whether the scheduler can adjust sched_latency_ns. The values are 0 = do not adjust, 1 = logarithmic adjustment, and 2 = linear adjustment.
# The adjustment made is based on the number of CPUs, and increases logarithmically or linearly as implied in the available values.
# This is due to the fact that with more CPUs there's an apparent reduction in perceived latency. "
#sched_tunable_scaling=0
# default is 500000 (0.5 ms) which causes migration seesaw, try 3-33 ms
#sched_migration_cost_ns=111333
# supposedly, faster and bigger migration should do more usefull work on CPUs
# but it has shown to create conflicts for realtime multimedia tasks by preempting them too much
#sched_nr_migrate=4
#numa_balancing_scan_delay_ms=1666
#numa_balancing_scan_period_min_ms=3333
#numa_balancing_scan_period_max_ms=9999
#numa_balancing_scan_size_mb=16

[sysfs]
/sys/kernel/debug/sched/min_granularity_ns=13111
# this seems to be missing from the dedicated section
/sys/kernel/debug/sched/idle_min_granularity_ns=333111
/sys/kernel/debug/sched/latency_ns=99666
/sys/kernel/debug/sched/latency_warn_once=0
/sys/kernel/debug/sched/latency_warn_ms=16
/sys/kernel/debug/sched/wakeup_granularity_ns=3311
/sys/kernel/debug/sched/tunable_scaling=0
/sys/kernel/debug/sched/migration_cost_ns=11333
/sys/kernel/debug/sched/nr_migrate=16
/sys/kernel/debug/sched/numa_balancing/scan_delay_ms=1666
/sys/kernel/debug/sched/numa_balancing/scan_period_min_ms=1111
/sys/kernel/debug/sched/numa_balancing/scan_period_max_ms=9999
/sys/kernel/debug/sched/numa_balancing/scan_size_mb=16
# none / voluntary / full
#/sys/kernel/debug/sched/preempt=full
# N / Y
#/sys/kernel/debug/sched/verbose=Y

# frequency control stuff
# https://www.phoronix.com/forums/forum/software/mobile-linux/1151560-the-xanmod-kernel-is-working-well-to-boost-ubuntu-desktop-workstation-performance/page3
# 0 / 1 / 3 / 9 / 18 / 33 / 50 / 111 / 500 / 1111 / 3333 / 6333 ?
/sys/devices/system/cpu/cpufreq/schedutil/rate_limit_us=33
/sys/devices/system/cpu/cpufreq/ondemand/sampling_rate=111
#/sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor=4
/sys/devices/system/cpu/cpufreq/ondemand/up_threshold=77
/sys/devices/system/cpu/cpufreq/ondemand/down_threshold=33
/sys/devices/system/cpu/cpufreq/ondemand/powersave_bias=11
/sys/devices/system/cpu/cpufreq/ondemand/io_is_busy=1
/sys/devices/system/cpu/cpufreq/conservative/freq_step=11
/sys/devices/system/cpu/cpufreq/conservative/down_threshold=33
/sys/devices/system/cpu/cpufreq/conservative/sampling_down_factor=4

# intel_pstate hacks from https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt
/sys/kernel/debug/pstate_snb/sample_rate_ms=3
/sys/kernel/debug/pstate_snb/setpoint=80
/sys/kernel/debug/pstate_snb/p_gain_pct=25

# https://www.kernel.org/doc/Documentation/x86/x86_64/machinecheck
/sys/devices/system/machinecheck/machinecheck*/tolerant=2

# https://www.kernel.org/doc/html/latest/admin-guide/mm/ksm.html
# don't run KSM too often
/sys/kernel/mm/ksm/pages_to_scan=4096
/sys/kernel/mm/ksm/sleep_millisecs=111
/sys/kernel/mm/ksm/stable_node_chains_prune_millisecs=3333
/sys/kernel/mm/ksm/max_page_sharing=512
/sys/kernel/mm/ksm/merge_across_nodes=0
/sys/kernel/mm/ksm/use_zero_pages=1
/sys/kernel/mm/ksm/run=1

# accumpanying options for THP
# get some 1GB-sized mega-pages on x86_64 too, 2MB ones are set via vm.nr_overcommit_hugepages
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages=8
# tmpfs mounts also should have 'huge=' option with 'always', 'within_size' or 'advise', like in /etc/systemd/system/{dev-shm,tmp}.mount
/sys/kernel/mm/transparent_hugepage/shmem_enabled=advise
# 'defer+madvise' is the most preffered option but it has a small risk of delays breaking realtime, 'always' breaks realtime for sure and creates bad stutters.
# however, tuned seems to shit itself and refuses to use 'defer'.
/sys/kernel/mm/transparent_hugepage/defrag=defer+madvise
# separate periodic defrag call is also breaks realtime and creates bad stutters
#/sys/kernel/mm/transparent_hugepage/khugepaged/defrag=0
# make it often and not long to avoid stutters
# 512/1000/10000 values seem tolerable on weak old CPUs
/sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan=512
/sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs=111
/sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs=3333
# for x86_64 arch
/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none=65536
/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap=4096

# NVMe I/O scheduling optimizations for minimal random r/w latency
# accommodates kernel NVMe module options above
# https://www.kernel.org/doc/Documentation/block/
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-storage_and_file_systems-configuration_tools
# is there any sense in using any I/O scheduler other than 'noop' ? maybe only 'mq-deadline' but it seems to hurt direct i/o read performance
# see https://github.com/pop-os/default-settings/pull/149 and https://www.phoronix.com/forums/forum/software/general-linux-open-source/1437578-mq-deadline-scheduler-optimized-for-much-better-scalability
# never use mq-deadline: https://lore.kernel.org/linux-block/0ca63d05-fc5b-4e6a-a828-52eb24305545@acm.org/
/sys/block/nvme*n*/queue/scheduler=kyber
# add all NVMe devices as entropy sources ? this is very effective but may worsen I/O access latency
/sys/block/nvme*n*/queue/add_random=1
## NOOP options
# NVMe supposed to support 65535 queue depth but Linux did not allow me to set more than (nvme.io_queue_depth-1) which is 4094 max and 256 is default
# this is limited by nvme.io_queue_depth and 8/16/32 may be good for latency
# HOWEVER, that number is requests per application's I/O "turn" and not overall per device queue
# this might be capped to <nvme.io_queue_depth>-1
# setting below the default 1024 likes to cause kernel hangs with xfs and i/o scheduling kernel threads
/sys/block/nvme*n*/queue/nr_requests=8
# device-specific, maximum value must be equal to max_hw_sectors_kb but smaller one supposedly improves I/O latency
# for me, sometimes it decided to have 2048 or 256
# 2048kb is equal to a hugepage on x86_64 and AMD GPUs, 4kb is default filesystem block size
# queue memory is 2*nr_requests*max_sectors_kb so 2*8192*64KB=1GB
/sys/block/nvme*n*/queue/max_sectors_kb=2048
# how this works is a mystery. you would think that anything >0 would ruin latency on SSD/NVMe but it seems 0 ruins throughput instead
# try 4/8/16/64 or 1024/2048/8192
# see https://github.com/Azure-Samples/cassandra-on-azure-vms-performance-experiments/blob/master/docs/cassandra-read-ahead.md
# and https://tracker.ceph.com/projects/ceph/wiki/Tuning_for_All_Flash_Deployments
/sys/block/nvme*n*/queue/read_ahead_kb=2048
# filesystem driver's internal read_ahead
/sys/class/bdi/btrfs-*/read_ahead_kb=2048
# https://events.static.linuxfound.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf
# -1 is default (high CPU-load);
# 0 is "adaptive hybrid polling" (best latency with moderate CPU-load);
# >0 is fixed-time hybrid polling in ns (device-specific; 2 may be close to 0, 4 - minimal CPU load with latency as high as -1)
/sys/block/nvme*n*/queue/io_poll_delay=0
# actually enable it, if possible
#/sys/block/nvme*n*/queue/io_poll=1
# don't use CPU for the work of NVMe controller ?
# 1 means 'simple merges', 2 is to actually disable them
/sys/block/nvme*n*/queue/nomerges=2
# supposedly decreases CPU load due to caching. 0 to disable, 1 for "CPU group", 2 is strict CPU binding
/sys/block/nvme*n*/queue/rq_affinity=2
# this throttles writes if it detect latency above this, 0 is "disable"
# default seems to be 2000, so you may try things like 333/666/999/1333/3333/6666/7333/9999
/sys/block/nvme*n*/queue/wbt_lat_usec=333
/sys/block/nvme*n*/queue/throttle_sample_time=1
## KYBER options: defaults are 2000000 / 10000000
/sys/block/nvme*n*/queue/iosched/read_lat_nsec=333111
/sys/block/nvme*n*/queue/iosched/write_lat_nsec=333111111
## MQ-DEADLINE options
/sys/block/nvme*n*/queue/iosched/front_merges=0
# must be lower for better latency
/sys/block/nvme*n*/queue/iosched/fifo_batch=4
/sys/block/nvme*n*/queue/iosched/writes_starved=4
/sys/block/nvme*n*/queue/iosched/read_expire=100
/sys/block/nvme*n*/queue/iosched/write_expire=1000

# same for legacy SATA ?
/sys/block/sd?/queue/nr_requests=64
/sys/block/sd?/queue/max_sectors_kb=2048
/sys/block/sd?/queue/read_ahead_kb=2048
/sys/block/sd?/queue/io_poll_delay=0
/sys/block/sd?/queue/io_poll=1
#/sys/block/sd?/queue/rq_affinity=2
#/sys/block/sd?/queue/wbt_lat_usec=33111
#/sys/block/sd?/queue/throttle_sample_time=11
/sys/block/sd?/device/ncq_prio_enable=1
# https://ata.wiki.kernel.org/index.php/Libata_FAQ
#/sys/block/sd?/device/queue_depth=31
# https://www.spinics.net/lists/linux-scsi/msg80506.html
# https://www.ibm.com/docs/en/linux-on-systems?topic=wsd-setting-queue-depth
/sys/block/sd?/device/queue_ramp_up_period=1111
# https://www.ibm.com/docs/en/linux-on-systems?topic=wsd-displaying-information-1
#/sys/block/sd?/device/queue_type=none
## KYBER options: defaults are 2000000 / 10000000
/sys/block/sd?/queue/iosched/read_lat_nsec=1111333
/sys/block/sd?/queue/iosched/write_lat_nsec=3311111
# this needs /etc/udev/rules.d/61-io-schedulers.rules like this:
# HDD / SSD
#ACTION=="add|change", KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"
#ACTION=="add|change", KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="kyber
# SDcard / eMMC
#ACTION=="add|change", KERNEL=="mmcblk?", ATTR{removable}=="1", ATTR{queue/scheduler}="bfq"
#ACTION=="add|change", KERNEL=="mmcblk?", ATTR{removable}=="0", ATTR{queue/scheduler}="kyber"

# remove wasteful operations on zram/zswap (should complement sysctl settings)
/sys/block/zram?/queue/nr_requests=1024
# 4 or 2048 ?
/sys/block/zram?/queue/max_sectors_kb=2048
# 4 / 64 / 2048 ?
/sys/block/zram?/queue/read_ahead_kb=2048
/sys/block/zram?/queue/io_poll_delay=0
#/sys/block/zram?/queue/io_poll=1
/sys/block/zram?/queue/nomerges=1
/sys/block/zram?/queue/rq_affinity=1
/sys/block/zram?/queue/throttle_sample_time=8

## attempt in controlling PWM fans
# we need to make sure that all this is done after module probing and autoloading
# it would be nice to set it to not spin lower than 2 times per second (120 RPM) BUT if that's not 0 then fan alarm will loose its shit
#/sys/class/hwmon/hwmon?/fan?_min=0
# SHUT UP
/sys/class/hwmon/hwmon?/temp?_beep=0
/sys/class/hwmon/hwmon?/fan?_beep=0
/sys/class/hwmon/hwmon?/fan?_alarm=0

# modern CPUs and their thermal pastes are designed to be no hotter than 70 degrees
# attempt to force "use fan at 32-70 degrees Celsius where 70 is critical, try to use ~80% at initialization and ~100% fan speed at 60 degrees" rule on motherboard
/sys/class/hwmon/hwmon?/temp?_min=22000
/sys/class/hwmon/hwmon?/temp?_max=77000
#/sys/class/hwmon/hwmon?/temp?_crit_hyst=87000
#/sys/class/hwmon/hwmon?/temp?_crit=90000
# 23437 is a default safe value (anything above hearing frequency of 22 KHz, about 25 KHz), lower ones (like 10190) may work better but they also produce noticeable noise
#/sys/class/hwmon/hwmon?/pwm?_freq=46875
# apparently, 0="full auto" (temperature-based ?), 1="open-loop" (manual, based on value 0-255 in pwm?) and 2="closed-loop" (based on target RPM or some other weird crap)
# on my system with broken it87 module setting '0' forces constant 100%, '1' forces constant hardcoded value (and wrong one at that) and '2' is getting stuck near last used value
# '2' may require fan IDs to be consistens with temperature sensor IDs which is unrealistic (for me they are reversed)
#/sys/class/hwmon/hwmon?/pwm?=145
# this may behave weirdly if applied to all fans at once, so do it only for the CPU one
#/sys/class/hwmon/hwmon?/pwm1_enable=2

# AMD GPU is locked to about 20% for some reason by default, try to force it to spin at 100% close to 65 degrees and set about 60% by default, so it would, at worst, be stuck on a useful speed
#/sys/class/drm/card?/device/hwmon/hwmon?/fan?_enable=1
#/sys/class/drm/card?/device/hwmon/hwmon?/pwm?=128
# this also broke on recent kernels
#/sys/class/drm/card?/device/hwmon/hwmon?/pwm?_min=32
#/sys/class/drm/card?/device/hwmon/hwmon?/pwm?_max=160
# 0 seem to be forcing constant 100% and 1 is fully manual
# on '2' amdgpu get's stuck at 13XX and on 0 it's just 100% meaning that this whole thing is broken
#/sys/class/drm/card?/device/hwmon/hwmon?/pwm?_enable=2
#/sys/class/drm/card?/device/hwmon/hwmon?/temp?_min=42000
#/sys/class/drm/card?/device/hwmon/hwmon?/temp?_max=67000
# GPUs are usually built to withstand 10-20 more degrees than CPUs but it's still degradational to them
#/sys/class/drm/card?/device/hwmon/hwmon?/temp?_crit_hyst=77000
#/sys/class/drm/card?/device/hwmon/hwmon?/temp?_crit=80000

[sysctl]
# it may actually hurt load balancing:
# https://unix.stackexchange.com/questions/277505/why-is-nice-level-ignored-between-different-login-sessions-honoured-if-star
kernel.sched_autogroup_enabled=0

# seems to be absent in newer kernels ! - https://forum.endeavouros.com/t/sysctl-output-changed-from-kernel-5-10-to-5-13-why/17097
# inspired by https://forums.gentoo.org/viewtopic-p-8001720.html and https://probablydance.com/2019/12/30/measuring-mutexes-spinlocks-and-how-bad-the-linux-scheduler-really-is/
#kernel.sched_latency_ns=100000
# Minimal preemption granularity for CPU-bound tasks:
# (default: 1 msec#  (1 + ilog(ncpus)), units: nanoseconds)
#kernel.sched_min_granularity_ns=100000
#kernel.sched_wakeup_granularity_ns=100000
# aggressiveness of task migration between CPUs
# https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.0/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Realtime_Specific_Tuning-Using_sched_nr_migrate_to_limit_SCHED_OTHER_processes..html
# RT-optimized is 2, default is 8, 32 seems safe even for RT and for modern high-thread CPUs and many-thread operating systems much more may be viable
#kernel.sched_nr_migrate=32
# The total time the scheduler will consider a migrated process
# "cache hot" and thus less likely to be re-migrated
# (system default is 500000, i.e. 0.5 ms)
#kernel.sched_migration_cost_ns=6666666
# https://www.suse.com/documentation/sles-12/book_sle_tuning/data/sec_tuning_taskscheduler_cfs.html
#kernel.sched_time_avg_ms=8
#kernel.sched_tunable_scaling=1

# https://bugzilla.redhat.com/show_bug.cgi?id=1797629
kernel.timer_migration=0
kernel.sched_cfs_bandwidth_slice_us=10000
kernel.sched_child_runs_first=1
#kernel.sched_energy_aware=0
kernel.sched_deadline_period_max_us=111111
kernel.sched_deadline_period_min_us=1111

# for better realtime guarantees
# this doesn't work with CONFIG_RT_GROUP_SCHED and must be -1 (unlimited) which is dangerous due to system lock-ups
kernel.sched_rt_runtime_us=50000
kernel.sched_rt_period_us=100000
# https://wiki.linuxfoundation.org/realtime/documentation/technical_basics/sched_policy_prio/start
# this is a very specific parameter: time-slice for RR processes of the same priority
kernel.sched_rr_timeslice_ms=1

# https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#sched-util-clamp-min
kernel.sched_util_clamp_max=999
# unfortunately, RT limits depend on generic limits
kernel.sched_util_clamp_min=1024
kernel.sched_util_clamp_min_rt_default=1024

# https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
# this only leads to random memory errors
#vm.overcommit_memory=2
#vm.overcommit_ratio=90
# give 16GB to dynamic hugepage pool by setting '8192' for 2MB pages of x86_64 arch ?
vm.nr_overcommit_hugepages=8192
vm.admin_reserve_kbytes=262144
# try gid of 'users' group which is likely '100' by default
vm.hugetlb_shm_group=100
# default of this is based on RAM-size (66 MB with 16 GB for me, for example) and works well enough but we want more
vm.min_free_kbytes=262144
# disable for lower RAM access latency (may cause severe fragmentation) or enable for efficiency (may hurt realtime tasks)
# default is '1' for "enabled"
#vm.compact_unevictable_allowed=0
# default is 20
vm.compaction_proactiveness=5
#vm.hugepages_treat_as_movable=1
# clustered servers may want to keep it at 0 but 3 could be a safe compromise between latency and efficiency… if it worked BUT in reality it only wastes CPU and brings I/O to a crawl with no benefit
# https://blogs.dropbox.com/tech/2017/09/optimizing-web-servers-for-high-throughput-and-low-latency/
#vm.zone_reclaim_mode=0
# how aggressive reclaim is, default is 1%
#vm.min_unmapped_ratio=3
# enable to keep processes in RAM that is controlled by CPU that is running them
# multi-socket systems must have this enabled
# once again, this critical option seems to have been nuked
#kernel.numa_balancing=0
#kernel.numa_balancing_scan_period_min_ms=64
#kernel.numa_balancing_scan_period_max_ms=10000
#kernel.numa_balancing_scan_delay_ms=256
#kernel.numa_balancing_scan_size_mb=16

# better clock:
# http://wiki.linuxaudio.org/wiki/system_configuration
dev.hpet.max-user-freq=4096

# may be already selected as kernel's default
# https://wiki.gentoo.org/wiki/Traffic_shaping#Theory
# sfq is simple and safe, fq_codel and complex and robust, cake is even more complex but lacks some tunable parameters
net.core.default_qdisc=fq_codel

# some stuff inspired by stock tuned rt profile
# see https://www.kernel.org/doc/Documentation/sysctl/net.txt
# https://github.com/leandromoreira/linux-network-performance-parameters
kernel.hung_task_check_interval_secs=5
kernel.hung_task_timeout_secs=30
vm.stat_interval=10
net.ipv4.tcp_fastopen=3
# default is 64 with 1/1 for rx/tx, 39 seems safe for rx/tx bias 3/2, 472 seems fine
# set weight to [ netdev_budget / weight_rx_bias ] ?
net.core.dev_weight=8
net.core.dev_weight_rx_bias=3
net.core.dev_weight_tx_bias=2
# this may hang ethernet interface in a mysterious way
net.core.busy_read=0
# should be 100 for maximum network performance but anything ≥50 is a CPU strain
net.core.busy_poll=0
net.core.netdev_max_backlog=1024
# 700-871 may be optimal but will eat a lot of CPU-time
# <200 may lead to dropped packets even on slow links but 117 seems safe
net.core.netdev_budget=128
# should be ≥5000 for maximum network performance, small values may result in dropped frames & packets
# 751 seems safe, 999 may be optimal, 1999 is decent for networking performance
net.core.netdev_budget_usecs=999
#net.core.netdev_tstamp_prequeue=0
# https://github.com/leandromoreira/linux-network-performance-parameters
# attemp in gaining minimal buffer by default with window for up to 10G
net.core.optmem_max=1048576
net.core.rmem_default=1048576
net.core.rmem_max=134217728
net.core.wmem_default=1048576
net.core.wmem_max=134217728
# tcp/udp_mem is supposedly in 4K pages
# so set it to min / pressure / max as 128MB / 768MB / 1GB
# too low may result in randomly dropped packets
net.ipv4.tcp_mem=32768 196608 262144
net.ipv4.tcp_rmem=16384 1048576 134217728
net.ipv4.tcp_wmem=16384 1048576 134217728
net.ipv4.udp_mem=32768 196608 262144
net.ipv4.udp_rmem_min=16384
net.ipv4.udp_wmem_min=16384
# https://www.kernel.org/doc/Documentation/networking/scaling.txt
# https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-networking-configuration_tools#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_tools-Configuring_Receive_Packet_Steering_RPS
# net.core.rps_sock_flow_entries should be a number of maximum expected system-wide simultaneous connections
net.core.rps_sock_flow_entries=1024

# BPF
#net.core.bpf_jit_enable=1
#net.core.bpf_jit_harden=1
net.core.bpf_jit_kallsyms=1
net.core.bpf_jit_limit=1073741824

# in RAM-caching we prefer extremely large read cache and fairly low write cache

# If a workload mostly uses anonymous memory and it hits this limit, the entire working set is buffered for I/O, and any more write buffering would require swapping, so it's time to throttle writes until I/O can catch up.  Workloads that mostly use file mappings may be able to use even higher values.
# The generator of dirty data starts writeback at this percentage (system default is 20%)
vm.dirty_ratio=24

# Start background writeback (via writeback threads) at this percentage (system default is 10%)
vm.dirty_background_ratio=16

# give it a minute
vm.dirty_expire_centisecs=12000
vm.dirty_writeback_centisecs=6000

# https://unix.stackexchange.com/questions/30286/can-i-configure-my-linux-system-for-more-aggressive-file-system-caching
# 100 is default, <100 is "retain more fs directory caches" (may be better when accessing big directories or otherwise a lot of files at once), >100 up to 1000 is "retain less"
vm.vfs_cache_pressure=111

# The swappiness parameter controls the tendency of the kernel to move processes out of physical memory and onto the swap disk.
# 0 tells the kernel to avoid swapping processes out of physical memory for as long as possible.
# 100 tells the kernel to aggressively swap processes out of physical memory and move them to swap cache.
# 0-10 is good for systems with normal swap but with usage of zswap (compressed swap in RAM) higher values may be more desirable.
# this seems to have changed in recent years: <100 is "prefer dropping caches to avoid swapping" and >100-200 is "prefer swapping stale process memory"
# so 150-180 may be prefered nowadays with zram+nvme (swap file should have lower priority than 10 in fstab)
vm.swappiness=133
# higher values (4-5) may be better on systems with CPU power to spare for I/O BUT they may increase latencies on SSDs
# see https://www.reddit.com/r/Fedora/comments/mzun99/new_zram_tuning_benchmarks/ which says that anything >0 produces too much latency
# but 9 might result in 2048K (4k*2^[4]=64K) read-ahead which would coincide with most NVMEs sector size and desired read-ahead
vm.page-cluster=1
# aggressiveness of swap in freeing memory, 100 means try to keep 1% of memory free, 1000 is the maximum
vm.watermark_scale_factor=333

Places

File tuned_latency-balanced.conf of Package pipewire

Places