Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
home:X0F:branches:multimedia
pipewire
tuned_latency-balanced.conf
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File tuned_latency-balanced.conf of Package pipewire
# # tuned configuration # [main] summary=Less power-hungry version of latency-performance profile #include=latency-performance [variables] # use 'rd.fstab=1 fstab=yes' to force systemd to actually use root's mount options # 'tsc=noirqtime,unstable,nowatchdog' & 'hpet=force clocksource=hpet highres=on' may be mandatory for reliable system timer, maybe add 'lapic=notscdeadline,nowatchdog x2apic_phys apicpmtimer acpi_use_timer_override x86_intel_mid_timer=lapic_and_apbt' for Intel CPUs and 'align_va_addr=on' that is default on AMD CPUs # 'zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=25' to enable transparently compressed swap which is viable on NVMe but kernel may bug-out and hang on untested regressions # zswap is cache for disk swap, zram should be used for actual compressed RAM-based swap: https://linuxreviews.org/Comparison_of_Compression_Algorithms # is 'nohz_full=[core_range]' still required even with NO_HZ_FULL=y, as kernel doc still says ? # 'threadirqs' also may be still be required despite IRQ_FORCED_THREADING=y # see https://forum.osdev.org/viewtopic.php?f=1&t=40408 for 'cpu_init_udelay=X' # on desktop watchdog is only good for triggering accidentally and doing hot reset that corrupts all your data, so disable it with 'acpi_no_watchdog nmi_watchdog=0 nowatchdog' # defraging and such on hugepages can lead to random latency spikes, so it can be disabled with 'hugetlb_free_vmemmap=off' cmdline_main=sysrq_always_enabled noautogroup numa_balancing=disable gbpages transparent_hugepage=madvise nmi_watchdog=0 x2apic_phys # recommended RCU setup, which is somehow not default, is: 'rcu_nocbs=all rcu_nocb_poll rcutree.kthread_prio=1 rcutree.use_softirq=0 rcutree.rcu_kick_kthreads rcupdate.rcu_normal_after_boot rcutree.jiffies_till_sched_qs=333' # "jiffie"-based values should be adjusted for HZ-rate of the kernel # see https://lwn.net/Articles/777214/ and https://lkml.org/lkml/2020/9/17/1243 also https://docs.kernel.org/RCU/stallwarn.html and https://www.uwsg.indiana.edu/hypermail/linux/kernel/2012.1/03981.html cmdline_rcu=rcu_nocbs=all rcutree.kthread_prio=25 rcutree.use_softirq=0 rcutree.rcu_kick_kthreads=1 rcutree.blimit=8 rcutree.qlowmark=24 rcutree.qhimark=256 rcutree.jiffies_till_first_fqs=1 rcutree.rcu_idle_gp_delay=4 rcupdate.rcu_task_ipi_delay=10 rcutree.rcu_idle_lazy_gp_delay=3333 # IOMMU stuff: 'iommu=on,nopt,noforce,noagp,nofullflush,memaper=3,merge iommu.forcedac=1 amd_iommu=on intel_iommu=on,sm_on' # also use some "RAM cache" for TLB with 'swiotlb=<slabs>' where slabs are 2KB in size for some reason, so 8192=16MB, 131072=256MB, 524288=1GB # but since v6.6 kernel increases swiotlb at runtime, so it's better to keep it minimal by default cmdline_io=swiotlb=1024 iommu=noagp,nofullflush,memaper=3,merge amd_iommu=pgtbl_v2 intel-ishtp.ishtp_use_dma=1 ioatdma.ioat_pending_level=4 ioatdma.completion_timeout=999 ioatdma.idle_timeout=10000 idxd.sva=1 idxd.tc_override=1 # PCI[e] optimizations: 'pci=ecrc=on,ioapicreroute,assign-busses,check_enable_amd_mmconf,realloc,pcie_bus_perf,pcie_scan_all pcie_ports=auto/native' cmdline_pci=pcie_ports=auto pci=noaer,ecrc=on,ioapicreroute,assign-busses,check_enable_amd_mmconf,realloc,pcie_bus_perf # ASPM is known to trigger critical issues with providing little power-saving, if any. Same goes for PCIe PM, so do 'pcie_aspm=off pcie_port_pm=off' but 'pcie_aspm.policy=performance' may be a gentler option # other ASPM options are: 'default' (developer's default, NOT config's default/unchanged !), 'powersave' and 'powersupersave' # the idiocy of ASPM handling is in the fact that to actually disable it, you may have to 'force'-enable (!) it in kernel (with policy=performance) to allow it to do anything because geniuses at Intel decided "to match Windows" for "less risk"…. # BIOS may limit power-states of CPU or kernel may be too "shy" to use them unless directly asked by BIOS, so let's force them with 'processor.ignore_ppc=1 processor.ignore_tpc=1'. that may be detrimental for laptops # also, see https://github.com/torvalds/linux/commit/25de5718356e264820625600a9edca1df5ff26f8 - kernel was made to be more aggressive in downing frequency which is bad for latency-critical tasks # https://bugzilla.redhat.com/show_bug.cgi?id=463285#c12 - 2 is default, 1 may give much better powersaving and 3 - better latency guarantees # hacked BIOS' for locked Intel CPUs may contain microcode that unlocks some higher performance states but that requires disabling OS microcode override with 'dis_ucode_ldr' # some Intel hacks: "processor.ignore_ppc=1 processor.ignore_tpc=1 processor.latency_factor=3 processor.max_cstate=7 intel_pstate=percpu_perf_limits,force intel_idle.use_acpi=1 skew_tick=1" cmdline_power=processor.ignore_ppc=1 processor.ignore_tpc=1 pcie_aspm.policy=powersave # threaded interrrupts of 'nvme.use_threaded_interrupts=1' supposed to be better for performance and latency but they may fail miserably instead # high nvme.io_queue_depth HW max is supposed to be 65535 but it may increse NVMe I/O latency, it may help to set nvme.max_host_mem_size_mb to be no less than GPU GART (256-1024 MB) # however, since kernel 5.14.3 nvme.io_queue_depth=65535 hangs at boot, it seems it was limited to 4095: https://lists.infradead.org/pipermail/linux-nvme/2014-July/001064.html and https://lore.kernel.org/all/31c4dc69-5d10-cc6a-4295-e42bbc0993d0@protonmail.com/ # see https://docs.microsoft.com/en-us/answers/questions/71713/a-good-queue-length-figure-for-premium-ssd.html for discerning optimal depth and nr_requests # the idiotic poll-based nvme access can only be enabled on boot-time BUT you HAVE to guess correct amount of queues for your specific device or it will silently fail # use something small like 'nvme.poll_queues=4 nvme.write_queues=2' by default # See https://lists.openwrt.org/pipermail/linux-nvme/2017-July/011956.html and 'NVM Express 1.3d, Section 4.4 ("Scatter Gather List (SGL)")' # nvme.sgl_threshold=1 should enable "Scatter-Gather List" on all I/O, 4096 - over 4KB, 65536 - over 64KB and 2097152 - over 2MB that is x86_64's hugepage size, nvme_core.streams is disabled by default, nvme_core.default_ps_max_latency_us=5500 disables deepest power-saving mode # some more iptions like: nvme.use_cmb_sqes=1 nvme_core.streams=1 nvme_core.default_ps_max_latency_us=5500 ? cmdline_storage=nvme.use_threaded_interrupts=1 nvme.use_cmb_sqes=1 nvme.io_queue_depth=65 nvme.max_host_mem_size_mb=3072 nvme.poll_queues=8 nvme.write_queues=4 nvme.sgl_threshold=65536 nvme_core.streams=1 nvme_core.admin_timeout=900 nvme_core.io_timeout=4294967295 nvme_core.shutdown_timeout=60 # MSI-capable NVMe running on old Intel system may result in boot failure "INTR-REMAP:[fault reason 38]" (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/605686), workaround is 'intremap=nosid' # amdgpu tends to completely shit itself on init on some systems when 64-bit PCIe adressing (AKA "above 4GB decoding") is enabled and "host bridge windows from ACPI" are "used" in PCI[e], so 'pci=nocrs' needs to be added # amdgpu's "new output framework" shits the bed in pure UEFI video mode by permanently disabling the output port that was used by UEFI, so disable it by 'amdgpu.dc=0' # although, it can be worked around also by using UEFI CSM compatibility and legacy VBIOS since amdgpu forces full modeset reinit either way # 'intremap=nosid,no_x2apic_optout pci=nocrs,big_root_window,skip_isa_align iomem=relaxed acpi_enforce_resources=lax amd_iommu_intr=vapic' # 'efi=runtime' might be needed to stop disabling BIOS' background trojans cmdline_workarounds=intremap=nosid,no_x2apic_optout pci=big_root_window,skip_isa_align mem_encrypt=off mce=recovery iomem=relaxed acpi_enforce_resources=lax # ignore vulnerabilities (all with 'mitigations=off') and disable mitigations that result in huge loss in performance ? most danger comes from virtual machines with untrusted guests cmdline_vulnerabilities=nospectre_v1 tsx_async_abort=off l1tf=off kvm-intel.vmentry_l1d_flush=cond mds=off # mirror module options from /etc/modprobe.d/90-HSF.conf for potentially built-in modules cmdline_options=drm_kms_helper.poll=0 amd_pmf.force_load=1 virtio_blk.queue_depth=33 virtio_blk.poll_queues=8 virtio_blk.num_request_queues=8 [bootloader] cmdline=${cmdline_main} ${cmdline_rcu} ${cmdline_io} ${cmdline_pci} ${cmdline_power} ${cmdline_storage} ${cmdline_workarounds} ${cmdline_vulnerabilities} ${cmdline_options} [script] # script for autoprobing sensors and configuring network devices script=script.sh [vm] # https://www.kernel.org/doc/Documentation/vm/transhuge.txt # use THP (transparent hugepages of 2M instead of default 4K on x86) to speed up RAM management by avoiding fragmentation & memory controller overload for price of more size overhead. # 'always' forces them by default, use only with a lot of RAM to spare. Better used in conjunction with kernel boot option of 'transparent_hugepage=always' for early allocation. # 'madvise' relies on software explicitly requesting them which is the safe option # this became incompatible with kernel built with PREEMPT_RT=y transparent_hugepages=always # also try putting something like 'GLIBC_TUNABLES=glibc.malloc.hugetlb=1' into /etc/environment # or install libhugetlbfs and preload it # this seems broken #[video] #radeon_powersave=dpm-balanced,auto [cpu] # see /usr/lib/python*/site-packages/tuned/plugins/plugin_cpu.py #force_latency=5 load_threshold=0.33 latency_low=1 latency_high=99 pm_qos_resume_latency_us=111 # this setting is possibly no longer applicable. # 'schedutil' is considered universal future-proof solution but it's too undercooked yet and might properly work only on AMD. # for some reason on newer kernels it decides to max-out frequency all the time, just like 'performance' # 'ondemand' is safe default for all cases. # on Intel CPUs only "performance" and "powersave" may be available and former completely disables frequency scaling '|' can be used for "if absent then try next" # 'userspace' may be viable with things like thermald governor=schedutil|ondemand|conservative|powersave # https://lore.kernel.org/patchwork/patch/655892/ & https://lkml.org/lkml/2019/3/18/612 # dumbasses from Intel decided to hardcode this to 'normal' during kernel init and disregard BIOS settings: https://patchwork.kernel.org/project/linux-pm/patch/6369897.qxlu8PgE1t@house/ # bias is set from 0 to 15 where performance=0, balance-performance=4, normal=6, balance-power=8, power=15 # or numbers can be used directly; for desktop you may want 'performance' and for laptop - 'balance-power' energy_perf_bias=performance|balance-performance # see /sys/devices/system/cpu/cpufreq/policy*/energy_performance_available_preferences energy_performance_preference=performance|balance-performance sampling_down_factor=3 #min_perf_pct=63 #max_perf_pct=133 #hwp_dynamic_boost=1 #energy_efficiency=0 #no_turbo=0 # once again, this is completely broken because there is no such thing as 'perf' python module and pyperf does not count #[scheduler] # sysctl settings that were imprisoned in debugfs by some I-will-happily-impose-my-favourite-default-settings-unto-whole-world guys # while Linux's default latency for high-load multi-subsystem multi-thread multimedia apps like games is still terrible # https://lkml.org/lkml/2021/10/21/663 # https://github.com/zen-kernel/zen-kernel/issues/238 # https://serverfault.com/questions/925815/cpu-utilization-impact-due-to-granularity-kernel-parameter-rhel6-vs-rhel7 # https://dev.to/satorutakeuchi/the-linux-s-sysctl-parameters-about-process-scheduler-1dh5 # previously, 100k was the lowest achievable value and defaults were 10 times that # for desktop you may want ≤10k and for laptop - ≥100k #sched_min_granularity_ns=3111 #sched_idle_min_granularity_ns=96333 #sched_latency_ns=6333 #sched_wakeup_granularity_ns=1333 # "sched_tunable_scaling controls whether the scheduler can adjust sched_latency_ns. The values are 0 = do not adjust, 1 = logarithmic adjustment, and 2 = linear adjustment. # The adjustment made is based on the number of CPUs, and increases logarithmically or linearly as implied in the available values. # This is due to the fact that with more CPUs there's an apparent reduction in perceived latency. " #sched_tunable_scaling=0 # default is 500000 (0.5 ms) which causes migration seesaw, try 3-33 ms #sched_migration_cost_ns=111333 # supposedly, faster and bigger migration should do more usefull work on CPUs # but it has shown to create conflicts for realtime multimedia tasks by preempting them too much #sched_nr_migrate=4 #numa_balancing_scan_delay_ms=1666 #numa_balancing_scan_period_min_ms=3333 #numa_balancing_scan_period_max_ms=9999 #numa_balancing_scan_size_mb=16 [sysfs] /sys/kernel/debug/sched/min_granularity_ns=13111 # this seems to be missing from the dedicated section /sys/kernel/debug/sched/idle_min_granularity_ns=333111 /sys/kernel/debug/sched/latency_ns=99666 /sys/kernel/debug/sched/latency_warn_once=0 /sys/kernel/debug/sched/latency_warn_ms=16 /sys/kernel/debug/sched/wakeup_granularity_ns=3311 /sys/kernel/debug/sched/tunable_scaling=0 /sys/kernel/debug/sched/migration_cost_ns=11333 /sys/kernel/debug/sched/nr_migrate=16 /sys/kernel/debug/sched/numa_balancing/scan_delay_ms=1666 /sys/kernel/debug/sched/numa_balancing/scan_period_min_ms=1111 /sys/kernel/debug/sched/numa_balancing/scan_period_max_ms=9999 /sys/kernel/debug/sched/numa_balancing/scan_size_mb=16 # none / voluntary / full #/sys/kernel/debug/sched/preempt=full # N / Y #/sys/kernel/debug/sched/verbose=Y # frequency control stuff # https://www.phoronix.com/forums/forum/software/mobile-linux/1151560-the-xanmod-kernel-is-working-well-to-boost-ubuntu-desktop-workstation-performance/page3 # 0 / 1 / 3 / 9 / 18 / 33 / 50 / 111 / 500 / 1111 / 3333 / 6333 ? /sys/devices/system/cpu/cpufreq/schedutil/rate_limit_us=33 /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate=111 #/sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor=4 /sys/devices/system/cpu/cpufreq/ondemand/up_threshold=77 /sys/devices/system/cpu/cpufreq/ondemand/down_threshold=33 /sys/devices/system/cpu/cpufreq/ondemand/powersave_bias=11 /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy=1 /sys/devices/system/cpu/cpufreq/conservative/freq_step=11 /sys/devices/system/cpu/cpufreq/conservative/down_threshold=33 /sys/devices/system/cpu/cpufreq/conservative/sampling_down_factor=4 # intel_pstate hacks from https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt /sys/kernel/debug/pstate_snb/sample_rate_ms=3 /sys/kernel/debug/pstate_snb/setpoint=80 /sys/kernel/debug/pstate_snb/p_gain_pct=25 # https://www.kernel.org/doc/Documentation/x86/x86_64/machinecheck /sys/devices/system/machinecheck/machinecheck*/tolerant=2 # https://www.kernel.org/doc/html/latest/admin-guide/mm/ksm.html # don't run KSM too often /sys/kernel/mm/ksm/pages_to_scan=4096 /sys/kernel/mm/ksm/sleep_millisecs=111 /sys/kernel/mm/ksm/stable_node_chains_prune_millisecs=3333 /sys/kernel/mm/ksm/max_page_sharing=512 /sys/kernel/mm/ksm/merge_across_nodes=0 /sys/kernel/mm/ksm/use_zero_pages=1 /sys/kernel/mm/ksm/run=1 # accumpanying options for THP # get some 1GB-sized mega-pages on x86_64 too, 2MB ones are set via vm.nr_overcommit_hugepages /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages=8 # tmpfs mounts also should have 'huge=' option with 'always', 'within_size' or 'advise', like in /etc/systemd/system/{dev-shm,tmp}.mount /sys/kernel/mm/transparent_hugepage/shmem_enabled=advise # 'defer+madvise' is the most preffered option but it has a small risk of delays breaking realtime, 'always' breaks realtime for sure and creates bad stutters. # however, tuned seems to shit itself and refuses to use 'defer'. /sys/kernel/mm/transparent_hugepage/defrag=defer+madvise # separate periodic defrag call is also breaks realtime and creates bad stutters #/sys/kernel/mm/transparent_hugepage/khugepaged/defrag=0 # make it often and not long to avoid stutters # 512/1000/10000 values seem tolerable on weak old CPUs /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan=512 /sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs=111 /sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs=3333 # for x86_64 arch /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none=65536 /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap=4096 # NVMe I/O scheduling optimizations for minimal random r/w latency # accommodates kernel NVMe module options above # https://www.kernel.org/doc/Documentation/block/ # https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-storage_and_file_systems-configuration_tools # is there any sense in using any I/O scheduler other than 'noop' ? maybe only 'mq-deadline' but it seems to hurt direct i/o read performance # see https://github.com/pop-os/default-settings/pull/149 and https://www.phoronix.com/forums/forum/software/general-linux-open-source/1437578-mq-deadline-scheduler-optimized-for-much-better-scalability # never use mq-deadline: https://lore.kernel.org/linux-block/0ca63d05-fc5b-4e6a-a828-52eb24305545@acm.org/ /sys/block/nvme*n*/queue/scheduler=kyber # add all NVMe devices as entropy sources ? this is very effective but may worsen I/O access latency /sys/block/nvme*n*/queue/add_random=1 ## NOOP options # NVMe supposed to support 65535 queue depth but Linux did not allow me to set more than (nvme.io_queue_depth-1) which is 4094 max and 256 is default # this is limited by nvme.io_queue_depth and 8/16/32 may be good for latency # HOWEVER, that number is requests per application's I/O "turn" and not overall per device queue # this might be capped to <nvme.io_queue_depth>-1 # setting below the default 1024 likes to cause kernel hangs with xfs and i/o scheduling kernel threads /sys/block/nvme*n*/queue/nr_requests=8 # device-specific, maximum value must be equal to max_hw_sectors_kb but smaller one supposedly improves I/O latency # for me, sometimes it decided to have 2048 or 256 # 2048kb is equal to a hugepage on x86_64 and AMD GPUs, 4kb is default filesystem block size # queue memory is 2*nr_requests*max_sectors_kb so 2*8192*64KB=1GB /sys/block/nvme*n*/queue/max_sectors_kb=2048 # how this works is a mystery. you would think that anything >0 would ruin latency on SSD/NVMe but it seems 0 ruins throughput instead # try 4/8/16/64 or 1024/2048/8192 # see https://github.com/Azure-Samples/cassandra-on-azure-vms-performance-experiments/blob/master/docs/cassandra-read-ahead.md # and https://tracker.ceph.com/projects/ceph/wiki/Tuning_for_All_Flash_Deployments /sys/block/nvme*n*/queue/read_ahead_kb=2048 # filesystem driver's internal read_ahead /sys/class/bdi/btrfs-*/read_ahead_kb=2048 # https://events.static.linuxfound.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf # -1 is default (high CPU-load); # 0 is "adaptive hybrid polling" (best latency with moderate CPU-load); # >0 is fixed-time hybrid polling in ns (device-specific; 2 may be close to 0, 4 - minimal CPU load with latency as high as -1) /sys/block/nvme*n*/queue/io_poll_delay=0 # actually enable it, if possible #/sys/block/nvme*n*/queue/io_poll=1 # don't use CPU for the work of NVMe controller ? # 1 means 'simple merges', 2 is to actually disable them /sys/block/nvme*n*/queue/nomerges=2 # supposedly decreases CPU load due to caching. 0 to disable, 1 for "CPU group", 2 is strict CPU binding /sys/block/nvme*n*/queue/rq_affinity=2 # this throttles writes if it detect latency above this, 0 is "disable" # default seems to be 2000, so you may try things like 333/666/999/1333/3333/6666/7333/9999 /sys/block/nvme*n*/queue/wbt_lat_usec=333 /sys/block/nvme*n*/queue/throttle_sample_time=1 ## KYBER options: defaults are 2000000 / 10000000 /sys/block/nvme*n*/queue/iosched/read_lat_nsec=333111 /sys/block/nvme*n*/queue/iosched/write_lat_nsec=333111111 ## MQ-DEADLINE options /sys/block/nvme*n*/queue/iosched/front_merges=0 # must be lower for better latency /sys/block/nvme*n*/queue/iosched/fifo_batch=4 /sys/block/nvme*n*/queue/iosched/writes_starved=4 /sys/block/nvme*n*/queue/iosched/read_expire=100 /sys/block/nvme*n*/queue/iosched/write_expire=1000 # same for legacy SATA ? /sys/block/sd?/queue/nr_requests=64 /sys/block/sd?/queue/max_sectors_kb=2048 /sys/block/sd?/queue/read_ahead_kb=2048 /sys/block/sd?/queue/io_poll_delay=0 /sys/block/sd?/queue/io_poll=1 #/sys/block/sd?/queue/rq_affinity=2 #/sys/block/sd?/queue/wbt_lat_usec=33111 #/sys/block/sd?/queue/throttle_sample_time=11 /sys/block/sd?/device/ncq_prio_enable=1 # https://ata.wiki.kernel.org/index.php/Libata_FAQ #/sys/block/sd?/device/queue_depth=31 # https://www.spinics.net/lists/linux-scsi/msg80506.html # https://www.ibm.com/docs/en/linux-on-systems?topic=wsd-setting-queue-depth /sys/block/sd?/device/queue_ramp_up_period=1111 # https://www.ibm.com/docs/en/linux-on-systems?topic=wsd-displaying-information-1 #/sys/block/sd?/device/queue_type=none ## KYBER options: defaults are 2000000 / 10000000 /sys/block/sd?/queue/iosched/read_lat_nsec=1111333 /sys/block/sd?/queue/iosched/write_lat_nsec=3311111 # this needs /etc/udev/rules.d/61-io-schedulers.rules like this: # HDD / SSD #ACTION=="add|change", KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq" #ACTION=="add|change", KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="kyber # SDcard / eMMC #ACTION=="add|change", KERNEL=="mmcblk?", ATTR{removable}=="1", ATTR{queue/scheduler}="bfq" #ACTION=="add|change", KERNEL=="mmcblk?", ATTR{removable}=="0", ATTR{queue/scheduler}="kyber" # remove wasteful operations on zram/zswap (should complement sysctl settings) /sys/block/zram?/queue/nr_requests=1024 # 4 or 2048 ? /sys/block/zram?/queue/max_sectors_kb=2048 # 4 / 64 / 2048 ? /sys/block/zram?/queue/read_ahead_kb=2048 /sys/block/zram?/queue/io_poll_delay=0 #/sys/block/zram?/queue/io_poll=1 /sys/block/zram?/queue/nomerges=1 /sys/block/zram?/queue/rq_affinity=1 /sys/block/zram?/queue/throttle_sample_time=8 ## attempt in controlling PWM fans # we need to make sure that all this is done after module probing and autoloading # it would be nice to set it to not spin lower than 2 times per second (120 RPM) BUT if that's not 0 then fan alarm will loose its shit #/sys/class/hwmon/hwmon?/fan?_min=0 # SHUT UP /sys/class/hwmon/hwmon?/temp?_beep=0 /sys/class/hwmon/hwmon?/fan?_beep=0 /sys/class/hwmon/hwmon?/fan?_alarm=0 # modern CPUs and their thermal pastes are designed to be no hotter than 70 degrees # attempt to force "use fan at 32-70 degrees Celsius where 70 is critical, try to use ~80% at initialization and ~100% fan speed at 60 degrees" rule on motherboard /sys/class/hwmon/hwmon?/temp?_min=22000 /sys/class/hwmon/hwmon?/temp?_max=77000 #/sys/class/hwmon/hwmon?/temp?_crit_hyst=87000 #/sys/class/hwmon/hwmon?/temp?_crit=90000 # 23437 is a default safe value (anything above hearing frequency of 22 KHz, about 25 KHz), lower ones (like 10190) may work better but they also produce noticeable noise #/sys/class/hwmon/hwmon?/pwm?_freq=46875 # apparently, 0="full auto" (temperature-based ?), 1="open-loop" (manual, based on value 0-255 in pwm?) and 2="closed-loop" (based on target RPM or some other weird crap) # on my system with broken it87 module setting '0' forces constant 100%, '1' forces constant hardcoded value (and wrong one at that) and '2' is getting stuck near last used value # '2' may require fan IDs to be consistens with temperature sensor IDs which is unrealistic (for me they are reversed) #/sys/class/hwmon/hwmon?/pwm?=145 # this may behave weirdly if applied to all fans at once, so do it only for the CPU one #/sys/class/hwmon/hwmon?/pwm1_enable=2 # AMD GPU is locked to about 20% for some reason by default, try to force it to spin at 100% close to 65 degrees and set about 60% by default, so it would, at worst, be stuck on a useful speed #/sys/class/drm/card?/device/hwmon/hwmon?/fan?_enable=1 #/sys/class/drm/card?/device/hwmon/hwmon?/pwm?=128 # this also broke on recent kernels #/sys/class/drm/card?/device/hwmon/hwmon?/pwm?_min=32 #/sys/class/drm/card?/device/hwmon/hwmon?/pwm?_max=160 # 0 seem to be forcing constant 100% and 1 is fully manual # on '2' amdgpu get's stuck at 13XX and on 0 it's just 100% meaning that this whole thing is broken #/sys/class/drm/card?/device/hwmon/hwmon?/pwm?_enable=2 #/sys/class/drm/card?/device/hwmon/hwmon?/temp?_min=42000 #/sys/class/drm/card?/device/hwmon/hwmon?/temp?_max=67000 # GPUs are usually built to withstand 10-20 more degrees than CPUs but it's still degradational to them #/sys/class/drm/card?/device/hwmon/hwmon?/temp?_crit_hyst=77000 #/sys/class/drm/card?/device/hwmon/hwmon?/temp?_crit=80000 [sysctl] # it may actually hurt load balancing: # https://unix.stackexchange.com/questions/277505/why-is-nice-level-ignored-between-different-login-sessions-honoured-if-star kernel.sched_autogroup_enabled=0 # seems to be absent in newer kernels ! - https://forum.endeavouros.com/t/sysctl-output-changed-from-kernel-5-10-to-5-13-why/17097 # inspired by https://forums.gentoo.org/viewtopic-p-8001720.html and https://probablydance.com/2019/12/30/measuring-mutexes-spinlocks-and-how-bad-the-linux-scheduler-really-is/ #kernel.sched_latency_ns=100000 # Minimal preemption granularity for CPU-bound tasks: # (default: 1 msec# (1 + ilog(ncpus)), units: nanoseconds) #kernel.sched_min_granularity_ns=100000 #kernel.sched_wakeup_granularity_ns=100000 # aggressiveness of task migration between CPUs # https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.0/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Realtime_Specific_Tuning-Using_sched_nr_migrate_to_limit_SCHED_OTHER_processes..html # RT-optimized is 2, default is 8, 32 seems safe even for RT and for modern high-thread CPUs and many-thread operating systems much more may be viable #kernel.sched_nr_migrate=32 # The total time the scheduler will consider a migrated process # "cache hot" and thus less likely to be re-migrated # (system default is 500000, i.e. 0.5 ms) #kernel.sched_migration_cost_ns=6666666 # https://www.suse.com/documentation/sles-12/book_sle_tuning/data/sec_tuning_taskscheduler_cfs.html #kernel.sched_time_avg_ms=8 #kernel.sched_tunable_scaling=1 # https://bugzilla.redhat.com/show_bug.cgi?id=1797629 kernel.timer_migration=0 kernel.sched_cfs_bandwidth_slice_us=10000 kernel.sched_child_runs_first=1 #kernel.sched_energy_aware=0 kernel.sched_deadline_period_max_us=111111 kernel.sched_deadline_period_min_us=1111 # for better realtime guarantees # this doesn't work with CONFIG_RT_GROUP_SCHED and must be -1 (unlimited) which is dangerous due to system lock-ups kernel.sched_rt_runtime_us=50000 kernel.sched_rt_period_us=100000 # https://wiki.linuxfoundation.org/realtime/documentation/technical_basics/sched_policy_prio/start # this is a very specific parameter: time-slice for RR processes of the same priority kernel.sched_rr_timeslice_ms=1 # https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#sched-util-clamp-min kernel.sched_util_clamp_max=999 # unfortunately, RT limits depend on generic limits kernel.sched_util_clamp_min=1024 kernel.sched_util_clamp_min_rt_default=1024 # https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html # this only leads to random memory errors #vm.overcommit_memory=2 #vm.overcommit_ratio=90 # give 16GB to dynamic hugepage pool by setting '8192' for 2MB pages of x86_64 arch ? vm.nr_overcommit_hugepages=8192 vm.admin_reserve_kbytes=262144 # try gid of 'users' group which is likely '100' by default vm.hugetlb_shm_group=100 # default of this is based on RAM-size (66 MB with 16 GB for me, for example) and works well enough but we want more vm.min_free_kbytes=262144 # disable for lower RAM access latency (may cause severe fragmentation) or enable for efficiency (may hurt realtime tasks) # default is '1' for "enabled" #vm.compact_unevictable_allowed=0 # default is 20 vm.compaction_proactiveness=5 #vm.hugepages_treat_as_movable=1 # clustered servers may want to keep it at 0 but 3 could be a safe compromise between latency and efficiency… if it worked BUT in reality it only wastes CPU and brings I/O to a crawl with no benefit # https://blogs.dropbox.com/tech/2017/09/optimizing-web-servers-for-high-throughput-and-low-latency/ #vm.zone_reclaim_mode=0 # how aggressive reclaim is, default is 1% #vm.min_unmapped_ratio=3 # enable to keep processes in RAM that is controlled by CPU that is running them # multi-socket systems must have this enabled # once again, this critical option seems to have been nuked #kernel.numa_balancing=0 #kernel.numa_balancing_scan_period_min_ms=64 #kernel.numa_balancing_scan_period_max_ms=10000 #kernel.numa_balancing_scan_delay_ms=256 #kernel.numa_balancing_scan_size_mb=16 # better clock: # http://wiki.linuxaudio.org/wiki/system_configuration dev.hpet.max-user-freq=4096 # may be already selected as kernel's default # https://wiki.gentoo.org/wiki/Traffic_shaping#Theory # sfq is simple and safe, fq_codel and complex and robust, cake is even more complex but lacks some tunable parameters net.core.default_qdisc=fq_codel # some stuff inspired by stock tuned rt profile # see https://www.kernel.org/doc/Documentation/sysctl/net.txt # https://github.com/leandromoreira/linux-network-performance-parameters kernel.hung_task_check_interval_secs=5 kernel.hung_task_timeout_secs=30 vm.stat_interval=10 net.ipv4.tcp_fastopen=3 # default is 64 with 1/1 for rx/tx, 39 seems safe for rx/tx bias 3/2, 472 seems fine # set weight to [ netdev_budget / weight_rx_bias ] ? net.core.dev_weight=8 net.core.dev_weight_rx_bias=3 net.core.dev_weight_tx_bias=2 # this may hang ethernet interface in a mysterious way net.core.busy_read=0 # should be 100 for maximum network performance but anything ≥50 is a CPU strain net.core.busy_poll=0 net.core.netdev_max_backlog=1024 # 700-871 may be optimal but will eat a lot of CPU-time # <200 may lead to dropped packets even on slow links but 117 seems safe net.core.netdev_budget=128 # should be ≥5000 for maximum network performance, small values may result in dropped frames & packets # 751 seems safe, 999 may be optimal, 1999 is decent for networking performance net.core.netdev_budget_usecs=999 #net.core.netdev_tstamp_prequeue=0 # https://github.com/leandromoreira/linux-network-performance-parameters # attemp in gaining minimal buffer by default with window for up to 10G net.core.optmem_max=1048576 net.core.rmem_default=1048576 net.core.rmem_max=134217728 net.core.wmem_default=1048576 net.core.wmem_max=134217728 # tcp/udp_mem is supposedly in 4K pages # so set it to min / pressure / max as 128MB / 768MB / 1GB # too low may result in randomly dropped packets net.ipv4.tcp_mem=32768 196608 262144 net.ipv4.tcp_rmem=16384 1048576 134217728 net.ipv4.tcp_wmem=16384 1048576 134217728 net.ipv4.udp_mem=32768 196608 262144 net.ipv4.udp_rmem_min=16384 net.ipv4.udp_wmem_min=16384 # https://www.kernel.org/doc/Documentation/networking/scaling.txt # https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/performance_tuning_guide/sect-red_hat_enterprise_linux-performance_tuning_guide-networking-configuration_tools#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_tools-Configuring_Receive_Packet_Steering_RPS # net.core.rps_sock_flow_entries should be a number of maximum expected system-wide simultaneous connections net.core.rps_sock_flow_entries=1024 # BPF #net.core.bpf_jit_enable=1 #net.core.bpf_jit_harden=1 net.core.bpf_jit_kallsyms=1 net.core.bpf_jit_limit=1073741824 # in RAM-caching we prefer extremely large read cache and fairly low write cache # If a workload mostly uses anonymous memory and it hits this limit, the entire working set is buffered for I/O, and any more write buffering would require swapping, so it's time to throttle writes until I/O can catch up. Workloads that mostly use file mappings may be able to use even higher values. # The generator of dirty data starts writeback at this percentage (system default is 20%) vm.dirty_ratio=24 # Start background writeback (via writeback threads) at this percentage (system default is 10%) vm.dirty_background_ratio=16 # give it a minute vm.dirty_expire_centisecs=12000 vm.dirty_writeback_centisecs=6000 # https://unix.stackexchange.com/questions/30286/can-i-configure-my-linux-system-for-more-aggressive-file-system-caching # 100 is default, <100 is "retain more fs directory caches" (may be better when accessing big directories or otherwise a lot of files at once), >100 up to 1000 is "retain less" vm.vfs_cache_pressure=111 # The swappiness parameter controls the tendency of the kernel to move processes out of physical memory and onto the swap disk. # 0 tells the kernel to avoid swapping processes out of physical memory for as long as possible. # 100 tells the kernel to aggressively swap processes out of physical memory and move them to swap cache. # 0-10 is good for systems with normal swap but with usage of zswap (compressed swap in RAM) higher values may be more desirable. # this seems to have changed in recent years: <100 is "prefer dropping caches to avoid swapping" and >100-200 is "prefer swapping stale process memory" # so 150-180 may be prefered nowadays with zram+nvme (swap file should have lower priority than 10 in fstab) vm.swappiness=133 # higher values (4-5) may be better on systems with CPU power to spare for I/O BUT they may increase latencies on SSDs # see https://www.reddit.com/r/Fedora/comments/mzun99/new_zram_tuning_benchmarks/ which says that anything >0 produces too much latency # but 9 might result in 2048K (4k*2^[4]=64K) read-ahead which would coincide with most NVMEs sector size and desired read-ahead vm.page-cluster=1 # aggressiveness of swap in freeing memory, 100 means try to keep 1% of memory free, 1000 is the maximum vm.watermark_scale_factor=333
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor