Discussion:
Bug#945879: linux: slab memory leak around 30MB/hour causing OOM on OVH VPS
Add Reply
Aurelien Jarno
2019-11-30 13:10:02 UTC
Reply
Permalink
Package: src:linux
Version: 4.19.67-2+deb10u2
Severity: important

Dear Maintainer,

Since the latest Debian stable release, I observe a slab memory leak of
about 30MB/hour when running the kernel 4.19.67-2+deb10u2 on an OVH VPS,
which causes an all applications to slowly move to swap after a few days,
and eventually an OOM. You'll find a typical munin memory plot attached
to the bug report.

This is the typical output off "slabtop -s c -o" at boot time:

| Active / Total Objects (% used) : 174642 / 180042 (97,0%)
| Active / Total Slabs (% used) : 6885 / 6885 (100,0%)
| Active / Total Caches (% used) : 97 / 135 (71,9%)
| Active / Total Size (% used) : 44380,79K / 45976,49K (96,5%)
| Minimum / Average / Maximum Object : 0,01K / 0,25K / 15,94K
|
| OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
| 7725 7613 98% 1,05K 515 15 8240K ext4_inode_cache
| 10816 10199 94% 0,58K 832 13 6656K inode_cache
| 8666 8589 99% 0,57K 619 14 4952K radix_tree_node
| 24885 21700 87% 0,19K 1185 21 4740K dentry
| 18120 18120 100% 0,13K 604 30 2416K kernfs_node_cache
| 11229 11160 99% 0,20K 591 19 2364K vm_area_struct
| 584 568 97% 4,00K 73 8 2336K kmalloc-4k
| 4160 3743 89% 0,25K 260 16 1040K filp
| 256 256 100% 3,75K 32 8 1024K task_struct
| 1188 1114 93% 0,69K 108 11 864K shmem_inode_cache
| 848 838 98% 1,00K 106 8 848K kmalloc-1k
| 368 368 100% 2,00K 46 8 736K kmalloc-2k
| 10944 10625 97% 0,06K 171 64 684K anon_vma_chain
| 703 703 100% 0,81K 37 19 592K sock_inode_cache
| 1176 1176 100% 0,50K 147 8 588K kmalloc-512
| 5733 5674 98% 0,10K 147 39 588K buffer_head
| 816 805 98% 0,65K 68 12 544K proc_inode_cache
| 5980 5980 100% 0,09K 130 46 520K anon_vma
| 240 240 100% 2,06K 16 15 512K sighand_cache
| 1134 1134 100% 0,44K 126 9 504K kmem_cache
| 375 375 100% 1,06K 25 15 400K signal_cache
| 5952 5753 96% 0,06K 93 64 372K kmalloc-64
| 322 322 100% 1,12K 23 14 368K mm_struct
| 352 352 100% 1,00K 44 8 352K UNIX
| 10368 10368 100% 0,03K 81 128 324K kmalloc-32
| 36 36 100% 8,00K 9 4 288K kmalloc-8k
| 135 60 44% 2,05K 9 15 288K request_queue
| 7344 7344 100% 0,04K 72 102 288K ext4_extent_status
| 1092 1092 100% 0,19K 52 21 208K cred_jar
| 4080 4080 100% 0,05K 48 85 192K ftrace_event_field
| 966 966 100% 0,19K 46 21 184K kmalloc-192
| 720 720 100% 0,25K 45 16 180K kmalloc-256
| 1344 1344 100% 0,12K 42 32 168K pid
| 40 40 100% 4,00K 5 8 160K biovec-max
| 624 560 89% 0,25K 39 16 156K skbuff_head_cache
| 108 108 100% 1,25K 9 12 144K UDPv6
| 1472 1472 100% 0,09K 32 46 128K trace_event_file
| 1680 1680 100% 0,07K 30 56 120K Acpi-Operand
| 1218 1218 100% 0,09K 29 42 116K kmalloc-96
| 928 928 100% 0,12K 29 32 116K kmalloc-128
| 348 348 100% 0,31K 29 12 116K mnt_cache
| 154 154 100% 0,69K 14 11 112K files_cache
| 6400 6400 100% 0,02K 25 256 100K kmalloc-16
| 39 39 100% 2,31K 3 13 96K TCPv6
| 399 399 100% 0,19K 19 21 76K proc_dir_entry
| 1152 1152 100% 0,06K 18 64 72K kmem_cache_node
| 867 867 100% 0,08K 17 51 68K task_delay_info
| 16 16 100% 4,00K 2 8 64K names_cache
| 1632 1632 100% 0,04K 16 102 64K pde_opener
| 28 28 100% 2,19K 2 14 64K TCP
| 7168 7168 100% 0,01K 14 512 56K kmalloc-8
| 896 896 100% 0,06K 14 64 56K kmalloc-rcl-64
| 168 113 67% 0,31K 14 12 56K nf_conntrack
| 504 463 91% 0,09K 12 42 48K kmalloc-rcl-96
| 768 768 100% 0,06K 12 64 48K vmap_area
| 616 616 100% 0,07K 11 56 44K eventpoll_pwq
| 189 189 100% 0,19K 9 21 36K dmaengine-unmap-16
| 1360 1360 100% 0,02K 8 170 32K numa_policy
| 15 15 100% 2,06K 1 15 32K dmaengine-unmap-256
| 112 112 100% 0,25K 7 16 28K pool_workqueue
| 36 36 100% 0,62K 3 12 24K task_group
| 168 168 100% 0,14K 6 28 24K ext4_groupinfo_4k
| 160 160 100% 0,12K 5 32 20K scsi_sense_cache
| 19 19 100% 0,81K 1 19 16K bdev_cache
| 15 15 100% 1,06K 1 15 16K dmaengine-unmap-128
| 16 16 100% 1,00K 2 8 16K biovec-64
| 8 8 100% 2,00K 1 8 16K biovec-128
| 13 13 100% 1,19K 1 13 16K RAWv6
| 136 136 100% 0,12K 4 34 16K jbd2_journal_head
| 408 408 100% 0,04K 4 102 16K ext4_system_zone
| 96 96 100% 0,12K 3 32 12K kmalloc-rcl-128
| 153 153 100% 0,08K 3 51 12K Acpi-State
| 24 24 100% 0,50K 3 8 12K skbuff_fclone_cache
| 23 23 100% 0,34K 1 23 8K taskstats
| 64 64 100% 0,12K 2 32 8K skbuff_ext_cache
| 78 78 100% 0,10K 2 39 8K blkdev_ioc
| 10 10 100% 0,75K 1 10 8K dax_cache
| 13 13 100% 0,60K 1 13 8K hugetlbfs_inode_cache
| 8 8 100% 1,00K 1 8 8K RAW
| 8 8 100% 0,94K 1 8 8K mqueue_inode_cache
| 21 21 100% 0,19K 1 21 4K kmalloc-rcl-192
| 73 73 100% 0,05K 1 73 4K Acpi-Parse
| 128 128 100% 0,03K 1 128 4K fsnotify_mark_connector
| 18 18 100% 0,21K 1 18 4K file_lock_cache
| 36 36 100% 0,11K 1 36 4K khugepaged_mm_slot
| 16 16 100% 0,25K 1 16 4K dquot
| 13 13 100% 0,30K 1 13 4K request_sock_TCP
| 13 13 100% 0,30K 1 13 4K request_sock_TCPv6
| 16 16 100% 0,24K 1 16 4K tw_sock_TCPv6
| 22 22 100% 0,18K 1 22 4K ip6-frags
| 85 85 100% 0,05K 1 85 4K fscrypt_ctx
| 128 128 100% 0,03K 1 128 4K jbd2_revoke_record_s
| 256 256 100% 0,02K 1 256 4K jbd2_revoke_table_s
| 73 73 100% 0,05K 1 73 4K mbcache
| 64 64 100% 0,06K 1 64 4K ext4_io_end
| 32 32 100% 0,12K 1 32 4K ext4_allocation_context
| 34 34 100% 0,12K 1 34 4K xt_hashlimit
| 0 0 0% 0,01K 0 512 0K kmalloc-rcl-8
| 0 0 0% 0,02K 0 256 0K kmalloc-rcl-16
| 0 0 0% 0,03K 0 128 0K kmalloc-rcl-32
| 0 0 0% 0,25K 0 16 0K kmalloc-rcl-256
| 0 0 0% 0,50K 0 8 0K kmalloc-rcl-512
| 0 0 0% 1,00K 0 8 0K kmalloc-rcl-1k
| 0 0 0% 2,00K 0 8 0K kmalloc-rcl-2k
| 0 0 0% 4,00K 0 8 0K kmalloc-rcl-4k
| 0 0 0% 8,00K 0 4 0K kmalloc-rcl-8k
| 0 0 0% 0,09K 0 42 0K dma-kmalloc-96
| 0 0 0% 0,19K 0 21 0K dma-kmalloc-192
| 0 0 0% 0,01K 0 512 0K dma-kmalloc-8
| 0 0 0% 0,02K 0 256 0K dma-kmalloc-16
| 0 0 0% 0,03K 0 128 0K dma-kmalloc-32
| 0 0 0% 0,06K 0 64 0K dma-kmalloc-64
| 0 0 0% 0,12K 0 32 0K dma-kmalloc-128
| 0 0 0% 0,25K 0 16 0K dma-kmalloc-256
| 0 0 0% 0,50K 0 8 0K dma-kmalloc-512
| 0 0 0% 1,00K 0 8 0K dma-kmalloc-1k
| 0 0 0% 2,00K 0 8 0K dma-kmalloc-2k
| 0 0 0% 4,00K 0 8 0K dma-kmalloc-4k
| 0 0 0% 8,00K 0 4 0K dma-kmalloc-8k
| 0 0 0% 0,43K 0 9 0K uts_namespace
| 0 0 0% 0,12K 0 34 0K iint_cache
| 0 0 0% 4,81K 0 6 0K net_namespace
| 0 0 0% 0,52K 0 15 0K user_namespace
| 0 0 0% 0,24K 0 16 0K tw_sock_TCP
| 0 0 0% 0,94K 0 8 0K PING
| 0 0 0% 0,69K 0 11 0K xfrm_state
| 0 0 0% 0,20K 0 20 0K ip4-frags
| 0 0 0% 0,20K 0 19 0K pid_namespace
| 0 0 0% 0,03K 0 128 0K dnotify_struct
| 0 0 0% 0,19K 0 21 0K userfaultfd_ctx_cache
| 0 0 0% 1,19K 0 13 0K PINGv6
| 0 0 0% 4,06K 0 7 0K x86_fpu
| 0 0 0% 0,16K 0 24 0K kvm_mmu_page_header
| 0 0 0% 15,94K 0 2 0K kvm_vcpu
| 0 0 0% 0,13K 0 30 0K kvm_async_pf


and a few minutes after a few hours:

| Active / Total Objects (% used) : 860957 / 882873 (97,5%)
| Active / Total Slabs (% used) : 25917 / 25917 (100,0%)
| Active / Total Caches (% used) : 100 / 137 (73,0%)
| Active / Total Size (% used) : 146417,38K / 150674,02K (97,2%)
| Minimum / Average / Maximum Object : 0,01K / 0,17K / 15,94K
|
| OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
| 130378 130271 99% 0,20K 6862 19 27448K vm_area_struct
| 5064 5057 99% 4,00K 633 8 20256K kmalloc-4k
| 8775 8546 97% 1,05K 585 15 9360K ext4_inode_cache
| 15054 14454 96% 0,58K 1158 13 9264K inode_cache
| 41706 35124 84% 0,19K 1986 21 7944K dentry
| 75972 66943 88% 0,10K 1948 39 7792K buffer_head
| 25376 24909 98% 0,25K 1586 16 6344K filp
| 10598 10104 95% 0,57K 757 14 6056K radix_tree_node
| 95808 95552 99% 0,06K 1497 64 5988K anon_vma_chain
| 7087 7087 100% 0,81K 373 19 5968K sock_inode_cache
| 4942 4942 100% 1,12K 353 14 5648K mm_struct
| 36690 36690 100% 0,13K 1223 30 4892K kernfs_node_cache
| 10386 10386 100% 0,44K 1154 9 4616K kmem_cache
| 51704 51704 100% 0,09K 1124 46 4496K anon_vma
| 2984 2984 100% 1,00K 373 8 2984K UNIX
| 3192 2367 74% 0,65K 266 12 2128K proc_inode_cache
| 54144 54144 100% 0,03K 423 128 1692K kmalloc-32
| 25472 25472 100% 0,06K 398 64 1592K kmalloc-64
| 1544 1536 99% 1,00K 193 8 1544K kmalloc-1k
| 8064 8064 100% 0,19K 384 21 1536K cred_jar
| 744 744 100% 2,00K 93 8 1488K kmalloc-2k
| 11904 11904 100% 0,12K 372 32 1488K pid
| 175616 175616 100% 0,01K 343 512 1372K kmalloc-8
| 336 336 100% 3,75K 42 8 1344K task_struct
| 10920 10920 100% 0,09K 260 42 1040K kmalloc-96
| 2008 2008 100% 0,50K 251 8 1004K kmalloc-512
| 1155 1024 88% 0,69K 105 11 840K shmem_inode_cache
| 10432 10432 100% 0,06K 163 64 652K kmem_cache_node
| 300 300 100% 2,06K 20 15 640K sighand_cache
| 555 555 100% 1,06K 37 15 592K signal_cache
| 180 129 71% 2,05K 12 15 384K request_queue
| 1392 1392 100% 0,25K 87 16 348K kmalloc-256
| 8364 5802 69% 0,04K 82 102 328K ext4_extent_status
| 80 44 55% 4,00K 10 8 320K biovec-max
| 1596 1596 100% 0,19K 76 21 304K kmalloc-192
| 36 36 100% 8,00K 9 4 288K kmalloc-8k
| 4480 4413 98% 0,06K 70 64 280K vmap_area
| 896 832 92% 0,25K 56 16 224K skbuff_head_cache
| 13824 13824 100% 0,02K 54 256 216K kmalloc-16
| 4080 4080 100% 0,05K 48 85 192K ftrace_event_field
| 144 144 100% 1,25K 12 12 192K UDPv6
| 1120 1120 100% 0,12K 35 32 140K kmalloc-128
| 187 187 100% 0,69K 17 11 136K files_cache
| 1472 1472 100% 0,09K 32 46 128K trace_event_file
| 52 52 100% 2,31K 4 13 128K TCPv6
| 1680 1680 100% 0,07K 30 56 120K Acpi-Operand
| 348 348 100% 0,31K 29 12 116K mnt_cache
| 1176 1094 93% 0,09K 28 42 112K kmalloc-rcl-96
| 546 433 79% 0,19K 26 21 104K dmaengine-unmap-16
| 816 680 83% 0,12K 24 34 96K jbd2_journal_head
| 1071 1071 100% 0,08K 21 51 84K task_delay_info
| 1216 1172 96% 0,06K 19 64 76K kmalloc-rcl-64
| 399 399 100% 0,19K 19 21 76K proc_dir_entry
| 1734 1734 100% 0,04K 17 102 68K pde_opener
| 16 16 100% 4,00K 2 8 64K names_cache
| 28 28 100% 2,19K 2 14 64K TCP
| 168 168 100% 0,31K 14 12 56K nf_conntrack
| 949 949 100% 0,05K 13 73 52K mbcache
| 616 616 100% 0,07K 11 56 44K eventpoll_pwq
| 1700 1700 100% 0,02K 10 170 40K numa_policy
| 408 408 100% 0,08K 8 51 32K Acpi-State
| 15 15 100% 2,06K 1 15 32K dmaengine-unmap-256
| 26 26 100% 1,19K 2 13 32K RAWv6
| 26 26 100% 1,19K 2 13 32K PINGv6
| 112 112 100% 0,25K 7 16 28K pool_workqueue
| 896 896 100% 0,03K 7 128 28K jbd2_revoke_record_s
| 36 36 100% 0,62K 3 12 24K task_group
| 168 168 100% 0,14K 6 28 24K ext4_groupinfo_4k
| 160 160 100% 0,12K 5 32 20K scsi_sense_cache
| 19 19 100% 0,81K 1 19 16K bdev_cache
| 15 15 100% 1,06K 1 15 16K dmaengine-unmap-128
| 8 8 100% 2,00K 1 8 16K biovec-128
| 16 16 100% 1,00K 2 8 16K RAW
| 16 16 100% 0,94K 2 8 16K PING
| 408 408 100% 0,04K 4 102 16K ext4_system_zone
| 96 70 72% 0,12K 3 32 12K kmalloc-rcl-128
| 24 24 100% 0,50K 3 8 12K skbuff_fclone_cache
| 42 42 100% 0,19K 2 21 8K kmalloc-rcl-192
| 23 23 100% 0,34K 1 23 8K taskstats
| 64 64 100% 0,12K 2 32 8K skbuff_ext_cache
| 8 8 100% 1,00K 1 8 8K biovec-64
| 78 78 100% 0,10K 2 39 8K blkdev_ioc
| 10 10 100% 0,75K 1 10 8K dax_cache
| 32 32 100% 0,25K 2 16 8K dquot
| 13 13 100% 0,60K 1 13 8K hugetlbfs_inode_cache
| 8 8 100% 0,94K 1 8 8K mqueue_inode_cache
| 128 128 100% 0,06K 2 64 8K ext4_io_end
| 73 73 100% 0,05K 1 73 4K Acpi-Parse
| 128 128 100% 0,03K 1 128 4K fsnotify_mark_connector
| 18 18 100% 0,21K 1 18 4K file_lock_cache
| 36 36 100% 0,11K 1 36 4K khugepaged_mm_slot
| 13 13 100% 0,30K 1 13 4K request_sock_TCP
| 16 16 100% 0,24K 1 16 4K tw_sock_TCP
| 13 13 100% 0,30K 1 13 4K request_sock_TCPv6
| 16 16 100% 0,24K 1 16 4K tw_sock_TCPv6
| 22 22 100% 0,18K 1 22 4K ip6-frags
| 85 85 100% 0,05K 1 85 4K fscrypt_ctx
| 256 256 100% 0,02K 1 256 4K jbd2_revoke_table_s
| 32 32 100% 0,12K 1 32 4K ext4_allocation_context
| 34 34 100% 0,12K 1 34 4K xt_hashlimit
| 0 0 0% 0,01K 0 512 0K kmalloc-rcl-8
| 0 0 0% 0,02K 0 256 0K kmalloc-rcl-16
| 0 0 0% 0,03K 0 128 0K kmalloc-rcl-32
| 0 0 0% 0,25K 0 16 0K kmalloc-rcl-256
| 0 0 0% 0,50K 0 8 0K kmalloc-rcl-512
| 0 0 0% 1,00K 0 8 0K kmalloc-rcl-1k
| 0 0 0% 2,00K 0 8 0K kmalloc-rcl-2k
| 0 0 0% 4,00K 0 8 0K kmalloc-rcl-4k
| 0 0 0% 8,00K 0 4 0K kmalloc-rcl-8k
| 0 0 0% 0,09K 0 42 0K dma-kmalloc-96
| 0 0 0% 0,19K 0 21 0K dma-kmalloc-192
| 0 0 0% 0,01K 0 512 0K dma-kmalloc-8
| 0 0 0% 0,02K 0 256 0K dma-kmalloc-16
| 0 0 0% 0,03K 0 128 0K dma-kmalloc-32
| 0 0 0% 0,06K 0 64 0K dma-kmalloc-64
| 0 0 0% 0,12K 0 32 0K dma-kmalloc-128
| 0 0 0% 0,25K 0 16 0K dma-kmalloc-256
| 0 0 0% 0,50K 0 8 0K dma-kmalloc-512
| 0 0 0% 1,00K 0 8 0K dma-kmalloc-1k
| 0 0 0% 2,00K 0 8 0K dma-kmalloc-2k
| 0 0 0% 4,00K 0 8 0K dma-kmalloc-4k
| 0 0 0% 8,00K 0 4 0K dma-kmalloc-8k
| 0 0 0% 0,43K 0 9 0K uts_namespace
| 0 0 0% 0,12K 0 34 0K iint_cache
| 0 0 0% 4,81K 0 6 0K net_namespace
| 0 0 0% 0,52K 0 15 0K user_namespace
| 0 0 0% 0,69K 0 11 0K xfrm_state
| 0 0 0% 0,20K 0 20 0K ip4-frags
| 0 0 0% 0,20K 0 19 0K pid_namespace
| 0 0 0% 0,03K 0 128 0K dnotify_struct
| 0 0 0% 0,19K 0 21 0K userfaultfd_ctx_cache
| 0 0 0% 4,06K 0 7 0K x86_fpu
| 0 0 0% 0,16K 0 24 0K kvm_mmu_page_header
| 0 0 0% 15,94K 0 2 0K kvm_vcpu
| 0 0 0% 0,13K 0 30 0K kvm_async_pf
| 0 0 0% 2,57K 0 12 0K dm_uevent
| 0 0 0% 3,23K 0 9 0K kcopyd_job

The problem has been introduced in Debian in kernel 4.19.67-1. I have
found that the problem has been introduced upstream in the 4.19.66
release. I have been able to bisect the issue on the upstream stable
branch and found it has been introduced in that commit:

| commit 4340d175b89896d069c1e875f5b98c80a408f680
| Author: Tejun Heo <***@kernel.org>
| Date: Fri May 31 10:38:58 2019 -0700
|
| cgroup: Include dying leaders with live threads in PROCS iterations
|
| commit c03cd7738a83b13739f00546166969342c8ff014 upstream.
|
| CSS_TASK_ITER_PROCS currently iterates live group leaders; however,
| this means that a process with dying leader and live threads will be
| skipped. IOW, cgroup.procs might be empty while cgroup.threads isn't,
| which is confusing to say the least.
|
| Fix it by making cset track dying tasks and include dying leaders with
| live threads in PROCS iteration.
|
| Signed-off-by: Tejun Heo <***@kernel.org>
| Reported-and-tested-by: Topi Miettinen <***@gmail.com>
| Cc: Oleg Nesterov <***@redhat.com>
| Signed-off-by: Greg Kroah-Hartman <***@linuxfoundation.org>

I am also able to reproduce the problem with the latest kernel from
experimental, ie version 5.4-1~exp1. On the other hand I use the Debian
stable kernel on many other machines, and they are not affected by the
issue (but not of them are OVH VPS).

-- Package-specific info:
** Kernel log: boot messages should be attached

** Model information
sys_vendor: OpenStack Foundation
product_name: OpenStack Nova
product_version: 14.1.1
chassis_vendor: QEMU
chassis_version: pc-i440fx-xenial
bios_vendor: SeaBIOS
bios_version: 2:1.10.2-58953eb7

** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] [8086:1237] (rev 02)
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0

00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] [8086:7000]
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] [8086:7010] (prog-if 80 [ISA Compatibility mode-only controller, supports bus mastering])
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Region 0: [virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]
Region 1: [virtual] Memory at 000003f0 (type 3, non-prefetchable)
Region 2: [virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]
Region 3: [virtual] Memory at 00000370 (type 3, non-prefetchable)
Region 4: I/O ports at c0e0 [size=16]
Kernel driver in use: ata_piix
Kernel modules: ata_piix, ata_generic

00:01.2 USB controller [0c03]: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] [8086:7020] (rev 01) (prog-if 00 [UHCI])
Subsystem: Red Hat, Inc QEMU Virtual Machine [1af4:1100]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin D routed to IRQ 11
Region 4: I/O ports at c080 [size=32]
Kernel driver in use: uhci_hcd
Kernel modules: uhci_hcd

00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI [8086:7113] (rev 03)
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 9
Kernel driver in use: piix4_smbus
Kernel modules: i2c_piix4

00:02.0 VGA compatible controller [0300]: Cirrus Logic GD 5446 [1013:00b8] (prog-if 00 [VGA controller])
Subsystem: Red Hat, Inc QEMU Virtual Machine [1af4:1100]
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Region 0: Memory at fc000000 (32-bit, prefetchable) [size=32M]
Region 1: Memory at febd0000 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at 000c0000 [disabled] [size=128K]
Kernel driver in use: cirrus
Kernel modules: cirrusfb, cirrus

00:03.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device [1af4:1000]
Subsystem: Red Hat, Inc Virtio network device [1af4:0001]
Physical Slot: 3
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at c0a0 [size=32]
Region 1: Memory at febd1000 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at feb80000 [disabled] [size=256K]
Capabilities: <access denied>
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

00:04.0 SCSI storage controller [0100]: Red Hat, Inc Virtio SCSI [1af4:1004]
Subsystem: Red Hat, Inc Virtio SCSI [1af4:0008]
Physical Slot: 4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at c000 [size=64]
Region 1: Memory at febd2000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

00:05.0 SCSI storage controller [0100]: Red Hat, Inc Virtio block device [1af4:1001]
Subsystem: Red Hat, Inc Virtio block device [1af4:0002]
Physical Slot: 5
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at c040 [size=64]
Region 1: Memory at febd3000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

00:06.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon [1af4:1002]
Subsystem: Red Hat, Inc Virtio memory balloon [1af4:0005]
Physical Slot: 6
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at c0c0 [size=32]
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci


** USB devices:
Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub


-- System Information:
Debian Release: 10.2
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.4.0-trunk-amd64 (SMP w/1 CPU core)
Kernel taint flags: TAINT_UNSIGNED_MODULE
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8), LANGUAGE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages linux-image-4.19.0-6-amd64 depends on:
ii initramfs-tools [linux-initramfs-tool] 0.133+deb10u1
ii kmod 26-1
ii linux-base 4.6

Versions of packages linux-image-4.19.0-6-amd64 recommends:
pn apparmor <none>
pn firmware-linux-free <none>

Versions of packages linux-image-4.19.0-6-amd64 suggests:
pn debian-kernel-handbook <none>
ii grub-pc 2.02+dfsg1-20
pn linux-doc-4.19 <none>

Versions of packages linux-image-4.19.0-6-amd64 is related to:
pn firmware-amd-graphics <none>
pn firmware-atheros <none>
pn firmware-bnx2 <none>
pn firmware-bnx2x <none>
pn firmware-brcm80211 <none>
pn firmware-cavium <none>
pn firmware-intel-sound <none>
pn firmware-intelwimax <none>
pn firmware-ipw2x00 <none>
pn firmware-ivtv <none>
pn firmware-iwlwifi <none>
pn firmware-libertas <none>
pn firmware-linux-nonfree <none>
pn firmware-misc-nonfree <none>
pn firmware-myricom <none>
pn firmware-netxen <none>
pn firmware-qlogic <none>
pn firmware-realtek <none>
pn firmware-samsung <none>
pn firmware-siano <none>
pn firmware-ti-connectivity <none>
pn xen-hypervisor <none>

-- no debconf information
Aurelien Jarno
2019-12-02 22:30:02 UTC
Reply
Permalink
control: reassign -1 openssh/1:7.9p1-10+deb10u1
control: retitle -1 openssh-server: cgroup leftovers with socket activation when key exchange fails
Post by Aurelien Jarno
Package: src:linux
Version: 4.19.67-2+deb10u2
Severity: important
Dear Maintainer,
Since the latest Debian stable release, I observe a slab memory leak of
about 30MB/hour when running the kernel 4.19.67-2+deb10u2 on an OVH VPS,
which causes an all applications to slowly move to swap after a few days,
and eventually an OOM. You'll find a typical munin memory plot attached
to the bug report.
[...]
Post by Aurelien Jarno
The problem has been introduced in Debian in kernel 4.19.67-1. I have
found that the problem has been introduced upstream in the 4.19.66
release.
It happens that the original problem is due to SSH, just that this new
kernel version makes things way more visible.

When using systemd socket activation the OpenSSH daemon sometimes does
not remove the cgroup created for the connection after the key exchange
algorithm has failed. This usually happens relatively rarely, less than
1% of the time. However on a single CPU system (e.g. VM with a single
vCPU), the problem happens 100% of the time.

To reproduce the problem using a VM:
- Reduce the number of vCPU to 1
- Switch the OpenSSH daemon to systemd socket activation using systemctl
enable ssh.socket followed by a reboot
- Try to connect to the system with a key exchange algorithm not
supported on buster. For example
ssh -o KexAlgorithms=diffie-hellman-group-exchange-sha1 host
- Look at /sys/fs/cgroup/memory/system.slice/system-ssh.slice. Each
connection leaves an entry in that directory. Each entry takes some
kernel memory.
- Depending on the available memory and available swap, after a few
thousands connections the OOM killer kills all the processes making
the system unusable.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
***@aurel32.net http://www.aurel32.net
Loading...