RE: [Bus error] huge_pages default value (try) not fall back
От | Fan Liu |
---|---|
Тема | RE: [Bus error] huge_pages default value (try) not fall back |
Дата | |
Msg-id | VI1PR0702MB372637DB70D9AAB11124891A9E830@VI1PR0702MB3726.eurprd07.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: [Bus error] huge_pages default value (try) not fall back (Odin Ugedal <odin@ugedal.com>) |
Список | pgsql-bugs |
Thank you so much for the information. BRs, Fan Liu ADP Document Database PG >>-----Original Message----- >>From: Odin Ugedal <odin@ugedal.com> >>Sent: 2020年6月9日 23:23 >>To: Fan Liu <fan.liu@ericsson.com> >>Cc: Dmitry Dolgov <9erthalion6@gmail.com>; PostgreSQL mailing lists >><pgsql-bugs@lists.postgresql.org> >>Subject: Re: [Bus error] huge_pages default value (try) not fall back >> >>Hi, >> >>I stumbled upon this issue when working with the related issue in Kubernetes >>that was referenced a few mails behind. So from what I understand, it looks >>like this issue is/may be a result of how hugetlb cgroup is enforcing the >>"limit_in_bytes" limit for huge pages. A process should theoretically don't >>segfault like this under normal circumstances when using memory received from >>a successful mmap. The value set to "limit_in_bytes" is only enforced during >>page allocation, and _not_ when mapping pages using mmap. This results in a >>successful mmap for -n- huge pages as long as the system has -n- free hugepages, >>even though the size is bigger than "limit_in_bytes". The process then reserves >>the huge page memory, and makes it inaccessible to other processes. >> >>The real issue is when postgres tries to write to the memory it received from >>mmap, and the kernel tries to allocate the reserved huge page memory. Since >>it is not allowed to do so by the cgroup, the process segfaults. >> >>This issue has been fixed in Linux this patch >>https://protect2.fireeye.com/v1/url?k=41942750-1f34c7c4-419467cb-86d2114ea >>b2f-4c9655dbe24776b3&q=1&e=4467c237-1149-49f1-ab6c-bc0a3c31b0f3&u=https%3A >>%2F%2Flkml.org%2Flkml%2F2020%2F2%2F3%2F1153, that adds a new element of >>control to the cgroup that will fix this issue. There are however no container >>runtimes that use it yet, and only 5.7+ (afaik.) kernels support it, but the >>progress can be tracked here: >>https://protect2.fireeye.com/v1/url?k=8e01d669-d0a136fd-8e0196f2-86d2114ea >>b2f-dd1ff954a0920218&q=1&e=4467c237-1149-49f1-ab6c-bc0a3c31b0f3&u=https%3A >>%2F%2Fgithub.com%2Fopencontainers%2Fruntime-spec%2Fissues%2F1050. The fix >>for the upstream Kubernetes issue >>(https://protect2.fireeye.com/v1/url?k=5d33f1ab-0393113f-5d33b130-86d2114e >>ab2f-38b5ca047e5124c3&q=1&e=4467c237-1149-49f1-ab6c-bc0a3c31b0f3&u=https%3 >>A%2F%2Fgithub.com%2Fopencontainers%2Fruntime-spec%2Fissues%2F1050) that made >>kubernetes set wrong value to the top level "limit_in_bytes" when the >>pre-allocated page count increased after kubernetes (kubelet) startup, will >>hopefully land in Kubernetes 1.19 (or 1.20). Fingers crossed! >> >>Hopefully this makes some sense, and gives some insights into the issue... >> >>Best regards, >>Odin Ugedal
В списке pgsql-bugs по дате отправления: