首页 > 编程学习 > Java进程退出

Java进程退出

发布时间:2022/10/1 0:13:23

JVM启动参数:

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/ -XX:+PrintGCDetails -Xloggc:/home/app/logs/web/gc.log -XX:+PrintGCDateStamps

 

1、被Linux杀死

1)、内存泄漏;
2)、进程所需的内存资源太大,对于java进程而言,除了-Xmx设置最大堆大小,还需要考虑元数据空间、堆外内存、直接内存的使用;

3、其他进程需要占用较多的资源,但是被OOM Killer机制选中当前进程;

OOM Killer机制:

Linux 内核有个机制叫OOM killer(Out Of Memory killer),该机制会监控那些占用内存过大,尤其是瞬间占用内存很快的进程,然后防止内存耗尽而自动把该进程杀掉。内核检测到系统内存不足、挑选并杀掉某个进程的过程可以参考内核源代码linux/mm/oom_kill.c,当系统内存不足的时候,out_of_memory()被触发,然后调用select_bad_process()选择一个”bad”进程杀掉。如何判断和选择一个”bad进程呢?linux选择”bad”进程是通过调用oom_badness(),挑选的算法和想法都很简单很朴实:最bad的那个进程就是那个最占用内存的进程。

参考:

Linux内核OOM killer机制_chirpyli的博客-CSDN博客_oomkiller

OOM Killer机制分析 - 知乎

出现OOM Killer的原因与解决方案-阿里云

查看日志

more /var/log/messages

egrep -i -r 'killed process' /var/log

dmesg -T

找到大概被kill的时间,查看有没有Out of Memory, Kill process xxx 的关键词

egrep -i -r 'killed process' /var/log

dmesg -T

Sep 30 08:56:11 ecs-hn1-app-007 kernel: AliYunDun invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: AliYunDun cpuset=/ mems_allowed=0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: CPU: 3 PID: 7859 Comm: AliYunDun Tainted: G           OE  ------------ T 3.10.0-1160.25.1.el7.x86_64 #1
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Call Trace:
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8f18311a>] dump_stack+0x19/0x1b
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8f17da3a>] dump_header+0x90/0x229
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8eb06642>] ? ktime_get_ts64+0x52/0xf0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8eb5de2f>] ? delayacct_end+0x8f/0xb0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebc232d>] oom_kill_process+0x2cd/0x490
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebc1d1d>] ? oom_unkillable_task+0xcd/0x120
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebc2a1a>] out_of_memory+0x31a/0x500
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8f17e557>] __alloc_pages_slowpath+0x5db/0x729
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebc8f96>] __alloc_pages_nodemask+0x436/0x450
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ec18c88>] alloc_pages_current+0x98/0x110
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebbdde7>] __page_cache_alloc+0x97/0xb0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebc0d80>] filemap_fault+0x270/0x420
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffffc03a2756>] ext4_filemap_fault+0x36/0x50 [ext4]
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebee08a>] __do_fault.isra.61+0x8a/0x100
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8eae61d1>] ? put_prev_entity+0x31/0x400
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebee63c>] do_read_fault.isra.63+0x4c/0x1b0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8ebf5e80>] handle_mm_fault+0xa20/0xfb0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8eac9ead>] ? hrtimer_start_range_ns+0x1fd/0x3c0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8f190653>] __do_page_fault+0x213/0x500
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8f190a26>] trace_do_page_fault+0x56/0x150
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8f18ffa2>] do_async_page_fault+0x22/0xf0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: [<ffffffff8f18c7a8>] async_page_fault+0x28/0x30
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Mem-Info:
Sep 30 08:56:12 ecs-hn1-app-007 kernel: active_anon:3848363 inactive_anon:154 isolated_anon:0#012 active_file:3374 inactive_file:6125 isolated_file:20#012 unevictable:0 dirty:22 writeback:16 unstable:0#012 slab_recl
aimable:15257 slab_unreclaimable:7621#012 mapped:326 shmem:239 pagetables:9772 bounce:0#012 free:33797 free_pcp:0 free_cma:0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Node 0 DMA free:15908kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB p
resent:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Sep 30 08:56:12 ecs-hn1-app-007 kernel: lowmem_reserve[]: 0 2830 15869 15869
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Node 0 DMA32 free:64028kB min:12044kB low:15052kB high:18064kB active_anon:2770000kB inactive_anon:120kB active_file:4412kB inactive_file:7456kB unevictable:0kB isolated(anon)
:0kB isolated(file):0kB present:3129216kB managed:2898784kB mlocked:0kB dirty:12kB writeback:0kB mapped:288kB shmem:180kB slab_reclaimable:14368kB slab_unreclaimable:4704kB kernel_stack:1776kB pagetables:6652kB unst
able:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:18456 all_unreclaimable? yes
Sep 30 08:56:12 ecs-hn1-app-007 kernel: lowmem_reserve[]: 0 0 13038 13038
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Node 0 Normal free:55252kB min:55468kB low:69332kB high:83200kB active_anon:12623452kB inactive_anon:496kB active_file:9084kB inactive_file:17044kB unevictable:0kB isolated(an
on):0kB isolated(file):80kB present:13631488kB managed:13351488kB mlocked:0kB dirty:76kB writeback:64kB mapped:1016kB shmem:776kB slab_reclaimable:46660kB slab_unreclaimable:25780kB kernel_stack:10192kB pagetables:3
2436kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:43384 all_unreclaimable? yes
Sep 30 08:56:12 ecs-hn1-app-007 kernel: lowmem_reserve[]: 0 0 0 0
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Node 0 DMA32: 1026*4kB (UEM) 589*8kB (UEM) 885*16kB (UEM) 539*32kB (UEM) 235*64kB (UEM) 50*128kB (UEM) 6*256kB (EM) 2*512kB (EM) 0*1024kB 0*2048kB 0*4096kB = 64224kB
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Node 0 Normal: 5898*4kB (UEM) 1677*8kB (UEM) 1170*16kB (UEM) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 55728kB
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Sep 30 08:56:12 ecs-hn1-app-007 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Sep 30 08:56:13 ecs-hn1-app-007 kernel: 9982 total pagecache pages
Sep 30 08:56:13 ecs-hn1-app-007 kernel: 0 pages in swap cache
Sep 30 08:56:13 ecs-hn1-app-007 kernel: Swap cache stats: add 0, delete 0, find 0/0
Sep 30 08:56:13 ecs-hn1-app-007 kernel: Free swap  = 0kB
Sep 30 08:56:13 ecs-hn1-app-007 kernel: Total swap = 0kB
Sep 30 08:56:13 ecs-hn1-app-007 kernel: 4194174 pages RAM
Sep 30 08:56:13 ecs-hn1-app-007 kernel: 0 pages HighMem/MovableOnly
```
Sep 30 08:56:13 ecs-hn1-app-007 kernel: 127629 pages reserved
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  380]     0   380    35369      101      75        0             0 systemd-journal
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  404]     0   404    11332      135      23        0         -1000 systemd-udevd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  510]     0   510    13883      127      28        0         -1000 auditd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  562]   999   562   153089     2162      64        0             0 polkitd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  566]    81   566    14559      180      34        0          -900 dbus-daemon
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  578]     0   578     6702      214      18        0             0 systemd-logind
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  582]   998   582    29483      148      29        0             0 chronyd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  815]     0   815    25736      514      48        0             0 dhclient
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [  884]     0   884   143570     3348      96        0             0 tuned
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 1124]     0  1124    31605      180      17        0             0 crond
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 1127]     0  1127     6477       50      18        0             0 atd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 1131]     0  1131    27552       42      10        0             0 agetty
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 1132]     0  1132    27552       42      10        0             0 agetty
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [23747]     0 23747   185468      413     218        0             0 rsyslogd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [26575]     0 26575    28235      275      59        0         -1000 sshd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [31369]   997 31369    19777      217      39        0             0 zabbix_agentd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [31370]   997 31370    19777      309      39        0             0 zabbix_agentd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [31371]   997 31371    20339      372      43        0             0 zabbix_agentd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [31372]   997 31372    20339      372      43        0             0 zabbix_agentd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [31373]   997 31373    20339      372      43        0             0 zabbix_agentd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [31374]   997 31374    20341      260      43        0             0 zabbix_agentd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [16737]     0 16737   187420     8093      67        0          -999 containerd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [20647]     0 20647   207923    15091     116        0          -500 dockerd
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [19807]     0 19807    12241      383      20        0             0 ilogtail
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [19808]     0 19808   102293    12131      88        0             0 ilogtail
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [10576]     0 10576   109338      260      29        0             0 AliSecGuard
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [22495]     0 22495     5970       90      16        0             0 argusagent
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [22497]     0 22497   280326    41489     136        0             0 /usr/local/clou
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [32483]     0 32483  1572349   658453    1464        0             0 java
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 9304]     0  9304   201546      950      13        0             0 aliyun-service
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 9471]     0  9471     4469      121      12        0             0 assist_daemon
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 5993]     0  5993    10614      378      22        0             0 AliYunDunUpdate
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 7826]     0  7826    34877     1708      68        0             0 AliYunDun
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [ 6340]     0  6340  3083814  2140635    4582        0             0 java
Sep 30 08:56:13 ecs-hn1-app-007 kernel: [19473]     0 19473  2193507   954577    2113        0             0 java
Sep 30 08:56:13 ecs-hn1-app-007 kernel: Out of memory: Kill process 6340 (java) score 511 or sacrifice child
Sep 30 08:56:13 ecs-hn1-app-007 kernel: Killed process 6340 (java), UID 0, total-vm:12335256kB, anon-rss:8562504kB, file-rss:36kB, shmem-rss:0kB
```

怎么避免OOM Killer误杀我的业务进程?


避免oom killer的方案
1. 直接修改/proc//oom_score_adj文件,将其置为-1000
以前是通过/proc//oom_score来控制的,但近年来新版linux已经使用oom_score_adj来代替旧版的oom_score

参考:https://github.com/tinganho/linux-kernel/blob/master/Documentation/feature-removal-schedule.txt#L171

2. 直接关闭oom-killer

关闭
echo "0" > /proc/sys/vm/oom-kill

激活

echo "1″ > /proc/sys/vm/oom-kill

JVM的OOM

jvm内存溢出,可添加启动参数,在发生故障的时候,产生dump文件

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/

通过Elicpse Memory Analyzer、JProfiler分析

JVM的故障

JVM自身发生致命故障,生成hs_err_pid_xxx.log文件,该文件包含故障信息,默认保存在当前应用的启动目录,可通过jvm参数设置文件路径

-XX:ErrorFile=/var/log/hs_err_pid<pid>.log

文件包含内容:

  • 日志头文件
  • 导致crash的线程信息
  • 所有线程信息
  • 安全点和锁信息
  • 堆信息
  • 本地代码缓存
  • 编译事件
  • gc相关记录
  • jvm内存映射
  • jvm启动参数
  • 服务器信息

参考:

JVM致命错误日志(hs_err_pid.log)分析_51CTO博客_hs_err_pid日志分析

如何分析hs_err_pidxxx.log文件_BannerEva的博客-CSDN博客_hs_err_pid是什么文件

JVM致命错误日志(hs_err_pid.log)解读_江畔独步的博客-CSDN博客

参考文档:

哪些原因会导致JAVA进程退出?

Copyright © 2010-2022 mfbz.cn 版权所有 |关于我们| 联系方式|豫ICP备15888888号