We usually deal with OutOfMemoryError problems because of heap or permgen size configuration problem.
But all the JVM memory is not permgen or heap.
As far as I understand, it can also be related to Threads / Stacks, native JVM code...
But using pmap I can see the process is allocated with 9.3G which is 3.3G off-heap memory usage.
I wonder what are the possibilities to monitor and tune this extra off-heap memory consumption.
I do not use direct off-heap memory access (MaxDirectMemorySize is 64m default)
Context: Load testing
Application: Solr/Lucene server
OS: Ubuntu
Thread count: 700
Virtualization: vSphere (run by us, no external hosting)
JVM
java version "1.7.0_09"
Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
Java HotSpot(TM) 64-Bit Server VM (build 23.5-b02, mixed mode)
Tunning
-Xms=6g
-Xms=6g
-XX:MaxPermSize=128m
-XX:-UseGCOverheadLimit
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:+CMSClassUnloadingEnabled
-XX:+OptimizeStringConcat
-XX:+UseCompressedStrings
-XX:+UseStringCache
Memory maps:
https://gist.github.com/slorber/5629214
vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 1743 381 4 1150 1 1 60 92 2 0 1 0 99 0
free
total used free shared buffers cached
Mem: 7986 7605 381 0 4 1150
-/+ buffers/cache: 6449 1536
Swap: 4091 1743 2348
Top
top - 11:15:49 up 42 days, 1:34, 2 users, load average: 1.44, 2.11, 2.46
Tasks: 104 total, 1 running, 103 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.5%us, 0.2%sy, 0.0%ni, 98.9%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8178412k total, 7773356k used, 405056k free, 4200k buffers
Swap: 4190204k total, 1796368k used, 2393836k free, 1179380k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17833 jmxtrans 20 0 2458m 145m 2488 S 1 1.8 206:56.06 java
1237 logstash 20 0 2503m 142m 2468 S 1 1.8 354:23.19 java
11348 tomcat 20 0 9184m 5.6g 2808 S 1 71.3 642:25.41 java
1 root 20 0 24324 1188 656 S 0 0.0 0:01.52 init
2 root 20 0 0 0 0 S 0 0.0 0:00.26 kthreadd
...
df -> tmpfs
Filesystem 1K-blocks Used Available Use% Mounted on
tmpfs 1635684 272 1635412 1% /run
The main problem we have:
- The server has 8G of physical memory
- The heap of Solr takes only 6G
- There is 1.5G of swap
- Swappiness=0
- The heap consumption seems appropriately tunned
- Running on the server: only Solr and some monitoring stuff
- We have a correct average response time
- We sometimes have anormaly long pauses, up to 20 seconds
I guess the pauses could be a full GC on a swapped heap right?
Why is there so much swap?
I don't even really know if this is the JVM that makes the server swap or if it is something hidden that I can't see. Perhaps the OS page cache? But not sure why the OS would create page cache entries if that creates swap.
I am considering testing the mlockall
trick used in some popular Java based storage/NoSQL like ElasticSearch, Voldemort or Cassandra: check Make JVM/Solr not swap, using mlockall
Edit:
Here you can see max heap, used heap (blue), a used swap (red). It seems kind of related.
I can see with Graphite that there are many ParNew GC occuring regularly. And there are a few CMS GC that correspond to the heap signifiant decreases of the picture.
The pauses doesn't seem to be correlated with the heap decreases but are regularly distributed between 10:00 and 11:30, so it may be related to the ParNew GC I guess.
During the load test I can see some disc activity and also some swap IO activity which is really calm when the test ends.
question from:
https://stackoverflow.com/questions/16697135/monitor-non-heap-memory-usage-of-a-jvm