netty的direct memory大小设置

最近遇到一个 netty 的 OutOfDirectMemoryError 报错,是在分配 direct memory 时内存不足导致的,看了下报错提示,要分配的内存大小为 16M,剩余的空间不足。这里 max direct memory 大约有 7G,于是就有一个疑问,这个值是怎么设置的?

代码分析

这里使用的 netty 版本是 4.1.14.Final,如下是报错时的调用栈信息,主要关注下 PlatformDependent 这个类。

1
2
3
4
5
6
7
8
9
10
11
Caused by: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 7532970287, max: 7549747200)
at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:618)
at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:572)
at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:764)
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:740)
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:244)
at io.netty.buffer.PoolArena.allocate(PoolArena.java:226)
at io.netty.buffer.PoolArena.allocate(PoolArena.java:146)
at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:324)
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:181)
at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:117)

找到 PlatformDependent 的第 572 行,位于 allocateDirectNoCleaner 函数内,它的功能是根据指定的容量(capacity)分配一个新的 ByteBuffer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/**
* Allocate a new {@link ByteBuffer} with the given {@code capacity}. {@link ByteBuffer}s allocated with
* this method <strong>MUST</strong> be deallocated via {@link #freeDirectNoCleaner(ByteBuffer)}.
*/
public static ByteBuffer allocateDirectNoCleaner(int capacity) {
assert USE_DIRECT_BUFFER_NO_CLEANER;

// 第572行
// 增加已使用内存计数,若内存不足,直接抛出异常
incrementMemoryCounter(capacity);
try {
// 分配内存
return PlatformDependent0.allocateDirectNoCleaner(capacity);
} catch (Throwable e) {
// 分配失败,减小已使用内存计数
decrementMemoryCounter(capacity);
throwException(e);
return null;
}
}

查看第 572 行对应的 incrementMemoryCounter 实现,它的功能是增加已使用内存的计数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
private static void incrementMemoryCounter(int capacity) {
if (DIRECT_MEMORY_COUNTER != null) {
for (;;) {
// 获取当前已使用内存计数
long usedMemory = DIRECT_MEMORY_COUNTER.get();

// 计算新的已使用内存计数
long newUsedMemory = usedMemory + capacity;

// 超过了最大的容量限制,抛出异常
if (newUsedMemory > DIRECT_MEMORY_LIMIT) {
throw new OutOfDirectMemoryError("failed to allocate " + capacity
+ " byte(s) of direct memory (used: " + usedMemory + ", max: " + DIRECT_MEMORY_LIMIT + ')');
}

// CAS更新计数值
if (DIRECT_MEMORY_COUNTER.compareAndSet(usedMemory, newUsedMemory)) {
break;
}
}
}
}

从代码逻辑可见,它是通过 CAS 更新已使用内存计数。在更新前先判断是否超过了 DIRECT_MEMORY_LIMIT 最大容量限制,若已超过则直接抛出异常,也就是说此时并未真正地分配内存。

这里就有个问题,DIRECT_MEMORY_LIMIT 是怎么设置的?

搜索代码发现,它是在 PlatformDependent 的静态代码块中设置的,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Here is how the system property is used:
//
// * < 0 - Don't use cleaner, and inherit max direct memory from java. In this case the
// "practical max direct memory" would be 2 * max memory as defined by the JDK.
// * == 0 - Use cleaner, Netty will not enforce max memory, and instead will defer to JDK.
// * > 0 - Don't use cleaner. This will limit Netty's total direct memory
// (note: that JDK's direct memory limit is independent of this).
// 默认未设置,所以maxDirectMemory值为-1
long maxDirectMemory = SystemPropertyUtil.getLong("io.netty.maxDirectMemory", -1);

if (maxDirectMemory == 0 || !hasUnsafe() || !PlatformDependent0.hasDirectBufferNoCleanerConstructor()) {
USE_DIRECT_BUFFER_NO_CLEANER = false;
DIRECT_MEMORY_COUNTER = null;
} else {
USE_DIRECT_BUFFER_NO_CLEANER = true;
if (maxDirectMemory < 0) {
// 取值逻辑在这里
maxDirectMemory = maxDirectMemory0();
if (maxDirectMemory <= 0) {
DIRECT_MEMORY_COUNTER = null;
} else {
DIRECT_MEMORY_COUNTER = new AtomicLong();
}
} else {
DIRECT_MEMORY_COUNTER = new AtomicLong();
}
}
DIRECT_MEMORY_LIMIT = maxDirectMemory;
logger.debug("-Dio.netty.maxDirectMemory: {} bytes", maxDirectMemory);

首先取 io.netty.maxDirectMemory 属性值,根据它的不同取值有如下含义:

  • < 0,不使用清理器(cleaner),从 java 继承 max direct memory 设置
  • == 0,使用清理器(cleaner),netty 不会强制最大内存,而是使用 jdk 设置
  • > 0,不使用清理器(cleaner),表示 netty 的最大 direct memory 限制

它的默认值是 -1,根据代码逻辑会执行到 maxDirectMemory = maxDirectMemory0() 这行,该方法的实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
private static final Pattern MAX_DIRECT_MEMORY_SIZE_ARG_PATTERN = Pattern.compile(
"\\s*-XX:MaxDirectMemorySize\\s*=\\s*([0-9]+)\\s*([kKmMgG]?)\\s*$");

private static long maxDirectMemory0() {
long maxDirectMemory = 0;
ClassLoader systemClassLoader = null;
try {
// 1. 通过反射调用sun.misc.VM.maxDirectMemory()
systemClassLoader = getSystemClassLoader();
Class<?> vmClass = Class.forName("sun.misc.VM", true, systemClassLoader);
Method m = vmClass.getDeclaredMethod("maxDirectMemory");
maxDirectMemory = ((Number) m.invoke(null)).longValue();
} catch (Throwable ignored) {
// Ignore
}

if (maxDirectMemory > 0) {
return maxDirectMemory;
}

try {
// 2. 通过MBean获取-XX:MaxDirectMemorySize配置,因为Android没有这些类,所以使用反射获取
Class<?> mgmtFactoryClass = Class.forName(
"java.lang.management.ManagementFactory", true, systemClassLoader);
Class<?> runtimeClass = Class.forName(
"java.lang.management.RuntimeMXBean", true, systemClassLoader);

Object runtime = mgmtFactoryClass.getDeclaredMethod("getRuntimeMXBean").invoke(null);

@SuppressWarnings("unchecked")
List<String> vmArgs = (List<String>) runtimeClass.getDeclaredMethod("getInputArguments").invoke(runtime);
for (int i = vmArgs.size() - 1; i >= 0; i --) {
Matcher m = MAX_DIRECT_MEMORY_SIZE_ARG_PATTERN.matcher(vmArgs.get(i));
if (!m.matches()) {
continue;
}

maxDirectMemory = Long.parseLong(m.group(1));
switch (m.group(2).charAt(0)) {
case 'k': case 'K':
maxDirectMemory *= 1024;
break;
case 'm': case 'M':
maxDirectMemory *= 1024 * 1024;
break;
case 'g': case 'G':
maxDirectMemory *= 1024 * 1024 * 1024;
break;
}
break;
}
} catch (Throwable ignored) {
// Ignore
}

if (maxDirectMemory <= 0) {
// 3. 仍未获取到,则从Runtime获取
maxDirectMemory = Runtime.getRuntime().maxMemory();
logger.debug("maxDirectMemory: {} bytes (maybe)", maxDirectMemory);
} else {
logger.debug("maxDirectMemory: {} bytes", maxDirectMemory);
}

return maxDirectMemory;
}

它的逻辑为:

  1. 通过反射调用 sun.misc.VM.maxDirectMemory(),若取到则返回
  2. 否则,获取 -XX:MaxDirectMemorySize 配置,若取到则返回
  3. 否则,调用 Runtime.getRuntime().maxMemory() 获取

结论

根据以上的分析,要设置 direct memory 的最大容量,既可以通过 netty 的 io.netty.maxDirectMemory 属性配置,也可以通过 jvm 的 -XX:MaxDirectMemorySize 参数设置,其中前者的优先级更高。

默认情况下,上面两项均未配置,则是通过 sun.misc.VM.maxDirectMemory() 获取 direct memory 的最大容量。

测试

通过如下程序验证上面的结论

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
public class DirectMemoryLimit {
public static void main(String[] args) throws Exception {

Long directMemoryLimit = getDirectMemoryLimit();

System.out.println("directMemoryLimit: " + directMemoryLimit + " byte");
}

private static Long getDirectMemoryLimit() throws ClassNotFoundException, NoSuchFieldException, IllegalAccessException {
Class<?> clazz = Class.forName("io.netty.util.internal.PlatformDependent");
if (clazz == null) {
return null;
}

Field directMemoryLimit = clazz.getDeclaredField("DIRECT_MEMORY_LIMIT");
if (directMemoryLimit == null) {
return null;
}

directMemoryLimit.setAccessible(true);

// 取DIRECT_MEMORY_LIMIT的值
return directMemoryLimit.getLong(clazz);
}
}

通过jvm设置

设置 jvm 参数 -XX:MaxDirectMemorySize=20m ,执行并查看结果

1
directMemoryLimit: 20971520 byte

其中 20M byte = 20 * 1024 * 1024 byte = 20971520 byte

通过netty设置

通过设置 io.netty.maxDirectMemory 属性,覆盖 -XX:MaxDirectMemorySize 配置的大小。

增加 jvm 参数 -Dio.netty.maxDirectMemory=1024,执行并查看结果

1
directMemoryLimit: 1024 byte

去掉设置项

去掉上面的两个设置,再次执行并查看结果

1
directMemoryLimit: 1908932608 byte

增加如下代码

1
System.out.println("maxMemory: " + Runtime.getRuntime().maxMemory());

执行并查看输出结果

1
maxMemory: 1908932608

可见默认情况下,通过 sun.misc.VM.maxDirectMemory() 获取并设置的 DIRECT_MEMORY_LIMIT 取值与 Runtime.getRuntime().maxMemory() 一致

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public static void saveAndRemoveProperties(Properties var0) {
if (booted) {
throw new IllegalStateException("System initialization has completed");
} else {
savedProps.putAll(var0);
String var1 = (String)var0.remove("sun.nio.MaxDirectMemorySize");
if (var1 != null) {
if (var1.equals("-1")) {
// sun.nio.MaxDirectMemorySize的配置值不是-1时
directMemory = Runtime.getRuntime().maxMemory();
} else {
long var2 = Long.parseLong(var1);
if (var2 > -1L) {
directMemory = var2;
}
}
}

var1 = (String)var0.remove("sun.nio.PageAlignDirectMemory");
if ("true".equals(var1)) {
pageAlignDirectMemory = true;
}

var1 = var0.getProperty("sun.lang.ClassLoader.allowArraySyntax");
allowArraySyntax = var1 == null ? defaultAllowArraySyntax : Boolean.parseBoolean(var1);
var0.remove("java.lang.Integer.IntegerCache.high");
var0.remove("sun.zip.disableMemoryMapping");
var0.remove("sun.java.launcher.diag");
var0.remove("sun.cds.enableSharedLookupCache");
}
}

Runtime.getRuntime().maxMemory()

这是一个 native 方法,从注释来看,它用于获取 java 虚拟机的最大可用内存。因为新生代里的 survivor 区采用的是复制算法,其可用空间只有一个 survivor 区大小,所以 java 堆总的可用空间大小为:老年代大小 + 新生代大小 - 一个 survivor 区大小

1
2
3
4
5
6
7
8
9
10
/**
* Returns the maximum amount of memory that the Java virtual machine will
* attempt to use. If there is no inherent limit then the value {@link
* java.lang.Long#MAX_VALUE} will be returned.
*
* @return the maximum amount of memory that the virtual machine will
* attempt to use, measured in bytes
* @since 1.4
*/
public native long maxMemory();

到 jdk 源码找到该方法的 native 实现(jdk/src/share/native/java/lang/Runtime.c)

1
2
3
4
5
JNIEXPORT jlong JNICALL
Java_java_lang_Runtime_maxMemory(JNIEnv *env, jobject this)
{
return JVM_MaxMemory();
}

它只是一个入口,具体实现在 hotspot 源码中(hotspot/src/share/vm/prims/jvm.cpp)

1
2
3
4
5
6
JVM_ENTRY_NO_ENV(jlong, JVM_MaxMemory(void))
JVMWrapper("JVM_MaxMemory");
// 计算 MaxMemory
size_t n = Universe::heap()->max_capacity();
return convert_size_t_to_jlong(n);
JVM_END

Universe.hpp 里找到 heap() 的实现。它返回了一个 CollectedHeap 类型的静态属性 _collectedHeap,该静态属性是在 initialize_heap() 里初始化的。根据 GC 策略的不同,_collectedHeap 被初始化为不同的实现。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// The particular choice of collected heap.
static CollectedHeap* heap() { return _collectedHeap; }

// 初始化代码
jint Universe::initialize_heap() {
if (UseParallelGC) {
#if INCLUDE_ALL_GCS
Universe::_collectedHeap = new ParallelScavengeHeap();
#else // INCLUDE_ALL_GCS
fatal("UseParallelGC not supported in this VM.");
#endif // INCLUDE_ALL_GCS

} else if (UseG1GC) {
#if INCLUDE_ALL_GCS
G1CollectorPolicyExt* g1p = new G1CollectorPolicyExt();
g1p->initialize_all();
G1CollectedHeap* g1h = new G1CollectedHeap(g1p);
Universe::_collectedHeap = g1h;
#else // INCLUDE_ALL_GCS
fatal("UseG1GC not supported in java kernel vm.");
#endif // INCLUDE_ALL_GCS

} else {
GenCollectorPolicy *gc_policy;

if (UseSerialGC) {
gc_policy = new MarkSweepPolicy();
} else if (UseConcMarkSweepGC) {
#if INCLUDE_ALL_GCS
if (UseAdaptiveSizePolicy) {
gc_policy = new ASConcurrentMarkSweepPolicy();
} else {
gc_policy = new ConcurrentMarkSweepPolicy();
}
#else // INCLUDE_ALL_GCS
fatal("UseConcMarkSweepGC not supported in this VM.");
#endif // INCLUDE_ALL_GCS
} else { // default old generation
gc_policy = new MarkSweepPolicy();
}
gc_policy->initialize_all();

// 使用CMS GC策略时的实现
Universe::_collectedHeap = new GenCollectedHeap(gc_policy);
}

打开 GenCollectedHeap,找到 max_capacity() 的实现。它是将各个分代的最大容量相加。

1
2
3
4
5
6
7
size_t GenCollectedHeap::max_capacity() const {
size_t res = 0;
for (int i = 0; i < _n_gens; i++) {
res += _gens[i]->max_capacity();
}
return res;
}

剩下的源码还没搞清楚,有空再补。。。

吃吃吃

参考

https://stackoverflow.com/questions/52980629/runtime-getruntime-maxmemory-calculate-method