1*8d67ca89SAndroid Build Coastguard Worker# Native Memory Allocator Verification 2*8d67ca89SAndroid Build Coastguard WorkerThis document describes how to verify the native memory allocator on Android. 3*8d67ca89SAndroid Build Coastguard WorkerThis procedure should be followed when upgrading or moving to a new allocator. 4*8d67ca89SAndroid Build Coastguard WorkerA small minor upgrade might not need to run all of the benchmarks, however, 5*8d67ca89SAndroid Build Coastguard Workerat least the 6*8d67ca89SAndroid Build Coastguard Worker[SQL Allocation Trace Benchmark](#sql-allocation-trace-benchmark), 7*8d67ca89SAndroid Build Coastguard Worker[Memory Replay Benchmarks](#memory-replay-benchmarks) and 8*8d67ca89SAndroid Build Coastguard Worker[Performance Trace Benchmarks](#performance-trace-benchmarks) should be run. 9*8d67ca89SAndroid Build Coastguard Worker 10*8d67ca89SAndroid Build Coastguard WorkerIt is important to note that there are two modes for a native allocator 11*8d67ca89SAndroid Build Coastguard Workerto run in on Android. The first is the normal allocator, the second is 12*8d67ca89SAndroid Build Coastguard Workercalled the svelte config, which is designed to run on memory constrained 13*8d67ca89SAndroid Build Coastguard Workersystems and be a bit slower, but take less RSS. To enable the svelte config, 14*8d67ca89SAndroid Build Coastguard Workeradd this line to the `BoardConfig.mk` for the given target: 15*8d67ca89SAndroid Build Coastguard Worker 16*8d67ca89SAndroid Build Coastguard Worker MALLOC_SVELTE := true 17*8d67ca89SAndroid Build Coastguard Worker 18*8d67ca89SAndroid Build Coastguard WorkerThe `BoardConfig.mk` file is usually found in the directory 19*8d67ca89SAndroid Build Coastguard Worker`device/<DEVICE_NAME>/` or in a sub directory. 20*8d67ca89SAndroid Build Coastguard Worker 21*8d67ca89SAndroid Build Coastguard WorkerWhen evaluating a native allocator, make sure that you benchmark both 22*8d67ca89SAndroid Build Coastguard Workerversions. 23*8d67ca89SAndroid Build Coastguard Worker 24*8d67ca89SAndroid Build Coastguard Worker## Android Extensions 25*8d67ca89SAndroid Build Coastguard WorkerAndroid supports a few non-standard functions and mallopt controls that 26*8d67ca89SAndroid Build Coastguard Workera native allocator needs to implement. 27*8d67ca89SAndroid Build Coastguard Worker 28*8d67ca89SAndroid Build Coastguard Worker### Iterator Functions 29*8d67ca89SAndroid Build Coastguard WorkerThese are functions that are used to implement a memory leak detector 30*8d67ca89SAndroid Build Coastguard Workercalled `libmemunreachable`. 31*8d67ca89SAndroid Build Coastguard Worker 32*8d67ca89SAndroid Build Coastguard Worker#### malloc\_disable 33*8d67ca89SAndroid Build Coastguard WorkerThis function, when called, should pause all threads that are making a 34*8d67ca89SAndroid Build Coastguard Workercall to an allocation function (malloc/free/etc). When a call 35*8d67ca89SAndroid Build Coastguard Workeris made to `malloc_enable`, the paused threads should start running again. 36*8d67ca89SAndroid Build Coastguard Worker 37*8d67ca89SAndroid Build Coastguard Worker#### malloc\_enable 38*8d67ca89SAndroid Build Coastguard WorkerThis function, when called, does nothing unless there was a previous call 39*8d67ca89SAndroid Build Coastguard Workerto `malloc_disable`. This call will unpause any thread which is making 40*8d67ca89SAndroid Build Coastguard Workera call to an allocation function (malloc/free/etc) when `malloc_disable` 41*8d67ca89SAndroid Build Coastguard Workerwas called previously. 42*8d67ca89SAndroid Build Coastguard Worker 43*8d67ca89SAndroid Build Coastguard Worker#### malloc\_iterate 44*8d67ca89SAndroid Build Coastguard WorkerThis function enumerates all of the allocations currently live in the 45*8d67ca89SAndroid Build Coastguard Workersystem. It is meant to be called after a call to `malloc_disable` to 46*8d67ca89SAndroid Build Coastguard Workerprevent further allocations while this call is being executed. To 47*8d67ca89SAndroid Build Coastguard Workersee what is expected for this function, the best description is the 48*8d67ca89SAndroid Build Coastguard Workertests for this funcion in `bionic/tests/malloc_itearte_test.cpp`. 49*8d67ca89SAndroid Build Coastguard Worker 50*8d67ca89SAndroid Build Coastguard Worker### Mallopt Extensions 51*8d67ca89SAndroid Build Coastguard WorkerThese are mallopt options that Android requires for a native allocator 52*8d67ca89SAndroid Build Coastguard Workerto work efficiently. 53*8d67ca89SAndroid Build Coastguard Worker 54*8d67ca89SAndroid Build Coastguard Worker#### M\_DECAY\_TIME 55*8d67ca89SAndroid Build Coastguard WorkerWhen set to zero, `mallopt(M_DECAY_TIME, 0)`, it is expected that an 56*8d67ca89SAndroid Build Coastguard Workerallocator will attempt to purge and release any unused memory back to the 57*8d67ca89SAndroid Build Coastguard Workerkernel on free calls. This is important in Android to avoid consuming extra 58*8d67ca89SAndroid Build Coastguard WorkerRSS. 59*8d67ca89SAndroid Build Coastguard Worker 60*8d67ca89SAndroid Build Coastguard WorkerWhen set to non-zero, `mallopt(M_DECAY_TIME, 1)`, an allocator can delay the 61*8d67ca89SAndroid Build Coastguard Workerpurge and release action. The amount of delay is up to the allocator 62*8d67ca89SAndroid Build Coastguard Workerimplementation, but it should be a reasonable amount of time. The jemalloc 63*8d67ca89SAndroid Build Coastguard Workerallocator was implemented to have a one second delay. 64*8d67ca89SAndroid Build Coastguard Worker 65*8d67ca89SAndroid Build Coastguard WorkerThe drawback to this option is that most allocators do not have a separate 66*8d67ca89SAndroid Build Coastguard Workerthread to handle the purge, so the decay is only handled when an 67*8d67ca89SAndroid Build Coastguard Workerallocation operation occurs. For server processes, this can mean that 68*8d67ca89SAndroid Build Coastguard WorkerRSS is slightly higher when the server is waiting for the next connection 69*8d67ca89SAndroid Build Coastguard Workerand no other allocation calls are made. The `M_PURGE` option is used to 70*8d67ca89SAndroid Build Coastguard Workerforce a purge in this case. 71*8d67ca89SAndroid Build Coastguard Worker 72*8d67ca89SAndroid Build Coastguard WorkerFor all applications on Android, the call `mallopt(M_DECAY_TIME, 1)` is 73*8d67ca89SAndroid Build Coastguard Workermade by default. The idea is that it allows application frees to run a 74*8d67ca89SAndroid Build Coastguard Workerbit faster, while only increasing RSS a bit. 75*8d67ca89SAndroid Build Coastguard Worker 76*8d67ca89SAndroid Build Coastguard Worker#### M\_PURGE 77*8d67ca89SAndroid Build Coastguard WorkerWhen called, `mallopt(M_PURGE, 0)`, an allocator should purge and release 78*8d67ca89SAndroid Build Coastguard Workerany unused memory immediately. The argument for this call is ignored. If 79*8d67ca89SAndroid Build Coastguard Workerpossible, this call should clear thread cached memory if it exists. The 80*8d67ca89SAndroid Build Coastguard Workeridea is that this can be called to purge memory that has not been 81*8d67ca89SAndroid Build Coastguard Workerpurged when `M_DECAY_TIME` is set to one. This is useful if you have a 82*8d67ca89SAndroid Build Coastguard Workerserver application that does a lot of native allocations and the 83*8d67ca89SAndroid Build Coastguard Workerapplication wants to purge that memory before waiting for the next connection. 84*8d67ca89SAndroid Build Coastguard Worker 85*8d67ca89SAndroid Build Coastguard Worker## Correctness Tests 86*8d67ca89SAndroid Build Coastguard WorkerThese are the tests that should be run to verify an allocator is 87*8d67ca89SAndroid Build Coastguard Workerworking properly according to Android. 88*8d67ca89SAndroid Build Coastguard Worker 89*8d67ca89SAndroid Build Coastguard Worker### Bionic Unit Tests 90*8d67ca89SAndroid Build Coastguard WorkerThe bionic unit tests contain a small number of allocator tests. These 91*8d67ca89SAndroid Build Coastguard Workertests are primarily verifying Android extensions and non-standard behavior 92*8d67ca89SAndroid Build Coastguard Workerof allocation routines such as what happens when a non-power of two alignment 93*8d67ca89SAndroid Build Coastguard Workeris passed to memalign. 94*8d67ca89SAndroid Build Coastguard Worker 95*8d67ca89SAndroid Build Coastguard WorkerTo run all of the compliance tests: 96*8d67ca89SAndroid Build Coastguard Worker 97*8d67ca89SAndroid Build Coastguard Worker adb shell /data/nativetest64/bionic-unit-tests/bionic-unit-tests --gtest_filter="malloc*" 98*8d67ca89SAndroid Build Coastguard Worker adb shell /data/nativetest/bionic-unit-tests/bionic-unit-tests --gtest_filter="malloc*" 99*8d67ca89SAndroid Build Coastguard Worker 100*8d67ca89SAndroid Build Coastguard WorkerThe allocation tests are not meant to be complete, so it is expected 101*8d67ca89SAndroid Build Coastguard Workerthat a native allocator will have its own set of tests that can be run. 102*8d67ca89SAndroid Build Coastguard Worker 103*8d67ca89SAndroid Build Coastguard Worker### Libmemunreachable Tests 104*8d67ca89SAndroid Build Coastguard WorkerThe libmemunreachable tests verify that the iterator functions are working 105*8d67ca89SAndroid Build Coastguard Workerproperly. 106*8d67ca89SAndroid Build Coastguard Worker 107*8d67ca89SAndroid Build Coastguard WorkerTo run all of the tests: 108*8d67ca89SAndroid Build Coastguard Worker 109*8d67ca89SAndroid Build Coastguard Worker adb shell /data/nativetest64/memunreachable_binder_test/memunreachable_binder_test 110*8d67ca89SAndroid Build Coastguard Worker adb shell /data/nativetest/memunreachable_binder_test/memunreachable_binder_test 111*8d67ca89SAndroid Build Coastguard Worker adb shell /data/nativetest64/memunreachable_test/memunreachable_test 112*8d67ca89SAndroid Build Coastguard Worker adb shell /data/nativetest/memunreachable_test/memunreachable_test 113*8d67ca89SAndroid Build Coastguard Worker adb shell /data/nativetest64/memunreachable_unit_test/memunreachable_unit_test 114*8d67ca89SAndroid Build Coastguard Worker adb shell /data/nativetest/memunreachable_unit_test/memunreachable_unit_test 115*8d67ca89SAndroid Build Coastguard Worker 116*8d67ca89SAndroid Build Coastguard Worker### CTS Entropy Test 117*8d67ca89SAndroid Build Coastguard WorkerIn addition to the bionic tests, there is also a CTS test that is designed 118*8d67ca89SAndroid Build Coastguard Workerto verify that the addresses returned by malloc are sufficiently randomized 119*8d67ca89SAndroid Build Coastguard Workerto help defeat potential security bugs. 120*8d67ca89SAndroid Build Coastguard Worker 121*8d67ca89SAndroid Build Coastguard WorkerRun this test thusly: 122*8d67ca89SAndroid Build Coastguard Worker 123*8d67ca89SAndroid Build Coastguard Worker atest AslrMallocTest 124*8d67ca89SAndroid Build Coastguard Worker 125*8d67ca89SAndroid Build Coastguard WorkerIf there are multiple devices connected to the system, use `-s <SERIAL>` 126*8d67ca89SAndroid Build Coastguard Workerto specify a device. 127*8d67ca89SAndroid Build Coastguard Worker 128*8d67ca89SAndroid Build Coastguard Worker## Performance 129*8d67ca89SAndroid Build Coastguard WorkerThere are multiple different ways to evaluate the performance of a native 130*8d67ca89SAndroid Build Coastguard Workerallocator on Android. One is allocation speed in various different scenarios, 131*8d67ca89SAndroid Build Coastguard Workeranother is total RSS taken by the allocator. 132*8d67ca89SAndroid Build Coastguard Worker 133*8d67ca89SAndroid Build Coastguard WorkerThe last is virtual address space consumed in 32 bit applications. There is 134*8d67ca89SAndroid Build Coastguard Workera limited amount of address space available in 32 bit apps, and there have 135*8d67ca89SAndroid Build Coastguard Workerbeen allocator bugs that cause memory failures when too much virtual 136*8d67ca89SAndroid Build Coastguard Workeraddress space is consumed. For 64 bit executables, this can be ignored. 137*8d67ca89SAndroid Build Coastguard Worker 138*8d67ca89SAndroid Build Coastguard Worker### Bionic Benchmarks 139*8d67ca89SAndroid Build Coastguard WorkerThese are the microbenchmarks that are part of the bionic benchmarks suite of 140*8d67ca89SAndroid Build Coastguard Workerbenchmarks. These benchmarks can be built using this command: 141*8d67ca89SAndroid Build Coastguard Worker 142*8d67ca89SAndroid Build Coastguard Worker mmma -j bionic/benchmarks 143*8d67ca89SAndroid Build Coastguard Worker 144*8d67ca89SAndroid Build Coastguard WorkerThese benchmarks are only used to verify the speed of the allocator and 145*8d67ca89SAndroid Build Coastguard Workerignore anything related to RSS and virtual address space consumed. 146*8d67ca89SAndroid Build Coastguard Worker 147*8d67ca89SAndroid Build Coastguard WorkerFor all of these benchmark runs, it can be useful to add these two options: 148*8d67ca89SAndroid Build Coastguard Worker 149*8d67ca89SAndroid Build Coastguard Worker --benchmark_repetitions=XX 150*8d67ca89SAndroid Build Coastguard Worker --benchmark_report_aggregates_only=true 151*8d67ca89SAndroid Build Coastguard Worker 152*8d67ca89SAndroid Build Coastguard WorkerThis will run the benchmark XX times and then give a mean, median, and stddev 153*8d67ca89SAndroid Build Coastguard Workerand helps to get a number that can be compared to the new allocator. 154*8d67ca89SAndroid Build Coastguard Worker 155*8d67ca89SAndroid Build Coastguard WorkerIn addition, there is another option: 156*8d67ca89SAndroid Build Coastguard Worker 157*8d67ca89SAndroid Build Coastguard Worker --bionic_cpu=XX 158*8d67ca89SAndroid Build Coastguard Worker 159*8d67ca89SAndroid Build Coastguard WorkerWhich will lock the benchmark to only run on core XX. This also avoids 160*8d67ca89SAndroid Build Coastguard Workerany issue related to the code migrating from one core to another 161*8d67ca89SAndroid Build Coastguard Workerwith different characteristics. For example, on a big-little cpu, if the 162*8d67ca89SAndroid Build Coastguard Workerbenchmark moves from big to little or vice-versa, this can cause scores 163*8d67ca89SAndroid Build Coastguard Workerto fluctuate in indeterminate ways. 164*8d67ca89SAndroid Build Coastguard Worker 165*8d67ca89SAndroid Build Coastguard WorkerFor most runs, the best set of options to add is: 166*8d67ca89SAndroid Build Coastguard Worker 167*8d67ca89SAndroid Build Coastguard Worker --benchmark_repetitions=10 --benchmark_report_aggregates_only=true --bionic_cpu=3 168*8d67ca89SAndroid Build Coastguard Worker 169*8d67ca89SAndroid Build Coastguard WorkerOn most phones with a big-little cpu, the third core is the little core. 170*8d67ca89SAndroid Build Coastguard WorkerChoosing to run on the little core can tend to highlight any performance 171*8d67ca89SAndroid Build Coastguard Workerdifferences. 172*8d67ca89SAndroid Build Coastguard Worker 173*8d67ca89SAndroid Build Coastguard Worker#### Allocate/Free Benchmarks 174*8d67ca89SAndroid Build Coastguard WorkerThese are the benchmarks to verify the allocation speed of a loop doing a 175*8d67ca89SAndroid Build Coastguard Workersingle allocation, touching every page in the allocation to make it resident 176*8d67ca89SAndroid Build Coastguard Workerand then freeing the allocation. 177*8d67ca89SAndroid Build Coastguard Worker 178*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 0)`, use these commands: 179*8d67ca89SAndroid Build Coastguard Worker 180*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_free_default 181*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_free_default 182*8d67ca89SAndroid Build Coastguard Worker 183*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 1)`, use these commands: 184*8d67ca89SAndroid Build Coastguard Worker 185*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_free_decay1 186*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_free_decay1 187*8d67ca89SAndroid Build Coastguard Worker 188*8d67ca89SAndroid Build Coastguard WorkerThe last value in the output is the size of the allocation in bytes. It is 189*8d67ca89SAndroid Build Coastguard Workeruseful to look at these kinds of benchmarks to make sure that there are 190*8d67ca89SAndroid Build Coastguard Workerno outliers, but these numbers should not be used to make a final decision. 191*8d67ca89SAndroid Build Coastguard WorkerIf these numbers are slightly worse than the current allocator, the 192*8d67ca89SAndroid Build Coastguard Workersingle thread numbers from trace data is a better representative of 193*8d67ca89SAndroid Build Coastguard Workerreal world situations. 194*8d67ca89SAndroid Build Coastguard Worker 195*8d67ca89SAndroid Build Coastguard Worker#### Multiple Allocations Retained Benchmarks 196*8d67ca89SAndroid Build Coastguard WorkerThese are the benchmarks that examine how the allocator handles multiple 197*8d67ca89SAndroid Build Coastguard Workerallocations of the same size at the same time. 198*8d67ca89SAndroid Build Coastguard Worker 199*8d67ca89SAndroid Build Coastguard WorkerThe first set of these benchmarks does a set number of 8192 byte allocations 200*8d67ca89SAndroid Build Coastguard Workerin one loop, and then frees all of the allocations at the end of the loop. 201*8d67ca89SAndroid Build Coastguard WorkerOnly the time it takes to do the allocations is recorded, the frees are not 202*8d67ca89SAndroid Build Coastguard Workercounted. The value of 8192 was chosen since the jemalloc native allocator 203*8d67ca89SAndroid Build Coastguard Workerhad issues with this size. It is possible other sizes might show different 204*8d67ca89SAndroid Build Coastguard Workerresults, but, as mentioned before, these microbenchmark numbers should 205*8d67ca89SAndroid Build Coastguard Workernot be used as absolutes for determining if an allocator is worth using. 206*8d67ca89SAndroid Build Coastguard Worker 207*8d67ca89SAndroid Build Coastguard WorkerThis benchmark is designed to verify that there is no performance issue 208*8d67ca89SAndroid Build Coastguard Workerrelated to having multiple allocations alive at the same time. 209*8d67ca89SAndroid Build Coastguard Worker 210*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 0)`, use these commands: 211*8d67ca89SAndroid Build Coastguard Worker 212*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_multiple_8192_allocs_default 213*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_multiple_8192_allocs_default 214*8d67ca89SAndroid Build Coastguard Worker 215*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 1)`, use these commands: 216*8d67ca89SAndroid Build Coastguard Worker 217*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_multiple_8192_allocs_decay1 218*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_multiple_8192_allocs_decay1 219*8d67ca89SAndroid Build Coastguard Worker 220*8d67ca89SAndroid Build Coastguard WorkerFor these benchmarks, the last parameter is the total number of allocations to 221*8d67ca89SAndroid Build Coastguard Workerdo in each loop. 222*8d67ca89SAndroid Build Coastguard Worker 223*8d67ca89SAndroid Build Coastguard WorkerThe other variation of this benchmark is to always do forty allocations in 224*8d67ca89SAndroid Build Coastguard Workereach loop, but vary the size of the forty allocations. As with the other 225*8d67ca89SAndroid Build Coastguard Workerbenchmark, only the time it takes to do the allocations is tracked, the 226*8d67ca89SAndroid Build Coastguard Workerfrees are not counted. Forty allocations is an arbitrary number that could 227*8d67ca89SAndroid Build Coastguard Workerbe modified in the future. It was chosen because a version of the native 228*8d67ca89SAndroid Build Coastguard Workerallocator, jemalloc, showed a problem at forty allocations. 229*8d67ca89SAndroid Build Coastguard Worker 230*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 0)`, use these commands: 231*8d67ca89SAndroid Build Coastguard Worker 232*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_forty_default 233*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_forty_default 234*8d67ca89SAndroid Build Coastguard Worker 235*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 1)`, use these command: 236*8d67ca89SAndroid Build Coastguard Worker 237*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_forty_decay1 238*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_forty_decay1 239*8d67ca89SAndroid Build Coastguard Worker 240*8d67ca89SAndroid Build Coastguard WorkerFor these benchmarks, the last parameter in the output is the size of the 241*8d67ca89SAndroid Build Coastguard Workerallocation in bytes. 242*8d67ca89SAndroid Build Coastguard Worker 243*8d67ca89SAndroid Build Coastguard WorkerAs with the other microbenchmarks, an allocator with numbers in the same 244*8d67ca89SAndroid Build Coastguard Workerproximity of the current values is usually sufficient to consider making 245*8d67ca89SAndroid Build Coastguard Workera switch. The trace benchmarks are more important than these benchmarks 246*8d67ca89SAndroid Build Coastguard Workersince they simulate real world allocation profiles. 247*8d67ca89SAndroid Build Coastguard Worker 248*8d67ca89SAndroid Build Coastguard Worker#### SQL Allocation Trace Benchmark 249*8d67ca89SAndroid Build Coastguard WorkerThis benchmark is a trace of the allocations performed when running 250*8d67ca89SAndroid Build Coastguard Workerthe SQLite BenchMark app. 251*8d67ca89SAndroid Build Coastguard Worker 252*8d67ca89SAndroid Build Coastguard WorkerThis benchmark is designed to verify that the allocator will be performant 253*8d67ca89SAndroid Build Coastguard Workerin a real world allocation scenario. SQL operations were chosen as a 254*8d67ca89SAndroid Build Coastguard Workerbenchmark because these operations tend to do lots of malloc/realloc/free 255*8d67ca89SAndroid Build Coastguard Workercalls, and they tend to be on the critical path of applications. 256*8d67ca89SAndroid Build Coastguard Worker 257*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 0)`, use these commands: 258*8d67ca89SAndroid Build Coastguard Worker 259*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_sql_trace_default 260*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_sql_trace_default 261*8d67ca89SAndroid Build Coastguard Worker 262*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 1)`, use these commands: 263*8d67ca89SAndroid Build Coastguard Worker 264*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_sql_trace_decay1 265*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_sql_trace_decay1 266*8d67ca89SAndroid Build Coastguard Worker 267*8d67ca89SAndroid Build Coastguard WorkerThese numbers should be as performant as the current allocator. 268*8d67ca89SAndroid Build Coastguard Worker 269*8d67ca89SAndroid Build Coastguard Worker#### mallinfo Benchmark 270*8d67ca89SAndroid Build Coastguard WorkerThis benchmark only verifies that mallinfo is still close to the performance 271*8d67ca89SAndroid Build Coastguard Workerof the current allocator. 272*8d67ca89SAndroid Build Coastguard Worker 273*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmark, use these commands: 274*8d67ca89SAndroid Build Coastguard Worker 275*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_mallinfo 276*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_mallinfo 277*8d67ca89SAndroid Build Coastguard Worker 278*8d67ca89SAndroid Build Coastguard WorkerCalls to mallinfo are used in ART so a new allocator is required to be 279*8d67ca89SAndroid Build Coastguard Workernearly as performant as the current allocator. 280*8d67ca89SAndroid Build Coastguard Worker 281*8d67ca89SAndroid Build Coastguard Worker#### mallopt M\_PURGE Benchmark 282*8d67ca89SAndroid Build Coastguard WorkerThis benchmark tracks the cost of calling `mallopt(M_PURGE, 0)`. As with the 283*8d67ca89SAndroid Build Coastguard Workermallinfo benchmark, it's not necessary for this to be better than the previous 284*8d67ca89SAndroid Build Coastguard Workerallocator, only that the performance be in the same order of magnitude. 285*8d67ca89SAndroid Build Coastguard Worker 286*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmark, use these commands: 287*8d67ca89SAndroid Build Coastguard Worker 288*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_mallopt_purge 289*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_mallopt_purge 290*8d67ca89SAndroid Build Coastguard Worker 291*8d67ca89SAndroid Build Coastguard WorkerThese calls are used to free unused memory pages back to the kernel. 292*8d67ca89SAndroid Build Coastguard Worker 293*8d67ca89SAndroid Build Coastguard Worker### Memory Trace Benchmarks 294*8d67ca89SAndroid Build Coastguard WorkerThese benchmarks measure all three axes of a native allocator, RSS, virtual 295*8d67ca89SAndroid Build Coastguard Workeraddress space consumed, speed of allocation. They are designed to 296*8d67ca89SAndroid Build Coastguard Workerrun on a trace of the allocations from a real world application or system 297*8d67ca89SAndroid Build Coastguard Workerprocess. 298*8d67ca89SAndroid Build Coastguard Worker 299*8d67ca89SAndroid Build Coastguard WorkerTo build this benchmark: 300*8d67ca89SAndroid Build Coastguard Worker 301*8d67ca89SAndroid Build Coastguard Worker mmma -j system/extras/memory_replay 302*8d67ca89SAndroid Build Coastguard Worker 303*8d67ca89SAndroid Build Coastguard WorkerThis will build two executables: 304*8d67ca89SAndroid Build Coastguard Worker 305*8d67ca89SAndroid Build Coastguard Worker /system/bin/memory_replay32 306*8d67ca89SAndroid Build Coastguard Worker /system/bin/memory_replay64 307*8d67ca89SAndroid Build Coastguard Worker 308*8d67ca89SAndroid Build Coastguard WorkerAnd these two benchmark executables: 309*8d67ca89SAndroid Build Coastguard Worker 310*8d67ca89SAndroid Build Coastguard Worker /data/benchmarktest64/trace_benchmark/trace_benchmark 311*8d67ca89SAndroid Build Coastguard Worker /data/benchmarktest/trace_benchmark/trace_benchmark 312*8d67ca89SAndroid Build Coastguard Worker 313*8d67ca89SAndroid Build Coastguard Worker#### Memory Replay Benchmarks 314*8d67ca89SAndroid Build Coastguard WorkerThese benchmarks display RSS, virtual memory consumed (VA space), and do a 315*8d67ca89SAndroid Build Coastguard Workerbit of performance testing on actual traces taken from running applications. 316*8d67ca89SAndroid Build Coastguard Worker 317*8d67ca89SAndroid Build Coastguard WorkerThe trace data includes what thread does each operation, so the replay 318*8d67ca89SAndroid Build Coastguard Workermechanism will simulate this by creating threads and replaying the operations 319*8d67ca89SAndroid Build Coastguard Workeron a thread as if it was rerunning the real trace. The only issue is that 320*8d67ca89SAndroid Build Coastguard Workerthis is a worst case scenario for allocations happening at the same time 321*8d67ca89SAndroid Build Coastguard Workerin all threads since it collapses all of the allocation operations to occur 322*8d67ca89SAndroid Build Coastguard Workerone after another. This will cause a lot of threads allocating at the same 323*8d67ca89SAndroid Build Coastguard Workertime. The trace data does not include timestamps, 324*8d67ca89SAndroid Build Coastguard Workerso it is not possible to create a completely accurate replay. 325*8d67ca89SAndroid Build Coastguard Worker 326*8d67ca89SAndroid Build Coastguard WorkerTo generate these traces, see the [Malloc Debug documentation](https://android.googlesource.com/platform/bionic/+/main/libc/malloc_debug/README.md), 327*8d67ca89SAndroid Build Coastguard Workerthe option [record\_allocs](https://android.googlesource.com/platform/bionic/+/main/libc/malloc_debug/README.md#record_allocs_total_entries). 328*8d67ca89SAndroid Build Coastguard Worker 329*8d67ca89SAndroid Build Coastguard WorkerTo run these benchmarks, first copy the trace files to the target using 330*8d67ca89SAndroid Build Coastguard Workerthese commands: 331*8d67ca89SAndroid Build Coastguard Worker 332*8d67ca89SAndroid Build Coastguard Worker adb push system/extras/memory_replay/traces /data/local/tmp 333*8d67ca89SAndroid Build Coastguard Worker 334*8d67ca89SAndroid Build Coastguard WorkerSince all of the traces come from applications, the `memory_replay` program 335*8d67ca89SAndroid Build Coastguard Workerwill always call `mallopt(M_DECAY_TIME, 1)' before running the trace. 336*8d67ca89SAndroid Build Coastguard Worker 337*8d67ca89SAndroid Build Coastguard WorkerRun the benchmark thusly: 338*8d67ca89SAndroid Build Coastguard Worker 339*8d67ca89SAndroid Build Coastguard Worker adb shell memory_replay64 /data/local/tmp/traces/XXX.zip 340*8d67ca89SAndroid Build Coastguard Worker adb shell memory_replay32 /data/local/tmp/traces/XXX.zip 341*8d67ca89SAndroid Build Coastguard Worker 342*8d67ca89SAndroid Build Coastguard WorkerWhere XXX.zip is the name of a zipped trace file. The `memory_replay` 343*8d67ca89SAndroid Build Coastguard Workerprogram also can process text files, but all trace files are currently 344*8d67ca89SAndroid Build Coastguard Workerchecked in as zip files. 345*8d67ca89SAndroid Build Coastguard Worker 346*8d67ca89SAndroid Build Coastguard WorkerEvery 100000 allocation operations, a dump of the RSS and VA space will be 347*8d67ca89SAndroid Build Coastguard Workerperformed. At the end, a final RSS and VA space number will be printed. 348*8d67ca89SAndroid Build Coastguard WorkerFor the most part, the intermediate data can be ignored, but it is always 349*8d67ca89SAndroid Build Coastguard Workera good idea to look over the data to verify that no strange spikes are 350*8d67ca89SAndroid Build Coastguard Workeroccurring. 351*8d67ca89SAndroid Build Coastguard Worker 352*8d67ca89SAndroid Build Coastguard WorkerThe performance number is a measure of the time it takes to perform all of 353*8d67ca89SAndroid Build Coastguard Workerthe allocation calls (malloc/memalign/posix_memalign/realloc/free/etc). 354*8d67ca89SAndroid Build Coastguard WorkerFor any call that allocates a pointer, the time for the call and the time 355*8d67ca89SAndroid Build Coastguard Workerit takes to make the pointer completely resident in memory is included. 356*8d67ca89SAndroid Build Coastguard Worker 357*8d67ca89SAndroid Build Coastguard WorkerThe performance numbers for these runs tend to have a wide variability so 358*8d67ca89SAndroid Build Coastguard Workerthey should not be used as absolute value for comparison against the 359*8d67ca89SAndroid Build Coastguard Workercurrent allocator. But, they should be in the same range as the current 360*8d67ca89SAndroid Build Coastguard Workervalues. 361*8d67ca89SAndroid Build Coastguard Worker 362*8d67ca89SAndroid Build Coastguard WorkerWhen evaluating an allocator, one of the most important traces is the 363*8d67ca89SAndroid Build Coastguard Workercamera.txt trace. The camera application does very large allocations, 364*8d67ca89SAndroid Build Coastguard Workerand some allocators might leave large virtual address maps around 365*8d67ca89SAndroid Build Coastguard Workerrather than delete them. When that happens, it can lead to allocation 366*8d67ca89SAndroid Build Coastguard Workerfailures and would cause the camera app to abort/crash. It is 367*8d67ca89SAndroid Build Coastguard Workerimportant to verify that when running this trace using the 32 bit replay 368*8d67ca89SAndroid Build Coastguard Workerexecutable, the virtual address space consumed is not much larger than the 369*8d67ca89SAndroid Build Coastguard Workercurrent allocator. A small increase (on the order of a few MBs) would be okay. 370*8d67ca89SAndroid Build Coastguard Worker 371*8d67ca89SAndroid Build Coastguard WorkerThere is no specific benchmark for memory fragmentation, instead, the RSS 372*8d67ca89SAndroid Build Coastguard Workerwhen running the memory traces acts as a proxy for this. An allocator that 373*8d67ca89SAndroid Build Coastguard Workeris fragmenting badly will show an increase in RSS. The best trace for 374*8d67ca89SAndroid Build Coastguard Workertracking fragmentation is system\_server.txt which is an extremely long 375*8d67ca89SAndroid Build Coastguard Workertrace (~13 million operations). The total number of live allocations goes 376*8d67ca89SAndroid Build Coastguard Workerup and down a bit, but stays mostly the same so an allocator that fragments 377*8d67ca89SAndroid Build Coastguard Workerbadly would likely show an abnormal increase in RSS on this trace. 378*8d67ca89SAndroid Build Coastguard Worker 379*8d67ca89SAndroid Build Coastguard WorkerNOTE: When a native allocator calls mmap, it is expected that the allocator 380*8d67ca89SAndroid Build Coastguard Workerwill name the map using the call: 381*8d67ca89SAndroid Build Coastguard Worker 382*8d67ca89SAndroid Build Coastguard Worker prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, <PTR>, <SIZE>, "libc_malloc"); 383*8d67ca89SAndroid Build Coastguard Worker 384*8d67ca89SAndroid Build Coastguard WorkerIf the native allocator creates a different name, then it necessary to 385*8d67ca89SAndroid Build Coastguard Workermodify the file: 386*8d67ca89SAndroid Build Coastguard Worker 387*8d67ca89SAndroid Build Coastguard Worker system/extras/memory_replay/NativeInfo.cpp 388*8d67ca89SAndroid Build Coastguard Worker 389*8d67ca89SAndroid Build Coastguard WorkerThe `GetNativeInfo` function needs to be modified to include the name 390*8d67ca89SAndroid Build Coastguard Workerof the maps that this allocator includes. 391*8d67ca89SAndroid Build Coastguard Worker 392*8d67ca89SAndroid Build Coastguard WorkerIn addition, in order for the frameworks code to keep track of the memory 393*8d67ca89SAndroid Build Coastguard Workerof a process, any named maps must be added to the file: 394*8d67ca89SAndroid Build Coastguard Worker 395*8d67ca89SAndroid Build Coastguard Worker frameworks/base/core/jni/android_os_Debug.cpp 396*8d67ca89SAndroid Build Coastguard Worker 397*8d67ca89SAndroid Build Coastguard WorkerModify the `load_maps` function and add a check of the new expected name. 398*8d67ca89SAndroid Build Coastguard Worker 399*8d67ca89SAndroid Build Coastguard Worker#### Performance Trace Benchmarks 400*8d67ca89SAndroid Build Coastguard WorkerThis is a benchmark that treats the trace data as if all allocations 401*8d67ca89SAndroid Build Coastguard Workeroccurred in a single thread. This is the scenario that could 402*8d67ca89SAndroid Build Coastguard Workerhappen if all of the allocations are spaced out in time so no thread 403*8d67ca89SAndroid Build Coastguard Workerevery does an allocation at the same time as another thread. 404*8d67ca89SAndroid Build Coastguard Worker 405*8d67ca89SAndroid Build Coastguard WorkerRun these benchmarks thusly: 406*8d67ca89SAndroid Build Coastguard Worker 407*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest64/trace_benchmark/trace_benchmark 408*8d67ca89SAndroid Build Coastguard Worker adb shell /data/benchmarktest/trace_benchmark/trace_benchmark 409*8d67ca89SAndroid Build Coastguard Worker 410*8d67ca89SAndroid Build Coastguard WorkerWhen run without any arguments, the benchmark will run over all of the 411*8d67ca89SAndroid Build Coastguard Workertraces and display data. It takes many minutes to complete these runs in 412*8d67ca89SAndroid Build Coastguard Workerorder to get as accurate a number as possible. 413