xref: /aosp_15_r20/bionic/docs/native_allocator.md (revision 8d67ca893c1523eb926b9080dbe4e2ffd2a27ba1)
1*8d67ca89SAndroid Build Coastguard Worker# Native Memory Allocator Verification
2*8d67ca89SAndroid Build Coastguard WorkerThis document describes how to verify the native memory allocator on Android.
3*8d67ca89SAndroid Build Coastguard WorkerThis procedure should be followed when upgrading or moving to a new allocator.
4*8d67ca89SAndroid Build Coastguard WorkerA small minor upgrade might not need to run all of the benchmarks, however,
5*8d67ca89SAndroid Build Coastguard Workerat least the
6*8d67ca89SAndroid Build Coastguard Worker[SQL Allocation Trace Benchmark](#sql-allocation-trace-benchmark),
7*8d67ca89SAndroid Build Coastguard Worker[Memory Replay Benchmarks](#memory-replay-benchmarks) and
8*8d67ca89SAndroid Build Coastguard Worker[Performance Trace Benchmarks](#performance-trace-benchmarks) should be run.
9*8d67ca89SAndroid Build Coastguard Worker
10*8d67ca89SAndroid Build Coastguard WorkerIt is important to note that there are two modes for a native allocator
11*8d67ca89SAndroid Build Coastguard Workerto run in on Android. The first is the normal allocator, the second is
12*8d67ca89SAndroid Build Coastguard Workercalled the svelte config, which is designed to run on memory constrained
13*8d67ca89SAndroid Build Coastguard Workersystems and be a bit slower, but take less RSS. To enable the svelte config,
14*8d67ca89SAndroid Build Coastguard Workeradd this line to the `BoardConfig.mk` for the given target:
15*8d67ca89SAndroid Build Coastguard Worker
16*8d67ca89SAndroid Build Coastguard Worker    MALLOC_SVELTE := true
17*8d67ca89SAndroid Build Coastguard Worker
18*8d67ca89SAndroid Build Coastguard WorkerThe `BoardConfig.mk` file is usually found in the directory
19*8d67ca89SAndroid Build Coastguard Worker`device/<DEVICE_NAME>/` or in a sub directory.
20*8d67ca89SAndroid Build Coastguard Worker
21*8d67ca89SAndroid Build Coastguard WorkerWhen evaluating a native allocator, make sure that you benchmark both
22*8d67ca89SAndroid Build Coastguard Workerversions.
23*8d67ca89SAndroid Build Coastguard Worker
24*8d67ca89SAndroid Build Coastguard Worker## Android Extensions
25*8d67ca89SAndroid Build Coastguard WorkerAndroid supports a few non-standard functions and mallopt controls that
26*8d67ca89SAndroid Build Coastguard Workera native allocator needs to implement.
27*8d67ca89SAndroid Build Coastguard Worker
28*8d67ca89SAndroid Build Coastguard Worker### Iterator Functions
29*8d67ca89SAndroid Build Coastguard WorkerThese are functions that are used to implement a memory leak detector
30*8d67ca89SAndroid Build Coastguard Workercalled `libmemunreachable`.
31*8d67ca89SAndroid Build Coastguard Worker
32*8d67ca89SAndroid Build Coastguard Worker#### malloc\_disable
33*8d67ca89SAndroid Build Coastguard WorkerThis function, when called, should pause all threads that are making a
34*8d67ca89SAndroid Build Coastguard Workercall to an allocation function (malloc/free/etc). When a call
35*8d67ca89SAndroid Build Coastguard Workeris made to `malloc_enable`, the paused threads should start running again.
36*8d67ca89SAndroid Build Coastguard Worker
37*8d67ca89SAndroid Build Coastguard Worker#### malloc\_enable
38*8d67ca89SAndroid Build Coastguard WorkerThis function, when called, does nothing unless there was a previous call
39*8d67ca89SAndroid Build Coastguard Workerto `malloc_disable`. This call will unpause any thread which is making
40*8d67ca89SAndroid Build Coastguard Workera call to an allocation function (malloc/free/etc) when `malloc_disable`
41*8d67ca89SAndroid Build Coastguard Workerwas called previously.
42*8d67ca89SAndroid Build Coastguard Worker
43*8d67ca89SAndroid Build Coastguard Worker#### malloc\_iterate
44*8d67ca89SAndroid Build Coastguard WorkerThis function enumerates all of the allocations currently live in the
45*8d67ca89SAndroid Build Coastguard Workersystem. It is meant to be called after a call to `malloc_disable` to
46*8d67ca89SAndroid Build Coastguard Workerprevent further allocations while this call is being executed. To
47*8d67ca89SAndroid Build Coastguard Workersee what is expected for this function, the best description is the
48*8d67ca89SAndroid Build Coastguard Workertests for this funcion in `bionic/tests/malloc_itearte_test.cpp`.
49*8d67ca89SAndroid Build Coastguard Worker
50*8d67ca89SAndroid Build Coastguard Worker### Mallopt Extensions
51*8d67ca89SAndroid Build Coastguard WorkerThese are mallopt options that Android requires for a native allocator
52*8d67ca89SAndroid Build Coastguard Workerto work efficiently.
53*8d67ca89SAndroid Build Coastguard Worker
54*8d67ca89SAndroid Build Coastguard Worker#### M\_DECAY\_TIME
55*8d67ca89SAndroid Build Coastguard WorkerWhen set to zero, `mallopt(M_DECAY_TIME, 0)`, it is expected that an
56*8d67ca89SAndroid Build Coastguard Workerallocator will attempt to purge and release any unused memory back to the
57*8d67ca89SAndroid Build Coastguard Workerkernel on free calls. This is important in Android to avoid consuming extra
58*8d67ca89SAndroid Build Coastguard WorkerRSS.
59*8d67ca89SAndroid Build Coastguard Worker
60*8d67ca89SAndroid Build Coastguard WorkerWhen set to non-zero, `mallopt(M_DECAY_TIME, 1)`, an allocator can delay the
61*8d67ca89SAndroid Build Coastguard Workerpurge and release action. The amount of delay is up to the allocator
62*8d67ca89SAndroid Build Coastguard Workerimplementation, but it should be a reasonable amount of time. The jemalloc
63*8d67ca89SAndroid Build Coastguard Workerallocator was implemented to have a one second delay.
64*8d67ca89SAndroid Build Coastguard Worker
65*8d67ca89SAndroid Build Coastguard WorkerThe drawback to this option is that most allocators do not have a separate
66*8d67ca89SAndroid Build Coastguard Workerthread to handle the purge, so the decay is only handled when an
67*8d67ca89SAndroid Build Coastguard Workerallocation operation occurs. For server processes, this can mean that
68*8d67ca89SAndroid Build Coastguard WorkerRSS is slightly higher when the server is waiting for the next connection
69*8d67ca89SAndroid Build Coastguard Workerand no other allocation calls are made. The `M_PURGE` option is used to
70*8d67ca89SAndroid Build Coastguard Workerforce a purge in this case.
71*8d67ca89SAndroid Build Coastguard Worker
72*8d67ca89SAndroid Build Coastguard WorkerFor all applications on Android, the call `mallopt(M_DECAY_TIME, 1)` is
73*8d67ca89SAndroid Build Coastguard Workermade by default. The idea is that it allows application frees to run a
74*8d67ca89SAndroid Build Coastguard Workerbit faster, while only increasing RSS a bit.
75*8d67ca89SAndroid Build Coastguard Worker
76*8d67ca89SAndroid Build Coastguard Worker#### M\_PURGE
77*8d67ca89SAndroid Build Coastguard WorkerWhen called, `mallopt(M_PURGE, 0)`, an allocator should purge and release
78*8d67ca89SAndroid Build Coastguard Workerany unused memory immediately. The argument for this call is ignored. If
79*8d67ca89SAndroid Build Coastguard Workerpossible, this call should clear thread cached memory if it exists. The
80*8d67ca89SAndroid Build Coastguard Workeridea is that this can be called to purge memory that has not been
81*8d67ca89SAndroid Build Coastguard Workerpurged when `M_DECAY_TIME` is set to one. This is useful if you have a
82*8d67ca89SAndroid Build Coastguard Workerserver application that does a lot of native allocations and the
83*8d67ca89SAndroid Build Coastguard Workerapplication wants to purge that memory before waiting for the next connection.
84*8d67ca89SAndroid Build Coastguard Worker
85*8d67ca89SAndroid Build Coastguard Worker## Correctness Tests
86*8d67ca89SAndroid Build Coastguard WorkerThese are the tests that should be run to verify an allocator is
87*8d67ca89SAndroid Build Coastguard Workerworking properly according to Android.
88*8d67ca89SAndroid Build Coastguard Worker
89*8d67ca89SAndroid Build Coastguard Worker### Bionic Unit Tests
90*8d67ca89SAndroid Build Coastguard WorkerThe bionic unit tests contain a small number of allocator tests. These
91*8d67ca89SAndroid Build Coastguard Workertests are primarily verifying Android extensions and non-standard behavior
92*8d67ca89SAndroid Build Coastguard Workerof allocation routines such as what happens when a non-power of two alignment
93*8d67ca89SAndroid Build Coastguard Workeris passed to memalign.
94*8d67ca89SAndroid Build Coastguard Worker
95*8d67ca89SAndroid Build Coastguard WorkerTo run all of the compliance tests:
96*8d67ca89SAndroid Build Coastguard Worker
97*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/nativetest64/bionic-unit-tests/bionic-unit-tests --gtest_filter="malloc*"
98*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/nativetest/bionic-unit-tests/bionic-unit-tests --gtest_filter="malloc*"
99*8d67ca89SAndroid Build Coastguard Worker
100*8d67ca89SAndroid Build Coastguard WorkerThe allocation tests are not meant to be complete, so it is expected
101*8d67ca89SAndroid Build Coastguard Workerthat a native allocator will have its own set of tests that can be run.
102*8d67ca89SAndroid Build Coastguard Worker
103*8d67ca89SAndroid Build Coastguard Worker### Libmemunreachable Tests
104*8d67ca89SAndroid Build Coastguard WorkerThe libmemunreachable tests verify that the iterator functions are working
105*8d67ca89SAndroid Build Coastguard Workerproperly.
106*8d67ca89SAndroid Build Coastguard Worker
107*8d67ca89SAndroid Build Coastguard WorkerTo run all of the tests:
108*8d67ca89SAndroid Build Coastguard Worker
109*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/nativetest64/memunreachable_binder_test/memunreachable_binder_test
110*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/nativetest/memunreachable_binder_test/memunreachable_binder_test
111*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/nativetest64/memunreachable_test/memunreachable_test
112*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/nativetest/memunreachable_test/memunreachable_test
113*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/nativetest64/memunreachable_unit_test/memunreachable_unit_test
114*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/nativetest/memunreachable_unit_test/memunreachable_unit_test
115*8d67ca89SAndroid Build Coastguard Worker
116*8d67ca89SAndroid Build Coastguard Worker### CTS Entropy Test
117*8d67ca89SAndroid Build Coastguard WorkerIn addition to the bionic tests, there is also a CTS test that is designed
118*8d67ca89SAndroid Build Coastguard Workerto verify that the addresses returned by malloc are sufficiently randomized
119*8d67ca89SAndroid Build Coastguard Workerto help defeat potential security bugs.
120*8d67ca89SAndroid Build Coastguard Worker
121*8d67ca89SAndroid Build Coastguard WorkerRun this test thusly:
122*8d67ca89SAndroid Build Coastguard Worker
123*8d67ca89SAndroid Build Coastguard Worker    atest AslrMallocTest
124*8d67ca89SAndroid Build Coastguard Worker
125*8d67ca89SAndroid Build Coastguard WorkerIf there are multiple devices connected to the system, use `-s <SERIAL>`
126*8d67ca89SAndroid Build Coastguard Workerto specify a device.
127*8d67ca89SAndroid Build Coastguard Worker
128*8d67ca89SAndroid Build Coastguard Worker## Performance
129*8d67ca89SAndroid Build Coastguard WorkerThere are multiple different ways to evaluate the performance of a native
130*8d67ca89SAndroid Build Coastguard Workerallocator on Android. One is allocation speed in various different scenarios,
131*8d67ca89SAndroid Build Coastguard Workeranother is total RSS taken by the allocator.
132*8d67ca89SAndroid Build Coastguard Worker
133*8d67ca89SAndroid Build Coastguard WorkerThe last is virtual address space consumed in 32 bit applications. There is
134*8d67ca89SAndroid Build Coastguard Workera limited amount of address space available in 32 bit apps, and there have
135*8d67ca89SAndroid Build Coastguard Workerbeen allocator bugs that cause memory failures when too much virtual
136*8d67ca89SAndroid Build Coastguard Workeraddress space is consumed. For 64 bit executables, this can be ignored.
137*8d67ca89SAndroid Build Coastguard Worker
138*8d67ca89SAndroid Build Coastguard Worker### Bionic Benchmarks
139*8d67ca89SAndroid Build Coastguard WorkerThese are the microbenchmarks that are part of the bionic benchmarks suite of
140*8d67ca89SAndroid Build Coastguard Workerbenchmarks. These benchmarks can be built using this command:
141*8d67ca89SAndroid Build Coastguard Worker
142*8d67ca89SAndroid Build Coastguard Worker    mmma -j bionic/benchmarks
143*8d67ca89SAndroid Build Coastguard Worker
144*8d67ca89SAndroid Build Coastguard WorkerThese benchmarks are only used to verify the speed of the allocator and
145*8d67ca89SAndroid Build Coastguard Workerignore anything related to RSS and virtual address space consumed.
146*8d67ca89SAndroid Build Coastguard Worker
147*8d67ca89SAndroid Build Coastguard WorkerFor all of these benchmark runs, it can be useful to add these two options:
148*8d67ca89SAndroid Build Coastguard Worker
149*8d67ca89SAndroid Build Coastguard Worker    --benchmark_repetitions=XX
150*8d67ca89SAndroid Build Coastguard Worker    --benchmark_report_aggregates_only=true
151*8d67ca89SAndroid Build Coastguard Worker
152*8d67ca89SAndroid Build Coastguard WorkerThis will run the benchmark XX times and then give a mean, median, and stddev
153*8d67ca89SAndroid Build Coastguard Workerand helps to get a number that can be compared to the new allocator.
154*8d67ca89SAndroid Build Coastguard Worker
155*8d67ca89SAndroid Build Coastguard WorkerIn addition, there is another option:
156*8d67ca89SAndroid Build Coastguard Worker
157*8d67ca89SAndroid Build Coastguard Worker    --bionic_cpu=XX
158*8d67ca89SAndroid Build Coastguard Worker
159*8d67ca89SAndroid Build Coastguard WorkerWhich will lock the benchmark to only run on core XX. This also avoids
160*8d67ca89SAndroid Build Coastguard Workerany issue related to the code migrating from one core to another
161*8d67ca89SAndroid Build Coastguard Workerwith different characteristics. For example, on a big-little cpu, if the
162*8d67ca89SAndroid Build Coastguard Workerbenchmark moves from big to little or vice-versa, this can cause scores
163*8d67ca89SAndroid Build Coastguard Workerto fluctuate in indeterminate ways.
164*8d67ca89SAndroid Build Coastguard Worker
165*8d67ca89SAndroid Build Coastguard WorkerFor most runs, the best set of options to add is:
166*8d67ca89SAndroid Build Coastguard Worker
167*8d67ca89SAndroid Build Coastguard Worker    --benchmark_repetitions=10 --benchmark_report_aggregates_only=true --bionic_cpu=3
168*8d67ca89SAndroid Build Coastguard Worker
169*8d67ca89SAndroid Build Coastguard WorkerOn most phones with a big-little cpu, the third core is the little core.
170*8d67ca89SAndroid Build Coastguard WorkerChoosing to run on the little core can tend to highlight any performance
171*8d67ca89SAndroid Build Coastguard Workerdifferences.
172*8d67ca89SAndroid Build Coastguard Worker
173*8d67ca89SAndroid Build Coastguard Worker#### Allocate/Free Benchmarks
174*8d67ca89SAndroid Build Coastguard WorkerThese are the benchmarks to verify the allocation speed of a loop doing a
175*8d67ca89SAndroid Build Coastguard Workersingle allocation, touching every page in the allocation to make it resident
176*8d67ca89SAndroid Build Coastguard Workerand then freeing the allocation.
177*8d67ca89SAndroid Build Coastguard Worker
178*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 0)`, use these commands:
179*8d67ca89SAndroid Build Coastguard Worker
180*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_free_default
181*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_free_default
182*8d67ca89SAndroid Build Coastguard Worker
183*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 1)`, use these commands:
184*8d67ca89SAndroid Build Coastguard Worker
185*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_free_decay1
186*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_free_decay1
187*8d67ca89SAndroid Build Coastguard Worker
188*8d67ca89SAndroid Build Coastguard WorkerThe last value in the output is the size of the allocation in bytes. It is
189*8d67ca89SAndroid Build Coastguard Workeruseful to look at these kinds of benchmarks to make sure that there are
190*8d67ca89SAndroid Build Coastguard Workerno outliers, but these numbers should not be used to make a final decision.
191*8d67ca89SAndroid Build Coastguard WorkerIf these numbers are slightly worse than the current allocator, the
192*8d67ca89SAndroid Build Coastguard Workersingle thread numbers from trace data is a better representative of
193*8d67ca89SAndroid Build Coastguard Workerreal world situations.
194*8d67ca89SAndroid Build Coastguard Worker
195*8d67ca89SAndroid Build Coastguard Worker#### Multiple Allocations Retained Benchmarks
196*8d67ca89SAndroid Build Coastguard WorkerThese are the benchmarks that examine how the allocator handles multiple
197*8d67ca89SAndroid Build Coastguard Workerallocations of the same size at the same time.
198*8d67ca89SAndroid Build Coastguard Worker
199*8d67ca89SAndroid Build Coastguard WorkerThe first set of these benchmarks does a set number of 8192 byte allocations
200*8d67ca89SAndroid Build Coastguard Workerin one loop, and then frees all of the allocations at the end of the loop.
201*8d67ca89SAndroid Build Coastguard WorkerOnly the time it takes to do the allocations is recorded, the frees are not
202*8d67ca89SAndroid Build Coastguard Workercounted. The value of 8192 was chosen since the jemalloc native allocator
203*8d67ca89SAndroid Build Coastguard Workerhad issues with this size. It is possible other sizes might show different
204*8d67ca89SAndroid Build Coastguard Workerresults, but, as mentioned before, these microbenchmark numbers should
205*8d67ca89SAndroid Build Coastguard Workernot be used as absolutes for determining if an allocator is worth using.
206*8d67ca89SAndroid Build Coastguard Worker
207*8d67ca89SAndroid Build Coastguard WorkerThis benchmark is designed to verify that there is no performance issue
208*8d67ca89SAndroid Build Coastguard Workerrelated to having multiple allocations alive at the same time.
209*8d67ca89SAndroid Build Coastguard Worker
210*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 0)`, use these commands:
211*8d67ca89SAndroid Build Coastguard Worker
212*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_multiple_8192_allocs_default
213*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_multiple_8192_allocs_default
214*8d67ca89SAndroid Build Coastguard Worker
215*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 1)`, use these commands:
216*8d67ca89SAndroid Build Coastguard Worker
217*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_multiple_8192_allocs_decay1
218*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_multiple_8192_allocs_decay1
219*8d67ca89SAndroid Build Coastguard Worker
220*8d67ca89SAndroid Build Coastguard WorkerFor these benchmarks, the last parameter is the total number of allocations to
221*8d67ca89SAndroid Build Coastguard Workerdo in each loop.
222*8d67ca89SAndroid Build Coastguard Worker
223*8d67ca89SAndroid Build Coastguard WorkerThe other variation of this benchmark is to always do forty allocations in
224*8d67ca89SAndroid Build Coastguard Workereach loop, but vary the size of the forty allocations. As with the other
225*8d67ca89SAndroid Build Coastguard Workerbenchmark, only the time it takes to do the allocations is tracked, the
226*8d67ca89SAndroid Build Coastguard Workerfrees are not counted. Forty allocations is an arbitrary number that could
227*8d67ca89SAndroid Build Coastguard Workerbe modified in the future. It was chosen because a version of the native
228*8d67ca89SAndroid Build Coastguard Workerallocator, jemalloc, showed a problem at forty allocations.
229*8d67ca89SAndroid Build Coastguard Worker
230*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 0)`, use these commands:
231*8d67ca89SAndroid Build Coastguard Worker
232*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_forty_default
233*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_forty_default
234*8d67ca89SAndroid Build Coastguard Worker
235*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 1)`, use these command:
236*8d67ca89SAndroid Build Coastguard Worker
237*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_forty_decay1
238*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=stdlib_malloc_forty_decay1
239*8d67ca89SAndroid Build Coastguard Worker
240*8d67ca89SAndroid Build Coastguard WorkerFor these benchmarks, the last parameter in the output is the size of the
241*8d67ca89SAndroid Build Coastguard Workerallocation in bytes.
242*8d67ca89SAndroid Build Coastguard Worker
243*8d67ca89SAndroid Build Coastguard WorkerAs with the other microbenchmarks, an allocator with numbers in the same
244*8d67ca89SAndroid Build Coastguard Workerproximity of the current values is usually sufficient to consider making
245*8d67ca89SAndroid Build Coastguard Workera switch. The trace benchmarks are more important than these benchmarks
246*8d67ca89SAndroid Build Coastguard Workersince they simulate real world allocation profiles.
247*8d67ca89SAndroid Build Coastguard Worker
248*8d67ca89SAndroid Build Coastguard Worker#### SQL Allocation Trace Benchmark
249*8d67ca89SAndroid Build Coastguard WorkerThis benchmark is a trace of the allocations performed when running
250*8d67ca89SAndroid Build Coastguard Workerthe SQLite BenchMark app.
251*8d67ca89SAndroid Build Coastguard Worker
252*8d67ca89SAndroid Build Coastguard WorkerThis benchmark is designed to verify that the allocator will be performant
253*8d67ca89SAndroid Build Coastguard Workerin a real world allocation scenario. SQL operations were chosen as a
254*8d67ca89SAndroid Build Coastguard Workerbenchmark because these operations tend to do lots of malloc/realloc/free
255*8d67ca89SAndroid Build Coastguard Workercalls, and they tend to be on the critical path of applications.
256*8d67ca89SAndroid Build Coastguard Worker
257*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 0)`, use these commands:
258*8d67ca89SAndroid Build Coastguard Worker
259*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_sql_trace_default
260*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_sql_trace_default
261*8d67ca89SAndroid Build Coastguard Worker
262*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmarks with `mallopt(M_DECAY_TIME, 1)`, use these commands:
263*8d67ca89SAndroid Build Coastguard Worker
264*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_sql_trace_decay1
265*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=malloc_sql_trace_decay1
266*8d67ca89SAndroid Build Coastguard Worker
267*8d67ca89SAndroid Build Coastguard WorkerThese numbers should be as performant as the current allocator.
268*8d67ca89SAndroid Build Coastguard Worker
269*8d67ca89SAndroid Build Coastguard Worker#### mallinfo Benchmark
270*8d67ca89SAndroid Build Coastguard WorkerThis benchmark only verifies that mallinfo is still close to the performance
271*8d67ca89SAndroid Build Coastguard Workerof the current allocator.
272*8d67ca89SAndroid Build Coastguard Worker
273*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmark, use these commands:
274*8d67ca89SAndroid Build Coastguard Worker
275*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_mallinfo
276*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_mallinfo
277*8d67ca89SAndroid Build Coastguard Worker
278*8d67ca89SAndroid Build Coastguard WorkerCalls to mallinfo are used in ART so a new allocator is required to be
279*8d67ca89SAndroid Build Coastguard Workernearly as performant as the current allocator.
280*8d67ca89SAndroid Build Coastguard Worker
281*8d67ca89SAndroid Build Coastguard Worker#### mallopt M\_PURGE Benchmark
282*8d67ca89SAndroid Build Coastguard WorkerThis benchmark tracks the cost of calling `mallopt(M_PURGE, 0)`. As with the
283*8d67ca89SAndroid Build Coastguard Workermallinfo benchmark, it's not necessary for this to be better than the previous
284*8d67ca89SAndroid Build Coastguard Workerallocator, only that the performance be in the same order of magnitude.
285*8d67ca89SAndroid Build Coastguard Worker
286*8d67ca89SAndroid Build Coastguard WorkerTo run the benchmark, use these commands:
287*8d67ca89SAndroid Build Coastguard Worker
288*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_mallopt_purge
289*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_mallopt_purge
290*8d67ca89SAndroid Build Coastguard Worker
291*8d67ca89SAndroid Build Coastguard WorkerThese calls are used to free unused memory pages back to the kernel.
292*8d67ca89SAndroid Build Coastguard Worker
293*8d67ca89SAndroid Build Coastguard Worker### Memory Trace Benchmarks
294*8d67ca89SAndroid Build Coastguard WorkerThese benchmarks measure all three axes of a native allocator, RSS, virtual
295*8d67ca89SAndroid Build Coastguard Workeraddress space consumed, speed of allocation. They are designed to
296*8d67ca89SAndroid Build Coastguard Workerrun on a trace of the allocations from a real world application or system
297*8d67ca89SAndroid Build Coastguard Workerprocess.
298*8d67ca89SAndroid Build Coastguard Worker
299*8d67ca89SAndroid Build Coastguard WorkerTo build this benchmark:
300*8d67ca89SAndroid Build Coastguard Worker
301*8d67ca89SAndroid Build Coastguard Worker    mmma -j system/extras/memory_replay
302*8d67ca89SAndroid Build Coastguard Worker
303*8d67ca89SAndroid Build Coastguard WorkerThis will build two executables:
304*8d67ca89SAndroid Build Coastguard Worker
305*8d67ca89SAndroid Build Coastguard Worker    /system/bin/memory_replay32
306*8d67ca89SAndroid Build Coastguard Worker    /system/bin/memory_replay64
307*8d67ca89SAndroid Build Coastguard Worker
308*8d67ca89SAndroid Build Coastguard WorkerAnd these two benchmark executables:
309*8d67ca89SAndroid Build Coastguard Worker
310*8d67ca89SAndroid Build Coastguard Worker    /data/benchmarktest64/trace_benchmark/trace_benchmark
311*8d67ca89SAndroid Build Coastguard Worker    /data/benchmarktest/trace_benchmark/trace_benchmark
312*8d67ca89SAndroid Build Coastguard Worker
313*8d67ca89SAndroid Build Coastguard Worker#### Memory Replay Benchmarks
314*8d67ca89SAndroid Build Coastguard WorkerThese benchmarks display RSS, virtual memory consumed (VA space), and do a
315*8d67ca89SAndroid Build Coastguard Workerbit of performance testing on actual traces taken from running applications.
316*8d67ca89SAndroid Build Coastguard Worker
317*8d67ca89SAndroid Build Coastguard WorkerThe trace data includes what thread does each operation, so the replay
318*8d67ca89SAndroid Build Coastguard Workermechanism will simulate this by creating threads and replaying the operations
319*8d67ca89SAndroid Build Coastguard Workeron a thread as if it was rerunning the real trace. The only issue is that
320*8d67ca89SAndroid Build Coastguard Workerthis is a worst case scenario for allocations happening at the same time
321*8d67ca89SAndroid Build Coastguard Workerin all threads since it collapses all of the allocation operations to occur
322*8d67ca89SAndroid Build Coastguard Workerone after another. This will cause a lot of threads allocating at the same
323*8d67ca89SAndroid Build Coastguard Workertime. The trace data does not include timestamps,
324*8d67ca89SAndroid Build Coastguard Workerso it is not possible to create a completely accurate replay.
325*8d67ca89SAndroid Build Coastguard Worker
326*8d67ca89SAndroid Build Coastguard WorkerTo generate these traces, see the [Malloc Debug documentation](https://android.googlesource.com/platform/bionic/+/main/libc/malloc_debug/README.md),
327*8d67ca89SAndroid Build Coastguard Workerthe option [record\_allocs](https://android.googlesource.com/platform/bionic/+/main/libc/malloc_debug/README.md#record_allocs_total_entries).
328*8d67ca89SAndroid Build Coastguard Worker
329*8d67ca89SAndroid Build Coastguard WorkerTo run these benchmarks, first copy the trace files to the target using
330*8d67ca89SAndroid Build Coastguard Workerthese commands:
331*8d67ca89SAndroid Build Coastguard Worker
332*8d67ca89SAndroid Build Coastguard Worker    adb push system/extras/memory_replay/traces /data/local/tmp
333*8d67ca89SAndroid Build Coastguard Worker
334*8d67ca89SAndroid Build Coastguard WorkerSince all of the traces come from applications, the `memory_replay` program
335*8d67ca89SAndroid Build Coastguard Workerwill always call `mallopt(M_DECAY_TIME, 1)' before running the trace.
336*8d67ca89SAndroid Build Coastguard Worker
337*8d67ca89SAndroid Build Coastguard WorkerRun the benchmark thusly:
338*8d67ca89SAndroid Build Coastguard Worker
339*8d67ca89SAndroid Build Coastguard Worker    adb shell memory_replay64 /data/local/tmp/traces/XXX.zip
340*8d67ca89SAndroid Build Coastguard Worker    adb shell memory_replay32 /data/local/tmp/traces/XXX.zip
341*8d67ca89SAndroid Build Coastguard Worker
342*8d67ca89SAndroid Build Coastguard WorkerWhere XXX.zip is the name of a zipped trace file. The `memory_replay`
343*8d67ca89SAndroid Build Coastguard Workerprogram also can process text files, but all trace files are currently
344*8d67ca89SAndroid Build Coastguard Workerchecked in as zip files.
345*8d67ca89SAndroid Build Coastguard Worker
346*8d67ca89SAndroid Build Coastguard WorkerEvery 100000 allocation operations, a dump of the RSS and VA space will be
347*8d67ca89SAndroid Build Coastguard Workerperformed. At the end, a final RSS and VA space number will be printed.
348*8d67ca89SAndroid Build Coastguard WorkerFor the most part, the intermediate data can be ignored, but it is always
349*8d67ca89SAndroid Build Coastguard Workera good idea to look over the data to verify that no strange spikes are
350*8d67ca89SAndroid Build Coastguard Workeroccurring.
351*8d67ca89SAndroid Build Coastguard Worker
352*8d67ca89SAndroid Build Coastguard WorkerThe performance number is a measure of the time it takes to perform all of
353*8d67ca89SAndroid Build Coastguard Workerthe allocation calls (malloc/memalign/posix_memalign/realloc/free/etc).
354*8d67ca89SAndroid Build Coastguard WorkerFor any call that allocates a pointer, the time for the call and the time
355*8d67ca89SAndroid Build Coastguard Workerit takes to make the pointer completely resident in memory is included.
356*8d67ca89SAndroid Build Coastguard Worker
357*8d67ca89SAndroid Build Coastguard WorkerThe performance numbers for these runs tend to have a wide variability so
358*8d67ca89SAndroid Build Coastguard Workerthey should not be used as absolute value for comparison against the
359*8d67ca89SAndroid Build Coastguard Workercurrent allocator. But, they should be in the same range as the current
360*8d67ca89SAndroid Build Coastguard Workervalues.
361*8d67ca89SAndroid Build Coastguard Worker
362*8d67ca89SAndroid Build Coastguard WorkerWhen evaluating an allocator, one of the most important traces is the
363*8d67ca89SAndroid Build Coastguard Workercamera.txt trace. The camera application does very large allocations,
364*8d67ca89SAndroid Build Coastguard Workerand some allocators might leave large virtual address maps around
365*8d67ca89SAndroid Build Coastguard Workerrather than delete them. When that happens, it can lead to allocation
366*8d67ca89SAndroid Build Coastguard Workerfailures and would cause the camera app to abort/crash. It is
367*8d67ca89SAndroid Build Coastguard Workerimportant to verify that when running this trace using the 32 bit replay
368*8d67ca89SAndroid Build Coastguard Workerexecutable, the virtual address space consumed is not much larger than the
369*8d67ca89SAndroid Build Coastguard Workercurrent allocator. A small increase (on the order of a few MBs) would be okay.
370*8d67ca89SAndroid Build Coastguard Worker
371*8d67ca89SAndroid Build Coastguard WorkerThere is no specific benchmark for memory fragmentation, instead, the RSS
372*8d67ca89SAndroid Build Coastguard Workerwhen running the memory traces acts as a proxy for this. An allocator that
373*8d67ca89SAndroid Build Coastguard Workeris fragmenting badly will show an increase in RSS. The best trace for
374*8d67ca89SAndroid Build Coastguard Workertracking fragmentation is system\_server.txt which is an extremely long
375*8d67ca89SAndroid Build Coastguard Workertrace (~13 million operations). The total number of live allocations goes
376*8d67ca89SAndroid Build Coastguard Workerup and down a bit, but stays mostly the same so an allocator that fragments
377*8d67ca89SAndroid Build Coastguard Workerbadly would likely show an abnormal increase in RSS on this trace.
378*8d67ca89SAndroid Build Coastguard Worker
379*8d67ca89SAndroid Build Coastguard WorkerNOTE: When a native allocator calls mmap, it is expected that the allocator
380*8d67ca89SAndroid Build Coastguard Workerwill name the map using the call:
381*8d67ca89SAndroid Build Coastguard Worker
382*8d67ca89SAndroid Build Coastguard Worker    prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, <PTR>, <SIZE>, "libc_malloc");
383*8d67ca89SAndroid Build Coastguard Worker
384*8d67ca89SAndroid Build Coastguard WorkerIf the native allocator creates a different name, then it necessary to
385*8d67ca89SAndroid Build Coastguard Workermodify the file:
386*8d67ca89SAndroid Build Coastguard Worker
387*8d67ca89SAndroid Build Coastguard Worker    system/extras/memory_replay/NativeInfo.cpp
388*8d67ca89SAndroid Build Coastguard Worker
389*8d67ca89SAndroid Build Coastguard WorkerThe `GetNativeInfo` function needs to be modified to include the name
390*8d67ca89SAndroid Build Coastguard Workerof the maps that this allocator includes.
391*8d67ca89SAndroid Build Coastguard Worker
392*8d67ca89SAndroid Build Coastguard WorkerIn addition, in order for the frameworks code to keep track of the memory
393*8d67ca89SAndroid Build Coastguard Workerof a process, any named maps must be added to the file:
394*8d67ca89SAndroid Build Coastguard Worker
395*8d67ca89SAndroid Build Coastguard Worker    frameworks/base/core/jni/android_os_Debug.cpp
396*8d67ca89SAndroid Build Coastguard Worker
397*8d67ca89SAndroid Build Coastguard WorkerModify the `load_maps` function and add a check of the new expected name.
398*8d67ca89SAndroid Build Coastguard Worker
399*8d67ca89SAndroid Build Coastguard Worker#### Performance Trace Benchmarks
400*8d67ca89SAndroid Build Coastguard WorkerThis is a benchmark that treats the trace data as if all allocations
401*8d67ca89SAndroid Build Coastguard Workeroccurred in a single thread. This is the scenario that could
402*8d67ca89SAndroid Build Coastguard Workerhappen if all of the allocations are spaced out in time so no thread
403*8d67ca89SAndroid Build Coastguard Workerevery does an allocation at the same time as another thread.
404*8d67ca89SAndroid Build Coastguard Worker
405*8d67ca89SAndroid Build Coastguard WorkerRun these benchmarks thusly:
406*8d67ca89SAndroid Build Coastguard Worker
407*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest64/trace_benchmark/trace_benchmark
408*8d67ca89SAndroid Build Coastguard Worker    adb shell /data/benchmarktest/trace_benchmark/trace_benchmark
409*8d67ca89SAndroid Build Coastguard Worker
410*8d67ca89SAndroid Build Coastguard WorkerWhen run without any arguments, the benchmark will run over all of the
411*8d67ca89SAndroid Build Coastguard Workertraces and display data. It takes many minutes to complete these runs in
412*8d67ca89SAndroid Build Coastguard Workerorder to get as accurate a number as possible.
413