xref: /aosp_15_r20/external/perfetto/docs/design-docs/heapprofd-wire-protocol.md (revision 6dbdd20afdafa5e3ca9b8809fa73465d530080dc)
1*6dbdd20aSAndroid Build Coastguard Worker# heapprofd: Shared Memory
2*6dbdd20aSAndroid Build Coastguard Worker
3*6dbdd20aSAndroid Build Coastguard Worker_**Status**: Implemented_
4*6dbdd20aSAndroid Build Coastguard Worker_**Authors**: fmayer_
5*6dbdd20aSAndroid Build Coastguard Worker_**Reviewers**: rsavitski, primiano_
6*6dbdd20aSAndroid Build Coastguard Worker_**Last Updated**: 2019-02-11_
7*6dbdd20aSAndroid Build Coastguard Worker
8*6dbdd20aSAndroid Build Coastguard Worker## Objective
9*6dbdd20aSAndroid Build Coastguard WorkerRestructure heapprofd to make use of the <code>[SharedRingBuffer](https://cs.android.com/android/platform/superproject/main/+/main:external/perfetto/src/profiling/memory/shared_ring_buffer.cc)</code>.
10*6dbdd20aSAndroid Build Coastguard Worker
11*6dbdd20aSAndroid Build Coastguard Worker
12*6dbdd20aSAndroid Build Coastguard Worker## Overview
13*6dbdd20aSAndroid Build Coastguard WorkerInstead of using a socket pool for sending callstacks and frees to heapprofd, we use a single shared memory buffer and signaling socket. The client writes the record describing mallocs or frees into the shared memory buffer, and then sends a single byte on the signalling socket to wake up the service.
14*6dbdd20aSAndroid Build Coastguard Worker
15*6dbdd20aSAndroid Build Coastguard Worker![](/docs/images/heapprofd-design/shmem-overview.png)
16*6dbdd20aSAndroid Build Coastguard Worker
17*6dbdd20aSAndroid Build Coastguard Worker## High-level design
18*6dbdd20aSAndroid Build Coastguard WorkerUsing a shared memory buffer between the client and heapprofd removes the need to drain the socket as fast as possible in the service, which we needed previously to make sure we do not block the client's malloc calls. This allows us to simplify the threading design of heapprofd.
19*6dbdd20aSAndroid Build Coastguard Worker
20*6dbdd20aSAndroid Build Coastguard WorkerThe _Main Thread_ has the Perfetto producer connection to traced, and handles incoming client connections for `/dev/socket/heapprofd`. It executes the logic for finding processes matching an incoming TraceConfig, matching processes to client configurations, and does the handshake with the clients. During this handshake the shared memory buffer is created by the service. After the handshake is done, the socket for a client is handed off to a particular _Unwinder Thread_.
21*6dbdd20aSAndroid Build Coastguard Worker
22*6dbdd20aSAndroid Build Coastguard WorkerAfter the handshake is completed, the sockets are handled by the assigned _Unwinder Thread's_ eventloop. The unwinder thread owns the metadata required for unwinding (the `/proc/pid/{mem,maps}` FDs, derived libunwindstack objects and the shared memory buffer). When data is received on the signaling socket, the _Unwinding Thread_ unwinds the callstack provided by the client and posts a task to the _Main Thread_ to apply to bookkeeping. This is repeated until no more records are pending in the buffer.
23*6dbdd20aSAndroid Build Coastguard Worker
24*6dbdd20aSAndroid Build Coastguard WorkerTo shut down a tracing session, the _Main Thread_ posts a task on the corresponding _Unwinding Thread_ to shut down the connection. When the client has disconnected, the _Unwinding Thread_ posts a task on the _Main Thread_ to inform it about the disconnect. The same happens for unexpected disconnects.
25*6dbdd20aSAndroid Build Coastguard Worker
26*6dbdd20aSAndroid Build Coastguard Worker![](/docs/images/heapprofd-design/shmem-detail.png)
27*6dbdd20aSAndroid Build Coastguard Worker
28*6dbdd20aSAndroid Build Coastguard Worker### Ownership
29*6dbdd20aSAndroid Build Coastguard WorkerAt every point in time, every object is owned by exactly one thread. No references or pointers are shared between different threads.
30*6dbdd20aSAndroid Build Coastguard Worker
31*6dbdd20aSAndroid Build Coastguard Worker**_Main Thread:_**
32*6dbdd20aSAndroid Build Coastguard Worker
33*6dbdd20aSAndroid Build Coastguard Worker*   Signaling sockets before handshake was completed.
34*6dbdd20aSAndroid Build Coastguard Worker*   Bookkeeping data.
35*6dbdd20aSAndroid Build Coastguard Worker*   Set of connected processes and TraceConfigs (in `ProcessMatcher` class).
36*6dbdd20aSAndroid Build Coastguard Worker
37*6dbdd20aSAndroid Build Coastguard Worker**_Unwinder Thread, for each process:_**
38*6dbdd20aSAndroid Build Coastguard Worker
39*6dbdd20aSAndroid Build Coastguard Worker*   Signaling sockets after handshake was completed.
40*6dbdd20aSAndroid Build Coastguard Worker*   libunwindstack objects for `/proc/pid/{mem,maps}`.
41*6dbdd20aSAndroid Build Coastguard Worker*   Shared memory buffer.
42*6dbdd20aSAndroid Build Coastguard Worker
43*6dbdd20aSAndroid Build Coastguard Worker
44*6dbdd20aSAndroid Build Coastguard Worker## Detailed design
45*6dbdd20aSAndroid Build Coastguard WorkerRefer to the following phases in the sequence diagram below:
46*6dbdd20aSAndroid Build Coastguard Worker
47*6dbdd20aSAndroid Build Coastguard Worker### 1. Handshake
48*6dbdd20aSAndroid Build Coastguard WorkerThe _Main Thread_ receives a `TracingConfig` from traced containing a `HeapprofdConfig`. It adds the processes expected to connect, and their `ClientConfiguration` to the `ProcessMatcher`. It then finds matching processes (by PID or cmdline) and sends the heapprofd RT signal to trigger initialization.
49*6dbdd20aSAndroid Build Coastguard Worker
50*6dbdd20aSAndroid Build Coastguard WorkerThe processes receiving this configuration connect to `/dev/socket/heapprofd` and sends `/proc/self/{map,mem}` fds. The _Main Thread_ finds the matching configuration in the `ProcessMatcher`, creates a new shared memory buffer, and sends the two over the signaling socket. The client uses those to finish initializing its internal state. The _Main Thread_ hands off (`RemoveFiledescriptorWatch` + `AddFiledescriptorWatch`) the signaling socket to an _Unwinding Thread_. It also hands off the `ScopedFile`s for the `/proc` fds. These are used to create `UnwindingMetadata`.
51*6dbdd20aSAndroid Build Coastguard Worker
52*6dbdd20aSAndroid Build Coastguard Worker
53*6dbdd20aSAndroid Build Coastguard Worker### 2. Sampling
54*6dbdd20aSAndroid Build Coastguard WorkerNow that the handshake is done, all communication is between the _Client_ and its corresponding _Unwinder Thread_.
55*6dbdd20aSAndroid Build Coastguard Worker
56*6dbdd20aSAndroid Build Coastguard WorkerFor every malloc, the client decides whether to sample the allocation, and if it should, write the `AllocMetadata` + raw stack onto the shared memory buffer, and then sends a byte over the signaling socket to wake up the _Unwinder Thread_. The _Unwinder Thread_ uses `DoUnwind` to get an `AllocRecord` (metadata like size, address, etc + a vector of frames).  It then posts a task to the _Main Thread_ to apply this to the bookkeeping.
57*6dbdd20aSAndroid Build Coastguard Worker
58*6dbdd20aSAndroid Build Coastguard Worker
59*6dbdd20aSAndroid Build Coastguard Worker### 3. Dump / concurrent sampling
60*6dbdd20aSAndroid Build Coastguard WorkerA dump request can be triggered by two cases:
61*6dbdd20aSAndroid Build Coastguard Worker
62*6dbdd20aSAndroid Build Coastguard Worker*   a continuous dump
63*6dbdd20aSAndroid Build Coastguard Worker*   a flush request from traced
64*6dbdd20aSAndroid Build Coastguard Worker
65*6dbdd20aSAndroid Build Coastguard WorkerBoth of these cases are handled the same way. The _Main Thread_ dumps the relevant processes' bookkeeping and flushes the buffers to traced.
66*6dbdd20aSAndroid Build Coastguard Worker
67*6dbdd20aSAndroid Build Coastguard WorkerIn general, _Unwinder Threads_ will receive concurrent records from clients. They will continue unwinding and posting tasks for bookkeeping to be applied. The bookkeeping will be applied after the dump is done, as the bookkeeping data cannot be concurrently modified.
68*6dbdd20aSAndroid Build Coastguard Worker
69*6dbdd20aSAndroid Build Coastguard Worker
70*6dbdd20aSAndroid Build Coastguard Worker### 4. Disconnect
71*6dbdd20aSAndroid Build Coastguard Workertraced sends a `StopDataSource` IPC. The _Main Thread_ posts a task to the _Unwinder Thread_ to ask it to disconnect from the client. It unmaps the shared memory, closes the memfd, and then closes the signaling socket.
72*6dbdd20aSAndroid Build Coastguard Worker
73*6dbdd20aSAndroid Build Coastguard WorkerThe client receives an `EPIPE` on the next attempt to send data over that socket, and then tears down the client.
74*6dbdd20aSAndroid Build Coastguard Worker
75*6dbdd20aSAndroid Build Coastguard Worker![shared memory sequence diagram](/docs/images/heapprofd-design/Shared-Memory0.png "shared memory sequence diagram")
76*6dbdd20aSAndroid Build Coastguard Worker
77*6dbdd20aSAndroid Build Coastguard Worker
78*6dbdd20aSAndroid Build Coastguard Worker## Changes to client
79*6dbdd20aSAndroid Build Coastguard WorkerThe client will no longer need a socket pool, as all operations are done on the same shared memory buffer and the single signaling socket. Instead, the data is written to the shared memory buffer, and then a single byte is sent on the signaling socket in nonblocking mode.
80*6dbdd20aSAndroid Build Coastguard Worker
81*6dbdd20aSAndroid Build Coastguard WorkerWe need to be careful about which operation we use to copy the callstack to the shared memory buffer, as `memcpy(3)` can crash on the stack frame guards due to source hardening.
82*6dbdd20aSAndroid Build Coastguard Worker
83*6dbdd20aSAndroid Build Coastguard Worker
84*6dbdd20aSAndroid Build Coastguard Worker## Advantages over current design
85*6dbdd20aSAndroid Build Coastguard Worker*   Clear ownership semantics between threads.
86*6dbdd20aSAndroid Build Coastguard Worker*   No references or pointers are passed between threads.
87*6dbdd20aSAndroid Build Coastguard Worker*   More efficient client using fewer locks.
88*6dbdd20aSAndroid Build Coastguard Worker
89*6dbdd20aSAndroid Build Coastguard Worker
90*6dbdd20aSAndroid Build Coastguard Worker## Disadvantages over current design
91*6dbdd20aSAndroid Build Coastguard Worker*   Inflates target process PSS / RSS with the shared memory buffer.
92*6dbdd20aSAndroid Build Coastguard Worker*   TaskRunners are unbounded queues. This has the potential of enqueueing a lot of bookkeeping work for a pathologically behaving process. As applying the bookkeeping information is a relatively cheap operation, we accept that risk.
93