1
2============================
3Userspace block driver(ublk)
4============================
5
6Introduction
7============
8
9This is the userspace daemon part(ublksrv) of the ublk framework, the other
10part is ``ublk driver`` [#userspace]_ which supports multiple queue.
11
12The two parts communicate by io_uring's IORING_OP_URING_CMD with one
13per-queue shared cmd buffer for storing io command, and the buffer is
14read only for ublksrv, each io command can be indexed by io request tag
15directly, and the command is written by ublk driver, and read by ublksrv
16after getting notification from ublk driver.
17
18For example, when one READ io request is submitted to ublk block driver, ublk
19driver stores the io command into cmd buffer first, then completes one
20IORING_OP_URING_CMD for notifying ublksrv, and the URING_CMD is issued to
21ublk driver beforehand by ublksrv for getting notification of any new io
22request, and each URING_CMD is associated with one io request by tag,
23so depth for URING_CMD is same with queue depth of ublk block device.
24
25After ublksrv gets the io command, it translates and handles the ublk io
26request, such as, for the ublk-loop target, ublksrv translates the request
27into same request on another file or disk, like the kernel loop block
28driver. In ublksrv's implementation, the io is still handled by io_uring,
29and share same ring with IORING_OP_URING_CMD command. When the target io
30request is done, the same IORING_OP_URING_CMD is issued to ublk driver for
31both committing io request result and getting future notification of new
32io request.
33
34So far, the ublk driver needs to copy io request pages into userspace buffer
35(pages) first for write before notifying the request to ublksrv, and copy
36userspace buffer(pages) to the io request pages after ublksrv handles
37READ. Also looks linux-mm can't support zero copy for this case yet. [#zero_copy]_
38
39More ublk targets will be added with this framework in future even though only
40ublk-loop and ublk-null are implemented now.
41
42libublksrv is also generated, and it helps to integrate ublk into existed
43project. One example of demo_null is provided for how to make a ublk
44device over libublksrv.
45
46Quick start
47===========
48
49how to build ublksrv:
50--------------------
51
52.. code-block:: console
53
54 autoreconf -i
55 ./configure #pkg-config and libtool is usually needed
56 make
57
58note: './configure' requires liburing 2.2 package installed, if liburing 2.2
59isn't available in your distribution, please configure via the following
60command, or refer to ``build_with_liburing_src`` [#build_with_liburing_src]_
61
62.. code-block:: console
63
64 PKG_CONFIG_PATH=${LIBURING_DIR} \
65 ./configure \
66 CFLAGS="-I${LIBURING_DIR}/src/include" \
67 CXXFLAGS="-I${LIBURING_DIR}/src/include" \
68 LDFLAGS="-L${LIBURING_DIR}/src"
69
70and LIBURING_DIR points to directory of liburing source code, and liburing
71needs to be built before running above commands. Also IORING_SETUP_SQE128
72has to be supported in the liburing source.
73
74c++20 is required for building ublk utility, but libublksrv and demo_null.c &
75demo_event.c can be built independently:
76
77- build libublksrv ::
78
79 make -C lib/
80
81- build demo_null && demo_event ::
82
83 make -C lib/
84 make demo_null demo_event
85
86help
87----
88
89- ublk help
90
91add one ublk-null disk
92----------------------
93
94- ublk add -t null
95
96
97add one ublk-loop disk
98----------------------
99
100- ublk add -t loop -f /dev/vdb
101
102or
103
104- ublk add -t loop -f 1.img
105
106
107add one qcow2 disk
108------------------
109
110- ublk add -t qcow2 -f test.qcow2
111
112note: qcow2 support is experimental, see details in qcow2 status [#qcow2_status]_
113and readme [#qcow2_readme]_
114
115
116remove one ublk disk
117--------------------
118
119- ublk del -n 0 #remove /dev/ublkb0
120
121- ublk del -a #remove all ublk devices
122
123list ublk devices
124---------------------
125
126- ublk list
127
128- ublk list -v #with all device info dumped
129
130
131unprivileged mode
132==================
133
134Typical use case is container [#stefan_container]_ in which user
135can manage its own devices not exposed to other containers.
136
137At default, controlling ublk device needs privileged user, since
138/dev/ublk-control is permitted for administrator only, and this
139is called privileged mode.
140
141For unprivilege mode, /dev/ublk-control needs to be allowed for
142all users, so the following udev rule need to be added:
143
144KERNEL=="ublk-control", MODE="0666", OPTIONS+="static_node=ublk-control"
145
146Also when new ublk device is added, we need ublk to change device
147ownership to the device's real owner, so the following rules are
148needed: ::
149
150 KERNEL=="ublkc*",RUN+="ublk_chown.sh %k"
151 KERNEL=="ublkb*",RUN+="ublk_chown.sh %k"
152
153``ublk_chown.sh`` can be found under ``utils/`` too.
154
155``utils/ublk_dev.rules`` includes the above rules.
156
157With the above two administrator changes, unprivileged user can
158create/delete/list/use ublk device, also anyone which isn't permitted
159can't access and control this ublk devices(ublkc*/ublkb*)
160
161Unprivileged user can pass '--unprevileged' to 'ublk add' for creating
162unprivileged ublk device, then the created ublk device is only available
163for the owner and administrator.
164
165use unprivileged ublk in docker
166-------------------------------
167
168- install the following udev rules in host machine: ::
169
170 ACTION=="add",KERNEL=="ublk[bc]*",RUN+="/usr/local/sbin/ublk_chown_docker.sh %k 'add' '%M' '%m'"
171 ACTION=="remove",KERNEL=="ublk[bc]*",RUN+="/usr/local/sbin/ublk_chown_docker.sh %k 'remove' '%M' '%m'"
172
173``ublk_chown_docker.sh`` can be found under ``utils/``.
174
175- run one container and install ublk & its dependency packages
176
177.. code-block:: console
178
179 docker run \
180 --name fedora \
181 --hostname=ublk-docker.example.com \
182 --device=/dev/ublk-control \
183 --device-cgroup-rule='a *:* rmw' \
184 --tmpfs /tmp \
185 --tmpfs /run \
186 --volume /sys/fs/cgroup:/sys/fs/cgroup:ro \
187 -ti \
188 fedora:38
189
190.. code-block:: console
191
192 #run the following commands inside the above container
193 dnf install -y git libtool automake autoconf g++ liburing-devel
194 git clone https://github.com/ming1/ubdsrv.git
195 cd ubdsrv
196 autoreconf -i&& ./configure&& make -j 4&& make install
197
198- add/delete ublk device inside container by unprivileged user
199
200.. code-block:: console
201
202 docker exec -u 1001:1001 -ti fedora /bin/bash
203
204.. code-block:: console
205
206 #run the following commands inside the above container
207 bash-5.2$ ublk add -t null --unprivileged
208 dev id 0: nr_hw_queues 1 queue_depth 128 block size 512 dev_capacity 524288000
209 max rq size 524288 daemon pid 178 flags 0x62 state LIVE
210 ublkc: 237:0 ublkb: 259:1 owner: 1001:1001
211 queue 0: tid 179 affinity(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 )
212 target {"dev_size":268435456000,"name":"null","type":0}
213
214 bash-5.2$ ls -l /dev/ublk*
215 crw-rw-rw-. 1 root root 10, 123 May 1 04:35 /dev/ublk-control
216 brwx------. 1 1001 1001 259, 1 May 1 04:36 /dev/ublkb0
217 crwx------. 1 1001 1001 237, 0 May 1 04:36 /dev/ublkc0
218
219 bash-5.2$ ublk del -n 0
220 bash-5.2$ ls -l /dev/ublk*
221 crw-rw-rw-. 1 root root 10, 123 May 1 04:35 /dev/ublk-control
222
223- example of ublk in docker: ``tests/debug/ublk_docker``
224
225test
226====
227
228run all built tests
229-------------------
230
231make test T=all
232
233
234run test group
235--------------
236
237make test T=null
238
239make test T=loop
240
241make test T=generic
242
243
244run single test
245---------------
246
247make test T=generic/001
248
249make test T=null/001
250
251make test T=loop/001
252...
253
254run specified tests or test groups
255----------------------------------
256
257make test T=generic:loop/001:null
258
259
260Debug
261=====
262
263ublksrv is running as one daemon process, so most of debug messages won't be
264shown in terminal. If any issue is observed, please collect log via command
265of "journalctl | grep ublksrvd"
266
267``./configure --enable-debug`` can build a debug version of ublk which
268dumps lots of runtime debug messages, and can't be used in production
269environment, should be for debug purpose only. For debug version of
270ublksrv, 'ublk add --debug_mask=0x{MASK}' can control which kind of
271debug log dumped, see ``UBLK_DBG_*`` defined in include/ublksrv_utils.h
272for each kind of debug log.
273
274libublksrv API doc
275==================
276
277API is documented in include/ublksrv.h, and doxygen doc can be generated
278by running 'make doxygen_doc', the generated html docs are in doc/html.
279
280Contributing
281============
282
283Any kind of contribution is welcome!
284
285Development is done over github.
286
287
288Todo:
289====
290
291libublk
292------
293
294Move libublksrv out of ublksrv project, and make it as one standalone repo
295and name it as libublk.
296
297It is planned to do it when ublk driver UAPI changes(feature addition) is slow down.
298
299License
300=======
301
302nlohmann(include/nlohmann/json.hpp) is from [#nlohmann]_, which is covered
303by MIT license.
304
305The library functions (all code in lib/ directory and include/ublksrv.h)
306are covered by dual licensed LGPL and MIT, see COPYING.LGPL and LICENSE.
307
308qcow2 target code is covered by GPL-2.0, see COPYING.
309
310All other source code are covered by dual licensed GPL and MIT, see
311COPYING and LICENSE.
312
313References
314==========
315
316.. [#ublk_driver] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/block/ublk_drv.c?h=v6.0
317.. [#zero_copy] https://lore.kernel.org/all/[email protected]/
318.. [#nlohmann] https://github.com/nlohmann/json
319.. [#qcow2_status] https://github.com/ming1/ubdsrv/blob/master/qcow2/STATUS.rst
320.. [#qcow2_readme] https://github.com/ming1/ubdsrv/blob/master/qcow2/README.rst
321.. [#build_with_liburing_src] https://github.com/ming1/ubdsrv/blob/master/build_with_liburing_src
322.. [#stefan_container] https://lore.kernel.org/linux-block/[email protected]/
323