Home
last modified time | relevance | path

Searched refs:CUDA (Results 1 – 25 of 656) sorted by relevance

12345678910>>...27

/aosp_15_r20/external/pytorch/docs/cpp/source/notes/
H A Dtensor_cuda_stream.rst1 Tensor CUDA Stream API
4 A `CUDA Stream`_ is a linear sequence of execution that belongs to a specific CUDA device.
5 The PyTorch C++ API supports CUDA streams with the CUDAStream class and useful helper functions to …
6 …hem in `CUDAStream.h`_. This note provides more details on how to use Pytorch C++ CUDA Stream APIs.
12 Acquiring CUDA stream
15 Pytorch's C++ API provides the following ways to acquire CUDA stream:
17 1. Acquire a new stream from the CUDA stream pool, streams are preallocated from the pool and retur…
26 by setting device index (defaulting to the current CUDA stream's device index).
28 2. Acquire the default CUDA stream for the passed CUDA device, or for the current device if no devi…
38 3. Acquire the current CUDA stream, for the CUDA device with index ``device_index``, or for the cur…
[all …]
/aosp_15_r20/prebuilts/cmake/linux-x86/share/cmake-3.22/Modules/
DFindCUDAToolkit.cmake10 This script locates the NVIDIA CUDA toolkit and the associated libraries, but
11 does not require the ``CUDA`` language be enabled for a given project. This
12 module does not search for the NVIDIA CUDA Samples.
20 The CUDA Toolkit search behavior uses the following order:
22 1. If the ``CUDA`` language has been enabled we will use the directory
41 the desired path in the event that multiple CUDA Toolkits are installed.
48 candidate is found, this is used. The default CUDA Toolkit install locations
54 | macOS | ``/Developer/NVIDIA/CUDA-X.Y`` |
58 | Windows | ``C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y`` |
61 Where ``X.Y`` would be a specific version of the CUDA Toolkit, such as
[all …]
DFindCUDA.cmake7 It is no longer necessary to use this module or call ``find_package(CUDA)``
8 for compiling CUDA code. Instead, list ``CUDA`` among the languages named
10 :command:`enable_language` command with ``CUDA``.
11 Then one can add CUDA (``.cu``) sources directly to targets similar to other
15 To find and use the CUDA toolkit libraries manually, use the
17 ``CUDA`` language being enabled.
22 Tools for building CUDA C files: libraries and build dependencies.
24 This script locates the NVIDIA CUDA C tools. It should work on Linux,
25 Windows, and macOS and should be reasonably up to date with CUDA C
33 acceptable version of CUDA was found.
[all …]
DCMakeDetermineCUDACompiler.cmake10 message(FATAL_ERROR "CUDA language not currently supported by \"${CMAKE_GENERATOR}\" generator")
38 _cmake_find_compiler(CUDA)
41 _cmake_find_compiler_path(CUDA)
56 set(CMAKE_CUDA_ARCHITECTURES "$ENV{CUDAARCHS}" CACHE STRING "CUDA architectures")
75 CMAKE_DETERMINE_COMPILER_ID_VENDOR(CUDA "--version")
78 … message(FATAL_ERROR "Clang with CUDA is not yet supported on Windows. See CMake issue #20776.")
81 …# Find the CUDA toolkit. We store the CMAKE_CUDA_COMPILER_TOOLKIT_ROOT and CMAKE_CUDA_COMPILER_LIB…
122 # - macOS: /Developer/NVIDIA/CUDA-X.Y
123 # - Windows: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y
133 set(platform_base "/Developer/NVIDIA/CUDA-")
[all …]
/aosp_15_r20/external/pytorch/cmake/Modules/
H A DFindCUDAToolkit.cmake13 This script locates the NVIDIA CUDA toolkit and the associated libraries, but
14 does not require the ``CUDA`` language be enabled for a given project. This
15 module does not search for the NVIDIA CUDA Samples.
23 The CUDA Toolkit search behavior uses the following order:
25 1. If the ``CUDA`` language has been enabled we will use the directory
44 the desired path in the event that multiple CUDA Toolkits are installed.
51 candidate is found, this is used. The default CUDA Toolkit install locations
57 | macOS | ``/Developer/NVIDIA/CUDA-X.Y`` |
61 | Windows | ``C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y`` |
64 Where ``X.Y`` would be a specific version of the CUDA Toolkit, such as
[all …]
/aosp_15_r20/external/tensorflow/tensorflow/tools/dockerfiles/partials/ubuntu/
H A Ddevel-nvidia.partial.Dockerfile2 ARG CUDA=11.2
3 FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}.1-base-ubuntu${UBUNTU_VERSION} as base
4 # ARCH and CUDA are specified again because the FROM directive resets ARGs
7 ARG CUDA
21 cuda-command-line-tools-${CUDA/./-} \
22 libcublas-${CUDA/./-} \
23 libcublas-dev-${CUDA/./-} \
24 cuda-nvprune-${CUDA/./-} \
25 cuda-nvrtc-${CUDA/./-} \
26 cuda-nvrtc-dev-${CUDA/./-} \
[all …]
H A Dnvidia.partial.Dockerfile2 ARG CUDA=11.2
3 FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}.1-base-ubuntu${UBUNTU_VERSION} as base
4 # ARCH and CUDA are specified again because the FROM directive resets ARGs
7 ARG CUDA
23 cuda-command-line-tools-${CUDA/./-} \
24 libcublas-${CUDA/./-} \
25 cuda-nvrtc-${CUDA/./-} \
26 libcufft-${CUDA/./-} \
27 libcurand-${CUDA/./-} \
28 libcusolver-${CUDA/./-} \
[all …]
/aosp_15_r20/external/pytorch/cmake/public/
H A Dcuda.cmake28 # Find CUDA.
29 find_package(CUDA)
32 "Caffe2: CUDA cannot be found. Depending on whether you are building "
39 # Enable CUDA language support
42 # Must be done before CUDA language is enabled, see
47 enable_language(CUDA)
64 message(FATAL_ERROR "Found two conflicting CUDA versions:\n"
69 message(STATUS "Caffe2: CUDA detected: " ${CUDA_VERSION})
70 message(STATUS "Caffe2: CUDA nvcc is: " ${CUDA_NVCC_EXECUTABLE})
71 message(STATUS "Caffe2: CUDA toolkit directory: " ${CUDA_TOOLKIT_ROOT_DIR})
[all …]
/aosp_15_r20/external/tensorflow/tensorflow/tools/dockerfiles/dockerfiles/ppc64le/
H A Ddevel-gpu-ppc64le.Dockerfile25 ARG CUDA=11.2
26 FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}.1-base-ubuntu${UBUNTU_VERSION} as base
27 # ARCH and CUDA are specified again because the FROM directive resets ARGs
30 ARG CUDA
44 cuda-command-line-tools-${CUDA/./-} \
45 libcublas-${CUDA/./-} \
46 libcublas-dev-${CUDA/./-} \
47 cuda-nvprune-${CUDA/./-} \
48 cuda-nvrtc-${CUDA/./-} \
49 cuda-nvrtc-dev-${CUDA/./-} \
[all …]
H A Ddevel-gpu-ppc64le-jupyter.Dockerfile25 ARG CUDA=11.2
26 FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}.1-base-ubuntu${UBUNTU_VERSION} as base
27 # ARCH and CUDA are specified again because the FROM directive resets ARGs
30 ARG CUDA
44 cuda-command-line-tools-${CUDA/./-} \
45 libcublas-${CUDA/./-} \
46 libcublas-dev-${CUDA/./-} \
47 cuda-nvprune-${CUDA/./-} \
48 cuda-nvrtc-${CUDA/./-} \
49 cuda-nvrtc-dev-${CUDA/./-} \
[all …]
H A Dgpu-ppc64le.Dockerfile25 ARG CUDA=11.2
26 FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}.1-base-ubuntu${UBUNTU_VERSION} as base
27 # ARCH and CUDA are specified again because the FROM directive resets ARGs
30 ARG CUDA
46 cuda-command-line-tools-${CUDA/./-} \
47 libcublas-${CUDA/./-} \
48 cuda-nvrtc-${CUDA/./-} \
49 libcufft-${CUDA/./-} \
50 libcurand-${CUDA/./-} \
51 libcusolver-${CUDA/./-} \
[all …]
/aosp_15_r20/external/tensorflow/tensorflow/tools/dockerfiles/dockerfiles/
H A Ddevel-gpu.Dockerfile25 ARG CUDA=11.2
26 FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}.1-base-ubuntu${UBUNTU_VERSION} as base
27 # ARCH and CUDA are specified again because the FROM directive resets ARGs
30 ARG CUDA
44 cuda-command-line-tools-${CUDA/./-} \
45 libcublas-${CUDA/./-} \
46 libcublas-dev-${CUDA/./-} \
47 cuda-nvprune-${CUDA/./-} \
48 cuda-nvrtc-${CUDA/./-} \
49 cuda-nvrtc-dev-${CUDA/./-} \
[all …]
H A Ddevel-gpu-jupyter.Dockerfile25 ARG CUDA=11.2
26 FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}.1-base-ubuntu${UBUNTU_VERSION} as base
27 # ARCH and CUDA are specified again because the FROM directive resets ARGs
30 ARG CUDA
44 cuda-command-line-tools-${CUDA/./-} \
45 libcublas-${CUDA/./-} \
46 libcublas-dev-${CUDA/./-} \
47 cuda-nvprune-${CUDA/./-} \
48 cuda-nvrtc-${CUDA/./-} \
49 cuda-nvrtc-dev-${CUDA/./-} \
[all …]
H A Dgpu.Dockerfile25 ARG CUDA=11.2
26 FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}.1-base-ubuntu${UBUNTU_VERSION} as base
27 # ARCH and CUDA are specified again because the FROM directive resets ARGs
30 ARG CUDA
46 cuda-command-line-tools-${CUDA/./-} \
47 libcublas-${CUDA/./-} \
48 cuda-nvrtc-${CUDA/./-} \
49 libcufft-${CUDA/./-} \
50 libcurand-${CUDA/./-} \
51 libcusolver-${CUDA/./-} \
[all …]
/aosp_15_r20/external/pytorch/aten/src/ATen/native/cuda/
H A DLinearAlgebraStubs.cpp55 cholesky_stub(DeviceType::CUDA, input, info, upper); in lazy_cholesky_kernel()
60 return cholesky_inverse_stub(DeviceType::CUDA, result, infos, upper); in lazy_cholesky_inverse_kernel()
65 lu_factor_stub(DeviceType::CUDA, input, pivots, infos, compute_pivots); in lazy_lu_factor()
70 triangular_solve_stub(DeviceType::CUDA, A, B, left, upper, transpose, unitriangular); in lazy_triangular_solve_kernel()
75 return orgqr_stub(DeviceType::CUDA, result, tau); in lazy_orgqr_kernel()
80 ormqr_stub(DeviceType::CUDA, input, tau, other, left, transpose); in lazy_ormqr_kernel()
85 geqrf_stub(DeviceType::CUDA, input, tau); in lazy_geqrf_kernel()
90 linalg_eigh_stub(DeviceType::CUDA, eigenvalues, eigenvectors, infos, upper, compute_eigenvectors); in lazy_linalg_eigh_kernel()
95 linalg_eig_stub(DeviceType::CUDA, eigenvalues, eigenvectors, infos, input, compute_eigenvectors); in lazy_linalg_eig_kernel()
107 svd_stub(DeviceType::CUDA, A, full_matrices, compute_uv, driver, U, S, Vh, info); in lazy_svd_kernel()
[all …]
/aosp_15_r20/external/llvm/docs/
H A DCompileCudaWithLLVM.rst2 Compiling CUDA C/C++ with LLVM
11 This document contains the user guides and the internals of compiling CUDA
12 C/C++ with LLVM. It is aimed at both users who want to compile CUDA with LLVM
14 familiarity with CUDA. Information about CUDA programming can be found in the
15 `CUDA programming guide
18 How to Build LLVM with CUDA Support
21 CUDA support is still in development and works the best in the trunk version
52 How to Compile CUDA C/C++ with LLVM
55 We assume you have installed the CUDA driver and runtime. Consult the `NVIDIA
56 CUDA installation guide
[all …]
/aosp_15_r20/external/pytorch/aten/src/ATen/core/boxing/impl/
H A Dkernel_lambda_test.cpp59 … .kernel(DispatchKey::CUDA, [] (Tensor, int64_t) -> int64_t {EXPECT_TRUE(false); return 0;})) in TEST()
61 … .kernel(DispatchKey::CUDA, [] (Tensor, int64_t) -> int64_t {EXPECT_TRUE(false); return 0;})); in TEST()
67 … .kernel(DispatchKey::CUDA, [] (Tensor, int64_t) -> int64_t {EXPECT_TRUE(false); return 0;})); in TEST()
69 … .kernel(DispatchKey::CUDA, [] (Tensor, int64_t) -> int64_t {EXPECT_TRUE(false); return 0;})); in TEST()
80 auto m_cuda = MAKE_TORCH_LIBRARY_IMPL(_test, CUDA); in TEST()
81 m_cuda.impl("my_op", DispatchKey::CUDA, [] (Tensor, int64_t i) {return i-1;}); in TEST()
85 expectCallsDecrement(DispatchKey::CUDA); in TEST()
90 expectDoesntFindKernel("_test::my_op", DispatchKey::CUDA); in TEST()
141 .kernel(DispatchKey::CUDA, [] (const Tensor& a) {return a;})); in TEST()
150 result = callOp(*op, dummyTensor(DispatchKey::CUDA)); in TEST()
[all …]
H A Dmake_boxed_from_unboxed_functor_test.cpp62 … .kernel<ErrorKernel>(DispatchKey::CUDA)) in TEST()
64 … .kernel<ErrorKernel>(DispatchKey::CUDA)); in TEST()
70 … .kernel<ErrorKernel>(DispatchKey::CUDA)); in TEST()
72 … .kernel<ErrorKernel>(DispatchKey::CUDA)); in TEST()
140 … .kernel<KernelWithTensorOutput>(DispatchKey::CUDA)); in TEST()
149 result = callOp(*op, dummyTensor(DispatchKey::CUDA)); in TEST()
151 EXPECT_EQ(DispatchKey::CUDA, extractDispatchKey(result[0].toTensor())); in TEST()
162 … -> Tensor[]", RegisterOperators::options().kernel<KernelWithTensorListOutput>(DispatchKey::CUDA)); in TEST()
167 …auto result = callOp(*op, dummyTensor(DispatchKey::CPU), dummyTensor(DispatchKey::CUDA), dummyTens… in TEST()
171 EXPECT_EQ(DispatchKey::CUDA, extractDispatchKey(result[0].toTensorVector()[1])); in TEST()
[all …]
H A Dkernel_function_test.cpp79 … .kernel<decltype(errorKernel), &errorKernel>(DispatchKey::CUDA)) in TEST()
81 … .kernel<decltype(errorKernel), &errorKernel>(DispatchKey::CUDA)); in TEST()
87 … .kernel<decltype(errorKernel), &errorKernel>(DispatchKey::CUDA)); in TEST()
89 … .kernel<decltype(errorKernel), &errorKernel>(DispatchKey::CUDA)); in TEST()
101 auto m_cuda = MAKE_TORCH_LIBRARY_IMPL(_test, CUDA); in TEST()
102 m_cuda.impl("my_op", DispatchKey::CUDA, TORCH_FN(decrementKernel)); in TEST()
106 expectCallsDecrement(DispatchKey::CUDA); in TEST()
111 expectDoesntFindKernel("_test::my_op", DispatchKey::CUDA); in TEST()
174 … .kernel<decltype(kernelWithTensorOutput), &kernelWithTensorOutput>(DispatchKey::CUDA)); in TEST()
183 result = callOp(*op, dummyTensor(DispatchKey::CUDA)); in TEST()
[all …]
/aosp_15_r20/external/pytorch/torch/csrc/distributed/c10d/
H A DOps.cpp82 IMPL_SEND(CUDA) in IMPL_SEND() argument
97 IMPL_RECV(CUDA)
111 IMPL_RECV_ANY_SOURCE(CUDA)
134 IMPL_REDUCE(CUDA)
159 IMPL_BROADCAST(CUDA)
183 IMPL_ALLREDUCE(CUDA)
201 IMPL_ALLREDUCE_COALESCED(CUDA)
225 IMPL_ALLGATHER(CUDA)
245 IMPL__ALLGATHER_BASE(CUDA)
261 IMPL_ALLGATHER_COALESCED(CUDA)
[all …]
/aosp_15_r20/external/pytorch/docs/source/
H A Dcuda_environment_variables.rst3 CUDA Environment Variables
5 For more information on CUDA runtime environment variables, see `CUDA Environment Variables <https:…
15 …- If set to ``1``, disables caching of memory allocations in CUDA. This can be useful for debuggin…
19CUDA is available, PyTorch will use NVML to check if the CUDA driver is functional instead of usin…
33 **CUDA Runtime and Libraries Environment Variables**
41 …- Comma-separated list of GPU device IDs that should be made available to CUDA runtime. If set to …
43 - If set to ``1``, makes CUDA calls synchronous. This can be useful for debugging.
H A Dbottleneck.rst25 Due to the asynchronous nature of CUDA kernels, when running against
26 CUDA code, the cProfile output and CPU-mode autograd profilers may
32 In these case where timings are incorrect, the CUDA-mode autograd profiler
36 To decide which (CPU-only-mode or CUDA-mode) autograd profiler output to
38 ("CPU total time is much greater than CUDA total time").
42 looking for responsible CUDA operators in the output of the CUDA-mode
55 If you are profiling CUDA code, the first profiler that ``bottleneck`` runs
56 (cProfile) will include the CUDA startup time (CUDA buffer allocation cost)
58 in code much slower than the CUDA startup time.
/aosp_15_r20/external/pytorch/c10/test/core/
H A DDeviceGuard_test.cpp15 FakeGuardImpl<DeviceType::CUDA> cuda_impl; in TEST()
17 FakeGuardImpl<DeviceType::CUDA>::setDeviceIndex(0); in TEST()
19 DeviceGuard g(Device(DeviceType::CUDA, 1), &cuda_impl); in TEST()
21 ASSERT_EQ(FakeGuardImpl<DeviceType::CUDA>::getDeviceIndex(), 0); in TEST()
30 FakeGuardImpl<DeviceType::CUDA> cuda_impl; in TEST()
32 FakeGuardImpl<DeviceType::CUDA>::setDeviceIndex(0); in TEST()
35 g.reset_device(Device(DeviceType::CUDA, 1), &cuda_impl); in TEST()
37 ASSERT_EQ(FakeGuardImpl<DeviceType::CUDA>::getDeviceIndex(), 0); in TEST()
/aosp_15_r20/external/pytorch/third_party/tensorflow_cuda_bazel_build/cuda/
H A Dbuild_defs.bzl1 # Macros for building CUDA code.
3 """Shorthand for select()'ing on whether we're building with CUDA.
6 with CUDA enabled. Otherwise, the select statement evaluates to if_false.
16 """Default options for all CUDA compilations."""
20 """Returns true if CUDA was enabled during the configure process."""
24 """Tests if the CUDA was enabled during the configure process.
27 --config=cuda. Used to allow non-CUDA code to depend on CUDA libraries.
/aosp_15_r20/prebuilts/cmake/linux-x86/share/cmake-3.22/Help/prop_gbl/
DCMAKE_CUDA_KNOWN_FEATURES.rst6 List of CUDA features known to this version of CMake.
9 CUDA compiler. If the feature is available with the C++ compiler, it will
20 Compiler mode is at least CUDA/C++ 03.
23 Compiler mode is at least CUDA/C++ 11.
26 Compiler mode is at least CUDA/C++ 14.
29 Compiler mode is at least CUDA/C++ 17.
32 Compiler mode is at least CUDA/C++ 20.
37 Compiler mode is at least CUDA/C++ 23.

12345678910>>...27