Linux repositories inspector

cudaFuncAttributes(3)

version 6.0
8 Feb 2019
Aliases: binaryVersion(3), binaryVersion(3), cacheModeCA(3), cacheModeCA(3), constSizeBytes(3), constSizeBytes(3), localSizeBytes(3), localSizeBytes(3), maxDynamicSharedSizeBytes(3), maxDynamicSharedSizeBytes(3), numRegs(3), numRegs(3), preferredShmemCarveout(3), preferredShmemCarveout(3), ptxVersion(3), ptxVersion(3), sharedSizeBytes(3), sharedSizeBytes(3)

nvidia-cuda-dev

NVIDIA CUDA development files

cuda

NVIDIA's GPU programming toolkit

NAME

cudaFuncAttributes -

SYNOPSIS

Data Fields

int binaryVersion
int cacheModeCA
size_t constSizeBytes
size_t localSizeBytes
int maxDynamicSharedSizeBytes
int maxThreadsPerBlock
int numRegs
int preferredShmemCarveout
int ptxVersion
size_t sharedSizeBytes

Detailed Description

CUDA function attributes

Field Documentation

int cudaFuncAttributes::binaryVersion

The binary architecture version for which the function was compiled. This value is the major binary version * 10 + the minor binary version, so a binary version 1.3 function would return the value 13.

int cudaFuncAttributes::cacheModeCA

The attribute to indicate whether the function has been compiled with user specified option ’-Xptxas --dlcm=ca’ set.

size_t cudaFuncAttributes::constSizeBytes

The size in bytes of user-allocated constant memory required by this function.

size_t cudaFuncAttributes::localSizeBytes

The size in bytes of local memory used by each thread of this function.

int cudaFuncAttributes::maxDynamicSharedSizeBytes

The maximum size in bytes of dynamic shared memory per block for this function. Any launch must have a dynamic shared memory size smaller than this value.

int cudaFuncAttributes::maxThreadsPerBlock

The maximum number of threads per block, beyond which a launch of the function would fail. This number depends on both the function and the device on which the function is currently loaded.

int cudaFuncAttributes::numRegs

The number of registers used by each thread of this function.

int cudaFuncAttributes::preferredShmemCarveout

On devices where the L1 cache and shared memory use the same hardware resources, this sets the shared memory carveout preference, in percent of the maximum shared memory. Refer to cudaDevAttrMaxSharedMemoryPerMultiprocessor. This is only a hint, and the driver can choose a different ratio if required to execute the function. See cudaFuncSetAttribute

int cudaFuncAttributes::ptxVersion

The PTX virtual architecture version for which the function was compiled. This value is the major PTX version * 10 + the minor PTX version, so a PTX version 1.3 function would return the value 13.

size_t cudaFuncAttributes::sharedSizeBytes

The size in bytes of statically-allocated shared memory per block required by this function. This does not include dynamically-allocated shared memory requested by the user at runtime.

Author

Generated automatically by Doxygen from the source code.
⇧ Top