RMM  23.12
RAPIDS Memory Manager
Public Types | Public Member Functions | List of all members
rmm::device_scalar< T > Class Template Reference

Container for a single object of type T in device memory. More...

#include <device_scalar.hpp>

Public Types

using value_type = typename device_uvector< T >::value_type
 T, the type of the scalar element.
 
using reference = typename device_uvector< T >::reference
 value_type&
 
using const_reference = typename device_uvector< T >::const_reference
 const value_type&
 
using pointer = typename device_uvector< T >::pointer
 The type of the pointer returned by data()
 
using const_pointer = typename device_uvector< T >::const_pointer
 

Public Member Functions

RMM_EXEC_CHECK_DISABLE device_scalar (device_scalar &&) noexcept=default
 Default move constructor.
 
device_scalaroperator= (device_scalar &&) noexcept=default
 Default move assignment operator. More...
 
 device_scalar (device_scalar const &)=delete
 Copy ctor is deleted as it doesn't allow a stream argument.
 
device_scalaroperator= (device_scalar const &)=delete
 Copy assignment is deleted as it doesn't allow a stream argument.
 
 device_scalar ()=delete
 Default constructor is deleted as it doesn't allow a stream argument.
 
 device_scalar (cuda_stream_view stream, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Construct a new uninitialized device_scalar. More...
 
 device_scalar (value_type const &initial_value, cuda_stream_view stream, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Construct a new device_scalar with an initial value. More...
 
 device_scalar (device_scalar const &other, cuda_stream_view stream, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Construct a new device_scalar by deep copying the contents of another device_scalar, using the specified stream and memory resource. More...
 
value_type value (cuda_stream_view stream) const
 Copies the value from device to host, synchronizes, and returns the value. More...
 
void set_value_async (value_type const &value, cuda_stream_view stream)
 Sets the value of the device_scalar to the value of v. More...
 
void set_value_async (value_type &&, cuda_stream_view)=delete
 
void set_value_to_zero_async (cuda_stream_view stream)
 Sets the value of the device_scalar to zero on the specified stream. More...
 
pointer data () noexcept
 Returns pointer to object in device memory. More...
 
const_pointer data () const noexcept
 Returns const pointer to object in device memory. More...
 
cuda_stream_view stream () const noexcept
 Stream associated with the device memory allocation. More...
 
void set_stream (cuda_stream_view stream) noexcept
 Sets the stream to be used for deallocation. More...
 

Detailed Description

template<typename T>
class rmm::device_scalar< T >

Container for a single object of type T in device memory.

T must be trivially copyable.

Template Parameters
TThe object's type

Member Typedef Documentation

◆ const_pointer

template<typename T >
using rmm::device_scalar< T >::const_pointer = typename device_uvector<T>::const_pointer

The type of the iterator returned by data() const

Constructor & Destructor Documentation

◆ device_scalar() [1/3]

template<typename T >
rmm::device_scalar< T >::device_scalar ( cuda_stream_view  stream,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)
inlineexplicit

Construct a new uninitialized device_scalar.

Does not synchronize the stream.

Note
This device_scalar is only safe to access in kernels and copies on the specified CUDA stream, or on another stream only if a dependency is enforced (e.g. using cudaStreamWaitEvent()).
Exceptions
rmm::bad_allocif allocating the device memory fails.
Parameters
streamStream on which to perform asynchronous allocation.
mrOptional, resource with which to allocate.

◆ device_scalar() [2/3]

template<typename T >
rmm::device_scalar< T >::device_scalar ( value_type const &  initial_value,
cuda_stream_view  stream,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)
inlineexplicit

Construct a new device_scalar with an initial value.

Does not synchronize the stream.

Note
This device_scalar is only safe to access in kernels and copies on the specified CUDA stream, or on another stream only if a dependency is enforced (e.g. using cudaStreamWaitEvent()).
Exceptions
rmm::bad_allocif allocating the device memory for initial_value fails.
rmm::cuda_errorif copying initial_value to device memory fails.
Parameters
initial_valueThe initial value of the object in device memory.
streamOptional, stream on which to perform allocation and copy.
mrOptional, resource with which to allocate.

◆ device_scalar() [3/3]

template<typename T >
rmm::device_scalar< T >::device_scalar ( device_scalar< T > const &  other,
cuda_stream_view  stream,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)
inline

Construct a new device_scalar by deep copying the contents of another device_scalar, using the specified stream and memory resource.

Exceptions
rmm::bad_allocIf creating the new allocation fails.
rmm::cuda_errorif copying from other fails.
Parameters
otherThe device_scalar whose contents will be copied
streamThe stream to use for the allocation and copy
mrThe resource to use for allocating the new device_scalar

Member Function Documentation

◆ data() [1/2]

template<typename T >
const_pointer rmm::device_scalar< T >::data ( ) const
inlinenoexcept

Returns const pointer to object in device memory.

Note
If the returned device pointer is used on a CUDA stream different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()), otherwise there may be a race condition.
Returns
Const pointer to underlying device memory

◆ data() [2/2]

template<typename T >
pointer rmm::device_scalar< T >::data ( )
inlinenoexcept

Returns pointer to object in device memory.

Note
If the returned device pointer is used on a CUDA stream different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()), otherwise there may be a race condition.
Returns
Pointer to underlying device memory

◆ operator=()

template<typename T >
device_scalar& rmm::device_scalar< T >::operator= ( device_scalar< T > &&  )
defaultnoexcept

Default move assignment operator.

Returns
device_scalar& A reference to the assigned-to object

◆ set_stream()

template<typename T >
void rmm::device_scalar< T >::set_stream ( cuda_stream_view  stream)
inlinenoexcept

Sets the stream to be used for deallocation.

Parameters
streamStream to be used for deallocation

◆ set_value_async()

template<typename T >
void rmm::device_scalar< T >::set_value_async ( value_type const &  value,
cuda_stream_view  stream 
)
inline

Sets the value of the device_scalar to the value of v.

This specialization for fundamental types is optimized to use cudaMemsetAsync when v is zero.

Note
If the stream specified to this function is different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()) before and after calling this function, otherwise there may be a race condition.

This function does not synchronize stream before returning. Therefore, the object referenced by v should not be destroyed or modified until stream has been synchronized. Otherwise, behavior is undefined.

Note
: This function incurs a host to device memcpy or device memset and should be used carefully.

Example:

int v{42};
// Copies 42 to device storage on `stream`. Does _not_ synchronize
...
cudaStreamSynchronize(stream);
// Synchronization is required before `v` can be modified
v = 13;
Container for a single object of type T in device memory.
Definition: device_scalar.hpp:41
cuda_stream_view stream() const noexcept
Stream associated with the device memory allocation.
Definition: device_scalar.hpp:262
void set_value_async(value_type const &value, cuda_stream_view stream)
Sets the value of the device_scalar to the value of v.
Definition: device_scalar.hpp:204
Exceptions
rmm::cuda_errorif copying value to device memory fails.
Parameters
valueThe host value which will be copied to device
streamCUDA stream on which to perform the copy

◆ set_value_to_zero_async()

template<typename T >
void rmm::device_scalar< T >::set_value_to_zero_async ( cuda_stream_view  stream)
inline

Sets the value of the device_scalar to zero on the specified stream.

Note
If the stream specified to this function is different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()) before and after calling this function, otherwise there may be a race condition.

This function does not synchronize stream before returning.

Note
: This function incurs a device memset and should be used carefully.
Parameters
streamCUDA stream on which to perform the copy

◆ stream()

template<typename T >
cuda_stream_view rmm::device_scalar< T >::stream ( ) const
inlinenoexcept

Stream associated with the device memory allocation.

Returns
Stream associated with the device memory allocation

◆ value()

template<typename T >
value_type rmm::device_scalar< T >::value ( cuda_stream_view  stream) const
inline

Copies the value from device to host, synchronizes, and returns the value.

Synchronizes stream after copying the data from device to host.

Note
If the stream specified to this function is different from the stream specified to the constructor, then an appropriate dependency must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()) before calling this function, otherwise there may be a race condition.
Exceptions
rmm::cuda_errorIf the copy fails.
rmm::cuda_errorIf synchronizing stream fails.
Returns
T The value of the scalar.
Parameters
streamCUDA stream on which to perform the copy and synchronize.

The documentation for this class was generated from the following file: