bnmf-algs
Classes | Typedefs | Enumerations | Functions
bnmf_algs::util Namespace Reference

Namespace for general purpose utility functions to be used by all bnmf_algs functions. More...

Classes

class  ComputationIterator
 A template iterator that generates values via computation using the previous value. More...
 
class  Generator
 A template generator that generates values from an initial value by applying a computation to the previous value a given number of times. More...
 

Typedefs

using gsl_rng_wrapper = std::unique_ptr< gsl_rng, decltype(&gsl_rng_free)>
 RAII wrapper around gsl_rng types. More...
 

Enumerations

enum  NormType { NormType::L1, NormType::L2, NormType::Inf }
 Type of the norm to be used by bnmf_algs::util::normalize and bnmf_algs::util::normalized. More...
 

Functions

template<typename RandomIterator >
RandomIterator choice (RandomIterator cum_prob_begin, RandomIterator cum_prob_end, const bnmf_algs::util::gsl_rng_wrapper &gsl_rng)
 Choose a random sample according to the given cumulative probability distribution. More...
 
template<typename Real , typename Integer >
void multinomial_mode (Integer num_trials, const vector_t< Real > &prob, vector_t< Integer > &count, double eps=1e-50)
 Find the mode of a multinomial distribution using Finucan's algorithm. More...
 
template<typename T >
matrix_t< T > dirichlet_mat (const matrix_t< T > &gamma_shape_mat, size_t norm_axis)
 Construct a matrix containing Dirichlet distribution samples on its rows or columns. More...
 
template<typename T >
util::Generator< std::pair< int, int >, details::SampleOnesNoReplaceComputer< T > > sample_ones_noreplace (const matrix_t< T > &X)
 Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix without replacement by using details::SampleOnesNoReplaceComputer as its Computer. More...
 
template<typename T >
util::Generator< std::pair< int, int >, details::SampleOnesReplaceComputer< T > > sample_ones_replace (const matrix_t< T > &X, size_t num_samples)
 Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix with replacement by using details::SampleOnesReplaceComputer as its Computer. More...
 
template<typename Function , typename Tuple , size_t... I>
auto call (Function f, Tuple t, std::index_sequence< I... > seq)
 Forward the selected elements of a tuple to a function and return the result. More...
 
template<typename Function , typename Tuple >
auto call (Function f, Tuple t)
 Forward the elements of a tuple to a function and return the result. More...
 
template<typename T >
double sparseness (const tensor_t< T, 3 > &S)
 Compute the sparseness of the given 3D tensor. More...
 
template<typename Scalar , int N>
void normalize (tensor_t< Scalar, N > &input, size_t axis, NormType type=NormType::L1, double eps=1e-50)
 Normalize a tensor in-place along the given axis. More...
 
template<typename Scalar , int N>
tensor_t< Scalar, N > normalized (const tensor_t< Scalar, N > &input, size_t axis, NormType type=NormType::L1, double eps=1e-50)
 Return a normalized copy of a tensor along the given axis. More...
 
template<typename Real >
Real psi_appr (Real x)
 Calculate digamma function approximately using the first 8 terms of the asymptotic expansion of digamma function. More...
 
template<typename Integer >
std::vector< std::pair< size_t, size_t > > partition_change_indices (Integer n, Integer k)
 Compute the sequence of indices to be incremented and decremented to generate all partitions of number n into k bins. More...
 
template<typename Integer >
std::vector< std::vector< Integer > > all_partitions (Integer n, Integer k)
 Compute all possible k length partitions of given number n. More...
 

Detailed Description

Namespace for general purpose utility functions to be used by all bnmf_algs functions.

Typedef Documentation

using bnmf_algs::util::gsl_rng_wrapper = typedef std::unique_ptr<gsl_rng, decltype(&gsl_rng_free)>

RAII wrapper around gsl_rng types.

Enumeration Type Documentation

Type of the norm to be used by bnmf_algs::util::normalize and bnmf_algs::util::normalized.

Enumerator
L1 

Normalize using \(L_1\) norm \(\left(\|x\|_1 = \sum_i |x_i|\right)\)

L2 

Normalize using \(L_2\) norm \(\left(\|x\|_2 = \sqrt{\sum_i x_i^2}\right)\)

Inf 

Normalize using \(L_{\infty}\) norm \(\left(\|x\|_{\infty} = \max\{|x_1|, \dots, |x_n|\}\right)\)

Function Documentation

template<typename Integer >
std::vector<std::vector<Integer> > bnmf_algs::util::all_partitions ( Integer  n,
Integer  k 
)

Compute all possible k length partitions of given number n.

This function computes all possible k length partitions of number n. There are exactly {N + k - 1}{k - 1} many k length partitions of number n.

The returned partition sequence has a special property: All consecutive partitions differ exactly by 1, i.e. only one of the bins are decremented by 1 and only one of the bins are incremented by 1. All the other bins remain unchanged.

When called with \(n = 2, k = 3\), this function returns:

\[ (2, 0, 0) (1, 1, 0) (0, 2, 0) (0, 1, 1) (1, 0, 1) (0, 0, 2) \]

in the exacty same order.

Template Parameters
IntegerInteger type to use.
Parameters
nNumber to divide into k partitions.
kNumber of partitions.
Returns
std::vector containing all k length partitions as std::vector types. All the partition vectors are of length k.
Exceptions
assertionerror if k is less than or equal to 0, if n is less than or equal to 0.
template<typename Function , typename Tuple , size_t... I>
auto bnmf_algs::util::call ( Function  f,
Tuple  t,
std::index_sequence< I... >  seq 
)

Forward the selected elements of a tuple to a function and return the result.

This function takes a function pointer and a (std::tuple, std::array, std::pair) and invokes the function with the chosen elements of the tuple using a std::index_sequence.

Template Parameters
FunctionFunction pointer type.
Tuplestd::tuple, std::array, std::pair and so on such that std::get<I>(Tuple) gets the Ith element of the pack.
IScalars that can be used when indexing (integer, short, etc.)
Parameters
fFunction pointer to invoke.
tTuple variant that contains the parameters to the function.
seqstd::index_sequence Indices of tuple elements to forward to the function.
Returns
Return value of the function after being invoked with the given parameters.
See also
https://stackoverflow.com/questions/7858817/unpacking-a-tuple-to-call-a-matching-function-pointer
template<typename Function , typename Tuple >
auto bnmf_algs::util::call ( Function  f,
Tuple  t 
)

Forward the elements of a tuple to a function and return the result.

This function takes a function pointer and a (std::tuple, std::array, std::pair) and invokes the function with the elements of the tuple. Parameter list of the function and the template types of tuple elements must be the same; otherwise, a compiler error would be issued.

Template Parameters
FunctionFunction pointer type.
Tuplestd::tuple, std::array, std::pair and so on such that std::get<I>(Tuple) gets the Ith element of the pack.
Parameters
fFunction pointer to invoke.
tTuple variant that contains the parameters to the function.
Returns
Return value of the function after being invoked with the given parameters.
See also
https://stackoverflow.com/questions/7858817/unpacking-a-tuple-to-call-a-matching-function-pointer
template<typename RandomIterator >
RandomIterator bnmf_algs::util::choice ( RandomIterator  cum_prob_begin,
RandomIterator  cum_prob_end,
const bnmf_algs::util::gsl_rng_wrapper gsl_rng 
)

Choose a random sample according to the given cumulative probability distribution.

This function chooses a sample (index) from the given cumulative probability distribution. Given distribution does not have to be normalized and it is not normalized inside this function. A sample is drawn uniformly from the range [0, max_cum_prob] and then the corresponding index is found using binary search.

Template Parameters
RandomIteratorAn iterator that satisfies RandomIterator interface. This function runs in O(logn) time with RandomIterators due to the efficiency in finding difference of RandomIterator types.
Parameters
cum_prob_beginBeginning of the cumulative probability range.
cum_prob_endEnd of the cumulative probability range.
gsl_rngGSL random number generator object.
Returns
Iterator to the cumulative distribution value of the chosen value. If no value was chosen, returns cum_prob_end.
Remarks
Complexity is \(\Theta(logn)\).
If all the values in the given cumulative distribution is 0, this function always returns the given end iterator. This means that no values were chosen.
If the given cumulative distribution is empty (begin == end), then this function always returns end iterator.
template<typename T >
matrix_t<T> bnmf_algs::util::dirichlet_mat ( const matrix_t< T > &  gamma_shape_mat,
size_t  norm_axis 
)

Construct a matrix containing Dirichlet distribution samples on its rows or columns.

This function constructs a Dirichlet distribution matrix using Gamma distribution shape parameters given as a parameter. This is done by sampling each entry of the result, \(D_{ij}\), from a Gamma distribution, \(\mathcal{G}(X_{ij}, 1)\) where \(X\) is the given Gamma shape matrix. Afterwards, rows or columns are normalized (sum to 1) according to the given normalization parameter.

Template Parameters
TType of the input and output matrix entries.
Parameters
gamma_shape_matMatrix containing the \((i, j)^{th}\) shape parameter.
norm_axisNormalization axis. If 0, then columns of the matrix is normalized. Therefore, each column is a Dirichlet sample. If 1, then rows of the matrix is normalized. Therefore, each rows is a Dirichlet sample.
Returns
Matrix of the same shape containing Dirichlet samples on its rows or columns depending on the given normalization parameter.
Remarks
Throws assertion error if norm_axis is different than 0 or 1.
template<typename Real , typename Integer >
void bnmf_algs::util::multinomial_mode ( Integer  num_trials,
const vector_t< Real > &  prob,
vector_t< Integer > &  count,
double  eps = 1e-50 
)

Find the mode of a multinomial distribution using Finucan's algorithm.

This function computes the mode of a multinomial distribution, \(\mathcal{M}(N, p)\), where \(N\) is the number of trials and \(p\) is the probability distribution of events, i.e. \(p_i\) is the probability of \(i^{th}\) event happening. Mode of a multinomial distribution is defined as the optimal allocation of occurrence counts to events such that the likelihood of the sample is maximized. This operation is performed according to the procedure defined by Finucan [4] .

In particular, the procedure implemented in this function computes an initial allocation. Afterwards, if the number of allocated trials are different than the given number, it iteratively adds/removes trials one-by-one according to normalized differences using a heap based procedure.

Template Parameters
RealFloating point types used to define the probabilities.
IntegerInteger types used to define event occurrence counts.
Parameters
num_trialsNumber of trials to allocate to different events. Must be nonnegative.
probProbability distribution of the underlying multinomial distribution.
countOutput count vector that will store the occurrence count of each event.
epsEpsilon value used to prevent division by zero errors.
Remarks
Throws assertion error if prob and count have different sizes, if num_trials is negative.
template<typename Scalar , int N>
void bnmf_algs::util::normalize ( tensor_t< Scalar, N > &  input,
size_t  axis,
NormType  type = NormType::L1,
double  eps = 1e-50 
)

Normalize a tensor in-place along the given axis.

Template Parameters
ScalarType of the tensor entries
NDimension of the tensor. Must be known at compile time.
Parameters
inputInput tensor to normalize in-place.
axisAxis along which the tensor will be normalized.
typeNormalization type. See bnmf_algs::util::NormType for details.
epsFloating point epsilon value to prevent division by 0 errors.
template<typename Scalar , int N>
tensor_t<Scalar, N> bnmf_algs::util::normalized ( const tensor_t< Scalar, N > &  input,
size_t  axis,
NormType  type = NormType::L1,
double  eps = 1e-50 
)

Return a normalized copy of a tensor along the given axis.

Template Parameters
ScalarType of the tensor entries
NDimension of the tensor. Must be known at compile time.
Parameters
inputInput tensor to normalize.
axisAxis along which the tensor will be normalized.
typeNormalization type. See bnmf_algs::util::NormType for details.
epsFloating point epsilon value to prevent division by 0 errors.
Returns
Normalized copy of the input tensor along the given axis
template<typename Integer >
std::vector<std::pair<size_t, size_t> > bnmf_algs::util::partition_change_indices ( Integer  n,
Integer  k 
)

Compute the sequence of indices to be incremented and decremented to generate all partitions of number n into k bins.

This function computes an index pair for each k length partition of number n. The pair holds the indices of the value to decrement by 1 and increment by 1, respectively. Starting from the initial partition, \((n, 0, \dots, 0)\), all k length partitions of number n can be generated by iteratively applying the decrement and increment operators on the returned indices. For example, this fuction returns the following index sequence when called with \(n = 2, k = 3\).

\[ (0, 1) (0, 1) (1, 2) (1, 0) (0, 2) \]

Starting from initial partition, \((2, 0, 0)\), if we decrement the value at the 0th index and increment the value at 1st index, we generate all partitions:

\[ (2, 0, 0) (1, 1, 0) (0, 2, 0) (0, 1, 1) (1, 0, 1) (0, 0, 2) \]

Template Parameters
IntegerInteger type to use.
Parameters
nNumber to divide into k partitions.
kNumber of partitions.
Returns
std::vector containing indices to decrement and increment as std::pair types. std::pair::first contains the index of element to decrement, and std::pair::second contains the index of element to increment.
Exceptions
assertionerror if k is less than or equal to 0, if n is less than or equal to 0.
template<typename Real >
Real bnmf_algs::util::psi_appr ( Real  x)

Calculate digamma function approximately using the first 8 terms of the asymptotic expansion of digamma function.

Device function to return psi_appr of a real number.

Digamma function is defined as

\[ \psi(x) = \frac{\Gamma'(x)}{\Gamma(x)} \]

This function computes an approximation for \(\psi(x)\) using the below formula:

\[ \psi(x) \approx \ln(x) - \frac{1}{2x} - \frac{1}{12x^2} + \frac{1}{120x^4} - \frac{1}{252x^6} + \frac{1}{240x^8} - \frac{5}{660x^{10}} + \frac{691}{32760x^{12}} - \frac{1}{12x^{14}} \]

This approximation is more accurate for larger values of \(x\). When computing \(\psi(x)\) for \(x < 6\), the below recurrence relation is used to shift the \(x\) value to use in the approximation formula to a value greater than \(6\):

\[ \psi(x + 1) = \frac{1}{x} + \psi(x) \]

Template Parameters
RealA real scalar value such as double and float.
Parameters
xParameter to \(\psi(x)\).
Returns
\(\psi(x)\).
See also
Appendix C.1 of [1] for a discussion of this method.
template<typename T >
util::Generator<std::pair<int, int>, details::SampleOnesNoReplaceComputer<T> > bnmf_algs::util::sample_ones_noreplace ( const matrix_t< T > &  X)

Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix without replacement by using details::SampleOnesNoReplaceComputer as its Computer.

This function samples the given matrix X one by one until all the entries of the matrix are sampled. The returned util::Generator object will return the indices of the next sample one at a time. Since the return value is a generator, only a single sample is stored in the memory at a given time. To learn the details of how the samples are generated, refer to details::SampleOnesNoReplaceComputer documentation.

Parameters
XMatrix to sample one by one without replacement.
Returns
A bnmf_algs::util::Generator type that can be used to generate a sequence of samples from the given matrix.
Exceptions
std::invalid_argumentif matrix \(X\) has no element, if matrix \(X\) contains negative entries.
template<typename T >
util::Generator<std::pair<int, int>, details::SampleOnesReplaceComputer<T> > bnmf_algs::util::sample_ones_replace ( const matrix_t< T > &  X,
size_t  num_samples 
)

Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix with replacement by using details::SampleOnesReplaceComputer as its Computer.

This function considers the given matrix X as a probability mass function and samples random indices according to the magnitude of the entries. Entries with higher values are more likely to be sampled more often. The returned util::Generator object will return the indices of the next sample one at a time. Since the return value is a generator, only a single sample is stored in the memory at a given time.

Parameters
XMatrix to sample one by one with replacement.
num_samplesNumber of samples to generate.
Returns
A bnmf_algs::util::Generator type that can be used to generate a sequence of samples from the given matrix.
Exceptions
std::invalid_argumentif matrix \(X\) has no element, if matrix \(X\) contains negative entries.
template<typename T >
double bnmf_algs::util::sparseness ( const tensor_t< T, 3 > &  S)

Compute the sparseness of the given 3D tensor.

Sparseness of a 3D tensor \(S_{x \times y \times z}\) is defined as

\[ \frac{\sqrt{xyz} - S_{+++}/\|S\|_F}{\sqrt{xyz} - 1} \]

where \(\|S\|_F = \sqrt{\sum_{ijk}(S_{ijk})^2}\) is the Frobenius norm of tensor \(S\) and \(S_{+++} = \sum_{ijk}S_{ijk}\) is the sum of elements of \(S\).

Template Parameters
TType of the entries of the tensor parameter.
Parameters
S3D tensor \(S\).
Returns
Sparseness of \(S\).
Remarks
If all elements of \(S\) are 0, then the function returns std::numeric_limits<double>::max().