Namespace for general purpose utility functions to be used by all bnmf_algs functions. More...

Classes
class	ComputationIterator
	A template iterator that generates values via computation using the previous value. More...

class	Generator
	A template generator that generates values from an initial value by applying a computation to the previous value a given number of times. More...

Typedefs
using	gsl_rng_wrapper = std::unique_ptr< gsl_rng, decltype(&gsl_rng_free)>
	RAII wrapper around gsl_rng types. More...

Enumerations
enum	NormType { NormType::L1, NormType::L2, NormType::Inf }
	Type of the norm to be used by bnmf_algs::util::normalize and bnmf_algs::util::normalized. More...

Functions
template<typename RandomIterator >
RandomIterator	choice (RandomIterator cum_prob_begin, RandomIterator cum_prob_end, const bnmf_algs::util::gsl_rng_wrapper &gsl_rng)
	Choose a random sample according to the given cumulative probability distribution. More...

template<typename Real , typename Integer >
void	multinomial_mode (Integer num_trials, const vector_t< Real > &prob, vector_t< Integer > &count, double eps=1e-50)
	Find the mode of a multinomial distribution using Finucan's algorithm. More...

template<typename T >
matrix_t< T >	dirichlet_mat (const matrix_t< T > &gamma_shape_mat, size_t norm_axis)
	Construct a matrix containing Dirichlet distribution samples on its rows or columns. More...

template<typename T >
util::Generator< std::pair< int, int >, details::SampleOnesNoReplaceComputer< T > >	sample_ones_noreplace (const matrix_t< T > &X)
	Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix without replacement by using details::SampleOnesNoReplaceComputer as its Computer. More...

template<typename T >
util::Generator< std::pair< int, int >, details::SampleOnesReplaceComputer< T > >	sample_ones_replace (const matrix_t< T > &X, size_t num_samples)
	Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix with replacement by using details::SampleOnesReplaceComputer as its Computer. More...

template<typename Function , typename Tuple , size_t... I>
auto	call (Function f, Tuple t, std::index_sequence< I... > seq)
	Forward the selected elements of a tuple to a function and return the result. More...

template<typename Function , typename Tuple >
auto	call (Function f, Tuple t)
	Forward the elements of a tuple to a function and return the result. More...

template<typename T >
double	sparseness (const tensor_t< T, 3 > &S)
	Compute the sparseness of the given 3D tensor. More...

template<typename Scalar , int N>
void	normalize (tensor_t< Scalar, N > &input, size_t axis, NormType type=NormType::L1, double eps=1e-50)
	Normalize a tensor in-place along the given axis. More...

template<typename Scalar , int N>
tensor_t< Scalar, N >	normalized (const tensor_t< Scalar, N > &input, size_t axis, NormType type=NormType::L1, double eps=1e-50)
	Return a normalized copy of a tensor along the given axis. More...

template<typename Real >
Real	psi_appr (Real x)
	Calculate digamma function approximately using the first 8 terms of the asymptotic expansion of digamma function. More...

template<typename Integer >
std::vector< std::pair< size_t, size_t > >	partition_change_indices (Integer n, Integer k)
	Compute the sequence of indices to be incremented and decremented to generate all partitions of number n into k bins. More...

template<typename Integer >
std::vector< std::vector< Integer > >	all_partitions (Integer n, Integer k)
	Compute all possible k length partitions of given number n. More...

Detailed Description

Namespace for general purpose utility functions to be used by all bnmf_algs functions.

Typedef Documentation

using bnmf_algs::util::gsl_rng_wrapper = typedef std::unique_ptr<gsl_rng, decltype(&gsl_rng_free)>

RAII wrapper around gsl_rng types.

Enumeration Type Documentation

enum bnmf_algs::util::NormType

strong

Type of the norm to be used by bnmf_algs::util::normalize and bnmf_algs::util::normalized.

Enumerator
L1	Normalize using \(L_1\) norm \(\left(\\|x\\|_1 = \sum_i \|x_i\|\right)\)
L2	Normalize using \(L_2\) norm \(\left(\\|x\\|_2 = \sqrt{\sum_i x_i^2}\right)\)
Inf	Normalize using \(L_{\infty}\) norm \(\left(\\|x\\|_{\infty} = \max\{\|x_1\|, \dots, \|x_n\|\}\right)\)

Function Documentation

template<typename Integer >

std::vector<std::vector<Integer> > bnmf_algs::util::all_partitions	(	Integer	n,
		Integer	k
	)

Compute all possible k length partitions of given number n.

This function computes all possible k length partitions of number n. There are exactly {N + k - 1}{k - 1} many k length partitions of number n.

The returned partition sequence has a special property: All consecutive partitions differ exactly by 1, i.e. only one of the bins are decremented by 1 and only one of the bins are incremented by 1. All the other bins remain unchanged.

When called with \(n = 2, k = 3\), this function returns:

\[ (2, 0, 0) (1, 1, 0) (0, 2, 0) (0, 1, 1) (1, 0, 1) (0, 0, 2) \]

in the exacty same order.

Template Parameters

Integer Integer type to use.

Parameters

n	Number to divide into k partitions.
k	Number of partitions.

Returns: std::vector containing all k length partitions as std::vector types. All the partition vectors are of length k.

Exceptions

assertion error if k is less than or equal to 0, if n is less than or equal to 0.

template<typename Function , typename Tuple , size_t... I>

auto bnmf_algs::util::call	(	Function	f,
		Tuple	t,
		std::index_sequence< I... >	seq
	)

Forward the selected elements of a tuple to a function and return the result.

This function takes a function pointer and a (std::tuple, std::array, std::pair) and invokes the function with the chosen elements of the tuple using a std::index_sequence.

Template Parameters

Function	Function pointer type.
Tuple	std::tuple, std::array, std::pair and so on such that std::get<I>(Tuple) gets the Ith element of the pack.
I	Scalars that can be used when indexing (integer, short, etc.)

Parameters

f	Function pointer to invoke.
t	Tuple variant that contains the parameters to the function.
seq	std::index_sequence Indices of tuple elements to forward to the function.

Returns: Return value of the function after being invoked with the given parameters.

See also: https://stackoverflow.com/questions/7858817/unpacking-a-tuple-to-call-a-matching-function-pointer

template<typename Function , typename Tuple >

auto bnmf_algs::util::call	(	Function	f,
		Tuple	t
	)

Forward the elements of a tuple to a function and return the result.

This function takes a function pointer and a (std::tuple, std::array, std::pair) and invokes the function with the elements of the tuple. Parameter list of the function and the template types of tuple elements must be the same; otherwise, a compiler error would be issued.

Template Parameters

Function	Function pointer type.
Tuple	std::tuple, std::array, std::pair and so on such that std::get<I>(Tuple) gets the Ith element of the pack.

Parameters

f	Function pointer to invoke.
t	Tuple variant that contains the parameters to the function.

Returns: Return value of the function after being invoked with the given parameters.

See also: https://stackoverflow.com/questions/7858817/unpacking-a-tuple-to-call-a-matching-function-pointer

template<typename RandomIterator >

RandomIterator bnmf_algs::util::choice	(	RandomIterator	cum_prob_begin,
		RandomIterator	cum_prob_end,
		const bnmf_algs::util::gsl_rng_wrapper &	gsl_rng
	)

Choose a random sample according to the given cumulative probability distribution.

This function chooses a sample (index) from the given cumulative probability distribution. Given distribution does not have to be normalized and it is not normalized inside this function. A sample is drawn uniformly from the range [0, max_cum_prob] and then the corresponding index is found using binary search.

Template Parameters

RandomIterator An iterator that satisfies RandomIterator interface. This function runs in O(logn) time with RandomIterators due to the efficiency in finding difference of RandomIterator types.

Parameters

cum_prob_begin	Beginning of the cumulative probability range.
cum_prob_end	End of the cumulative probability range.
gsl_rng	GSL random number generator object.

Returns: Iterator to the cumulative distribution value of the chosen value. If no value was chosen, returns cum_prob_end.

Remarks: Complexity is \(\Theta(logn)\).; If all the values in the given cumulative distribution is 0, this function always returns the given end iterator. This means that no values were chosen.; If the given cumulative distribution is empty (begin == end), then this function always returns end iterator.

template<typename T >

matrix_t<T> bnmf_algs::util::dirichlet_mat	(	const matrix_t< T > &	gamma_shape_mat,
		size_t	norm_axis
	)

Construct a matrix containing Dirichlet distribution samples on its rows or columns.

This function constructs a Dirichlet distribution matrix using Gamma distribution shape parameters given as a parameter. This is done by sampling each entry of the result, \(D_{ij}\), from a Gamma distribution, \(\mathcal{G}(X_{ij}, 1)\) where \(X\) is the given Gamma shape matrix. Afterwards, rows or columns are normalized (sum to 1) according to the given normalization parameter.

Template Parameters

T	Type of the input and output matrix entries.

Parameters

gamma_shape_mat	Matrix containing the \((i, j)^{th}\) shape parameter.
norm_axis	Normalization axis. If 0, then columns of the matrix is normalized. Therefore, each column is a Dirichlet sample. If 1, then rows of the matrix is normalized. Therefore, each rows is a Dirichlet sample.

Returns: Matrix of the same shape containing Dirichlet samples on its rows or columns depending on the given normalization parameter.

Remarks: Throws assertion error if norm_axis is different than 0 or 1.

template<typename Real , typename Integer >

void bnmf_algs::util::multinomial_mode	(	Integer	num_trials,
		const vector_t< Real > &	prob,
		vector_t< Integer > &	count,
		double	eps = `1e-50`
	)

Find the mode of a multinomial distribution using Finucan's algorithm.

This function computes the mode of a multinomial distribution, \(\mathcal{M}(N, p)\), where \(N\) is the number of trials and \(p\) is the probability distribution of events, i.e. \(p_i\) is the probability of \(i^{th}\) event happening. Mode of a multinomial distribution is defined as the optimal allocation of occurrence counts to events such that the likelihood of the sample is maximized. This operation is performed according to the procedure defined by Finucan [4] .

In particular, the procedure implemented in this function computes an initial allocation. Afterwards, if the number of allocated trials are different than the given number, it iteratively adds/removes trials one-by-one according to normalized differences using a heap based procedure.

Template Parameters

Real	Floating point types used to define the probabilities.
Integer	Integer types used to define event occurrence counts.

Parameters

num_trials	Number of trials to allocate to different events. Must be nonnegative.
prob	Probability distribution of the underlying multinomial distribution.
count	Output count vector that will store the occurrence count of each event.
eps	Epsilon value used to prevent division by zero errors.

Remarks: Throws assertion error if prob and count have different sizes, if num_trials is negative.

template<typename Scalar , int N>

void bnmf_algs::util::normalize	(	tensor_t< Scalar, N > &	input,
		size_t	axis,
		NormType	type = `NormType::L1`,
		double	eps = `1e-50`
	)

Normalize a tensor in-place along the given axis.

Template Parameters

Scalar	Type of the tensor entries
N	Dimension of the tensor. Must be known at compile time.

Parameters

input	Input tensor to normalize in-place.
axis	Axis along which the tensor will be normalized.
type	Normalization type. See bnmf_algs::util::NormType for details.
eps	Floating point epsilon value to prevent division by 0 errors.

template<typename Scalar , int N>

tensor_t<Scalar, N> bnmf_algs::util::normalized	(	const tensor_t< Scalar, N > &	input,
		size_t	axis,
		NormType	type = `NormType::L1`,
		double	eps = `1e-50`
	)

Return a normalized copy of a tensor along the given axis.

Template Parameters

Scalar	Type of the tensor entries
N	Dimension of the tensor. Must be known at compile time.

Parameters

input	Input tensor to normalize.
axis	Axis along which the tensor will be normalized.
type	Normalization type. See bnmf_algs::util::NormType for details.
eps	Floating point epsilon value to prevent division by 0 errors.

Returns: Normalized copy of the input tensor along the given axis

template<typename Integer >

std::vector<std::pair<size_t, size_t> > bnmf_algs::util::partition_change_indices	(	Integer	n,
		Integer	k
	)

Compute the sequence of indices to be incremented and decremented to generate all partitions of number n into k bins.

This function computes an index pair for each k length partition of number n. The pair holds the indices of the value to decrement by 1 and increment by 1, respectively. Starting from the initial partition, \((n, 0, \dots, 0)\), all k length partitions of number n can be generated by iteratively applying the decrement and increment operators on the returned indices. For example, this fuction returns the following index sequence when called with \(n = 2, k = 3\).

\[ (0, 1) (0, 1) (1, 2) (1, 0) (0, 2) \]

Starting from initial partition, \((2, 0, 0)\), if we decrement the value at the 0th index and increment the value at 1st index, we generate all partitions:

\[ (2, 0, 0) (1, 1, 0) (0, 2, 0) (0, 1, 1) (1, 0, 1) (0, 0, 2) \]

Template Parameters

Integer Integer type to use.

Parameters

n	Number to divide into k partitions.
k	Number of partitions.

Returns: std::vector containing indices to decrement and increment as std::pair types. std::pair::first contains the index of element to decrement, and std::pair::second contains the index of element to increment.

Exceptions

assertion error if k is less than or equal to 0, if n is less than or equal to 0.

template<typename Real >

Real bnmf_algs::util::psi_appr ( Real x )

Calculate digamma function approximately using the first 8 terms of the asymptotic expansion of digamma function.

Device function to return psi_appr of a real number.

Digamma function is defined as

\[ \psi(x) = \frac{\Gamma'(x)}{\Gamma(x)} \]

This function computes an approximation for \(\psi(x)\) using the below formula:

\[ \psi(x) \approx \ln(x) - \frac{1}{2x} - \frac{1}{12x^2} + \frac{1}{120x^4} - \frac{1}{252x^6} + \frac{1}{240x^8} - \frac{5}{660x^{10}} + \frac{691}{32760x^{12}} - \frac{1}{12x^{14}} \]

This approximation is more accurate for larger values of \(x\). When computing \(\psi(x)\) for \(x < 6\), the below recurrence relation is used to shift the \(x\) value to use in the approximation formula to a value greater than \(6\):

\[ \psi(x + 1) = \frac{1}{x} + \psi(x) \]

Template Parameters

Real	A real scalar value such as double and float.

Parameters

x	Parameter to \(\psi(x)\).

Returns: \(\psi(x)\).

See also: Appendix C.1 of [1] for a discussion of this method.

template<typename T >

util::Generator<std::pair<int, int>, details::SampleOnesNoReplaceComputer<T> > bnmf_algs::util::sample_ones_noreplace ( const matrix_t< T > & X )

Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix without replacement by using details::SampleOnesNoReplaceComputer as its Computer.

This function samples the given matrix X one by one until all the entries of the matrix are sampled. The returned util::Generator object will return the indices of the next sample one at a time. Since the return value is a generator, only a single sample is stored in the memory at a given time. To learn the details of how the samples are generated, refer to details::SampleOnesNoReplaceComputer documentation.

Parameters

X	Matrix to sample one by one without replacement.

Returns: A bnmf_algs::util::Generator type that can be used to generate a sequence of samples from the given matrix.

Exceptions

std::invalid_argument if matrix \(X\) has no element, if matrix \(X\) contains negative entries.

template<typename T >

util::Generator<std::pair<int, int>, details::SampleOnesReplaceComputer<T> > bnmf_algs::util::sample_ones_replace	(	const matrix_t< T > &	X,
		size_t	num_samples
	)

Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix with replacement by using details::SampleOnesReplaceComputer as its Computer.

This function considers the given matrix X as a probability mass function and samples random indices according to the magnitude of the entries. Entries with higher values are more likely to be sampled more often. The returned util::Generator object will return the indices of the next sample one at a time. Since the return value is a generator, only a single sample is stored in the memory at a given time.

Parameters

X	Matrix to sample one by one with replacement.
num_samples	Number of samples to generate.

Returns: A bnmf_algs::util::Generator type that can be used to generate a sequence of samples from the given matrix.

Exceptions

std::invalid_argument if matrix \(X\) has no element, if matrix \(X\) contains negative entries.

template<typename T >

double bnmf_algs::util::sparseness ( const tensor_t< T, 3 > & S )

Compute the sparseness of the given 3D tensor.

Sparseness of a 3D tensor \(S_{x \times y \times z}\) is defined as

\[ \frac{\sqrt{xyz} - S_{+++}/\|S\|_F}{\sqrt{xyz} - 1} \]

where \(\|S\|_F = \sqrt{\sum_{ijk}(S_{ijk})^2}\) is the Frobenius norm of tensor \(S\) and \(S_{+++} = \sum_{ijk}S_{ijk}\) is the sum of elements of \(S\).

Template Parameters

T	Type of the entries of the tensor parameter.

Parameters

S	3D tensor \(S\).

Returns: Sparseness of \(S\).

Remarks: If all elements of \(S\) are 0, then the function returns std::numeric_limits<double>::max().

Classes

Typedefs

Enumerations

Functions

Detailed Description

Typedef Documentation

Enumeration Type Documentation

Function Documentation