bnmf-algs
|
Namespace for general purpose utility functions to be used by all bnmf_algs functions. More...
Classes | |
class | ComputationIterator |
A template iterator that generates values via computation using the previous value. More... | |
class | Generator |
A template generator that generates values from an initial value by applying a computation to the previous value a given number of times. More... | |
Typedefs | |
using | gsl_rng_wrapper = std::unique_ptr< gsl_rng, decltype(&gsl_rng_free)> |
RAII wrapper around gsl_rng types. More... | |
Enumerations | |
enum | NormType { NormType::L1, NormType::L2, NormType::Inf } |
Type of the norm to be used by bnmf_algs::util::normalize and bnmf_algs::util::normalized. More... | |
Functions | |
template<typename RandomIterator > | |
RandomIterator | choice (RandomIterator cum_prob_begin, RandomIterator cum_prob_end, const bnmf_algs::util::gsl_rng_wrapper &gsl_rng) |
Choose a random sample according to the given cumulative probability distribution. More... | |
template<typename Real , typename Integer > | |
void | multinomial_mode (Integer num_trials, const vector_t< Real > &prob, vector_t< Integer > &count, double eps=1e-50) |
Find the mode of a multinomial distribution using Finucan's algorithm. More... | |
template<typename T > | |
matrix_t< T > | dirichlet_mat (const matrix_t< T > &gamma_shape_mat, size_t norm_axis) |
Construct a matrix containing Dirichlet distribution samples on its rows or columns. More... | |
template<typename T > | |
util::Generator< std::pair< int, int >, details::SampleOnesNoReplaceComputer< T > > | sample_ones_noreplace (const matrix_t< T > &X) |
Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix without replacement by using details::SampleOnesNoReplaceComputer as its Computer. More... | |
template<typename T > | |
util::Generator< std::pair< int, int >, details::SampleOnesReplaceComputer< T > > | sample_ones_replace (const matrix_t< T > &X, size_t num_samples) |
Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix with replacement by using details::SampleOnesReplaceComputer as its Computer. More... | |
template<typename Function , typename Tuple , size_t... I> | |
auto | call (Function f, Tuple t, std::index_sequence< I... > seq) |
Forward the selected elements of a tuple to a function and return the result. More... | |
template<typename Function , typename Tuple > | |
auto | call (Function f, Tuple t) |
Forward the elements of a tuple to a function and return the result. More... | |
template<typename T > | |
double | sparseness (const tensor_t< T, 3 > &S) |
Compute the sparseness of the given 3D tensor. More... | |
template<typename Scalar , int N> | |
void | normalize (tensor_t< Scalar, N > &input, size_t axis, NormType type=NormType::L1, double eps=1e-50) |
Normalize a tensor in-place along the given axis. More... | |
template<typename Scalar , int N> | |
tensor_t< Scalar, N > | normalized (const tensor_t< Scalar, N > &input, size_t axis, NormType type=NormType::L1, double eps=1e-50) |
Return a normalized copy of a tensor along the given axis. More... | |
template<typename Real > | |
Real | psi_appr (Real x) |
Calculate digamma function approximately using the first 8 terms of the asymptotic expansion of digamma function. More... | |
template<typename Integer > | |
std::vector< std::pair< size_t, size_t > > | partition_change_indices (Integer n, Integer k) |
Compute the sequence of indices to be incremented and decremented to generate all partitions of number n into k bins. More... | |
template<typename Integer > | |
std::vector< std::vector< Integer > > | all_partitions (Integer n, Integer k) |
Compute all possible k length partitions of given number n. More... | |
Namespace for general purpose utility functions to be used by all bnmf_algs functions.
using bnmf_algs::util::gsl_rng_wrapper = typedef std::unique_ptr<gsl_rng, decltype(&gsl_rng_free)> |
RAII wrapper around gsl_rng types.
|
strong |
Type of the norm to be used by bnmf_algs::util::normalize and bnmf_algs::util::normalized.
std::vector<std::vector<Integer> > bnmf_algs::util::all_partitions | ( | Integer | n, |
Integer | k | ||
) |
Compute all possible k length partitions of given number n.
This function computes all possible k length partitions of number n. There are exactly {N + k - 1}{k - 1} many k length partitions of number n.
The returned partition sequence has a special property: All consecutive partitions differ exactly by 1, i.e. only one of the bins are decremented by 1 and only one of the bins are incremented by 1. All the other bins remain unchanged.
When called with \(n = 2, k = 3\), this function returns:
\[ (2, 0, 0) (1, 1, 0) (0, 2, 0) (0, 1, 1) (1, 0, 1) (0, 0, 2) \]
in the exacty same order.
Integer | Integer type to use. |
n | Number to divide into k partitions. |
k | Number of partitions. |
assertion | error if k is less than or equal to 0, if n is less than or equal to 0. |
auto bnmf_algs::util::call | ( | Function | f, |
Tuple | t, | ||
std::index_sequence< I... > | seq | ||
) |
Forward the selected elements of a tuple to a function and return the result.
This function takes a function pointer and a (std::tuple, std::array, std::pair) and invokes the function with the chosen elements of the tuple using a std::index_sequence.
Function | Function pointer type. |
Tuple | std::tuple, std::array, std::pair and so on such that std::get<I>(Tuple) gets the Ith element of the pack. |
I | Scalars that can be used when indexing (integer, short, etc.) |
f | Function pointer to invoke. |
t | Tuple variant that contains the parameters to the function. |
seq | std::index_sequence Indices of tuple elements to forward to the function. |
auto bnmf_algs::util::call | ( | Function | f, |
Tuple | t | ||
) |
Forward the elements of a tuple to a function and return the result.
This function takes a function pointer and a (std::tuple, std::array, std::pair) and invokes the function with the elements of the tuple. Parameter list of the function and the template types of tuple elements must be the same; otherwise, a compiler error would be issued.
Function | Function pointer type. |
Tuple | std::tuple, std::array, std::pair and so on such that std::get<I>(Tuple) gets the Ith element of the pack. |
f | Function pointer to invoke. |
t | Tuple variant that contains the parameters to the function. |
RandomIterator bnmf_algs::util::choice | ( | RandomIterator | cum_prob_begin, |
RandomIterator | cum_prob_end, | ||
const bnmf_algs::util::gsl_rng_wrapper & | gsl_rng | ||
) |
Choose a random sample according to the given cumulative probability distribution.
This function chooses a sample (index) from the given cumulative probability distribution. Given distribution does not have to be normalized and it is not normalized inside this function. A sample is drawn uniformly from the range [0, max_cum_prob] and then the corresponding index is found using binary search.
RandomIterator | An iterator that satisfies RandomIterator interface. This function runs in O(logn) time with RandomIterators due to the efficiency in finding difference of RandomIterator types. |
cum_prob_begin | Beginning of the cumulative probability range. |
cum_prob_end | End of the cumulative probability range. |
gsl_rng | GSL random number generator object. |
matrix_t<T> bnmf_algs::util::dirichlet_mat | ( | const matrix_t< T > & | gamma_shape_mat, |
size_t | norm_axis | ||
) |
Construct a matrix containing Dirichlet distribution samples on its rows or columns.
This function constructs a Dirichlet distribution matrix using Gamma distribution shape parameters given as a parameter. This is done by sampling each entry of the result, \(D_{ij}\), from a Gamma distribution, \(\mathcal{G}(X_{ij}, 1)\) where \(X\) is the given Gamma shape matrix. Afterwards, rows or columns are normalized (sum to 1) according to the given normalization parameter.
T | Type of the input and output matrix entries. |
gamma_shape_mat | Matrix containing the \((i, j)^{th}\) shape parameter. |
norm_axis | Normalization axis. If 0, then columns of the matrix is normalized. Therefore, each column is a Dirichlet sample. If 1, then rows of the matrix is normalized. Therefore, each rows is a Dirichlet sample. |
void bnmf_algs::util::multinomial_mode | ( | Integer | num_trials, |
const vector_t< Real > & | prob, | ||
vector_t< Integer > & | count, | ||
double | eps = 1e-50 |
||
) |
Find the mode of a multinomial distribution using Finucan's algorithm.
This function computes the mode of a multinomial distribution, \(\mathcal{M}(N, p)\), where \(N\) is the number of trials and \(p\) is the probability distribution of events, i.e. \(p_i\) is the probability of \(i^{th}\) event happening. Mode of a multinomial distribution is defined as the optimal allocation of occurrence counts to events such that the likelihood of the sample is maximized. This operation is performed according to the procedure defined by Finucan [4] .
In particular, the procedure implemented in this function computes an initial allocation. Afterwards, if the number of allocated trials are different than the given number, it iteratively adds/removes trials one-by-one according to normalized differences using a heap based procedure.
Real | Floating point types used to define the probabilities. |
Integer | Integer types used to define event occurrence counts. |
num_trials | Number of trials to allocate to different events. Must be nonnegative. |
prob | Probability distribution of the underlying multinomial distribution. |
count | Output count vector that will store the occurrence count of each event. |
eps | Epsilon value used to prevent division by zero errors. |
void bnmf_algs::util::normalize | ( | tensor_t< Scalar, N > & | input, |
size_t | axis, | ||
NormType | type = NormType::L1 , |
||
double | eps = 1e-50 |
||
) |
Normalize a tensor in-place along the given axis.
Scalar | Type of the tensor entries |
N | Dimension of the tensor. Must be known at compile time. |
input | Input tensor to normalize in-place. |
axis | Axis along which the tensor will be normalized. |
type | Normalization type. See bnmf_algs::util::NormType for details. |
eps | Floating point epsilon value to prevent division by 0 errors. |
tensor_t<Scalar, N> bnmf_algs::util::normalized | ( | const tensor_t< Scalar, N > & | input, |
size_t | axis, | ||
NormType | type = NormType::L1 , |
||
double | eps = 1e-50 |
||
) |
Return a normalized copy of a tensor along the given axis.
Scalar | Type of the tensor entries |
N | Dimension of the tensor. Must be known at compile time. |
input | Input tensor to normalize. |
axis | Axis along which the tensor will be normalized. |
type | Normalization type. See bnmf_algs::util::NormType for details. |
eps | Floating point epsilon value to prevent division by 0 errors. |
std::vector<std::pair<size_t, size_t> > bnmf_algs::util::partition_change_indices | ( | Integer | n, |
Integer | k | ||
) |
Compute the sequence of indices to be incremented and decremented to generate all partitions of number n into k bins.
This function computes an index pair for each k length partition of number n. The pair holds the indices of the value to decrement by 1 and increment by 1, respectively. Starting from the initial partition, \((n, 0, \dots, 0)\), all k length partitions of number n can be generated by iteratively applying the decrement and increment operators on the returned indices. For example, this fuction returns the following index sequence when called with \(n = 2, k = 3\).
\[ (0, 1) (0, 1) (1, 2) (1, 0) (0, 2) \]
Starting from initial partition, \((2, 0, 0)\), if we decrement the value at the 0th index and increment the value at 1st index, we generate all partitions:
\[ (2, 0, 0) (1, 1, 0) (0, 2, 0) (0, 1, 1) (1, 0, 1) (0, 0, 2) \]
Integer | Integer type to use. |
n | Number to divide into k partitions. |
k | Number of partitions. |
assertion | error if k is less than or equal to 0, if n is less than or equal to 0. |
Real bnmf_algs::util::psi_appr | ( | Real | x | ) |
Calculate digamma function approximately using the first 8 terms of the asymptotic expansion of digamma function.
Device function to return psi_appr of a real number.
Digamma function is defined as
\[ \psi(x) = \frac{\Gamma'(x)}{\Gamma(x)} \]
This function computes an approximation for \(\psi(x)\) using the below formula:
\[ \psi(x) \approx \ln(x) - \frac{1}{2x} - \frac{1}{12x^2} + \frac{1}{120x^4} - \frac{1}{252x^6} + \frac{1}{240x^8} - \frac{5}{660x^{10}} + \frac{691}{32760x^{12}} - \frac{1}{12x^{14}} \]
This approximation is more accurate for larger values of \(x\). When computing \(\psi(x)\) for \(x < 6\), the below recurrence relation is used to shift the \(x\) value to use in the approximation formula to a value greater than \(6\):
\[ \psi(x + 1) = \frac{1}{x} + \psi(x) \]
Real | A real scalar value such as double and float. |
x | Parameter to \(\psi(x)\). |
util::Generator<std::pair<int, int>, details::SampleOnesNoReplaceComputer<T> > bnmf_algs::util::sample_ones_noreplace | ( | const matrix_t< T > & | X | ) |
Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix without replacement by using details::SampleOnesNoReplaceComputer as its Computer.
This function samples the given matrix X one by one until all the entries of the matrix are sampled. The returned util::Generator object will return the indices of the next sample one at a time. Since the return value is a generator, only a single sample is stored in the memory at a given time. To learn the details of how the samples are generated, refer to details::SampleOnesNoReplaceComputer documentation.
X | Matrix to sample one by one without replacement. |
std::invalid_argument | if matrix \(X\) has no element, if matrix \(X\) contains negative entries. |
util::Generator<std::pair<int, int>, details::SampleOnesReplaceComputer<T> > bnmf_algs::util::sample_ones_replace | ( | const matrix_t< T > & | X, |
size_t | num_samples | ||
) |
Return a bnmf_algs::util::Generator that will generate a sequence of samples from the given matrix with replacement by using details::SampleOnesReplaceComputer as its Computer.
This function considers the given matrix X as a probability mass function and samples random indices according to the magnitude of the entries. Entries with higher values are more likely to be sampled more often. The returned util::Generator object will return the indices of the next sample one at a time. Since the return value is a generator, only a single sample is stored in the memory at a given time.
X | Matrix to sample one by one with replacement. |
num_samples | Number of samples to generate. |
std::invalid_argument | if matrix \(X\) has no element, if matrix \(X\) contains negative entries. |
double bnmf_algs::util::sparseness | ( | const tensor_t< T, 3 > & | S | ) |
Compute the sparseness of the given 3D tensor.
Sparseness of a 3D tensor \(S_{x \times y \times z}\) is defined as
\[ \frac{\sqrt{xyz} - S_{+++}/\|S\|_F}{\sqrt{xyz} - 1} \]
where \(\|S\|_F = \sqrt{\sum_{ijk}(S_{ijk})^2}\) is the Frobenius norm of tensor \(S\) and \(S_{+++} = \sum_{ijk}S_{ijk}\) is the sum of elements of \(S\).
T | Type of the entries of the tensor parameter. |
S | 3D tensor \(S\). |