bnmf-algs
|
bnmf-algs is a header only (except benchmark and CUDA parts) C++14 library providing parallelized implementations of nonnegative matrix factorization and allocation model algorithms [1]. bnmf-algs is built on top of Eigen3, and uses Eigen matrix and tensor types in its API and internal computations. The library can be built with OpenMP support to enable parallel implementations of algorithms. Additionally, bnmf-algs supports CUDA 8 and above for computing large matrix/tensor operations on an NVIDIA GPU for additional speed-up.
Allocation model is proposed by M. B. Kurutmaz, A. T. Cemgil, U. Şimşekli, and S. Yıldırım (see [1]). The model has a close relationship with NMF, and admits similar inference algorithms. The problem of finding the optimal solution to an allocation model problem is named as the Best Latent Decomposition problem in [1]. This repository contains parallelized versions of BLD (best latent decomposition) algorithms. All the theoretical ideas behind the implemented BLD solutions are obtained from personal communication with M. B. Kurutmaz and A. T. Cemgil.
All the implemented algorithms/functions/classes are extensively documented, tested using edge-case and correctness tests, and benchmarked. You can easily reach the test results on Travis and documentation on readthedocs by following the links at the top of this page. Benchmark codes are also provided together with this library. To run the benchmarks, you need to first compile them. Instructions for doing so are mentioned in build section.
Below we introduce the namespaces and main algorithms provided by bnmf-algs.
The library is divided into sections using C++ namespaces. The main namespace of the library is bnmf-algs. The rest of the large namespaces are as follows:
This namespace contains allocation model related parameter classes and functions. Currently, a class to store model parameters is provided. Additionally, functions for sampling prior matrices and tensors, and calculating the log marginal value of a sample according to allocation model is provided.
This is the namespace that contains various BLD solver algorithms. Each algorithm has a similar signature, taking the matrix to allocate into the model tensor according to the given model parameters and various algorithm parameters such as iteration count and epsilon values. In order to read algorithmic descriptions, please refer to bnmf-algs library documentation.
cuda namespace contains utility functions and classes that make it easier to use CUDA from C++ code.
nmf namespace contains a parallelized implementation of nonnegative matrix factorization algorithm implemented according to beta divergence. The algorithm is parallelized by using parallel Eigen matrix multiplications.
util namespace contains various utility functions and classes.
To use bnmf-algs in your project, you can follow the steps described in this section. You can check the example programs compiled with/without CUDA support to have an idea of how to use bnmf-algs in your project. Additionally, we describe the main build script used to build benchmarks and tests.
Two example programs are provided in example and example_with_cuda directories. They provide the same functionality, i.e. read a matrix from a given file, factorize it, and write the resulting matrices/tensors to predefined files. The only difference is that one of them is built using CUDA and thus offers CUDA support whereas the other only uses OpenMP for parallelizations. Example CMakeLists.txt files are provided in both directory to give an example CMakeLists.txt needed to use bnmf-algs library in your project.
You can easily build both of the example projects by running the provided build script.
This is the file that used to generate the makefile to compile the main file in example_with_cuda directory. A brief explanation of certain parts is as follows:
Building the library without CUDA support is much simpler as you don't need to separately compile any part of bnmf-algs library. Simply including it and linking againts GSL is enough. Explanation of CMakeLists.txt is as follows:
Benchmarks and tests can be built using the main build script provided in build.sh file under project root. To configure the build process, you need to edit build.config file. This file contains CMake flags to pass to cmake executable. Each option is described in the comments above it. For example, to enable OpenMP support, set
Similarly, to enable CUDA, set
After configuring the build process, to build the tests run:
This command builds the tests in release mode. To run the tests after building, type
In order to build the benchmarks, Celero library must be installed on your system. After installing Celero, you need to specify its include and link paths in build.config file. Finally, you can build the benchmarks using the following command:
This will create a benchmark executable in build/benchmark
.
To see the list of benchmarks, type
To run a specific benchmark with name X, type
To generate the doxygen documentation, run
in the project root directory. Then, you can view the documentation in HTML format by opening doc/build/html/index.html
using your browser.
You can also view the documentation online. Online documentation is automatically built from the latest commit to the master branch.
You can clean the built tests, benchmarks and documentation by running
This command removes all cmake related folders/files and all previously built targets.