indptr: stands for index pointer and returns an array of row starts.In the diagram below, the first non-zero value occurs at row 0 column 5, hence 5 appears as the first value in the indices array, followed by 1 (row 1, column 1). indices: an array of column indices - starting from the first row (from left to right), we identify non-zero positions and return their indices in that row. data: the values of the non-zero values - these are the non-zero values that are stored within the sparse matrix.To efficiently represent a sparse matrix, CSR uses three numpy arrays to store some relevant information, including: compressed sparse column) is used for write-once-read-many tasks. Compressed Sparse Row (CSR)Įven though there are many types of sparse matrices in SciPy such as Dictionary of Keys (DOK) and List of List (LIL), I will only touch on Compressed Sparse Rows (CSR) because it is the most used and widely known format.ĬSR (and also CSC, a.k.a. of 7 runs, 10000 loops each) > 56Ĭlearly, the best performance in terms of both time and space is obtained when we store a sparse matrix with the sparse module. of 7 runs, 100 loops each) > 2880112 %timeit square(B_sparse) print(getsizeof(B_sparse)) > 187 µs ± 5.24 µs per loop (mean ± std. of 7 runs, 1 loop each) > 56 %timeit square(B_full) print(getsizeof(B_full)) > 2.18 ms ± 56.5 µs per loop (mean ± std. of 7 runs, 100 loops each) > 2880112 %timeit square(A_sparse) print(getsizeof(A_sparse)) > 409 ms ± 11.5 ms per loop (mean ± std. %timeit square(A_full) print(getsizeof(A_full)) > 6.91 ms ± 84.5 µs per loop (mean ± std. We then time these different matrices stored in these different forms and also how much memory they use. B_sparse = sparse.csc_matrix(B_full) # Create a square function to return the square of the matrix def square(A): return np.power(A, 2) B_full = np.diag(np.random.rand(600)) # Matrix 4: Store B_full as a sparse matrix. A_sparse = sparse.csc_matrix(A_full) # Matrix 3: Create a sparse matrix (stored as a full matrix). A_full = np.random.rand(600, 600) # Matrix 2: Store A_full as a sparse matrix (though it is dense). import numpy as np from scipy import sparse from sys import getsizeof # Matrix 1: Create a dense matrix (stored as a full matrix). using their row and column indices).īefore we dive into what CSR is, let’s compare the efficiency difference in using numpy arrays versus sparse matrices in both time and space complexity. The idea behind the implementation is simple: Instead of storing all values in a dense matrix, let’s just store the non-zero values in some format (e.g. In Python, sparse data structures are efficiently implemented in the scipy.sparse module, which are mostly based on Numpy arrays. How then do we represent these matrices? Introducing… SciPy’s Sparse Module We shall illustrate using an example below. In addition to space complexity, dense matrices exacerbate our runtimes as well. From a simple logic standpoint, it simply doesn’t make sense to store so many zeros!įrom a mathematical standpoint, if we have a 100,000 x 100,000 matrix, this will require us to have 100,000 x 100,000 x 8 = 80 GB worth of memory to store this matrix (since each double uses 8 bytes)! Time Complexity This is because a full array occupies a block of memory for each entry, so a n x m array requires n x m blocks of memory. When dealing with sparse matrices, storing them as a full matrix (from this point referred to as a dense matrix) is simply inefficient. To formalize these two constraints, they are known as time and space complexity (memory). Install the cross compiling dependencies.Mac’s Activity Monitor (Source by Author) This will download and build LLVM and CLANG into a local directory. If you name it "deps", you won't have to change any variables later.Ĭonfigure with the FreeMat dependencies script Make a build directory somewhere.Ĭreate a subdirectory for the unfound dependencies (LLVM and CLANG in Yum install ncurses-devel clang-devel rpm-buildĪt the moment, LLVM and CLANG do not install correctly in Fedora. Yum install ufsparse-devel fftw-devel llvm-devel pcre-devel gcc-c++ libffi-devel Yum install qt4 qt4-devel portaudio-devel atlas-devel atlas Here are some general instructions on building FreeMat from scratch on Freemat provides visualization, image manipulation, and plotting as well as parallel programming.(src: ) Freemat is an interpreted, matrix-oriented development environment for engineering and scientific applications, similar to the commercial package MATLAB.
0 Comments
Leave a Reply. |