by tremguy on 1/1/21, 6:11 AM with 19 comments
by Const-me on 1/1/21, 10:40 AM
Can be a single tree for the matrix where keys are tuples of two integers. Or one tree per row, where keys are single integers, column index.
Unlike CSR, inserting elements is Log(n). RAM overhead is larger than CSR but still reasonable, much smaller than hash maps or red-black trees would be. Similar to CSR, reading rows is mostly sequential RAM reads i.e. pretty fast.
by layoutIfNeeded on 1/1/21, 2:38 PM
No, this is not the mathematical definition. An n-by-n matrix is usually considered sparse if the number of nonzero elements is O(n). Which means that the ratio of nonzero elements goes to zero as the matrix grows.
by mihevc on 1/1/21, 9:33 AM
[1] https://arrow.apache.org/docs/cpp/api/tensor.html?highlight=...
[2] https://github.com/apache/arrow/blob/master/python/pyarrow/t...
by eggie5 on 1/1/21, 12:08 PM
by golovashkin on 1/1/21, 1:52 PM
by mihevc on 1/1/21, 7:44 PM
It's a generalized CSR/CSC (to ndim > 2) format that uses a tree data structure where each path from root to leaf encodes a nonzero element.
by mindv0rtex on 1/1/21, 2:44 PM