We consider constrained optimization problems of the kind: where the feasibility region is a polytope, i.e., is the set of such that: where are real matrices of size and , respectively, and are column vectors. Equivalently, we can rewrite (1) as: where are the -th row of and , respectively, and denotes the scalar product. In this post we…
Consider a linear system of complex-valued equations of the kind , where are complex square and rectangular matrices, respectively, and is the unknown complex matrix. To compute , a natural option is to use any algorithm available in the literature initially conceived for real systems, such as Gauss elimination, LU/Cholesky decomposition, and translate its operations to complex arithmetic. However, there may…
In signal processing, a classic problem consists in estimating a signal, in the form of a complex column vector , by observing a related signal , which has been produced by multiplying the unknown by a known matrix and adding noise : 1. Assumptions The random signal and noise are independent of each other, Gaussian…
Kolmogorov-Arnold networks (KAN) are generating significant interest in the AI community due to their potential for accuracy and interpretability. We implement KAN (and MLPs, incidentally) from scratch in simple Python (no Torch / TensorFlow). The code is available on GitHub.
Chain rule to the rescue! To train a deep neural network (DNN), one needs to repeatedly compute the gradient of the empirical loss with respect to all the DNN parameters . Since their number can be huge, it is crucial that the procedure for gradient computation grows “slowly” with . Back-propagation addresses this successfully by exploiting shrewdly…