ML without tears

Understanding PPO from first principles

November 17, 2025

Proximal Policy Optimization (PPO) algorithm is arguably the default choice in modern reinforcement learning (RL) libraries. In this post we understand how to derive PPO from first principles. First, we brush up our memory on the underlying Markov Decision Process (MDP) model. 1. Preliminaries on Markov Decision Process (MDP) In an MDP, an agent (say,…

Sequential Decision Problems
The plumber’s secret weapon: The Divergence Theorem

January 5, 2025

This post explores the Gauss’s divergence theorem through intuitive and visual reasoning. To engage the reader’s imagination, we use water flux as our running example, although the reasoning applies to any vector field, e.g., electric, magnetic, heat or gravity field. Moreover, to keep things simple we work on the two dimensions, although the same principles…

Calculus
Lagrangian multipliers, normal cones and KKT optimality conditions

November 6, 2024

We consider constrained optimization problems of the kind: where the feasibility region is a polytope, i.e., is the set of such that: where are real matrices of size and , respectively, and are column vectors. Equivalently, we can rewrite (1) as: where are the -th row of and , respectively, and denotes the scalar product. In this post we…

Static optimization
Get real! Solving a complex-value linear system using real arithmetic

October 14, 2024

Consider a linear system of complex-valued equations of the kind , where are complex square and rectangular matrices, respectively, and is the unknown complex matrix. To compute , a natural option is to use any algorithm available in the literature initially conceived for real systems, such as Gauss elimination, LU/Cholesky decomposition, and translate its operations to complex arithmetic. However, there may…

Signal processing
Oldies but goldies: MMSE estimator

June 24, 2024

In signal processing, a classic problem consists in estimating a signal, in the form of a complex column vector , by observing a related signal , which has been produced by multiplying the unknown by a known matrix and adding noise : 1. Assumptions The random signal and noise are independent of each other, Gaussian…

Signal processing

Machine Learning without tears

Mathy stuff, how I would have liked to learn them

Understanding PPO from first principles

The plumber’s secret weapon: The Divergence Theorem

Lagrangian multipliers, normal cones and KKT optimality conditions

Get real! Solving a complex-value linear system using real arithmetic

Oldies but goldies: MMSE estimator