In this appendix, I provide proof for two statements:
(1) | ddA(tr(AB))=BT | |
---|---|---|
(2) | ddA(tr(ABAT))=2AB | (for symmetric B) |
Given two matrices A(m×n) and B(n×m). The product of two matrices:
AB=[a11⋯a1n⋮⋱⋮am1⋯amn][b11⋯b1m⋮⋱⋮bn1⋯bnm]=[∑ni=1a1ibi1⋯∑ni=1a1ibim⋮⋱⋮∑ni=1amibi1⋯∑ni=1amibim]
The trace of AB is the sum of the main diagonal:
tr(AB)=n∑i=1a1ibi1+⋯+n∑i=1amibim=n∑i=1m∑j=1ajibij
Differentiate using the function of gradient:
∂f(X)∂X=[∂f(X)∂x11⋯∂f(X)∂x1n⋮⋱⋮∂f(X)∂xm1⋯∂f(X)∂xmn]
∂tr(AB)∂A=[∂(∑ni=1∑mj=1ajibij)∂a11⋯∂(∑ni=1∑mj=1ajibij)∂a1n⋮⋱⋮∂(∑ni=1∑mj=1ajibij)∂am1⋯∂(∑ni=1∑mj=1ajibij)∂amn]=
=[b11⋯bn1⋮⋱⋮b1m⋯bnm]=BT
Given two matrices A(m×n) and B(n×n).
ABAT=[a11⋯a1n⋮⋱⋮am1⋯amn][b11⋯b1n⋮⋱⋮bn1⋯bnn][a11⋯am1⋮⋱⋮a1n⋯amn]=
=[∑ni=1a1ibi1⋯∑ni=1a1ibin⋮⋱⋮∑ni=1amibi1⋯∑ni=1amibin][a11⋯am1⋮⋱⋮a1n⋯amn]=
=[∑nj=1∑ni=1a1ibija1j⋯∑nj=1∑ni=1a1ibijamj⋮⋱⋮∑nj=1∑ni=1amibija1j⋯∑nj=1∑ni=1amibijamj]
The trace of ABAT is the sum of the main diagonal:
tr(ABAT)=n∑j=1n∑i=1a1ibija1j+⋯+n∑j=1n∑i=1amibijamj=m∑k=1n∑j=1n∑i=1akibijakj
∂tr(ABAT)∂A=[∂(∑nk=1∑nj=1∑ni=1akibijakj)∂a11⋯∂(∑nk=1∑nj=1∑ni=1akibijakj)∂a1n⋮⋱⋮∂(∑nk=1∑nj=1∑ni=1akibijakj)∂am1⋯∂(∑nk=1∑nj=1∑ni=1akibijakj)∂amn]=
=[∑nj=1b1ja1j+∑ni=1a1ibi1⋯∑nj=1bnja1j+∑ni=1a1ibin⋮⋱⋮∑nj=1b1jamj+∑ni=1amibi1⋯∑nj=1bnjamj+∑ni=1amibin]=
=[∑nj=1a1jb1j⋯∑nj=1a1jbnj⋮⋱⋮∑nj=1amjb1j⋯∑nj=1amjbnj]+[∑ni=1a1ibi1⋯∑ni=1a1ibin⋮⋱⋮∑ni=1amibi1⋯∑ni=1amibin]=
=[a11⋯a1n⋮⋱⋮am1⋯amn][b11⋯bn1⋮⋱⋮b1n⋯bnn]+[a11⋯a1n⋮⋱⋮am1⋯amn][b11⋯b1n⋮⋱⋮bn1⋯bnn]=
=ABT+AB
If B is symmetric, B=BT:
∂tr(ABAT)∂A=ABT+AB=AB+AB=2AB