Linear Transform Intuition
March 14, 2025
Example Calculations and Intuition
Linear Transform Intuition Given Eigenvectors and Eigenvalues
To find eigenvalues and eigenvectors for $A$ as given
\[A = \begin{bmatrix} 2 & 1\\ 1 & 2 \end{bmatrix}\]Solution:
\[\begin{align*} |A - \lambda I| &= \Bigg| \begin{bmatrix} 2 & 1\\ 1 & 2 \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0\\ 0 & 1 \end{bmatrix} \Bigg|=\Bigg| \begin{matrix} 2-\lambda & 1\\ 1 & 2-\lambda \end{matrix}\Bigg| \\ &= 3 - 4 \lambda + \lambda^2 \end{align*}\]hence $\lambda_1 = 1$ and $\lambda_2 = 3$.
for eigenvectors:
\[(A - \lambda_1 I) \bold{v}_1 = \begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \end{bmatrix}=\begin{bmatrix} 0 \\ 0 \end{bmatrix}\]thus derived
\[\bold{v}_{\lambda_1} = \begin{bmatrix} -1 \\ 1 \end{bmatrix}\]same calculation applied when $\lambda = 3$
\[\bold{v}_{\lambda_2} = \begin{bmatrix} 1 \\ 1 \end{bmatrix}\]Geometrically speaking, the transformation matrix $A$ can be explained as scaling with a multiple of \(1\) on \(\bold{v}_{\lambda_1}\) and \(3\) on \(\bold{v}_{\lambda_2}\) basis.
For example, there exist points by transform $A\bold{x}_i$:
- $\bold{x}_1=(1,3)$, there is $A\bold{x}_1=(7,5)$
- $\bold{x}_2=(1,2)$, there is $A\bold{x}_2=(5,4)$
- $\bold{x}_3=(1,1)$, there is $A\bold{x}_3=(3,3)$, exactly scaled by $\lambda_2=3$
- $\bold{x}_4=(1,0)$, there is $A\bold{x}_4=(2,1)$
- $\bold{x}_5=(1,-1)$, there is $A\bold{x}_5=(1,-1)$, exactly scaled by $\lambda_1=1$
- $\bold{x}_6=(1,-2)$, there is $A\bold{x}_6=(0,-3)$
- $\bold{x}_7=(1,-3)$, there is $A\bold{x}_7=(-1,-5)$

</br>
In conclusion, the larger value of eigenvalue, the more powerful it could stretch linear transform towards the corresponding eigenvector direction. If a point sits exactly on an eigenvector, this point is stretched linearly by eigenvalue on the eigenvector direction.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
# Define the transformation matrix A
A = np.array([[2, 1], [1, 2]])
# Define the original points
points = np.array([
[1, 3],
[1, 2],
[1, 1],
[1, 0],
[1, -1],
[1, -2],
[1, -3]
])
# Compute transformed points
transformed_points = np.dot(points, A.T)
# Eigenvectors and eigenvalues
eigenvalues, eigenvectors = np.linalg.eig(A)
# Animation setup
fig, ax = plt.subplots()
ax.set_xlim(-2, 12)
ax.set_ylim(-6, 6)
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.grid()
# Plot x and y axes
ax.axhline(0, color='black', linewidth=1)
ax.axvline(0, color='black', linewidth=1)
# Plot eigenvectors
for i in range(2):
vec = eigenvectors[:, i] * eigenvalues[i] # Scale for better visualization
ax.plot([-vec[0], vec[0]], [-vec[1], vec[1]], color='plum', linestyle='dashed',
label=f"Eigenvector [%.3f %.3f], Eigenvalue %.3f" %
(eigenvectors[0, i], eigenvectors[1, i], eigenvalues[i]))
# Plot original points
scatter_original, = ax.plot(points[:, 0], points[:, 1], 'o', color='lightblue', label="Original Points")
scatter_transformed, = ax.plot([], [], 'o', color='lightgreen', label="Transforming Points")
ax.legend(loc='lower right')
# Animate transformation
def update(frame):
t = frame / 30 # interpolation factor (0 to 1)
intermediate_points = (1 - t) * points + t * transformed_points
scatter_transformed.set_data(intermediate_points[:, 0], intermediate_points[:, 1])
return scatter_transformed,
ani = animation.FuncAnimation(fig, update, frames=31, interval=50, blit=True)
# Save the animation as a GIF
ani.save("linear_transform_example.gif", writer="pillow", fps=15)
plt.show()
Calculate Matrix With Stacked Powered Matrix
\[X = \begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix}^{ \begin{bmatrix} 2 & -1\\ -3 & 2 \end{bmatrix}^{-1}}\]Solution:
Calculate inverse:
\[\begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix}^{ \begin{bmatrix} 2 & 1\\ 3 & 2 \end{bmatrix}}\]Use $e$ log:
\[e^{ ln(\begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix}) \begin{bmatrix} 2 & 1\\ 3 & 2 \end{bmatrix} }\]get the eigenvalues and eigenvectors
\[\lambda_1=1 \begin{bmatrix} 1 \\ 3 \end{bmatrix}, \lambda_2=2 0\begin{bmatrix} 0 \\ 1 \end{bmatrix}\]for
\[e^{ ln(\begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix})}\]thus,
\[ln(\begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix})=\begin{bmatrix} 1 & 0\\ 3 & 1 \end{bmatrix}\begin{bmatrix} ln(1) & 0\\ 0 & ln(2) \end{bmatrix}\begin{bmatrix} 1 & 0\\ 3 & 1 \end{bmatrix}^{-1}\]thus
\[ln(\begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix})=ln(2)\begin{bmatrix} 0 & 0\\ -3 & 1 \end{bmatrix}\]Consider the original equation
\[e^{ln(\begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix}) \begin{bmatrix} 2 & 1\\ 3 & 2 \end{bmatrix}}=e^{ ln(2)\begin{bmatrix} 0 & 0\\ -3 & 1 \end{bmatrix}\begin{bmatrix} 2 & 1\\ 3 & 2 \end{bmatrix}}\]then
\[e^{ln(\begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix}) \begin{bmatrix} 2 & 1\\ 3 & 2 \end{bmatrix}}= e^{ln(2)\begin{bmatrix} 0 & 0\\ -3 & -1 \end{bmatrix}}\]again, get the eigenvalues and eigenvectors
\[\lambda_1=0 \begin{bmatrix} 1 \\ -3 \end{bmatrix}, \lambda_2=-ln(2) \begin{bmatrix} 0 \\1 \end{bmatrix}\]for
\[e^{ln(2)\begin{bmatrix} 0 & 0\\ -3 & -1\end{bmatrix}}\]thus
\[e^{ ln(2) \begin{bmatrix} 0 & 0\\ -3 & -1 \end{bmatrix} }= \begin{bmatrix} 1 & 0\\ -3 & -1 \end{bmatrix} \begin{bmatrix} e^{0} & 0\\ 0 & e^{-ln(2)} \end{bmatrix} \begin{bmatrix} 0 & 0\\ -3 & -1 \end{bmatrix}^{-1}\]thus, derived the final solution
\[X = \begin{bmatrix} 1 & 0\\ -3 & 2 \end{bmatrix}^{ \begin{bmatrix} 2 & -1\\ -3 & 2 \end{bmatrix}^{-1}}= e^{ ln(2) \begin{bmatrix} 0 & 0\\ -3 & -1 \end{bmatrix}}=\begin{bmatrix} 1 & 0\\ -3/2 & 1/2 \end{bmatrix}\]Covariance Matrix
A $2 \times 2$ covariance matrix is defined as
\[\Sigma = \begin{bmatrix} \sigma(x,x) & \sigma(x,y) \\ \sigma(y,x) & \sigma(y,y) \end{bmatrix}\]in which \(\sigma(x,y) = E [ \big(x - E(x) \big) \big(y - E(y)\big) ]\)
where $x$ and $y$ are sample vectors, hence $\sigma(x,y)$ is scalar.

</br>
The orientations and thickness of the point cloud are eigenvectors and eigenvalues, such as the two arrows shown as below.

</br>
Determinant And Trace (Indicative of Transform Volume)
Determinant
The determinant of a square matrix $A$ representing a linear transformation is a scalar value that quantifies the factor by which the transformation scales volumes in space.
$\text{det}(A)>1$ Expansion
\[A\bold{x}=\begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 7 \end{bmatrix}\]$\text{det}(A)<1$ Contraction
\[A\bold{x}=\begin{bmatrix} 0.1 & 0.2 \\ 0.3 & 0.4 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 0.3 \\ 0.7 \end{bmatrix}\]$\text{det}(A)=1$ Volume Preservation/Pure Rotation
\[A\bold{x}=\begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} -1 \\ 1 \end{bmatrix}\]The vector $\bold{x}$ is rotated by $90$ degrees counterclockwise.
$\text{det}(A)=0$ Collapse
$\text{det}(A)=0$ happens when $\text{rank}(A)$ is not full.
\[A\bold{x}_1=\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 \\ 2 \end{bmatrix} \\ A\bold{x}_2=\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 3 \\ 3 \end{bmatrix} \\ A\bold{x}_3=\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 3 \end{bmatrix}\]All $\bold{x}_i$ are collapsed into the line $0=x_2-x_1$.
Trace
Trace of a matrix is defined as
\[\text{tr}(A) = \sum^n_{i=1} a_{ii}\]A matrix trace equals the dum of its diagonal entries and the sum of Its eigenvalues.
\[\sum^n_{i=1}\lambda_i=\text{tr}(A)\]Since matrix trace is the sum of eigenvalues, it shows a vague overview of eigenvalue “energy”.
Determinant shows more detailed how matrix transformation volume expands/contracts, however, is much more difficult to compute. Trace comes in rescue as an alternative characterized by easy computation.
Infinitesimal Transformations and Linear Approximation
For $A$ close to the identity (i.e., $I+\epsilon A$, where $\epsilon$ is a trivial amount), the first-order approximation of the determinant is $\text{det}(I+\epsilon A)=1+\epsilon \text{tr}(A)$. For $\epsilon$ is a trivial amount, higher-order terms can be dropped/ignored.
Thus, $\text{tr}(A)$ approximates the volume change rate for small $\epsilon$.
Example: Rate of Continuous Dynamical Linear Systems
Consider a linear continuous dynamical system defined by $\frac{d\bold{x}}{dt}=A\bold{x}$, its integration solution is $\bold{x}(t)=e^{At}\bold{x}(0)$.
The volume scaling factor over time $t$ is $\text{det}(e^{At})=e^{\text{tr}(A)t}$. Differentiating at $t=0$, the instantaneous rate of volume change is $\frac{d}{dt}\text{det}(e^{At})\big|_{t=0}=\text{tr}(A)$.
Take iterative steps to update the dynamic linear system by $t_{+1}=t_{0}+\delta t$, and remember $\frac{d\bold{x}}{dt}=A\bold{x}$ is real time computation given at the time input $\bold{x}$ (the observed change $A\bold{x}$ is different per each timestamp observation at $t_0$). When $\delta t\rightarrow 0$ is small enough, the dynamic system can be viewed continuous at every $t_{0}\rightarrow t_{+1}$ with the change rate $\text{tr}(A)$.
In conclusion, $\text{tr}(A)$ is the first-order/linear approximation over time at every system update step $t_{+1}=t_{0}+\delta t$.
Volume Growth/Decay:
- If $\text{tr}(A)>0$: Volume expands exponentially.
- If $\text{tr}(A)<0$: Volume contracts exponentially.
- If $\text{tr}(A)=0$: Volume is preserved (e.g., Hamiltonian systems).
Eigenvector and Orthogonality
If $A^{\top}A=I$, matrix $A$ is termed orthogonal matrix.
An orthogonal matrix preserves vector length and inner product (the result is invariant) during linear transformation.
For example, a rotation matrix is orthogonal.
\[R (\theta) = \begin{bmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \\ \end{bmatrix}\]Eigen-Decomposition (Diagonalization)
Eigen-decomposition (diagonalization) breaks a square matrix $A$ into eigenvectors and eigenvalues.
\[A=Q\Lambda Q^{-1}\]where
- $\Lambda$: a diagonal matrix where non-zero entries are eigenvalues
- $Q$: Matrix whose columns are eigenvectors of $A$
It has the properties: $Q^{\top}=Q$, so that $Q^{-1}=Q$.
For a matrix eigen-decomposition that the resultant eigenvectors are orthogonal, there are a few scenarios and conditions.
Real Symmetry and Eigenvector Orthogonality
A matrix is real symmetric if $A^{\top}=A\in\mathbb{R}^{n \times n}$.
By the Spectral Theorem, if $A$ is a real symmetric matrix, then:
- All eigenvalues of $A$ are real
- This means the eigenvectors of $A$ can be chosen to be orthogonal and normalized.
- $A$ can be can be orthogonally diagonalized $A=Q\Lambda Q^{\top}$, where 1) $\Lambda$ is a diagonal matrix containing the eigenvalues of $A$, 2) the columns of $Q$ are the orthonormal eigenvectors of $A$.