478 字
2 分钟
[人工智能数学基础] 雅可比与海森矩阵

雅可比矩阵#

f:RnRmf: \mathbb{R}^n \to \mathbb{R}^m是一个向量值函数

f(x1,x2,,xn)=[f1(x1,x2,,xn)f2(x1,x2,,xn)fm(x1,x2,,xn)]f(x_1, x_2, \ldots, x_n) = \begin{bmatrix}f_1(x_1, x_2, \ldots, x_n) \\ f_2(x_1, x_2, \ldots, x_n) \\ \vdots \\ f_m(x_1, x_2, \ldots, x_n)\end{bmatrix}

则雅可比矩阵为:

Jf=[f1x1f1x2f1xnf2x1f2x2f2xnfmx1fmx2fmxn]m×nJ_f = \begin{bmatrix} \dfrac{\partial f_1}{\partial x_1} & \dfrac{\partial f_1}{\partial x_2} & \cdots & \dfrac{\partial f_1}{\partial x_n} \\ \dfrac{\partial f_2}{\partial x_1} & \dfrac{\partial f_2}{\partial x_2} & \cdots & \dfrac{\partial f_2}{\partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \dfrac{\partial f_m}{\partial x_1} & \dfrac{\partial f_m}{\partial x_2} & \cdots & \dfrac{\partial f_m}{\partial x_n} \end{bmatrix}_{m \times n}

海森矩阵#

f:RnRf: \mathbb{R}^n \to \mathbb{R}是一个标量值函数

其雅可比矩阵为:

Jf=[fx1fx2fxn]1×nJ_f = \begin{bmatrix} \dfrac{\partial f}{\partial x_1} & \dfrac{\partial f}{\partial x_2} & \cdots & \dfrac{\partial f}{\partial x_n} \end{bmatrix}_{1 \times n}

则海森矩阵为对雅可比矩阵某行(ff是标量函数时就一行)的每个元素求所有二阶偏导数:

或者可以说海森矩阵是「雅可比矩阵的转置的雅可比矩阵」

Hf=J(f)=J(JfT)H_f = J(\nabla f) = J(J_f^T)

Hf=[2fx122fx1x22fx1xn2fx2x12fx222fx2xn2fxnx12fxnx22fxn2]n×nH_f = \begin{bmatrix} \dfrac{\partial^2 f}{\partial x_1^2} & \dfrac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_1 \partial x_n} \\ \dfrac{\partial^2 f}{\partial x_2 \partial x_1} & \dfrac{\partial^2 f}{\partial x_2^2} & \cdots & \dfrac{\partial^2 f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \dfrac{\partial^2 f}{\partial x_n \partial x_1} & \dfrac{\partial^2 f}{\partial x_n \partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_n^2} \end{bmatrix}_{n \times n}
TIP

如果ff是一个向量值函数(f:RnRmf: \mathbb{R}^n \to \mathbb{R}^m),则海森矩阵是一个m×n×nm \times n \times n的张量

海森矩阵性质#

  • 海森矩阵是对称矩阵:HfT=HfH_f^T = H_f(通常情况下,除非ff的二阶偏导数不连续)
  • 海森矩阵的特征值可以用来判断函数的极值性质:
    • 如果HfH_f的所有特征值都大于0(正定),则ff在该点是局部极小值
    • 如果HfH_f的所有特征值都小于0(负定),则ff在该点是局部极大值
    • 如果HfH_f的特征值有正又有负(不定),则ff在该点是一个鞍点
    • 如果HfH_f的特征值中有0且非零值同号或全为0(奇异),则需要进一步分析才能确定极值性质
[人工智能数学基础] 雅可比与海森矩阵
https://a1kari8.github.io/posts/ai_math/grad/
作者
A1kari8
发布于
2026-04-25
许可协议
CC BY-NC-SA 4.0