逻辑回归的梯度下降公式详细推导过程 您所在的位置:网站首页 sigmoid函数推导过程 逻辑回归的梯度下降公式详细推导过程

逻辑回归的梯度下降公式详细推导过程

2024-07-10 09:08| 来源: 网络整理| 查看: 265

逻辑回归的梯度下降公式

逻辑回归的代价函数公式如下:

J ( θ ) = − 1 m [ ∑ i = 1 m y ( i ) log ⁡ h θ ( x ( i ) ) + ( 1 − y ( i ) ) log ⁡ ( 1 − h θ ( x ( i ) ) ) ] J(\theta)=-\frac{1}{m}\left[\sum_{i=1}^{m} y^{(i)} \log h_{\theta}\left(x^{(i)}\right)+\left(1-y^{(i)}\right) \log \left(1-h_{\theta}\left(x^{(i)}\right)\right)\right] J(θ)=−m1​[i=1∑m​y(i)loghθ​(x(i))+(1−y(i))log(1−hθ​(x(i)))]

其梯度下降公式如下:

θ j : = θ j − α ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) \theta_{j}:=\theta_{j}-\alpha \sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)} θj​:=θj​−αi=1∑m​(hθ​(x(i))−y(i))xj(i)​

详细推导过程

公式回顾

Sigmoid公式及其求导(详细推导过程):

g ( x ) = 1 1 + e − x g(x)=\frac{1}{1+e^{-x}} g(x)=1+e−x1​

g ′ ( x ) = g ( x ) ( 1 − g ( x ) ) g^{\prime}(x)=g(x)(1-g(x)) g′(x)=g(x)(1−g(x))

推导过程如下:

∂ J ( θ ) ∂ θ J = − 1 m ∑ i = 1 m ( y ( i ) 1 h θ ( x ( i ) ) ∂ h θ ( x i ) ∂ θ j − ( 1 − y ( i ) ) 1 1 − h θ ( x ( i ) ) ∂ h θ ( x i ) ∂ θ j ) = − 1 m ∑ i = 1 m ( y ( i ) 1 g ( θ T x ( i ) ) − ( 1 − y ( i ) ) 1 1 − g ( θ T x ( i ) ) ) ⋅ ∂ g ( θ T x ( i ) ) ∂ θ j = − 1 m ∑ i = 1 m ( y ( i ) 1 g ( θ T x ( i ) ) − ( 1 − y ( i ) ) 1 1 − g ( θ T x ( i ) ) ) ⋅ g ( θ T x ( i ) ) ( 1 − g ( θ T x ( i ) ) ) x j ( i ) = − 1 m ∑ i = 1 m ( y ( i ) ( 1 − g ( θ T x ( i ) ) − ( 1 − y ( i ) ) g ( θ T x ( i ) ) ) ⋅ x j ( i ) = − 1 m ∑ i = 1 m ( y ( i ) − g ( θ T x ( i ) ) ) ⋅ x j ( i ) = 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) ⋅ x j ( i ) \begin{aligned} \frac{\partial J(\theta)}{\partial \theta_{J}} &=-\frac{1}{m} \sum_{i=1}^{m}\left(y^{(i)} \frac{1}{h_{\theta}\left(x^{(i)}\right)} \frac{\partial h_{\theta}\left(x^{i}\right)}{\partial \theta_{j}}-\left(1-y^{(i)}\right) \frac{1}{1-h_{\theta}\left(x^{(i)}\right)} \frac{\partial h_{\theta}\left(x^{i}\right)}{\partial \theta_{j}}\right) \\\\ &= -\frac{1}{m} \sum_{i=1}^{m}\left(y^{(i)} \frac{1}{g\left(\theta^{T} x^{(i)}\right)}-\left(1-y^{(i)}\right) \frac{1}{1-g\left(\theta^{T} x^{(i)}\right)}\right) \cdot \frac{\partial g\left(\theta^{T} x^{(i)}\right)}{\partial \theta_{j}}\\\\ &=-\frac{1}{m} \sum_{i=1}^{m}\left(y^{(i)} \frac{1}{g\left(\theta^{T} x^{(i)}\right)}-\left(1-y^{(i)}\right) \frac{1}{1-g\left(\theta^{T} x^{(i)}\right)}\right) \cdot g\left(\theta^{T} x^{(i)}\right)\left(1-g\left(\theta^{T} x^{(i)}\right)\right)x_{j}^{(i)} \\\\ & =-\frac{1}{m} \sum_{i=1}^{m}\left(y^{(i)}\left(1-g\left(\theta^{T} x^{(i)}\right)-\left(1-y^{(i)}\right) g\left(\theta^{T} x^{(i)}\right)\right) \cdot x_{j}^{(i)}\right. \\\\ &=-\frac{1}{m} \sum_{i=1}^{m}\left(y^{(i)}-g\left(\theta^{T} x^{(i)}\right)\right) \cdot x_{j}^{(i)} \\\\ &=\frac{1}{m} \sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) \cdot x_{j}^{(i)} \end{aligned} ∂θJ​∂J(θ)​​=−m1​i=1∑m​(y(i)hθ​(x(i))1​∂θj​∂hθ​(xi)​−(1−y(i))1−hθ​(x(i))1​∂θj​∂hθ​(xi)​)=−m1​i=1∑m​(y(i)g(θTx(i))1​−(1−y(i))1−g(θTx(i))1​)⋅∂θj​∂g(θTx(i))​=−m1​i=1∑m​(y(i)g(θTx(i))1​−(1−y(i))1−g(θTx(i))1​)⋅g(θTx(i))(1−g(θTx(i)))xj(i)​=−m1​i=1∑m​(y(i)(1−g(θTx(i))−(1−y(i))g(θTx(i)))⋅xj(i)​=−m1​i=1∑m​(y(i)−g(θTx(i)))⋅xj(i)​=m1​i=1∑m​(hθ​(x(i))−y(i))⋅xj(i)​​

参考资料

考研必备数学公式大全: https://blog.csdn.net/zhaohongfei_358/article/details/106039576

Sigmoid函数求导过程:https://blog.csdn.net/zhaohongfei_358/article/details/119274445



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有