公式: Adadelta 是Adagrad的一种扩展

gt=J(θt1)Gt=γGt+(1γ)gtgtθt=t1+ϵGt+ϵt=γt1+(1γ)θtθtθt=θt1+θt \begin{aligned} &g_t= \nabla J(\theta_{t-1}) \\ &G_t = \gamma G_t+(1-\gamma)g_t \odot g_t\\ &\nabla \theta_t = \frac{\sqrt{\nabla_{t-1}+\epsilon}}{\sqrt{G_t + \epsilon}}\\ &\nabla_t = \gamma\nabla_{t-1} +(1-\gamma)\nabla\theta_t\odot\nabla\theta_t \\ &\theta_t = \theta_{t-1}+\nabla\theta_t \end{aligned}

results matching ""

    No results matching ""