In A Similar Vein Gradients Can Become Exponentially Big Leading To Exploding Gradients

in a similar vein gradients can become exponentially big leading to exploding gradients

Gated Recurrent Unit (GRUs) are a variant of recurrent neural networks that address some of the challenges of vanilla implementations of RNN such as gradient vanishing and explosion. The issue with vanilla RNNs is that because the same set of weights are multiplied applied the various timesteps, the value can quickly go to zero during the backpropagation step as the error signal is sent through the network. In a similar vein, gradients can become exponentially big leading to exploding gradients. However, the problem of exploding gradients can easily be handled by gradient clipping whereby if the value of a gradient is larger than a predetermined threshold, its value is set to that threshold.

GRUs tackle the problem of vanishing gradients by introducing a gate that determines whether internal state should be remembered or forgotten. The equation for a gated recurrent unit that determines the internal state h is presented below:

Where z_t represents the update gate and is given as:

While r_t represents the reset gate and is represented mathematically as:

Notice how the value of the update gate z_t determines which portions of the equation contribute to the value of h_t. A value of 0 for z_t would knock out the first part of the equation of h_t while a value of 1 would erase the second portion. This is analogous to how humans remember information they deem important while throwing away irrelevant pieces of information.

Search This Blog

ihry

In A Similar Vein Gradients Can Become Exponentially Big Leading To Exploding Gradients

in a similar vein gradients can become exponentially big leading to exploding gradients

Popular Posts

Give Them Their Own Account Giving Your Children Their Own User Account On Your

how to abort 3 month pregnancy at home