[PYTHON] "Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight

While reading "Deep Learning from scratch" (written by Yasuki Saito, published by O'Reilly Japan), I will make a note of the sites I referred to. Part 10 ← → Part 11

I changed the source code ch06 / optimizer_compare_mnist.py used in the comparison of the update method using the MNIST data set a little, and tried several ways to set the initial value.

# coding: utf-8
import os
import sys
sys.path.append(os.pardir)  #Settings for importing files in the parent directory
import matplotlib.pyplot as plt
from dataset.mnist import load_mnist
from common.util import smooth_curve
from common.multi_layer_net import MultiLayerNet
from common.optimizer import *

# 0:Read MNIST data==========
(x_train, t_train), (x_test, t_test) = load_mnist(normalize=True)

train_size = x_train.shape[0]
batch_size = 128
max_iterations = 2000

# 1:Experiment settings==========
optimizers = {}
optimizers['SGD'] = SGD()
optimizers['Momentum'] = Momentum()
optimizers['AdaGrad'] = AdaGrad()
optimizers['Adam'] = Adam()
#optimizers['RMSprop'] = RMSprop()

networks = {}
train_loss = {}
for key in optimizers.keys():
    networks[key] = MultiLayerNet(
        input_size=784, hidden_size_list=[100, 100, 100, 100],
        output_size=10,
        activation='relu',weight_init_std='relu', 
        weight_decay_lambda=0)
    train_loss[key] = []    

# 2:Start of training==========
for i in range(max_iterations):
    batch_mask = np.random.choice(train_size, batch_size)
    x_batch = x_train[batch_mask]
    t_batch = t_train[batch_mask]
    
    for key in optimizers.keys():
        grads = networks[key].gradient(x_batch, t_batch)
        optimizers[key].update(networks[key].params, grads)
    
        loss = networks[key].loss(x_batch, t_batch)
        train_loss[key].append(loss)

#Evaluation with test data
x = x_test
t = t_test

for key in optimizers.keys():
    network = networks[key]

    y = network.predict(x)

    accuracy_cnt = 0
    for i in range(len(x)):
        p= np.argmax(y[i])
        if p == t[i]:
            accuracy_cnt += 1

    print(key + " Accuracy:" + str(float(accuracy_cnt) / len(x)))

Specify'relu'in the activation function and set "Initial value of He"

activation='relu', weight_init_std='he',

Result of processing test data

SGD Accuracy:0.9325 Momentum Accuracy:0.966 AdaGrad Accuracy:0.9707 Adam Accuracy:0.972

Specify'sigmoid'in the activation function and set "Initial value of Xavier"

activation='sigmoid', weight_init_std='xavier',

Result of processing test data

SGD Accuracy:0.1135 Momentum Accuracy:0.1028 AdaGrad Accuracy:0.9326 Adam Accuracy:0.9558

SGD and Momentum had a bad recognition rate, so I set the number of batches to 10000.

SGD Accuracy:0.1135 Momentum Accuracy:0.9262 AdaGrad Accuracy:0.9617 Adam Accuracy:0.9673

Momentum's recognition rate has risen so much, but SGD is totally useless.

Specify'relu'as the activation function and set the normal distribution with a standard deviation of 0.01 as the initial value.

activation='relu', weight_init_std=0.01,

Result of processing test data

SGD Accuracy:0.1135 Momentum Accuracy:0.1135 AdaGrad Accuracy:0.9631 Adam Accuracy:0.9713

SGD and Momentum don't seem to be learning at all.

Part 10 ← → Part 11

Recommended Posts

"Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight

"Deep Learning from scratch" Self-study memo (No. 11) CNN

"Deep Learning from scratch" Self-study memo (No. 19) Data Augmentation

"Deep Learning from scratch 2" Self-study memo (No. 21) Chapters 3 and 4

"Deep Learning from scratch" self-study memo (No. 18) One! Meow! Grad-CAM!

"Deep Learning from scratch" self-study memo (No. 19-2) Data Augmentation continued

"Deep Learning from scratch" Self-study memo (Part 12) Deep learning

"Deep Learning from scratch" self-study memo (No. 15) TensorFlow beginner tutorial

[Deep Learning from scratch] Initial value of neural network weight when using Relu function

"Deep Learning from scratch" self-study memo (No. 13) Try using Google Colaboratory

"Deep Learning from scratch" self-study memo (unreadable glossary)

"Deep Learning from scratch" Self-study memo (9) MultiLayerNet class

[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~

"Deep Learning from scratch" Self-study memo (10) MultiLayerNet class

"Deep Learning from scratch" Self-study memo (No. 16) I tried to build SimpleConvNet with Keras

"Deep Learning from scratch" Self-study memo (No. 17) I tried to build DeepConvNet with Keras

Deep learning / Deep learning made from scratch Chapter 6 Memo

[Learning memo] Deep Learning made from scratch [Chapter 5]

[Learning memo] Deep Learning made from scratch [Chapter 6]

Deep learning / Deep learning made from scratch Chapter 7 Memo

Learning record of reading "Deep Learning from scratch"

[Learning memo] Deep Learning made from scratch [~ Chapter 4]

"Deep Learning from scratch" Self-study memo (No. 14) Run the program in Chapter 4 on Google Colaboratory

Deep Learning from scratch Chapter 2 Perceptron (reading memo)

TensorFlow> Learning sine curve> Reproduction of learning result from weight, bias v0.3 (Failure) / python> pass: no operation

Installation of TensorFlow, a machine learning library from Google

"Deep Learning from scratch" Self-study memo (No. 10-2) Initial value of weight

Deep Learning from scratch 1-3 chapters

Application of Deep Learning 2 made from scratch Spam filter

Deep Learning / Deep Learning from Zero 2 Chapter 4 Memo

Deep Learning / Deep Learning from Zero Chapter 3 Memo

Deep Learning / Deep Learning from Zero 2 Chapter 5 Memo

Deep learning from scratch (cost calculation)

Deep Learning / Deep Learning from Zero 2 Chapter 7 Memo

Deep Learning / Deep Learning from Zero 2 Chapter 8 Memo

Deep Learning / Deep Learning from Zero Chapter 5 Memo

Deep Learning / Deep Learning from Zero 2 Chapter 3 Memo

Deep Learning / Deep Learning from Zero 2 Chapter 6 Memo

"Deep Learning from scratch" Self-study memo (Part 8) I drew the graph in Chapter 6 with matplotlib

Why ModuleNotFoundError: No module named'dataset.mnist' appears in "Deep Learning from scratch".

Write an impression of Deep Learning 3 framework edition made from scratch

Deep learning from scratch (forward propagation edition)

Deep learning / Deep learning from scratch 2-Try moving GRU

"Deep Learning from scratch" in Haskell (unfinished)

[Windows 10] "Deep Learning from scratch" environment construction

[Deep Learning from scratch] About hyperparameter optimization

Python vs Ruby "Deep Learning from scratch" Chapter 4 Implementation of loss function

Deep Learning from scratch ① Chapter 6 "Techniques related to learning"

Good book "Deep Learning from scratch" on GitHub

Python vs Ruby "Deep Learning from scratch" Summary

Python vs Ruby "Deep Learning from scratch" Chapter 3 Implementation of 3-layer neural network

Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3

[Deep Learning from scratch] I implemented the Affine layer

Django memo # 1 from scratch

Deep learning 1 Practice of deep learning

Python vs Ruby "Deep Learning from scratch" Chapter 1 Graph of sin and cos functions

Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]

[Deep Learning from scratch] I tried to explain Dropout

Chapter 3 Neural Network Cut out only the good points of deep learning made from scratch

A memo when executing the deep learning sample code created from scratch with Google Colaboratory

Chapter 2 Implementation of Perceptron Cut out only the good points of deep learning made from scratch

An amateur stumbled in Deep Learning from scratch Note: Chapter 1

Making from scratch Deep Learning ❷ An amateur stumbled Note: Chapter 5