VGG

2024-06-29 08:08| 来源: 网络整理| 查看: 265

What is VGG-Net?

It is a typical deep Convolutional Neural Network (CNN) design with numerous layers, and the abbreviation VGG stands for Visual Geometry Group. The term “deep” describes the number of layers, with VGG-16 or VGG-19 having 16 or 19 convolutional layers, respectively.

Innovative object identification models are built using the VGG architecture. The VGGNet, created as a deep neural network, outperforms benchmarks on a variety of tasks and datasets outside of ImageNet. It also remains one of the most often used image recognition architectures today.

Distinguish Between AlexNet (above) and VGGNet (below) architecturesVGG-16

The convolutional neural network model called the VGG model, or VGGNet, that supports 16 layers is also known as VGG16. It was developed by A. Zisserman and K. Simonyan from the University of Oxford. The research paper titled “Very Deep Convolutional Networks for Large-Scale Image Recognition” contains the model that these researchers released.

In ImageNet, the VGG16 model achieves top-5 test accuracy of about 92.7 per cent. A dataset called ImageNet has over 14 million photos that fall into almost 1000 types. It was also among the most well-liked models submitted at ILSVRC-2014. It significantly outperforms AlexNet by substituting several 3x3 kernel-sized filters for the huge kernel-sized filters. Nvidia Titan Black GPUs were used to train the VGG16 model over many weeks.

The VGGNet-16 has 16 layers and can classify photos into 1000 different object categories, including keyboard, animals, pencil, mouse, etc., as discussed above. The model also accepts images with a resolution of 224 by 224.7

VGG 16 ArchitectureVGG-19

The VGG19 model (also known as VGGNet-19) has the same basic idea as the VGG16 model, with the exception that it supports 19 layers. The numbers “16” and “19” refer to the model’s weight layers (convolutional layers). In comparison to VGG16, VGG19 contains three extra convolutional layers. In the final section of this essay, we’ll go into greater detail on the features of the VGG16 and VGG19 networks.

VGG-Net Architecture

Very tiny convolutional filters are used in the construction of the VGG network. Thirteen convolutional layers and three fully connected layers make up the VGG-16.

VGG-Net Architecture

Let’s quickly examine VGG’s architecture:

Inputs: The VGGNet accepts 224224-pixel images as input. To maintain a consistent input size for the ImageNet competition, the model’s developers chopped out the central 224224 patches in each image.Convolutional Layers: VGG’s convolutional layers use the smallest feasible receptive field, or 33, to record left-to-right and up-to-down movement. Additionally, 11 convolution filters are used to transform the input linearly. The next component is a ReLU unit, a significant advancement from AlexNet that shortens training time. Rectified linear unit activation function, or ReLU, is a piecewise linear function that, if the input is positive, outputs the input; otherwise, the output is zero. The convolution stride is fixed at 1 pixel to keep the spatial resolution preserved after convolution (stride is the number of pixel shifts over the input matrix).Hidden Layers: The VGG network’s hidden layers all make use of ReLU. Local Response Normalization (LRN) is typically not used with VGG as it increases memory usage and training time. Furthermore, it doesn’t increase overall accuracy.Fully Connected Layers: The VGGNet contains three layers with full connectivity. The first two levels each have 4096 channels, while the third layer has 1000 channels with one channel for each class.Understanding VGG-16

The deep neural network’s 16 layers are indicated by the number 16 in their name, which is VGG (VGGNet). This indicates that the VGG16 network is quite large, with a total of over 138 million parameters. Even by today’s high standards, it is a sizable network. The network is more appealing due to the simplicity of the VGGNet16 architecture, nevertheless. Its architecture alone can be used to describe how uniform it is.

The height and width are decreased by a pooling layer that comes after a few convolution layers. There are around 64 filters available, which we can then multiply by two to get about 128 filters, and so on up to 256 filters. In the last layer, we can use 512 filters.

VGG 16 ArchitectureVGG Configuration, Training, and Results

The VGG network has five configurations named A to E. The depth of the configuration increases from left (A) to right (B), with more layers added. Below is a table describing all the potential network architectures:

All configurations adhere to the same design and simply differ in depth; for example, network A has 11 weight layers (8 convolutional and 3 fully-connected layers), whereas network E has 19 weight layers (16 convolutional and 3 fully-connected layers). Convolutional layers have a relatively low number of channels, starting at 64 in the first layer and rising by a factor of 2 after each max-pooling layer to a maximum of 512. The following picture displays the overall amount of parameters (in millions):

VGG-16 SummarySummary Table of VGG 16

Limitations Of VGG 16:

It is very slow to train (the original VGG model was trained on Nvidia Titan GPU for 2–3 weeks).The size of VGG-16 trained imageNet weights is 528 MB. So, it takes quite a lot of disk space and bandwidth which makes it inefficient.138 million parameters lead to exploding gradients problem.Implementation of VGGNet-16

Importing Libraries, Defining the input image shape and Building VGG-16 Model

Working with pretrained modelKeras library also provides the pre-trained model in which one can load the saved model weights, and use them for different purposes: transfer learning, image feature extraction, and object detection. We can load the model architecture given in the library, and then add all the weights to the respective layers.Before using the pre-trained models, let’s write a few functions which will be used to make some predictions. First, load some images and pre-process them.Using pretrained weights to save some time.

Code Credits go to — VGGNet-16 Architecture: A Complete Guide by Paras Varshney

And with that, we have completed the VGG-Net Architecture, if anyone would like to understand this architecture deeply can check out the paper which was published (The link has been added in the introduction part of this blog). Keep Learning.

【本文地址】

公司简介

联系我们