Matlab – Deep Learning Toolbox – Getting Started Guide

Matlab – Deep Learning Toolbox – Getting Started Guide
اسم المؤلف
Mark Hudson, Beale Martin, T. Hagan Howard B. Demuth
8 أكتوبر 2021
(لا توجد تقييمات)

Matlab – Deep Learning Toolbox – Getting Started Guide
Mark Hudson Beale
Martin T. Hagan
Howard B. Demuth
Acknowledgments . viii
Getting Started
Deep Learning Toolbox Product Description 1-2
Get Started with Deep Network Designer 1-3
Try Deep Learning in 10 Lines of MATLAB Code . 1-13
Classify Image Using Pretrained Network . 1-15
Get Started with Transfer Learning 1-17
Create Simple Image Classification Network 1-26
Create Simple Sequence Classification Network Using Deep Network
Designer 1-29
Shallow Networks for Pattern Recognition, Clustering and Time Series
. 1-38
Shallow Network Apps and Functions in Deep Learning Toolbox . 1-38
Deep Learning Toolbox Applications 1-39
Shallow Neural Network Design Steps 1-40
Fit Data with a Shallow Neural Network 1-42
Defining a Problem . 1-42
Using the Neural Network Fitting App 1-42
Using Command-Line Functions . 1-55
Classify Patterns with a Shallow Neural Network . 1-63
Defining a Problem . 1-63
Using the Neural Network Pattern Recognition App 1-64
Using Command-Line Functions . 1-76
Cluster Data with a Self-Organizing Map . 1-83
Defining a Problem . 1-83
Using the Neural Network Clustering App . 1-83
Using Command-Line Functions . 1-95
ContentsShallow Neural Network Time-Series Prediction and Modeling 1-100
Defining a Problem 1-100
Using the Neural Network Time Series App . 1-100
Using Command-Line Functions 1-114
Train Shallow Networks on CPUs and GPUs 1-123
Parallel Computing Toolbox . 1-123
Parallel CPU Workers 1-123
GPU Computing 1-124
Multiple GPU/CPU Computing . 1-124
Cluster Computing with MATLAB Parallel Server . 1-124
Load Balancing, Large Problems, and Beyond 1-125
Sample Data Sets for Shallow Neural Networks . 1-126
Shallow Neural Networks Glossary
Shallow Neural Networks Glossary
ADALINE Acronym for a linear neuron: ADAptive LINear Element.
adaption Training method that proceeds through the specified sequence of
inputs, calculating the output, error, and network adjustment for each
input vector in the sequence as the inputs are presented.
adaptive filter Network that contains delays and whose weights are adjusted after
each new input vector is presented. The network adapts to changes in
the input signal properties if such occur. This kind of filter is used in
long distance telephone lines to cancel echoes.
adaptive learning rate Learning rate that is adjusted according to an algorithm during
training to minimize training time.
architecture Description of the number of the layers in a neural network, each
layer’s transfer function, the number of neurons per layer, and the
connections between layers.
learning rule
Learning rule in which weights and biases are adjusted by errorderivative (delta) vectors backpropagated through the network.
Backpropagation is commonly applied to feedforward multilayer
networks. Sometimes this rule is called the generalized delta rule.
backtracking search Linear search routine that begins with a step multiplier of 1 and then
backtracks until an acceptable reduction in performance is obtained.
batch Matrix of input (or target) vectors applied to the network
simultaneously. Changes to the network weights and biases are made
just once for the entire set of vectors in the input matrix. (The term
batch is being replaced by the more descriptive expression
“concurrent vectors.”)
batching Process of presenting a set of input vectors for simultaneous
calculation of a matrix of output vectors and/or new weights and
Bayesian framework Assumes that the weights and biases of the network are random
variables with specified distributions.
BFGS quasi-Newton
Variation of Newton’s optimization algorithm, in which an
approximation of the Hessian matrix is obtained from gradients
computed at each iteration of the algorithm.
bias Neuron parameter that is summed with the neuron’s weighted inputs
and passed through the neuron’s transfer function to generate the
neuron’s output.
bias vector Column vector of bias values for a layer of neurons.
Brent’s search Linear search that is a hybrid of the golden section search and a
quadratic interpolation.
Layered network in which each layer only receives inputs from
previous layers.
Charalambous’ search Hybrid line search that uses a cubic interpolation together with a type
of sectioning.
classification Association of an input vector with a particular target vector.
competitive layer Layer of neurons in which only the neuron with maximum net input
has an output of 1 and all other neurons have an output of 0. Neurons
compete with each other for the right to respond to a given input
competitive learning Unsupervised training of a competitive layer with the instar rule or
Kohonen rule. Individual neurons learn to become feature detectors.
After training, the layer categorizes input vectors among its neurons.
competitive transfer
Accepts a net input vector for a layer and returns neuron outputs of 0
for all neurons except for the winner, the neuron associated with the
most positive element of the net input n.
concurrent input vectors Name given to a matrix of input vectors that are to be presented to a
network simultaneously. All the vectors in the matrix are used in
making just one set of changes in the weights and biases.
conjugate gradient
In the conjugate gradient algorithms, a search is performed along
conjugate directions, which produces generally faster convergence
than a search along the steepest descent directions.
connection One-way link between neurons in a network.
connection strength Strength of a link between two neurons in a network. The strength,
often called weight, determines the effect that one neuron has on
cycle Single presentation of an input vector, calculation of output, and new
weights and biases.
dead neuron Competitive layer neuron that never won any competition during
training and so has not become a useful feature detector. Dead
neurons do not respond to any of the training vectors.
decision boundary Line, determined by the weight and bias vectors, for which the net
input n is zero.
delta rule See Widrow-Hoff learning rule.
delta vector The delta vector for a layer is the derivative of a network’s output
error with respect to that layer’s net input vector.
distance Distance between neurons, calculated from their positions with a
distance function.
distance function Particular way of calculating distance, such as the Euclidean distance
between two vectors.
Glossary-2early stopping Technique based on dividing the data into three subsets. The first
subset is the training set, used for computing the gradient and
updating the network weights and biases. The second subset is the
validation set. When the validation error increases for a specified
number of iterations, the training is stopped, and the weights and
biases at the minimum of the validation error are returned. The third
subset is the test set. It is used to verify the network design.
epoch Presentation of the set of training (input and/or target) vectors to a
network and the calculation of new weights and biases. Note that
training vectors can be presented one at a time or all together in a
error jumping Sudden increase in a network’s sum-squared error during training.
This is often due to too large a learning rate.
error ratio Training parameter used with adaptive learning rate and momentum
training of backpropagation networks.
error vector Difference between a network’s output vector in response to an input
vector and an associated target output vector.
feedback network Network with connections from a layer’s output to that layer’s input.
The feedback connection can be direct or pass through several layers.
feedforward network Layered network in which each layer only receives inputs from
previous layers.
Fletcher-Reeves update Method for computing a set of conjugate directions. These directions
are used as search directions as part of a conjugate gradient
optimization procedure.
function approximation Task performed by a network trained to respond to inputs with an
approximation of a desired function.
generalization Attribute of a network whose output for a new input vector tends to be
close to outputs for similar input vectors in its training set.
generalized regression
Approximates a continuous function to an arbitrary accuracy, given a
sufficient number of hidden neurons.
global minimum Lowest value of a function over the entire range of its input
parameters. Gradient descent methods adjust weights and biases in
order to find the global minimum of error for a network.
golden section search Linear search that does not require the calculation of the slope. The
interval containing the minimum of the performance is subdivided at
each iteration of the search, and one subdivision is eliminated at each
gradient descent Process of making changes to weights and biases, where the changes
are proportional to the derivatives of network error with respect to
those weights and biases. This is done to minimize network error.
Glossary-3hard-limit transfer
Transfer function that maps inputs greater than or equal to 0 to 1, and
all other values to 0.
Hebb learning rule Historically the first proposed learning rule for neurons. Weights are
adjusted proportional to the product of the outputs of pre- and
postweight neurons.
hidden layer Layer of a network that is not connected to the network output (for
instance, the first layer of a two-layer feedforward network).
home neuron Neuron at the center of a neighborhood.
hybrid bisection-cubic
Line search that combines bisection and cubic interpolation.
initialization Process of setting the network weights and biases to their original
input layer Layer of neurons receiving inputs directly from outside the network.
input space Range of all possible input vectors.
input vector Vector presented to the network.
input weight vector Row vector of weights going to a neuron.
input weights Weights connecting network inputs to layers.
Jacobian matrix Contains the first derivatives of the network errors with respect to the
weights and biases.
Kohonen learning rule Learning rule that trains a selected neuron’s weight vectors to take on
the values of the current input vector.
layer Group of neurons having connections to the same inputs and sending
outputs to the same destinations.
layer diagram Network architecture figure showing the layers and the weight
matrices connecting them. Each layer’s transfer function is indicated
with a symbol. Sizes of input, output, bias, and weight matrices are
shown. Individual neurons and connections are not shown.
layer weights Weights connecting layers to other layers. Such weights need to have
nonzero delays if they form a recurrent connection (i.e., a loop).
learning Process by which weights and biases are adjusted to achieve some
desired network behavior.
learning rate Training parameter that controls the size of weight and bias changes
during learning.
learning rule Method of deriving the next changes that might be made in a network
or a procedure for modifying the weights and biases of a network.
Glossary-4Levenberg-Marquardt Algorithm that trains a neural network 10 to 100 times faster than the
usual gradient descent backpropagation method. It always computes
the approximate Hessian matrix, which has dimensions n-by-n.
line search function Procedure for searching along a given search direction (line) to locate
the minimum of the network performance.
linear transfer function Transfer function that produces its input as its output.
link distance Number of links, or steps, that must be taken to get to the neuron
under consideration.
local minimum Minimum of a function over a limited range of input values. A local
minimum might not be the global minimum.
log-sigmoid transfer
Squashing function of the form shown below that maps the input to
the interval (0,1). (The toolbox function is logsig.)
f(n) = 1
1 + e−n
Manhattan distance The Manhattan distance between two vectors x and y is calculated as
D = sum(abs(x-y))
maximum performance
Maximum amount by which the performance is allowed to increase in
one iteration of the variable learning rate training algorithm.
maximum step size Maximum step size allowed during a linear search. The magnitude of
the weight vector is not allowed to increase by more than this
maximum step size in one iteration of a training algorithm.
mean square error
Performance function that calculates the average squared error
between the network outputs a and the target outputs t.
momentum Technique often used to make it less likely for a backpropagation
network to get caught in a shallow minimum.
momentum constant Training parameter that controls how much momentum is used.
mu parameter Initial value for the scalar µ.
neighborhood Group of neurons within a specified distance of a particular neuron.
The neighborhood is specified by the indices for all the neurons that
lie within a radius d of the winning neuron i*:
Ni(d) = {j,dij ≤ d}
net input vector Combination, in a layer, of all the layer’s weighted input vectors with
its bias.
neuron Basic processing element of a neural network. Includes weights and
bias, a summing junction, and an output transfer function. Artificial
neurons, such as those simulated and trained with this toolbox, are
abstractions of biological neurons.
Glossary-5neuron diagram Network architecture figure showing the neurons and the weights
connecting them. Each neuron’s transfer function is indicated with a
ordering phase Period of training during which neuron weights are expected to order
themselves in the input space consistent with the associated neuron
output layer Layer whose output is passed to the world outside the network.
output vector Output of a neural network. Each element of the output vector is the
output of a neuron.
output weight vector Column vector of weights coming from a neuron or input. (See also
outstar learning rule.)
outstar learning rule Learning rule that trains a neuron’s (or input’s) output weight vector
to take on the values of the current output vector of the postweight
layer. Changes in the weights are proportional to the neuron’s output.
overfitting Case in which the error on the training set is driven to a very small
value, but when new data is presented to the network, the error is
pass Each traverse through all the training input and target vectors.
pattern A vector.
pattern association Task performed by a network trained to respond with the correct
output vector for each input vector presented.
pattern recognition Task performed by a network trained to respond when an input vector
close to a learned vector is presented. The network “recognizes” the
input as one of the original target vectors.
perceptron Single-layer network with a hard-limit transfer function. This network
is often trained with the perceptron learning rule.
perceptron learning rule Learning rule for training single-layer hard-limit networks. It is
guaranteed to result in a perfectly functioning network in finite time,
given that the network is capable of doing so.
performance Behavior of a network.
performance function Commonly the mean squared error of the network outputs. However,
the toolbox also considers other performance functions. Type help
nnperformance for a list of performance functions.
Polak-Ribiére update Method for computing a set of conjugate directions. These directions
are used as search directions as part of a conjugate gradient
optimization procedure.
positive linear transfer
Transfer function that produces an output of zero for negative inputs
and an output equal to the input for positive inputs.
Glossary-6postprocessing Converts normalized outputs back into the same units that were used
for the original targets.
Powell-Beale restarts Method for computing a set of conjugate directions. These directions
are used as search directions as part of a conjugate gradient
optimization procedure. This procedure also periodically resets the
search direction to the negative of the gradient.
preprocessing Transformation of the input or target data before it is presented to the
neural network.
principal component
Orthogonalize the components of network input vectors. This
procedure can also reduce the dimension of the input vectors by
eliminating redundant components.
quasi-Newton algorithm Class of optimization algorithm based on Newton’s method. An
approximate Hessian matrix is computed at each iteration of the
algorithm based on the gradients.
radial basis networks Neural network that can be designed directly by fitting special
response elements where they will do the most good.
radial basis transfer
The transfer function for a radial basis neuron is
radbas(n) = e−n2
regularization Modification of the performance function, which is normally chosen to
be the sum of squares of the network errors on the training set, by
adding some fraction of the squares of the network weights.
Training algorithm that eliminates the harmful effect of having a small
slope at the extreme ends of the sigmoid squashing transfer functions.
saturating linear
transfer function
Function that is linear in the interval (-1,+1) and saturates outside
this interval to -1 or +1. (The toolbox function is satlin.)
scaled conjugate
gradient algorithm
Avoids the time-consuming line search of the standard conjugate
gradient algorithm.
sequential input vectors Set of vectors that are to be presented to a network one after the
other. The network weights and biases are adjusted on the
presentation of each input vector.
sigma parameter Determines the change in weight for the calculation of the
approximate Hessian matrix in the scaled conjugate gradient
sigmoid Monotonic S-shaped function that maps numbers in the interval (-∞,∞)
to a finite interval such as (-1,+1) or (0,1).
simulation Takes the network input p, and the network object net, and returns
the network outputs a.
spread constant Distance an input vector must be from a neuron’s weight vector to
produce an output of 0.5.
Glossary-7squashing function Monotonically increasing function that takes input values between -∞
and +∞ and returns values in a finite interval.
star learning rule Learning rule that trains a neuron’s weight vector to take on the
values of the current input vector. Changes in the weights are
proportional to the neuron’s output.
sum-squared error Sum of squared differences between the network targets and actual
outputs for a given input vector or set of vectors.
supervised learning Learning process in which changes in a network’s weights and biases
are due to the intervention of any external teacher. The teacher
typically provides output targets.
symmetric hard-limit
transfer function
Transfer that maps inputs greater than or equal to 0 to +1, and all
other values to -1.
symmetric saturating
linear transfer function
Produces the input as its output as long as the input is in the range -1
to 1. Outside that range the output is -1 and +1, respectively.
tan-sigmoid transfer
Squashing function of the form shown below that maps the input to
the interval (-1,1). (The toolbox function is tansig.)
f(n) = 1
1 + e−n
tapped delay line Sequential set of delays with outputs available at each delay output.
target vector Desired output vector for a given input vector.
test vectors Set of input vectors (not used directly in training) that is used to test
the trained network.
topology functions Ways to arrange the neurons in a grid, box, hexagonal, or random
training Procedure whereby a network is adjusted to do a particular job.
Commonly viewed as an offline job, as opposed to an adjustment made
during each time interval, as is done in adaptive training.
training vector Input and/or target vector used to train a network.
transfer function Function that maps a neuron’s (or layer’s) net output n to its actual
tuning phase Period of SOFM training during which weights are expected to spread
out relatively evenly over the input space while retaining their
topological order found during the ordering phase.
System that has more variables than constraints.
unsupervised learning Learning process in which changes in a network’s weights and biases
are not due to the intervention of any external teacher. Commonly
Glossary-8changes are a function of the current network input vectors, output
vectors, and previous weights and biases.
update Make a change in weights and biases. The update can occur after
presentation of a single input vector or after accumulating changes
over several input vectors.
validation vectors Set of input vectors (not used directly in training) that is used to
monitor training progress so as to keep the network from overfitting.
weight function Weight functions apply weights to an input to get weighted inputs, as
specified by a particular function.
weight matrix Matrix containing connection strengths from a layer’s inputs to its
neurons. The element w
i,j of a weight matrix W refers to the connection
strength from input j to neuron i.
weighted input vector Result of applying a weight to a layer’s input, whether it is a network
input or the output of another layer.
Widrow-Hoff learning
Learning rule used to train single-layer linear networks. This rule is
the predecessor of the backpropagation rule and is sometimes
referred to as the delta rule.

كلمة سر فك الضغط :
The Unzip Password :


يجب عليك التسجيل في الموقع لكي تتمكن من التحميل
تسجيل | تسجيل الدخول


اترك تعليقاً