# Matlab – Deep Learning Toolbox – Getting Started Guide

**Matlab – Deep Learning Toolbox – Getting Started GuideMark Hudson BealeMartin T. HaganHoward B. DemuthAcknowledgmentsAcknowledgments . viiiGetting StartedDeep Learning Toolbox Product Description 1-2Get Started with Deep Network Designer 1-3Try Deep Learning in 10 Lines of MATLAB Code . 1-13Classify Image Using Pretrained Network . 1-15Get Started with Transfer Learning 1-17Create Simple Image Classification Network 1-26Create Simple Sequence Classification Network Using Deep NetworkDesigner 1-29Shallow Networks for Pattern Recognition, Clustering and Time Series. 1-38Shallow Network Apps and Functions in Deep Learning Toolbox . 1-38Deep Learning Toolbox Applications 1-39Shallow Neural Network Design Steps 1-40Fit Data with a Shallow Neural Network 1-42Defining a Problem . 1-42Using the Neural Network Fitting App 1-42Using Command-Line Functions . 1-55Classify Patterns with a Shallow Neural Network . 1-63Defining a Problem . 1-63Using the Neural Network Pattern Recognition App 1-64Using Command-Line Functions . 1-76Cluster Data with a Self-Organizing Map . 1-83Defining a Problem . 1-83Using the Neural Network Clustering App . 1-83Using Command-Line Functions . 1-95vContentsShallow Neural Network Time-Series Prediction and Modeling 1-100Defining a Problem 1-100Using the Neural Network Time Series App . 1-100Using Command-Line Functions 1-114Train Shallow Networks on CPUs and GPUs 1-123Parallel Computing Toolbox . 1-123Parallel CPU Workers 1-123GPU Computing 1-124Multiple GPU/CPU Computing . 1-124Cluster Computing with MATLAB Parallel Server . 1-124Load Balancing, Large Problems, and Beyond 1-125Sample Data Sets for Shallow Neural Networks . 1-126Shallow Neural Networks GlossaryShallow Neural Networks GlossaryADALINE Acronym for a linear neuron: ADAptive LINear Element.adaption Training method that proceeds through the specified sequence ofinputs, calculating the output, error, and network adjustment for eachinput vector in the sequence as the inputs are presented.adaptive filter Network that contains delays and whose weights are adjusted aftereach new input vector is presented. The network adapts to changes inthe input signal properties if such occur. This kind of filter is used inlong distance telephone lines to cancel echoes.adaptive learning rate Learning rate that is adjusted according to an algorithm duringtraining to minimize training time.architecture Description of the number of the layers in a neural network, eachlayer’s transfer function, the number of neurons per layer, and theconnections between layers.backpropagationlearning ruleLearning rule in which weights and biases are adjusted by errorderivative (delta) vectors backpropagated through the network.Backpropagation is commonly applied to feedforward multilayernetworks. Sometimes this rule is called the generalized delta rule.backtracking search Linear search routine that begins with a step multiplier of 1 and thenbacktracks until an acceptable reduction in performance is obtained.batch Matrix of input (or target) vectors applied to the networksimultaneously. Changes to the network weights and biases are madejust once for the entire set of vectors in the input matrix. (The termbatch is being replaced by the more descriptive expression“concurrent vectors.”)batching Process of presenting a set of input vectors for simultaneouscalculation of a matrix of output vectors and/or new weights andbiases.Bayesian framework Assumes that the weights and biases of the network are randomvariables with specified distributions.BFGS quasi-NewtonalgorithmVariation of Newton’s optimization algorithm, in which anapproximation of the Hessian matrix is obtained from gradientscomputed at each iteration of the algorithm.bias Neuron parameter that is summed with the neuron’s weighted inputsand passed through the neuron’s transfer function to generate theneuron’s output.bias vector Column vector of bias values for a layer of neurons.Brent’s search Linear search that is a hybrid of the golden section search and aquadratic interpolation.Glossary-1cascade-forwardnetworkLayered network in which each layer only receives inputs fromprevious layers.Charalambous’ search Hybrid line search that uses a cubic interpolation together with a typeof sectioning.classification Association of an input vector with a particular target vector.competitive layer Layer of neurons in which only the neuron with maximum net inputhas an output of 1 and all other neurons have an output of 0. Neuronscompete with each other for the right to respond to a given inputvector.competitive learning Unsupervised training of a competitive layer with the instar rule orKohonen rule. Individual neurons learn to become feature detectors.After training, the layer categorizes input vectors among its neurons.competitive transferfunctionAccepts a net input vector for a layer and returns neuron outputs of 0for all neurons except for the winner, the neuron associated with themost positive element of the net input n.concurrent input vectors Name given to a matrix of input vectors that are to be presented to anetwork simultaneously. All the vectors in the matrix are used inmaking just one set of changes in the weights and biases.conjugate gradientalgorithmIn the conjugate gradient algorithms, a search is performed alongconjugate directions, which produces generally faster convergencethan a search along the steepest descent directions.connection One-way link between neurons in a network.connection strength Strength of a link between two neurons in a network. The strength,often called weight, determines the effect that one neuron has onanother.cycle Single presentation of an input vector, calculation of output, and newweights and biases.dead neuron Competitive layer neuron that never won any competition duringtraining and so has not become a useful feature detector. Deadneurons do not respond to any of the training vectors.decision boundary Line, determined by the weight and bias vectors, for which the netinput n is zero.delta rule See Widrow-Hoff learning rule.delta vector The delta vector for a layer is the derivative of a network’s outputerror with respect to that layer’s net input vector.distance Distance between neurons, calculated from their positions with adistance function.distance function Particular way of calculating distance, such as the Euclidean distancebetween two vectors.GlossaryGlossary-2early stopping Technique based on dividing the data into three subsets. The firstsubset is the training set, used for computing the gradient andupdating the network weights and biases. The second subset is thevalidation set. When the validation error increases for a specifiednumber of iterations, the training is stopped, and the weights andbiases at the minimum of the validation error are returned. The thirdsubset is the test set. It is used to verify the network design.epoch Presentation of the set of training (input and/or target) vectors to anetwork and the calculation of new weights and biases. Note thattraining vectors can be presented one at a time or all together in abatch.error jumping Sudden increase in a network’s sum-squared error during training.This is often due to too large a learning rate.error ratio Training parameter used with adaptive learning rate and momentumtraining of backpropagation networks.error vector Difference between a network’s output vector in response to an inputvector and an associated target output vector.feedback network Network with connections from a layer’s output to that layer’s input.The feedback connection can be direct or pass through several layers.feedforward network Layered network in which each layer only receives inputs fromprevious layers.Fletcher-Reeves update Method for computing a set of conjugate directions. These directionsare used as search directions as part of a conjugate gradientoptimization procedure.function approximation Task performed by a network trained to respond to inputs with anapproximation of a desired function.generalization Attribute of a network whose output for a new input vector tends to beclose to outputs for similar input vectors in its training set.generalized regressionnetworkApproximates a continuous function to an arbitrary accuracy, given asufficient number of hidden neurons.global minimum Lowest value of a function over the entire range of its inputparameters. Gradient descent methods adjust weights and biases inorder to find the global minimum of error for a network.golden section search Linear search that does not require the calculation of the slope. Theinterval containing the minimum of the performance is subdivided ateach iteration of the search, and one subdivision is eliminated at eachiteration.gradient descent Process of making changes to weights and biases, where the changesare proportional to the derivatives of network error with respect tothose weights and biases. This is done to minimize network error.GlossaryGlossary-3hard-limit transferfunctionTransfer function that maps inputs greater than or equal to 0 to 1, andall other values to 0.Hebb learning rule Historically the first proposed learning rule for neurons. Weights areadjusted proportional to the product of the outputs of pre- andpostweight neurons.hidden layer Layer of a network that is not connected to the network output (forinstance, the first layer of a two-layer feedforward network).home neuron Neuron at the center of a neighborhood.hybrid bisection-cubicsearchLine search that combines bisection and cubic interpolation.initialization Process of setting the network weights and biases to their originalvalues.input layer Layer of neurons receiving inputs directly from outside the network.input space Range of all possible input vectors.input vector Vector presented to the network.input weight vector Row vector of weights going to a neuron.input weights Weights connecting network inputs to layers.Jacobian matrix Contains the first derivatives of the network errors with respect to theweights and biases.Kohonen learning rule Learning rule that trains a selected neuron’s weight vectors to take onthe values of the current input vector.layer Group of neurons having connections to the same inputs and sendingoutputs to the same destinations.layer diagram Network architecture figure showing the layers and the weightmatrices connecting them. Each layer’s transfer function is indicatedwith a symbol. Sizes of input, output, bias, and weight matrices areshown. Individual neurons and connections are not shown.layer weights Weights connecting layers to other layers. Such weights need to havenonzero delays if they form a recurrent connection (i.e., a loop).learning Process by which weights and biases are adjusted to achieve somedesired network behavior.learning rate Training parameter that controls the size of weight and bias changesduring learning.learning rule Method of deriving the next changes that might be made in a networkor a procedure for modifying the weights and biases of a network.GlossaryGlossary-4Levenberg-Marquardt Algorithm that trains a neural network 10 to 100 times faster than theusual gradient descent backpropagation method. It always computesthe approximate Hessian matrix, which has dimensions n-by-n.line search function Procedure for searching along a given search direction (line) to locatethe minimum of the network performance.linear transfer function Transfer function that produces its input as its output.link distance Number of links, or steps, that must be taken to get to the neuronunder consideration.local minimum Minimum of a function over a limited range of input values. A localminimum might not be the global minimum.log-sigmoid transferfunctionSquashing function of the form shown below that maps the input tothe interval (0,1). (The toolbox function is logsig.)f(n) = 11 + e−nManhattan distance The Manhattan distance between two vectors x and y is calculated asD = sum(abs(x-y))maximum performanceincreaseMaximum amount by which the performance is allowed to increase inone iteration of the variable learning rate training algorithm.maximum step size Maximum step size allowed during a linear search. The magnitude ofthe weight vector is not allowed to increase by more than thismaximum step size in one iteration of a training algorithm.mean square errorfunctionPerformance function that calculates the average squared errorbetween the network outputs a and the target outputs t.momentum Technique often used to make it less likely for a backpropagationnetwork to get caught in a shallow minimum.momentum constant Training parameter that controls how much momentum is used.mu parameter Initial value for the scalar µ.neighborhood Group of neurons within a specified distance of a particular neuron.The neighborhood is specified by the indices for all the neurons thatlie within a radius d of the winning neuron i*:Ni(d) = {j,dij ≤ d}net input vector Combination, in a layer, of all the layer’s weighted input vectors withits bias.neuron Basic processing element of a neural network. Includes weights andbias, a summing junction, and an output transfer function. Artificialneurons, such as those simulated and trained with this toolbox, areabstractions of biological neurons.GlossaryGlossary-5neuron diagram Network architecture figure showing the neurons and the weightsconnecting them. Each neuron’s transfer function is indicated with asymbol.ordering phase Period of training during which neuron weights are expected to orderthemselves in the input space consistent with the associated neuronpositions.output layer Layer whose output is passed to the world outside the network.output vector Output of a neural network. Each element of the output vector is theoutput of a neuron.output weight vector Column vector of weights coming from a neuron or input. (See alsooutstar learning rule.)outstar learning rule Learning rule that trains a neuron’s (or input’s) output weight vectorto take on the values of the current output vector of the postweightlayer. Changes in the weights are proportional to the neuron’s output.overfitting Case in which the error on the training set is driven to a very smallvalue, but when new data is presented to the network, the error islarge.pass Each traverse through all the training input and target vectors.pattern A vector.pattern association Task performed by a network trained to respond with the correctoutput vector for each input vector presented.pattern recognition Task performed by a network trained to respond when an input vectorclose to a learned vector is presented. The network “recognizes” theinput as one of the original target vectors.perceptron Single-layer network with a hard-limit transfer function. This networkis often trained with the perceptron learning rule.perceptron learning rule Learning rule for training single-layer hard-limit networks. It isguaranteed to result in a perfectly functioning network in finite time,given that the network is capable of doing so.performance Behavior of a network.performance function Commonly the mean squared error of the network outputs. However,the toolbox also considers other performance functions. Type helpnnperformance for a list of performance functions.Polak-Ribiére update Method for computing a set of conjugate directions. These directionsare used as search directions as part of a conjugate gradientoptimization procedure.positive linear transferfunctionTransfer function that produces an output of zero for negative inputsand an output equal to the input for positive inputs.GlossaryGlossary-6postprocessing Converts normalized outputs back into the same units that were usedfor the original targets.Powell-Beale restarts Method for computing a set of conjugate directions. These directionsare used as search directions as part of a conjugate gradientoptimization procedure. This procedure also periodically resets thesearch direction to the negative of the gradient.preprocessing Transformation of the input or target data before it is presented to theneural network.principal componentanalysisOrthogonalize the components of network input vectors. Thisprocedure can also reduce the dimension of the input vectors byeliminating redundant components.quasi-Newton algorithm Class of optimization algorithm based on Newton’s method. Anapproximate Hessian matrix is computed at each iteration of thealgorithm based on the gradients.radial basis networks Neural network that can be designed directly by fitting specialresponse elements where they will do the most good.radial basis transferfunctionThe transfer function for a radial basis neuron isradbas(n) = e−n2regularization Modification of the performance function, which is normally chosen tobe the sum of squares of the network errors on the training set, byadding some fraction of the squares of the network weights.resilientbackpropagationTraining algorithm that eliminates the harmful effect of having a smallslope at the extreme ends of the sigmoid squashing transfer functions.saturating lineartransfer functionFunction that is linear in the interval (-1,+1) and saturates outsidethis interval to -1 or +1. (The toolbox function is satlin.)scaled conjugategradient algorithmAvoids the time-consuming line search of the standard conjugategradient algorithm.sequential input vectors Set of vectors that are to be presented to a network one after theother. The network weights and biases are adjusted on thepresentation of each input vector.sigma parameter Determines the change in weight for the calculation of theapproximate Hessian matrix in the scaled conjugate gradientalgorithm.sigmoid Monotonic S-shaped function that maps numbers in the interval (-∞,∞)to a finite interval such as (-1,+1) or (0,1).simulation Takes the network input p, and the network object net, and returnsthe network outputs a.spread constant Distance an input vector must be from a neuron’s weight vector toproduce an output of 0.5.GlossaryGlossary-7squashing function Monotonically increasing function that takes input values between -∞and +∞ and returns values in a finite interval.star learning rule Learning rule that trains a neuron’s weight vector to take on thevalues of the current input vector. Changes in the weights areproportional to the neuron’s output.sum-squared error Sum of squared differences between the network targets and actualoutputs for a given input vector or set of vectors.supervised learning Learning process in which changes in a network’s weights and biasesare due to the intervention of any external teacher. The teachertypically provides output targets.symmetric hard-limittransfer functionTransfer that maps inputs greater than or equal to 0 to +1, and allother values to -1.symmetric saturatinglinear transfer functionProduces the input as its output as long as the input is in the range -1to 1. Outside that range the output is -1 and +1, respectively.tan-sigmoid transferfunctionSquashing function of the form shown below that maps the input tothe interval (-1,1). (The toolbox function is tansig.)f(n) = 11 + e−ntapped delay line Sequential set of delays with outputs available at each delay output.target vector Desired output vector for a given input vector.test vectors Set of input vectors (not used directly in training) that is used to testthe trained network.topology functions Ways to arrange the neurons in a grid, box, hexagonal, or randomtopology.training Procedure whereby a network is adjusted to do a particular job.Commonly viewed as an offline job, as opposed to an adjustment madeduring each time interval, as is done in adaptive training.training vector Input and/or target vector used to train a network.transfer function Function that maps a neuron’s (or layer’s) net output n to its actualoutput.tuning phase Period of SOFM training during which weights are expected to spreadout relatively evenly over the input space while retaining theirtopological order found during the ordering phase.underdeterminedsystemSystem that has more variables than constraints.unsupervised learning Learning process in which changes in a network’s weights and biasesare not due to the intervention of any external teacher. CommonlyGlossaryGlossary-8changes are a function of the current network input vectors, outputvectors, and previous weights and biases.update Make a change in weights and biases. The update can occur afterpresentation of a single input vector or after accumulating changesover several input vectors.validation vectors Set of input vectors (not used directly in training) that is used tomonitor training progress so as to keep the network from overfitting.weight function Weight functions apply weights to an input to get weighted inputs, asspecified by a particular function.weight matrix Matrix containing connection strengths from a layer’s inputs to itsneurons. The element wi,j of a weight matrix W refers to the connectionstrength from input j to neuron i.weighted input vector Result of applying a weight to a layer’s input, whether it is a networkinput or the output of another layer.Widrow-Hoff learningruleLearning rule used to train single-layer linear networks. This rule isthe predecessor of the backpropagation rule and is sometimesreferred to as the delta rule.**

**كلمة سر فك الضغط : books-world.netThe Unzip Password : books-world.net**

## تعليقات