Elements of artificial neural networks solution manual
Jordan Pollack , Mark A. Bedau , Phil Husbands , Richard A. Watson , and Takashi Ikegami. Search Search. Search Advanced Search close Close.
Request Permissions Exam copy. Overview Author s Praise. Summary Elements of Artificial Neural Networks provides a clearly organized general introduction, focusing on a broad range of algorithms, for students and others who want to use neural networks rather than simply study them. Instructor Resources Downloadable instructor resources available for this title: solution manual and slides.
Share Share Share email. Joydeep Ghosh Associate Professor and Endowed Engineering Foundation Fellow, The University of Texas at Austin Elements of Artificial Neural Networks is very well written, provides enough detail to allow students to implement various algorithms, and contains good examples.
Andrew G. All of the algorithms are presented using block-structured pseudo-code, and exercises are provided throughout. Software implementing many commonly used neural network algorithms is available at the book's website.
Transparency masters, including abbreviated text and figures for the entire book, are available for instructors using the text.
PDF Skip to main content. Books Journals Reference Works Topics. Buy The Book. ISBN: pp. This is not the case with discontinuous functions such as the step function. Recall that a function is continuous if small changes in its inputs produce corresponding small changes in its output.
With the step function shown in figure 1. Biological systems are subject to noise, and a neuron with a discontinuous node function may potentially be activated by a small amount of noise, implying that this node is biologically implausible. Another feature of the step function is that its output "saturates," i.
This is desirable because we cannot expect biological or electronic hardware to produce excessively high voltages. The outputs of the step function may be interpreted as class identifiers: we may conclude that an input sample belongs to one class if and only if the net input exceeds a certain value.
This interpretation of the step-functional neuron appears simplistic when a network contains more than one neuron. It is sometimes possible to interpret nodes in the interior of the network as identifying features of the input, while the output neurons compute the application-specific output based on the inputs received from these feature-identifying intermediate nodes.
Ramp functions The ramp function is shown in figure 1. This node function also implies the existence of a threshold c which must be exceeded by the net weighted input in order to activate the node. The node output also saturates, i.
But unlike the step function, the ramp is continuous; small variations in net weighted input cause correspondingly small variations or none at all in the output. Sigmoid functions The most popular node functions used in neural nets are "sigmoid" S-shaped functions, whose output is illustrated in figure 1.
The advantage of these functions is that their smoothness makes it easy to devise learning algorithms and understand the behavior of large networks whose nodes compute such functions. Ex- perimental observations of biological neurons demonstrate that the neuronal firing rate is roughly sigmoidal, when plotted against the net input to a neuron. But the Brooklyn Bridge can be sold easily to anyone who believes that biological neurons perform any pre- cise mathematical operation such as exponentiation.
From the viewpoint of hardware or software implementation, exponentiation is an expensive computational task, and one may question whether such extensive calculations make a real difference for practical neural networks.
Piecewise linear functions Piecewise linear functions are combinations of various lin- ear functions, where the choice of the linear function depends on the relevant region of the input space.
Piecewise linear functions are easier to compute than general nonlinear functions such as sigmoid functions, and have been used as approximations of the same, as shown in figure 1. Gaussian functions Bell-shaped curves such as the one shown in figure 1. Algebraically, a Gaussian function of the net weighted input to a node may be described as follows. Gaussian node functions are used in Radial Basis Function networks, discussed in chapter 4. The way nodes are connected determines how compu- tations proceed and constitutes an important early design decision by a neural network developer.
A brief discussion of biological neural networks is relevant, prior to examining artificial neural network architectures. Different parts of the central nervous system are structured differently; hence it is incor- rect to claim that a single architecture models all neural processing.
The cerebral cortex, where most processing is believed to occur, consists of five to seven layers of neurons with each layer supplying inputs into the next.
However, layer boundaries are not strict and connections that cross layers are known to exist. Feedback pathways are also known to exist, e. Each neuron is connected with many, but not all, of the neighboring neurons within the same layer. Most of these connections are excitatory, but some are inhibitory. There are some "veto" neurons that have the overwhelming power of neutralizing the effects of a large number of excitatory inputs to a neuron.
Some amount of indirect self-excitation also occurs. In the following subsections, we discuss artificial neural network architectures, some of which derive inspiration from biological neural networks. This is the most general neural net architecture imaginable, and every other architec- ture can be seen to be its special case, obtained by setting some weights to zeroes.
In a fully connected asymmetric network, the connection from one node to another may carry a different weight than the connection from the second node to the first, as shown in figure 1. This architecture is seldom used despite its generality and conceptual simplicity, due to the large number of parameters.
In a network with n nodes, there are n2 weights. It is difficult to devise fast learning schemes that can produce fully connected networks that generalize well. It is practically never the case that every node has direct influence on every other node.
Fully connected networks are also biologically implausible—neurons rarely establish synapses with geographically distant neurons. A special case of fully connected architecture is one in which the weight that connects one node to another is equal to its symmetric reverse, as shown in figure 1. Note that node I is an input node as well as an output node.
In chapter 6, we consider these networks for associative memory tasks. In the figure, some nodes are shown as "Input" nodes, some as "Output" nodes, and all others are considered "Hidden" nodes whose interaction with the external environment is indirect.
A "hidden node" is any node that is neither an input node nor an output node. Some nodes may not receive external inputs, as in some recurrent networks considered in chapter 4. Some nodes may receive an input as well as generate an output, as seen in node I of figure 1.
We adopt the convention that a single input arrives at and is distributed to other nodes by each node of the "input layer" or "layer 0"; no other computation occurs at nodes in layer 0, and there are no intra-layer connections among nodes in this layer.
Networks that are not acyclic are referred to as recurrent networks. These networks are succinctly described by a sequence of numbers indicating the num- ber of nodes in each layer. For instance, the network shown in figure 1. These networks, generally with no more than four such layers, are among the most com- mon neural nets in use, so much so that some users identify the phrase "neural networks" to mean only feedforward networks. Conceptually, nodes in successively higher layers ab- stract successively higher level features from preceding layers.
In the literature on neural networks, the term "feedforward" has been used sometimes to refer to layered or acyclic networks. Modularity allows the neural network developer to solve smaller tasks separately using small neural network modules and then combine these modules in a logical manner.
Modules can be organized in several different ways, some of which are illustrated in figure 1. How does this learning occur? What are possible mathematical models of learning? In this section, we summarize some of the basic theories of biological learning and their adaptations for artificial neural networks.
In artificial neural networks, learning refers to the method of modifying the weights of connections between the nodes of a specified network. When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes place in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased. For artificial neural networks, this implies a gradual increase in strength of connections among nodes having similar outputs when presented with the same input.
The strength of connections between neurons eventually comes to represent the correlation between their outputs. Many modifications of this rule have been developed and are widely used in artificial neural network models. Networks that use this type of learning are described in chapter 6. The competitive process involves self-excitation and mutual inhibition among nodes, until a single winner emerges.
This leads to the development of networks in which each node specializes to be the winner for a set of similar patterns. This process has been observed in biological systems, and artificial neural networks that conduct this process are discussed in chapter 5. Competition may be viewed as the consequence of resources being limited, drawing from the analogy of ecological systems.
In the brain, maintaining synapses and high con- nection strengths requires resources, which are limited. These resources would be wasted if a large number of neurons were to respond in identical ways to input patterns.
A compet- itive mechanism can be viewed as a way of ensuring selective neural responses to various input stimuli. Resource conservation is also achieved by allowing connection strengths to decay with time. The converse of competition is cooperation, found in some neural network models. Cooperative activity can occur in several different ways. Different nodes may spec- ialize in different subtasks, so that together they accomplish a much bigger task.
Al- ternatively, several nodes may learn the same or a similar subtask, providing for fault tolerance: errors made by a single node may then be compensated for by other nodes. Connections may exist from each member of such a set of nodes to another higher level node so that the higher level node comes to represent an abstract concept or gen- eralization that combines the concepts represented by the members of the lower level nodes.
Each interaction with the environment can be viewed as measuring the performance of the system, and results in a small change in the system's behavior such that perform- ance improves in the future. If moving limbs in one direction leads towards food positive feedback , this reinforces the animal's behavior in response to a presented input.
The same principle forms the basis of much of machine learning. In the context of neural networks, for instance, if increasing a particular weight leads to diminished perform- ance or larger error, then that weight is decreased as the network is trained to perform better. The amount of change made at every step is very small in most networks to ensure that a network does not stray too far from its partially evolved state, and so that the net- work withstands some mistakes made by the teacher, feedback, or performance evaluation mechanism.
If the incremental change is infinitesimal, however, the neural network will require excessively large training times. Some training methods cleverly vary the rate at which a network is modified.
Practically every non-mechanical task performed by animals requires the interaction of neural networks. Perception, recognition, memory, conscious thought, dreams, sensorimo- tor control—the list goes on.
The desire to simulate some of these tasks has motivated the development of artificial neural networks. In this section, we present the reasons for study- ing neural networks from the viewpoint of the computational tasks for which they can be used.
For each task, we identify performance measures that can be used to judge the degree of success of a neural network in performing the task. At a high level, the tasks performed using neural networks can be classified as those re- quiring supervised or unsupervised learning. In supervised learning, a teacher is available to indicate whether a system is performing correctly, or to indicate a desired response, or to validate the acceptability of a system's responses, or to indicate the amount of error in system performance.
This is in contrast with unsupervised learning, where no teacher is available and learning must rely on guidance obtained heuristically by the system examin- ing different sample data or the environment.
A concrete example of supervised learing is provided by "classification" problems, whereas "clustering" provides an example of unsupervised learning. The distinction between supervised and unsupervised learning is illustrated in the following examples. In doing this, the archaeologist is guided by many past examples of male and female skeletons. Examination of these past examples called the training set allows the archaeologist to learn about the distinctions between male and female skeletons.
This learning process is an example of supervised learning, and the result of the learning process can be applied to determine whether the newly discovered skeleton belongs to a man. For this task, no previous data may be available to clearly identify the species for each skeleton fragment. The archaeologist has to determine whether the skeletons that can be reconstructed from the fragments are sufficiently similar to belong to the same species, or if the differences between these skeletons are large enough to warrant grouping them into different species.
This is an unsupervised learning process, which involves estimating the magnitudes of differences between the skeletons. One archaeologist may believe the skeletons belong to different species, while another may disagree, and there is no absolute criterion to determine who is correct.
This data consists of four measurements: the lengths and widths of sepals and petals of iris flowers. Class membership of each data vector is indicated in the fifth column of this table. This information is used in supervised learning.
But if we remove the fifth column of the data, all we have is a set of vectors of widths and lengths of petals and sepals of iris flowers. To separate all vectors into different groups of iris flowers, we would use procedures that depend only on the four values in each vector, and the relative proximity of different vectors. Such training is unsupervised because no a priori information is used regarding class membership, i. We are provided with a "training set" consist- ing of sample patterns that are representative of all classes, along with class membership information for each pattern.
Using the training set, we deduce rules for membership in each class and create a classifier, which can then be used to assign other patterns to their respective classes according to these rules. Neural networks have been used to classify samples, i. For instance, each output node can stand for one class.
In some networks, an additional constraint is that the magnitude of that output node must exceed a minimal threshold, say 0. For two-class problems, feedforward networks with a single output node are adequate. Neural networks have been used successfully in a large number of practical classifica- tion tasks, such as the following. Recognizing printed or handwritten characters 2.
Classifying loan applications into credit-worthy and non-credit-worthy groups 3. Analyzing sonar and radar data to determine the nature of the source of a signal 1.
In clustering problems, on the other hand, all that is avail- able is a set of samples and distance relationships that can be derived from the sample descriptions. For example, flowers may be clustered using features such as color and num- ber of petals. Most clustering mechanisms are based on some distance measure.
Each object is repre- sented by an ordered set vector of features. Thus, one would like to group samples so as to minimize intra-cluster distances while maximizing inter-cluster distances, subject to con- straints on the number of clusters that can be formed.
One way to measure intra-cluster dis- tance would be to find the average distance of different samples in a cluster from the cluster center. Similarly, inter-cluster distance could be measured using the distance between the centers of different clusters. The number of clusters depends on the problem, but should be as small as possible.
Some neural networks accomplish clustering by the following method. Initially, each node reacts randomly to the presentation of input samples. Nodes with higher outputs to an input sample learn to react even more strongly to that sample and to other input sam- ples geographically near that sample. This method is analogous to the statistical approach of k-nearest neighbor clustering, in which each sample is placed in the same cluster as the majority of its immediate neighbors.
Each input sample is associated with the nearest weight vector with the smallest Euclidean distance. Vector quantization is the process of dividing up space into several connected regions called "Voronoi regions" , a task similar to clustering. Each region is represented using a single vector called a "codebook vector". Every point in the input space belongs to one of these regions, and is mapped to the corresponding nearest codebook vector.
The set of codebook vectors is a compressed form of the set of input data vectors, since many different input data vectors may be mapped to the same codebook vector. For two-dimensional input spaces, the boundaries of Voronoi regions are obtained by sketching the perpendicular bisectors of the lines joining neighboring codebook vectors. In auto-association or associative memory tasks see figure 1. In hetew-association see figure 1. An example of an auto-associative task is the generation of a complete uncorrupted image, such as a face, from a corrupted version.
An example of hetero-association is the generation of a name when the image of a face is presented as input. For hetero-associative recall, a second layer of nodes is needed to generate the output pattern corresponding to an input pattern. These concepts are discussed in greater detail in chap- ter 6. The outputs corresponding to some input vectors may be known from training data, but we may not know the mathematical function describing the actual process that generates the outputs from the input vectors.
Function approximation is the task of learning or constructing a function that generates approximately the same outputs from input vectors as the process being modeled, based on available training data. Since h Output Input Figure 1. Continuity and smoothness of the function are almost always required. Following established scientific practice, an important criterion is that of simplicity of the model, i.
These criteria sometimes oppose the performance criterion of minimizing error, as shown in figure 1. This set of samples contains one outlier whose behavior deviates significantly from other samples. The same is true in the example in figure 1. Among the latter, fa is certainly desirable because it is smoother and can be represented by a network with fewer parameters. Implicit in such comparisons is the assumption that the given samples themselves might contain some errors due to the method used in obtaining them, or due to environmental factors.
Function approximation can be performed using the networks described in chapters 3 and 4. Many industrial or manufacturing problems involve stabilizing the behavior of an object, or tracking the behavior of a moving object. These can also be viewed as function approximation problems in which the desired function is the time-varying behavior of the object in question. An example task is that of predicting the behavior of stock market indices.
Weigend and Huberman observe that prediction hinges on two types of knowledge: knowledge of underlying laws, a very powerful and accurate means of prediction, and the discovery of strong empirical regularities in observations of a given system.
However, laws underlying the behavior of a system are not easily discovered, and empirical regularities or periodicities are not always evident, and can often be masked by noise. Though perfect prediction is hardly ever possible, neural networks can be used to obtain reasonably good predictions in a number of cases. For instance, neural nets have succeeded in learning the year cycle in sunspot data cf. At a high level, the prediction problem is a special case of function approximation problems, in which the function values are represented using time series.
A time series is a sequence of values measured over time, in discrete or continuous time units, e. For a network that is to make predictions based upon d most recent values of Figure 1. At each step in the training phase, a d-tuple of input data recent history is presented to the network. The network attempts to predict the next value in the time sequence. In this way, the forecasting problem reduces to a function approximation problem. In forecasting problems, it is important to consider both short-term "one-lag" and long-term "multilag" predictions.
In one-lag prediction, we forecast the next value based only on actual past values. In multilag prediction, on the other hand, some predicted values are also used to predict future values. Multilag prediction is required, for example, if we want to predict the value of a variable six months from today, not knowing the values for the next five months.
A better understanding of difficult problems is often obtained by studying many related variables together rather than by studying just one variable. A multivariate time series consists of sequences of values of several variables concurrently changing with time. The variables being measured may be significantly correlated, e. Values for each variable may then be predicted with greater accuracy if variations in the other variables are also taken into account. To be successful, forecasting must be based on all available correlations and empirical interdependencies among different temporal sequences.
Feedforward as well as recurrent networks have been used for forecasting and are dis- cussed in chapters 3 and 4. Control addresses the task of determining the values for in- put variables in order to achieve desired values for output variables. This is also a function approximation problem, for which feedforward, recurrent, and some specialized neural networks have been used successfully. Adaptive control techniques have been developed for systems subject to large variations in parameter values, environmental conditions, and signal inputs.
Neural networks can be employed in adaptive control systems to provide fast response, without requiring human intervention. A simple example of a static control system is one that maps input voltages into mechanical displacements of a robotic arm.
Irrespective of the history of the system, an input voltage will always generate the same output displacement. By contrast, the inverted pendulum control task, where the behavior of a pendulum de- pends on time-dependent input variables such as velocity, is a dynamic control task.
Neural networks have been used for two tasks associated with control; in both tasks, learning is supervised because the system's behavior dictates what the neural network is to accomplish. These tasks, illustrated in figure 1. System forward identification is the task of approximating the behavior of a system using a neural network or other learning method.
Inverse identification is the task of learning the inverse of the behavior of a system, possibly using neural networks. For instance, given the amount of force applied to a robotic arm system, the system's behavior results in displacement by a certain amount. The inverse problem in this case consists of determining the force required to produce a desired amount of displacement.
If the neural network has been successfully trained to perform inverse system identifi- cation, it can generate values for system inputs needed to obtain desired system outputs.
If the system's input-output mapping is not completely known, or varies with time, a neural network may have to be continually trained to track changes in system behavior. A feedback control system is then appropriate, in which the error between actual and desired system output is used to modify the input signal, in order to produce the desired behavior.
Feedforward neural networks discussed in chapter 3 have been applied to many control problems, such as pole-balancing [Tolat and Widrow ], robot arm control [Guez, Eilbert, and Kam ], truck backing-up [Nguyen and Widrow ], and inverse robot kinematics [Josin, Charney, and White ]. These systems have been successful due to the following features. Realization of fast decision making and control by parallel computation 2. Ability to adapt to a large number of parameters 3.
Natural fault tolerance due to the distributed representation of information 4. Robustness to variations in parameters not modeled, due to the generalization proper- ties of networks 1. An example is the task of arranging components on a circuit board such that the total length of wires is minimized, with additional constraints that require certain components to be connected to certain others, as shown in figure 1.
Some such problems can also be solved using neural networks. The best solutions for complex optimization problems are often obtained using net- works with stochastic behavior in which network behavior has a significant random or probabilistic component. Chapter 7 discusses neural networks and stochastic algorithms for optimization. Many problems discussed in preceding subsections and solved using neural net- works can be viewed as search problems: for instance, each "state" in a neural network is a possible weight matrix, each "move" is the possible change in the weight matrix that may be made by the learning algorithm, and the "goal" state is one for which the mean squared error is at a local minimum.
Neural networks can also be applied to certain other search problems often solved using symbolic reasoning systems. In attempting to solve a search problem using neural networks, the first and possibly the hardest task is to obtain a suitable representation for the search space in terms of nodes and weights in a network.
For instance, in a neural network to be used for game playing, the inputs to the network describe the current state of the board game, and the desired output pattern identifies the best possible move to be made. The weights in the network can be trained based on an evaluation of the quality of previous moves made by the network in response to various input patterns.
How well does the learner perform on the data on which the learner has been trained, i. How well does the learner perform on new data not used for training the learner? What are the computational resources time, space, effort required by the learner? These distance measures give equal emphasis to all dimensions of input data. Some- times, it is more meaningful to use weighted distance measures that attach different degrees of importance to different dimensions.
For instance, if in a two-dimensional input vector the first dimension is height, measured in meters, and the second dimen- sion is weight, measured in kilograms, the Euclidean distance between 1. The nature of the problem sometimes dictates the choice of the error measure. In classi- fication problems, in addition to the Euclidean distance, another possible error measure is the fraction of misclassified samples. Intra-cluster distance may be expressed as the sum or maximum of distances between all possible pairs of objects in each cluster, or between the centroid and other objects in each cluster.
Similarly, inter-cluster distances may be computed by adding the distances between centroids or pairs of objects in different clusters. But good generalizability is also necessary, i. Consider a child learning addition of one digit numbers. However, would the child be able to add and for which the answer was not provided earlier? This can occur if the child has learned how to extrapolate addition to larger numbers, rather than merely memorized the answers provided for training examples. The same distinction between learning and memorization is also relevant for neural networks.
In network development, therefore, available data is separated into two parts, of which one part is the training data and other part is the test data. It has been observed that exces- sive training on the training data sometimes decreases performance on the test data.
One way to avoid this danger of "overtraining" is by constant evaluation of the system using test data as learning proceeds. After each small step of learning in which performance of the network on training data improves , one must examine whether performance on test data also improves. If there is a succession of training steps in which performance im- proves only for the training data and not for the test data, overtraining is considered to have occurred, and the training process should be terminated.
However, training the networks or applying a learning algorithm can take a very long time, sometimes requiring many hours or days. Training time increases rapidly with the size of the networks and complexity of the problem. For fast training, it helps significantly to break down the problem into smaller subproblems and solve each one separately; modular networks that use such problem decomposition techniques were described in section 1.
The capabilities of a network are limited by its size. Despite this, the use of large net- works increases training time and reduces generalizability. Size of a network can be mea- sured in terms of the number of nodes, connections, and layers in a network. Complexity of node functions, possibly estimated as the number of bits needed to represent the functions, also contributes to network complexity measures.
Most learning algorithms, however, have been devised without clear hardware implementation considerations. For instance, neural hardware can generally allow only weights whose magnitudes are severely limited. Most chips also have limited fanout pin-count , so networks with very high connectivity are difficult to implement.
More biologically plausible are asynchronous networks in which any node or group of nodes may perform its computations at any instant. The time sequence of computations becomes especially relevant in recurrent or cyclic networks. When there are intra-layer connections, it is assumed that the previous output of one node is fed into the other node at the same layer, and vice versa.
Asynchrony is essential for proving the correctness properties of some networks. Parallelizability is another important consideration for neural networks. Depending on the hardware, synchronous or asynchronous computation models may be appropriate. Most existing learning methods have been devised for serial hardware, although some par- allel implementations have also been developed. In this chapter, we have examined some of the main principles underlying such systems. Most neural network models combine a highly interconnected network architecture with a simple neuron model.
The use of a learning algorithm is assumed, which stores the knowledge specific to a problem in the weights of connections between neurons. Neural networks have been used for several different kinds of tasks, such as classifica- tion, clustering, function approximation, and optimization. Each task requires a different kind of network and learning algorithm, whose details are discussed in the chapters that follow.
So far, we have only examined the commonalities between these networks and attempted to present a general framework within which various neural network models can be understood.
Where possible, we will point out the relations to other well-known statistical and mathematical procedures, often cloaked under the cover of neural network terminology. This is partly because neural networks are non-parametric estimators, making no assumptions about input distri- butions, and use non-linear node functions. By contrast, non-parametric and non-linear sta- tistical procedures are relatively complex and harder to implement than neural networks.
However, statistical procedures are useful in suggesting the limitations of different neural network models, and also point to the directions for future research in neural networks. For instance, results of statistical analysis can be carefully interpreted, and measures of confidence can be formulated for results generated by statistical procedures; these are un- available for neural network procedures. There are some problems for which the alternative to a neural net is a rule-based expert system.
The latter is suitable if knowledge is readily available. Neural networks can be applied more easily if raw data is available and if it is difficult to find an expert whose knowledge can be codified into rules.
Again, the amount of external or expert assistance available may dictate the preferential use of the non-neural alternatives, although hardware or parallel implementations weigh in favor of neural nets.
Many books and thousands of papers have been written on the subject of neural net- works since the decade of the s, which has been called the "renaissance" period for the subject, probably beginning with the publication of the book Parallel Distributed Pro- cessing D.
Rumelhart and J. McClelland, Anderson and Rosenfeld and Cowan are good sources that discuss the history of neural networks. A series of important network models are cataloged in Artificial Neural Systems Simpson, For the interested and mathematically inclined reader, Neural Networks: A Comprehen- sive Foundation by Haykin , and Introduction to the Theory of Neural Computation by Hertz, Krogh, and Palmer should serve as useful followups to this text.
Books have also appeared on specific applications of neural networks, in areas such as chemistry and geography. Research papers appear in journals such as Neural Networks, IEEE Trans- actions on Neural Networks, Neurocomputing, Journal of Artificial Neural Networks, and Biological Cybernetics; important papers also appear in many other journals in the areas of artificial intelligence, statistics, pattern recognition, physics, biology, parallel computing, and cognitive science.
Many neural network tools are also freely available on the Internet. It is easy to be lost in this immense literature on neural networks. Not all networks are described here, nor anywhere else, for that matter: new networks, algorithms, and heuristics appear in the literature every month.
We hope that this book will help in the newcomer's first forays into this field and whet the reader's appetite for more details. For each of the following problems, indicate whether it can be considered a problem of classification, clustering, pattern association, optimization, forecasting, function approxi- mation, or other appropriate category.
For which of these problems would it be appropriate to use neural networks, as opposed to rule-based expert systems, statistical procedures, or other traditional techniques, and why? In each case, certain "inputs" specific to a particular situation may be available, as well as some knowledge stored in a neural network or some other system. Decide whether a person is smiling, based on an analysis of the visual input you receive. Remember the name of a person. Determine whether a person is suffering from malaria.
Determine what actions will make a person leave you alone. How useful is it to have a clustering system where the number of clusters equals the number of input patterns? Give an example of a classification task where clustering is not helpful.
0コメント