9.9 Exercises

9.1 The following table consists of training data from an employee database. The data have been generalized. For example, “31…35” for age represents the age range of 31 to 35. For a given row entry, count represents the number of data tuples having the values for department, status, age, and salary given in that row.

Image


Let status be the class-label attribute.

(a) Design a multilayer feed-forward neural network for the given data. Label the nodes in the input and output layers.

(b) Using the multilayer feed-forward neural network obtained in (a), show the weight values after one iteration of the backpropagation algorithm, given the training instance “(sales, senior, 31…35, 46K…50K) ”. Indicate your initial weight values and biases and the learning rate used.

9.2 The support vector machine is a highly accurate classification method. However, SVM classifiers suffer from slow processing when training with a large set of data tuples. Discuss how to overcome this difficulty and develop a scalable SVM algorithm for efficient SVM classification in large data sets.

9.3 Compare and contrast associative classification and discriminative frequent pattern–based classification. Why is classification based on frequent patterns able to achieve higher classification accuracy in many cases than a classic decision tree method?

9.4 Compare the advantages and disadvantages of eager classification (e.g., decision tree, Bayesian, neural network) versus lazy classification (e.g., k-nearest neighbor, case-based reasoning).

9.5 Write an algorithm for k-nearest-neighbor classification given k, the nearest number of neighbors, and n, the number of attributes describing each tuple.

9.6 Briefly describe the classification processes using (a) genetic algorithms, (b) rough sets, and (c) fuzzy sets.

9.7 Example 9.3 showed a use of error-correcting codes for a multiclass classification problem having four classes.

(a) Suppose that, given an unknown tuple to label, the seven trained binary classifiers collectively output the codeword 0101110, which does not match a codeword for any of the four classes. Using error correction, what class label should be assigned to the tuple?

(b) Explain why using a 4-bit vector for the codewords is insufficient for error correction.

9.8 Semi-supervised classification, active learning, and transfer learning are useful for situations in which unlabeled data are abundant.

(a) Describe semi-supervised classification, active learning, and transfer learning. Elaborate on applications for which they are useful, as well as the challenges of these approaches to classification.

(b) Research and describe an approach to semi-supervised classification other than self-training and cotraining.

(c) Research and describe an approach to active learning other than pool-based learning.

(d) Research and describe an alternative approach to instance-based transfer learning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.141.206