4。 We have established a model selection method based on entire decision performance evaluation, which provides theoretical foundation and technique support for model selection in knowledge discovery。
We have established a class of model selection methods based on entire decision performance evaluation。 In the context of complete information systems, we have proposed three decision performance parameters of a complete decision-rule set, which are entire certain measure, entire consistency measure and entire support measure。 In the context of incomplete information systems, we have first characterized incomplete decision rules using maximal consistent blocks and have then given the corresponding those three parameters。 To evaluate the entire decision performance of dominant rules from ordered decision information systems, we have also developed three evaluation parameters, which are entire certain measure, entire consistency measure and cover measure。 These results show that all the proposed evaluation methods are much better than existing methods based on approximation accuracy and approximation quality, which can provide theoretical foundation and technique support for model selection and scientific decision for a specific issue。
In a few words, from the viewpoint of granulation cognitive mechanism, this paper have obtained a series of important results at four stages including information granulation, granulation uncertainty modeling strategy and model selection。 These results from this paper have initially established a data modeling theory and method architecture based on granulation mechanism, which have important theoretical significance for complex data modeling, and also have practical application values for improving efficiency of mass information processing。
Keywords: Complex data; Data modeling; Granular computing; Information granulation; Granular space; Information granularity; Multigranulation; Dynamic granulation; Ordered granulation; Model selection
Chapter Two
2。1 Neural Networks
Consider a supervised learning problem where we have access to labeled training examples (x(i), y(i))。 Neural networks give a way of defining a complex, non-linear form of hypotheses hW, b(x), with parameters W, b that we can fit to our data。
To describe neural networks, we will begin by describing the simplest possible neural network, one which comprises a single "neuron。" We will use the following diagram to denote a single neuron:
This "neuron" is a computational unit that takes as input x1, x2, x3 (and a +1 intercept term), and outputs, where is called the activation function。 In these notes, we will choose f(·) to be the sigmoid function:
Thus, our single neuron corresponds exactly to the input-output mapping defined by logistic regression。
Although these notes will use the sigmoid function, it is worth noting that another common choice for f is the hyperbolic tangent, or tanh, function:
Here are plots of the sigmoid and tanh functions:
The tanh(z) function is a rescaled version of the sigmoid, and its output range is [ - 1, 1 ] instead of [ 0, 1 ]。
Note that unlike some other venues (including the Open Classroom videos, and parts of CS229), we are not using the convention here of x0 = 1。 Instead, the intercept term is handled separately by the parameter b。
Finally, one identity that'll be useful later: If f(z) = 1 / (1 + exp( − z)) is the sigmoid function, then its derivative is given by f'(z) = f(z)(1 − f(z))。 (If f is the tanh function, then its derivative is given by f'(z) = 1 − (f(z))2。) You can derive this yourself using the definition of the sigmoid (or tanh) function。
2。2 Neural Network model
A neural network is put together by hooking together many of our simple "neurons," so that the output of a neuron can be the input of another。 For example, here is a small neural network: