The trained models (classifiers) produced from the learning phase

The trained models (classifiers) produced from the learning phase will then be used to disambiguate CHIR99021 CAS unseen and unlabeled examples in the testing phase. That is, during the learning phase, the constructed feature vectors of the training instances will be used as labeled examples to train classifiers. The classifier will be then used to disambiguate unseen and unlabeled examples in the application phase. One of the main strength of this method is that the features are selected for learning and classification.Feature Selection ��The features selected from the training examples have great impact on the effectiveness of the machine learning technique. Extensive research efforts have been devoted to feature selection in machine learning research [18�C21].

The labeled training instances will be used to extract the word features for the feature vectors.Suppose the word wx has two senses s1, s2, let the set C1 be the set of wx instances labeled with s1, and suppose C2 contains instances of wx labeled with sense s2. So, each instance of wx labeled with sense s1 or s2 (i.e., in the set C1 or in the set C2) can be viewed aspn?p3p2p1f1f2f3?fn,(1)where the words p1, p2,��, pn and f1, f2,��, fn are the context words surrounding this instance, and n is the window size. Next, we collect all the context words pi and fi of all instances in C1 and C2 in one set W (s.t. W = w1, w2,��, wm). Each context word wiW may occur in the contexts of instances labeled with s1 or with s2 or combination and in any distribution.

We want to determine that, if we see a context word wq in an ambiguous instance/example, to what extent this occurrence of wq suggests that this example belongs to C1 or to C2. Thus, we use as features those context words wi that can highly discriminate between C1 and C2. For that, we use feature selection techniques such as mutual information (MI) [19, 20] as follows. For each context word wi W in the labeled training examples, we compute four values a, b, c, and d as follows:a = number of occurrences of wi in C1,b = number of occurrences of wi in C2,c = number of examples of C1 that do not contain wi,d = number of examples of C2 that do not contain wi.Therefore, the mutual information (MI) can be defined asMI=N?a(a+b)?(a+c),(2)and N is the total number of training examples. MI is a well-known concept in information theory and statistical Brefeldin_A learning. MI is a measure of interaction and common information between two variables [22]. In this work, we adapted MI to represent the interaction between the context words wi and the class label based on the values a through d as defined above.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>