Hyponym harvesting has attracted a lot of interest. A large number of approaches were developed so far. The approaches can be divided into pattern-based, kernel-based, and document clustering-based [17] methods. A typical text pattern consists of a sequence of words or sentence marks and two placeholder variables labeled with hypernym or hyponym. An example of such a pattern is the Hearst pattern [18], “hyponym and other hypernym.” If matched to the sentence The secretary and other politicians criticized the law, the placeholder variable hyponym would be assigned to secretary, the variable hypernym to politician, and therefore the correct hyponymy relation “secretary is a hyponym of politician” is extracted. Such a surface-based approach is easy to realize and also quite limited. It fails for instance if an additional subclause is inserted, for example, The secretary and, according to our information, a lot of other politicians criticized the law. In this case, the given surface pattern can no longer be used for the extraction of the above-mentioned relation. This problem can be overcome by employing graph-based representations such as dependency trees. The patterns of such an approach are given by dependency subtrees. An approach to learn these patterns automatically was devised by Snow et al. [15]. For that, the path in the dependency tree is extracted, which connects the corresponding nouns with each other. To account for certain keywords indicating a hyponym relation like such (see first Hearst pattern) they added the links to the word on either side of the two nouns (if not yet contained) to the path too. Frequently occurring paths are then learned as patterns for indicating a hyponymy relation.
Another often used approach is the use of structure kernels, usually tree and sequence kernels. Assume that a relation R1 = R(a1, a2) should be compared with a relation R2 = R(a1', a2'). Then the following kernels might be applied to determine the estimated similarity of the surrounding tree structures:
Hyponym extraction approaches based on document clustering usually try to convert the document hierarchy, as determined by a hierarchical clustering method, into a taxonomy [17,20].
3.17.176.72