11.6 Architecture

In this section, we shortly give an overview of the total hyponymy extraction process.

1. In the first step the corpus containing the hyponyms (here the German Wikipedia) is parsed by the deep linguistic parser WOCADI2 [23]. For that WOCADI makes use of the semantic lexicon HaGenLex3 [24] and a given knowledge base KB. The output of the WOCADI analysis for a single sentence is a token list, a dependency tree, and a semantic network.
2. Shallow extraction rules (similar to Hearst patterns) are applied on the token list.
3. Deep extraction rules are applied on the semantic network representation.
4. A validation module is applied that filters out incorrect hypotheses by looking on the semantic properties of these hypotheses [25].
5. Not all of the hypotheses that pass this filter are actually correct. Therefore, a support vector machine is additionally applied to validate the accepted hypotheses. Validation scores are calculated for all hypotheses and stored with them together in the hypotheses knowledge base HKB.
6. The best hypotheses of HKB, according to the scores, are stored in the knowledge base KB after manual inspection.

The entire validation process is illustrated in Figure 11.2.

Figure 11.2 Activity diagram for the hyponym acquisition process.


A deep extraction rule consists of a conclusion sub0(a1, a2) (sub0: hyponymy/instance-of/troponymy relation) and a premise that is a semantic network where two of the nodes are labeled with the variables a1 and a2. The pattern network is tried to be matched to the sentence network. The variables can be bound to arbitrary concepts. The instantiated conclusion, where a1 and a2 are replaced by the concepts they were bound to, is the extracted hypothesis. Several example rules are given in Table 11.1. The extraction rules are in part manually specified and in part learn automatically from a collection of annotated semantic networks employing the Minimum Description Length Principle [26], basically following the approach of Cook and Holder [12].

Table 11.1 A Selection of Deep Patterns.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.