Preliminary Recommendations



It is not necessary here to provide a general motivation justifying the need for standards in the field of Language Engineering. We can simply state that standards are needed in the area of Computational Lexicons in order to:

Following the work done within the EAGLES Lexicon Working Group in the area of Morphosyntax for the standardisation of the lexical information at the morphosyntactic layer (EAGLES, 1996), the obvious next step was to move on to the syntactic layer and to deal with subcategorisation information. Subcategorisation -- restricted to verbs for the moment -- is broadly intended as referring to typical collocations sanctioned by strong syntactic/semantic selection (head/complement relation), thus leaving out other collocation types such as head/modifier, head/specifier, etc.

We state here a number of general issues which can be of help to better situate the work done by the Lexicon/Syntax group. The first issue concerns the characterisation of the EAGLES Lexicon work with respect to the way in which one can try to define standards. One can hypothesise different approaches to the task of reaching consensus, by providing standards at the levels of:

From this viewpoint, the most important concern for EAGLES is linguistic substance. Consequently, the group is building on the results of the ET-7 feasibility study (Heid & McNaught, 1991) which recommended the following methodology: to break up the complex descriptive devices into ``minimal observable facts'' in order to arrive at the most fine-grained, common set of features underlying different theoretical frameworks or systems. EAGLES results are therefore based on a careful and detailed analysis of different linguistic theories and frameworks, but aiming at reaching a consensus at the level of these ``minimal observable facts''.

Connected with this basic objective is the approach chosen towards its achievement, an approach which can be defined as looking for an edited uniongif of the features proposed in the various major theories and systems. This approach tries to capture all the relevant distinctions made by the major lexical theories/systems, without taking a theoretical stand, thereby giving to features labels which are as neutral as possible. By employing this approach, the Lexicon/Syntax group is using the same methodology as that already established within the Lexicon/Morphosyntax group, thus adhering to a coherent path.

As background principles to our work, we aim to be:

In an attempt to be as theory-compatible as possible, there are a few points where choices were left open, especially for those aspects of grammatical description which tend to be more theory-bound (e.g. grammatical relations and control). There are practical drawbacks to this decision -- especially with regard to the implementation of the proposed standard -- but, at least in this first phase, more importance was given to avoiding committment to specific theories of lexical description. We recognise that there is a tension between the decision to be flexible and open to more than one choice and the real and effective useability of the proposed standards. Without abandoning the principle of flexibility and openness, we can provide an indication of usage by exemplifying the implementation of critical choices.

The last point we want to mention concerns more the methodology of work within EAGLES. In general, the EAGLES results are achieved in a dynamic way, with a cyclical process of revision after one or more phases of testing and feedback, possibly in large projects. The difference between the European approach and other approaches to standards should be pointed out here, to be taken as a description of a general tendency. While in, say, the USA, a sort of de facto standard is somehow made available to the community through the provision of publicly available data, in Europe we try to arrive at consensually agreed standards. This implies a considerable effort in trying to involve the relevant experts in the different areas of concern, either in the phase of producing the standards, or at least in the successive phases of testing the proposals and providing feedback. This approach also involves a large amount of overheads in terms of activities and work necessary to arrive at a consensus as well as a slower process of arriving at the aimed-for results.

