Annotation Model of the morphosyntactic component of the SUSANNE scheme as
applied to the British English SUSANNE corpus (Sampson, 1995), also
covering the simplified SUSANNE tag set as used by the TnT Tagger
(Brants 2000).
Brants, T. (2000),
TnT--a statistical part-of-speech tagger,
In Proc. ANLP 2000
Sampson, G. (1995), English for the computer:
The SUSANNE corpus and analytic scheme,
Oxford University Press
SUSANNE tag AT pertains to articles and other determiners, i.e., "every", the indefinite article, the negative determiner and the definite article.
(Sampson 1995, p.105)
SUSANNE tags beginning with NN represent common nouns and direction nouns.
Also, ND has been included here as it is comparable with temporal situating as in NNa, NNp, NNT....
The Conjunction class tags conjunctions and subsumes the classes BTO, coordinating conjunctions, subordinating conjunctions and pre-coordinators.
The SUSANNE tags begin with C... . for logical reasons, pre-coordinator (LE) is added here.
SUSANNE tag used for "every" (Sampson 1995, p. 105), according to http://www.ilc.cnr.it/EAGLES96/morphsyn/node454.html#SECTION00084100000000000000, an indefinite determiner
SUSANNE tag beginning with FW... apply to foreign words which are not capable of being allocated a more specific tag by reference to their English context.
(Sampson 1995, p. 108)
GG is the SUSANNE tag used for the Germanic genitive inflection +?s, or + ? after a plural stem and certain other stems ending in -s.
(Sampson 1995, p. 108)
SUSANNE tags beginning with VV... are used for non-modal verbs or non-auxiliary verbs.
LexicalVerbPastForm is a class for past tense verbforms.
(Sampson 1995, p. 118f)
Tags which begin with VV... are used for non-modal verbs or non-auxiliary verbs.
LexicalVerbPresentParticiple is a class for present participle verbforms.
(Sampson 1995: 118f)
Tags which begin with VV... are used for non-modal verbs or non-auxiliary verbs.
LexicalVerbThirdPerson is used for the third person form of full verbs.
(Sampson 1995, p. 119)
The NonPOS class subsumes separated morphological elements, formulae, equations, foreign words etc.
Also, GG (~ FA) and ZZ (single Letter) are added here
MD is the SUSANNE tag used for ordinal numbers, whether used as ordinal adjective or adverb or as fraction e.g. "third", "fourth".
(Sampson 1995, p. 111)
Organization nouns are common nouns that denote is an organization and that are used as the head word in the name of organizations of that kind. (Sampson 1995, p. 94)
The SUSANNE tag LE applies to pre-coordinators, i.e., the first element of paired coordination markers (cf. http://www.ilc.cnr.it/EAGLES96/morphsyn/node622.html#SECTION000124200000000000000, ?3.71).
In the Susanne Corpus "pre-co-ordinator" means the first part of paired co-ordinating markers, e.g. in "both... and", "neither... nor".
(Sampson 1995, p.110)
SUSANNE tags beginning with ICS apply to prepositions (I) that can also occur as subordinating conjunctions (CS), e.g.,
"after", "before", "since" (ICSt), "considering" (ICS), "save" (ICSx).
This ambiguity is not resolved in SUSANNE.
(Sampson 1995, p.108f.)
The Present class contains the SUSANNE tags VBM, VBZ, VDZ, VHZ and VVZ.
SUSANNE V...-tags which contain "M" represent first person singular of "be", those which contain "Z" represent third person singular verbs.
Other present verbs are tagged as BaseForm.
In SUSANNE terminology, a "qualifier" is an adverb-modifying adjective or adverb.
Tags beginning with RG... apply to qualifiers which having no other adverbial use.
(Sampson 1995, p.116)
SUSANNE tags beginning with D... form a very heterogeneous class of expressions include quantifiers, DB and determiners.
According to the examples, the prototypical forms which are collected here can be used as determiners. Some have ambiguity with pronouns.
SUSANNE tags beginning with CS... represent subcoordinating conjunctions ("although", "how", "if").
(Sampson 1995, p.106)
The tags BTO and TO are also included here, according to van Valin and Lapolla (1997), English "to" (+infinitive) is a subordinating
conjunction.
SUSANNE tag for "in order" introducing infinitive (cf. TO);
according to http://www.ilc.cnr.it/EAGLES96/morphsyn/node623.html#SECTION000124300000000000000 this is a subordinating conjunction
These are substitutive possessive pronouns. The SUSANNE tags begin with PPG... . (G is "genitive", as English possessives are derived from Genitive forms of personal pronouns)
(Sampson 1995, p. 115)
SUSANNE tags which begin with PNQ... represent SubstitutiveWHPronouns. This class contains PNQV ("whosever", "whomever", "whoever"), SubstitutiveInterrogativePronoun and SubstitutiveRelativePronoun.
SUSANNE TagSet (Sampson 1995) augmented with
English examples from the Susanne corpus and from "Morphosyntactic
Phenomena Encoded in Lexicons and Corpora A Common Proposal and
Applications to European Languages EAG---CLWG---MORPHSYN/R
Version of 31st Aug, 1996"
(http://www.ilc.cnr.it/EAGLES96/morphsyn/)
Common nouns of style and title. An S term is a status-indicating item which either accompanies one or more individual names within the full title of a person or is used to addres a person of appropriate status, or both. (Sampson 1995, p. 95, 112-113)
Following the EAGLES recommendations, the Unique class subsumes different 'particles'.
These are isolate forms, uninflectable, which have not been included anwhere else
(see http://www.ilc.cnr.it/EAGLES96/morphsyn/node700.html#SECTION000153100000000000000 for a list).
Includes "to" (+ Infin.) (UI), negative "not", "n't" (UN), and existential "there" (UX)
Unit nouns, i.e., units of measurement, wether written in full ("inch", "kilogramm") or abbreviated either alphabetically or symbolically, as head of a noun phrase denoting a measured quantity. (Sampson 1995, p. 97)
SUSANNE tags beginning with V... represent verbs.
Note that the subclassification follows the following EAGLES recommendation:
Since it is impractical, however, given the current capabilities of tagging software,
to resolve automatically the ambiguity of these six morphological functions, it is a
common practice to assign a single value to the base form, or else to assign two values,
one for the finite and one for the non-finite functions. Because of this, the tables below
show two tagsets: one tagset representing the 6 attribute-values above, and a reduced tagset
(`RTags'), which resembles most tagsets so far used for the English language in reducing the
six values to two.
http://www.ilc.cnr.it/EAGLES96/morphsyn/node150.html#SECTION00054000000000000000