OLiA Annotation Model for the SFB632 Annotation Guidelines
(Dipper et al. 2007) for Morphology and Syntax
Stefanie Dipper, Michael Götze and Stavros Skopeteas (2007),
Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology,
Syntax, Semantics, and Information Structure.
In: Interdisciplinary Studies on Information Structure (ISIS)
Working papers of the SFB 632; vol. 7.
Universität Potsdam
Accusative case is the case in nominative-accusative languages that marks certain syntactic functions, usually direct objects. (http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsAccusativeCase.htm)
This is the Adjectival Phrase (AP). In general, adjectives are not annotated at the syntactic layer. However, there are two exeptions: adjectives (or APs) that function as nominal predicats are annotated with AP. The head of the AP is not labeled; this information can be retrieved from the POS layer.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 3.3.4)
These are adjectives, e.g. Spanish "aburrido" (boring).
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.3)
The category adjunct (ADJ) is assigned to those constituents that appear as optional additions, be it to the main verb or to a given noun. This means that they can be left out freely without a change in
grammaticality or a significant change in meaning.
In "John called Mary (from school) (with his cell phone)" the optional additions "from school" and "with his cell phone" are such optional additions that can be left out freely.
Adjuncts are generally used to convey additional information about the time, place, manner, or cause of the event or situation described by the clause (see below). That is, they restrict the class of events/ situations described by the clause to a subset. If required the category ADJ can be split up into semantic
sub-categories, that are annotated in layer semantic roles (time, location, etc.).
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.3)
These are adpositions: preposition/postposition/X-positions, e.g. "before" in "before two years", "ago" in "two years ago", cf. German "um ... willen" in "um unseres Vaters willen".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.5)
These are adverbs, e.g. "soon".
So called pronominal adverbs in German are also annotated as ADV, e.g. "darueber".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.4)
These are subordinate clauses with adverbial function which are annotated as ADV, e.g. "Tom sleeps when the sun rises."
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.6)
NPs that refer to the entities that cause actions, either animates or inanimates,
are annotated as agents.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
The category ARG is assigned to those syntactic constituents that appear as obligatory complements to the main verb. This means that they cannot be left out without a change in grammaticality or a significant change in meaning.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.3)
If you need a attributive paradigm of pronouns, then append AT. Attributive pronouns replace function as a determiner. E.g. "your" is tagged as PRONPOS-AT.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
These are auxiliary verb, e.g. "have", cf. "be" in "be destroyed", cf. German "haben" in the formation of perfect tenses ("gegessen haben").
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
Cause indicates the reason why something happens and is often expressed by a
PP (because of, with, through etc.). Sometimes this role is close to the role of
Instrument. The criterion for the choice of tag CAUSE is if the expression can
be paraphrased through a clausal subordinate clause.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.10)
This is a sentence/ clause (S). It marks both main clauses and subordinate clauses. The root S symbol also covers the final punctuation mark.
Dependent verb forms (infinitives, gerunds, participles, etc.) are labeled as S. Also infinitival complements of lexical verbs are annotated as S.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 3.3.5)
Comitative applies to an animate entity that accompanies a participant of the
action.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.12)
Comitative case is a case expressing accompaniment.
It carries the meaning "with" or "accompanied by."
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsComitativeCase.htm
NCOM is used for common nouns, e.g. "house".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.1)
The comparative is the form of an adjective or adverb which denotes the degree or grade by which a person, thing, or other entity has a property or quality greater or less in extent than that of another.
http://en.wikipedia.org/wiki/Comparative
A complementizer is a conjunction which marks a complement clause.
(http://www.sil.org/linguistics/GlossaryofLinguisticTerms/WhatIsAComplementizer.htm 22.07.07)
This tag is used if you need to indicate complementizers or adverbial subordinating conjunctions separately, e.g. "that", "when".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.7)
This is the Constituent structure layer. There are used multiple layers which are named 'CS1', 'CS2', ..., 'CSn'.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 3.1)
These are coordinating conjunctions, e.g. "and", "or".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.7)
A copula is an intransitivity verb which links a subject to a noun phrase, an adjective, or other
constituent which expresses the predicate.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsACopula.htm
These are copula verbs, e.g. "be" in "be happy".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
Dative case is a case that marks indirect objects (for languages in which they are held to exist) or nouns having the role of recipient (as of things given), beneficiary of an action, or possessor of an item.
( http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsDativeCase.htm)
These are demonstrative pronouns, e.g. "this". Notice that German displays a demonstrative pronoun that is in most cases homonymous to the definite article.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
Determiners include articles ("the") and cardinal numerals ("two") used as determiners (see 5.3.5; 5.3.8). They do not include demonstratives or quantifiers (cf. 5.3.8).
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.6)
A diminutive is a formation of a word used to convey a slight degree of the root meaning, smallness of the object or quality named, encapsulation, intimacy, or endearment. (http://en.wikipedia.org/wiki/Diminutive 22.07.07)
The category Direct Object (DO) belongs to the extended scheme in the guidelines and is assigned to the second argument of a transitive verb, which is not designated in the sense that it is less prominent than the subject. This rule of thumb makes the Nominal Phrase "Bill" in "The boys like Bill" the DO, since it does not agree with the main verb in number. Like subjects, DOs are assigned structural case (ACC/PAR or ABS) in case-assigning languages. Like subjects, DOs have a default base position relative to verb and subject in languages that do not assign case: In English and French, the DO follows the main verb (and the subject). DOs are generally taken
to stand in close syntactic relation with the main verb, which is reflected by the fact that they can be displaced together: "[Den Mann gerufen] haben wir." Apart from this, DOs are often only identifiable based of the absence of properties typical for subjects.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.4)
A distal is a distinction in place deixis that indicates location far from the speaker or other deictic
center. It is a kind of a proximal-distal dimension. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsADistal.htm
These are ditransitive verbs, e.g. "give".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
Ergative case is the case of nouns in ergative-absolutive languages that would generally be the subjects of transitive verbs in the translation equivalents of nominative-accusative languages such as English. Ergative case is more likely to be formally marked on the noun than absolutive case is.
(http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsErgativeCase.htm)
Experiencer is the sentient being that participates in a state/event of emotion (love, hate, etc.), volition (wish, want, etc.), cognition (think, remember, etc.), perception (see, hear, etc.) or bodily sensation (feel cold, feel hungry, etc.).
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.4)
Expletive pronouns (also called 'impersonal pronouns', 'pleonastic pronouns') are pronouns which do not have any meaning but are syntactically required, e.g. "there is a man".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
Feminine gender is a grammatical gender that marks nouns that have human or animal female referents, and often marks nouns that have referents that do not carry distinctions of sex. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsFeminineGender.htm
The focus of a sentence is the portion that presents salient information of high communicative interest. The rest of the sentence is extrafocal and contains presupposed information.
(http://www.uni-erfurt.de/sprachwissenschaft/proxy.php?port=8080&file=lido/servlet/Lido_Servlet Fokus 18.06.07)
This is the grammatical function layer. This layer encodes the syntactic relations that various syntactic constituents in a clause (NP, PP, AP, S) entertain with respect to the main verb of that clause.
Relevant information at this layer relates to the questions of
(i) whether a syntactic constituent is an obligatory addition the verb (argument), or wether it is an optional addition that could be easily left out (adjunct),
(ii) wether the relative status of the different arguments differs and - if so - which of the arguments of a verb (if any) has a prominent status with respect to grammatical processes such as agreement, binding, focus marking etc. Only constituents that are annotated at the CS layers may be labeled for grammatical functions.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.1)
Future tense is an absolute tense that refers to a time after the moment of utterance. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsFutureTense.htm
Genitive case is a case in which the referent of the marked noun is the possessor of the referent of another noun. In some languages, genitive case may express an associative relation between the marked noun and another noun.
(http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsGenitiveCase.htm)
Goal is a general term covering the notions of recipient that means an entity which receives something, of benefactive that is an entity to whose advantage an action is performed (or
malefactive: an entity to whose disadvantage an action is performed), and of purpose which means the intension for which an action is performed.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.5)
Habitual aspect is an imperfective aspect that expresses the occurrence of an event or state as characteristic of a period of time. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsHabitualAspect.htm
Imperative mood is mood that signals directive modality, especially in commands. Its use may be extended to signal permission. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsImperativeMood.htm
Imperfective aspect is an aspect that expresses an event or state, with respect to its internal structure, instead of expressing it as a simple whole. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsImperfectiveAspect.htm
Indefiniteness is a kind of definiteness indicating that the referent(s) of an expression are not presumed to be identifiable. The referent is not identifiable because of a lack of shared knowledge or situation, including no previous mention of the referent. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsIndefiniteness.htm
The indicative mood is used for factual statements and positive beliefs. All intentions in speaking that a particular language does not put into another mood use the indicative. It is the most commonly used mood and is found in all languages. Example: "Paul is reading a book" or "Paul reads books".
http://en.wikipedia.org/wiki/Indicative
The category Indirect Object (IO) belongs to the extended scheme in the guidelines and is assigned to that argument of a (ditransitive) verb that is not assigned the status of Subject (SUBJ) nor Direct Object (DO). In case-languages, IOs are often assigned the Dative. Semantically, the IO is often used to express the receiver or eneficient/ maleficient of an event, such as the NP John in Mary gave John a book/ kiss. Unlike SUBJs and DOs, IOs seem to always refer to individuals and must be expressed by a nominal constituent.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.4)
An infinitive is the base form of a verb. It is unmarked for inflectional categories aspect, modality, number, person and tense. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsAnInfinitive.htm
Instruments are means with the help of which the action is carried out.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.6)
Instrumental case is a case indicating that the referent of the noun it marks is the means of the
accomplishment of the action expressed by the clause.
(http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsInstrumentalCase.htm)
These are intransitive verbs, e.g. "sleep".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
These are lexical verbs, e.g. "walk", cf. German "wollen" in "ich will ein Eis".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
Location covers the spatial relations of static spatial location, direction of movement, source which is indicating the origin of movement, and the relation of path which is indicating a place through which the movement takes place.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
Locative case is a case that expresses location at the referent of the noun it marks.
The term adessive case, a synonym of locative case, is used especially in studies of Finno-Ugric grammar.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsLocativeCase.htm
These are main clauses, e.g. "John sleeps.".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.6)
Manner applies to constituents that denote how something is carried out.
Adverbs may also denote manner, however, they are not annotated at any of the
syntactic layers.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.11)
Masculine gender is a grammatical gender that marks nouns having human or animal male referents, and often marks nouns having referents that do not have distinctions of sex.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsMasculineGender.htm
Medial/ Immediacy is a distinction in place deixis that indicates location at a distance intermediate
between locations considered proximal and distal. It is a kind of a proximal-distal dimension.
(in according to: http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsImmediacy.htm)
These are modal verbs, e.g. "can", cf. German "wollen" in "in will gehen".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
The negative mood expresses a negated action. In many languages, this is not a distinct mood. Negation is expressed by adding a particle before the verb phrase, as in Spanish "No esta en casa", or after it, as in archaic and dialectal English "Thou remembrest not" or Dutch "Ik zie hem niet", or both, as in French "Je ne sais pas" or Afrikaans "Hy kan nie Afrikaans praat nie".
http://en.wikipedia.org/wiki/Negative_mood#Negative
Neuter gender is a grammatical gender that includes those nouns having referents which do not have
distinctions of sex, and often includes some which do have a natural sex distinction.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsNeuterGender.htm
Nominative case is the case that identifies clause subjects in nominative-accusative languages. Nouns used in isolation have this case. Nominative case is not often formally marked in nominative-accusative languages.
(http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsNominativeCase.htm
The tag N is used for the general case of a noun, e.g. "water".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.1)
A Noun Phrase (NP) consists of a head plus any modifying or determining material, i.e. adjectives, relative clauses, determiners, demonstratives, ect.
NPs typically occur as complements to verbs or prepositions/ postpositions.
Substantive pronouns (he, she, it, this, that, someone, anyone, ect.) and expletive subjects are NPs.
And NPs can be embedded within another NP.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 3.3.1)
An oblique case (Latin: casus generalis) in linguistics is a noun case of synthetic languages that is used generally when a noun is the object of a sentence or a preposition. An oblique case can appear in any case relationship except the nominative case of a sentence subject or the vocative case of direct address.
( http://en.wikipedia.org/wiki/Oblique_case)
A participle is a lexical item, derived from a verb, that has some of the characteristics and functions of both verbs and adjectives. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsAParticiple.htm
These are particles, e.g. German "ja". Interjections are also annotated as particles, e.g. "oh".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.9)
Passive voice is a voice that indicates that the subject is the patient or recipient of the action denoted
by the verb.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsPassiveVoice.htm
Past tense is an absolute tense that refers to a time before the moment of utterance. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsPastTense.htm
Perfective aspect is an aspect that expresses a temporal view of an event or state as a simple whole, apart from the consideration of the internal structure of the time in which it occurs.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsPerfectiveAspect.htm
These are personal pronouns, e.g. "you" .
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
Plural number is number that expresses reference to a quantity greater than that expressed by the largest specific number category in a language, such as "more than one" in English, and "more than two" in some other languages. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsPluralNumber.htm
These are possessive pronouns, e.g. "your".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
A Possessor is the entity who owns something.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.7)
This is a nominal predicate (noun or adjective), either with or without copula. The term nominal predicate may be used for the complements of further copulative verbs (cf. small clauses), e.g. consider, call, etc.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.5)
A Prepositional Phrase (PP) consists of a prepositional/ postpositional head and its NP-complement, plus optional modifiers. Pronominal adverbs are also PP constituents.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 3.3.2)
Present tense is an absolute tense that refers to the moment of utterance. It often refers to events or
states that do not merely coincide with the moment of utterance, such as those that are continuous, habitual, or lawlike.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsPresentTense.htm
This is the class of pronouns which also includes the quantifiers.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
NPRP is used for proper noun, e.g. "Peter".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.1)
A proximal-distal dimension is a distinction in place deixis that indicates distance from the speaker or other deictic center.
(http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsAProximalDistalDimension.htm)
A proximal is a distinction in place deixis that indicates location close to the speaker or other
deictic center. It is a kind of a proximal-distal dimension. http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsAProximal.htm
These are quantifiers, e.g. "jeder", "alle".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
PRONRFL reflexive pronouns, e.g. "myself" (4.3.10.5)
This category should be
used only if the language possesses pronouns, which are always usedas
reflexives, e.g. the English reflexive pronouns (not the German
pronouns of the type "ich schaeme mich", where the ambiguity
personal/reflexive is resolved in the argument structure
of the given verb).
These are relative clauses which are annotated as ATTR, e.g. " I saw the boy who ate the mango.".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.6)
These are relative pronouns, e.g. "which".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
This is the semantic Role layer. Lexical heads not only require a certain number of arguments but also determine the semantic properties of these arguments depending on how these are involved in the state of affairs described by the lexical head. This means that the syntactic arguments enter certain semantic (also called thematic or theta-) roles, which are pre-established by the selecting properties of the lexical head. The relationship between a lexical head and its arguments can be explained by the use of a small finite set of universally applicable notions which indicate whether a certain argument is the performer of an action, just undergoes an action etc. Note that only constituents that are annotated at the CS and FUNCTION layers may be labeled for semantic role.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.1)
Singular number is number that refers to one member of a designated class.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsSingularNumber.htm
The category Subject (SUBJ) belongs to the extended scheme in the guidelines and is assigned to a designated argument that is prominent with respect to a number of grammatical relations such (i.) as constituency with the verb, (ii.) agreement, (iii.) and binding, etc. This prominence is often taken to correspond to a prominent position in the syntactic structure of the clause.
(i) Unlike direct objects, subjects do not seem to form a constituent with the verb as shown by the fact that the two cannot be topicalised together in
" *[Johann gesehen] hat den Mann nicht" vs. "[Den Mann gesehen] hat Johann nicht."
(ii) In agreement languages, the subject is that argument that the verb always agrees with (in some languages the verb additionally agrees with the object as well): "Johann (sg.) sleeps (sg.)" vs. "*The boys (pl.) sleeps (sg.)"
(iii) subjects can bind reflexive pronomina: "Peter blamed himself" vs. " *Heself blamed Peter." Subjects are most often expressed by nominal constituents (NPs), but sentential subjects as in
"[That Peter won the race] surprised me" are also possible with certain verbs. Often the subject has the semantic role of AGENT, but this is not a 1:1-correspondence.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.4)
Subjunctive mood is a mood that typically signals irrealis meanings, such as potentiality, uncertainty,
prediction obligation, and desire. It most typically occurs in a subordinate clause, but may occur outside of one.
http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsSubjunctiveMood.htm
These are subordinating conjunctions, e.g. "if", "that", "when".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.7)
If you need substantive paradigm of pronouns, then append SU. Substantive pronouns replace the whole NP. E.g. "yours" is tagged as PRONPOS-SU.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.8)
The superlative of an adjective or adverb is a form of adjective or adverb which indicates that something has some feature to a greater degree than anything it is being compared to in a given context.
http://en.wikipedia.org/wiki/Superlative
Theme is a general term covering the notions of patient that means an entity affected by the action, of result that means an entity effected by the action, i.e. which emerges out of the action, or of theme that means an entity effected by the action, i.e. which emerges out of the action.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.3)
Time covers a point or an interval of time at which the action takes place.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.9)
A topic of a sentence is a syntagm that contains reference points for the predication contained in the rest of the sentence.
(http://www.uni-erfurt.de/sprachwissenschaft/proxy.php?port=8080&file=lido/servlet/Lido_Servlet Topik18.06.07)
These are transitive verbs, e.g. "buy".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
This categorie belongs to the extended scheme in the guidelines. Prepositional objects are annotated with the generic label OBJ.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 4.3.4)
These are verbs in general case, e.g. "sleep".
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 5.3.2)
A Verb (V) at the syntax layer is either a lexical (VLEX) or a copula verb (VCOP) at the POS layer. Modal verbs and auxiliaries are not annotated in the constituent structure. The verb and its arguments are placed at the same CSn layer. Raising and control verbs are treated like ordinary verbs. They subcategorize for a sentential complement.
(Information Structure in Cross-Linguistic Corpora:
Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure 3.3.3)
0LD
These are verbal nouns. Some of the Chadic languages have morphologically opaque verbal noun stems in the progresive aspect, i.e. it is not obvious from the morphology that we deal with a deverbal noun, instead of a verb proper. In such cases, use the tag VN.
N and V are not defined as disjoint in the EAGLES categorization yet, so we assign VN to both nouns and verbs (possibly a specific verb form ? a participle ?)