Ontology - multext-east
- Abstract
- OWL/DL Ontology for MULTEXT-East morphosyntactic specifications -- OLiA annotation model for the morphosyntactic specifications of MULTEXT-East v. 4. (Erjavec 2010)
http://nl.ijs.si/ME/owl/
Christian Chiarcos, 2010-2011
Licence:
The ontologies are distributed under the Creative Commons Attribution 3.0 Unported (CC BY 3.0) licence. You are free to to copy, distribute and transmit the work, to adapt the work and to make commercial use of the work under the condition that you make a reference to:
Christian Chiarcos and Tomaz Erjavec (2011), OWL/DL formalization of the MULTEXT-East morphosyntactic specifications. In: Proceedings of the 5th Linguistic Annotation Workshop (LAW-V), held in conjunction with the ACL-HLT 2011, June 2011, Portland, Oregon, USA, p. 11--20.
Please note that these ontologies are still under development, and that more detailed and precise definitions will be added incrementally.
Sources:
Unless marked otherwise, all comments refer to Erjavec (2010). Additionally, Qasemizadeh & Rahimi (2006), Dimitrova et al. (2009) and Derzhanski & Kotsyba (2009) were consulted for clarification. Email communication with Tomaž Erjavec, Serge Sharoff, Dan Tufis, Ivan A. Derzhanski, Natalia Kosyba, Csaba Oravecz and Hamidreza Kobdani represents the third source of information consulted for this ontology.
References:
Ivan Derzhanski, Natalia Kotsyba (2009), Towards a Consistent Morphological Tagset for Slavic Languages: Extending MULTEXT-East for Polish, Ukrainian and Belarusian, In: Proc. MONDILEX Third Open Workshop Bratislava, Slovakia, 15–16 April, 2009, p. 9-26
Ludmila Dimitrova, Radovan Garabík, Daniela Majchráková (2009), Comparing Bulgarian and Slovak Multext-East morphology tagset, In: Proceedings of MONDILEX Second Open Workshop, Kyiv, Ukraine, 2–4 February, 2009, p. 38-46
Tomaž Erjavec (ed., 2010), MULTEXT-East Morphosyntactic Specifications Version 4. 2010-05-12, http://nl.ijs.si/ME/V4/msd/html/index.html
Behrang Qasemizadeh and Saeed Rahimi (2006), Persian in MULTEXT-East Framework, in T. Salakoski et al. (eds.): FinTAL 2006, LNAI 4139, pp. 541 – 551, 2006. - Latest Version
- http://purl.org/olia/mte/multext-east.owl#
Classes - Overview
Properties - Overview
- hasAdjectiveFormation
- hasAdpositionFormation
- hasAnimacy
- hasAspect
- hasCase
- hasClitic
- hasConjunctionFormation
- hasCourtesy
- hasDefiniteness
- hasDegree
- hasFeature
- hasFormation
- hasGender
- hasHumanness
- hasInterjectionFormation
- hasModificationType
- hasNegation
- hasNumber
- hasNumeralForm
- hasOwnedNumber
- hasOwnerGender
- hasOwnerNumber
- hasOwnerPerson
- hasParticleFormation
- hasPerson
- hasPronounForm
- hasQuantifier
- hasSubCase
- hasSyntacticType
- hasTense
- hasTransitivity
- hasVerbForm
- hasVoice
- hasWHType
Classes
Abbreviation | ||
---|---|---|
Abstract | e.g., ??? ??? ???? ??? ????? ?? ??? ??? ??? ???? ???? ??? ????? ??? ???? ??? ??? ?? ???? ????? (uk) | |
SubClass Of | ||
AbessiveCase | ||
Abstract | Case="abessive" (Estonian) | |
SubClass Of | ||
AblativeCase | ||
Abstract | Case="ablative" (Estonian, Hungarian) | |
SubClass Of | ||
AccusativeCase | ||
Abstract | e.g., -i/el, -l/el, -le/el, -ne/noi, -o/el, -te/tu, i-/el, l-/el, l/el (ro) | |
SubClass Of | ||
ActiveVoice | ||
Abstract | Voice="active" Macedonian has two types of (adjectival) participles: active and passive. Active corresponds to Macedonian L-form and passive to verbal adjective, neuter gender, singular. For example, nosel is encoded as VForm=Participle, Voice=Active, nosen as VForm=participle, Voice=Passive. (MTE v4) |
|
SubClass Of | ||
AdessiveCase | ||
Abstract | Case="adessive" (Estonian, Hungarian) | |
SubClass Of | ||
AditiveCase | ||
Abstract | Case="aditive" (Estonian) | |
SubClass Of | ||
Adjectival | ||
Abstract | Pronoun/Syntactic_Type="adjectival" (Slavic), Abbreviation/Syntactic_Type="adjectival" Pronouns can be distinguished between having a (syntactically) nominal and (syntactically) adjectival function. All pronominal types except the demonstrative and possessive one can be nominal, and all except for the personal one can be adjectival. (MTE v4) |
|
SubClass Of | ||
Sub-Classes | ||
AdjectivalAdverb | ||
Abstract | Adverb/Type="adjectival" (Serbian, Macedonian, Bulgarian) Bulgarian AdjectivalAdverbs have the same form as adjectives in Gender = neuter, Person = 3, Number = singular. (MTE v4) |
|
SubClass Of | ||
Adjective | ||
Abstract | For some MTE languages, Adjective also includes adjectival participles (Ukrainian), adverbs in certain positions (Macedonian mnogu, malku, nekolku are also considered adjectives in cases they are used before nouns, because they can have definiteness in their inflectional paradigm), and pronominal forms (Resian 'sw?j' / own is considered an adjective, not a pronoun). | |
SubClass Of | ||
Sub-Classes | ||
AdjectiveFormation | ||
Abstract | Adjective with feature "Formation" The Formation attribute distinguishes a nominal (short) form from a so-called compound (long) form of an Adjective in Czech. The nominal form can be used in the predicative function only. It is specified for nominative and accusative Case only. (MTE v4) This corresponds to the use of Definiteness (i.e., ReductionFeature) for Polish. |
|
SubClass Of | ||
Sub-Classes | ||
Adposition | ||
Abstract | e.g., a ??????????/=, ??? ???? ??????? ? ??? ?????? ?????? ??? ????? ????????? ??? ?? ? ????? ??????? ?????? ????? ?????? ???? ?????, ? ?? ????? ??? ???? ?? ??? ?????????? ?????????? ? ?? ????? ??? ?? ???? ???? ????? ????? ?????? ???????, ? ?? ? ?? ????? ??? ? ?? ???, ???????? ??????? ???????? ??????? ????????, ? ?? ?? ??? ???? ??? ???? ???? ????? ?????? ??? ???? ???? ?????? ????? ????? ?????? ??????? ????? ????, ?? ?? ?? ??? ?-??? ?-?? ?-??? ?-???????? ?-??? ?-????? ?-??? ?-???????? ?-???? ?-????? ?-?????? ?-????? ?-????? ?-??????? ?-????? ?-??????? ?-?????? ?-?????, ?????? ?, ?? ????????? (uk) | |
SubClass Of | ||
Sub-Classes | ||
AdpositionFormation | ||
Abstract | Adposition/Formation Czech: A preposition can be contracted with a pronoun; such a preposition has Formation=c(ompound). (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
Adverb | ||
Abstract | Adverb and Adjective overlap, and in MTE v4, different language-specific resolution strategies are applied: Polish post-prepositional adjectives like (po) polsku are treated as adverbs, Macedonian pre-nominal adverbs like mnogu, malku, nekolku are considered adjectives (like adjectives, they can have definiteness in their inflectional paradigm). Also, definitions for Adverb and Particle may overlap, see Adverb/Type="particle" as used for Hungarian and Romanian MTE v4. In Slovak MTE v4, however, Particles form a separate part of speech category as is customary in Slovak grammars. As for the overlap between Adverb and Conjunction, Adverb/Type="portmanteau" was introduced for Romanian to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). Romanian Conjunction/Type="portmanteau" applies only to the word "?i" which can be both a coordonating conjunction and an adverb. The distinctionamong these interpretations is rather tricky for the average native speaker and was a constant source of noise in automatic tagging. Therefore, for the sake of automatic processing we defined this "portmanteau" type value. (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
Adverbial | ||
Abstract | Pronoun/Syntactic_Type="adverbial" (Polish, Serbian, Russian, Ukrainian), Abbreviation/Syntactic_Type="adverbial" | |
SubClass Of | ||
AffirmativeParticle | ||
Abstract | Particle/Type="affirmative" | |
SubClass Of | ||
AgglutinantClitic | ||
Abstract | Clitic="agglutinant" (Verb, Pronoun: Polish) Polish: The agglutination phenomenon in Polish is similar to Czech clitic_s for pronouns, but has a wider scope and can be found in more parts of speech. It is encoded as a more general "Clitic(y/n/a/d)" attribute and is specified, e.g., for the indicative VForm with Tense=pa(s)t, corresponding to "praet" flexeme in the IPIC to differentiate between forms like gni?t? (clitic="n") and gniot?- (clitic="d"), where the latter not only demands a clitic but also has different form. The value "(a)gglutinant" indicates the clitic itself, e.g., -em in gniot?em. Values "y" and "n" are left to enable showing that a graphical word, i.e., delimited by white spaces, is a combination of a (d)emanding (or free) segment and an (a)gglutinant in case the word segmentation should be revised in the future. Prepositionality is encoded as Clitic with values "y(es)" for ni?, niego etc., "n(o)" for j?, go etc., "a(gglutinant)" for -?. Cf. the Clitic value "bound" for Slovene pronouns like zate which refers to the whole cluster, formally a combination of a preposition and a pronoun. This coding can be used for similar phenomena in Polish, e.g dla? (for him), given the word segmentation is revised towards a more trraditional one. |
|
SubClass Of | ||
AllativeCase | ||
Abstract | Case="allative" (Estonian, Hungarian) | |
SubClass Of | ||
AmbiguousAdjective | ||
Abstract | Adjective subcategories that refer to different phenomena in different languages | |
SubClass Of | ||
Sub-Classes | ||
AmbiguousCliticness | ||
SubClass Of | ||
Sub-Classes | ||
AmbiguousDefinitenessFeature | ||
SubClass Of | ||
Sub-Classes | ||
Animacy | ||
SubClass Of | ||
Sub-Classes | ||
Animate | ||
Abstract | Animate="yes" (Slavic Noun, Pronoun; Czech verb) Slovak (like most other Slavic languages) distinguishes masculine animate (Animate="yes") and masculine inanimate (Animate="no") gender. Masculine inanimate nouns always have the same form in the nominative and accusative case, whereas masculine animate nouns have predominantly the same form in the genitive and accusative case. Masculine animate nouns and masculine inanimate nouns differ in accusative singular, nominative (vocative) and accusative plural only (Slovak MTE v4). In Resian, Animacy can also be marked on neuter singular accusative Nouns. The feminine declension masculine noun has only one Ncmsa, that is marked as animate: o?o / father is Ncmsa--y. (Resian MTE v4). |
|
SubClass Of | ||
AoristTense | ||
Abstract | In Bulgarian, there is a language specific Tense="aorist" value for the Tense attribute. Past perfect tense ?aorist? expresses a past action (event) carried out or completed in a given moment or during a given period and finished before the state of speaking. (Dimitrova et al. 2009) In Resian, the aorist is encountered sporadically in historical texts only. (MTE v4) |
|
SubClass Of | ||
ApproximateNumeral | ||
Abstract | Bulgarian has Numeral/Form=approx(a), used for approximate numerals (???????? /about a ten/, ??????? /about a hundred/) (Dimitrova et al. 2009) | |
SubClass Of | ||
Article | ||
Abstract | Article and Determiner are independent top-level concepts that may, however, overlap for some languages: For Romanian, the distribition of articles is fixed, while for determiners is not. Also, the determiners are Person marked while the articles are not (Dan Tufis, email 2010/06/09). Determiner/Type="article" as used for Persian thus means that the token is in the intersection between Article and Determiner. In Persian, there are different types of determiners namely demonstrative, indefinite, interrogative, exclamative, and article. As defined here, there is just one article in Farsi; i.e, '?( '???yek). It is homonym with ?? ????which is a number. (Qasemizadeh and Rahimi 2006) The Persian article marks specificity rather than definiteness. (Ivan A. Derzhanski, email 2010/06/18) MTE distinguishes four subclasses of Article, fully instantiated for Romanian only. Unlike in most of the European languages, the article in Romanian has four types. Beside the types definite and indefinite which have the generally known semantic value, Romanian uses two additional types of articles, which are semantically subordinated to the definite article but which have special forms and meanings: (1) the possessive article (also called genitival article) is an element in the structure of the possessive pronoun, of the ordinal numeral (e.g. al meu (mine) and al treilea (the third)), and of the indefinite genitive forms of the nouns (e.g. capitol al c?r?ii (chapter of the book)). (2) the demonstrative article links a definite noun to its determinants, links a numeral or an adjective to a noun, and it is a constituent part of the relative superlative (e.g. fata cea mare (the elder girl), cel lenes, (the lazy), respectively prietenul cel mai bun (the best friend)). (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
Aspect | ||
SubClass Of | ||
Sub-Classes | ||
AspectParticle | ||
Abstract | Particle/Type="aspect" (Romanian) A verbal particle with Particle/Type="aspect" modifies the verbs and carries information on the verb form, i.e., on its aspect (Dan Tufis, email 2010/06/09) | |
SubClass Of | ||
AttributivePronoun | ||
Abstract | Pronoun/Referent_Type="attributive" (Bulgarian) and Pronoun/Syntactic_Type="adjectival" (other Slavic languages) are essentially the same thing (though in the Bulgarian tagset the value also contrasts with "possessive") (Ivan A Derzhanski, email 2010/06/09) | |
SubClass Of | ||
AuxiliaryVerb | ||
Abstract | Type="auxiliary" In Persian, future tense is made by the help of Auxiliary verbs. In order to make progressive form in Farsi, verbs are inflected with the prefix '?( '???m?). Perfective forms of verbs are usually made using auxiliary verbs '? ?( '??? ????am, ast, ?). Passive form of the verbs in Farsi are made by the help of Auxiliary verbs. Passive form of the verb is made of Past Participle + Auxiliary verb '?( '?????odan). (Qasemizadeh and Rahimi 2006) Note that the extension of AuxiliaryVerb may be defined differently. For many languages, AuxiliaryVerb does generally not include forms of "to be" (e.g., Czech b?t, Slovak by?, Macedonian bi), or modal verbs (e.g., Czech, Slovak, Macedonian). In other languages, however, verbs are classified according to their function rather than their form, e.g., Resian byt / to be can be main, copula or auxiliary. (MTE v4) |
|
SubClass Of | ||
BaseVerb | ||
Abstract | Type="base" (English) | |
SubClass Of | ||
Biaspectual | ||
Abstract | Aspect="biaspectual" (Verb: Slovene, Russian, Ukranian; Adjective: Ukrainian), identified with Aspect="ambivalent" (Verb: Slovak) Every Russian verb form is either perfective or imperfective. This applies to the majority of Russian verbs, but there is a small number of biaspectual verbs. A biaspectual verb can either take a perfective or an imperfective value, depending on the context. Examples of biaspectual verbs are: ???????????? 'utilize', ??????? 'execute', ???????? 'get married'. (Feldman & Arshavskaya 2007) In the Slovak MTE v4, Aspect="ambivalent", presents a special class of verbs that have the same form in perfective and imperfective/progressive aspect (the difference is only semantic/syntactic, not morphological). (Dimitrova et al. 2009) Anna Feldman & Katya Arshavskaya (2007), English and Russian event annotation: A pilot study. Studies in Variation, Contacts and Change in English 1, http://www.helsinki.fi/varieng/journal/volumes/01/feldman_arshavskaya/ | |
SubClass Of | ||
BothNumeral | ||
Abstract | Form="both" (Romanian) A combination of digit (or roman) and letter representation, cf. English "2nd". | |
SubClass Of | ||
BoundClitic | ||
Abstract | Clitic="bound" (Slovene and Resian pronoun) Clitic="bound" appears in Slovene and indicates in fact the whole cluster, e.g. "zame, pome", a combination of a preposition and a pronoun. So, ontologically, "bound" is rather ElementWith Clitic for Slovene. (Natalia Kotsyba, email 2010/06/21) In Resian, however, "bound" seems to be a CliticElement. At least, the Resian MSD index lists for nas/m? both Clitic=bound (Pp1-pa--b-n) and Clitic=no (Pp1-pa--n-n). This is really a problem, because the only proper generalization over both uses would be to specify it as being ambiguous between CliticElement (for Resian) and ElementWithClitic (for Slovene). |
|
SubClass Of | ||
CardinalNumeral | ||
Abstract | Cardinal numerals signify a numerical (quantitative) property of objects, e.g., Sloval jeden
dom, dve ?eny, tri knihy; Bulgarian ???? ???, ??? ????, ??? ????? /one home, two women, three books/.
(Dimitrova et al. 2009) Traditional Romanian grammars usually distinguish seven numeral types, where five of them have specific forms and the other two are obtained by composition. The first group is made up by the following numeral types: cardinal (trei-three), ordinal (al treilea-the third), fractional (treime-one third), multiple (?ntreit-trine), collective (am?ndoi-both). The second group contains the numeral types which are composed by means of other parts of speech: distributive (c?te trei-...each three...), adverbial (de trei ori-thrice) and again the collective numeral which also has compound forms (to?i trei-all three). Nonetheless, as the numerals of the second group have a weak syntactic cohesion, namely each composition element may be regarded as an element of the sentence, with its own grammatical function, these last numeral types are irrelevant for the morphosyntactic annotation. (Romanian MTE v4) |
|
SubClass Of | ||
Case | ||
Abstract | feature of Noun (http://nl.ijs.si/ME/V4/msd/html/msd.N.html) and Verb (Russian and Estonian, http://nl.ijs.si/ME/V4/msd/html/msd.V.html) | |
SubClass Of | ||
Sub-Classes |
|
|
CausalAdverb | ||
Abstract | Adverb/Type="causal" is used in the Hungarian MTE v4, but no examples are provided. | |
SubClass Of | ||
CausalisCase | ||
Abstract | Case="causalis" (Hungarian) | |
SubClass Of | ||
Clitic | ||
Abstract | Clitic="yes" (Noun/Adjective: Romanian; Verb: Romanian, Polish, Serbian, Persian) Slovak Pronoun: Type=reflexive ecompasses all reflexive pronouns (sa, sebe, si, svoj, seba) as well as "sa" in its role as the obligatory particle of reflexive verbs. Personal and possessive reflexives are further distinguished via the Referent_Type attribute. "sa" in all its roles will be marked as the reflexive personal clitic pronoun. The Clitic attribute distinguishes clitical vs. nonclitical pronominal forms, e.g. "ti" vs. "tebe". Polish Pronoun: Prepositionality is encoded as Clitic with values "y(es)" for ni?, niego etc., "n(o)" for j?, go etc., "a(gglutinant)" for -?. Cf. the Clitic value "bound" for Slovene pronouns like zate which refers to the whole cluster, formally a combination of a preposition and a pronoun. This coding can be used for similar phenomena in Polish, e.g dla? (for him), given the word segmentation is revised towards a more trraditional one. Hungarian Adverb: The modifier -e question word (the only Hungarian clitic) is attached to the preceding word with a hyphen. Romanian Verb, Noun, Adjective: The cliticization phenomenon in Romanian is not restricted to verb-pronoun relationship, but may also be observed with the (main) verb and the auxiliary, the noun or adjective with pronoun, with noun or adjective with copula, pronoun with auxiliary, preposition with (indefinite) article, numeral or (indefinite) pronoun, negative adverb with verb, auxiliary or pronoun, and some others (mainly created through the contracted forms of the verb "a fi"-to be). We restrict ourselves to considering only the graphically marked clicitizations. In such cases, the two, three or (sometimes) four constituents of a cliticized word-form are always separated by a hyphen. Omitting the hyphen in such cases is an unacceptable error in written Romanian. Romanian Article: Note that the definite article has only enclitic forms, except for one proclitical form (lui + proper noun: lui Ion). The inflected forms of the foreign-origin words (mainly nouns) not fully assimilated, are usually written with a hyphen between the base-form and the inflectional ending. In our encoding, we classified these endings (which are supposed to be split by the segmenter) as clitic articles (clitic attribute is always "y") which can be either definite (type=f, "-istul") or indefinite (type=i, "ist") and are characterised by gender (gender=m, "ist"; gender=f, "ist?"), number (number=s, "ist"; number=p, "i?ti") and case (case=r, "istul"; case=o, "istului"). |
|
SubClass Of | ||
CliticDefiniteDeterminer | ||
SubClass Of | ||
Sub-Classes | ||
CliticDeterminerType | ||
SubClass Of | ||
Sub-Classes | ||
CliticDistalDeterminer | ||
Abstract | Definiteness="distal" (Noun/Adjective/Pronoun: Macedonian) For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal). (MTE v4) | |
SubClass Of | ||
CliticElement | ||
SubClass Of | ||
Sub-Classes | ||
CliticIndefiniteDeterminer | ||
SubClass Of | ||
Cliticness | ||
Abstract | feature "Clitic" This is a highly feature used in different ways in different MTE tagsets. Therefore, no definition is attempted, but rather, different functions are represented in subconcepts. A proper ontological model would have to clarify the situation further. The attribute Clitic means either "hasClitic" (if applied to Noun) or "isClitic" (if applied to Article): This is similar to Case, which on Adpositions means "requiresCase" rathen than hasCase. Definitely something to think about; is it better to be formally correct or have small set of attributes? (Tomaz Erjavec, email 2010/06/09) [Romanian] Clitic feature denotes: 1) a character ellision: i-am dat = ?i am dat (? is deleted) - I gave him 2) insertion: duc?ndu-m? = duc?nd+U+m? (U is inserted +for phonological reasons) - carrying myself 3) or both: m?nc?ndu-l = m?nc?nd+U+_e_l (U is inserted and e is deleted) - eating it 4) "fast speaking" that is pronuntiation of two words as if a single word: maic?-mea - my mother it is interesting to note that in "normal speaking" this would be "maica mea" where the word "maica" is in definite form while in the previous form was in indefinite form (notice the final ă) It is always signaled by a hyphen between the content word and the functional word that are cliticized. (Dan Tufis, email 2010/06/09) Polish Pronoun: Prepositionality is encoded as Clitic with values "y(es)" for ni?, niego etc., "n(o)" for j?, go etc., "a(gglutinant)" for -?. Cf. the Clitic value "bound" for Slovene pronouns like zate which refers to the whole cluster, formally a combination of a preposition and a pronoun. This coding can be used for similar phenomena in Polish, e.g dla? (for him), given the word segmentation is revised towards a more trraditional one. (MTE, v4.0) Czech Pronoun: The Clitic attribute distinguishes clitical vs. nonclitical pronominal forms, e.g. "ti" vs. "tob?". (MTE, v4.0) |
|
SubClass Of | ||
Sub-Classes | ||
CliticProximalDeterminer | ||
Abstract | Definiteness="proximal" (Noun/Adjective/Pronoun Macedonian) For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal). (MTE v4) | |
SubClass Of | ||
CliticSpecificDeterminer | ||
Abstract | Persian does have an article, but it marks specificity rather than definiteness. The Persian article is similar to the Balkan one (a clitic of pronominal origin that's written together with the word), except that it isn't exactly definite (you can even see it described as an indefinite article). (Ivan A. Derzhanski, emails 2010/06/18) | |
SubClass Of | ||
Collective | ||
Abstract | Collective plurals are usually considered as derivation rather than an inflection, but modelled as a number feature in the MTE schema of Resian (Slovene dialect in Italy). | |
SubClass Of | ||
CollectiveNumber | ||
Abstract | Collective plurals, though usually considered as derivation rather than an inflection, are modelled as a number feature in the MTE schema of Resian (Slovene dialect in Italy). | |
SubClass Of | ||
CollectiveNumeral | ||
Abstract | Numeral/Type="collect" (Romanian) In traditional Romanian grammars, expressions like am?ndoi-both, to?i trei-all three are referred to as collective numerals. (MTE v4) |
|
SubClass Of | ||
Collocation | ||
SubClass Of | ||
Sub-Classes | ||
ComitativeCase | ||
Abstract | Case="komitative" (Estonian) | |
SubClass Of | ||
CommonGender | ||
Abstract | The Gender value "common" is assigned to nouns that can combine with adjectives in either feminine or masculine,e.g., Ukrainian ?????? or either neutral or masculine gender, e.g. Ukrainian ?????. In Russian, Gender=common is used for words such as ?????, ???????, ?????, ????, ??????, etc. (MTE v4) | |
SubClass Of | ||
CommonNoun | ||
Abstract | Noun/Type=Common | |
SubClass Of | ||
ComparativeDegree | ||
Abstract | e.g., ??????????/????????, ????????????/????????, ????????????/????????, ????????????/????????, ??????????/????????, ????????????/????????, ????????????/????????, ????????????/????????, ????????????/???????? (mk) | |
SubClass Of | ||
ComparativeParticle | ||
Abstract | In the Bulgarian MTE v4 specs, Particle/Type="comparative" is used for particles used to create comparatives or superlatives (??, ???). In other Slavic languages, e.g., Slovak, comparatives are formed through a morphology suffix, naj- is written together with superlatives (although this could be considered just a difference in orthography), so that ComparativeParticle is applied to Bulgarian only. (Dimitrova et al. 2009) | |
SubClass Of | ||
CompoundAdjective | ||
Abstract | Formation="compound" (Czech) | |
SubClass Of | ||
CompoundAdposition | ||
Abstract | Adposition/Formation="compound" In several languages, there is a distinct class of compound prepositions. Each of them forms a formal and semantic unit, although graphically they stay unfused, e.g. Romanian de la, pe la, de pe, etc., or Resian ta-na / in (MTE v4) |
|
SubClass Of | ||
CompoundConjunction | ||
Abstract | Conjunction/Formation="compound" In Romanian, CompoundConjunction refers to conjunctions formed periphrastically, with some word/phrase combined by a conjunction: din moment ce, f?r? s?, fat,? de cum etc. Also compare Resian za wojo ki / because. MTE v4) |
|
SubClass Of | ||
CompoundInterjection | ||
Abstract | Interjection/Formation="compound" | |
SubClass Of | ||
CompoundParticle | ||
Abstract | Particle/Formation="compound" | |
SubClass Of | ||
Conditional | ||
Abstract | e.g., bude/by?, budem/by?, budeme/by?, budete/by?, bude?/by?, bud?/by?, by, neposta?ia/nesta?i?, neposta??/nesta?i? (sk) | |
SubClass Of | ||
Conjunction | ||
Abstract | e.g., as, either, neither (en) | |
SubClass Of | ||
Sub-Classes | ||
ConjunctionFormation | ||
Abstract | Conjunction/Formation refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word. (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
CoordinatingConjunction | ||
Abstract | Conjunction/Type="coordinating" | |
SubClass Of | ||
Sub-Classes | ||
CoordinatingConjunction_ConjunctType | ||
Abstract | CoordinatingConjunction originally conflated two different classifications of conjunctions: (a) according to position and distribution (CoordinatingConjunction_PositionalType) (b) according to the elements conjoined (CoordinatingConjunction_ConjunctType). | |
SubClass Of | ||
Sub-Classes | ||
CoordinatingConjunction_PositionalType | ||
Abstract | CoordinatingConjunction originally conflated two different classifications of conjunctions: (a) according to position and distribution (CoordinatingConjunction_PositionalType) (b) according to the elements conjoined (CoordinatingConjunction_ConjunctType). | |
SubClass Of | ||
Sub-Classes | ||
CopulaVerb | ||
Abstract | Verb/Type="copula" Depending on the language-specific classification of verbs in terms of their forms or functions, Verb/Type="copula" is either applied to forms of the verb "to be" in all its functions (e.g., Czech "b?t", Slovak "by?"), or restricted to occasions where it serves as a copula (e.g., Resian "byt"). (MTE v4) Copula verbs are inflected, so that the corresponding Macedonian word bi is considered as particle, rather than verb copula (in contrast to other copula it doesn't inflect for person / number) (Tomaz Erjavec, email 2010/06/09) |
|
SubClass Of | ||
CorrelativeCoordinatingConjunction | ||
Abstract | Conjunction/Coord_Type="correlat" (Romanian). In Romanian, there are three kinds of conjunctions depending on their usage: as such or together with other conjunctions or adverbs: (1) simple, between conjuncts: Ion ori Maria (John or Mary); (2) repetitive, before each conjunct: fie Ion fie Maria fie... (either John or Mary or...) (3) correlative, before a conjoined phrase, it requires specific coordinators between conjuncts: at?t mama c?t ?i tata (both mother and father). (MTE v4) | |
SubClass Of | ||
CountNumber | ||
Abstract | Number="count" (Nouns in Serbian, Macedonian, Bulgarian), e.g., Bulgarian ???/??, ???????/??????, ???/??, ??????/?????, ??????/??????, ?????/???? | |
SubClass Of | ||
Courtesy | ||
Abstract | feature "Courtesy" In Resian, the attribute Courtesy is relevant for the 2nd person plural, where forms in '-ta' refer to a plural subject and '-t?' to a singular subject. For Slovene this attribute is not used, even though the distinction is made in a similar manner. (MTE v4) In Persian, instead of the singular form of the verb, the plural form is used to refer to a singular subject for reasons of couresy. In fact, such attributes for Farsi are not found in traditional grammar books. (Qasemizadeh and Rahimi 2006) |
|
SubClass Of | ||
Sub-Classes | ||
DativeCase | ||
Abstract | e.g., desater?m/desater?, devaten?ct?/devaten?ct?, devaten?ct?mu/devaten?ct?, devaten?ct?m/devaten?ct?, druh?/druh?, druh?mu/druh?, druh?m/druh?, dvoj?, dv?ma/dva (cs) | |
SubClass Of | ||
Definite | ||
Abstract | Definiteness="yes" (Noun/Adjective: Romanian, Macedonian, Bulgarian, Persian; Verb: Bulgarian, Hungarian; Pronoun: Resian, Macedonian, Bulgarian) In Romanian, nouns can be marked for definiteness with the enclitic definite article. In noun-adjective construction, the definite article may attach enclitically to either adjectives or modified nouns (never to both of them). If present, the definite article attaches to the right of the first word in the sequence, e.g. Bunul om (The kind man) v.s. Omul bun. (The kind man) (MTE v4) For Macedonian, the definiteness attributes can take the values: non definite (no), generally definite (yes), definite at short visible distance (proximal), and definite at longer visible distance (distal). (MTE v4) Persian does have an article, but it marks specificity rather than definiteness. (Ivan A. Derzhanski, email 2010/06/18) According to Qasemizadeh & Rahimi's (2006) description of tokenization, Definiteness of Nouns etc. thus refers to an orthographically non-separated definite (specifity-marking) article. |
|
SubClass Of | ||
DefiniteArticle | ||
Abstract | Article/Type="definite" is used in the Romanian, Hungarian and Resian MTE v4 specs. Hungarian has tree articles: a, az and egy. a and az are definite. These may not have number and case. The word 'az' may have but that is a pronoun in those cases. (MTE v4) The definite article in Resian is 'te ta t?' and formally distinct from the demonstrative pronoun from which it derived: 'jte jta jt?'. (MTE v4) |
|
SubClass Of | ||
Definiteness | ||
Abstract | Like Cliticness, Definiteness assembles a number of *different* phenomena grouped together under a single MTE attribute.
Definiteness refers to the definite and indefinite article in English, Bulgarian and Romanian, for the difference between long and short adjective inflection in other Slavic languages, for verbal agreement features of the Hungarian verb.
Bulgarian Definiteness attribute: One of the most important grammatical characteristics of the new Bulgarian language which sets it apart from the rest of the Slavic languages is the existence of a definite article. The definite article is a morphological indicator of the grammatical category determination (definiteness). The definite article is not a particle (particles are a separate category of words ? parts-of-speech, while the article is not a separate word), nor is it a simple suffix, but a meaningful compound part of the word. It is a word-forming morpheme, which is placed at the end of words in order to express definiteness, familiarity, acquaintance (Bulgarian Grammar, 1993). In Bulgarian, nouns, adjectives, numerals, and full-forms of the possessive pronouns and participles can acquire an article. For the singular masculine article, there are two forms: a full article(f)[l.s.] and a short article(s)[l.s.]. The full article is used when a singular masculine form is the syntactic subject of the clause, otherwise a short one is used ? a purely orthographic rule. The distinction of full vs. short is not made for feminine, neuter and plural forms, and we use just the yes(y) or no(n) to mark definiteness or respectively lack thereof. Therefore, the definiteness attribute can take overall 4 different values: indefinite(n), definitive(y), short article(s), full article(f). (Dimitrova et al. 2009)
In Polish, the vocalicity of (a)gglutinated forms like -em vs -m is mapped on the Definiteness attribute with its values "(f)ull-art" and "(s)hort-art" respectively, meaning "full form" and "short form". The IPIC flexeme winien and predicatives like rad are treated as short adjectives?Definiteness="short-art". The terms are very artificial, but this category is used due to the similarity of the phenomenon. (MTE v4) One of MTE v.3?s most perplexing choices is that it uses the same binary feature Definiteness of the part of speech Verb to indicate, in Bulgarian, that a participle bears a definite article (?????????? ?the ones who talked?), and in Hungarian, that a finite form of a transitive verb has a definite 3rd person direct object (tanulom ?I learn it?). Thus two totally dissimilar (not to mention unrelated) phenomena are handled alike merely because their names in the respective grammatical traditions happen to mean the same. In MTE v.4 the tagset for Persian encodes izafet as Case=genitive (i.e., practically the opposite!) in an effort to avoid introducing a language-specific feature. (Derzhanski and Kotsyba 2009) Hungarian Definiteness (of verbs): In simple terms, it means that the verb takes a definite object, which is reflected in the type of verb conjugation. Eg. in Hungarian there will be two forms of the verb 'see' here 1. I can see an elephant. 2. I can see the elephant. depending on the definiteness of the object, 'l?tok' vs. 'l?tom'. The above 1s2s form of verbs takes a 1st person singular subject and 2nd person definite object (which in actual fact can also be plural not only singular). Both subject and object can be (pro)dropped. Eg. I can see you -> (?n) l?tlak (t?ged/titeket) I see you_sg/you_pl (Csaba Oravecz, email 2010/06/15) Persian: In Farsi, Nouns are inflected for number and Definiteness. ... Farsi adjectives are inflected for degree and definiteness. (Qasemizadeh and Rahimi 2006) |
|
SubClass Of | ||
Sub-Classes | ||
Degree | ||
Abstract | A feature of adjectives. In some languages, e.g., Hungarian and Slovak, also some adverbs may have degree. (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
DelativeCase | ||
Abstract | Case="delative" (Hungarian) | |
SubClass Of | ||
DemandingClitic | ||
Abstract | Clitic="demanding" (Verb: Polish) An element that contains a clitic which is, however, represented as a separate token. Polish: Particles were extracted from the IPIC particle-adverbs category manually along with adverbs, pronouns and interjections and a few conjunctions. The Clitic attribute enables differentiating particles that are agglutinated to non-particles (value= "a"), e.g., by, ?e. The value "y" labels a composite particle such as niechby when treated as one word; alternatively it may be encoded as a aequence of two particles, the optionally demanding niech with Clitic="d" and the agglutinant by with Clitic="a". (MTE v4) This can be a subclass of ElementWithoutClitic. They are default though and won't be encoded in most cases. We only use them in some cases for Polish verbs. (Natalia Kotsyba, email 2010/06/21) |
|
SubClass Of | ||
DemonstrativeArticle | ||
Abstract | The demonstrative article (Article/Type="demonstrative") in Romanian links a definite noun to its determinants, links a numeral or an adjective to a noun, and it is a constituent part of the relative superlative (e.g. fata cea mare (the elder girl), cel lenes, (the lazy), respectively prietenul cel mai bun (the best friend)). (MTE v4) | |
SubClass Of | ||
DemonstrativeDeterminer | ||
Abstract | Determiner/Type="demonstrative" (English, Romanian, Persian) | |
SubClass Of | ||
DemonstrativePronoun | ||
Abstract | Pronoun/Type="demonstrative" | |
SubClass Of | ||
DemonstrativeQuantifier | ||
Abstract | In the Czech and Slovak MTE v4 specs, Numeral/Class="demonstrative" are items meaning `this many/much', etc. Strictly speaking, they are pronumerals (pro-quantifiers), but traditional descriptions don't recognise such a category, so they are described variously as pronouns (because they contain a demonstrative element) or as numerals (because their syntactic distribution is that of numerals, or very close)." (Ivan A Derzhanski, email 2010/06/11) | |
SubClass Of | ||
DeterminalPronoun | ||
Abstract | DeterminalPronoun refers to an Estonian intensifier that is formally identical with the reflexive pronoun. Thus, Pronoun/Type="determinal" in the Estonian MTE v4 specs is used for the emphatic/reflexive pronouns _ise_, _end(a)_ `(one)self'." Note that DeterminalPronoun is not to be confused with English pronominal determiners (attributive pronouns). (Ivan A. Derzhanski, email 2010/06/15; Heiki-Jaan Kaalep, email 2010/06/21; G?lzow 2006, p.258) Insa G?lzow (2006), The acquisition of intensifiers: Emphatic reflexives in English and German child language, Mouton de Gruyter, Berlin, p. 258 |
|
SubClass Of | ||
Determiner | ||
Abstract | e.g., ?????? ?????? ??, ????, ???/?? ?? ??/??, ?????? ??, ???????? ?????? ???, ?????/???? ???? ??????/??? (fa) | |
SubClass Of | ||
Sub-Classes | ||
DigitNumeral | ||
Abstract | Form="digit" | |
SubClass Of | ||
Diminuitive | ||
Abstract | The value 'diminutive' for Degree is used in Resian only, for derivated adjectives that end with the suffix '-i?' (MTE v.4). In MTE, diminuitive was modelled as a feature of Degree. This is, however, misplaced, as there are languages where Degree and Diminuitivity are independent. In Latvian, for example, the diminutive suffix may be attached to an adjective, not only in the positive but in the comparative and superlative degrees (Ruke-Dravina 1953). DiminuitiveDegree was thus renamed to Diminuitive and removed from Degree. Velta Ruke-Dravina (1953), Adjectival Diminuitives in Latvian. The Slavonic and East European Review 31(77): 452-465 | |
SubClass Of | ||
DirectCase | ||
Abstract | Case="direct" (Romanian) In the Romanian case system the value 'direct' conflates 'nominative' and 'accusative'. |
|
SubClass Of | ||
DistributiveCase | ||
Abstract | Case="distributive" (Hungarian) | |
SubClass Of | ||
DualNumber | ||
Abstract | In Czech, the dual Number manifests itself in the instrumental Case of several Nouns denoting dual parts of the human body. (MTE v4) | |
SubClass Of | ||
DualQuantifier | ||
Abstract | Numeral/Class="definite2" (Czech) Some feminine and neuter body parts in Czech have preserved dual forms, and if the noun is dual, so are its attributes (adjectives, pronouns). So the agreement of the numeral 2 differs formally from 3-4 (Ivan A. Derzhanski, email 2010/06/16) |
|
SubClass Of | ||
ElativeCase | ||
Abstract | Case="elative" (Estonian, Hungarian) | |
SubClass Of | ||
ElativeDegree | ||
Abstract | Degree="elative" (Adjective: Resian, Serbian, Macedonian) In Semitic languages, ElativeDegree refers to the ?adjective of superiority.? In some languages such as Arabic, the concepts of comparative and superlative degree of an adjective are merged into a single form, the elative. How this form is understood or translated depends upon context and definiteness. In the absence of comparison, the elative conveys the notion of ?greatest?, ?supreme.? The elative of ???? (kab?:r, "big") is ???? (??kbar, ?bigger/biggest?, ?greater/greatest?). (http://en.wiktionary.org/wiki/elative) In Slavic languages, as well, it is pretty standard. I do agree with the definition though, that "the elative conveys the notion of ?greatest?, ?supreme.?" So, Slovene "lep" is beautiful, "prelep" is very (or supremely) beautiful; I guess the "pre-" prefix could be roughly translated as "over-". Used in Resian, Serbian, Macedonian. In Slovenian, we banished it, as even "ordinary" degrees are borderline inflection / derivation, but, I think, elative is is definitely not inflection. (Toma? Erjavec, email 2010/06/21) |
|
SubClass Of | ||
ElementWithClitic | ||
SubClass Of | ||
Sub-Classes | ||
ElementWithoutClitic | ||
SubClass Of | ||
Sub-Classes | ||
EmphaticDeterminer | ||
Abstract | Determiner/Type="emphatic" (Romanian) In Romanian, there are specific forms for the so-called emphatic determiner, which may accompany both a noun and a personal pronoun: fata ?ns??i (the girl herself), also ea ?ns??i (she herself). |
|
SubClass Of | ||
EmphaticPronoun | ||
Abstract | In the Ukrainian MTE v4 specs, Pronoun/Type="emphatic" is used for pronoun forms ??"????, ??"????, ??"???, ??"????, etc., with complex meanings like "there is nobody/nothing (to do sth/to use for doing sth, etc.)". Orthographically these are identical forms of negative nominal pronouns ?????, ???? "nobody, nothing" in oblique cases, however, with differing accent. They are referred to as either separate pronoun lexemes or predicatives in grammars. All Ukrainian emphatic pronoun forms include negation. (MTE v4) | |
SubClass Of | ||
EssiveCase | ||
Abstract | Case="essive" (Hungarian, Estonian) In Estonian the essive case means such things as `(I played golf) as a student', `(I worked) as a bartender', `(you look) tired', `(he's very good) as a dancing partner', `(we parted) as friends'. This doesn't sound like the definition you quoted, but is similar (though not identical) to the meaning of the Hungarian form. (Ivan A. Derzhanski, email 2010/06/15) Hungarian has two essive cases, essive-formalis (formatives, e.g., emberk?nt "as people") and essive-modalis (essivus-formalis, e.g., ember?l from ember "people") (Nose 2003, p. 108) The essive-modal case in Hungarian language can express the state, capacity, task in which somebody is or which somebody has (Essive case, e.g. "as a reward", "for example"), or the manner in which an action is carried out, an event happens, or the language which somebody knows (Modal case, e.g. "sloppily", "unexpectedly", "speak German"). An example of this would be in the sentence "Besz?lek magyarul." (I speak Hungarian.) The sentence denotes the ability of being able to speak the Hungarian language. According to vowel harmony rules, ul becomes ?l in cases such as "Besz?lek n?met?l." (I speak German.) because the word for "German", n?met is composed completely of median and/or frontal vowels. (http://en.wikipedia.org/wiki/Essive-modal_case) |
|
SubClass Of | ||
EssiveFormalCase | ||
Abstract | Case="essive-formal" (Hungarian) e.g., Hungarian 'katonak?nt' -> [serves] as a soldier. (Csaba Oravecz, email 2010/06/15) The Hungarian "formativus, or essivus-formalis `-k?nt' ... usually expresses a position, task and manner of the person or the thing." (Nose 2003) "Haspelmath & Buchholz (1998:321) explained the function of the essive case as ``role phrases''. Role phrases represent the role of the function in which a participant appears. They regard the role phrases as adverbial." (Nose 2003, p. 117) In the Hungarian language this case combines the Essive case and the Formal case, and it can express the position, task, state (e.g. "as a tourist"), or the manner (e.g. "like a hunted animal"). The status of the suffix -k?nt in the declension system is disputed for several reasons. First, in general, Hungarian case suffixes are absolute word-final, while -k?nt permits further suffixation by the locative suffix -i. Second, most Hungarian case endings participate in vowel harmony, while -k?nt does not. For these reasons, many modern analyses of the Hungarian case system, starting with L?szl? Antal's "A magyar esetrendszer" (1961) do not consider the essive/formal to be a case. (http://en.wikipedia.org/wiki/Essive-formal_case) cf. Masahiko Nose (2003), Adverbial Usage of the Hungarian Essive Case |
|
SubClass Of | ||
ExclamativeDeterminer | ||
Abstract | Determiner/Type="exclamative" was introduced for Persian ?????? ?? | |
SubClass Of | ||
ExclamativePronoun | ||
SubClass Of | ||
ExistentialThere | ||
Abstract | English existential there is specified as a subtype of pronoun in MTE v4, i.e., Pronoun/Type="ex-there" | |
SubClass Of | ||
FactiveCase | ||
Abstract | Case="factive" (Hungarian) | |
SubClass Of | ||
FeminineGender | ||
Abstract | e.g., hentej/hent?, hentie/hent?, hentou/hent?, hent?, hent?/hent?, hent?ch/hent?, hent?m/hent?, hent?mi/hent?, ko?k?tiek/ko?k?tka (sk) | |
SubClass Of | ||
FirstPerson | ||
Abstract | e.g., ??, ???/????, ????, ?????/????, ???/???, ?????/???, ??/??, ???/??, ????/?? (bg) | |
SubClass Of | ||
FirstSgSecondSg | ||
Abstract | Definiteness="1s2s" (Verb: Hungarian) Hungarian: 1s2s is a special form for definitness, in which the speaker's person is first singular (I) and the target of the transitivity is second singular (you). (MTE v4) Hungarian Definiteness (of verbs): In simple terms, it means that the verb takes a definite object, which is reflected in the type of verb conjugation. Eg. in Hungarian there will be two forms of the verb 'see' here 1. I can see an elephant. 2. I can see the elephant. depending on the definiteness of the object, 'l?tok' vs. 'l?tom'. The above 1s2s form of verbs takes a 1st person singular subject and 2nd person definite object (which in actual fact can also be plural not only singular). Both subject and object can be (pro)dropped. Eg. I can see you -> (?n) l?tlak (t?ged/titeket) I see you_sg/you_pl (Csaba Oravecz, email 2010/06/15) |
|
SubClass Of | ||
Foreign | ||
Abstract | In the Slovene MTE v4 specs, Residual/Type="foreign" marks a words in a strech of foreign language text. (MTE v4) | |
SubClass Of | ||
FormalCase | ||
Abstract | (from the discussion of [Hungarian] EssiveFormalCase) "`formal' in `essive-formal' is not an indication of register: there is another form, which in some descriptions is simply called `formal', with the affix _-k?pp(en)_ and a similar meaning (`in the form of ...', they probably meant when they came up with the term). The line between a case ending, an adverb formative and a postposition is a thin one in Hungarian." (Ivan A. Derzhanski, email 2010/06/15) http://en.wikipedia.org/wiki/Essive-formal_case (2010/06/15): "In the Hungarian language this case combines the Essive case and the Formal case, and it can express the position, task, state (e.g. "as a tourist"), or the manner (e.g. "like a hunted animal")." |
|
SubClass Of | ||
Formation | ||
Abstract | Formation refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word. (MTE v4 on Conjunction/Formation) | |
SubClass Of | ||
Sub-Classes | ||
FractalNumeral | ||
Abstract | Numeral/Form="fractional" (Romanian) In traditional Romanian grammars, FractionalNumeral refers to expressions like treime-one third. (MTE v4) |
|
SubClass Of | ||
FullArticle | ||
Abstract | Definiteness="full-art" (Noun: Bulgarian; Verb: Polish, Russian, Bulgarian; Adjective: Polish, Russian, Ukrainian, Bulgarian; Pronoun: Polish, Bulgarian) In Bulgarian, the singular masculine article has two forms: a full article(f)[l.s.] and a short article(s)[l.s.]. The full article is used when a singular masculine form is the syntactic subject of the clause, otherwise a short one is used ? a purely orthographic rule. The distinction of full vs. short is not made for feminine, neuter and plural forms, and we use just the yes(y) or no(n) to mark definiteness or respectively lack thereof. Therefore, the definiteness attribute can take overall 4 different values: indefinite(n), definitive(y), short article(s), full article(f) e.g., ???, ????, ????? /a man, the man[short], the man [full]/ (Dimitrova et al. 2009) For Polish, the Vocalicity of (a)gglutinated forms like -em vs -m is mapped on the Definiteness attribute with its values "(f)ull-art" and "(s)hort-art" respectively, meaning "full form" and "short form". The terms are very artificial, but this category is used due to the similarity of the phenomenon. The IPIC flexeme winien and predicatives like rad are thus treated as short adjectives?Definiteness="short-art". (MTE v4) |
|
SubClass Of | ||
FutureParticle | ||
Abstract | Particle/Type="future" (Romanian) A verbal particle with Particle/Type="future" modifies the verbs and carries the information that the verb is in future tense (Dan Tufis, email 2010/06/09) | |
SubClass Of | ||
FutureTense | ||
Abstract | Tense="future" Czech/Slovak verbs normally form the future Tense periphrastically by auxiliary "b?t" (E. "to be") plus infinitive of the main Verb. In addition to the copula, there are, however, some Verbs which form future Tense non-periphrastically, i.e. synthetically (Verbs of motion). Such verbal forms are marked as Tense=f. (MTE v4) |
|
SubClass Of | ||
Gender | ||
Abstract | Gender is reduced to the traditional set of "masculine, feminine, neutral" (and mergers between these). Animacy and Humanness that constitute subgenders of masculine accusative forms in Polish and other Slavic languages are represented as individual attributes. (Polish MTE v4) The Gender value "common" is assigned to nouns that can combine with adjectives in either feminine or masculine,e.g. Ukrainian ??????, or either neutral or masculine gender, e.g., Ukrainian ?????. (Ukrainian MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
GeneralAdjective | ||
Abstract | The Slovene MTE (v4), Adjective/Type="general" conflates qualificative adjectives and ordinal adjectives (distinguished in MTE v3). (Tomaz Erjavec, email 2010/06/09) | |
SubClass Of | ||
Sub-Classes | ||
GeneralAdverb | ||
Abstract | Adverbials can be sub-classified in different ways, Adverb/Type="general" refers to the prototypical case of adverbs per language. Its definition, however, depends on the respective (language-specific) taxonomy of adverbs. For example the distinction proposed for Romanian considers the principal syntactic properties of the adverbs. For Romanian, the general type includes most of the pronominal adverbs (demonstrative: aici (here), indefinite: oriunde (anywhere)). A distinct negative value is needed for adverbs as well (nic?ieri - nowhere, niciodat? - never). The particle type covers those adverbs which can dislocate verbal compound forms (ex. Ea a tot c?ntat -- She has ever sung) or mark degrees (ex. circa (about), foarte (very), prea (too)). Such adverbs are cam, mai, prea, ?i, tot, foarte etc. A useful distinction in Romanian considers the adverbs which can have predicative role, that is they can govern a subordinate sentence (ex. Fire?te c? o ?tiu -- Certainly I know it). Here (for uniformity within a multilingual environment), they are squeezed into the modifier class. No formal distinction is made between the interrogative adverbs and the relative ones. The "portmanteau" type of adverb was introduced to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). This was necessary for tagging purposes. (MTE v4) | |
SubClass Of | ||
GeneralDeterminer | ||
Abstract | Determiner/Type="general" (English) | |
SubClass Of | ||
GeneralParticle | ||
Abstract | In the Bulgarian MTE v4 specs, Particle/Type="general" is applied for non-specialised particles that do not fall in any of the other classes, i.e., negative, general, comparative, verbal, interrogative, or modal particle. (Dimitrova et al. 2009) | |
SubClass Of | ||
GeneralPronoun | ||
Abstract | Pronoun/Type="general" in English, and Slavic MTE v4 specs refers to pronouns not grouped together with any of the other subcategories of Pronoun defined for the respective language. In Slovak, for example, "general" pronouns concern the Pronouns like "v?etci" [E. "all"], "ka?d?" [E. "every"] etc.) (MTE v4) | |
SubClass Of | ||
GenitiveCase | ||
Abstract | e.g., me/mina, meie/mina, meiegi/mina, minu/mina, minugi/mina, mu/mina, nende/tema, nendegi/tema, Pauluste/Paulus (et) | |
SubClass Of | ||
Gerund | ||
Abstract | "Gerund" as defined here is fully ambiguous | |
SubClass Of | ||
GerundOrAdverbialParticiple | ||
Abstract | The problem is that the English term _gerund_ is ambiguous: with respect to Latin, in whose grammatical tradition it originates, it refers to a deverbal noun, and is needed in this function for Polish as well; in descriptions of some other languages, however, it has been used for an adverbial participle. The two meanings have nothing in common, except that the English _ing_-form can translate both. (Ivan A Derzhanski, email 2010/06/09) | |
SubClass Of | ||
Sub-Classes | ||
GerundProper | ||
Abstract | The concept GerundProper is introduced here to refer to "gerund" in the sense of a deverbal noun. The term _gerund_, however, is ambiguous: with respect to Latin, in whose grammatical tradition it originates, it refers to a deverbal noun, and is needed in this function for Polish as well; in descriptions of some other languages, however, it has been used for an adverbial participle. The two meanings have nothing in common, except that the English _ing_-form can translate both. (Ivan A Derzhanski, email 2010/06/09) Polish proper gerunds (deverbal nouns) are encoded as common nouns. Since they are very frequent in Polish, it seems expedient to add a type for them, i.e., Noun/Type="gerund". (Derzhanski and Kotsyba 2009) |
|
Human | ||
Abstract | Human="yes" | |
SubClass Of | ||
Humanness | ||
Abstract | The attribute "Human" is added to express derogativity in Polish. The Polish derogatives are a class of plural forms of nouns which are [?Human] in the nominative but [+Human] in the accusative. (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
IllativeCase | ||
Abstract | Case="illative" (Estonian, Hungarian) | |
SubClass Of | ||
Imperative | ||
Abstract | e.g., accompagnaimu/akompanj?t, amaite/am?t, contemplaimo/kontempl?t, D?ite/d?t, gauo/tet, gau?/tet, hovarnajte/govarn?t, hrishi/gri?it, lashi/l?gat (sl-rozaj) | |
SubClass Of | ||
ImperfectTense | ||
Abstract | Tense="imperfect" (Romanian, Croatian, Serbian, Macedonian, Bulgarian, Estonian) | |
SubClass Of | ||
Impersonal | ||
Abstract | VForm="impersonal" (Polish, Ukrainian) In Ukrainian, the impersonal VForm (o) is characterized by the ending -??/-??. It exists in other Slavic languages as well, although in most of them it coincides with the neutral form of the passive adjectival participle and is classified as such. In Ukrainian, as well as in Polish, the attributive form is different from the predicative one, cf. in Ukrainian ?????? ??????? (a written rule) vs ?????? ??????? (a rule was/is written). (MTE v4) |
|
SubClass Of | ||
Inanimate | ||
Abstract | Animate="no" (Slavic Noun/Pronoun http://nl.ijs.si/ME/V4/msd/html/msd.N.html; Czech verb) Slovak (like most other Slavic languages) distinguishes masculine animate (Animate="yes") and masculine inanimate (Animate="no") gender. Masculine inanimate nouns always have the same form in the nominative and accusative case, whereas masculine animate nouns have predominantly the same form in the genitive and accusative case. Masculine animate nouns and masculine inanimate nouns differ in accusative singular, nominative (vocative) and accusative plural only (Slovak MTE v4). |
|
SubClass Of | ||
Indefinite | ||
Abstract | Definiteness="no" (Noun/Adjective: Romanian, Macedonian, Bulgarian, Persian; Verb: Bulgarian, Hungarian; Pronoun: Resian, Macedonian, Bulgarian) For Bulgarian, Romanian and Macedonian, Definiteness="no" marks the absence of the definite article at the element in the clause that is expected to carry it (usually the first word of an NP). Also, Persian does have an article, but it marks specificity rather than definiteness. (Ivan A. Derzhanski, email 2010/06/18) According to Qasemizadeh & Rahimi's (2006) description of tokenization, Definiteness of Nouns etc. thus refers to an orthographically non-separated definite (specifity-marking) article. |
|
SubClass Of | ||
IndefiniteAdjective | ||
SubClass Of | ||
IndefiniteArticle | ||
Abstract | Article/Type="indefinite" is used in the Romanian, Resian and Hungarian MTE v4 specs. Hungarian, for example, has tree articles: a, az and egy. egy is indefinite. These may not have number and case. In Resian, the indefinite pronoun is 'din na n?' and formally distinct from the numeral from which it derived: 'dyn dn? dn?'. (MTE v4) | |
SubClass Of | ||
IndefiniteDeterminer | ||
Abstract | Determiner/Type="indefinite" (English, Romanian, Persian) | |
SubClass Of | ||
IndefinitePronoun | ||
Abstract | Pronoun/Type="indefinite" For some languages, IndefinitePronoun also covers negative pronous. In Romanian, however, it is worth differentiating the negative pronoun from other indefinite pronouns: a negative pronoun cannot be an argument for a verb unless the verb itself is negated too (e.g. Nu am v?zut pe nimeni / *Am v?zut pe nimeni). (MTE v4) |
|
SubClass Of | ||
IndefiniteQuantifier | ||
SubClass Of | ||
Indicative | ||
Abstract | e.g., ?????/????, ?????/????, ??????/????, ??????/????, ????/????, ?????/????, ?????/??????, ?????/????, ??????/???? (ru) | |
SubClass Of | ||
InessiveCase | ||
Abstract | Case="inessive" (Hungarian, Estonian) | |
SubClass Of | ||
Infinitive | ||
Abstract | e.g., ??????????/= ??????????/?????????? ?????????/= ?????????/????????? ???????/= ???????/??????? ?????????/= ?????????/????????? ???????????/= ???????????/??????????? ????????????/= ????????????/???????????? ??????????????/= ??????????????/?????????????? ????????????/= ????????????/???????????? ?????????????/= ?????????????/????????????? ???????????/= ???????????/???????????, ???????????/= ???????????/??????????? ?????????/= ?????????/????????? ???????????/= ???????????/??????????? ??????????/= ??????????/?????????? ??????????/= ??????????/?????????? ??????????????/= ??????????????/?????????????? ????????????????/= ????????????????/???????????????? ???????????/= ???????????/??????????? ?????????????/= ?????????????/????????????? ????????????/= ????????????/????????????, ????????/= ????????/???????? ???????/= ???????/??????? ??????????/= ??????????/?????????? ??????/= ??????/?????? ?????????/= ?????????/????????? ???????????/= ???????????/??????????? ????????/= ????????/???????? ???????/= ???????/??????? ?????????/= ?????????/????????? ???????/= ???????/???????, ????/= ????/???? (uk) | |
SubClass Of | ||
InfinitiveParticle | ||
Abstract | Particle/Type="infinitive" (Romanian) A verbal particle with Particle/Type="future" modifies the verbs and carries the information that the verb is infinite (Dan Tufis, email 2010/06/09) | |
SubClass Of | ||
InitialCoordinatingConjunction | ||
Abstract | In the English MTE v4 specs, Conjunction/Coord_Type="initial" designates the initial component of a complex conjunction consisting of multiple words (see CorrelativeCoordinatingConjunction), e.g., "neither" in "either ... or ..." and "either" in "neither ... nor ...". | |
SubClass Of | ||
InstrumentalCase | ||
Abstract | e.g., ????/??, ????/??, ???????????/???????????, ?????/???, ????/??, ????/??, ?????/??????, ???/??, ?????/???? (ru) | |
SubClass Of | ||
Interjection | ||
Abstract | e.g., ach/ach, ho/ho, och/och (pl) | |
SubClass Of | ||
Sub-Classes | ||
InterjectionFormation | ||
Abstract | Interjection/Formation refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word. (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
InterrogativeAdverb | ||
Abstract | Adverb/Type="interrogative" is used in the Hungarian MTE v4 specs. Corresponds to Adverb/Wh_Type="question" in English MTE v4. | |
SubClass Of | ||
InterrogativeDeterminer | ||
Abstract | Determiner/Type="interrogative" (Persian MTE v4), corresponds to Determiner/Type="question" in English MTE v4. | |
SubClass Of | ||
InterrogativeOrRelativeAdverb | ||
Abstract | Adverb/Type="int-rel" for Romanian applies to interrogative and relative adverbs that are not formally distinguished in Romanian. (MTE v4) Corresponds to the presence of Adverb/Wh_Type in English MTE v4. | |
SubClass Of | ||
Sub-Classes | ||
InterrogativeOrRelativeDeterminer | ||
Abstract | Determiner/Type="int-rel" (Romanian MTE v4), corresponds to the presence of Determiner/Wh_Type in English MTE v4. | |
SubClass Of | ||
Sub-Classes | ||
InterrogativeOrRelativePronoun | ||
Abstract | Pronoun/Type="int-rel" (Romanian), corresponds to the presence of Pronoun/Wh_Type in English (MTE v4). | |
SubClass Of | ||
Sub-Classes | ||
InterrogativeParticle | ||
Abstract | Particle/Type="interrogative" (Croatian, Serbian, Bulgarian). In Bulgarian, this category is applied to particles used to form yes/no-questions or exclamations (??, ????, ????, ????, ?????) (Dimitrova et al. 2009) | |
SubClass Of | ||
InterrogativePronoun | ||
Abstract | Pronoun/Wh_Type="question" (English MTE v4), Pronoun/Type="interrogative" (other languages). | |
SubClass Of | ||
InterrogativeQuantifier | ||
SubClass Of | ||
Intransitive | ||
Abstract | Transitive="no" (Persian) | |
SubClass Of | ||
LetterNumeral | ||
Abstract | Form="letter" | |
SubClass Of | ||
LightVerb | ||
Abstract | In linguistics, a light verb is a verb participating in complex predication that has little semantic content of its own, but provides through inflection some details on the event semantics, such as aspect, mood, or tense. The semantics of the compound, as well as its argument structure, are determined by the head or primary component of the compound, which may be a verb or noun (V+V or V+N compounds). Other names for "light verb" include: vector verb or explicator verb, emphasising its role within the compound; or thin verb or semantically weak verb, emphasising (as with "light") its lack of semantics. A "semantically weak" verb is not to be confused with a "weak verb" as in the Germanic weak inflection. Light verbs are similar to auxiliary verbs in some ways. Most English light verbs occur in V+N forms sometimes called "stretched verbs": for example, take in take a nap, where the primary sense is provided by "nap", and "take" is the light verb. The light verbs most common in these constructions are also common in phrasal verbs. A verb which is "light" in one context may be "heavy" in another: as with "take" in I will take a book to read. Examples in other languages include the Yiddish geb in geb a helf (literally give a help, "help"); the French faire in faire semblant (lit. make seeming, "pretend"); the Hindi nikal paRA (lit. leave fall, "start to leave"); and the b? construction in Chinese.[1] Some verbs are found in many such expressions; to reuse an earlier example, take is found in take a nap, take a shower, take a sip, take a bow, take turns, and so on. Light verbs are extremely common in Indo-Iranian languages, Japanese, and other languages in which verb compounding is a primary mechanism for marking aspectual distinctions. (http://en.wikipedia.org/wiki/Light_verb) | |
SubClass Of | ||
LocativeCase | ||
Abstract | e.g., belim/beo, Bratstvu/Bratstvo, dvojim/dvoje, Hajd, jednoj/jedan, jednom/jedan, jednome/jedan, jednomu/jedan, Jevrejkama/Jevrejka (sr) | |
SubClass Of | ||
MainVerb | ||
Abstract | Type="main" Depending to tagging strategy (see AuxiliaryVerb), MainVerb refers to lexical verbs that have neither auxiliary, nor modal, nor copular function, or it refers to verbs other than "to be", "to have", etc. |
|
SubClass Of | ||
MasculineGender | ||
Abstract | e.g., Hrvatoma/Hrvat, katerih/kateri, katerim/kateri, katerimi/kateri, pilatusema/pilatus, tistih/tisti, treh/trije, trem/trije, tremi/trije (sl) | |
SubClass Of | ||
MedialVoice | ||
Abstract | Voice="medial" (Russian), e.g., ?????????/???????, ???????????/?????????, ???????????/????????? | |
SubClass Of | ||
MFormNumeral | ||
Abstract | The Bulgarian MTE v4 specs have an additional subtype of Numeral, Numeral/Form=m_form.
This signifies a special form of cardinal numbers for persons of masculine gender for `two', `three', `four', `five' and `six', formed with suffix -(?)??: ?????, ?????, ?????? /two(people), three(people), five(people)/
(Dimitrova et al. 2009; Earl 2000, p. 153). They go beyond six, though the higher the number, the less natural they sound. `Seven Brides for Seven Brothers' is _Sedem nevesti za sedmina bratja_, always. Otoh, _The Seven Samurai_ is _Sedemte samurai_, not _Sedminata samurai_. It's a stylistic choice. (Ivan A Derzhanski, email 2010/06/20) Lily Earl (2000), A comprehensive Bulgarian grammar for foreign learners, Daniela Ubenova, Sofia |
|
SubClass Of | ||
ModalParticle | ||
Abstract | Particle/Type="modal" (Croatian, Serbian, Bulgarian). In the Bulgarian MTE, Type=modal refers to particles that express urge or order, mostly homonymous with other types of particles, for instance ??, ????, ????, ?????. (Dimitrova et al. 2009) | |
SubClass Of | ||
ModalVerb | ||
Abstract | Type="modal" Persian Modal verbs change the aspect of verbs to Subjunctive. Usually they come before Main verbs in present subjunctive form so the Main verb will have normal inflectional attributes. But if the Main verb appears in past 3rd person form, then the construction will be impersonal. Modal verbs usually are not inflected by number and person. However, there is an exception for the verb '?( '????????tav?nestan) that can be inflected for person and number. (Qasemizadeh and Saeed Rahimi 2006) |
|
SubClass Of | ||
ModificationType | ||
Abstract | Determiner/Modific_Type refers to the prenominal or postnominal positions of Determiners which distinguish different forms in Romanian. (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
ModifierAdverb | ||
Abstract | Adverb/Type="modifier" is used in the English, Romanian and Hungarian MTE v4 specs. For Romanian, Adverb/Type="modifier" applies to adverbs which can have predicative role, that is they can govern a subordinate sentence (ex. Fire?te c? o ?tiu -- Certainly I know it). Here (for uniformity within a multilingual environment), they are squeezed into the modifier class. (MTE v4) | |
SubClass Of | ||
MoodInterjection | ||
Abstract | Interjection/Type="mood" (Hungarian) | |
SubClass Of | ||
MorphologicalDerivation | ||
Sub-Classes | ||
MorphologicalFormOfNumeral | ||
SubClass Of | ||
Sub-Classes | ||
MorphosyntacticCategory | ||
Abstract | Top-level categories as specified under http://nl.ijs.si/ME/V4/msd/html/msd.cats.html. Subordinate categories reflect "Type" and related attributes. | |
SubClass Of | ||
Sub-Classes | ||
MorphosyntacticFeature | ||
Abstract | Morphosyntactic features as specified under http://nl.ijs.si/ME/V4/msd/html. Note that attribute like "type" are represented as subcategories of MorphosyntacticCategory, cf. remarks there. | |
Sub-Classes | ||
MultipleNumeral | ||
Abstract | A Multiple Numeral serves to define a complex whole, with respect to the number of its parts. In English, a Multiple Numeral is formed by adding the syllable "-fold" to the stem of a numeral. (Joseph Ghostwick [1878], English language -- Grammar, Historical, London, Longmans, Green, and Co., http://www.archive.org/details/englishgrammarhi00gostrich) | |
SubClass Of | ||
MultiplicativeCase | ||
Abstract | Case="multiplicative" (Hungarian) | |
SubClass Of | ||
Negated | ||
Abstract | Negation="yes" Negative="yes" encodes negative verbal word-forms in Slavic languages and Estonian. (MTE v4) In Slovak, for example, verbs form negative by prefix 'ne-', with the exception of the verb "by?" (E. "to be") which forms the negative in indicative by using separate particle "nie", e.g. "nie je" (is not). Here, Slovak "je" would be marked as negative, despite having positive form. In Resian, negative is always marked as 'n' except for two verbs: 'n?man' / not to have, 'n?si' / not to be. (MTE v4) |
|
SubClass Of | ||
Negation | ||
Abstract | Negative="yes" encodes negative verbal word-forms in Slavic languages and Estonian. (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
NegativeAdverb | ||
Abstract | Adverb/Type="negative" are used in the Serbian and Romanian MTE v4 specs, e.g., for Romanian nic?ieri - nowhere, niciodat? - never. (MTE v4) | |
SubClass Of | ||
NegativeDeterminer | ||
Abstract | Determiner/Type="negative" (Romanian) In Romanian the negative determiner is expressed by the unit nici + indefinite article (e.g. nici un, nici o). (MTE v4) |
|
SubClass Of | ||
NegativeParticle | ||
Abstract | e.g., ??, ???? (bg) | |
SubClass Of | ||
NegativePronoun | ||
Abstract | Pronoun/Type="negative" (Romanian, Slavic) Overlap with EmphaticPronoun: In the Ukrainian MTE v4 specs, the emphatic Type of Pronoun is used for pronoun forms ??"????, ??"????, ??"???, ??"????, etc., with complex meanings like "there is nobody/nothing (to do sth/to use for doing sth, etc.)". Orthographically these are identical forms of negative nominal pronouns ?????, ???? "nobody, nothing" in oblique cases, however, with differing accent. They are referred to as either separate pronoun lexemes or predicatives in grammars. All Ukrainian emphatic pronoun forms include negation. Overlap with IndefinitePronoun: In Romanian it is worth differentiating the negative pronoun from other indefinite pronouns: a negative pronoun cannot be an argument for a verb unless the verb itself is negated too (e.g. Nu am v?zut pe nimeni / *Am v?zut pe nimeni). |
|
SubClass Of | ||
NegativeSubordinatingConjunction | ||
Abstract | Conjunction/Sub_Type="negative" (Romanian, Serbian, Russian) In Romanian, each conjunction requires another mood, so that the diversity may be controlled by subcategorisation rules. The attribute Sub_Type distinguishes among the positive and negative conjunctions, providing means to control verbal double negation, (as in case of the negative pronouns, determiners and adverbs): nici NU am venit, nimeni NU vorbe?te, nici_un tren N-a trecut, nic?ieri N-am v?zut (MTE v4) | |
SubClass Of | ||
NeuterGender | ||
Abstract | In Romanian the declension of a neuter noun always follows in singular a masculine paradigm and in plural a feminine one. Specific implementations could take advantage of this rule and by organizing the paradigmatic space in partial paradigms (masc-sing, masc-pl, fem-sing, fem-pl) to get rid of neuter value for the gender attribute. (MTE v4) | |
SubClass Of | ||
NoClitic | ||
Abstract | Clitic="no" (Noun/Adjective: Romanian; Verb: Romanian, Polish, Serbian, Persian) Slovak Pronoun: The Clitic attribute distinguishes clitical vs. nonclitical pronominal forms, e.g. "ti" vs. "tebe". Romanian Verb, Noun, Adjective: The cliticization phenomenon in Romanian is not restricted to verb-pronoun relationship, but may also be observed with the (main) verb and the auxiliary, the noun or adjective with pronoun, with noun or adjective with copula, pronoun with auxiliary, preposition with (indefinite) article, numeral or (indefinite) pronoun, negative adverb with verb, auxiliary or pronoun, and some others (mainly created through the contracted forms of the verb "a fi"-to be). We restrict ourselves to considering only the graphically marked clicitizations. In such cases, the two, three or (sometimes) four constituents of a cliticized word-form are always separated by a hyphen. Omitting the hyphen in such cases is an unacceptable error in written Romanian. Romanian Article: Note that the definite article has only enclitic forms, except for one proclitical form (lui + proper noun: lui Ion). The inflected forms of the foreign-origin words (mainly nouns) not fully assimilated, are usually written with a hyphen between the base-form and the inflectional ending. In our encoding, we classified these endings (which are supposed to be split by the segmenter) as clitic articles (clitic attribute is always "y") which can be either definite (type=f, "-istul") or indefinite (type=i, "ist") and are characterised by gender (gender=m, "ist"; gender=f, "ist?"), number (number=s, "ist"; number=p, "i?ti") and case (case=r, "istul"; case=o, "istului"). |
|
SubClass Of | ||
NoHuman | ||
Abstract | Human="no" | |
SubClass Of | ||
Nominal | ||
Abstract | Pronoun/Syntactic_Type="nominal" (Slavic), Abbreviation/Syntactic_Type="nominal" Slovak Pronoun: Pronouns are distinguished between having a (syntactically) nominal and (syntactically) adjectival function. All pronominal types except the demonstrative and possessive one can be nominal, and all except for the personal one can be adjectival. |
|
SubClass Of | ||
NominalAdjective | ||
Abstract | Formation="nominal" (Czech) | |
SubClass Of | ||
NominativeCase | ||
Abstract | e.g., eu, tu (ro) | |
SubClass Of | ||
NoncliticElement | ||
SubClass Of | ||
NonInitialCoordinatingConjunction | ||
Abstract | In the English MTE v4 specs, Conjunction/Coord_Type="non-initial" designates coordinating conjunctions that are no InitialCoordinatingConjunction, i.e., the second element of a complex conjunction consisting of multiple words (see CorrelativeCoordinatingConjunction), e.g., "or" in "either ... or ..." and "nor" in "neither ... nor ...", but also single-word conjunctions such as "than", "but", and "and". | |
SubClass Of | ||
NonNegated | ||
Abstract | Negation="no" Non-negated verbs carry no morphological marks of negation. In Resian, negative is always marked as 'no' except for two verbs: 'n?man' / not to have, 'n?si' / not to be. In Slovak, verbs form negative by prefix 'ne-', with the exception of the verb "by?" (E. "to be") which forms the negative in indicative by using separate particle "nie", e.g. "nie je" (is not). Here, "je" would be marked as negative, despite having positive form. (MTE v4) |
|
SubClass Of | ||
NonspecificPronoun | ||
Abstract | In the Russian MTE v4 specs, Pronoun/Type="nonspecific" marks the following Russian words: ???? 'all', ?????? 'any, every', ??? 'oneself', ????? 'the very', ?????? 'every, each', ???? 'other', ????? 'any', ?????? 'other'. The name "nonspecific" follows Halliday (1985, Section 6.2.1.1). (MTE v4) A nonspecific pronoun refers to an unidentified or general entity (e.g., "I saw *someone*", "I saw *everyone*"). A nonspecific pronoun is not, therefore, a personal pronoun, but an indefinite one. (Andrews 2003). Andrews, Richard J. (2003), Introduction to Classical Nahuatl. University of Oklahoma Press. Halliday, M.A.K. (1985), An introduction to Functional Grammar, London: Edward Arnold | |
SubClass Of | ||
Noun | ||
Abstract | Nouns are characterized by features such as gender and number (for all MTE v4 languages, http://nl.ijs.si/ME/V4/msd/html/msd.N.html) As for the overlap between Noun and Adjective, Slovak adjectival nouns (gazdin?, hostinsk?) are classified as nouns. Sometimes the distinction between noun and adjective is not as clear as we want (obchodn? cestuj?ci). (MTE v4) As for the overlap between Noun and Verb, Ukrainian gerunds are not differentiated, but could be treated as a special class of nouns, nota bene: they possess aspect. (MTE v4) |
|
SubClass Of | ||
Sub-Classes | ||
Number | ||
Abstract | Hungarian has three types of number in the nominal inflection: 1. The number of the noun. 2. The number of owners that own the noun. 3. The number of the context given referent, which is some possession of the noun, i.e. belongs to the noun (anaphoric possessive). (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
Numeral | ||
Abstract | In Romanian (as in many other languages) several numerals have noun behaviour (some grammarians classify such numerals as nouns) with gender and declension of their own, which they preserve even in the composition of the superior order numerals; these are, for instance, sut? (hundred), mie (thousand), milion (million) and miliard (billion). In a sentence most numerals may fulfill the function of other parts of speech like noun, determiner or adverb. (Romanian MTE v.4) In some languages, a division of numerals according to their nominal, adjectival or adverbial function is, however, not usual, and numerals have not been subsumed under adjectives, pronouns, determiners, etc. because the internal structure of complex numerals is idiosyncratic and because of their specific syntactic distribution. (Resian, English, Czech, MTE v.4). |
|
SubClass Of | ||
Sub-Classes | ||
NumeralAgreementClass | ||
Abstract | In most Slavic languages, Numerals and Quantifiers involve specific agreement patterns, e.g., in Russian: (b) PaucalQuantifier (MTE v4: Numeral/Class="definite234"): requires noun in genitive singular, e.g., ???/???/?????? ???? "two/three/four years" (c) PluralQuantifier (MTE v4: Numeral/Class="definite"):requires noun in genitive plural, e.g., ????/?????/???????/??????? ??? "five/many/how many/that many years" Bulgarian has done away with the distinction between 4 and 5, and generalised the 2-4 form to all numerals (and some other quantifiers), but the others generally keep it. Also Slovene has a living dual (both Sorbians likewise, but they haven't been MTEd). Some Czech feminine and neuter body parts have preserved dual forms, and if the noun is dual, so are its attributes (adjectives, pronouns). So 2 differs formally from 3-4. The corresponding agreement pattern is a DualQuantifier (MTE v4: Numeral/Class="definite2"). (Ivan A. Derzhanski & Christian Chiarcos) |
|
SubClass Of | ||
Sub-Classes | ||
NumeralForm | ||
Abstract | NumeralForm conflates two different aspects that are made explicit here: - OrthographicRepresentationOfNumeral: the orthographical representation of Numerals, and - MorphologicalFormOfNumeral: morphological subclasses of Numeral as defined by their derivational morphology | |
SubClass Of | ||
Sub-Classes | ||
NumeralThreeOrFour | ||
Abstract | Numeral/Class="definite34" (Polish, Czech). Agreement pattern as prototypically manifested for numerals "three" and "four" (in Czech and Polish). | |
SubClass Of | ||
NumeralTwoToFour | ||
Abstract | Numeral/Class="definite234" (Slovak) Agreement pattern as prototypically manifested for the numerals two to four. |
|
SubClass Of | ||
ObliqueCase | ||
Abstract | Case="oblique" (Romanian, Macedonian) In the Romanian case system the value 'oblique' conflates 'genitive' and 'dative'. In the Macedonian case system the value 'oblique' conflates archaic forms of 'genitive', 'dative' and 'accusative'. |
|
SubClass Of | ||
OrdinalAdjective | ||
Abstract | "Ordinal adjective" is applied to Slovenian vrstni pridevniki and Ukrainian ???????? ???????????. A more appropriate term would be relative adjective (Derzhanski and Kotsyba 2009). In Macedonian MTE v.4, the term "ordinal adjective" designates ordinal numerals, e.g., prv, vtor (eng. first, second) (note on MTE v4 adjectives). This category is thus ambiguous between "RelationalAdjective" and "OrdinalNumeral and Adjective" | |
SubClass Of | ||
OrdinalNumeral | ||
Abstract | Numeral/Type="ordinal". Ordinal (qualitative) numerals have an enumerating property, through which one can determine the consecutive position of an object in an ensemble of homogenous objects, e.g., Slovak prv? de?, druh? mesiac, tretia sekunda; Bulgarian ????? ???, ????? ?????, ????? ??????? /first day, second month, third second/. (Dimitrova et al. 2009) Ordinal numerals often have the same inflectional characteristics as adjectives (Macedonian MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
OrthographicalRepresentationOfNumeral | ||
SubClass Of | ||
Sub-Classes | ||
OtherInterjection | ||
Abstract | Interjection/Type="other" (Hungarian, as compared to Type="mood") | |
SubClass Of | ||
owl:Thing | ||
Namespace | http://www.w3.org/2002/07/owl# | |
Sub-Classes | ||
Participle | ||
Abstract | The concept Participle merges MTE v4 Verb/VForm="participle" and Adjective/Type="participle" that are used with partially overlapping extent: Overlap with Adjective: Czech adjectival active and passive participles, e.g. "stoj?c?" (E. "standing") or "ud?lan?" (E. "performed" or "done", cf. Note 4 above) are classified as adjectives. (MTE v4) Slovak: The 'past participle' in Slovak is used for expressing compound active past Tense and is encoded as: Type=p(articiple), Tense=pa(s)t. (MTE v4) Slovak/Bulgarian: Vform=participle(p) corresponds to Slovak L-participle, in Bulgarian called just the participle and is used to form the past tense or the conditional. In Bulgarian, it also includes past participle (????????) /spoken/). (Dimitrova et al. 2009) Macedonian: The passive participle is used in verbal forms with the auxiliary ima / nema (eng. to have, to have not). The verbal adjective, in case it is used out of this construction, is considered as separate lemma. (MTE v4) Romanian participle and gerund mood permit an adjectival use. However, the adjectival use of gerund is extremely rare (o m?n? tremurnd? - a shaking hand). (MTE v4) |
|
SubClass Of | ||
Sub-Classes | ||
ParticipleAdverb | ||
Abstract | Adverb/Type="participle" is used in the Slovene MTE v4 specs, e.g., 'le?e' / lying. Slovenian adverbial participles are, however, not attested for Resian. (MTE v4) | |
SubClass Of | ||
Particle | ||
Abstract | Particle has an overlap with Adverb, see Adverb/Type="particle" as used for Hungarian and Romanian MTE v4. In Slovak MTE v4, however, Particles form a separate part of speech category as is customary in Slovak grammars. In the Slovak MTE tagset, we simplified our task enormously by resigning the classification attempts (which can be analysed ad nauseam to an arbitrary precision (?imkov?, 2004)), and all the articles have the same simple tag P. (Dimitrova et al. 2009) | |
SubClass Of | ||
Sub-Classes | ||
ParticleAdverb | ||
Abstract | Adverb/Type="particle" is used in the Romanian and Hungarian MTE v4 specs. In Romanian, the particle type covers those adverbs which can dislocate verbal compound forms (ex. Ea a tot c?ntat -- She has ever sung) or mark degrees (ex. circa (about), foarte (very), prea (too)). Such adverbs are cam, mai, prea, ?i, tot, foarte etc. (MTE v4) | |
SubClass Of | ||
ParticleFormation | ||
Abstract | Particle/Formation refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word. (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
PartitiveCase | ||
Abstract | Case="partitive" (Estonian) | |
SubClass Of | ||
PartOfFixedExpression | ||
Abstract | Some forms can only be used in a fixed context, e.g., polsku in po polsku. They are classified as special kinds of adjectives in the IPIC. In the MTE version this information is preserved in the status of a "burkinostka". This term is devised by Magdalena Derwojedowa and refers to dependent words like Burkina which only make sense and can be morphosyntactically identified in a fixed combination (Burkina Faso). | |
SubClass Of | ||
PassiveVoice | ||
Abstract | Voice="passive" In Macedonian, two types of (adjectival) participles exist: active and passive. Active corresponds to Macedonian L-form and passive to verbal adjective, neuter gender, singular. For example, nosel is encoded as VForm=Participle, Voice=Active, nosen as VForm=participle, Voice=Passive. (MTE v4) |
|
SubClass Of | ||
PastTense | ||
Abstract | Tense="past" (Ukrainian also for adjectives) | |
SubClass Of | ||
PaucalNumber | ||
Abstract | Number="paucal" (Serbian Verb) PaucalNumber is a form used with numerals from 2 to 4 (cf. PaucalQuantifier). (Ivan A. Derzhanski, email 2010/06/16) |
|
SubClass Of | ||
PaucalQuantifier | ||
Abstract | In many Slavic languages, numerals between 2 and 4 (and some quantifiers) involve a specific agreement patterns that is different from that of smaller and greater numbers. In Russian, for example, genitive singular is requires. These numerals and quantifiers with the same characteristics are referred to here as "paucal quantifiers". (cf. David Pesetsky, http://www.uni-leipzig.de/~jtrommer/Harvard/pesetsky.pdf) | |
SubClass Of | ||
Sub-Classes | ||
PerfectiveAspect | ||
Abstract | Aspect="perfective" (Noun: Polish; Verb: Slavic; Adjective: Polish, Ukrainian) In Slavic aspectology, Russian aspect is usually considered to be a binary category, i.e., every Russian verb form is either perfective or imperfective. Only imperfective verbs can combine with the auxiliary ???? 'be' in the analytic future construction, e.g. (1) ?????-?????? ? ???? ?????? ??? H??????. Sometime I will write like Nabokov 'I will write like Nabokov one day.' The synthetic future tense is restricted to perfective verbs, e.g. (2) ? ??????? ????. I callFUTURE you 'I'll call you.' There is the syntactic restriction that only imperfective verbs combine with phase verbs. In Russian, verbs like ???????? / ?????? 'begin', ?????????? / ?????????? 'continue' or ??????? / ??????? 'stop/finish' can only take an imperfective infinitive as a complement. (Feldman & Arshavskaya 2007) Anna Feldman & Katya Arshavskaya (2007), English and Russian event annotation: A pilot study. Studies in Variation, Contacts and Change in English 1, http://www.helsinki.fi/varieng/journal/volumes/01/feldman_arshavskaya/ | |
SubClass Of | ||
Person | ||
SubClass Of | ||
Sub-Classes | ||
PersonalPronoun | ||
Abstract | Pronoun/Type="personal" and Pronoun/Referent_Type="personal" | |
SubClass Of | ||
PersonOfObject | ||
Abstract | Hungarian verbs ... [have] two conjugations: definite and indefinite. The indefinite conjugation is used: 1. With an intransitive verb 2. With an indefinite object including an indefinite pronoun object 3. With most question words as the object 4. With a relative pronoun as the object 5. With a 1st or 2nd person pronoun as the object, whether stated or unstated The definite conjugation is used: 1. With a definite object 2. With a following clause with hogy ("that") 3. With questions with melyik and h?nyadik ("which") as the object 4. With a 3rd person pronoun as the object, whether stated or unstated (http://en.wikipedia.org/wiki/Hungarian_grammar_(verbs)#Definite_and_indefinite_conjugations, 2010/06/18) The term `conjugation', while traditional, is confusing here: it normally refers to a paradigmatic class, not to part of a lexeme's paradigm. What Hungarian has in fact is limited marking of the person of the direct object (object agreement) in the verb, with the caveat that a 3rd person object is only marked if it is definite, a 2nd person object is only marked if the subject is 1st person singular, and a 1st person object is never marked. (Ivan A. Derzhanski, email 2010/06/18) |
|
SubClass Of | ||
Sub-Classes | ||
PluperfectTense | ||
Abstract | Tense="pluperfect" | |
SubClass Of | ||
PluralNumber | ||
Abstract | e.g., -astea/?sta, -istele, -istelor, -i?tii, -i?tilor, -nsutitelor/?nsutit, -nsuti?ilor/?nsutit, -ntregi/?ntreg, -ntreitelor/?ntreit (ro) | |
SubClass Of | ||
PluralQuantifier | ||
Abstract | A PluralQuantifier is a Quantifier (or Numeral) that specifies a large multitude of entities. The agreement pattern of a plural quantifier is different from that or an singular quantifier, but as opposed to DualQuantifier and PaucalQuantifier, PluralQuantifier includes quantifiers that denote arbitrarily large sets of entities. (Chiarcos) The corresponding category in Czech, Polish and Slovak MTE v4 specs is Numeral/Class="definite", that refers to numerals larger than four. (MTE v4) | |
SubClass Of | ||
PortmanteauAdverb | ||
Abstract | For Romanian, Adverb/Type="portmanteau" was introduced to cover some few words which can be both adverbs and conjunctions (with adverbial reading more frequent). This was necessary for tagging purposes. (MTE v4) | |
SubClass Of | ||
PortmanteauConjunction | ||
Abstract | Conjunction/Type="portmanteau" (Romanian) Romanian: The "portmanteau" type of conjunction applies only to the word "?i" which can be both a coordonating conjunction and an adverb. The distinction among these interpretations is rather tricky for the average native speaker and was a constant source of noise in automatic tagging. Therefore, for the sake of automatic processing we defined this "portmanteau" type value. (MTE v4) |
|
SubClass Of | ||
PositiveDegree | ||
Abstract | Degree="positive" designates PositiveDegree, but also the absence of morphological degree markers. In English, for example, many comparatives and superlatives are formed with more/most, so that "positive" in this context cannot be interpreted as "neither comparative nor superlative". (MTE v4) | |
SubClass Of | ||
PositiveSubordinatingConjunction | ||
Abstract | Conjunction/Sub_Type="positive" (Romanian, Serbian, Russian) In Romanian, each conjunction requires another mood, so that the diversity may be controlled by subcategorisation rules. The attribute Sub_Type distinguishes among the positive and negative conjunctions, providing means to control verbal double negation, (as in case of the negative pronouns, determiners and adverbs): nici NU am venit, nimeni NU vorbe?te, nici_un tren N-a trecut, nic?ieri N-am v?zut (MTE v4) | |
SubClass Of | ||
PossessiveAdjective | ||
Abstract | Adjective/Type="possessive" are denominal, not pronominal expressions of possession (Ivan A Derzhanski, email 2010/06/09). Therefore not to be confused with Pronoun/Type=adjectival(a) (Bulgarian only), for words like ???? /cleverly, wisely, sensibly/, which are derived from adjectives. (Dimitrova et al. 2009) | |
SubClass Of | ||
PossessiveArticle | ||
Abstract | In Romanian, the possessive article (also called genitival article) is an element in the structure of the possessive pronoun, of the ordinal numeral (e.g. al meu (mine) and al treilea (the third)), and of the indefinite genitive forms of the nouns (e.g. capitol al c?r?ii (chapter of the book)). (MTE v4) | |
SubClass Of | ||
PossessiveDeterminer | ||
Abstract | Determiner/Type="possessive" (English, Romanian, Persian) | |
SubClass Of | ||
PossessivePronoun | ||
Abstract | Pronoun/Type="possessive" and Pronoun/Referent_Type="possessive", e.g., Macedonian moj, tvoj (eng. my, your). (MTE v4) | |
SubClass Of | ||
PostnominalModification | ||
Abstract | Determiner/Modific_Type="postnomin" (Romanian) The Modific_Type attribute is relevant for some Romanian pronouns and determiners. The prenominal determiner always precedes the noun (e.g.acest b?iat - this boy), whereas the postnominal determiner appears only after the noun (e.g. b?iatul acesta - this boy). (MTE v4) |
|
SubClass Of | ||
Postposition | ||
Abstract | Adposition/Type="postposition" English: Postpositions are rare in English. "possessive" 's and ' might be considered postpositions, especially if the alternative is to assign them to the unique membership class (where by definition they would be unrelated). (MTE v4) Farsi has several prepositions but there is only one postposition '?( '???r?). It is an overt marker for direct object. (Qasemizadeh and Saeed Rahimi 2006) |
|
SubClass Of | ||
PremodifyingOrdinalNumeral | ||
Abstract | Numeral/Type="ordinal2" (Persian) In Persian, a number can be inflected by two different suffix to express ordinal meaning. These suffixes are "om" and "omin". They both have more or less the same meaning; however they are different morphosyntactically. For example, the English phrase "first person" can be translated to Persian as follows: (1) nafar yekom (2) yekomin nafar "nafar" in Persian means person, "yek" in Persian means one. (1) yek + om = yekom (ordinal) (2) yek + omin = yekomin (ordinal2) Ordinal2 refers to the premodifying variant in (2), e.g., ?????/???? ???/?? ???/?? (MTE v4; Behrang Qasemizadeh, email 2010/06/26) |
|
SubClass Of | ||
PrenominalModification | ||
Abstract | Determiner/Modific_Type="prenomin" (Romanian) The Modific_Type attribute is relevant for some Romanian pronouns and determiners. The prenominal determiner always precedes the noun (e.g.acest b?iat - this boy), whereas the postnominal determiner appears only after the noun (e.g. b?iatul acesta - this boy). (MTE v4) |
|
SubClass Of | ||
Sub-Classes | ||
Preposition | ||
Abstract | Type="preposition" | |
SubClass Of | ||
PrepositionalCase | ||
SubClass Of | ||
PresentTense | ||
Abstract | Tense="present" (Ukrainian also for adjectives) | |
SubClass Of | ||
Program | ||
Abstract | In the Slovene MTE v4 specs, Residual/Type="program" marks where the tokenisation program made a mistake. (MTE v4) | |
SubClass Of | ||
ProgressiveAspect | ||
Abstract | Aspect="progressive" (Noun: Polish; Verb: Slavic and Persian; Adjective: Polish, Ukrainian) In Slavic aspectology, Russian aspect is usually considered to be a binary category, i.e., every Russian verb form is either perfective or imperfective. Only imperfective verbs can combine with the auxiliary ???? 'be' in the analytic future construction, e.g. (1) ?????-?????? ? ???? ?????? ??? H??????. Sometime I will write like Nabokov 'I will write like Nabokov one day.' The synthetic future tense is restricted to perfective verbs, e.g. (2) ? ??????? ????. I callFUTURE you 'I'll call you.' There is the syntactic restriction that only imperfective verbs combine with phase verbs. In Russian, verbs like ???????? / ?????? 'begin', ?????????? / ?????????? 'continue' or ??????? / ??????? 'stop/finish' can only take an imperfective infinitive as a complement. (Feldman & Arshavskaya 2007) Anna Feldman & Katya Arshavskaya (2007), English and Russian event annotation: A pilot study. Studies in Variation, Contacts and Change in English 1, http://www.helsinki.fi/varieng/journal/volumes/01/feldman_arshavskaya/ | |
SubClass Of | ||
Pronominal | ||
Abstract | Abbreviation/Syntactic_Type="pronominal" (Romanian) | |
SubClass Of | ||
Pronoun | ||
Abstract | The English MTE v4 specs distinguish five major classes of Pronoun, i.e., PersonalPronoun, PossessivePronoun, DemonstrativePronoun, ReflexivePronoun and GeneralPronoun. "General" pronouns are those which are not personal, possessive, demonstrative or reflexive. The choice of these four categories is based on distributional facts, though at a rather high level of abstraction. They enter into anaphoric dependencies which are signalled morphosyntactically and are therefore (in principle) more amenable to automatic detection. Most general pronouns do not, although they too sometimes encode number information.
(MTE v4) Bulgarian marks definiteness for pronouns, but it is present only for the possessive and reflexive types of pronouns, and for some general pronouns. Examples include: Possessive: ??? ? ??? - ???? /my/, ???? - ???? ? ????? /your, 2 p. sing/, ????? ? ??????? ? ???????? /his/ Reflexive: ???? ? ???? ? ?????, ???? ? ??????, ????-??????, ???? - ?????? /his, her, its, their own/ (Dimitrova et al. 2009) |
|
SubClass Of | ||
Sub-Classes | ||
PronounForm | ||
Abstract | The feature "Pronoun_Form" distinguishes weak and strong pronouns Romanian. For Romanian we need an attribute (called Pronoun_Form) to make the distinction between strong and weak forms of the same personal pronoun or reflexive pronoun. All the weak forms can be adjoined to the adjacent words both proclitically or enclitically. In such cases the junction is always graphically marked by a hyphen between the pronoun and the neighboring word. The hyphen also marks possible elisions from either pronoun or the adjacent word. (MTE v4) |
|
SubClass Of | ||
Sub-Classes | ||
ProperNoun | ||
Abstract | Noun/Type=Proper | |
SubClass Of | ||
ProQuantifier | ||
Abstract | ProQuantifier represents Pronoun/Referent_type="quantitative" and Numeral/Class="demonstrative" (etc.) The MTE v4 categories Numeral/Class="interrogative", Numeral/Class="demonstrative", Numeral/Class="indefinite" and Numeral/Class="relative" are items meaning `how many/much', `this many/much', `several/some', `as many/much' etc. Strictly speaking, they are pronumerals (pro-quantifiers), but traditional descriptions don't recognise such a category, so they are described variously as pronouns (because they can be interrogative, demonstrative etc., as proforms other than personal or possessive ones can) or as numerals (because their syntactic distribution is that of numerals, or very close). (Ivan A. Derzhanski, email 2010/06/11) The difference between pronouns and these classes of numerals is, however, fuzzy and many are indeed classified as pronouns (Slovak MTE v.4) The Bulgarian equivalents of [Slovak] demonstrative, indefinite, interrogative and relative "numeral" are thus classified as pronouns with Referent_type="quantitative", e.g. ??????? ??????? /a few students/ ? indefinite pronoun + noun. or sometimes as adverbs. (Dimitrova et al. 2009) |
|
SubClass Of | ||
Sub-Classes | ||
QualificativeAdjective | ||
Abstract | In Czech and Slovak MTE v4, Adjective/Type=qualificative conflates the deverbative adjectival participles, i.e., past active participle (Czech only), passive participle and present active participle, e.g., Slovak "stojaci" (E. "standing") or "uroben?" (E. "made" or "done") For Romanian, one could make the distinction between qualificative and determinative adjectives, although this is not common practice in Romanian linguistics. (MTE v4) | |
SubClass Of | ||
Quantifier | ||
Abstract | A quantifier is a pronoun or a determiner that expresses a referent's definite or indefinite number or amount. (adapted from http://linguistics-ontology.org/gold/2008/Quantifier, quoting Crystal 1997, 317; extended in order to cover ProQuantifier, as suggested by Ivan A Derzhinski). In MTE v4, Quantifier corresponds to the top-class Numeral that has been extended to cover non-numerical quantifiers. In Slavic languages, (Pro)Quantifiers and Numerals have a specific syntactic distribution where different quantities are correlated with certain agreement patterns. The Slovak MTE, for example, distinguishes the following agreement patterns: definite1 for ?one?, definite2 for ?two?, definite34 for ?three? or ?four?, definite for ?five or more?. Definite1, definite2, definite34 and definite are separated according to syntactical structures the numerals impose on the governed nouns ? definite1 requires the corresponding noun to be in nominative singular, definite2 in nominative plural, definite34 nominative plural, definite genitive plural. (Dimitrova et al. 2009) These agreement patterns are specified in the subconcept NumeralAgreementClass. See the subconcepts Numeral and ProQuantifier for their respective semantic sub-classification. Overlap with Pronoun: Bulgarian equivalents of [Slovak] demonstrative, indefinite, interrogative numerals (ProQuantifiers) are classified as pronouns of a respective Type (including relative), e.g. ??????? ??????? /a few students/ ? indefinite pronoun + noun. or sometimes as adverbs. (Dimitrova et al. 2009) |
|
SubClass Of | ||
Sub-Classes | ||
Question | ||
Abstract | WHType="question" (English) | |
SubClass Of | ||
Quotative | ||
Abstract | VForm="quotative" (Estonian) A quotative is grammatical device to mark reported speech in some languages (http://en.wikipedia.org/wiki/Quotative), e.g., in Estonian. ?Reportedly, while he was going (in his boat), he turned over.? Ta olevat oma paadiga ?mber l?inud He was_QUOTATIVE his_own boat_WITH over gone. (Estonian translation of an example given under http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsAQuotativeEvidential.htm) (Heiki-Jaan.Kaalep, email 2010/06/22) |
|
SubClass Of | ||
ReciprocalPronoun | ||
Abstract | Pronoun/Type="reciprocal" (Persian, Estonian, Hungarian) | |
SubClass Of | ||
ReductionFeature | ||
Abstract | For Polish, the Vocalicity of (a)gglutinated forms like -em vs -m is mapped on the MTE Definiteness attribute with its values "(f)ull-art" and "(s)hort-art" respectively, meaning "full form" and "short form". The terms are very artificial, but this category is used due to the similarity of the phenomenon. The IPIC flexeme winien and predicatives like rad are treated as short adjectives?Definiteness="short-art". (MTE v4) Etymologically speaking, CliticDeterminerType and ReductionFeature are the same: the ending of the full form was originally a cliticised demonstrative pronoun (just like the article in Bg or Ro), and the semantic distinction was [+/- definite], but it has shifted to [attributive:predicative] or some such on some occasions. However, keeping them together wouldn't be correct: Bulgarian has preserved (to a limited extent) the old long form, and has a fourfold opposition of, say, _nov : novi : novija : novijat_, the first two members of which have counterparts in several Slavic languages (although the functions differ), while the second two are restricted to the Balkan sprachbund. I'd call them [-article short], [-article full], [+article short] and [+article full] respectively. (Ivan A. Derzhanski, emails 2010/06/18) [T]he suffixation of an actual pronoun to the adjective ... for the Balto-Slavic definite adjective inflection ... was used with definite nouns ... the Balto-Slavic forms show clear evidence of a well-attested IE pronoun suffixed to the adjective (McFadden 2004) ReductionFeature corresponds to AdjectiveFormation in Czech. Thomas McFadden (2004), On the pronominal origins of the Germanic strong adjective inflection, http://ifla.uni-stuttgart.de/institut/mitarbeiter/tom/downloads/gmcadj.pdf (to appear in M?nchner Studien zur Sprachwissenschaft) |
|
SubClass Of | ||
ReflexivePronoun | ||
Abstract | In the Czech and Slovak MTE v4 specs, Pronoun/Type="reflexive" ecompasses all reflexive pronouns (Slovak sa, sebe, si, svoj, seba, Czech se, sebe, si, sv?j) as well as Slovak "sa" and Czech "se" in its role as the obligatory particle of reflexive verbs. Personal and possessive reflexives are further distinguished via the Referent_Type attribute. "sa" in all its roles will be marked as the reflexive personal clitic pronoun. (MTE v4) | |
SubClass Of | ||
RelationalAdjective | ||
Abstract | The Slovene adjective expresses three main ideas: quality (qualitative adjectives, kakovostni pridevniki), relation (relational adjectives, vrstni pridevniki) and possession (possessive adjectives, svojilni pridevniki). Relational adjectives express type, class or numerical sequence of a noun. For instance: kemijska in fizikalna sprememba (chemical and physical change), fotografski aparat (photographic device (=camera)). (http://en.wikipedia.org/wiki/Slovene_grammar) | |
SubClass Of | ||
Relative | ||
Abstract | WHType="relative" (English) | |
SubClass Of | ||
RelativeAdverb | ||
Abstract | e.g., 'Ow, 'ow, how, where, whereupon, when (en) | |
SubClass Of | ||
RelativeDeterminer | ||
Abstract | Determiner/Wh_Type="relative" (English MTE v4) | |
SubClass Of | ||
RelativePronoun | ||
Abstract | Pronoun/Wh_Type="relative" (English) or Pronoun/Type="relative" (other languages). | |
SubClass Of | ||
RelativeQuantifier | ||
SubClass Of | ||
RepetitiveCoordinatingConjunction | ||
Abstract | Conjunction/Coord_Type="repetit" (Romanian). In Romanian, there are three kinds of conjunctions depending on their usage: as such or together with other conjunctions or adverbs: (1) simple, between conjuncts: Ion ori Maria (John or Mary); (2) repetitive, before each conjunct: fie Ion fie Maria fie... (either John or Mary or...) (3) correlative, before a conjoined phrase, it requires specific coordinators between conjuncts: at?t mama c?t ?i tata (both mother and father). (MTE v4) | |
SubClass Of | ||
Residual | ||
Abstract | Residual refers to otherwise non-classified parts of speech. In Slovak, special 'adverb prepositions' (po, na, do), encountered in expressions like po anglicky, na zeleno, do modra are classified as residuals. Traditional Slovak grammars do not like to consider them separate words, but rather see them to be different part-of-speech, mostly an adverb (see interjections above), with a space inside. In corresponding Bulgarian expressions (e.g. ?? ?????????), the residual will be classified as Sp (preposition). This is however just a difference in grammar description, not an inherent difference in the languages. (Dimitrova et al. 2009) |
|
SubClass Of | ||
Sub-Classes | ||
RomanNumeral | ||
Abstract | Form="roman" | |
SubClass Of | ||
SecondPerson | ||
Abstract | e.g., budete/by?, bu?te/by?, majte/ma?, m?te/ma?, neposta??te/nesta?i?, neposta???/nesta?i?, nepovle?iete/nevliec?, nepo?alujete/ne?alova?, nepo?eniete/nehna? (sk) | |
SubClass Of | ||
SentenceCoordinatingConjunction | ||
Abstract | Conjunction/Coord_Type="sentence" (Serbian, Russian, Hungarian). | |
SubClass Of | ||
ShortArticle | ||
Abstract | Definiteness="short-art" (Noun: Bulgarian; Verb: Polish, Russian, Bulgarian; Adjective: : Polish, Russian, Ukrainian, Bulgarian; Pronoun: Polish, Bulgarian) The Bulgarian singular masculine article has two forms: a full article(f)[l.s.] and a short article(s)[l.s.]. The full article is used when a singular masculine form is the syntactic subject of the clause, otherwise a short one is used ? a purely orthographic rule. The distinction of full vs. short is not made for feminine, neuter and plural forms, and we use just the yes(y) or no(n) to mark definiteness or respectively lack thereof. Therefore, the definiteness attribute can take overall 4 different values: indefinite(n), definitive(y), short article(s), full article(f) e.g., ???, ????, ????? /a man, the man[short], the man [full]/ (Dimitrova et al. 2009) For Polish, the Vocalicity of (a)gglutinated forms like -em vs -m is mapped on the Definiteness attribute with its values "(f)ull-art" and "(s)hort-art" respectively, meaning "full form" and "short form". The terms are very artificial, but this category is used due to the similarity of the phenomenon. The IPIC flexeme winien and predicatives like rad are thus treated as short adjectives?Definiteness="short-art". (MTE v.4) |
|
SubClass Of | ||
SimpleAdposition | ||
Abstract | Adposition/Formation="simple", i.e., non-compound adposition. | |
SubClass Of | ||
SimpleConjunction | ||
Abstract | Formation="simple" Romanian: As with prepositions, we can distinguish two kinds of conjunctions in Romanian: (1) simple conjunctions: e.g. ?i,dar,de?i etc. (2) conjunctions formed periphrastically, with some word/phrase combined by a conjunction: din moment ce, f?r? s?, fat,? de cum etc. (MTE v4) |
|
SubClass Of | ||
SimpleCoordinatingConjunction | ||
Abstract | In the Romanian MTE v4 specs, Conjunction/Coord_Type="simple" is defined in contrast to repetitive and correlative coordinating conjunctions. In Romanian, there are three kinds of conjunctions depending on their usage: as such or together with other conjunctions or adverbs: (1) simple, between conjuncts: Ion ori Maria (John or Mary); (2) repetitive, before each conjunct: fie Ion fie Maria fie... (either John or Mary or...) (3) correlative, before a conjoined phrase, it requires specific coordinators between conjuncts: at?t mama c?t ?i tata (both mother and father). (MTE v4) | |
SubClass Of | ||
SimpleInterjection | ||
Abstract | Interjection/Formation="simple" | |
SubClass Of | ||
SimpleParticle | ||
Abstract | Particle/Formation="simple" | |
SubClass Of | ||
SingularNumber | ||
Abstract | e.g., chce?/cht?t, cht?j/cht?t, country, desater?, desater?m/desater?, des?t?/des?t?, devaten?ct?/devaten?ct?, devaten?ct?mu/devaten?ct?, devaten?ct?m/devaten?ct? (cs) | |
SubClass Of | ||
SingularQuantifier | ||
Abstract | A singular quantifier is a quantifier or a numeral that specifies a single referent from a set. (Chiarcos) In Czech and Slovak MTE v4 specs, the corresponding category Numeral/Class="definite1" is applied to the numeral "one". (MTE v4) | |
SubClass Of | ||
SociativeCase | ||
Abstract | Case="sociative" (Hungarian) | |
SubClass Of | ||
SpecialNumeral | ||
Abstract | Numeral/Type="special" | |
SubClass Of | ||
SpecifierAdverb | ||
Abstract | Adverb/Type="specifier" (English) | |
SubClass Of | ||
StrongPronoun | ||
Abstract | Pronoun_form="strong" (Romanian) For Romanian we need an attribute (called Pronoun_Form) to make the distinction between strong and weak forms of the same personal pronoun or reflexive pronoun. All the weak forms can be adjoined to the adjacent words both proclitically or enclitically. In such cases the junction is always graphically marked by a hyphen between the pronoun and the neighboring word. The hyphen also marks possible elisions from either pronoun or the adjacent word. (MTE v4) |
|
SubClass Of | ||
Subjunctive | ||
Abstract | In Resian, the subjunctive is formally identical to the imperative, with one form for the three persons in the singular and the forms for the 2nd and 3rd person plural being identical. (MTE v4) | |
SubClass Of | ||
SubjunctiveParticle | ||
Abstract | Particle/Type="subjunctive" (Romanian) A verbal particle with Particle/Type="future" modifies the verbs and marks the verb as being subjunctive (Dan Tufis, email 2010/06/09) | |
SubClass Of | ||
SublativeCase | ||
Abstract | Case="sublative" (Hungarian) | |
SubClass Of | ||
SubordinatingConjunction | ||
Abstract | Conjunction/Type="subordinating", for Romanian, Serbian and Russian further sub-classified by the Sub_Type attribute. In Romanian, each conjunction requires another mood, so that the diversity may be controlled by subcategorisation rules. The attribute Sub_Type distinguishes among the positive and negative conjunctions, providing means to control verbal double negation, (as in case of the negative pronouns, determiners and adverbs): nici NU am venit, nimeni NU vorbe?te, nici_un tren N-a trecut, nic?ieri N-am v?zut (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
SuperessiveCase | ||
Abstract | Case="superessive" (Hungarian) | |
SubClass Of | ||
SuperlativeDegree | ||
Abstract | e.g., aktuaalseimatesse/aktuaalseim, kiireimale/kiireim, k?rgeimale/k?rgeim, parimail/parim, parimale/parim, parimalt/parim, parimana/parim, parimateks/parim, parimatele/parim (et) | |
SubClass Of | ||
Supine | ||
Abstract | VForm="supine" (Slovene, Estonian) In Romanian, Supine appears mostly with a preposition, except for a few intransitive verbs when they are subordinated to the impersonal verb a trebui (must). Only the preposition allows for differentiating a supine from a participle-masculine-singular. It was thus left out from Romanian MTE v4 specs. (MTE v4) |
|
SubClass Of | ||
SyntacticType | ||
Abstract | Syntactic_Type is used to distinguish the nominal and adjectival function of Pronouns in Croatian, Slovak, Resian and Czech. In Czech and Slovak, pronouns are distinguished between having a (syntactically) nominal and (syntactically) adjectival function, e.g. Slovak ktor?, m?j. All pronominal types except the demonstrative and possessive one can be nominal, and all except for the personal one can be adjectival. Furthermore, in Slovene and Serbian, the adverbial function of certain Pronouns is distinguished. (MTE v4) This distinction is absent in the Bulgarian language (there are no adjectival pronouns of this type). Slovak also has several quasi-adjectival pronouns classified as Syntactic_Type="adjectival" (e.g. tvoj), equivalents of which do exist in Bulgarian as well, but due to lack of the clear distinction of adjectival paradigm it was not felt unnecessary to introduce this value in Bulgarian MTE (Dimitrova et al. 2009) Also used in Abbreviations to signal the Part of Speech of the abbreviation in Romanian and Estonian. Although the values for this attribute could range over the part of speech categories in the language, in Romanian most of the abbreviations falls into noun class. (MTE v4) |
|
SubClass Of | ||
Sub-Classes | ||
TemporalisCase | ||
Abstract | Case="temporalis" (Hungarian) | |
SubClass Of | ||
Tense | ||
SubClass Of | ||
Sub-Classes | ||
TerminativeCase | ||
Abstract | Case="terminative" (Estonian, Hungarian) | |
SubClass Of | ||
ThirdPerson | ||
Abstract | Although in traditional grammar books the demonstrative, indefinite and int_rel determiners are not characterised by person, in the Romanian MTE dictionaries they are recorded (for reasons beyond morpho-lexical encoding) as 3rd person (the same as nouns). However, for the automatic tagging this value has been marked as irrelevant. (MTE v4) | |
SubClass Of | ||
Transgressive | ||
Abstract | e.g., nezvestuj?c/nezvestova? (sk) | |
SubClass Of | ||
Transitive | ||
Abstract | Transitive="yes" (Persian) | |
SubClass Of | ||
Transitivity | ||
Abstract | feature "Transitive" | |
SubClass Of | ||
Sub-Classes | ||
TranslativeCase | ||
Abstract | Case="translative" (Estonian) | |
SubClass Of | ||
Typo | ||
Abstract | In the Slovene MTE v4 specs, Residual/Type="typo" marks a mis-typed word. (MTE v4) | |
SubClass Of | ||
UniquitiveDeterminer | ||
Abstract | Determiner/Type="exceptional" is applied to the Persian uniquitive determiner ???? i.e., "the only" (MTE v4; Hamidreza Kobdani, email 2010/06/15) | |
SubClass Of | ||
Verb | ||
Abstract | e.g., dostali/dostat (cs) | |
SubClass Of | ||
Sub-Classes | ||
Verbal | ||
Abstract | Abbreviation/Syntactic_Type="verbal" | |
SubClass Of | ||
VerbalAdverb | ||
Abstract | Adverb/Type="verbal" applies to adverbs derived from from verbs (verbal adverbs) in the Serbian, Macedonian and Hungarian MTE v4 specs. Macedonian verbal adverbs (gerunds) like odejkji are thus not considered as verbal forms, but as Adverb/Type="verbal". (MTE v4) | |
SubClass Of | ||
VerbalParticle | ||
Abstract | A verbal particle modifies the verb and carries information on the verb form (e.g., finiteness, tense and aspect). (Dimitrova et al. 2009, Dan Tufis, email 2010/06/09). In the Bulgarian MTE specs, Particle/Type=verbal(v) is used to form different type of verbal syntactical relationships, e.g. to create future tense (?? ???????), or particles like ??, ??. (Dimitrova et al. 2009) The Romanian MTE v4 specs provide a more fine-grained subclassification of (verbal) particles (MTE v4) | |
SubClass Of | ||
Sub-Classes | ||
VerbForm | ||
Abstract | Feature "VForm" of verbs | |
SubClass Of | ||
Sub-Classes | ||
VocativeCase | ||
Abstract | Macedonian: Two vocative forms exist with same MSD, e.g. narode / narodu (narod) are both Ncmsvn. Slovak: Slovak distinguishes 7 cases, the locative case being obligatorily prepositional. Vocative is identical with nominative, with the exception of several nouns and (substandard usage of) some proper names. Here, vocative is marked according to its syntactic role. 'ty' (E. 'you') is usually vocative. Many other pronouns can be marked as vocative because of their syntactical position, e.g. in 'm?j bo?e' (E. 'my god'), 'm?j' is vocative. |
|
SubClass Of | ||
Voice | ||
SubClass Of | ||
Sub-Classes | ||
WeakPronoun | ||
Abstract | Pronoun_Form="weak" (Romanian) For Romanian we need an attribute (called Pronoun_Form) to make the distinction between strong and weak forms of the same personal pronoun or reflexive pronoun. All the weak forms can be adjoined to the adjacent words both proclitically or enclitically. In such cases the junction is always graphically marked by a hyphen between the pronoun and the neighboring word. The hyphen also marks possible elisions from either pronoun or the adjacent word. (MTE v4) |
|
SubClass Of | ||
WHType | ||
SubClass Of | ||
Sub-Classes | ||
WithCliticS | ||
Abstract | Clitic_s="yes" (Czech) In Czech the 2nd Person singular present Tense of the auxiliary Verb "b?t" (i.e. the form "jsi") can be cliticised as -s on certain non-finite verb forms and pronouns. There is no intermediate hyphen between the verbal form and the 's' morpheme. Its presence is indicated by the positive value of the binary feature Clitic_s of the parts of speech Verb and Pronoun. The feature Clitic_s='yes' thus denotes Czech pronouns having the clitic morpheme 's' appended as a suffix. (Derzhanski and Kotsyba 2009; MTE v4) | |
SubClass Of | ||
WithCourtesy | ||
Abstract | Courtesy="yes" (Slovene/Resian, Persian) | |
SubClass Of | ||
WithoutCliticS | ||
Abstract | feature Clitic_s: the 'yes' value of the Clitic_s attribute denotes Czech pronouns having the clitic morpheme 's' appended as a suffix.
Czech: The 'yes' value of the Clitic_s attribute denotes a verbal form having the clitic morpheme 's' appended as a suffix. This 's' morpheme expresses 2nd Person singular present Tense of the auxiliary Verb "b?t" (i.e. the form "jsi"). There is no intermediate hyphen between the verbal form and the 's' morpheme.
The Clitic_s attribute is specified for VForm=infinitive (VForm=n) and Vform=p(articiple) only.
(MTE v.4) In Czech the 2nd person singular present tense form of the copula jsi can be cliticised as -s on certain non-finite verb forms and pronouns, and its presence is indicated by the positive value of the binary feature Clitic_s of the parts of speech Verb and Pronoun. Essentially the same phenomenon exists in Polish, but it involves four cliticised forms of the copula (1sg -m, 1pl - ?my, 2sg -?, 2pl -?cie), and they float more freely (the host can be any content word, e.g. ?winia? ?thou art a pig?, dobry? ?thou art good?) (Derzhanski and Kotsyba 2009) Therefore modeled here as subClass of Clitic. Clitic_s="no" (Czech) |
|
SubClass Of | ||
WithoutCourtesy | ||
Abstract | Courtesy="no" (Resian, Persian) | |
SubClass Of | ||
WordsCoordinatingConjunction | ||
Abstract | Conjunction/Coord_Type="words" (Serbian, Russian, Hungarian) | |
SubClass Of |
Object Properties
hasAdjectiveFormation | ||
---|---|---|
Range | ||
Domain | ||
hasAdpositionFormation | ||
Range | ||
Domain | ||
hasAnimacy | ||
Range | ||
Domain | ||
hasAspect | ||
Abstract | also applicable to Polish nouns, and Polish and Ukrainian adjectives | |
Range | ||
Domain | ||
hasCase | ||
Sub-Properties | ||
Range | ||
Domain | ||
hasClitic | ||
Range | ||
Domain | ||
hasConjunctionFormation | ||
Range | ||
Domain | ||
hasCourtesy | ||
Range | ||
Domain | ||
hasDefiniteness | ||
Range | ||
Domain | ||
hasDegree | ||
Range | ||
Domain | ||
hasFeature | ||
Sub-Properties |
|
|
Range | ||
Domain | ||
hasFormation | ||
Abstract | Formation: refers to the graphical components: simple, i.e. consisting of one word; compound, i.e. consisting of more than one word. | |
Sub-Properties | ||
Range | ||
hasGender | ||
Range | ||
Domain | ||
hasHumanness | ||
Range | ||
Domain | ||
hasInterjectionFormation | ||
Range | ||
Domain | ||
hasModificationType | ||
Range | ||
Domain | ||
hasNegation | ||
Range | ||
Domain | ||
hasNumber | ||
Range | ||
Domain | ||
hasNumeralForm | ||
Range | ||
Domain | ||
hasOwnedNumber | ||
Abstract | feature "Owned_Number" (Pronoun and PronominalAdjectives; Hungarian Noun, Numeral) Owned_Number: in the Hungarian system, different word-forms are distinguished for nominals on the basis of so called 'anaphoric possessive' number, i.e. the number of the thing(s) possessed by the nominal in question. | |
Range | ||
Domain | ||
hasOwnerGender | ||
Abstract | Pronoun/Owner_Gender Owner_Gender: used to encode the Gender of the possessor in Pronouns and (in Romanian) Determiners. | |
Range | ||
Domain | ||
hasOwnerNumber | ||
Abstract | feature "Owner_Number" (Pronoun, for Hungarian also Noun and Adjective) Owner_Number: used to specify the possessor number in Pronouns, as well as (in Romanian) in Determiners, and (in Hungarian) in Adjectives and Nouns. | |
Range | ||
Domain | ||
hasOwnerPerson | ||
Abstract | feature "Owner_Person" (Pronoun (and pronominal Adjectives), Noun: Hungarian) Owner_Person: used to specify the possessor person in in Hungarian in Adjectives and Nouns. | |
Range | ||
Domain | ||
hasParticleFormation | ||
Range | ||
Domain | ||
hasPerson | ||
Range | ||
Domain | ||
hasPronounForm | ||
Abstract | Pronoun/Pronoun_Form | |
Range | ||
Domain | ||
hasQuantifier | ||
Range | ||
Domain | ||
hasSubCase | ||
Abstract | A SubCase refers to non-standard cases, i.e., a grammatical differentiation that occurs in few inflection paradigms that is regularly expressed by a single case. Some Russian genitive nouns take the non-standard ending "-?,-?" in genitive to express partitive meaning ("????? ???????? ???") or in prepositive (locative) to express locative meaning ("?? ?????"). (MTE 4; Serge Sharoff) | |
Range | ||
Domain | ||
hasSyntacticType | ||
Abstract | Pronoun/Syntactic_Type and Abbreviation/Syntactic_Type | |
Range | ||
Domain | ||
hasTense | ||
Abstract | also applied to Ukrainian Adjective | |
Range | ||
Domain | ||
hasTransitivity | ||
Range | ||
Domain | ||
hasVerbForm | ||
Range | ||
Domain | ||
hasVoice | ||
Abstract | applied to Polish and Hungarian adjective | |
Range | ||
Domain | ||
hasWHType | ||
Range | ||
Domain |