PREFIX Ontology Name # Classes # Properties # Individuals # Imports Modified
ancorra ancorra 43 0 29 1
  Annotation Model for AnnCorra : Guidelines For POS And Chunk Annotation For Indian Languages as described by Bharati et al. (2006). Unless marked otherwise, all comments here are quotes from this document. Bharati et al. (2006) claim to provide a tagset applicable to all Indian languages. They explicitly mention Hindi, Bangla, Marathi, Telugu, and Tamil, and the tagset can be assumed to be applicable to at least these languages. The document is, however, mostly working with examples from Hindi and Bangla. Akshar Bharati, Dipti Misra Sharma, Lakshmi Bai, Rajeev Sangal (2006), AnnCorra : Annotating Corpora. Guidelines For POS And Chunk Annotation For Indian Languages, Tech. Rep., L anguage Technologies Research Centre IIIT, Hyderabad, version of 15-12-2006, http://ltrc.iiit.ac.in/tr031/posguidelines.pdf
ancorra-link ancorra-link 46 0 0 2
  2011/08/15 provisional semiautomated linking for morphosyntax, manually revised (no properties linked, yet) Christian Chiarcos, chiarcos@uni-potsdam.de
arabic_khoja arabic_khoja 62 6 69 1
  OLiA Annotation Model for morphosyntactic annotation of Arabic, following Khoja et al. (2001) Arabic grammar has been studied for centuries, and the principles of describing the language already exist. Since so much knowledge is readily available, it is logical to derive our tagset from this wealth of information. The alternative to this is to base the Arabic tagset on an Indo-European one, but by doing this we may lose a lot of the information that an Arabic tagset would give us. Also, by moulding Arabic to fit an Indo-European language, we might distort the way Arabic is perceived by its native speakers. (Khoja et al, 2001) The prototype tagger reported in (Khoja 2003) was based on a lexicon of under 10,000 word-types, extracted from a corpus of about 50,000 word-tokens. The initial 50,000-word training corpus was extracted from the Saudi Al-Jazirah newspaper (date 03/03/1999); initial tagging experiments were done on other newspaper texts, and a social science paper. (Atwell 2007) Unless specified otherwise, all comments are quotes from Khoja et al. (2001). References: Khoja, S, Garside, R, and Knowles, G (2001) A tagset for the morphosyntactic tagging of Arabic. Paper given at the Corpus Linguistics 2001 conference, Lancaster, http://zeus.cs.pacificu.edu/shereen/CL2001.pdf Eric Atwell (2007), Development of tag sets for part-of-speech tagging, Corpus Linguistics Conference 2007, Birmingham, http://www.comp.leeds.ac.uk/eric/atwell07clih.pdf
basque basque 66 0 61 1
  OLiA Annotation Model for the morphosyntactic annotation of Basque as produced by Morfeus/EusTagger, a morphological analyzer and a lemmatizer/tagger for Basque (Ezeiza et al. 1998, details: http://ixa.si.ehu.es/Ixa/Produktuak/1274695175, http://ixa.si.ehu.es/Ixa/Produktuak/1273217967; demo: http://ixa2.si.ehu.es/demo/analisianali.jsp). Probably the same categories were used for the annotation of the Basque AnCora corpus (http://ixa.si.ehu.es/Ixa/Produktuak/1274695486), a 155.000 word sub-corpus of the EPEC (Reference Corpus for the Processing of Basque). Unless stated otherwise, the information provided in the ontology is taken from http://ixa2.si.ehu.es/edblkontsulta/labur-eus.htm. For translation and interpretation of the tagset documentation, we consulted Sagüés (2011), http://en.wikipedia.org/wiki/Basque_language and google translate. References: Ezeiza N., Aduriz I., Alegria I., Arriola J.M., Urizar R. 1998. Combining Stochastic and Rule-Based Methods for Disambiguation in Agglutinative Languages COLING-ACL'98. Pgs. 380 - 384. Vol 1. Montreal (Canada). August 10-14, 1998. Miguel Sagüés (2011), Gramática Elemental Vasca, Txertoa, Donastia (San Sebastian), Spain
brown brown 70 2 0 1
  Annotation Model for morphosyntactic (pos) annotations of the Brown Corpus 2008/05/23 created 2010/02/16 updated Christian Chiarcos, chiarcos@uni-potsdam.de
brown-link brown-link 0 0 0 2
  2010/02/18 redesigned Christian Chiarcos, chiarcos@uni-potsdam.de
connexor connexor 53 13 139 1
  annotation model for connexor machinese syntax
2008/03/27 created (for the English scheme under http://193.185.105.50/demo/machinese/doc/enfdg3-tags.html, pos tags only) 2010/01/08 linked with system.owl 2010/01/25 added morphological features: subclassification of verbs, aspect, gender, revised participle, etc.; distinction between hasTag, hasTagContaining etc. 2010/11/25 added syntax tags and missing tags from http://193.185.105.50/demo/machinese/doc/MS-all-tags.html Christian Chiarcos, chiarcos@uni-potsdam.de
connexor_link connexor_link 4 0 0 2
  2010/11/26 created Christian Chiarcos, chiarcos@uni-potsdam.de
dzongkha dzongkha 58 0 66 1
  2010-09-23 created
2010-10-12 NominalNoun and QuantifierNoun updated according to Jurmey's email
2011-08-16 linking with system.owl Christian Chiarcos, chiarcos@uni-potsdam.de
dzongkha-link dzongkha-link 57 0 0 2
  2011/08/16 provisional semiautomated linking for morphosyntax, manually revised Christian Chiarcos, chiarcos@uni-potsdam.de
eagles eagles 163 16 206 1
  OLiA annotation model for the EAGLES recommendations for the annotation of morphosyntax. Unless specified otherwise, all comments are quotes from Leech & Wilson (1996) Originally applied to the following languages: Catalan, Danish, Dutch, English, French, German, Greek, Irish, Portuguese, Spanish, Swedish (http://www.ilc.cnr.it/EAGLES96/morphsyn/node4.html#SECTION00022000000000000000) G. Leech & A. Wilson (1996), EAGLES Recommendations for the Morphosyntactic Annotation of Corpora (EAG--TCWG--MAC/R, Version of Mar, 1996), http://www.ilc.cnr.it/EAGLES/annotate/annotate.html
eagles-link eagles-link 166 16 0 2
  2011/08/10 provisional semiautomated linking 2011/08/11 manual revision and extension Christian Chiarcos, chiarcos@uni-potsdam.de
emille emille 143 11 153 1
  OLiA Annotation Model for the morphosyntactic annotation of the Urdu section of the EMILLE corpus (Hardie 2003, 2004). Unless marked otherwise, all comments are quotes from Hardie (2004), Chapter 3. The tagset discussed here was created in accordance with the EAGLES guidelines for morphosyntactic annotation of corpora. Although these guidelines were written to cover the languages of the European Union, they can be applied fairly easily to Urdu, which, coming as it does from another branch of the Indo- European family, is structurally quite similar. They can also be extended to deal with the idiosyncrasies presented by Urdu grammar. (Hardie 2003) The first stage of the work was to develop a tagset for use in Urdu texts and corpora, an area which has not been research extensively heretofore2. The next stage, now underway, is to test the tagset’s usability in manual tagging, and build up a set of tagged texts to serve as training data for the final phase of this part of the project. This will be to automate the tagging and subsequently tag the whole of the EMILLE Urdu corpus. (Hardie 2003) References Hardie, A (2003) Developing a tagset for automated part-of-speech tagging in Urdu. In: Corpus Linguistics 2003, 2003-03-01, Lancaster. http://eprints.lancs.ac.uk/103/ Hardie, Andrew (2004) The computational analysis of morphosyntactic categories in Urdu. Other thesis, Lancaster University. http://eprints.lancs.ac.uk/106/ Ruth Laila Schmidt (1999) Urdu, an essential grammar, Routledge, London.
emille-link emille-link 147 18 0 2
  2011/08/15 provisional semiautomated linking for morphosyntax 2011/08/16 manually revised Christian Chiarcos, chiarcos@uni-potsdam.de
french french 60 2 186 1
  OLiA Annotation Model for the morphosyntactic annotations of the French Le Monde corpus (Abeillé et al. 2000). Unless specified otherwise, all comments (mostly examples) are quotes from Abeillé and Clément (2003). References Abeille A., Clement L., Kinyon A. (2000) “Building a treebank for French”, In Proc. LREC 2000 Abeillé, A. and Clément, L. (2003), "Annotation Morpho-syntaxique. Les mots simples - Les mots composés. Corpus Le Monde", ms., version of 10 janvier 2003, http://www.llf.cnrs.fr/Gens/Abeille/guide-morpho-synt.02.pdf
french-tt french-tt 35 0 32 1
  OLiA annotation model for French part-of-speech tags as produced by the TreeTagger. Tagset courtesy of Achim Stein. Unless specified otherwise, all comments are literal quotes from Stein (2003). References: Achim Stein (2003), French TreeTagger Part-of-Speech Tags, version of April 2003, http://www.ims.uni-stuttgart.de/~schmid/french-tagset.html
genia genia 40 2 0 0
  Annotation scheme of the GENIA corpus (Kim et al. 2003).
Kim, J.D. and Ohta, T. and Tateisi, Y. and Tsujii, J. (2003), GENIA corpus-a semantically annotated corpus for bio-textmining, Bioinformatics 19(1):180-182
genia-link genia-link 39 0 0 2
  Provisional GENIA-OLiA-Linking
iiit iiit 29 0 29 1
  OLiA Annotation Model for a Part of Speech Tagger for Indian Languages (IIIT 2007). Languages mentioned in the document include Hindi, Marathi, and Telugu. To a certain extent, IIIT (2007) seems to be a revision of http://ltrc.iiit.ac.in/tr031/posguidelines.pdf that was developed at the same institute. Unless marked otherwise, all comments are quotes from IIIT (2007). IIIT (2007), A Part of Speech Tagger for Indian Languages (POS tagger), Tagset developed at IIIT - Hyderabad after consultations with several institutions through two workshops. available under http://shiva.iiit.ac.in/SPSAL2007/iiit_tagset_guidelines.pdf
iiit-link iiit-link 25 0 0 2
  2011/08/15 provisional semiautomated linking for morphosyntax, manually revised Christian Chiarcos, chiarcos@uni-potsdam.de
ilposts ilposts 116 18 0 1
  Annotation Model for the IL-POSTS tagset, a pan-Indian annotation scheme (Baskaran et al. 2008), primarily applied to Bangla, Hindi, Kannada, Malayalam, Marathi, Sanskrit, Tamil and Telugu. Unless marked otherwise, all comments refer to Baskaran et al. (2008). "There are four main language families found in India, viz., Austro-Asiatic, Dravidian, Indo-Aryan and Tibeto-Burman, of which Dravidian and Indo-Aryan (IA) form the largest group of languages spoken in the sub-continent. This framework concentrates on Dravidian and IA language families for two main reasons: (i) practical issues of manageability, (ii) the fact that of the 22 official languages in India a large majority belonged to these two language families. However, the detailed linguistic analysis and discussions that led to the design of this framework leads us to believe that it is broad enough to cover Indian Languages from the other language families as well." Sankaran Baskaran, Kalika Bali, Tanmoy Bhattacharya, Pushpak Bhattacharyya, Monojit Choudhury, Girish Nath Jha, Rajendran S., Saravanan K., Sobha L., and KVS Subbarao (2008), A Common Parts-of-Speech Tagset Framework for Indian Languages, In Proceedings of LREC 2008, p. 1331-1337, http://www.lrec-conf.org/proceedings/lrec2008/pdf/337_paper.pdf
ilposts-link ilposts-link 116 0 0 2
  2011/08/12 provisional semiautomated linking for morphosyntax, manual revision and extension (no properties linked, yet) 2011/08/15 OWL/DL validation (FACT++) Christian Chiarcos, chiarcos@uni-potsdam.de
morphisto morphisto 43 15 0 1
  Tags used for Morphisto morphological analyses, extracted from the analysis of the Potsdam Commentary Corpus. 10/02/01 created, Christian Chiarcos, chiarcos@uni-potsdam.de
morphisto-link morphisto-link 72 7 0 2
  2010/02/02 provisional morphisto-OLiA linking Christian Chiarcos, chiarcos@uni-potsdam.de
msd-bg-link msd-bg-link 0 0 0 4
msd-cs-link msd-cs-link 0 0 0 4
msd-en-link msd-en-link 0 0 0 4
msd-et-link msd-et-link 0 0 0 4
msd-fa-link msd-fa-link 0 0 0 4
msd-hu-link msd-hu-link 0 0 0 4
msd-mk-link msd-mk-link 0 0 0 4
msd-pl-link msd-pl-link 0 0 0 4
msd-ro-link msd-ro-link 0 0 0 4
msd-ru-link msd-ru-link 0 0 0 4
msd-sk-link msd-sk-link 0 0 0 4
msd-sl-link msd-sl-link 0 0 0 4
msd-sl-rozaj-link msd-sl-rozaj-link 0 0 0 4
msd-sr-link msd-sr-link 0 0 0 4
msd-uk-link msd-uk-link 0 0 0 4
multext-east multext-east 302 34 0 0
  OWL/DL Ontology for MULTEXT-East morphosyntactic specifications -- OLiA annotation model for the morphosyntactic specifications of MULTEXT-East v. 4. (Erjavec 2010)
http://nl.ijs.si/ME/owl/
Christian Chiarcos, 2010-2011

Licence:
The ontologies are distributed under the Creative Commons Attribution 3.0 Unported (CC BY 3.0) licence. You are free to to copy, distribute and transmit the work, to adapt the work and to make commercial use of the work under the condition that you make a reference to:
Christian Chiarcos and Tomaz Erjavec (2011), OWL/DL formalization of the MULTEXT-East morphosyntactic specifications. In: Proceedings of the 5th Linguistic Annotation Workshop (LAW-V), held in conjunction with the ACL-HLT 2011, June 2011, Portland, Oregon, USA, p. 11--20.
Please note that these ontologies are still under development, and that more detailed and precise definitions will be added incrementally.

Sources:
Unless marked otherwise, all comments refer to Erjavec (2010). Additionally, Qasemizadeh & Rahimi (2006), Dimitrova et al. (2009) and Derzhanski & Kotsyba (2009) were consulted for clarification. Email communication with Tomaž Erjavec, Serge Sharoff, Dan Tufis, Ivan A. Derzhanski, Natalia Kosyba, Csaba Oravecz and Hamidreza Kobdani represents the third source of information consulted for this ontology.
References:
Ivan Derzhanski, Natalia Kotsyba (2009), Towards a Consistent Morphological Tagset for Slavic Languages: Extending MULTEXT-East for Polish, Ukrainian and Belarusian, In: Proc. MONDILEX Third Open Workshop Bratislava, Slovakia, 15–16 April, 2009, p. 9-26
Ludmila Dimitrova, Radovan Garabík, Daniela Majchráková (2009), Comparing Bulgarian and Slovak Multext-East morphology tagset, In: Proceedings of MONDILEX Second Open Workshop, Kyiv, Ukraine, 2–4 February, 2009, p. 38-46
Tomaž Erjavec (ed., 2010), MULTEXT-East Morphosyntactic Specifications Version 4. 2010-05-12, http://nl.ijs.si/ME/V4/msd/html/index.html
Behrang Qasemizadeh and Saeed Rahimi (2006), Persian in MULTEXT-East Framework, in T. Salakoski et al. (eds.): FinTAL 2006, LNAI 4139, pp. 541 – 551, 2006.
olia olia 858 50 0 2
  OLiA Reference Model for Morphology, Morphosyntax and Syntax (originally based on the EAGLES recommendations, with modifications in accordance to DCR (ISOcat, June 2013), TDS ontology, GOLD v.03, the SFB 632 annotation guidelines, the MULTEXT-East ontology and various annotation schemes)
olia-top olia-top 62 0 0 1
  Top categories of the OLiA Reference Model 2010/01/19 created 2010/04/08 removed NPFunction (=> SyntacticRole) 2010/04/13 added MorphologicalProcess, MorphologicalFeature, DiscourseFeature, AnimacyFeature, ReferentTypeFeature, RegisterFeature, UsageAndFrequencyFeature 2010/04/14 validation, PossessiveFeature removed (see olia:hasOwnerNumber), moved olia:NarrativeType and olia:PolarityFeature here 2010/04/15 additions in accordance to the PTB Bracketing Guidelines: NullElement, SentenceTypeFeature (Santorini 1991, Bies et al. 1995) 2010/11/30 added TopologicalField in accordance to the TueDa-D/Z annotation guidelines (Telljohann et al. 2009) 2011/07/29 replace url by purl 2011/07/31 added ProximityFeature 2011/08/03 added SpecificityFeature 2011/08/04 SubordTypeFeature, CoordTypeFeature deprecated, added NumeralAgreementClass 2011/08/11 StrengthFeature recast as MorphologicalFeature rather than MorphosyntacticFeature 2011/08/15 EmphasisFeature added 2011/08/15 PhonologicalProcess added (for Elision and Apocope, formerly both classified as MorphologicalProcess) 2013/06/25 EvidentialityFeature, ClusivityFeature added (from ISOcat), intensity as new label to EmphasisFeature LexicalRelation for labels for relations holding between lexemes 2013/06/27 AgreementFeature (from ISOcat, as superclass of NominalAgreementClass, Person, Gender, Number; not as a relation between words) 2013/06/28 EvaluativeFeature (for ISOcat PreferredEvaluative and PejorativeEvaluative), ModalityFeature (Modality and Mood distinction revised) 2016/04/18 fixed minor validity warnings Christian Chiarcos, chiarcos@uni-potsdam.de
olia_system olia_system 5 9 0 0
  OLiA core concepts for linguistic annotations.
p1 p1 1 0 0 3
  Represents the linking between the treetagger owls (currently treetagger-german.owl only) and the OLiA reference model. POS-tag linking is imported.
parole_es_cat parole_es_cat 94 9 115 1
  OLiA Annotation Model for Spanish morphosyntax as used by the FreeLing Tagger (http://nlp.lsi.upc.edu/freeling/) following the PAROLE/EAGLES 2.0 guidelines Unless marked otherwise, all comments are quotes from http://nlp.lsi.upc.edu/freeling/doc/userman/parole-es.html
pctb pctb 40 0 32 1
  OLiA Annotation Model for the morphosyntactic annotations of the Penn Chinese Treebank (PCTB) "This document is designed for the Penn Chinese Treebank Project [XPX+00]. The goal of the project is the creation of a 100-thousand word corpus of Mandarin Chinese text with syntactic bracketing. The annotation consists of two stages: the first phrase is word segmentation and part-of-speech (POS) tagging and the second phrase is syntactic bracketing. ... We have chosen syntactic distribution as the main criterion for our POS tagging because itcomplies with the principles adopted in contemporary linguistics theories, such as the notion of head projections in the X-bar theory and the GB theory." (Xia 2000, p.4f) Unless specified otherwise, all comments are quotes from Xia (2000) Fei Xia (2000), The Part-Of-Speech Tagging Guidelines for the Penn Chinese Treebank (3.0), version of October 17, 2000, http://www.cis.upenn.edu/~chinese/posguide.3rd.ch.pdf
penn penn 55 2 51 0
  OLiA Annotation Model for Penn Treebank (PTB) part-of-speech annotation (Santorini 1990) Unless specified otherwise, all comments are taken from Santorini (1990). References Beatrice Santorini (1990), Part-of-Speech tagging guidelines for the Penn Treebank Project, 3rd revision, 2nd printing, ftp://ftp.cis.upenn.edu/pub/treebank/doc/tagguide.ps.gz
penn-link penn-link 7 0 0 2
  provisional Penn-OLiA-Linking
penn-syntax penn-syntax 91 5 0 0
  Penn Syntax annotations according to Santorini (1991) and Bies et al. (1995), the edge/node classification follows TIGERSearch conventions. Beatrice Santorini (1991), Bracketing Guidelines for the Penn Treebank Project *** DRAFT VERSION ****, May 15, 1991 (ftp://ftp.cis.upenn.edu/pub/treebank/doc/old-bktguide.ps.gz) Ann Bies, Mark Ferguson, Karen Katz, and Robert MacIntyre (1995), Bracketing Guidelines for Treebank II Style Penn Treebank Project, January 1995 (ftp://ftp.cis.upenn.edu/pub/treebank/doc/manual/root.ps.gz)
penn-syntax-link penn-syntax-link 145 0 0 2
  Provisional Penn-OLiA-Linking for syntax (concepts only, no properties) Ch. Chiarcos, 10/04/18, chiarcos@uni-potsdam.de
qtag-link qtag-link 45 0 0 2
  Provisional linking between the English QTag tagset and the OLiA Reference Model
russ russ 83 23 0 1
  Ontology for the morphosyntactic annotations of the Uppsala corpus, a corpus of 20th century Russian prose 2008/05/23 created Christian Chiarcos, Angelika Adam 2010/02/16 updated Christian Chiarcos, chiarcos@uni-potsdam.de
russ-link russ-link 92 8 0 2
russleeds russleeds 22 2 0 0
  Annotation scheme for part of speech annotation used by Serge Sharoff's TreeTagger module, cf. Sharoff et al. (2008).
Sharoff, S. and Kopotev, M. and Erjavec, T. and Feldman, A. and Divjak, D. (2008), Designing and evaluating Russian tagsets. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, May 2008
russleeds-link russleeds-link 22 0 0 2
  Provisional linking between the Russian TreeTagger module (developed by Serge Sharoff at the University of Leeds) and the OLiA Reference Model
sfb632 sfb632 140 18 0 0
  OLiA Annotation Model for the SFB632 Annotation Guidelines (Dipper et al. 2007) for Morphology and Syntax
Stefanie Dipper, Michael Götze and Stavros Skopeteas (2007), Information Structure in Cross-Linguistic Corpora: Annotation Guidelines for Phonology, Morphology, Syntax, Semantics, and Information Structure. In: Interdisciplinary Studies on Information Structure (ISIS) Working papers of the SFB 632; vol. 7. Universität Potsdam
sfb632-link sfb632-link 68 11 0 2
  Provisional linking between SFB632 Annotation Model and OLiA Reference Model (morphology only)
stanford stanford 63 0 64 1
  OLiA Annotation Model of Stanford Parser dependency labels (de Marneffe and Manning 2011) Unless specified otherwise, all comments are taken from de Marneffe and Manning (2011) References: Marie-Catherine de Marneffe and Christopher D. Manning (2011), Stanford typed dependencies manual, September 2008, revised for Stanford Parser v. 1.6.9 in September 2011, http://nlp.stanford.edu/software/dependencies_manual.pdf
stanford-link stanford-link 53 0 0 2
  2010/02/18 provisional linking Christian Chiarcos, chiarcos@uni-potsdam.de
stts stts 76 2 0 0
  Annotation Model for the Stuttgart-Tübingen Tagset (STTS, Schiller et al. 1999) of part of speech annotation. 2006 created 2006-2008 maintained by Angelika Adam 2010/01/04 system.owl integration 2010/12/07 removed cardinality restriction of hasTag Christian Chiarcos, chiarcos@uni-potsdam.de
stts-link stts-link 4 0 0 2
  Provisional linking between OLiA reference model and the STTS annotation model 2005 created, Christian Chiarcos and Angelika Adam 10/01/04 updated 10/03/17 modal verbs linked Christian Chiarcos, chiarcos@uni-potsdam.de
susa susa 123 12 0 0
  Annotation Model of the morphosyntactic component of the SUSANNE scheme as applied to the British English SUSANNE corpus (Sampson, 1995), also covering the simplified SUSANNE tag set as used by the TnT Tagger (Brants 2000).
Brants, T. (2000), TnT--a statistical part-of-speech tagger, In Proc. ANLP 2000
Sampson, G. (1995), English for the computer: The SUSANNE corpus and analytic scheme, Oxford University Press
susa-link susa-link 74 9 0 2
tcodex tcodex 127 2 107 1
  OLiA Annotation Model for the Tatian Corpus of Deviating Examples (T-CODEX, Petrova et al. 2009) and other resources for Old High German (OHG) assembled by project B4 of the Collaborative Research Center (SFB) 632 "Information Structure" (2003-2015), Universität Potsdam, HU Berlin. The present corpus, the Tatian Corpus of Deviating Examples T-CODEX 2.0, provides morpho-syntactic and information structural annotation of parts of the Old High German translation attested in the MS St. Gallen Cod. 56, traditionally called the OHG Tatian, one of the largest prose texts from the classical OHG period. This corpus was designed and annotated by Project B4 of Collaborative Research Center on Information Structure at Humboldt University Berlin. (Petrova and Odebrecht 2011) Unless marked otherwise, all comments are authored by Svetlana Petrova from the 2008 edition of this ontology. Additional resources consulted for the development of this ontology include Petrova and Odebrecht (2011). References: Svetlana Petrova and Carolin Odebrecht (2011), Tatian Corpus of Deviating Examples. T-CODEX 2.1 Corpus Description (version of 21-Mar-2011). Technical Report, HU Berlin, https://korpling.german.hu-berlin.de/~annis/T-CODEX/corpus_description_tatian2.1.pdf Petrova, Svetlana, Solf, Michael, Ritz, Julia, Chiarcos, Christian, Zeldes, Amir (2009) Building and using a richly annotated interlinear diachronic corpus: the case of Old High German Tatian. Traitement Automatique des Langues 50 (2): 47-71.
tcodex-link tcodex-link 119 0 0 2
  2011/08/11 provisional semiautomated linking for morphosyntax, manual revision and extension Christian Chiarcos, chiarcos@uni-potsdam.de
tibet-link tibet-link 29 0 0 2
  Speculative (!) linking between OLiA Reference Model and the Tibetan Annotation Model
tibet_old tibet_old 41 2 0 0
  Experimental draft for an OLiA Annotation Model for the Tibetan corpus described by Wagner and Zeisler (2004)
Wagner, A. and Zeisler, B. (2004), A syntactically annotated corpus of Tibetan. In Proc. Fourth International Conference on Language Resources and Evaluation (LREC 2004), Lisboa, Portugal, May 2004
tiger tiger 91 13 0 2
  TIGER morphosyntax (parts of speech = modified STTS) and morphology July 2007 created by Christian Chiarcos and Angelika Adam 2010/01/13 system.owl references updated, Christian Chiarcos, chiarcos@uni-potsdam.de
tiger-syntax tiger-syntax 11 4 0 1
  prototype ontology for syntactic annotation. Important note: Here, an ontology of labels used in syntactic annotation is provided, but not an ontology of _structures_. Especially, the ontology itself does not perform a conversion between dependency trees and constituency analyses, though its representations may be used by corresponding converters.
treetagger-german treetagger-german 0 0 0 2
  Annotation Model for German TreeTagger tagset (STTS + Chunk labels) 10/01/25 created by Christian Chiarcos, chiarcos@uni-potsdam.de
tueba tueba 37 12 88 1
  Annotation Model of morposyntactic, morphological and syntactic annotations of the TüBa-D/Z corpus (v5) following Telljohann et al. (2009), unless marked otherwise Heike Telljohann, Erhard W. Hinrichs, Sandra K¨ubler, Heike Zinsmeister, Kathrin Beck (2009), Stylebook for the T¨ubingen Treebank of Written German (T¨uBa-D/Z), Tech. rep. Universität Tübingen, Seminar für Sprachwissenschaft, version of November 2009 (TüBa-D/Z v5)
tueba-link tueba-link 69 6 0 4
  Represents the linking between the TueBa-D/Z syntax and morphology annotations and the OLiA reference model. Note that TueBa-D/Z uses the STTS tag set for part of speech annotation, and hence, the treatment of POS tags is imported from stts-link.rdf
turkish turkish 101 0 100 1
  OLiA Annotation Model for Turkish morphosyntax Unless specified otherwise, all comments are quotes from Oflazer et al. (2003) Kemal Oflazer, Bilge Say, Dilek Zeynep Hakkani-Tür, Gökhan Tür (2003), Building a Turkish treebank, Treebanks: Building and Using Parsed Corpora (20): 261--277, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.9.9280&rep=rep1&type=pdf
ubyCat ubyCat 53 62 134 2
  OLiA Annotation Model for Uby Parts of Speech (Gurevych et al, 2012) extracted from the Uby DTD (http://purl.org/olia/ubyCat.owl, version of Nov 21th, 2012). References Iryna Gurevych, Judith Eckle-Kohler, Silvana Hartmann, Michael Matuschek, Christian M. Meyer and Christian Wirth, 2012, Uby - A Large-Scale Unified Lexical-Semantic Resource, Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon, France. The DTD is made available under a Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license which is available at http://creativecommons.org/licenses/by-sa/3.0/ You are free to share (copy, distribute and transmit) the work, to develop your own extensions (adapt, remix) of the work, and to make commercial use of the work.
ubyCat-link ubyCat-link 66 0 72 4
  Uby POS Linking Model
ubyPos ubyPos 64 10 24 1
  OLiA Annotation Model for Uby Parts of Speech (Gurevych et al, 2012) extracted from the Uby DTD (http://code.google.com/p/uby/source/browse/de.tudarmstadt.ukp.uby/tags/de.tudarmstadt.ukp.uby-0.2.0/de.tudarmstadt.ukp.uby.lmf.model-asl/src/main/resources/dtd/UBY_LMF.dtd, version of Nov 21th, 2012). References Iryna Gurevych, Judith Eckle-Kohler, Silvana Hartmann, Michael Matuschek, Christian M. Meyer and Christian Wirth, 2012, Uby - A Large-Scale Unified Lexical-Semantic Resource, Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon, France. The DTD is made available under a Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license which is available at http://creativecommons.org/licenses/by-sa/3.0/ You are free to share (copy, distribute and transmit) the work, to develop your own extensions (adapt, remix) of the work, and to make commercial use of the work.
ubyPos-link ubyPos-link 66 0 24 4
  Uby POS Linking Model
urdu urdu 50 0 42 1
  OLiA annotation model for morphosyntactic and morphological annotations of Urdu following Sajjad (2007). Unless marked otherwise, all coments are quoted from this document. Hassan Sajjad (2007), Urdu Part of Speech Tagset, version 1.0.0.0, 07-12-2007, Center for research in Urdu Language Processing. National University of Computer and Emerging Sciences, Lahore, Pakistan, http://www.crulp.org/Downloads/langproc/UrduPOStagger/UrduPOStagset.pdf
urdu-link urdu-link 49 0 0 2
  2011/08/15 provisional semiautomated linking for morphosyntax, manually revised Christian Chiarcos, chiarcos@uni-potsdam.de