Swedish in a Multilingual Setting (SMS)

You are here

Home / Swe-Clarin / Swe-Clarin centres / Swedish in a Multilingual Setting (SMS)

The CLARIN Knowledge Centre for Swedish in a Multilingual Setting (CLARIN-SMS) offers expertise in linguistic processing of text, especially for Swedish and/or when multiple languages are involved. In addition, CLARIN-SMS offers expertise in the application of language technology to Swedish Sign Language.

CLARIN-SMS is primarily directed at researchers in the humanities and social sciences with a need for analysis, annotation or mining of Swedish or multilingual text, and additionally at researchers with a need for corpora or tools for Swedish Sign Language.

CLARIN-SMS makes resources in the form of tools for linguistic processing and corpora available in service of the humanities and social sciences. The resources include monolingual (mainly Swedish) and multilingual corpora across several domains, and tools for basic processing of text, including tokenization, morphological analysis, part-of-speech tagging, syntactic parsing, and named entity recognition. CLARIN-SMS offers special expertise in the following areas:

– Processing of parallel and comparable corpora, including alignment and machine translation
– Cross-linguistically consistent annotation within the framework of Universal Dependencies
– Computation and evaluation of measures of text complexity
– Language technology for Swedish Sign Language

CLARIN-SMS is a distributed Knowledge Centre which includes the following partners:

– Linköping University, Department of Computer and Information Science. Contact: Lars Ahrenberg, lars.ahrenberg@liu.se
– Stockholm University, Department of Linguistics. Contact: Mats Wirén, mats.wiren@ling.su.se
– Uppsala University, Department of Linguistics and Philology. Contacts: Joakim Nivre, joakim.nivre@lingfil.uu.se, Eva Pettersson, eva.pettersson@lingfil.uu.se

Help Desk Contact 

Mats Wirén, mats.wiren@ling.su.se

Publications

Lars Ahrenberg (2015). Converting an English–Swedish Parallel Treebank to Universal Dependencies. Proc. Third International Conference on Dependency Linguistics (DepLing 2.015), Association for Computational Linguistics, pages 10–19. ACL Anthology W15-2103.

Marco Kuhlmann and Stephan Oepen (2016). Towards a Catalogue of Linguistic Graph Banks. Computational Linguistics, 42, 4, 819–827. ISSN 0891-2017, E-ISSN 1530-9312.

Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajič, Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, and Daniel Zeman (2016). Universal Dependencies v1: A Multilingual Treebank Collection. Proc. Tenth International Conference on Language Resources and Evaluation (LREC 2016).

Robert Östling (2018). Part of Speech Tagging: Shallow or Deep Learning? Northern European Journalof Language Technology, Volume 5, Article 1.

Robert Östling, Carl Börstell, Moa Gärdenfors and Mats Wirén (2017). Universal Dependencies for Swedish Sign Language. Proc. 21st Nordic Conference on Computational Linguistics, pages 303–308. Linköping.

Aaron Smith, Bernd Bohnet, Miryam de Lhoneux, Joakim Nivre, and Sara Stymne (2018). 82 treebanks,34 models: Universal Dependency Parsing with Multi-Treebank Models. Proc. CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 113–123.