Du är här

Hem / Evenemang / NLP4Pseudo

 Workshop on NLP and Pseudonymisation

NoDaLiDa 2019, Turku, Finland

September 30, 2019



The Workshop on NLP and Pseudonymization to be held at the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa 2019) invites submissions of papers. The workshop will be held on September 30, 2019 on the campus of the University of Turku in Finland.



The goal of making research data freely available and the goal of personal data protection constitute a classical conflict of interests. This is particularly the case for language data in relation to the newly adopted General Data Protection Regulation, GDPR. While research, as a public interest, can process personal data, the GDPR requires appropriate safeguards to be in place. Consent is such a measure but cannot always be obtained, or be general enough, and in this case anonymisation/pseudonymisation can be applied, with the intended effect that real individuals no longer can be identified from the language data.

Many disciplines in the humanities and social sciences, in medicine, and not least in NLP are using language data in their research. This includes data as varied as emails, student essays, personal stories, text parts of medical records, court proceedings and decisions, interviews and chat data. NLP has a double role here, as a user of data, and, as a technology to support pseudonymisation.

The aim of the workshop is to bring together researchers working on NLP and pseudonymisation (broadly interpreted) and get a comprehensive view of problem areas, approaches and methods from current research. We invite papers on completed work as well as work in progress. Workshop contributions may include, but are not limited to, the following topics:

  •  + applications of NLP to support pseudonymisation of language data (text, speech or sign language)
  •  + methods to recognize personal data in text, speech or sign language
  •  + reports on projects using or planning to use pseudonymisation for releasing language data for research
  •  + evaluation of pseudonymisation support systems
  •  + requirements on and evaluation of NLP-tools for purposes of pseudonymization
  •  + realistic data generation for pseudonymisation
  •  + system evaluation from legal and/or ethical perspectives​​



  • + Thursday, June 20, 2019: Sunday, June 30, 2019: Extended deadline for submission of papers
  • + Friday, August 16, 2019: Notification of Acceptance
  • + Friday, August 30, 2019: Camera-Ready Manuscripts
  • + Monday, September 30, 2019: Workshop



We have the pleasure to welcome our invited speaker Martin Krallinger, leading researcher in Life Sciences and Text Mining at the Barcelona Supercomputing Center, and former Head of Biological Text Mining Unit at the Spanish National Cancer Research Centre (CNIO).
The abstract of the talk is here.



Allison Adams, Eric Aili, Daniel Aioanei, Rebecca Jonsson, Lina Mickelsson, Dagmar Mikmekova, Fred Roberts, Javier Fernandez Valencia and Rocher Wechsler: AnonyMate: A Toolkit for Anonymizing Unstructured Chat Data. 

Hanna Berg and Hercules Dalianis: Augmenting a De-identification System for Swedish Clinical Text Using Open Resources and Deep Learning.

Hercules Dalianis: Pseudonymisation of Swedish Electronic Patient Records Using a Rule-Based Approach. 



14:00-14:15 Introduction (Lars Ahrenberg and Beáta Megyesi)

14:15-15:05 Invited talk by Martin Krallinger 

Coffee Break 15:05-15:30

15:30-15:55 Allison Adams, Eric Aili, Daniel Aioanei, Rebecca Jonsson, Lina Mickelsson, Dagmar Mikmekova, Fred Roberts, Javier Fernandez Valencia and Rocher Wechsler: AnonyMate: A Toolkit for Anonymizing Unstructured Chat Data

15:55-16:20 Hercules Dalianis: Pseudonymisation of Swedish Electronic Patient Records using a rule-based approach 

16:20-16:45 Hanna Berg and Hercules Dalianis: Augmenting a De-identification System for Swedish Clinical Text Using Open Resources and Deep learning

16.45-16.50 Closing (Lars Ahrenberg and Beáta Megyesi)



The workshop proceedings are published in the NEALT Proceedings Series, No. 41, in parallel by: 

Linköping Electronic Press:

ACL Anthology:



The call for papers is here.



    We invite paper submissions in two distinct tracks:

    • + regular papers 8-10 pages (excluding references) on substantial, original, and unpublished research, including evaluation results, where appropriate;
    • + short papers 4-6 pages (excluding references) on smaller, focused contributions, work in progress, negative results, surveys, opinion pieces, or on system demonstrations.

    Presentations of accepted papers are either oral, poster, or demo, which will be decided by the program committee.

    Papers accepted for presentation at the conference will be included in the NLP for Pseudonymization 2019 proceedings, which is published as part of the Northern European Association for Language Technology (NEALT) Proceedings Series ( by Linköping University Electronic Press (ECP:, as freely available Gold Open Access.



    Reviewing of submissions and selection of the conference program will be managed by Program Committee. All submissions will receive at least two double-blind reviews by experts in the field.

    • + Lars Ahrenberg (program co-chair), Linköping University, Sweden
    • + Beáta Megyesi (program co-chair), Uppsala University, Sweden
    • + Hercules Dalianis, Stockholm University, Sweden
    • + Koenraad de Smedt, University of Bergen, Norway
    • + Cyril Grouin, LIMSI, CNRS, Université Paris-Saclay, France
    • + Dimitrios Kokkinakis, University of Gothenburg, Sweden
    • + Krister Lindén, University of Helsinki, Finland
    • + Aurélie Névéol, LIMSI, CNRS, Université Paris-Saclay, France
    • + Sumithra Velupillai, King's College, London, UK
    • + Sussi Olsen, CST, University of Copenhagen, Denmark
    • + Elena Volodina, University of Gothenburg, Sweden
    • + Mats Wirén, Stockholm University, Sweden