error
Kurzer Serviceunterbruch am Montag, 8. Dezember 2025, 12 bis 13 Uhr. Sie können in diesem Zeitraum keine neuen Dokumente hochladen oder bestehende Einträge bearbeiten. Das Login wird in diesem Zeitraum deaktiviert. Grund: Wartungsarbeiten // Short service interruption on Monday, December 8, 2025, 12.00 – 13.00. During this time, you won’t be able to upload new documents or edit existing records. The login will be deactivated during this time. Reason: maintenance work
 

SwissDial: Parallel Multidialectal Corpus of Spoken Swiss German


Date

2021-03-21

Publication Type

Working Paper

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

Swiss German is a dialect continuum whose natively acquired dialects significantly differ from the formal variety of the language. These dialects are mostly used for verbal communication and do not have standard orthography. This has led to a lack of annotated datasets, rendering the use of many NLP methods infeasible. In this paper, we introduce the first annotated parallel corpus of spoken Swiss German across 8 major dialects, plus a Standard German reference. Our goal has been to create and to make available a basic dataset for employing data-driven NLP applications in Swiss German. We present our data collection procedure in detail and validate the quality of our corpus by conducting experiments with the recent neural models for speech synthesis.

Publication status

published

Editor

Book title

Journal / series

Volume

Pages / Article No.

2103.11401

Publisher

Cornell University

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Computation and Language

Organisational unit

03420 - Gross, Markus (emeritus) / Gross, Markus (emeritus) check_circle
02154 - Media Technology Center (MTC) / Media Technology Center (MTC)

Notes

Funding

Related publications and datasets