Lux-ASR: Automatic Speech Recognition for Luxembourgish

We are switching to a new interface and a new adress. Try it out here: https://luxasr.uni.lu

Use our Lux-ASR to transcribe your Luxembourgish recordings into the corresponding text. Either upload an audio/video file (currently supported; wav, mp3, m4a or mp4 file format) or use the microphone to record your speech. Hit ‚Transcribe!‘ and after a certain time, the transcription will appear. The estimated time for transcription is displayed as a timer. It is possible to upload longer recordings with durations of up to several hours (depending on the file size). Lux-ASR is fast: It can transcribe up to 170 words per second. You can also try the examples.

The transcription (and translation) will appear here.

With this interface, we are giving access to our most performant tool for automatic speech recognition of Luxembourgish (speech-to-text). It has been trained on 150+ hours of carefully controlled pairs of audio and transcription snippets and is achieving a word error rate below 10%, i.e. 10 errors per 100 running words (punctuation and case included 😛 ). We are providing this tool to facilitate the transcription of Luxembourgish audio recordings into written text for research purposes, but also for general public use. The resulting text follows the current spelling rules for 2019.

Available options

Several audio input languages are available (default: Luxembourgish). If the recording contains more than one speaker, setting diarization to ‚On‘ will separate the text of every speaker in the recording along with time codes for their turns. Note that diarization adds some extra time to the recognition process. Three output formats are available: plain text (txt), SubRip Subtitles (srt), JSON (with or without time codes for words) and Praat TextGrid. These files can be downloaded through the link below the transcription. The recognition duration takes up to 5% of the audio file’s duration. Once the recognition process has started, an estimated time and a timer will be displayed to keep track of the progress.

As an experimental feature for the text translation to other languages has been added, which will output the recognized text in English, German, Portuguese, Spanish or French. Note that these translations take more time to run (around 1/3 of the audio’s duration). The quality of these translations may vary.

The maximal size for upload is 500 MB. The preferred file format for audio files is ‚wav‘ with a sampling frequency of 16,000 Hz.

Disclaimer

Note that the transcription and the translation are run on a dedicated server at the University of Luxembourg. All data thus stays within Luxembourg and the University’s network. Nobody has access to the uploaded audio or the text output. The audio data is streamed to this server and no files are stored on this server or in the network. No data is used to further train the model and no data is transferred to third parties.

Contact

For more information about the setup and functioning of Lux-ASR, see here. Lux-ASR is under constant development by Peter Gilles, Léopold Hillah, and Nina Hosseini-Kivanani at the University of Luxembourg and is supported by the Chambre des Députes du Grand-Duché de Luxembourg. Contact us for more information and API access.

References

Schreibe einen Kommentar Antworten abbrechen

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Audio Language	Speaker Diarization	Output format	Translation
	On Off

Lux-ASR: Automatic Speech Recognition for Luxembourgish

We are switching to a new interface and a new adress. Try it out here: https://luxasr.uni.lu

Upload Audio (wav, mp3, m4a) or Video File (mp4)

Record from Microphone

Audio Language

Speaker Diarization

Output format

Translation

Available options

Disclaimer

Contact

References

Schreibe einen Kommentar Antworten abbrechen