Skip to main content

Table 1 Word error rate and time-needed-to-correct of Vink-generated transcriptsWould it be possible for the table to not be in alternating white and blue rows? Furthermore the formatting is off, there should be 1 row per language, formatted the way that the row for "Arabic (Classic)" already is.

From: From voice to ink (Vink): development and assessment of an automated, free-of-charge transcription tool

Language

Audio length (minutes)

Audio characteristics

 

Time-needed-to-correct (minutes)

Total words

Word Error Rate (WER)

 

American English

06:50

Number of speakers

2

17

854

WER

6.6%

  

Sex

F, M

  

Substitutions

7

  

Background noise1

Medium

  

Insertions

50

      

Deletions

0

Arabic (Classical Arabic)

03:06

Number of Speakers

Sex

Background noise

1

F

Low

27.5

363

WER

Substitutions

Insertions

Deletions

15.2%

7

20

28

Bahasa Indonesia

05:12

Number of speakers

2

10

465

WER

7.95%

  

Sex

F, F

  

Substitutions

10

  

Background noise

Medium

  

Insertions

22

      

Deletions

5

Burmese

05:05

Number of speakers

Sex

Background noise

3

M, M, F

High

Transcript is nonsensical

Chinese

05:01

Number of speakers

1

12

950

WER

0.95%

  

Sex

F

  

Substitutions

8

  

Background noise

Low

  

Insertions

1

      

Deletions

0

Filipino

5:00

Number of speakers

2

19

1343

WER

7.80%

  

Sex

F, GNB2

  

Substitutions

56

  

Background noise

Medium

  

Insertions

5

      

Deletions

45

French

04:09

Number of speakers

2

19:57

611

WER

24%

  

Sex

F, M

  

Substitutions

15

  

Background noise

Medium

  

Insertions

12

      

Deletions

122

German

05:00

Number of speakers

2

9:40

676

WER

4.28%

  

Sex

F, F

  

Substitutions

9

  

Background noise

Low

  

Insertions

2

      

Deletions

18

Malagasy

04:41

Number of speakers

2

62

351

WER

41%

  

Sex

F, M

  

Substitutions

134

  

Background noise

Medium

  

Insertions

12

      

Deletions

5

Portuguese Brazilian

02:19

Number of speakers

2

4

209

WER

1.4%

  

Sex

F, M

  

Substitutions

2

  

Background noise

Medium

  

Insertions

1

      

Deletions

0

Spanish Colombian

06:31

Number of speakers

2

36:46

1111

WER

14.5%

  

Sex

F, F

  

Substitutions

34

  

Background noise

Low

  

Insertions

21

      

Deletions

107

Tamil

04:32

Number of speakers

1

72

221

WER

79.8%

  

Sex

M

  

Substitutions

45

  

Background noise

Low

  

Insertions

103

      

Deletions

54

Turkish

03:19

Number of speakers

1

8

232

WER

4.3%

  

Sex

F

  

Substitutions

3

  

Background noise

Low

  

Insertions

1

      

Deletions

6

Yoruba

5:56

Number of speakers

2

20

528

WER

46%

  

Sex

F, M

  

Substitutions

164

  

Background noise

Medium

  

Insertions

36

      

Deletions

45

  1. 1Background noise levels were classified ‘low’ in case of close to no background noise, ‘medium’ in case of occasional or faint background noises and ‘high’ if background noises notably impaired understandability of speakers 2GNB: gender non-binary