Automatic Speech Recognition
Transformers
Safetensors
Japanese
whisper
audio
hf-asr-leaderboard
Inference Endpoints
asahi417 commited on
Commit
ba3af7b
1 Parent(s): 4d213e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -49
README.md CHANGED
@@ -6,9 +6,6 @@ tags:
6
  - audio
7
  - automatic-speech-recognition
8
  - hf-asr-leaderboard
9
- metrics:
10
- - wer
11
- - cer
12
  widget:
13
  - example_title: CommonVoice 8.0 (Test Split)
14
  src: >-
@@ -20,45 +17,6 @@ widget:
20
  src: >-
21
  https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test/resolve/main/sample.flac
22
  pipeline_tag: automatic-speech-recognition
23
- model-index:
24
- - name: kotoba-tech/kotoba-whisper-v1.1
25
- results:
26
- - task:
27
- type: automatic-speech-recognition
28
- dataset:
29
- name: CommonVoice_8.0 (Japanese)
30
- type: japanese-asr/ja_asr.common_voice_8_0
31
- metrics:
32
- - type: WER
33
- value: 59.27
34
- name: WER
35
- - type: CER
36
- value: 9.44
37
- name: CER
38
- - task:
39
- type: automatic-speech-recognition
40
- dataset:
41
- name: ReazonSpeech (Test)
42
- type: japanese-asr/ja_asr.reazonspeech_test
43
- metrics:
44
- - type: WER
45
- value: 56.62
46
- name: WER
47
- - type: CER
48
- value: 12.6
49
- name: CER
50
- - task:
51
- type: automatic-speech-recognition
52
- dataset:
53
- name: JSUT Basic5000
54
- type: japanese-asr/ja_asr.jsut_basic5000
55
- metrics:
56
- - type: WER
57
- value: 64.36
58
- name: WER
59
- - type: CER
60
- value: 8.48
61
- name: CER
62
  datasets:
63
  - japanese-asr/whisper_transcriptions.reazonspeech.large
64
  - japanese-asr/whisper_transcriptions.reazonspeech.large.wer_10.0
@@ -77,13 +35,25 @@ Following table presents the raw CER (unlike usual CER where the punctuations ar
77
  along with the.
78
 
79
 
80
- | model | CommonVoice 8.0 (Japanese) | JSUT Basic 5000 | ReazonSpeech Test |
81
- |:---------------------------------------------------------|---------------------------------------:|-------------------------------------:|----------------------------------------:|
82
- | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator + stable-ts) | 13.7 | 11.2 | 17.4 |
83
- | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator) | 13.9 | 11.4 | 18 |
84
- | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (stable-ts) | 15.7 | 15 | 17.7 |
85
- | [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0) | 15.6 | 15.2 | 17.8 |
86
- | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 12.9 | 13.4 | 20.6 |
 
 
 
 
 
 
 
 
 
 
 
 
87
 
88
  Regarding to the normalized CER, since those update from v1.1 will be removed by the normalization, kotoba-tech/kotoba-whisper-v1.1 marks the same CER values as [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0).
89
 
 
6
  - audio
7
  - automatic-speech-recognition
8
  - hf-asr-leaderboard
 
 
 
9
  widget:
10
  - example_title: CommonVoice 8.0 (Test Split)
11
  src: >-
 
17
  src: >-
18
  https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test/resolve/main/sample.flac
19
  pipeline_tag: automatic-speech-recognition
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  datasets:
21
  - japanese-asr/whisper_transcriptions.reazonspeech.large
22
  - japanese-asr/whisper_transcriptions.reazonspeech.large.wer_10.0
 
35
  along with the.
36
 
37
 
38
+ | model | [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
39
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
40
+ | [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0) | 17.6 | 15.4 | 17.4 |
41
+ | [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) | 17.7 | 15.4 | 17 |
42
+ | [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (punctuator + stable-ts) | 17.7 | 15.4 | 17 |
43
+ | [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (punctuator) | 17.7 | 15.4 | 17 |
44
+ | [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (stable-ts) | 17.7 | 15.4 | 17 |
45
+ | [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0) | 17.8 | 15.2 | 17.8 |
46
+ | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) | 17.9 | 15 | 17.8 |
47
+ | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator + stable-ts) | 17.9 | 15 | 17.8 |
48
+ | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator) | 17.9 | 15 | 17.8 |
49
+ | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (stable-ts) | 17.9 | 15 | 17.8 |
50
+ | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 15.3 | 13.4 | 20.5 |
51
+ | [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 15.9 | 10.6 | 34.6 |
52
+ | [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 16.6 | 11.3 | 40.7 |
53
+ | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 17.9 | 13.1 | 39.3 |
54
+ | [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 34.5 | 26.4 | 76 |
55
+ | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 21.5 | 18.9 | 48.1 |
56
+ | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 58.8 | 38.3 | 153.3 |
57
 
58
  Regarding to the normalized CER, since those update from v1.1 will be removed by the normalization, kotoba-tech/kotoba-whisper-v1.1 marks the same CER values as [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0).
59