Hi,
I was trying to use AutoModelForCTC.from_pretrained("facebook/w2v-bert-2.0") to load the w2v-bert model, but I always get the error:
File “/home/jcsilva/huggingsound/.venv/lib/python3.11/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py”, line 1185, in init
raise ValueError(
ValueError: You are trying to instantiate <class ‘transformers.models.wav2vec2_bert.modeling_wav2vec2_bert.Wav2Vec2BertForCTC’> with a configuration that does not define the vocabulary size of the language model head. Please instantiate the model as follows:Wav2Vec2BertForCTC.from_pretrained(..., vocab_size=vocab_size). or definevocab_sizeof your model’s configuration.
Investigating the issue, I saw two possible causes:
-
The
vocab_sizeparam defined in the config file at config.json · facebook/w2v-bert-2.0 at main is equal tonull. @reach-vb or @ylacombe , would it be possible to remove this param (vocab_size) from the model config file? If not, what do you think about setting any valid value (e.g 32, such as what we see at config.json · facebook/wav2vec2-large-xlsr-53 at main). -
The
vocab_sizedefault value for W2VBert model is None (please see it here), but it is 32 for Wav2Vec2 models as you can see here. Could we have bothvocab_sizedefault value as 32? This way theValueErrorexception I mentioned in this ticket is not seen when usingAutoModelForCTC.
Thank you