Instructions to use RheaTinghe/stat214-lab3-bert-lora-r4-maxlen256 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use RheaTinghe/stat214-lab3-bert-lora-r4-maxlen256 with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string
stat214-lab3-bert-lora-r4-maxlen256
LoRA adapter for bert-base-uncased, fine-tuned on transcripts from the
Huth Lab fMRI story-listening dataset for the Stat 214 (Spring 2026)
final project at UC Berkeley.
The adapter is used to extract context-aware word embeddings that are then fed into a per-voxel ridge regression to predict whole-brain BOLD signal from spoken-story stimuli.
Configuration
| Hyperparameter | Value |
|---|---|
| Base model | bert-base-uncased |
LoRA rank r |
4 |
| LoRA alpha | 8 |
| LoRA dropout | 0.1 |
| Target modules | query, value |
| Training objective | Masked Language Modeling (MLM, 15%) |
| Training stories | 86 (Huth Lab podcast transcripts) |
| MLM max sequence length | 256 |
| Epochs | 3 |
| Optimizer | AdamW, lr=2e-4 |
| Batch size | 16 |
| Final MLM training loss | — |
Encoding-model performance
After extracting per-word embeddings from this adapter (using ±10 word context windows + Lanczos downsampling + 4 TR delays) and fitting per-voxel ridge regression on Subjects 2 and 3:
| Subject | Mean CC | Top 5% CC | Top 1% CC | Top-1 voxel |
|---|---|---|---|---|
| Subject 2 | 0.0643 | 0.2143 | 0.2906 | 0.4736 |
| Subject 3 | 0.0660 | 0.2176 | 0.3043 | 0.5159 |
(See full project repository for ridge weights, evaluation code, and SHAP / LIME word-importance analyses.)
Loading the adapter
from transformers import BertForMaskedLM, BertTokenizerFast
from peft import PeftModel
tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
base = BertForMaskedLM.from_pretrained("bert-base-uncased")
model = PeftModel.from_pretrained(base, "RheaTinghe/stat214-lab3-bert-lora-r4-maxlen256")
model.eval()
# Extract per-word embeddings via ±10 word context windows
# (see scripts/run_bert_pretrained.py in the project repo for the
# complete extraction pipeline)
Citation
@misc{stat214lab3,
author = {Galloro, Drew and Wang, Ruihang and Khothsombath, Benjamin and Zhang, Rhea},
title = {Stat 214 Lab 3: BERT-LoRA encoding model for fMRI},
year = {2026},
note = {UC Berkeley Spring 2026},
}
- Downloads last month
- 16
Model tree for RheaTinghe/stat214-lab3-bert-lora-r4-maxlen256
Base model
google-bert/bert-base-uncased