Skip to content

Commit f910a39

Browse files
schnamosfluegel05
andauthored
chaning learning rate back to default of 1e-3 (#156)
* chaning learning rate back to default of 1e-3, however, we recommend 1e-4 for OPT fine-tuning experiments * delete duplicate config, fix vocab size * removed some obsolte config files from OPT experiments --------- Co-authored-by: sfluegel <simon.fluegel@uni-osnabrueck.de>
1 parent 9a6c7bf commit f910a39

File tree

7 files changed

+7
-51
lines changed

7 files changed

+7
-51
lines changed

configs/model/OPT_experiments/electra_LR.yml

Lines changed: 0 additions & 12 deletions
This file was deleted.

configs/model/OPT_experiments/electra_tox_expl.yml

Lines changed: 0 additions & 15 deletions
This file was deleted.

configs/model/electra-for-pretraining.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,16 @@ init_args:
44
class_path: chebai.loss.pretraining.ElectraPreLoss
55
out_dim: null
66
optimizer_kwargs:
7-
lr: 1e-4
7+
lr: 1e-3
88
config:
99
generator:
10-
vocab_size: 1400
10+
vocab_size: 4400
1111
max_position_embeddings: 1800
1212
num_attention_heads: 8
1313
num_hidden_layers: 6
1414
type_vocab_size: 1
1515
discriminator:
16-
vocab_size: 1400
16+
vocab_size: 4400
1717
max_position_embeddings: 1800
1818
num_attention_heads: 8
1919
num_hidden_layers: 6

configs/model/electra.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ class_path: chebai.models.Electra
22
init_args:
33
model_type: classification
44
optimizer_kwargs:
5-
lr: 1e-4
5+
lr: 1e-3
66
config:
77
vocab_size: 4400
88
max_position_embeddings: 1800

configs/model/electra300.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
class_path: chebai.models.Electra
22
init_args:
3+
model_type: classification
34
optimizer_kwargs:
45
lr: 1e-3
56
config:
6-
vocab_size: 1400
7+
vocab_size: 4400
78
max_position_embeddings: 301
89
num_attention_heads: 8
910
num_hidden_layers: 6

configs/model/electra_pretraining.yml

Lines changed: 0 additions & 18 deletions
This file was deleted.

configs/model/electra_tox.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ class_path: chebai.models.Electra
22
init_args:
33
model_type: classification
44
optimizer_kwargs:
5-
lr: 1e-4
5+
lr: 1e-3 # we recommend 1e-4 for OPT finetuning, however, 1e-3 is the default
66
config:
77
vocab_size: 1400
88
max_position_embeddings: 1800

0 commit comments

Comments
 (0)