Tuning Large PLMs for Event Extraction
We provide an example script for tuning large pre-trained language models (PLMs) on event extraction tasks. We use BMTrain as the distributed training engine. BMTrain is an efficient large model training toolkit, see BMTrain and ModelCenter for more details. We adapt the code of ModelCenter for event extraction and place the code in OmniEvent/utils.
Setup
Install the code in OmniEvent/utils/ModelCenter
:
cd utils/ModelCenter
pip install .
Easy Start
Run bash train.sh
to train MT5-xxl. You can modify the config and the important hyper-parameters are as follows:
NNODES # number of nodes
GPUS_PER_NODE # gpus use on one node
model-config # We only support T5 and MT5
The original ModelCenter repo doesn’t support inference method (i.e. generate
) for decoder PLMs. We provide
beam_search.py
for inference.