Light Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

sheffieldnlp/AMR2Text-summ

Repository files navigation

Work in progress

Guided Neural Language Generation for Abstractive Summarization Using AMR

This repository contains code for our EMNLP 2018 paper "Guided Neural Language Generation for Abstractive Summarization Using AMR"

Obtaining the Dataset

We used the Abstract Meaning Representation Annotation Release 2.0 which contains manually annotated document and summary AMR.

Preprocessing the Data

For preprocessing, clone the AMR preprocessing repository.

git clone https://github.com/sheffieldnlp/AMR-Preprocessing

Run the AMR linearizing where the input is the system summary AMR from Liu's summarizer ($F) and the AMR raw dataset ($AMR). Here we use the test dataset. Run the preprocessing on the training, and validation dataset if you want to train the model.

export F_TRAIN=//amr-release-2.0-amrs-training.txt
export F_TEST=//amr-release-2.0-amrs-test.txt
export F_DEV=//amr-release-2.0-amrs-dev.txt
export OUTPUT=//
python var_free_amrs.py -f $F_TRAIN -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
python var_free_amrs.py -f $F_TEST -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
python var_free_amrs.py -f $F_DEV -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var

For each set (train, test and dev) the script will produce a set of two files: the sentence (.sent) and its respective linearized AMR (.tf) files.

Training New Model

export SRC=//all_amr-release-2.0-amrs-training.txt.tf
export TGT=//all_amr-release-2.0-amrs-training.txt.sent
export SRC_VALID=//all_amr-release-2-0-amrs-dev-all.txt.tf
export TGT_VALID=//all_amr-release-2-0-amrs-dev-all.txt.sent
export SAVE=//

python preprocess.py -train_src $SRC -train_tgt $TGT -valid_src $SRC_VALID -valid_tgt $TGT_VALID -save_data $SAVE -src_seq_length 1000 -tgt_seq_length 1000 -shuffle 1
export F=//summ_ramp_10_passes_len_edges_exp_0
export OUTPUT=/
export AMR=//amr-release-2.0-amrs-test-proxy.txt
python var_free_amrs.py -is_dir -f $F -output_path $OUTPUT --custom_parentheses --no_semantics --delete_amr_var --with_side -side_file $AMR




python $WORK/train.py -data $PREPROCESS/van_noord/no_filter_amr_2/data -save_model $MODEL/$TYPE -rnn_size 500 -layers 2 -epochs 2000 -optim sgd -learning_rate 1 -learning_rate_decay 0.8 -encoder_type brnn -global_attention general -seed 1 -dropout 0.5 -batch_size 32

Generation with New Model

python $WORK/translate.py -src $file -output $INPUT/gen/summ_rigotrio_fluent_side/$(basename $file).system -model $MODEL/rse/sprint_1/acc_53.28_ppl_46.79_e126.pt -replace_unk -side_src $INPUT/processed/rigotrio/body$(basename $file).s -side_tgt $INPUT/processed/rigotrio/body_$(basename $file).sent.s -beam_size 5 -max_length 100 -n_best 1 -batch_size 1 -verbose -psi 0.95 -theta 2.5 -k 15

About

AMR-to-Text Generator with Side Information

Topics

Resources

Readme

License

MIT license

Contributing

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

  • Python 83.8%
  • Perl 8.0%
  • Emacs Lisp 3.7%
  • Shell 3.1%
  • Smalltalk 0.4%
  • Ruby 0.4%
  • Other 0.6%