M-ABSA

This repo contains the data and code for our EMNLP-2025 paper M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis.

Data Description:

This is a dataset suitable for the multilingual ABSA task with triplet extraction.

All datasets are stored in the data/ folder:

All dataset contains 7 domains.

domains = ["coursera", "hotel", "laptop", "restaurant", "phone", "sight", "food"]

Each dataset contains 21 languages.

langs = ["ar", "da", "de", "en", "es", "fr", "hi", "hr", "id", "ja", "ko", "nl", "pt", "ru", "sk", "sv", "sw", "th", "tr", "vi", "zh"]

The labels contain triplets with [aspect term, aspect category, sentiment polarity]. Each sentence is separated by "####", with the first part being the sentence and the second part being the corresponding triplet. Here is an example, where the triplet includes [aspect term, aspect category, sentiment polarity].

This coffee brews up a nice medium roast with exotic floral and berry notes .####[['coffee', 'food quality', 'positive']]

Each dataset is divided into training, validation, and test sets.

Code Requirements

We recommend to install the specified version of the following packages:

transformers==4.0.0
sentencepiece==0.1.91
pytorch_lightning==0.8.1

Quick Start for the Baseline

Set up the environment as described in the above section.
Download the pre-trained mT5-base model from https://huggingface.co/google/mt5-base and place it under the folder mT5-base/ .
Run command bash run.sh, which train the model on source language under UABSA/TASD task.
Run command bash evaluate.sh, which test the model on target language under UABSA/TASD task.

Detailed Usage: We conduct experiments on two ABSA subtasks with M-ABSA dataset in the paper, you can change the parameters in run.sh to try them:

task: tasd for triplet extraction, uabsa for (aspect term - sentiment polarity) pair extraction
dataset: one of the seven datasets in [food, restaurant, coursera, laptop, sight, phone, hotel]

python main.py --task tasd \
               --dataset hotel \
               --model_name_or_path mt5-base \
               --paradigm extraction \
               --n_gpu 0 \
               --do_train \
               --do_direct_eval \
               --train_batch_size 16 \
               --gradient_accumulation_steps 2 \
               --eval_batch_size 16 \
               --learning_rate 3e-4 \
               --num_train_epochs 5

Quick Start for the LLM Evaluation

Set up the environment as described in the above section.
Download the LLMs from huggingface and enter the direction of the model in the "TODO" place holder of each python file for LLM evaluation.

Detailed Usage: We conduct experiments on two ABSA subtasks with M-ABSA dataset in the paper, you can change the parameters on command line directly:

model: one of the models in [gemma, llama, mistral, qwen]
task: tasd for triplet extraction, uabsa for (aspect term - sentiment polarity) pair extraction
test_lang: one of the languages in [ar, da, de, en, es, fr, hi, hr, id, ja, ko, nl, pt, ru, sk, sv, sw, th, tr, vi, zh]
type: one of the seven datasets in [food, restaurant, coursera, laptop, sight, phone, hotel]

python {model}_{task}.py  --test_lang "en" --type "food"

Citation

If the code or dataset is used in your research, please star our repo and cite our paper as follows:

@inproceedings{wu-etal-2025-absa,
    title = "{M}-{ABSA}: A Multilingual Dataset for Aspect-Based Sentiment Analysis",
    author = "Wu, ChengYan  and
      Ma, Bolei  and
      Liu, Yihong  and
      Zhang, Zheyu  and
      Deng, Ningyuan  and
      Li, Yanshu  and
      Chen, Baolan  and
      Zhang, Yi  and
      Xue, Yun  and
      Plank, Barbara",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.emnlp-main.128/",
    doi = "10.18653/v1/2025.emnlp-main.128",
    pages = "2530--2557",
    ISBN = "979-8-89176-332-6",
}

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
data		data
eval_baseline_mT5		eval_baseline_mT5
eval_llm		eval_llm
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

M-ABSA

Data Description:

Code Requirements

Quick Start for the Baseline

Quick Start for the LLM Evaluation

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

swaggy66/M-ABSA

Folders and files

Latest commit

History

Repository files navigation

M-ABSA

Data Description:

Code Requirements

Quick Start for the Baseline

Quick Start for the LLM Evaluation

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages