Citation:
@inproceedings{chen-etal-2023-places,
title = "{PLACES}: Prompting Language Models for Social Conversation Synthesis",
author = "Chen, Maximillian and
Papangelis, Alexandros and
Tao, Chenyang and
Kim, Seokhwan and
Rosenbaum, Andy and
Liu, Yang and
Yu, Zhou and
Hakkani-Tur, Dilek",
booktitle = "Findings of the Association for Computational Linguistics: EACL 2023",
month = may,
year = "2023",
address = "Dubrovnik, Croatia",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.findings-eacl.63",
pages = "844--868",
}
Below is a version of the dyadic data generated using PLACES with GPT 3.5-Turbo as the backbone:
https://raw.githubusercontent.com/maxlchen/PLACES-GPT3.5/main/PLACES-GPT3.5-Dyadic.jsonlist
A multiparty version of the data is coming shortly!
PLACES-GPT3.5 is also featured in DialogStudio: https://github.com/salesforce/DialogStudio#loading-data
This code can be used to recreate the conversations from the paper, which used a list of reference topics from FITS. Feel free to try out different topic and prompt inputs!
Follow these steps to generate synthetic conversations with PLACES.
PLACES can use existing conversations as examples in the prompt, or hand-crafted ones. In the paper, we use 3 datasets: TopicalChat [1], DailyDialog [2], and FITS [3].
Download from here.
Download from here.
Download FITS data from ParlAI.
First, you need to install ParlAI: pip install parlai
Using: parlai display_data -t fits
will tell you where the FITS data is stored on your local machine.
It may take some time to download the data the first time you call it.
We've tested our code with Python 3.8 and transformers 4.26.0
but it should work with earlier versions of transformers too. After creating a virtual environment, you can install the requirements:
pip install transformers
If you want to use Topical Chat conversations as prompts, you need to first parse Topical-Chat into a simpler format:
python parse_topical_chat.py --tc_path <PATH_TO_TOPICAL_CHAT>
This will produce a .jsonlist
file into the prompts
directory.
The general command to run PLACES is:
python conversation_synthesis.py <ARGUMENTS>
For example:
python conversation_synthesis.py --fits_path <PATH_TO_FITS>
Or:
python conversation_synthesis.py --fits_path <PATH_TO_FITS>
--in_context_dataset "daily_dialog"
--in_context_dataset_path <PATH_TO_DAILY_DIALOG>
If you want to run triadic conversations, use the --triadic
flag:
For example:
python conversation_synthesis.py --fits_path <PATH_TO_FITS>
--triadic
Or:
python conversation_synthesis.py --fits_path <PATH_TO_FITS>
--in_context_dataset "daily_dialog"
--in_context_dataset_path <PATH_TO_DAILY_DIALOG>
--triadic
While we haven't tested multi-party conversations with more than 3 participants, it should be possible to do so by
creating the appropriate prompts in utils.py
.
- Karthik Gopalakrishnan, Behnam Hedayatnia, Qinlang Chen, Anna Gottardi, Sanjeev Kwatra, Anushree Venkatesh, Raefer Gabriel, Dilek Hakkani-Tür, Topical-Chat: Towards knowledge-grounded open-domain conversations, Interspeech 2019
- Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. IJCNLP 2017.
- Xu J, Ung M, Komeili M, Arora K, Boureau YL, Weston J. Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback. arXiv preprint 2022.