Open Information Extraction (OIE) Resources

A curated list of Open Information Extraction (OIE) resources: research papers, code, data, applications, etc. The list is not limited to Open Information Extraction systems exclusively. It also includes work highly related to OIE, such as taxonomizing open relations and using OIE in downstream applications.

All work is done by Kiril Gashteovski on https://github.com/gkiril/oie-resources . This is a fork to update with newer papers.

Introduction to OIE

Open Information Extraction (OIE) systems aim to extract unseen relations and their arguments from unstructured text in unsupervised manner. In its simplest form, given a natural language sentence, they extract information in the form of a triple, consisted of subject (S), relation (R) and object (O).

Suppose we have the following input sentence:

AMD, which is based in U.S., is a technology company.

An OIE system aims to make the following extractions:

("AMD"; "is based in"; "U.S.")
("AMD"; "is"; "technology company")

Papers sorted in chronological order

2006

"Machine Reading" - AAAI 2006

Oren Etzioni, Michele Banko, Michael J. Cafarella

2007

"Open Information Extraction from the Web" - IJCAI 2007

Michele Banko, Michael J. Cafarella, Stephen Soderland, Matthew Broadhead, Oren Etzioni
"Unsupervised Resolution of Objects and Relations on the Web" - NAACL 2007

Alexander Yates, Oren Etzioni
"TextRunner: Open Information Extraction on the Web" - HLT-NAACL 2007

Alexander Yates, Michele Banko, Matthew Broadhead, Michael J. Cafarella, Oren Etzioni, Stephen Soderland

2008

"The Tradeoffs between Open and Traditional Relation Extraction" - ACL 2008

Michele Banko, Oren Etzioni
"Open Knowledge Extraction through Compositional Language Processing" - STEP 2008

Benjamin Van Durme, Lenhart K. Schubert
"Open Information Extraction from the Web" - Commun. ACM 2008

Oren Etzioni, Michele Banko, Stephen Soderland, Daniel S. Weld

2009

"Using Wikipedia to Bootstrap Open Information Extraction" - SIGMOD 2009

Daniel S. Weld, Raphael Hoffmann, Fei Wu

2010

"Open Information Extraction Using Wikipedia" - ACL 2010

Fei Wu, Daniel S. Weld
"Identifying Functional Relations in Web Text" - EMNLP 2010

Thomas Lin, Mausam, Oren Etzioni
"Adapting Open Information Extraction to Domain-Specific Relations" - AI Magazine (31), 2010

Stephen Soderland, Brendan Roof, Bo Qin, Shi Xu, Mausam, Oren Etzioni

2011

"Open Information Extraction: The Second Generation" - IJCAI 2011 (slides)

Oren Etzioni, Anthony Fader, Janara Christensen, Stephen Soderland, Mausam
"Identifying Relations for Open Information Extraction" - EMNLP 2011 (resources (code, data))

Anthony Fader, Stephen Soderland, Oren Etzioni
"Filtering and Clustering Relations for Unsupervised Information Extraction in Open Domain" - CIKM 2011

Wei Wang, Romaric Besançon, Olivier Ferret, Brigitte Grau
"An Analysis of Open Information Extraction based on Semantic Role Labeling" - K-CAP 2011

Janara Christensen, Mausam, Stephen Soderland, Oren Etzioni

2012

"Open Language Learning for Information Extraction" - EMNLP-CoNLL 2012 (resources (code, data, binaries))

Mausam, Michael Schmitz, Stephen Soderland, Robert Bart, Oren Etzioni
"PATTY: A Taxonomy of Relational Patterns with Semantic Types" - EMNLP-CoNLL 2012

Ndapandula Nakashole, Gerhard Weikum, Fabian M. Suchanek
"Ensemble Semantics for Large-scale Unsupervised Relation Extraction" - EMNLP-CoNLL 2012

Bonan Min, Shuming Shi, Ralph Grishman, Chin-Yew Lin
"WiSeNet: building a wikipedia-based semantic network with ontologized relations" - CIKM 2012 (resources)

Andrea Moro, Roberto Navigli
"Open Information Extraction for SOV Language Based on Entity-Predicate Pair Detection" - COLING 2012

Woong-Ki Lee, Yeon-Su Lee, Hyoung-Gyu Lee, Won-Ho Ryu, Hae-Chang Rim
"A Weighting Scheme for Open Information Extraction" - HLT-NAACL 2012

Yuval Merhav
"Dependency-based open information extraction" - Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP @ ACL 2012

Pablo Gamallo, Marcos Garcia
"KrakeN: N-ary Facts in Open Information Extraction" - AKBC-WEKEX@NAACL-HLT 2012

Alan Akbik, Alexander Löser
"Improving Open Information Extraction for Informal Web Documents with Ripple-Down Rules" - PKAW 2012

Myung Hee Kim, Paul Compton

2013

"ClausIE: clause-based open information extraction" - WWW 2013 (slides, code, all resources)

Luciano Del Corro, Rainer Gemulla
"Integrating Syntactic and Semantic Analysis into the Open Information Extraction Paradigm" - IJCAI 2013 (resources)

Andrea Moro, Roberto Navigli
"Effectiveness and Efficiency of Open Relation Extraction" - EMNLP 2013 (code)

Filipe de Sá Mesquita, Jordan Schmidek, Denilson Barbosa
"Open Information Extraction with Tree Kernels" - HLT-NAACL 2013

Ying Xu, Mi-Young Kim, Kevin Quinn, Randy Goebel, Denilson Barbosa
"Relation Extraction with Matrix Factorization and Universal Schemas" - HLT-NAACL 2013

Sebastian Riedel, Limin Yao, Andrew McCallum, Benjamin M. Marlin
"Open Information Extraction via Contextual Sentence Decomposition" - ICSC 2013

Hannah Bast, Elmar Haussmann
"Integrating Open and Closed Information Extraction: Challenges and First Steps" - NLP-DBPEDIA@ISWC 2013

Arnab Dutta, Christian Meilicke, Mathias Niepert, Simone Paolo Ponzetto
"Open Information Extraction to KBP Relations in 3 Hours" - TAC 2013

Stephen Soderland, John Gilmer, Robert Bart, Oren Etzioni, Daniel S. Weld

2014

"ReNoun: Fact Extraction for Nominal Attributes" - EMNLP 2014

Mohamed Yahya, Steven Whang, Rahul Gupta, Alon Y. Halevy
"ZORE: A Syntax-based System for Chinese Open Relation Extraction" - EMNLP 2014

Likun Qiu, Yue Zhang
"Canonicalizing Open Knowledge Bases" - CIKM 2014

Luis Galárraga, Geremy Heitz, Kevin Murphy, Fabian M. Suchanek
"Focused Entailment Graphs for Open IE Propositions" - CoNLL 2014

Omer Levy, Ido Dagan, Jacob Goldberger
"Boosting Open Information Extraction with Noun-Based Relations" - LREC 2014

Clarissa Castellã Xavier, Vera Lúcia Strube de Lima
"Improving Open Relation Extraction via Sentence Re-Structuring" - LREC 2014

Jordan Schmidek, Denilson Barbosa
"More Informative Open Information Extraction via Simple Inference" - ECIR 2014

Hannah Bast, Elmar Haussmann
"Semantifying Triples from Open Information Extraction Systems" - STAIRS 2014

Arnab Dutta, Christian Meilicke, Heiner Stuckenschmidt
"Entity-Centric Coreference Resolution of Person Entities for Open Information Extraction" - Procesamiento del Lenguaje Natural (2014)

Marcos García, Pablo Gamallo
"Open Information Extraction for Spanish Language based on Syntactic Constraints" - ACL (Student Research Workshop) (2014)

Alisa Zhila, Alexander Gelbukh

2015

"Leveraging Linguistic Structure For Open Domain Information Extraction" - ACL 2015 (code (Java), code (Python))

Gabor Angeli, Melvin Jose Johnson Premkumar, Christopher D. Manning
"Open IE as an Intermediate Structure for Semantic Tasks" - ACL 2015

Gabriel Stanovsky, Ido Dagan, Mausam
"Large-Scale Information Extraction from Textual Definitions through Deep Syntactic and Semantic Analysis" - TACL 2015 (resources)

Claudio Delli Bovi, Luca Telesca, Roberto Navigli
"Inferring Binary Relation Schemas for Open Information Extraction" - EMNLP 2015

Kangqi Luo, Xusheng Luo, Kenny Qili Zhu
"Knowledge Base Unification via Sense Embeddings and Disambiguation" - EMNLP 2015 (resources)

Claudio Delli Bovi, Luis Espinosa Anke, Roberto Navigli
"CORE: Context-Aware Open Relation Extraction with Factorization Machines" - EMNLP 2015 (code)

Fabio Petroni, Luciano Del Corro, Rainer Gemulla
"Multilingual Open Relation Extraction Using Cross-lingual Projection" - HLT-NAACL 2015

Manaal Faruqui, Shankar Kumar
"Enriching Structured Knowledge with Open Information" - WWW 2015

Arnab Dutta, Christian Meilicke, Heiner Stuckenschmidt
"SRDF: Korean Open Information Extraction using Singleton Property" - ISWC 2015

Sangha Nam, YoungGyun Hahm, Sejin Nam, Key-Sun Choi
"Multilingual Open Information Extraction" - EPIA 2015

Pablo Gamallo, Marcos García
"Open information extraction based on lexical semantics" - J. Braz. Comp. Soc. 21 2015

Clarissa Castellã Xavier, Vera Lúcia Strube de Lima, Marlo Souza

2016

"Nested Propositions in Open Information Extraction" - EMNLP 2016 (talk)

Nikita Bhutani, H. V. Jagadish, Dragomir R. Radev
"Creating a Large Benchmark for Open Information Extraction" - EMNLP 2016 (code, talk)

Gabriel Stanovsky, Ido Dagan
"Porting an Open Information Extraction System from English to German" - EMNLP 2016 (code)

Tobias Falke, Gabriel Stanovsky, Iryna Gurevych, Ido Dagan
"Relation Schema Induction using Tensor Factorization with Side Information" - EMNLP 2016

Madhav Nimishakavi, Uday Singh Saini, Partha P. Talukdar
"Open Information Extraction Systems and Downstream Applications" - IJCAI 2016

Mausam
"Demonyms and Compound Relational Nouns in Nominal Open IE" - AKBC@NAACL-HLT 2016

Harinder Pal, Mausam
"A Rule Based Open Information Extraction Method Using Cascaded Finite-State Transducer" - PAKDD 2016

Hailun Lin, Yuanzhuo Wang, Peng Zhang, Weiping Wang, Yinliang Yue, Zheng Lin
"Getting More Out Of Syntax with PropS" - CoRR (2016)

Gabriel Stanovsky, Jessica Ficler, Ido Dagan, Yoav Goldberg
"Improving Open Information Extraction for Semantic Web Tasks" - Trans. Computational Collective Intelligence 21, 2016

Cheikh Kacfah Emani, Catarina Ferreira Da Silva, Bruno Fiés, Parisa Ghodous
"An Informativeness Approach to Open IE Evaluation" - CICLing 2016 (slides, code + data)

William Léchelle, Philippe Langlais

2017

"MinIE: Minimizing Facts in Open Information Extraction" - EMNLP 2017 (code, poster, all resources)

Kiril Gashteovski, Rainer Gemulla, Luciano Del Corro
"Answering Complex Questions Using Open Information Extraction" - ACL 2017

Tushar Khot, Ashish Sabharwal, Peter Clark
"Pocket Knowledge Base Population" - ACL 2017

Travis Wolfe, Mark Dredze, Benjamin Van Durme
"Bootstrapping for Numerical Open IE" - ACL 2017

Swarnadeep Saha, Harinder Pal, Mausam
"MT/IE: Cross-lingual Open Information Extraction with Neural Sequence-to-Sequence Models" - EACL 2017 (code)

Kevin Duh, Benjamin Van Durme, Sheng Zhang
Open Relation Extraction for Support Passage Retrieval: Merit and Open Issues" - SIGIR 2017

Amina Kadry, Laura Dietz
"Syntactic Representation Learning for Open Information Extraction on Web" - WWW 2017

Chengsen Ru, Jintao Tang, Shasha Li, Ting Wang
"MetaPAD: Meta Pattern Discovery from Massive Text Corpora" (code)- KDD 2017

Meng Jiang, Jingbo Shang, Taylor Cassidy, Xiang Ren, Lance M. Kaplan, Timothy P. Hanratty, Jiawei Han
"RelVis: Benchmarking OpenIE Systems" - ISWC 2017

Rudolf Schneider, Tom Oberhauser, Tobias Klatt, Felix A. Gers, Alexander Löser
"A Consolidated Open Knowledge Representation for Multiple Texts" - LSDSem@EACL 2017

Rachel Wities, Vered Shwartz, Gabriel Stanovsky, Meni Adler, Ori Shapira, Shyam Upadhyay, Dan Roth, Eugenio Martínez-Cámara, Iryna Gurevych, Ido Dagan
"Open Relation Extraction and Grounding" - IJCNLP 2017

Dian Yu, Lifu Huang, Heng Ji
"Selective Decoding for Cross-lingual Open Information Extraction" - IJCNLP(1) 2017

Sheng Zhang, Kevin Duh, Benjamin Van Durme
"An assessment of open relation extraction systems for the semantic web" - Inf. Syst. 71, 2017

Amal Zouaq, Michel Gagnon, Ludovic Jean-Louis
"An Evaluation of PredPatt and Open IE via Stage 1 Semantic Role Labeling" - IWCS 2017

Sheng Zhang, Rachel Rudinger, Benjamin Van Durme
"Discovering Relational Phrases for Qualia Roles Through Open Information Extraction" - KESW 2017

Giovanni Siragusa, Valentina Leone, Luigi Di Caro, Claudio Schifanella
"Open Relation Extraction Based on Core Dependency Phrase Clustering" - DSC 2017

Chengsen Ru, Shasha Li, Jintao Tang, Yi Gao, Ting Wang
"Analysing Errors of Open Information Extraction Systems" - Workshop on Building Linguistically Generalizable NLP Systems @ EMNLP 2017

Rudolf Schneider, Tom Oberhauser, Tobias Klatt, Felix A. Gers, Alexander Löser

2018

"Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction" - WSDM 2018

Mingming Sun, Xu Li, Xin Wang, Miao Fan, Yue Feng, Ping Li
"Assertion-Based QA With Question-Aware Open Information Extraction" - AAAI 2018

Zhao Yan, Duyu Tang, Nan Duan, Shujie Liu, Wendi Wang, Daxin Jiang, Ming Zhou, Zhoujun Li
"Neural Open Information Extraction" - ACL 2018

Lei Cui, Furu Wei, Ming Zhou
"Supervised Open Information Extraction" - NAACL-HLT 2018

Gabriel Stanovsky, Julian Michael, Luke Zettlemoyer, Ido Dagan
"Logician and Orator: Learning from the Duality between Language and Knowledge in Open Domain" - EMNLP 2018

Mingming Sun, Xu Li, Ping Li
"Open Information Extraction from Conjunctive Sentences" - COLING 2018

Swarnadeep Saha, Mausam
"Graphene: Semantically-Linked Propositions in Open Information Extraction" - COLING 2018 (code, documentation)

Matthias Cetto, Christina Niklaus, André Freitas, Siegfried Handschuh
"Open Information Extraction on Scientific Text: An Evaluation" - COLING 2018

Paul T. Groth, Michael Lauruhn, Antony Scerri, Ron Daniel
"A Survey on Open Information Extraction" - COLING 2018

Christina Niklaus, Matthias Cetto, André Freitas, Siegfried Handschuh
"StuffIE: Semantic Tagging of Unlabeled Facets Using Fine-Grained Information Extraction" - CIKM 2018

Radityo Eko Prasojo, Mouna Kacimi, Werner Nutt
"Towards Practical Open Knowledge Base Canonicalization" - CIKM 2018

Tien-Hsuan Wu, Zhiyong Wu, Ben Kao, Pengcheng Yin
"Open Information Extraction with Global Structure Constraints" - WWW 2018

Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Frank F. Xu, Jiawei Han
"CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information" - WWW 2018 (code)

Shikhar Vashishth, Prince Jain, Partha Talukdar
"Revisiting the Task of Scoring Open IE Relations" (poster) - LREC 2018

William Léchelle, Philippe Langlais
"Employing Semantic Context for Sparse Information Extraction Assessment" - TKDD 2018 (resources)

Pei-Pei Li, Haixun Wang, Hongsong Li, Xindong Wu
"Open Information Extraction with Meta-pattern Discovery in Biomedical Literature" - BCB 2018

Xuan Wang, Yu Zhang, Qi Li, Yinyin Chen, Jiawei Han
"Modeling and Summarizing News Events Using Semantic Triples" - ESWC 2018

Radityo Eko Prasojo, Mouna Kacimi, Werner Nutt
"Disambiguating Open IE: Identifying Semantic Similarity in Relation Extraction by Word Embeddings" - PROPOR 2018

Leandro M. P. Sanches, Victor S. Cardel, Larissa S. Machado, Marlo Souza, Laís do Nascimento Salvador
"Task-Oriented Evaluation of Dependency Parsing with Open Information Extraction" - PROPOR 2018

Pablo Gamallo, Marcos Garcia
"Challenges of an Annotation Task for Open Information Extraction in Portuguese" - PROPOR 2018

Rafael Glauber, Leandro Souza de Oliveira, Cleiton Fernando Lima Sena, Daniela Barreiro Claro, Marlo Souza
"A systematic mapping study on open information extraction" - Expert Syst. Appl. 2018

Rafael Glauber, Daniela Barreiro Claro
"Self-training on refined clause patterns for relation extraction" - Inf. Process. Manage. 54(4): 686-706 (2018)

Duc-Thuan Vo, Ebrahim Bagheri
"Supervised Neural Models Revitalize the Open Relation Extraction" - CoRR 2018

Shengbin Jia, Yang Xiang, Xiaojun Chen
"Chinese Open Relation Extraction and Knowledge Base Establishment" - ACM Trans. Asian & Low-Resource Lang. Inf. Process. 2018 (slides, code)

Shengbin Jia, Shijia E, Maozhen Li, Yang Xiang
Rule-based Indonesian Open Information Extraction" - ICAICTA 2018

Ade Romadhony, Ayu Purwarianti, Dwi H. Widyantoro
"WiRe57 : A Fine-Grained Benchmark for Open Information Extraction" - CoRR 2018

William Léchelle, Fabrizio Gotti, Philippe Langlais
" Uncovering Algorithmic Approaches in Open Information Extraction: A Literature Review." - 30th Benelux Conference on Artificial Intelligence 2018

Injy Sarhan, Marco Spruit

2019

"OPIEC: An Open Information Extraction Corpus" - AKBC 2019 (data + resources, code (data reading), code (pipeline))

Kiril Gashteovski, Sebastian Wanner, Sven Hertling, Samuel Broscheit, Rainer Gemulla
"MinScIE: Citation-centered Open Information Extraction" - JCDL 2019 (code)

Anne Lauscher, Yide Song, Kiril Gashteovski
"EAL: A Toolkit and Dataset for Entity-Aspect Linking" - JCDL 2019 (data, code, demo)

Federico Nanni, Jingyi Zhang, Ferdinand Betz, Kiril Gashteovski
"Integrating Local Context and Global Cohesiveness for Open Information Extraction" - WSDM 2019 (code)

Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Ahmed El-Kishky, Jiawei Han
"Open Information Extraction from Question-Answer Pairs" - NAACL 2019

Nikita Bhutani, Yoshihiko Suhara, Wang-Chiew Tan, Alon Halevy and H V Jagadish
"OpenKI: Integrating Open Information Extraction and Knowledge Bases with Relation Inference" - NAACL 2019 (data)

Dongxu Zhang, Subhabrata Mukherjee, Colin Lockard, Xin Luna Dong, Andrew McCallum
"OpenCeres: When Open Information Extraction Meets the Semi-Structured Web" - NAACL 2019 (video, slides, data)

Colin Lockard, Prashant Shiralkar and Xin Luna Dong
"Improving Open Information Extraction via Iterative Rank-Aware Learning" - ACL 2019 (code)

Zhengbao Jiang, Pengcheng Yin and Graham Neubig
"Open Relation Extraction: Relational Knowledge Transfer from Supervised Data to Unsupervised Data" - EMNLP 2019

Ruidong Wu, Yuan Yao, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin and Maosong Sun
"Supervising Unsupervised Open Information Extraction Models" - EMNLP 2019

Arpita Roy, Youngja Park, Taesung Lee and Shimei Pan
"CaRB: A Crowdsourced Benchmark for Open IE" - EMNLP 2019 (code and data)

Sangnie Bhardwaj, Samarth Aggarwal and Mausam
"CaRe: Open Knowledge Graph Embeddings" - EMNLP 2019 (code)

Swapnil Gupta, Sreyash Kenkre, Partha Talukdar
"Collaborative Policy Learning for Open Knowledge Graph Reasoning" - EMNLP 2019 (code)

Cong Fu, Tong Chen, Meng Qu, Woojeong Jin, Xiang Ren
"Multi-Input Multi-Output Sequence Labeling for Joint Extraction of Fact and Condition Tuples from Scientific Text" - EMNLP 2019

Tianwen Jiang, Tong Zhao, Bing Qin, Ting Liu, Nitesh Chawla, Meng Jiang
"The Role of "Condition": A Novel Scientific Knowledge Graph Representation and Construction Model" - KDD 2019

Tianwen Jiang, Tong Zhao, Bing Qin, Ting Liu, Nitesh V. Chawla, Meng Jiang
"Canonicalization of Open Knowledge Bases with Side Information from the Source Text" - ICDE 2019

Xueling Lin, Lei Chen
"Open Relation Extraction for Chinese Noun Phrases" - TKDE 2019

Chengyu Wang, Xiaofeng He, Aoying Zhou
"Divide and Extract – Disentangling Clause Splitting and Proposition Extraction" - RANLP 2019

Darina Gold, Torsten Zesch
"Exploiting Open IE for Deriving Multiple Premises Entailment Corpus" - RANLP 2019

Martin Víta, Jakub Klímek
"Lexicon-Grammar based Open Information Extraction from Natural Language Sentences in Italian" - Expert Systems and Applications 2019

Raffaele Guarasci, Emanuele Damiano, Aniello Minutolo, Massimo Esposito, Giuseppe De Pietro
"Weakly Supervised, Data-Driven Acquisition of Rules for Open Information Extraction" - CAIAC 2019

Fabrizio GottiEmail, Philippe Langlais
"Aligning Open IE Relations and KB Relations using a Siamese Network Based on Word Embedding" - ICCS 2019

Rifki Afina Putri, Giwon Hong, Sung-Hyon Myaeng
"Contextualized Word Embeddings in a Neural Open Information Extraction Model" - NLDB 2019

Injy Sarhan, Marco R. Spruit
"Multilingual Open Information Extraction: Challenges and Opportunities" - Information 10(7): 228, 2019

Daniela Barreiro Claro, Marlo Souza, Clarissa Castellã Xavier, Leandro Souza de Oliveira
"CTGA: Graph-based Biomedical Literature Search" - IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Tianwen Jiang, Zhihan Zhang, Tong Zhao, Bing Qin, Ting Liu, Nitesh V. Chawla, Meng Jiang
"When Lexicon-Grammar Meets Open Information Extraction: a Computational Experiment for Italian Sentences" - CLiC-it 2019

Raffaele Guarasci, Emanuele Damiano, Aniello Minutolo, Massimo Esposito
"Towards a gold standard dataset for Open Information Extraction in Italian" - SNAMS 2019

Raffaele Guarasci, Emanuele Damiano, Aniello Minutolo, Massimo Esposito
"Co-Clustering Triples from Open Information Extraction" - COMAD 2019

Koninika Pal, Vinh Thinh Ho, Gerhard Weikum
Coherence and Salience-Based Multi-Document Relationship Mining - APWeb-WAIM 2019

Yongpan Sheng, Zenglin Xu
"Learning Open Information Extraction of Implicit Relations from Reading Comprehension Datasets" - CoRR 2019

Jacob Beckerman, Theodore Christakis

2020

Systematic Comparison of Neural Architectures and Training Approaches for Open Information Extraction - EMNLP 2020

Patrick Hohenecker, Frank Mtumbuka, Vid Kocijan, Thomas Lukasiewicz
A Predicate-Function-Argument Annotation of Natural Language for Open-Domain Information eXpression - EMNLP 2020 (resources)

Mingming Sun, Wenyue Hua, Zoey Liu, Xin Wang, Kangjie Zheng, Ping Li
Systematic Comparison of Neural Architectures and Training Approaches for Open Information Extraction - EMNLP 2020

Patrick Hohenecker, Frank Mtumbuka, Vid Kocijan, Thomas Lukasiewicz
SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction - EMNLP 2020

Xuming Hu, Chenwei Zhang, Yusong Xu, Lijie Wen, Philip S. Yu
"OpenIE6: Iterative Grid Labeling and Coordination Analysis for Open Information Extraction" (code) - EMNLP 2020

Keshav Kolluru, Vaibhav Adlakha, Samarth Aggarwal, Mausam, Soumen Chakrabarti
"Multi2OIE: Multilingual Open Information Extraction based on Multi-Head Attention with BERT" (code) - EMNLP 2020

Youngbin Ro, Yukyung Lee, Pilsung Kang
"On Aligning OpenIE Extractions with Knowledge Bases: A Case Study" (video, slides, resources) - Eval4NLP@EMNLP 2020

Kiril Gashteovski, Rainer Gemulla, Bhushan Kotnis, Sven Hertling, Christian Meilicke
"IMoJIE: Iterative Memory-Based Joint Open Information Extraction" (code, video) - ACL 2020

Keshav Kolluru, Samarth Aggarwal, Vipul Rathore, Mausam, Soumen Chakrabarti
"Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction" (resources, video) - ACL 2020

Samuel Broscheit, Kiril Gashteovski, Yanjie Wang, Rainer Gemulla
"Learning Interpretable Relationships between Entities, Relations and Concepts via Bayesian Structure Learning on Open Domain Facts" (video) - ACL 2020

Jingyuan Zhang, Mingming Sun, Yue Feng, Ping Li
"In Layman’s Terms: Semi-Open Relation Extraction from Scientific Texts" (code, video) - ACL 2020

Ruben Kruiper, Julian Vincent, Jessica Chen-Burger, Marc Desmulliez, Ioannis Konstas
"Span Model for Open Information Extraction on Accurate Corpus" (code)- AAAI 2020

Junlang Zhan, Hai Zhao
"LOREM: Language-consistent Open Relation Extraction from Unstructured Text" (code) - WWW 2020

Tom Harting, Sepideh Mesbah, Christoph Lofi
"Extracting Knowledge from Web Text with Monte Carlo Tree Search" - WWW 2020

Guiliang Liu, Xu Li, Jiakang Wang, Mingming Sun, Ping Li
"MULCE: Multi-level Canonicalization with Embeddings of Open Knowledge Bases" - WISE 2020

Tien-Hsuan Wu, Ben Kao, Zhiyong Wu, Xiyang Feng, Qianli Song, Cheng Chen
An Advantage Actor-Critic Algorithm with Confidence Exploration for Open Information Extraction" - SDM 2020

Guiliang Liu, Xu Li, Miningming Sun, Ping Li
"Chinese Open Relation Extraction with Pointer-Generator Networks" - DSC 2020

Ziheng Cheng, Xu Wu, Xiaqing Xie, Jingchen Wu
Explainable OpenIE Classifier with Morpho-syntactic Rules" - HI4NLP@ECAI 2020

Bruno Cabral, Marlo Souza, Daniela Barreiro Claro
Language Models are Open Knowledge Graphs - CoRR 2020

Chenguang Wang, Xiao Liu, Dawn Song
"Hybrid Neural Tagging Model for Open Relation Extraction" - CoRR 2020 (data)

Shengbin Jia, Yang Xiang
"Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network" - CoRR 2020

Tianwen Jiang, Tong Zhao, Bing Qin, Ting Liu, Nitesh V. Chawla, Meng Jiang
"Tag and Correct: Question Aware Open Information Extraction with Two-Stage Decoding" - CoRR 2020

Martin Kuo, Yaobo Liang, Lei Ji, Nan Duan, Linjun Shou, Ming Gong, Peng Chen
"Abstractive Query Focused Summarization with Query-Free Resources" - CoRR 2020

Yumo Xu, Mirella Lapata
"Can we survive without labelled data in nlp? transfer learning for open information extraction" - Applied Sciences 2020

Injy Sarhan, Marco Spruit
Open Information Extraction for Knowledge Graph Construction - DEXA 2020

Iqra Muhammad, Anna Kearney, Carrol Gamble, Frans Coenen, and Paula Williamso

2021

"CoRI: Collective Relation Integration with Data Augmentation for Open Information Extraction" - ACL 2021

Zhengbao Jiang, Jialong Han, Bunyamin Sisman, Xin Luna Dong
"DocOIE: A Document-level Context-Aware Dataset for OpenIE" - ACL 2021

Kuicai Dong, Zhao Yilin, Aixin Sun, Jung-Jae Kim, Xiaoli Li
"OKGIT: Open Knowledge Graph Link Prediction with Implicit Types" - ACL 2021

Chandrahas, Partha Pratim Talukdar
"Maximal Clique Based Non-Autoregressive Open Information Extraction" - EMNLP 2021

Bowen Yu, Yucheng Wang, Tingwen Liu, Hongsong Zhu, Limin Sun, Bin Wang
"Zero-Shot Information Extraction as a Unified Text to Triple Translation" - EMNLP 2021 (code)

Chenguang Wang, Xiao Liu, Zui Chen, Haoyun Hong, Jie Tang, Dawn Song
"Open Knowledge Graphs Canonicalization using Variational Autoencoders" - EMNLP 2021 (code)

Sarthak Dash, Gaetano Rossiello, Nandana Mihindukulasooriya, Sugato Bagchi, Alfio Gliozzo
"LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction" - EACL 2021 (code and data)

Jacob Solawetz, Stefan Larson
"Open Hierarchical Relation Extraction" - NAACL 2021 (code)

Kai Zhang, Yuan Yao, Ruobing Xie, Xu Han, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun
"Semi-Open Information Extraction" - WWW 2021

Bowen Yu, Zhenyu Zhang, Jiawei Sheng, Tingwen Liu, Yubin Wang, Yucheng Wang, Bin Wang
"Joint Open Knowledge Base Canonicalization and Linking" - SIGMOD 2021

Yinan Liu, Wei Shen, Yuanfei Wang, Jianyong Wang, Zhenglu Yang, Xiaojie Yuan
"TENET: Joint Entity and Relation Linking with Coherence Relaxation" - SIGMOD 2021

Xueling Lin, Lei Chen, Chaorui Zhang
"Multi-Grained Dependency Graph Neural Network for Chinese Open Information Extraction" - PAKDD 2021

Zhiheng Lyu, Kaijie Shi, Xin Li, Lei Hou, Juanzi Li, Binheng Song
"CaSIE: Canonicalize and Informative Selection of the OpenIE system" - ICDE 2021

Hao Xin, Xueling Lin, Lei Chen
PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation" - Student Research Workshop @ EACL

Dimitris Papadopoulos, Nikolaos Papadakis, Nikolaos Matsatsinis
" Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph" - Knowledge-Based Systems 2021

Injy Sarhan, Marco Spruit
"Universal Dependencies for Multilingual Open Information Extraction" - 3rd Conference on Language, Data and Knowledge, LDK 2021

Massinissa Atmani, Mathieu Lafourcade

2022

"BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation" - ACL 2022 (code)

Kiril Gashteovski, Mingying Yu, Bhushan Kotnis, Carolin Lawrence, Mathias Niepert, Goran Glavaš
"MILIE: Modular & Iterative Multilingual Open Information Extraction" - ACL 2022

Bhushan Kotnis, Kiril Gashteovski, Daniel Rubio, Ammar Shaker, Vanesa Rodriguez-Tembras, Makoto Takamoto, Mathias Niepert, Carolin Lawrence
Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction" - ACL 2022 (code)

Keshav Kolluru, Muqeeth Mohammed, Shubham Mittal, Soumen Chakrabarti, Mausam
"OIE@OIA: an Adaptable and Efficient Open Information Extraction Framework" - ACL 2022

Xin Wang, Minlong Peng, Mingming Sun, Ping Li
"Open Relation Modeling: Learning to Define Relations between Entities" - ACL 2022 (code)

Jie Huang, Kevin Chang, Jinjun Xiong, Wen-mei Hwu
"DeepStruct: Pretraining of Language Models for Structure Prediction" - ACL 2022 (code)

Chenguang Wang, Xiao Liu, Zui Chen, Haoyun Hong, Jie Tang, Dawn Song
"AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark" - ACL 2022 (code)

Niklas Friedrich, Kiril Gashteovski, Mingying Yu, Bhushan Kotnis, Carolin Lawrence, Mathias Niepert, Goran Glavaš
"CompactIE: Compact Facts in Open Information Extraction" - NAACL 2022 (code)

Farima Fatahi Bayat, Nikita Bhutani, H. V. Jagadish
"DetIE: Multilingual Open Information Extraction Inspired by Object Detection" - AAAI 2022 (code)

Michael Vasilkovsky, Anton Alekseev, Valentin Malykh, Ilya Shenbin, Elena Tutubalina, Dmitriy Salikhov, Mikhail Stepnov, Andrey Chertok, Sergey I. Nikolenko
"A Survey on Neural Open Information Extraction: Current Status and Future Directions" - IJCAI 2022

Shaowen Zhou, Bowen Yu, Aixin Sun, Cheng Long, Jingyang Li, Jian Sun
"Open Information Extraction from 2007 to 2022 – A Survey" - CoRR 2022

Pai Liu, Wenyang Gao, Wenjie Dong, Songfang Huang, Yue Zhang
" PortNOIE: A Neural Framework for Open Information Extraction for the Portuguese Language" PROPOR 2022: 15th International Conference on Computational Processing of Portuguese

Bruno Cabral, Marlo Souza & Daniela Barreiro Claro
" LILLIE: Information extraction and database integration using linguistics and learning-based algorithms" - Information Systems 2022

Ellery Smith, Dimitris Papadopoulos, Martin Braschler, Kurt Stockinger
"Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction" - ACL 2022

Keshav Kolluru, Mohammed Muqeeth, Shubham Mittal, Soumen Chakrabarti, Mausam
"DeepStruct: Pretraining of Language Models for Structure Prediction " - ACL 2022

Chenguang Wang, Xiao Liu, Zui Chen, Haoyun Hong, Jie Tang, Dawn Song

Papers grouped by category

Surveys

"Open Information Extraction Systems and Downstream Applications" - IJCAI 2016

Mausam
"A Survey on Open Information Extraction" - COLING 2018

Christina Niklaus, Matthias Cetto, André Freitas, Siegfried Handschuh
"A systematic mapping study on open information extraction" - Expert Syst. Appl. 2018

Rafael Glauber, Daniela Barreiro Claro
" Uncovering Algorithmic Approaches in Open Information Extraction: A Literature Review." - 30th Benelux Conference on Artificial Intelligence 2018

Injy Sarhan, Marco Spruit
"Multilingual Open Information Extraction: Challenges and Opportunities" - Information 10(7): 228, 2019

Daniela Barreiro Claro, Marlo Souza, Clarissa Castellã Xavier, Leandro Souza de Oliveira
"A Survey on Neural Open Information Extraction: Current Status and Future Directions" - IJCAI 2022

Shaowen Zhou, Bowen Yu, Aixin Sun, Cheng Long, Jingyang Li, Jian Sun
"Open Information Extraction from 2007 to 2022 – A Survey" - CoRR 2022

Pai Liu, Wenyang Gao, Wenjie Dong, Songfang Huang, Yue Zhang

Evaluation

"Creating a Large Benchmark for Open Information Extraction" - EMNLP 2016 (code, talk)

Gabriel Stanovsky, Ido Dagan
"An Informativeness Approach to Open IE Evaluation" - CICLing 2016 (slides, code + data)

William Léchelle, Philippe Langlais
"An Evaluation of PredPatt and Open IE via Stage 1 Semantic Role Labeling" - IWCS 2017

Sheng Zhang, Rachel Rudinger, Benjamin Van Durme
"An assessment of open relation extraction systems for the semantic web" - Inf. Syst. 71, 2017

Amal Zouaq, Michel Gagnon, Ludovic Jean-Louis
"RelVis: Benchmarking OpenIE Systems" - ISWC 2017

Rudolf Schneider, Tom Oberhauser, Tobias Klatt, Felix A. Gers, Alexander Löser
"Analysing Errors of Open Information Extraction Systems" - Workshop on Building Linguistically Generalizable NLP Systems @ EMNLP 2017

Rudolf Schneider, Tom Oberhauser, Tobias Klatt, Felix A. Gers, Alexander Löser
"Open Information Extraction on Scientific Text: An Evaluation" - COLING 2018

Paul T. Groth, Michael Lauruhn, Antony Scerri, Ron Daniel
"WiRe57 : A Fine-Grained Benchmark for Open Information Extraction" - CoRR 2018

William Léchelle, Fabrizio Gotti, Philippe Langlais
"CaRB: A Crowdsourced Benchmark for Open IE" - EMNLP 2019 (code and data)

Sangnie Bhardwaj, Samarth Aggarwal and Mausam
"Towards a gold standard dataset for Open Information Extraction in Italian" - SNAMS 2019

Raffaele Guarasci, Emanuele Damiano, Aniello Minutolo, Massimo Esposito
Systematic Comparison of Neural Architectures and Training Approaches for Open Information Extraction - EMNLP 2020

Patrick Hohenecker, Frank Mtumbuka, Vid Kocijan, Thomas Lukasiewicz
"On Aligning OpenIE Extractions with Knowledge Bases: A Case Study" (video, slides, resources) - Eval4NLP@EMNLP 2020

Kiril Gashteovski, Rainer Gemulla, Bhushan Kotnis, Sven Hertling, Christian Meilicke
"BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation" - ACL 2022 (code)

Kiril Gashteovski, Mingying Yu, Bhushan Kotnis, Carolin Lawrence, Mathias Niepert, Goran Glavaš

OIE for downstream applications

OIE's output has been shown to be a useful input for many downstream tasks. In this section, several downstream tasks that benefited from OIE output are listed.

Question Answering

"Triple-Fact Retriever: An explainable reasoning retrieval model for multi-hop QA problem" - ICDE 2022

Chengmin Wu, Enrui Hu, Ke Zhan, Lan Luo, Xinyu Zhang, Hao Jiang, Qirui Wang, Zhao Cao, Fan Yu, Lei Chen
"Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step Rewriting" - ACL 2021

Yi Cheng, Siyao Li, Bang Liu, Ruihui Zhao, Sujian Li, Chenghua Lin, Yefeng Zheng
"Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs" - EMNLP 2019

Angela Fan, Claire Gardent, Chloé Braud, Antoine Bordes
"Assertion-based QA with Question-Aware Open Information Extraction" AAAI 2018

Zhao Yan, Duyu Tang, Nan Duan, Shujie Liu, Wendi Wang, Daxin Jiang, Ming Zhou, Zhoujun Li
"Answering Complex Questions Using Open Information Extraction" - ACL 2017

Tushar Khot, Ashish Sabharwal, Peter Clark
"Paraphrase-Driven Learning for Open Question Answering" ACL 2013

Anthony Fader, Luke S. Zettlemoyer, Oren Etzioni

Slot Filling

"Open Information Extraction to KBP Relations in 3 Hours" - TAC 2013

Stephen Soderland, John Gilmer, Robert Bart, Oren Etzioni, Daniel S. Weld
"Leveraging Linguistic Structure For Open Domain Information Extraction" - ACL 2015 (code (Java), code (Python))

Gabor Angeli, Melvin Jose Johnson Premkumar, Christopher D. Manning
"University of Washington System for 2015 KBP Cold Start Slot Filling" - TAC 2015

Stephen Soderland, Natalie Hawkins, Gene L. Kim, Daniel S. Weld
"Combining Open IE and Distant Supervision for KBP Slot Filling" - TAC 2015

Stephen Soderland, Natalie Hawkins, John Gilmer, Daniel S. Weld
"Open Relation Extraction and Grounding" - IJCNLP 2017

Dian Yu, Lifu Huang, Heng Ji

Event Extraction

"Generating Coherent Event Schemas at Scale" - EMNLP 2013

Niranjan Balasubramanian, Stephen Soderland, Mausam, Oren Etzioni
"Cross-document Event Identity via Dense Annotation" - CoNLL 2021

Adithya Pratapa, Zhengzhong Liu, Kimihiro Hasegawa, Linwei Li, Yukari Yamakawa, Shikun Zhang, Teruko Mitamura

Relation Linking

"TENET: Joint Entity and Relation Linking with Coherence Relaxation" - SIGMOD 2021

Xueling Lin, Lei Chen, Chaorui Zhang
"Capturing Knowledge in Semantically-typed Relational Patterns to Enhance Relation Linking" - K-CAP 2017

Kuldeep Singh, Isaiah Onando Mulang', Ioanna Lytra, Mohamad Yaser Jaradeh, Ahmad Sakor, Maria-Esther Vidal, Christoph Lange, Sören Auer

Open Link Prediction

"OKGIT: Open Knowledge Graph Link Prediction with Implicit Types" - ACL 2021

Chandrahas, Partha Pratim Talukdar
"Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction" (resources, video) - ACL 2020

Samuel Broscheit, Kiril Gashteovski, Yanjie Wang, Rainer Gemulla

Relation Extraction

"RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information" - EMNLP 2018

Shikhar Vashishth, Rishabh Joshi, Sai Suman Prayaga, Chiranjib Bhattacharyya, Partha Talukdar

Relating Entities

"Relating Legal Entities via Open Information Extraction" - MTSR 2018

Giovanni Siragusa, Rohan Nanda, Valeria De Paiva, Luigi Di Caro

Story Comprehension

"Enhanced Story Comprehension for Large Language Models through Dynamic Document-Based Knowledge Graphs" - AAAI 2022

Berkeley Andrus, Yeganeh Nasiri, Jay Cui, Ben Cullen, Nancy Fulda

Video Grounding

"Interventional Video Grounding With Dual Contrastive Learning" - CVPR 2021

Guoshun Nan, Rui Qiao, Yao Xiao, Jun Liu, Sicong Leng, Hao Zhang, Wei Lu

OIE in Different Languages

Most of the OIE systems are focused on extractions made from text written on English. However, some OIE systems either are focused on a language other than English, or are multilingual. In this section, OIE systems on languages other than English or multilingual OIE systems are listed.

Multilingual OIE Systems

"MILIE: Modular & Iterative Multilingual Open Information Extraction" - ACL 2022

Bhushan Kotnis, Kiril Gashteovski, Daniel Rubio, Ammar Shaker, Vanesa Rodriguez-Tembras, Makoto Takamoto, Mathias Niepert, Carolin Lawrence
Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction" - ACL 2022 (code)

Keshav Kolluru, Muqeeth Mohammed, Shubham Mittal, Soumen Chakrabarti, Mausam
"DetIE: Multilingual Open Information Extraction Inspired by Object Detection" - AAAI 2022 (code

Michael Vasilkovsky, Anton Alekseev, Valentin Malykh, Ilya Shenbin, Elena Tutubalina, Dmitriy Salikhov, Mikhail Stepnov, Andrey Chertok, Sergey I. Nikolenko
"Multi2OIE: Multilingual Open Information Extraction based on Multi-Head Attention with BERT" (code) - EMNLP 2020

Youngbin Ro, Yukyung Lee, Pilsung Kang
"LOREM: Language-consistent Open Relation Extraction from Unstructured Text" (code) - WWW 2020

Tom Harting, Sepideh Mesbah, Christoph Lofi
Explainable OpenIE Classifier with Morpho-syntactic Rules" - HI4NLP@ECAI 2020

Bruno Cabral, Marlo Souza, Daniela Barreiro Claro
"Multilingual Open Information Extraction: Challenges and Opportunities" - Information 10(7): 228, 2019

Daniela Barreiro Claro, Marlo Souza, Clarissa Castellã Xavier, Leandro Souza de Oliveira
"Multilingual Open Relation Extraction Using Cross-lingual Projection" - HLT-NAACL 2015

Manaal Faruqui, Shankar Kumar
"MT/IE: Cross-lingual Open Information Extraction with Neural Sequence-to-Sequence Models" - EACL 2017 (code)

Kevin Duh, Benjamin Van Durme, Sheng Zhang
"Multilingual Open Information Extraction" - EPIA 2015

Pablo Gamallo, Marcos García
"Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction" - ACL 2022

Keshav Kolluru, Mohammed Muqeeth, Shubham Mittal, Soumen Chakrabarti, Mausam

OIE Systems for German Language

"GerIE - An Open Information Extraction System for the German Language" - J. UCS 2018

Akim Bassa, Mark Kröll, Roman Kern
"Porting an Open Information Extraction System from English to German" - EMNLP 2016 (code)

Tobias Falke, Gabriel Stanovsky, Iryna Gurevych, Ido Dagan

OIE Systems for Portugese Language

"Challenges of an Annotation Task for Open Information Extraction in Portuguese" - PROPOR 2018

Rafael Glauber, Leandro Souza de Oliveira, Cleiton Fernando Lima Sena, Daniela Barreiro Claro, Marlo Souza
"Inference Approach to Enhance a Portuguese Open Information Extraction" - ICEIS 2017

Cleiton Fernando Lima Sena, Rafael Glauber, Daniela Barreiro Claro
"DependentIE: An Open Information Extraction system on Portuguese by a Dependence Analysis" - ENIAC 2017

Leandro Souza de Oliveira, Rafael Glauber, Daniela Barreiro Claro
*" PortNOIE: A Neural Framework for Open Information Extraction for the Portuguese Language" - PROPOR 2022: 15th International Conference on Computational Processing of Portuguese

Bruno Cabral, Marlo Souza & Daniela Barreiro Claro

OIE Systems for Spanish Language

"Open Information Extraction for Spanish Language based on Syntactic Constraints" - ACL (Student Research Workshop) (2014)

Alisa Zhila, Alexander Gelbukh

OIE Systems for Chinese Language

"ZORE: A Syntax-based System for Chinese Open Relation Extraction" - EMNLP 2014

Likun Qiu, Yue Zhang
"Chinese Open Relation Extraction and Knowledge Base Establishment" - ACM Trans. Asian & Low-Resource Lang. Inf. Process. 2018 (slides, code)

Shengbin Jia, Shijia E, Maozhen Li, Yang Xiang
"Open Relation Extraction for Chinese Noun Phrases" - TKDE 2019

Chengyu Wang, Xiaofeng He, Aoying Zhou
"Chinese Open Relation Extraction with Pointer-Generator Networks" - DSC 2020

Ziheng Cheng, Xu Wu, Xiaqing Xie, Jingchen Wu
"Multi-Grained Dependency Graph Neural Network for Chinese Open Information Extraction" - PAKDD 2021

Zhiheng Lyu, Kaijie Shi, Xin Li, Lei Hou, Juanzi Li, Binheng Song

OIE Systems for Persian Language

"RePersian:An Efficient Open Information Extraction Tool in Persian" - ICWR 2020

Raana Saheb-Nassagh, Majid Asgari, Behrouz Minaei-Bidgoli
"A recursive algorithm for open information extraction from Persian texts" - IJCAT 2018

Mahmoud Rahat, Alireza Talebpour, Seyedamin Monemian
"Open information extraction as an intermediate semantic structure for Persian text summarization" - Int. J. on Digital Libraries (2018)

Mahmoud Rahat, Alireza Talebpour
"Parsa: An open information extraction system for Persian" - DSH 2018

Mahmoud Rahat, Alireza Talebpour

OIE Systems for Italian Language

"Lexicon-Grammar based Open Information Extraction from Natural Language Sentences in Italian" - Expert Systems and Applications 2019

Raffaele Guarasci, Emanuele Damiano, Aniello Minutolo, Massimo Esposito, Giuseppe De Pietro
"Towards a gold standard dataset for Open Information Extraction in Italian" - SNAMS 2019

Raffaele Guarasci, Emanuele Damiano, Aniello Minutolo, Massimo Esposito

OIE Systems for Indonesian Language

Rule-based Indonesian Open Information Extraction" - ICAICTA 2018

Ade Romadhony, Ayu Purwarianti, Dwi H. Widyantoro

OIE Systems for Greek Language

PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation" - Student Research Workshop @ EACL

Dimitris Papadopoulos, Nikolaos Papadakis, Nikolaos Matsatsinis

Supervised OIE

"Supervised Open Information Extraction" - NAACL-HLT 2018

Gabriel Stanovsky, Julian Michael, Luke Zettlemoyer, Ido Dagan
"Neural Open Information Extraction" - ACL 2018

Lei Cui, Furu Wei, Ming Zhou
"Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction" - WSDM 2018

Mingming Sun, Xu Li, Xin Wang, Miao Fan, Yue Feng, Ping Li
"Logician and Orator: Learning from the Duality between Language and Knowledge in Open Domain" - EMNLP 2018

Mingming Sun, Xu Li, Ping Li
"Supervising Unsupervised Open Information Extraction Models" - EMNLP 2019

Arpita Roy, Youngja Park, Taesung Lee and Shimei Pan
"Contextualized Word Embeddings in a Neural Open Information Extraction Model" - NLDB 2019

Injy Sarhan, Marco R. Spruit
"Weakly Supervised, Data-Driven Acquisition of Rules for Open Information Extraction" - CAIAC 2019

Fabrizio GottiEmail, Philippe Langlais
"Learning Open Information Extraction of Implicit Relations from Reading Comprehension Datasets" - CoRR 2019

Jacob Beckerman, Theodore Christakis
"Span Model for Open Information Extraction on Accurate Corpus" (code)- AAAI 2020

Junlang Zhan, Hai Zhao
"Extracting Knowledge from Web Text with Monte Carlo Tree Search" - WWW 2020

Guiliang Liu, Xu Li, Jiakang Wang, Mingming Sun, Ping Li
An Advantage Actor-Critic Algorithm with Confidence Exploration for Open Information Extraction" - SDM 2020

Guiliang Liu, Xu Li, Miningming Sun, Ping Li
"Hybrid Neural Tagging Model for Open Relation Extraction" - CoRR 2020 (data)

Shengbin Jia, Yang Xiang
"IMoJIE: Iterative Memory-Based Joint Open Information Extraction" (code) - ACL 2020

Keshav Kolluru, Samarth Aggarwal, Vipul Rathore, Mausam, Soumen Chakrabarti
"OpenIE6: Iterative Grid Labeling and Coordination Analysis for Open Information Extraction" (code) - EMNLP 2020

Keshav Kolluru, Vaibhav Adlakha, Samarth Aggarwal, Mausam, Soumen Chakrabarti
Systematic Comparison of Neural Architectures and Training Approaches for Open Information Extraction - EMNLP 2020

Patrick Hohenecker, Frank Mtumbuka, Vid Kocijan, Thomas Lukasiewicz
"Multi-Grained Dependency Graph Neural Network for Chinese Open Information Extraction" - PAKDD 2021

Zhiheng Lyu, Kaijie Shi, Xin Li, Lei Hou, Juanzi Li, Binheng Song

Canonicalization of OIE

"Canonicalizing Open Knowledge Bases" - CIKM 2014

Luis Galárraga, Geremy Heitz, Kevin Murphy, Fabian M. Suchanek
"CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information" - WWW 2018 (code)

Shikhar Vashishth, Prince Jain, Partha Talukdar
"Towards Practical Open Knowledge Base Canonicalization" - CIKM 2018

Tien-Hsuan Wu, Zhiyong Wu, Ben Kao, Pengcheng Yin
"CaRe: Open Knowledge Graph Embeddings" - EMNLP 2019 (code)

Swapnil Gupta, Sreyash Kenkre, Partha Talukdar
"Canonicalization of Open Knowledge Bases with Side Information from the Source Text" - ICDE 2019

Xueling Lin, Lei Chen
"MULCE: Multi-level Canonicalization with Embeddings of Open Knowledge Bases" - WISE 2020

Tien-Hsuan Wu, Ben Kao, Zhiyong Wu, Xiyang Feng, Qianli Song, Cheng Chen
"Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network" - CoRR 2020

Tianwen Jiang, Tong Zhao, Bing Qin, Ting Liu, Nitesh V. Chawla, Meng Jiang
"Open Knowledge Graphs Canonicalization using Variational Autoencoders" - EMNLP 2021 (code)

Sarthak Dash, Gaetano Rossiello, Nandana Mihindukulasooriya, Sugato Bagchi, Alfio Gliozzo
"Joint Open Knowledge Base Canonicalization and Linking" - SIGMOD 2021

Yinan Liu, Wei Shen, Yuanfei Wang, Jianyong Wang, Zhenglu Yang, Xiaojie Yuan
"CaSIE: Canonicalize and Informative Selection of the OpenIE system" - ICDE 2021

Hao Xin, Xueling Lin, Lei Chen
"Multi-View Clustering for Open Knowledge Base Canonicalization" - KDD 2022

Wei Shen, Yang Yang, Yinan Liu

Slides

[pdf] "Compact Open Information Extraction on Large Corpora". Talk by Kiril Gashteovski given at NEC Labs Europe GmbH, 2019.
[pdf] "(Information Extraction) Lecture 10 – Ontological and Open IE": A lecture on Open IE, which is part of the course "Information Extraction", by Prof. Dr. Alexander Fraser, from LMU München
Open IE Tutorial: Open Information Extraction for QA by André Freitas. Tutorial was presented on OKBQA 2018
[pdf] "Chinese Open Relation Extraction and Knowledge Base Establishment", 2018
[pdf] "Brief Introduction and Review of Open Information Extraction (Open-IE) Systems". Project Presentation by Sina Miran.
[pdf] "Open Information Extraction Systems and Downstream Applications" by Prof. Mausam. The talk was presented at IJCAI 2016
[pptx] "Open Information Extraction from the Web", presented by Prof. Oren Etzioni. The tutorial was presented at AKBC-WEKEX 2012
[pdf] "ClausIE: Clause-Based Open Information Extraction" by Luciano del Corro.
[pdf] "Open Information Extraction: the Second Generation"
[pdf] "Open Information Extraction: Where Are We Going?" by Claudio Delli Bovi, 2016
[pdf] "An Informativeness Approach to Open Information Extraction Evaluation" by William Léchelle, 2016

Talks

[video] "Open Information Extraction from the Web", by Prof. Oren Etzioni, presented at AKBC-WEKEX 2012. Slides: [pptx]
[video] "Open Information Extraction: Where Are We Going?", by Claudio Delli Bovi. The talk was given at AI2 in 2016. Slides [pdf]
[video] "Nested Propositions in Open Information Extraction" by Nikita Bhutani at EMNLP 2016
[video] "Creating a Large Benchmark for Open Information Extraction" by Gabriel Stanovsky at EMNLP 2016
[video] "OpenCeres: When Open Information Extraction Meets the Semi-Structured Web" by Colin Lockard at NAACL 2019 slides [pdf]

Code

MinIE: Open Information Extraction System
- MinIE: originally written in Java
- Python wrapper for MinIE
- MinScIE - an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations (based on MinIE).
- SalIE - Salient Open Information Extraction (based on MinIE)
ClausIE: Clause-based OIE
- ClausIE: originally written in Java
- ClausIE (mavenized version)
- ClausIEpy: Python wrapper for ClausIE
OpenIE at IIT Delhi:
- OpenIE6
- IMoJIE: a BERT-based OpenIE system
- OpenIE5
OpenIE at UW:
- OLLIE
- ReVerb
Stanford's OpenIE:
- Stanford OpenIE: Stanford's OpenIE system.
- Stanford OpenIE Spider: Extract Information from WebCorpus using Stanford Open Information Extraction.
- Python wrapper for Stanford OpenIE: The unofficial cross-platform Python wrapper for the state-of-art information extraction library from Stanford University.
Graphene: OpenIE system containing coreference resolution, simplification and open relation extraction pipeline
EXEMPLAR
DefIE: Open information extraction from textual definitions
ReMine: Integrating Local and Global Cohesiveness for Open Information Extraction
OIE systems for languages other than English or cross-lingual systems:
- Zhopenie - Chinese OIE: OIE system for Chinese language written in Python.
- Open Relation Extraction for Chinese: Knowledge triples extraction (entities and relations extraction) and knowledge base construction based on dependency syntax for open domain text (for Chinese)
- Baaz: Open information extraction from Persian web (Python)
- MT/IE: Cross-lingual Open IE. Attention-based sequence-to-sequence model for cross-lingual open IE. Written in Python
- Relation Extraction on German Websites: This repository holds a collection of three Open Information Extraction approaches for the German language
- DptOIE: A Portuguese Open Information Extraction system based on Dependency Analysis
- PragmaticOIE: a rule-based approach to extract facts in Portuguese in a first pragmatic level
CORE: Context-Aware Open Relation Extraction with Factorization Machines
CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information
IMPLIE: IMPLIE (IMPLicit relation Information Extraction) is a program that extracts binary relations from English sentences where the relationship between the two entities is not explicitly stated in the text.
Ranking: Iterative Rank-Aware Open IE (confidence score).
Zero-Shot Information Extraction as a Unified Text-to-Triple Translation:: Implementation of Zero-Shot Information Extraction
Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph:: Implementation of attention-based OIE, KG construction and canonicalization.
BenchIE: Open Information Extraction Evaluation Based on Facts, Not Tokens:: Implementation of Benchmark for Open Information Extraction.
DetIE: Multilingual Open Information Extraction Inspired by Object Detection:: Implementation of Multilingual Open Information Extraction Inspired by Object Detection.
LILLIE: Information extraction and database integration using linguistics and learning-based algorithms:: Implementation of LILLIE: Information extraction and database integration using linguistics and learning-based algorithms.
CompactIE: Compact Facts in Open Information Extraction:: Implementation of CompactIE.
DeepStruct: Pretraining of Language Models for Structure Prediction:: Implementation of DeepStruct: Pretraining of Language Models for Structure Prediction.
Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction:: Implementation of Alignment-Augmented Consistent Translation for Multilingual Open Information Extraction.

Data

OIE output is used as a useful input in many other downstream tasks, such as question answering, event schema induction or generating inference rules. Moreover, OIE output can be used as a "fuel" to derive further resources. Here, the data is organized into two major categories: 1) OIE corpora; 2) Resources derived from OIE output.

OIE corpora

OPIEC: An Open Information Extraction Corpus: the largest OIE corpus to date, containing more than 341M triples extracted from the entire English Wikipedia. Each triple from the corpus is composed of rich meta-data: each token from the subj / obj / rel along with NLP annotations (POS tag, NER tag, ...), provenance sentence along with the dependency parse, original (golden) left from Wikipedia, sentence order, space / time, etc.
[.gz] ReVerb extractions: 15 million high-precision OIE extractions (826MB compressed) from the OIE system ReVerb. The extractions were made from the ClueWeb09 corpus. The data contains (subject, relation, object) triples, accompanied by a confidence score (estimating the likelihood of whether the triple was correctly extracted) and provenance information (the link of the web-page where the triple was extracted from).
ReVerb extractions (linked): 3 million triples with linked argument (a subset of the 15 M high-precision ReVerb extractions). The links (to Freebase) are provided by an entity linker. The data fields are: argument 1, relation phrase, argument 2, freebase ID for argument 1 link, corresponding freebase entity name, link score, link ambiguity score
PATTY: PATTY is a system that takes open relations between two arguments, structures them into relational synsets and then organizes the synsets into a taxonomy. This resource contains over 15M triples with disambiguated arguments (links to WikiPedia articles) and relation synset ID between them. Additionaly, the resource contains: 1) relation pattern synsets with type signatures; 2) relation pattern subsumptions; 3) relation paraphrases; 4) evaluation data;
WiseNet (1.0 and 2.0): similarly as PATTY, WiseNet 1.0/2.0 is a source containing of OIE triples, where the arguments are disambiguated and the open relations are organized into relation synsets and then taxonomized. One of the main differences between PATTY and WiseNet is that WiseNet contains "golden links" for the arguments (annotated by humans) by keeping the original links from the WikiPedia articles.
KB-Unify: KB-Unify takes as an input several OIE corpora and unifies them into a single disambiguated OIE repository. The open relations are organized into relational synsets and the arguments are disambiguated with BabelFy.

Resources derived from OIE output

Functional relations: 10K Functional relations. This resource comes from the paper "Identifying Functional Relations in Web Text", published on EMNLP 2010.
Entailment rules: 10M predicative entailment rules learned using local and global algorithms. From the documentation: "This resource of predicative entailment rules contains three resources in two formats – shallow and syntactic. Resources are learned over the REVERB data set and using the local and algorithms described in Chapter 5 of Jonathan Berant’s thesis (which is part of the package)."
Entailment rules: 36K high precision entailment rules (data and code). The resource is the result of the work of Prachi Jain and Mausam "Knowledge-Guided Linguistic Rewrites for Inference Rule Verification" published on NAACL-HLT, 2016.

PhD theses

"Compact Open Information Extraction: Methods, Corpora, Analysis" by Kiril Gashteovski, University of Mannheim, Germany, 2020
"Constructing Lexicons of Relational Phrases" by Adam Grycner, University of Saarland, Germany, 2017
"Methods for open information extraction and sense disambiguation on natural language text" by Luciano Del Corro, University of Saarland, Germany, 2016
"Automated Knowledge Base Extension Using Open Information" by Arnab Kumar Dutta, University of Mannheim, Germany, 2015
"Exploiting Knowledge in Unsupervised Open Information Extraction" by Yuval Merhav, Illinois Institute of Technology, USA, 2012
"Open Information Extraction for the Web" by Michele Banko, University of Washington, USA, 2009

Demos

ClausIE: Demo for ClausIE, an OIE system.
Fact retrieval: Fact retrieval with OpenIE on large corpora.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md

FORMAS/awesome_openie

Folders and files

Latest commit

History

Repository files navigation