A collection of important Open Knowledge Graph (OKG) resources: research papers, code, data, applications, etc. This collection is not limited to Open Knowledge Graphs but also includes their downstream applications.
We will try to make this list updated frequently. If you find any error or any missed paper, please don't hesitate to open issues or pull requests.
- Introduction to Open Knowledge Graphs
- Papers by venues
- Papers by categories
- Slides
- Talks
- Code
- Data
- Other Repos
Open Information Extraction (OIE) systems aim to extract unseen relations and their arguments from unstructured text in unsupervised manner. In its simplest form, given a natural language sentence, they extract information in the form of a triple, consisted of subject (S), relation (R) and object (O).
Suppose we have the following input sentence:
AMD, which is based in U.S., is a technology company.
An OIE system aims to make the following extractions:
("AMD"; "is based in"; "U.S.")
("AMD"; "is"; "technology company")
[SIGKDD' 2022] Multi-View Clustering for Open Knowledge Base Canonicalization [Paper | Code | Video]
Wei Shen, Yang Yang, Yinan Liu
[SIGMOD' 2021] Joint Open Knowledge Base Canonicalization and Linking [Paper | Code | Video]
Yinan Liu, Wei Shen, Yuanfei Wang, Jianyong Wang, Zhenglu Yang, Xiaojie Yuan
[ICDE' 2021] CaSIE: Canonicalize and Informative Selection of the OpenIE system [Paper]
Hao Xin, Xueling Lin, Lei Chen
[ACL' 2021] OKGIT: Open Knowledge Graph Link Prediction with Implicit Types [Paper | Code | Video]
Chandrahas, Partha Pratim Talukdar
[EMNLP' 2021] Open Knowledge Graphs Canonicalization using Variational Autoencoders [Paper | Code | Video]
Sarthak Dash, Gaetano Rossiello, Nandana Mihindukulasooriya, Sugato Bagchi, Alfio Gliozzo
[VLDB' 2020] KBPearl: A Knowledge Base Population System Supported by Joint Entity and Relation Linking [Paper | Code | Readme]
Xueling Lin, Haoyang Li, Hao Xin, Zijian Li, Lei Chen
[ACL' 2020] Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction [Paper | Code | Resources | Video]
Samuel Broscheit, Kiril Gashteovski, Yanjie Wang, Rainer Gemulla
[Eval4NLP@EMNLP' 2020] On Aligning OpenIE Extractions with Knowledge Bases: A Case Study [Paper | Resources | Video | Slides]
Kiril Gashteovski, Rainer Gemulla, Bhushan Kotnis, Sven Hertling, Christian Meilicke
[WISE' 2020] MULCE: Multi-level Canonicalization with Embeddings of Open Knowledge Bases [Paper]
Tien-Hsuan Wu, Ben Kao, Zhiyong Wu, Xiyang Feng, Qianli Song, Cheng Chen
[CoRR' 2020] Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network [Paper]
Tianwen Jiang, Tong Zhao, Bing Qin, Ting Liu, Nitesh V. Chawla, Meng Jiang
[CoRR' 2020] Language Models are Open Knowledge Graphs [Paper]
Chenguang Wang, Xiao Liu, Dawn Song
[ICDE' 2019] Canonicalization of Open Knowledge Bases with Side Information from the Source Text [Paper | Code | Dataset Description]
Xueling Lin, Lei Chen
[EMNLP' 2019].CaRe: Open Knowledge Graph Embeddings [Paper | Code]
Swapnil Gupta, Sreyash Kenkre, Partha Talukdar
[EMNLP' 2019] Collaborative Policy Learning for Open Knowledge Graph Reasoning [Paper | Code]
Cong Fu, Tong Chen, Meng Qu, Woojeong Jin, Xiang Ren
[AKBC' 2019] OPIEC: An Open Information Extraction Corpus [Paper | Data + Resources | Code (data reading) | Code (pipeline)]
Kiril Gashteovski, Sebastian Wanner, Sven Hertling, Samuel Broscheit, Rainer Gemulla
[WWW' 2018] CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information [Paper | Code]
Shikhar Vashishth, Prince Jain, Partha Talukdar
[CIKM' 2018] Towards Practical Open Knowledge Base Canonicalization [Paper]
Tien-Hsuan Wu, Zhiyong Wu, Ben Kao, Pengcheng Yin
[CIKM' 2014] Canonicalizing Open Knowledge Bases [Paper]
Luis Galárraga, Geremy Heitz, Kevin Murphy, Fabian M. Suchanek
[SIGKDD' 2022] Multi-View Clustering for Open Knowledge Base Canonicalization [Paper | Code | Video]
Wei Shen, Yang Yang, Yinan Liu
[SIGMOD' 2021] Joint Open Knowledge Base Canonicalization and Linking [Paper | Code | Video]
Yinan Liu, Wei Shen, Yuanfei Wang, Jianyong Wang, Zhenglu Yang, Xiaojie Yuan
[EMNLP' 2021] Open Knowledge Graphs Canonicalization using Variational Autoencoders [Paper | Code | Video]
Sarthak Dash, Gaetano Rossiello, Nandana Mihindukulasooriya, Sugato Bagchi, Alfio Gliozzo
[WISE' 2020] MULCE: Multi-level Canonicalization with Embeddings of Open Knowledge Bases [Paper]
Tien-Hsuan Wu, Ben Kao, Zhiyong Wu, Xiyang Feng, Qianli Song, Cheng Chen
[CoRR' 2020] Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network [Paper]
Tianwen Jiang, Tong Zhao, Bing Qin, Ting Liu, Nitesh V. Chawla, Meng Jiang
[ICDE' 2019] Canonicalization of Open Knowledge Bases with Side Information from the Source Text [Paper | Code | Dataset Description]
Xueling Lin, Lei Chen
[WWW' 2018] CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information [Paper | Code]
Shikhar Vashishth, Prince Jain, Partha Talukdar
[CIKM' 2018] Towards Practical Open Knowledge Base Canonicalization [Paper]
Tien-Hsuan Wu, Zhiyong Wu, Ben Kao, Pengcheng Yin
[CIKM' 2014] Canonicalizing Open Knowledge Bases [Paper]
Luis Galárraga, Geremy Heitz, Kevin Murphy, Fabian M. Suchanek
[ACL' 2021] OKGIT: Open Knowledge Graph Link Prediction with Implicit Types [Paper | Code | Video]
Chandrahas, Partha Pratim Talukdar
[ACL' 2020] Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction [Paper | Code | Resources | Video]
Samuel Broscheit, Kiril Gashteovski, Yanjie Wang, Rainer Gemulla
[EMNLP' 2019].CaRe: Open Knowledge Graph Embeddings [Paper | Code]
Swapnil Gupta, Sreyash Kenkre, Partha Talukdar
[EMNLP' 2019] Collaborative Policy Learning for Open Knowledge Graph Reasoning [Paper | Code]
Cong Fu, Tong Chen, Meng Qu, Woojeong Jin, Xiang Ren
[Eval4NLP@EMNLP' 2020] On Aligning OpenIE Extractions with Knowledge Bases: A Case Study [Paper | Resources | Video | Slides]
Kiril Gashteovski, Rainer Gemulla, Bhushan Kotnis, Sven Hertling, Christian Meilicke
OKG's output has been shown to be a useful input for many downstream tasks. In this section, several downstream tasks that benefited from OKG output are listed.
[ICDE' 2021] CaSIE: Canonicalize and Informative Selection of the OpenIE system [Paper]
Hao Xin, Xueling Lin, Lei Chen
[VLDB' 2020] KBPearl: A Knowledge Base Population System Supported by Joint Entity and Relation Linking [Paper | Code | Readme]
Xueling Lin, Haoyang Li, Hao Xin, Zijian Li, Lei Chen
- [pdf] "Compact Open Information Extraction on Large Corpora". Talk by Kiril Gashteovski given at NEC Labs Europe GmbH, 2019.
- [video] "Open Information Extraction from the Web", by Prof. Oren Etzioni, presented at AKBC-WEKEX 2012.
OIE output is used as a useful input in many other downstream tasks, such as question answering, event schema induction or generating inference rules. Moreover, OIE output can be used as a "fuel" to derive further resources. Here, the data is organized into two major categories: 1) OIE corpora; 2) Resources derived from OIE output.
- OPIEC: An Open Information Extraction Corpus: the largest OIE corpus to date, containing more than 341M triples extracted from the entire English Wikipedia. Each triple from the corpus is composed of rich meta-data: each token from the subj / obj / rel along with NLP annotations (POS tag, NER tag, ...), provenance sentence along with the dependency parse, original (golden) left from Wikipedia, sentence order, space / time, etc.
- [.gz] ReVerb extractions: 15 million high-precision OIE extractions (826MB compressed) from the OIE system ReVerb. The extractions were made from the ClueWeb09 corpus. The data contains (subject, relation, object) triples, accompanied by a confidence score (estimating the likelihood of whether the triple was correctly extracted) and provenance information (the link of the web-page where the triple was extracted from).
- ReVerb extractions (linked): 3 million triples with linked argument (a subset of the 15 M high-precision ReVerb extractions). The links (to Freebase) are provided by an entity linker. The data fields are: argument 1, relation phrase, argument 2, freebase ID for argument 1 link, corresponding freebase entity name, link score, link ambiguity score
- PATTY: PATTY is a system that takes open relations between two arguments, structures them into relational synsets and then organizes the synsets into a taxonomy. This resource contains over 15M triples with disambiguated arguments (links to WikiPedia articles) and relation synset ID between them. Additionaly, the resource contains: 1) relation pattern synsets with type signatures; 2) relation pattern subsumptions; 3) relation paraphrases; 4) evaluation data;
- WiseNet (1.0 and 2.0): similarly as PATTY, WiseNet 1.0/2.0 is a source containing of OIE triples, where the arguments are disambiguated and the open relations are organized into relation synsets and then taxonomized. One of the main differences between PATTY and WiseNet is that WiseNet contains "golden links" for the arguments (annotated by humans) by keeping the original links from the WikiPedia articles.
- KB-Unify: KB-Unify takes as an input several OIE corpora and unifies them into a single disambiguated OIE repository. The open relations are organized into relational synsets and the arguments are disambiguated with BabelFy.
-
Awesome-LLM-KG, created by Linhao Luo from Monash University.
-
Awesome-Knowledge-Graph-Reasoning, created by Ke Liang from National University of Defense Technology.
-
oie-resources, created by Kiril Gashteovski from NEC Laboratories Europe.
-
knowledge-graphs, created by Shaoxiong Ji from University of Helsinki.