Fact Summarization for Personalized Knowledge Graphs
Danai Koutra, Davide Mottin, Jing Zhu
Personal Knowledge Graphs (PKGs): Methodology, tools and applications
TL;DR
Personal Knowledge Graph contain information about a single user. Being able to summarize their content is instrumental for both efficiency and usability.
Abstract
Summarization for personalized KGs has several key challenges that need to be addressed when designing new computational approaches:
- Data volume. As mentioned above, KGs tend to be massive as they contain facts about the world. Though summarization aims to reduce the size of the data, summarization approaches need to be carefully designed in order to be scalable and process these graphs efficiently.
- Data complexity. KGs encode heterogeneous semantic information and contain a large number of entity and relationship types e.g. the widely used encyclope dic knowledge graphs such as DBPedia3, YAGO4 and Freebase5 contain over ten million triples, spanning topics like music, movies, sports, etc. Moreover, due to the automatic way that KGs are constructed, they tend to contain various types of errors or miss information, which can affect the summarization process, the quality of the summary, and subsequently the performance in the downstream task of interest.
- Data evolution. Some facts in KGs are static (e.g., the Eiffel tower is in Paris), but many facts change over time at different rates (e.g., the score of a sports game changes over the span of a couple of hours, while the president of a coun- try usually changes every 4 years). The summarization methods should account for these changes, which should be reflected in the resultant summaries in order to reduce errors in the downstream tasks.
- Evaluation. Depending on the downstream task, different evaluation metrics may be appropriate for each KG summarization method. This makes the com- parison of different methods challenging. Moreover, due to privacy concerns, there is lack of publicly-available personalized KGs or historical queries per individual, which hinders the evaluation of proposed methods in real settings
Share this post
Reddit
LinkedIn
Email