Exploring the Data Wilderness through Examples

Davide Mottin, Matteo Lissandrini, Yannis Velegrakis, Themis Palpanas

SIGMOD 2019

TL;DR

This tutorial explores

  • Methods for exploring large datasets using examples
  • Algorithmic solutions to search without query languages
  • Interactive methods and user-in-the-loop feedback
  • Machine learning for adaptive, online methods

Abstract

Exploration is one of the primordial ways to accrue knowledge about the world and its nature. As we accumulate, mostly automatically, data at unprecedented volumes and speed, our datasets have become complex and hard to understand. In this context exploratory search provides a handy tool for progressively gather the necessary knowledge by starting from a tentative query that hopefully leads to answers at least partially relevant and that can provide cues about the next queries to issue.

An exploratory query should be simple enough to avoid complicate declarative languages (such as SQL) and mechanisms, and at the same time retain the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so called example-based methods, in which the user, or the analyst circumvent query languages by using examples as input. This shift in semantics has led to a number of methods receiving as query a set of example members of the answer set. The search system then infers the entire answer set based on the given examples and any additional information provided by the underlying database.

In this tutorial, we present an excursus over the main example-based methods methods for exploratory analysis. We show how different data types require different techniques, and present algorithms that are specifically designed for relational, textual, and graph data. We conclude by providing a unifying view of this query-paradigm and identify new exciting research directions.

Outline

The tutorial covers

  1. Unified framework for data exploration by-example
  2. Example-based methods
    1. Example methods in relational databases
    2. Example methods in textual data
    3. Example methods in graphs
  3. Learning methods based on examples

Cite us

TBD