Workload-Aware Self-Tuning Histograms of String Data

Printer-friendly versionSend by email
Conference Proceedings (fully refereed)
N. Zoulis, E. Mavroudi, A. Lykoura, A. Charalambidis, S. Konstantopoulos
In this paper we extend STHoles, a very successful algorithm that uses query results to build and maintain multi-dimensional histograms of numerical data. Our contribution is the formal definition of extensions of all relevant concepts; such that they are independent of the domain of the data, but subsume STHoles concepts as their numerical specialization. At the same time, we also derive specializations for the string domain and implement these into a prototype that we use to empirically validate our approach. Our current implementation uses string prefixes as the machinery for describing string ranges. Although weaker than regular expressions, prefixes can be very efficiently applied and can capture interesting ranges in hierarchically structured string domains, such as those of filesystem pathnames and URIs. In fact, we base the empirical validation of the approach on existing, publicly available Semantic Web data where we demonstrate convergence to accurate and efficient histograms
Software and Knowledge Engineering Laboratory (SKEL)
Conference Short Name: 
DEXA 2015
Conference Country: 
Conference Venue: 
Conference Date(s): 
Tue, 01/09/2015 - Fri, 04/09/2015
Conference Level: 
DEXA 2015

© 2018 - Institute of Informatics and Telecommunications | National Centre for Scientific Research "Demokritos"

Terms of Service and Privacy Policy