Symbolic event recognition systems often rely on knowledge bases of event definitions, expressed in first-order logic, to detect event occurrences over time. Logical frameworks for representing and reasoning about events provide robust temporal reasoning and enable the automated discovery of event rules via Inductive Logic Programming (ILP). Although existing structure learning approaches ease the discovery of such rules in noisy data streams, they assume the existence of fully-labelled training sequences, which is unrealistic for most real-life applications. In this thesis we address the issue of scalable semi-supervised learning for event recognition. We propose two novel techniques for inferring the missing supervision on training sequences and enable learning event rules in the Event Calculus. First, we propose SPLICE, a framework that employs a graph-based method to derive labels for unlabelled data, based on their distance to their labelled counterparts. In order to adapt the graph-based method to first-order logic, we use a suitable structural distance for measuring the distance between sets of logical atoms. The labelling process is achieved online (single-pass) by means of a caching mechanism and the Hoeffding bound for filtering contradicting examples. However, SPLICE labelling may be compromised since its structural measure is agnostic of the feature semantics. Moreover, there is no guarantee about the quality of the labelling found in the local graphs that are built as the data stream in. To that end, we also propose SPLICE+, a second method that improves upon SPLICE by employing a hybrid measure combining an optimised structural distance, and a data-driven one. The former is guided by feature selection, while the latter is a mass-based dissimilarity. In addition, SPLICE+ improves the graph construction process, by storing a synopsis of the past, in order to achieve more informed labelling on the local graphs. We evaluate our approach on the task of complex event recognition by using a benchmark dataset for human activity recognition, a dataset for maritime monitoring, as well as a dataset for fleet management.

Skip to content