A study performed during the thesis project of Mrs. Giulia Donato concerning Emoji redundancy in Twitter content is presented on Thursday 4 April 2019 at the Institute of Informatics & Telecommunications, at Aigaio lecture room.
Title: Investigating Redundancy in Emoji Use: Study on a Twitter Based Corpus
Abstract:
The popularity of emoji is on the rise. In the past years we witnessed the recognition of this visual language by institutions such as the Oxford Dictionary and experienced a constant growth of studies on the subject.
Researchers point at the fact that the diffusion of emoji is very likely to change many communicative conventions, even those well established, in online interactions. For example, we might witness the disappearance of emoticons and other communicative strategies, such as the use of acronyms and characters repetition, in favour of emoji.
Most of the work on emoji that has been done so far was focused on the most evident of their characteristics: the effectiveness in expressing emotions. Nevertheless, emoji can represent much more than feelings, since the set of pictographs is constantly being updated with new icons representing objects, actions and concepts. Given their pervasiveness, it seems time for linguistic related research to pay attention to which purpose non-emotional emoji are employed in the language of social media.
With this aim we annotated a set of tweets, with the help of human annotators. We then asked the annotators to recognize whether the emoji was used in a redundant way, in a non-redundant way or as a substitute for an actual word. Subsequently, we employed the annotated corpus in a set of machine learning experiments where we trained a number of classifiers to distinguish among these different informative behaviours.
We discovered the task to be difficult for humans, who reached a moderate agreement. However, the classifier, provided with specific features, obtained satisfactory results and managed to improve over the two proposed baselines.
Speaker Bio:
Giulia Donato is a computational linguist, currently based in Athens and working on conversational interfaces.
Her thesis project, defended at the University of Copenhagen in 2017, was presented at the Wassa workshop during the EMNLP conference in Copenhagen (Dk, 2017) and at the main LREC conference, in Myiazaki (Jp, 2018).