Domain Adaptation for Affect in Tweets
The paper describes the best performing system for the SemEval-2018 Affect in Tweets (English) sub-tasks. The system focuses on the ordinal classification and regression sub-tasks for valence and emotion. For ordinal classification valence is classified into 7 different classes ranging from -3 to 3 whereas emotion is classified into 4 different classes 0 to 3 separately for each emotion namely anger, fear, joy and sadness. The regression sub-tasks estimate the intensity of valence and each emotion. The system performs domain adaptation of 4 different models and creates an ensemble to give the final prediction. The proposed system achieved 1st position out of 75 teams which participated in the fore-mentioned subtasks. We outperform the baseline model by margins ranging from 49.2% to 76.4 %, thus, pushing the state-of-the-art significantly
SemEval @ NAACL 2018 [http://alt.qcri.org/semeval2018/index.php]
Agree to Disagree: Improving Disagreement Detection with Dual GRUs
This paper presents models for detecting agreement/disagreement in online discussions. In this work we show that by using a Siamese inspired architecture to encode the discussions, we no longer need to rely on hand-crafted features to exploit the meta thread structure. We evaluate our model on existing online discussion corpora ABCD, IAC and AWTP. Experimental results on ABCD dataset show that by fusing lexical and word embedding features, our model achieves the state-of-the-art performance of 0.804 average F1 score. We also show that the model trained on ABCD dataset performs competitively on relatively smaller annotated datasets (IAC and AWTP).
ESSEM @ ACII 2017 [http://acii2017.org/]
Ensemble of Deep Neural Networks for Acoustic Scene Classification
Deep neural networks (DNNs) have recently achieved great success in a multitude of classification tasks. Ensembles of DNNs have been shown to improve the performance. In this paper, we explore the recent state-of-the-art DNNs used for image classification. We modified these DNNs and applied them to the task of acoustic scene classification. We conducted a number of experiments on the TUT Acoustic Scenes 2017 dataset to empirically compare these methods. Finally, we show that the ensemble of these DNNs improves the baseline score for DCASE-2017 Task 1 by 10%.
Detection and Classification of Acoustic Scenes and Events 2017
Seernet at EmoInt-2017: Tweet Emotion Intensity Estimator
The paper describes experiments on estimating emotion intensity in tweets using a generalized regressor system. The system combines lexical, syntactic and pre- trained word embedding features, trains them on general regressors and finally combines the best performing models to create an ensemble. The proposed system stood 3rd out of 22 systems in the leaderboard of WASSA-2017 Shared Task on Emotion Intensity.
WASSA @ EMNLP 2017 [http://optima.jrc.it/wassa2017/]
“Attention” for Detecting Unreliable News in the Information Age
An Unreliable news is any piece of information which is false or misleading, deliberately spread to promote political, ideological and financial agendas. Recently the problem of unreliable news has got a lot of attention as the number instances of using news and social media outlets for propaganda have in- creased rapidly. This poses a serious threat to society, which calls for technology to automatically and reliably identify unreliable news sources. This paper is an effort made in this direction to build systems for detecting unreliable news articles. In this paper, various NLP algorithms were built and evaluated on Unreliable News Data 2017 dataset. Variants of hierarchical attention networks (HAN) are presented for en- coding and classifying news articles which achieve the best results of 0.944 ROC-AUC. Finally, Attention layer weights are visualized to understand and give insight into the decisions made by HANs.
AICS @ AAAI 18 [https://aaai.org/]
Hierarchical Ensemble for Indian Native Language Identification
Native Language Identification has played an important role in forensics primarily for author profiling and identification. In this work, we discuss our approach to the shared task of Indian Language Identification. The task is primarily to identify the native language of the writer from the given XML file which contains a set of Facebook comments in the English language. We propose a hierarchical ensemble approach which combines various machine learning techniques along with language agnostic feature extraction to perform the final classification. Our hierarchical ensemble improves the TF-IDF based baseline accuracy by 3.9%. The proposed system stood 3rd across unique team submissions.
INLI @ FIRE-2017 [http://fire.irsi.res.in/fire/2017/home]
Anger Detection in Social Media for Resource Scarce Languages
Emotion Detection from text is a recent field of research that is closely related to Sentiment Analysis. Emotion Analysis aims to detect and recognize different types of feelings through the expression of texts, such as anger, disgust, fear, happiness, sadness, surprise etc. Identifying emotion information from social media, news articles and other user generated content has a lot of applications. Current techniques heavily depend on emotion and polarity lexicons; however, such lexicons are only available in few resource rich languages and this hinders the research for resource scarce languages. Also, social media texts in Indian languages have distinct features such as Romanization, code mixing, grammatical and spelling mistakes, which makes the task of classification even harder. This research addresses this task by training a deep learning architecture on large amount of data available on social media platforms like Twitter, using emojis as proxy for emotions. The model’s performance is then evaluated on a manually annotated dataset. This work is focused on Hindi language but the techniques used are language agnostic and can be used for other languages as well.
LREC 2018 [http://lrec-conf.org/lrec2018]