24 Jan Developing Anupam: A SentiWordNet for Hindi in use
(Presented at The CALA 2019: Revitalisation and Representation, January 23 to 25, 2018, at Paññāsāstra University of Cambodia, Siem Reap, Cambodia)
Towards developing ‘Anupam’: A SentiWordNet for Hindi in use
SentiWordNets are the prime resources used for Sentiment Analysis or Opinion Mining. Sentiments expressed through linguistic utterances can be classified with (a) Positive, (b) Negative and (c) Objective scores with the help of a SentiWordNet (1).
There have been several attempts on developing SentiWordNets for Indian languages automatically and semi-automatically using various computational techniques, and taking reference of already existing resources in other languages (2). However, there is no such Hindi resource developed to date that is created keeping in mind the socio-cultural and linguistic nuances of Hindi.
The dependency on the resources of other languages poses various problems such as loose or no handling of indigenous concepts, erroneous scoring due to the linguistic divergence and no coverage of language-specific sentiment expression devices to name a few.
Talking about the recent trends, with the evolution of social media platforms and mass media, Hindi along with other Indian languages has undergone many changes at different levels which directly affect the extension of a Hindi speaker’s intent. Therefore exclusion of the devices used in this domain affects the accuracy, which in turn defeats the purpose of developing such resources.
In this regard, this paper presents a work in development for Hindi language and the justifications for the creation of the resource in development as opposed to the currently available resources.
1. Baccianella, Stefano, Andrea Esuli, and Fabrizio Sebastiani. “Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining.” Lrec. Vol. 10. No. 2010. 2010.
2. Das, Amitava, and Sivaji Bandyopadhyay. “SentiWordNet for Indian languages.” Proceedings of the Eighth Workshop on Asian Language Resouces. 2010.
Keywords: SentiWordNet, Sentiment Analysis, Hindi