IIIT-H Researcher Sreekavitha Parupalli Develops Largest Computational Version of Telugu Dictionary

In conversation with Sreekavitha Parupalli, an MS by Research graduate from the Centre for Exact Humanities who speaks of her research work inspired by the vision of the Late. Prof. Navjyoti Singh and how it has enabled her to contribute to her own community.

For researchers working in the areas of language processing, machine translation, word sense disambiguation, culture studies, language teaching, dictionary compilation, and other areas of linguistics and language technology, plenty of English language resources exist online in the form of a lexical database, the WordNet. In the case of Indian languages however, the resources available online are far fewer. And for languages such as Telugu, the challenges are even greater. Even though a Telugu WordNet exists, there are few issues that researchers face while using it and it has limited coverage. “Most researchers prefer working on existing resources to enrich the systems rather than creating one online from scratch. While there are students working on creating online resources for Kannada and Malayalam among other South Indian languages, very few from our campus seemed to work on the Telugu language”, says Sreekavitha Parupalli, one of the youngest researchers to receive her Master’s Degree this year. By manually annotating 8483 verbs, 253 adverbs and 1673 adjectives from an authentic Telugu-Telugu dictionary, Sreekavitha has contributed to the enrichment of OntoSenseNet – a verb-centric ontological resource for Indian Languages.

Telugu Bidda

Hailing from Khammam in Telangana and having studied Telugu as her first language, during her schooling, gave Sreekavitha an edge in that she could not only converse fluently in Telugu, but also read and write with ease. Crediting the Late Prof. Navajyoti Singh for having egged her into this research area, Sreekavitha however is candid enough to admit that her first reaction to Telugu research was a horrified “No”. “When everyone else was doing fancy stuff, working on cutting-edge technology like cloud computing and data analytics, I was embarrassed to say I was doing Telugu research,” she says. It was when Sreekavitha was assisting a senior in creating a Telugu database for names of colours, that she discovered her proclivity towards her own roots and heritage. “I sat with my grandmother and noted down different names we use for different colours in Telugu, such as turmeric yellow, a yellow used to describe a particular kind of a flower and so on. It was not only fun, and interesting, but I also realised that it wasn’t so difficult after all”, she laughs.

Telugu Linguistic Heritage

Starting out with a mind-numbing task of typing out 21,000 odd Telugu words along with their meanings from an authentic dictionary, with help from her mother, Vijaya Lakshmi, Sreekavitha went on to build a database of Telugu words in their usage as verbs, adverbs and adjectives. Explaining how many more researchers can now actively engage in Telugu research, Sreekavitha says that her work aims to preserve and promote the usage of an authentic Telugu dictionary by developing a computational version of the same. “With the developed Telugu dictionary in hand, native speakers can perform better annotation tasks as both the word and its meaning are in a language they are familiar with. And any computer science expert or researcher can design algorithms that can work on top of the annotated language data. My research work has added numerous words to the existing resources such as the Telugu WordNet”, she says.

Building Block

Sreekavitha presented her paper at the 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICling- 2018) in Hanoi, Vietnam. She mentions ruefully that her “beloved Professor” Dr. Navjyoti Singh unfortunately passed away following an illness before she could break the news of having published and later presented the paper. Confronted with a dilemma on whether to continue in the same field of research or to change tracks, she says, “I thought a lot about it and realised that Prof. wanted me to focus on Telugu research. And I continued in the same field, albeit under a new advisor, Prof. Radhika Mamidi.” Only this time, Sreekavitha turned her focus towards the area of sentiment analysis.  “Using the resource (dictionary) that I developed and the proposed approaches, we found some improvement in the accuracy of sentiment analysis task in Telugu, compared with the other resources available”, she says. She presented another two papers at the Annual Meeting of the Association for Computational Linguistics (ACL- 2018) at Melbourne, Australia and still two more remotely at the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2018) and the 27th International Conference on Computational Linguistics (COLING 2018).“All of this is new and pioneering work since no one else has done it before. And the multiple papers I published are basically experiments that adopt the dictionary developed and the ontology proposed by Navjyoti sir,” explains Sreekavitha.

Learnings from IIIT-H

With a self-confessed single agenda of “making my parents proud”, Sreekavitha followed her father’s advice of applying to IIIT-H and joined as an Integrated B.Tech student in CS and MS by Research in Exact Humanities.“Being an educationist by profession, he believed that it is not only technology that is necessary but one should be in touch with the Humanities side of things too”, says Sreekavitha about her father, Usha Kiran Kumar’s vision. Stating emphatically that IIIT-H has taught her everything, Sreekavitha says, “All the thinking and analytical ability and the confidence I have in myself, I owe it to IIIT-H”. She adds that the papers she was able to publish and the associated travel undertaken are the result of the curriculum and research culture at IIIT-H.

Well Rounded

It was not all research or being confined to labs though for Sreekavitha. Currently working as a Product Engineer at Sprinklr (a social media management system for enterprises) in Gurugram, Sreekavitha credits her placement to the internship she did at Siemens in blockchain technology last summer. She also attended the Stanford Summer School where apart from short courses in Psychology, Technology and Innovation, and Decision-Making, as a marketing intern she gained hands-on experience with digital marketing during a project with Hewlett-Packard Enterprise (HP). Sreekavitha was awarded a special mention for promoting female entrepreneurship on campus as one of the coordinators of the Entrepreneurial Cell and co-founder of the Lean In movement at IIIT-H, When not watching movies with friends and travelling with them, Sreekavitha maintains a travel blog documenting her journey to the places she’s seen and been to. She has also acted in a short film that enabled her to understand the process of filmmaking which is of immense interest to her. On prospects of further research, Sreekavitha signs off saying, “While I worked on automatically identifying the primary sense of the verb in Telugu OntoSenseNet and enhanced Sentiment Analysis, more work needs to be done on the Ontology proposed for adjectives and adverbs. Perhaps someone from the junior batches of IIIT-H will take it up as their area of research to explore the true potential of the theory”.