Research and Interests

I am interested in Artificial Intelligence, particularly in fields related to:

  • Machine Learning
  • Data Mining
  • Natural Language Processing
  • Information Retrieval

I have completed courses on Data Mining, Natural Language Processing, and Pattern Recognition as electives in the BTech curriculum. I am currently doing a course on Computational Intelligence as an elective in the current semester of BTech. Apart from these, I have completed the following online courses:
    
  • Natural Language Processing - Dan Jurafsky and Christopher Manning
  • Machine Learning - Andrew Ng
  • Introduction to Artificial Intelligence - Sebastian Thrun and Peter Norvig
However, I do not have a certificate of completion for any of these as I could not meet the deadlines, while trying to balance between the BTech curriculum and these online courses.
    
Following are the projects I have worked on in these fields:
    
  • Explicit Semantic Analysis for Computing Semantic Relatedness of Biomedical Text: Using Explicit Semantic Analysis to develop a method for computing the semantic relatedness of text in the biomedical domain that outperforms the current state-of-the-art in this domain.
  • Location Prediction With Sparse GPS Data: Applying location prediction techniques developed for dense and large GPS datasets to small and sparse datasets and dealing with the problem of sparseness in the process. I was working on this during my internship as a Research Scholar at Information Sciences Institute, University of Southern California, under Professor Craig A. Knoblock.
  • Coreference Resolution for Keyphrase Extraction: Trying to improve keyphrase extraction by applying coreference resolution. I was working on this project with Nicolai Erbs of Technische Universität Darmstadt. Applying coreference resolution did not worsen the accuracy of keyphrase extraction on the datasets that we tried. Work is still going on on this project, and the code is yet to be tested on more datasets, but I am not working on it currently.
  • Coreference Resolution System: A coreference resolution system, based on A Multi-Pass Sieve for Coreference Resolution by Raghunathan et al., built with Anunay Bhargava. The system was implemented from scratch in Python, but also used Stanford Parser, JPype (a Python interface to Stanford Parser), and Stanford NER. This was a course project for the Natural Language Processing theory course.
  • Handwritten Character Recognizer: An Artificial Neural Network based handwritten character recognizer developed using OpenCV in Python. This was a course project for Pattern Recognition theory course.
  • Language Identifier: A Naive-Bayes based language identifier, built from scratch in Java, that can identify the language of a document from 10 European Languages. More languages can be added to the identifier without having to change the code. This was a small independent project.
In future, I plan to explore other fields of AI too. My long-term goals include pursuing higher studies and doing research in AI.