Projects
Computer Vision
- AI Image Analysis of HP's Large-scale Printing Presses (2024) Partnered with HP to automate their million-dollar printing press by characterizing the PDF documents to be printed. Characterization of the documents was tackled as a semantic image segmentation challenge, using models such as DeepLab, U-Net, and vision transformers, resulting in significant increase in both speed and accuracy over their classical system.
View code on Github
View the report
Natural Language Processing
- Commonsense reasoning in Large Language Models (2024) The project aims to investigate and enhance the reasoning capabilities of LLMs. We are working with problems which require deductions and/or multi-hop reasoning to reach to a valid conclusion. We have forced the LLMs to formulate all the reasoning problems as Constrained Satisfaction Problems (CSP), that is helpful in enabling LLMs with the understanding of constraints, resulting in better deductions and calculations.
- Mitigating Bias in Downstream NLP Models (2023) The research on debiasing in downstream tasks has mainly focused on a single bias dimension, which is frequently not transferable to other dimensions. This is the gap we aimed to fill, by introducing a generalized adversarial technique for debiasing downstream tasks. We have worked with hate speech detection and experimented with three bias dimensions namely, gender, race and religion. Based on the investigation, it appears that adversarial training has the potential to serve as a generalized debiasing technique.
View the report
Data Science
- Black Student Success Initiative @ Oregon State University (2023) We have devised data-driven strategic interventions based on both descriptive and predictive analytics that will reduce stop out of Black students at Oregon State University, and will also enhance the graduation rate among Black students. The interventions if implemented as policies, will help identify students at risk early and automatically. It also will streamline the process for getting required help before reaching the critical level.
- Missing Values Imputaion with Variational Autoencoder (2023) In this project, we studied a paper emphasizing the superior performance of deep neural networks (DNNs) in the task of matrix completion, outpacing both conventional linear and nonlinear methods. Expanding upon this notion, we incorporated the use of Variational Autoencoder (VAE) to capitalize on its generative ability for imputing the missing entries in matrices. The results from our experimentation validate the superior performance of VAEs over both DNNs and CNNs, also exhibiting robustness as an additional advantage.
- Topic Modeling with Polya Distribution (2023) In this project, we have estimated the parameters of the Polya's distribution, concentrating on the beta-binomial model. Through mathematical analysis, we derived and computed the Fisher Information Matrix (FIM) and the Cramer-Rao Lower Bound (CRLB) under well-defined assumptions. Additionally, we delved into Maximum Likelihood Estimation (MLE) and the Method of Moments to estimate our parameters, confronting challenges like the absence of closed-form solutions and the limitations of fixed-point iteration methods.
View code on Github
View code on Github
Presentation on Google Slides