Reducing Misclassification Risk in Dynamic Graph Neural Networks through Abstention
@inproceedings{bib_Redu_2025, AUTHOR = {Jayadratha Gayen, Himanshu Pal, Naresh Manwani, Charu Sharma}, TITLE = {Reducing Misclassification Risk in Dynamic Graph Neural Networks through Abstention}, BOOKTITLE = {IEEE International Conference on Advances in Social Networks Analysis and Mining}. YEAR = {2025}}
Many real-world systems can be modeled as dynamic graphs, where nodes and edges evolve over time, requiring specialized models to capture their evolving dynamics in risk-sensitive applications effectively. Graph neural networks (GNNs) for temporal graphs are one such category of specialized models. For the first time, our approach integrates a reject option strategy within the framework of GNNs for continuous time dynamic graphs (CTDGs). This allows the model to strategically abstain from making predictions when the uncertainty is high and confidence is low, thus minimizing the risk of critical misclassification and enhancing the results and reliability. We propose a coverage-based abstention prediction model to implement the reject option that maximizes prediction within a specified coverage. It improves the prediction score for link prediction and node classification tasks. Temporal GNNs deal with extremely skewed datasets for the next state prediction or node classification task. In the case of class imbalance, our method can be further tuned to provide a higher weight to the minority class. Exhaustive experiments are presented on four datasets for dynamic link prediction and two datasets for dynamic node classification tasks. This demonstrates the effectiveness of our approach in improving the reliability and area under the curve (AUC)/average precision (AP) scores for predictions in dynamic graph scenarios. The results highlight our model’s ability to efficiently handle the trade-offs between prediction confidence and coverage, making it a dependable solution for applications requiring high precision in dynamic and uncertain environments.
@inproceedings{bib_Conf_2025, AUTHOR = {Jayadratha Gayen, Himanshu Pal, Naresh Manwani, Charu Sharma}, TITLE = {Confidence First: Reliability-Driven Temporal Graph Neural Networks}, BOOKTITLE = {KNOWLEDGE DISCOVERY AND DATA MINING WORKSHOPS}. YEAR = {2025}}
Many real-world systems can be modeled as dynamic graphs, where nodes and edges evolve over time, requiring specialized models to capture their evolving dynamics in risk-sensitive applications effectively. Graph neural networks (GNNs) for temporal graphs are one such category of specialized models. For the first time, our approach integrates a reject option strategy within the framework of GNNs for continuous-time dynamic graphs (CTDGs). This allows the model to strategically abstain from making predictions when the uncertainty is high and confidence is low, thus minimizing the risk of critical misclassification and enhancing the results and reliability. We propose a coverage-based abstention prediction model to implement the reject option that maximizes prediction within a specified coverage. It improves the prediction score for link prediction and node classification tasks. Temporal GNNs deal with extremely skewed datasets for the next state prediction or node classification task. In the case of class imbalance, our method can be further tuned to provide a higher weight to the minority class. Exhaustive experiments are presented on four datasets for dynamic link prediction and two datasets for dynamic node classification tasks. This demonstrates the effectiveness of our approach in improving the reliability and area under the curve (AUC)/average precision (AP) scores for predictions in dynamic graph scenarios. The results highlight our model's ability to efficiently handle the trade-offs between prediction confidence and coverage, making it a dependable solution for applications requiring high precision in dynamic and uncertain environments.
@inproceedings{bib_Pseu_2025, AUTHOR = {Darshana S, Naresh Manwani, Vineet Gandhi}, TITLE = {Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2025}}
We motivate weakly supervised learning as an effective learning paradigm for problems where curating perfectly annotated datasets is expensive and may require domain expertise such as fine-grained classification. We focus on Partial Label Learning (PLL), a weakly-supervised learning paradigm where each training instance is paired with a set of candidate labels (partial label), one of which is the true label. Noisy PLL (NPLL) relaxes this constraint by allowing some partial labels to not contain the true label, enhancing the practicality of the problem. Our work centres on NPLL and presents a framework that initially assigns pseudo-labels to images by exploiting the noisy partial labels through a weighted nearest neighbour algorithm. These pseudo-label and image pairs are then used to train a deep neural network classifier with label smoothing. The classifier's features and predictions are subsequently employed to refine and enhance the accuracy of pseudo-labels. We perform thorough experiments on seven datasets and compare against nine NPLL and PLL methods. We achieve state-of-the-art results in all studied settings from the prior literature, obtaining substantial gains in the simulated fine-grained benchmarks. Further, we show the promising generalisation capability of our framework in realistic, fine-grained, crowd-sourced datasets.
@inproceedings{bib_Node_2025, AUTHOR = {Uday Bhaskar K, Jayadratha Gayen, Charu Sharma, Naresh Manwani}, TITLE = {Node Classification With Reject Option}, BOOKTITLE = {Transactions in Machine Learning Research}. YEAR = {2025}}
One of the key tasks in graph learning is node classification. While Graph neural networks have been used for various applications, their adaptivity to reject option settings has not been previously explored. In this paper, we propose NCwR, a novel approach to node classification in Graph Neural Networks (GNNs) with an integrated reject option. This allows the model to abstain from making predictions when uncertainty is high. We propose cost-based and coverage-based methods for classification with abstention in node classification settings using GNNs. We perform experiments using our method on three standard citation network datasets Cora, Citeseer and Pubmed and compare with relevant baselines. We also model the Legal judgment prediction problem on the ILDC dataset as a node classification problem, where nodes represent legal cases and edges represent citations. We further interpret the model by analyzing the cases in which it abstains from predicting and visualizing which part of the input features influenced this decision.
@inproceedings{bib_Achi_2025, AUTHOR = {Vidhi Rathore, Naresh Manwani}, TITLE = {Achieving Fair PCA using Joint Eigenvalue Decomposition}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2025}}
Principal Component Analysis (PCA) is a widely used method for dimensionality reduction, but it often overlooks fairness, especially when working with data that includes demographic characteristics. This can lead to biased representations that disproportionately affect certain groups. To address this issue, our approach incorporates Joint Eigenvalue Decomposition (JEVD), a technique that enables the simultaneous diagonalization of multiple matrices, ensuring both fair and efficient representations. We formally show that the optimal solution of JEVD leads to a fair PCA solution. By integrating JEVD with PCA, we strike an optimal balance between preserving data structure and promoting fairness across diverse groups. We demonstrate that our method outperforms existing baseline approaches in fairness and representational quality on various datasets. It retains the core advantages of PCA while ensuring that sensitive demographic attributes do not create disparities in the reduced representation.
@inproceedings{bib_Node_2025, AUTHOR = {Uday Bhaskar K, Jayadratha Gayen, Charu Sharma, Naresh Manwani}, TITLE = {Node Classification With Integrated Reject Option For Legal Judgement Prediction}, BOOKTITLE = {Association for the Advancement of Artificial Intelligence Workshop}. YEAR = {2025}}
One of the key tasks in graph learning is node classification. While Graph neural networks have been used for various applications, their adaptivity to reject option setting is not previously explored. In this paper, we propose NCwR, a novel approach to node classification in Graph Neural Networks (GNNs) with an integrated reject option, which allows the model to abstain from making predictions when uncertainty is high. We propose both cost-based and coverage-based methods for classification with abstention in node classification setting using GNNs. We perform experiments using our method on three standard citation network datasets Cora, Citeseer and Pubmed and compare with relevant baselines. We also model the Legal judgment prediction problem on ILDC dataset as a node classification problem where nodes represent legal cases and edges represent citations. We further interpret the model by analyzing the cases that the model abstains from predicting by visualizing which part of the input features influenced this decision.
Text Representation Models based on the Spatial Distributional Properties of Word Embeddings
@inproceedings{bib_Text_2024, AUTHOR = {Narendra Babu Unnam, Krishna Reddy Polepalli, Amit Pandey, Naresh Manwani}, TITLE = {Text Representation Models based on the Spatial Distributional Properties of Word Embeddings}, BOOKTITLE = {India Joint International Conference on Data Science & Management of Data}. YEAR = {2024}}
In the current digital era, about 80% of the digital data which is being generated is unstructured and unlabeled natural language text. In the development cycle of information retrieval and text mining applications, text representation is the most fundamental and critical step, as its effectiveness directly impacts the application’s performance. The existing traditional text representation frameworks are mostly frequency distribution-based. In this work, we explored the spatial distribution of word embeddings and proposed two text representational models. The experimental demonstrated that proposed models perform consistently better at text mining tasks compared to baseline methods.
@inproceedings{bib_Towa_2024, AUTHOR = {Vrund Shah, Tejas Kiran Chaudhari, Naresh Manwani}, TITLE = {Towards Calibrated Losses for Adversarial Robust Reject Option Classification}, BOOKTITLE = {Asian Conference on Machine Learning}. YEAR = {2024}}
Robustness towards adversarial attacks is a vital property for classifiers in several applications such as autonomous driving, medical diagnosis, etc. Also, in such scenarios, where the cost of misclassification is very high, knowing when to abstain from prediction becomes crucial. A natural question is which surrogates can be used to ensure learning in scenarios where the input points are adversarially perturbed and the classifier can abstain from prediction? This paper aims to characterize and design surrogates calibrated in enquote{Adversarial Robust Reject Option} setting. First, we propose an adversarial robust reject option loss $ell_{d}^{gamma}$ and analyze it for the hypothesis set of linear classifiers ($mathcal{H}_{textrm{lin}}$). Next, we provide a complete characterization result for any surrogate to be $(ell_{d}^{gamma},mathcal{H}_{textrm{lin}})$- calibrated. To demonstrate the difficulty in designing surrogates to $ell_{d}^{gamma}$, we show negative calibration results for convex surrogates and quasi-concave conditional risk cases (these gave positive calibration in adversarial setting without reject option). We also empirically argue that Shifted Double Ramp Loss (DRL) and Shifted Double Sigmoid Loss (DSL) satisfy the calibration conditions. Finally, we demonstrate the robustness of shifted DRL and shifted DSL against adversarial perturbations on a synthetically generated dataset.
@inproceedings{bib_EQUI_2024, AUTHOR = {Tejas Kiran Chaudhari, Naresh Manwani}, TITLE = {EQUISCALE: Equitable Scaling for Abstention Learning}, BOOKTITLE = {Pacific Rim International Conference on Artificial Intelligence}. YEAR = {2024}}
We propose an approach to train a fair cost-based abstain option classifier. Existing literature on fairness in classification with abstention is limited, covering only coverage-based abstention models. In coverage-based abstention models, the target coverage is decided before-hand and is kept the same for all the groups. In contrast, cost-based approaches introduce a cost for abstention, which can cause uneven abstention rates between different groups, leading to an unfair system. We extend the independence and separation fairness criteria to consider abstention. We provide a model-agnostic in-processing algorithm to incorporate these constraints in the models. We demonstrate the algorithm’s efficacy by experimenting with two different cost-based abstain option
classifiers. Additionally, we explore mixing constraints from independence and separation criteria into one model, which is impossible in a binary classification task.
ILAEDA: An Imitation Learning Based Approach for Automatic Exploratory Data Analysis
@inproceedings{bib_ILAE_2024, AUTHOR = {Manatkar Abhijit Devendra, Devarsh Patel, Hima Patel, Naresh Manwani}, TITLE = {ILAEDA: An Imitation Learning Based Approach for Automatic Exploratory Data Analysis}, BOOKTITLE = {Association for Computing Machinery}. YEAR = {2024}}
Automating end-to-end Exploratory Data Analysis (AutoEDA) is a challenging open problem, often tackled through Reinforcement Learning (RL) by learning to predict a sequence of analysis operations (FILTER, GROUP, etc). Defining rewards for each operation is a challenging task and existing methods rely on various emph{interestingness measures} to craft reward functions to capture the importance of each operation. In this work, we argue that not all of the essential features of what makes an operation important can be accurately captured mathematically using rewards. We propose an AutoEDA model trained through imitation learning from expert EDA sessions, bypassing the need for manually defined interestingness measures. Our method, based on generative adversarial imitation learning (GAIL), generalizes well across datasets, even with limited expert data. We also introduce a novel approach for generating synthetic EDA demonstrations for training. Our method outperforms the existing state-of-the-art end-to-end EDA approach on benchmarks by upto 3x, showing strong performance and generalization, while naturally capturing diverse interestingness measures in generated EDA sessions.
CDAN: Cost Dependent Deep Abstention Network
Bhavya Kalra,Naresh Manwani
Pacific Rim International Conference on Artificial Intelligence, PRICAI, 2023
@inproceedings{bib_CDAN_2023, AUTHOR = {Bhavya Kalra, Naresh Manwani}, TITLE = {CDAN: Cost Dependent Deep Abstention Network}, BOOKTITLE = {Pacific Rim International Conference on Artificial Intelligence}. YEAR = {2023}}
This paper proposes deep architectures for learning instance-specific abstain (reject) option multiclass classifiers. The proposed approach uses novel bounded multiclass abstention loss for multiclass classification as a performance measure. This approach uses rejection cost as the rejection parameter in con- trast to coverage-based approaches. To show the effectiveness of the proposed approach, we experiment with several real-world datasets and compare them with state-of-the-art coverage-based and cost-of-rejection-based techniques. The ex- perimental results show that the proposed method improves performance over the state-of-the-art approaches.
Features Normalisation and Standardisation (FNS): An Unsupervised Approach for Detecting Adversarial Attacks for Medical Images.
M Sreenivasan,Naresh Manwani
International Conference on Agents and Artificial Intelligence, ICAART, 2023
@inproceedings{bib_Feat_2023, AUTHOR = {M Sreenivasan, Naresh Manwani}, TITLE = {Features Normalisation and Standardisation (FNS): An Unsupervised Approach for Detecting Adversarial Attacks for Medical Images.}, BOOKTITLE = {International Conference on Agents and Artificial Intelligence}. YEAR = {2023}}
Deep learning systems have shown state-of-the-art performance in clinical prediction tasks. However, current research suggests that cleverly produced hostile images can trick these systems. Deep learning-based medical image classification algorithms have been questioned regarding their practical deployment. To address this problem, we provide an unsupervised learning technique for detecting adversarial attacks on medical images. Without identifying the attackers or reducing classification performance, our suggested strategy FNS (Features Normalization and Standardization), can detect adversarial attacks more effectively than earlier methods.
RPL-SVM: Making SVM Robust Against Missing Values and Partial Labels
M Sreenivasan,Naresh Manwani
Pacific Rim International Conference on Artificial Intelligence, PRICAI, 2023
@inproceedings{bib_RPL-_2023, AUTHOR = {M Sreenivasan, Naresh Manwani}, TITLE = {RPL-SVM: Making SVM Robust Against Missing Values and Partial Labels}, BOOKTITLE = {Pacific Rim International Conference on Artificial Intelligence}. YEAR = {2023}}
We present a novel second-order cone programming frame- work to learn robust classifiers that can tolerate uncertainty in the obser- vations of partially labeled multiclass classification problems. We call it RPL-SVM. Our formulation is based on a chance-constrained framework. Experimental results show that RPL-SVM efficiently learns multiclass classifiers with missing values in a partial label setting.
Delaytron: Efficient Learning of Multiclass Classifiers with Delayed Bandit Feedbacks
Naresh Manwani,Mudit Agarwal
International Joint Conference on Neural Networks, IJCNN, 2023
@inproceedings{bib_Dela_2023, AUTHOR = {Naresh Manwani, Mudit Agarwal}, TITLE = {Delaytron: Efficient Learning of Multiclass Classifiers with Delayed Bandit Feedbacks}, BOOKTITLE = {International Joint Conference on Neural Networks}. YEAR = {2023}}
In this paper, we present online algorithm called {it Delaytron} for learning multi class classifiers using delayed bandit feedbacks. The sequence of feedback delays ${d_t}_{t=1}^T$ is unknown to the algorithm. At the $t$-th round, the algorithm observes an example $mathbf{x}_t$ and predicts a label $tilde{y}_t$ and receives the bandit feedback $mathbb{I}[tilde{y}_t=y_t]$ only $d_t$ rounds later. When $t+d_t>T$, we consider that the feedback for the $t$-th round is missing. We show that the proposed algorithm achieves regret of $mathcal{O}left(sqrt{frac{2 K}{gamma}left[frac{T}{2}+left(2+frac{L^2}{R^2Vert WVert_F^2}right)sum_{t=1}^Td_tright]}right)$ when the loss for each missing sample is upper bounded by $L$. In the case when the loss for missing samples is not upper bounded, the regret achieved by Delaytron is $mathcal{O}left(sqrt{frac{2 K}{gamma}left[frac{T}{2}+2sum_{t=1}^Td_t+vert mathcal{M}vert Tright]}right)$ where $mathcal{M}$ is the set of missing samples in $T$ rounds. These bounds were achieved with a constant step size which requires the knowledge of $T$ and $sum_{t=1}^Td_t$. For the case when $T$ and $sum_{t=1}^Td_t$ are unknown, we use a doubling trick for online learning and proposed Adaptive Delaytron. We show that Adaptive Delaytron achieves a regret bound of $mathcal{O}left(sqrt{T+sum_{t=1}^Td_t}right)$. We experimentally show that the proposed approach can learn efficient classifiers even with delayed bandit feedbacks and the accuracy does not degrade much due to delays in feedbacks.
ALBIF: Active Learning with BandIt Feedbacks
Mudit Agarwal,Naresh Manwani
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2022
Abs | | bib Tex
@inproceedings{bib_ALBI_2022, AUTHOR = {Mudit Agarwal, Naresh Manwani}, TITLE = {ALBIF: Active Learning with BandIt Feedbacks}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2022}}
Online active learning algorithms reduce human labeling costs by querying only a subset of informative incoming instances from the data stream to update the classification model. Active learning for online multiclass classification under complete information has been well addressed; however, it remains unaddressed for the bandit setting. In this paper, we investigate online active learning techniques under the bandit feedback setting. We proposed an efficient algorithm for learning a multiclass classifier with bandit feedbacks under the active learning setting. The proposed algorithms enjoy a regret bound of the order O(logT) in the active learning setting as well as in the standard (non-active) bandit feedbacks. We show the effectiveness of the proposed approach using extensive experiments on several benchmark
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Centric AI Systems
Hima Patel,Shanmukha Guttula,Ruhi Sharma Mittal,Naresh Manwani,Laure Berti-Equille,Abhijit Manatkar
KNOWLEDGE DISCOVERY AND DATA MINING, KDD, 2022
@inproceedings{bib_Adva_2022, AUTHOR = {Hima Patel, Shanmukha Guttula, Ruhi Sharma Mittal, Naresh Manwani, Laure Berti-Equille, Abhijit Manatkar}, TITLE = {Advances in Exploratory Data Analysis, Visualisation and Quality for Data Centric AI Systems}, BOOKTITLE = {KNOWLEDGE DISCOVERY AND DATA MINING}. YEAR = {2022}}
It is widely accepted that data preparation is one of the most timeconsuming steps of the machine learning (ML) lifecycle. It is also one of the most important steps, as the quality of data directly influences the quality of a model. In this tutorial, we will discuss the importance and the role of exploratory data analysis (EDA) and data visualisation techniques to find data quality issues and for data preparation, relevant to building ML pipelines. We will also discuss the latest advances in these fields and bring out areas that need innovation. To make the tutorial actionable for practitioners, we will also discuss the most popular open-source packages that one can get started with along with their strengths and weaknesses. Finally, we will discuss on the challenges posed by industry workloads and the gaps to be addressed to make data-centric AI real in industry settings
RoLNiP: Robust Learning Using Noisy Pairwise Comparisons
Samartha S M,Naresh Manwani
Asian Conference on Machine Learning, ACML, 2022
@inproceedings{bib_RoLN_2022, AUTHOR = {Samartha S M, Naresh Manwani}, TITLE = {RoLNiP: Robust Learning Using Noisy Pairwise Comparisons}, BOOKTITLE = {Asian Conference on Machine Learning}. YEAR = {2022}}
Momentum Iterative Gradient Sign Method outperforms PGD Attacks
M Sreenivasan,Naresh Manwani,Durga Prasad Dhulipudi
@inproceedings{bib_Mome_2022, AUTHOR = {M Sreenivasan, Naresh Manwani, Durga Prasad Dhulipudi}, TITLE = {Momentum Iterative Gradient Sign Method outperforms PGD Attacks}, BOOKTITLE = {}. YEAR = {2022}}
ALBIF: Active Learning With BandIt Feedbacks
Mudit Agarwal,Naresh Manwani
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2022
@inproceedings{bib_ALBI_2022, AUTHOR = {Mudit Agarwal, Naresh Manwani}, TITLE = {ALBIF: Active Learning With BandIt Feedbacks}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2022}}
Journey to the center of the words: Word weighting scheme based on the geometry of word embeddings
Narendra Babu Unnam,Krishna Reddy Polepalli,Amit Pandey,Naresh Manwani
International Conference on Scientific and Statistical Database Management, SSDBM, 2022
@inproceedings{bib_Jour_2022, AUTHOR = {Narendra Babu Unnam, Krishna Reddy Polepalli, Amit Pandey, Naresh Manwani}, TITLE = {Journey to the center of the words: Word weighting scheme based on the geometry of word embeddings}, BOOKTITLE = {International Conference on Scientific and Statistical Database Management}. YEAR = {2022}}
Cooperative Monitoring of Malicious Activity in Stock Exchanges
Bhavya Kalra,MUNNANGI BALA SAI KRISHNA REDDY,Majmundar Kushal Alpeshkumar,Naresh Manwani,Praveen Paruchuri
Pacific Asia Conference on Knowledge Discovery and Data Mining Workshops, PAKDD-W, 2021
@inproceedings{bib_Coop_2021, AUTHOR = {Bhavya Kalra, MUNNANGI BALA SAI KRISHNA REDDY, Majmundar Kushal Alpeshkumar, Naresh Manwani, Praveen Paruchuri}, TITLE = {Cooperative Monitoring of Malicious Activity in Stock Exchanges}, BOOKTITLE = {Pacific Asia Conference on Knowledge Discovery and Data Mining Workshops}. YEAR = {2021}}
Learning Multiclass Classifier Under Noisy Bandit Feedback
Mudit Agarwal,Naresh Manwani
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2021
@inproceedings{bib_Lear_2021, AUTHOR = {Mudit Agarwal, Naresh Manwani}, TITLE = {Learning Multiclass Classifier Under Noisy Bandit Feedback}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2021}}
This paper addresses the problem of multiclass classification with corrupted or noisy bandit feedback. In this setting, the learner may not receive true feedback. Instead, it receives feedback that has been flipped with some non-zero probability. We propose a novel approach to deal with noisy bandit feedback based on the unbiased estimator technique. We further offer a method that can efficiently estimate the noise rates, thus providing an end-to-end framework. The proposed algorithm enjoys a mistake bound of the order of O(√T ) in the high noise case and of the order of O(T 2/3) in the worst case. We show our approach’s effectiveness using extensive experiments on several benchmark datasets.
RISAN: Robust instance specific deep abstention network
Bhavya Kalra,Kulin Shah,Naresh Manwani
Uncertainty in Artificial Intelligence, UAI, 2021
@inproceedings{bib_RISA_2021, AUTHOR = {Bhavya Kalra, Kulin Shah, Naresh Manwani}, TITLE = {RISAN: Robust instance specific deep abstention network}, BOOKTITLE = {Uncertainty in Artificial Intelligence}. YEAR = {2021}}
In this paper, we propose deep architectures for learning instance specific abstain (reject option) binary classifiers. The proposed approach uses double sigmoid loss function as described by Kulin Shah and Naresh Manwani in (" Online Active Learning of Reject Option Classifiers", AAAI, 2020), as a performance measure. We show that the double sigmoid loss is classification calibrated. We also show that the excess risk of 0-d-1 loss is upper bounded by the excess risk of double sigmoid loss. We derive the generalization error bounds for the proposed architecture for reject option classifiers. To show the effectiveness of the proposed approach, we experiment with several real world datasets. We observe that the proposed approach not only performs comparable to the state-of-the-art approaches, it is also robust against label noise. We also provide visualizations to observe the important features learned by the network corresponding to the abstaining decision.
The Curious case of Convex Neural networks
Sivaprasad S,Ankur Singh,Naresh Manwani,Vineet Gandhi
The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Da, ECML PKDD, 2021
@inproceedings{bib_The__2021, AUTHOR = {Sivaprasad S, Ankur Singh, Naresh Manwani, Vineet Gandhi}, TITLE = {The Curious case of Convex Neural networks}, BOOKTITLE = {The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Da}. YEAR = {2021}}
Multiclass Classification using dilute bandit feedback
Gaurav Batra,Naresh Manwani
Pacific Rim International Conference on Artificial Intelligence, PRICAI, 2021
@inproceedings{bib_Mult_2021, AUTHOR = {Gaurav Batra, Naresh Manwani}, TITLE = {Multiclass Classification using dilute bandit feedback}, BOOKTITLE = {Pacific Rim International Conference on Artificial Intelligence}. YEAR = {2021}}
This paper introduces a new online learning framework for multiclass classification called learning with diluted bandit feedback. At every time step, the algorithm predicts a candidate label set instead of a single label for the observed example. It then receives feedback from the environment whether the actual label lies in this candidate label set or not. This feedback is called "diluted bandit feedback". Learning in this setting is even more challenging than the bandit feedback setting, as there is more uncertainty in the supervision. We propose an algorithm for multiclass classification using dilute bandit feedback (MC-DBF), which uses the exploration-exploitation strategy to predict the candidate set in each trial. We show that the proposed algorithm achieves O(T^{1-frac{1}{m+2}}) mistake bound if candidate label set size (in each step) is m. We demonstrate the effectiveness of the proposed approach with extensive simulations.
Exact Passive Aggressive Algorithm for Multiclass Classification Using Partial Labels
Maanik Arora,Naresh Manwani
ACM IKDD Conference on Data Sciences, IKDD-CDS, 2021
@inproceedings{bib_Exac_2021, AUTHOR = {Maanik Arora, Naresh Manwani}, TITLE = {Exact Passive Aggressive Algorithm for Multiclass Classification Using Partial Labels}, BOOKTITLE = {ACM IKDD Conference on Data Sciences}. YEAR = {2021}}
Complete information about the class label is not given in many real-world classification problems. For example in a multiclass setting, instead of the ground-truth label, we can be given a set of candidate labels, assuming that the true label belongs to this set. This type of setting is called learning under partial labels. In this paper, we propose exact passive-aggressive online algorithms for multiclass classification using only partial labels. For updating the weights, we find the exact solution of a quadratic optimization problem under multiple class separability constraints. We obtain this by finding the active constraints using KKT conditions of the optimization problem. The set of support classes for which the weight vector is to be updated is determined by these constraints. The proposed algorithms are called PA, PA-I, and PA-II. We also provide a thorough theoretical analysis of the proposed algorithms including regret bounds. We provide extensive simulation results to show the effectiveness of the proposed approaches.
The Curious Case of Convex Networks
Sarath S,Naresh Manwani,Vineet Gandhi
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databa, PKDD/ECML, 2021
@inproceedings{bib_The__2021, AUTHOR = {Sarath S, Naresh Manwani, Vineet Gandhi}, TITLE = {The Curious Case of Convex Networks}, BOOKTITLE = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databa}. YEAR = {2021}}
In this paper, we investigate a constrained formulation of neural networks where the output is a convex function of the input. We show that the convexity constraints can be enforced on both fully connected and convolutional layers, making them applicable to most architectures. The convexity constraints include restricting the weights (for all but the first layer) to be non-negative and using a non-decreasing convex activation function. Albeit simple, these constraints have profound implications on the generalization abilities of the network. We draw three valuable insights: (a) Input Output Convex Networks (IOC-NN) self regularize and almost uproot the problem of overfitting; (b) Although heavily constrained, they come close to the performance of the base architectures; and (c) The ensemble of convex networks can match or outperform the non convex counterparts. We demonstrate the efficacy of the proposed idea using thorough experiments and ablation studies on MNIST, CIFAR10, and CIFAR100 datasets with three different neural network architectures. The code for this project is publicly available at: https://github.com/sarathsp1729/Convex-Networks.
Robust Deep Ordinal Regression under Label Noise
Bhanu Garg,Naresh Manwani
Asian Conference on Machine Learning, ACML, 2020
@inproceedings{bib_Robu_2020, AUTHOR = {Bhanu Garg, Naresh Manwani}, TITLE = {Robust Deep Ordinal Regression under Label Noise}, BOOKTITLE = {Asian Conference on Machine Learning}. YEAR = {2020}}
The real-world data is often susceptible to label noise, which might constrict the effectiveness of the existing state of the art algorithms for ordinal regression. Existing works on ordinal regression do not take label noise into account. We propose a theoretically grounded approach for class conditional label noise in ordinal regression problems. We present a deep learning implementation of two commonly used loss functions for ordinal regression that is both-1) robust to label noise, and 2) rank consistent for a good ranking rule. We verify these properties of the algorithm empirically and show robustness to label noise on real data and rank consistency. To the best of our knowledge, this is the first approach for robust ordinal regression models.
Exact Passive-Aggressive Algorithms for Multiclass Classification Using Bandit Feedbacks
Maanik Arora,Naresh Manwani
Asian Conference on Machine Learning, ACML, 2020
@inproceedings{bib_Exac_2020, AUTHOR = {Maanik Arora, Naresh Manwani}, TITLE = {Exact Passive-Aggressive Algorithms for Multiclass Classification Using Bandit Feedbacks}, BOOKTITLE = {Asian Conference on Machine Learning}. YEAR = {2020}}
In many real-life classification problems, we may not get exact class labels for training samples. One such example is bandit feedback in multiclass classification. In this setting, we only get to know whether our predicted label is correct or not. Due to which, we are left in uncertainty about the actual class label when we predict the wrong class. This paper proposes exact passive-aggressive online algorithms for multiclass classification under bandit feedback (EPABF). The proposed approach uses an exploration-exploitation strategy to guess the class label in every trial. To update the weights, we solve a quadratic optimization problem under multiple class separability constraints and find the exact solution. We do this by finding active constraints using the KKT conditions of the optimization problem. These constraints form a support set that determines the classes for which the weight vector needs to be updated. We propose three different variants of the weight update rule, which vary based on the aggressiveness to correct the mistake. These are called EPABF, EPABF-I, and EPABF-II. We also provide mistake bounds for the proposed EPABF, EPABF-I, and EPABF-II. Experiments demonstrated that our proposed algorithms perform better than other bandit feedback-based approaches and comparably to the full information approaches.
Online Active Learning of Reject Option Classifiers
Shah Kulin Nitinkumar,Naresh Manwani
American Association for Artificial Intelligence, AAAI, 2020
@inproceedings{bib_Onli_2020, AUTHOR = {Shah Kulin Nitinkumar, Naresh Manwani}, TITLE = {Online Active Learning of Reject Option Classifiers}, BOOKTITLE = {American Association for Artificial Intelligence}. YEAR = {2020}}
Active learning is an important technique to reduce the number of labeled examples in supervised learning. Active learning for binary classification has been well addressed in machine learning. However, active learning of the reject option classifier remains unaddressed. In this paper, we propose novel algorithms for active learning of reject option classifiers. We develop an active learning algorithm using double ramp loss function. We provide mistake bounds for this algorithm. We also propose a new loss function called double sigmoid loss function for reject option and corresponding active learning algorithm. We offer a convergence guarantee for this algorithm. We provide extensive experimental results to show the effectiveness of the proposed algorithms. The proposed algorithms efficiently reduce the number of label examples required.
Robust Learning of Multi-Label Classifiers under Label Noise
Himanshu Kumar,Naresh Manwani,P. S. Sastry
India Joint International Conference on Data Science & Management of Data, COMAD/CODS, 2020
@inproceedings{bib_Robu_2020, AUTHOR = {Himanshu Kumar, Naresh Manwani, P. S. Sastry}, TITLE = {Robust Learning of Multi-Label Classifiers under Label Noise}, BOOKTITLE = {India Joint International Conference on Data Science & Management of Data}. YEAR = {2020}}
In this paper, we address the problem of robust learning of multilabel classifiers when the training data has label noise. We consider learning algorithms in the risk-minimization framework. We define what we call symmetric label noise in multi-label settings which is a useful noise model for many random errors in the labeling of data. We prove that risk minimization is robust to symmetric label noise if the loss function satisfies some conditions. We show that Hamming loss and couple of surrogates of Hamming loss satisfy these sufficient conditions and hence are robust. By learning feedforward neural networks on some benchmark multi-label datasets, we provide empirical evidence to illustrate our theoretical results on robust learning of multi-label classifiers under label noise.
Online Algorithms for Multiclass Classification Using Partial Labels
Rajarshi Bhattacharjee,Naresh Manwani
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2020
@inproceedings{bib_Onli_2020, AUTHOR = {Rajarshi Bhattacharjee, Naresh Manwani}, TITLE = {Online Algorithms for Multiclass Classification Using Partial Labels}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2020}}
In this paper, we propose online algorithms for multiclass classification using partial labels. We propose two variants of Perceptron called Avg Perceptron and Max Perceptron to deal with the partially labeled data. We also propose Avg Pegasos and Max Pegasos, which are extensions of the Pegasos algorithm. We also provide mistake bounds for Avg Perceptron and regret bound for Avg Pegasos. We show the effectiveness of the proposed approaches by experimenting on various datasets and comparing them with the standard Perceptron and Pegasos.
Learning Multiclass Classifier Under Noisy Bandit Feedback
Mudit Agarwal,Naresh Manwani
Technical Report, arXiv, 2020
@inproceedings{bib_Lear_2020, AUTHOR = {Mudit Agarwal, Naresh Manwani}, TITLE = {Learning Multiclass Classifier Under Noisy Bandit Feedback}, BOOKTITLE = {Technical Report}. YEAR = {2020}}
This paper addresses the problem of multiclass classification with corrupted or noisy bandit feedback. In this setting, the learner may not receive true feedback. Instead, it receives feedback that has been flipped with some non-zero probability. We propose a novel approach to deal with noisy bandit feedback, based on the unbiased estimator technique. We further propose an approach that can efficiently estimate the noise rates, and thus providing an end-to-end framework. The proposed algorithm enjoys mistake bound of the order of O(√T). We provide a theoretical mistake bound for our proposal. We also carry out extensive experiments on several benchmark datasets to demonstrate that our proposed approach successfully learns the underlying classifier even using noisy bandit feedbacks.
Exact Passive-Aggressive Algorithms for Ordinal Regression Using Interval Labels
Naresh Manwani,Mohit Chandra
Transactions on Neural Networks and Learning Systems, NNLS, 2019
@inproceedings{bib_Exac_2019, AUTHOR = {Naresh Manwani, Mohit Chandra}, TITLE = {Exact Passive-Aggressive Algorithms for Ordinal Regression Using Interval Labels}, BOOKTITLE = {Transactions on Neural Networks and Learning Systems}. YEAR = {2019}}
n this article, we propose exact passive-aggressive(PA) online algorithms for ordinal regression. The proposedalgorithms can be used even when we have interval labelsinstead of actual labels for example. The proposed algorithmssolve a convex optimization problem at every trial. We find anexact solution to those optimization problems to determine theupdated parameters. We propose a support class algorithm (SCA)that finds the active constraints using the Karush–Kuhn–Tucker(KKT) conditions of the optimization problems. These active con-straints form a support set, which determines the set of thresholdsthat need to be updated. We derive update rules for PA, PA-I, and PA-II. We show that the proposed algorithms maintainthe ordering of the thresholds after every trial. We provide themistake bounds of the proposed algorithms in both ideal andgeneral settings. We also show experimentally that the proposedalgorithms successfully learn accurate classifiers using intervallabels as well as exact labels. The proposed algorithms also dowell compared to other approaches.
PRIL: Perceptron Ranking Using Interval Labels
Naresh Manwani
India Joint International Conference on Data Science & Management of Data, COMAD/CODS, 2019
@inproceedings{bib_PRIL_2019, AUTHOR = {Naresh Manwani}, TITLE = {PRIL: Perceptron Ranking Using Interval Labels}, BOOKTITLE = {India Joint International Conference on Data Science & Management of Data}. YEAR = {2019}}
In this paper, we propose an online learning algorithm called PRIL for learning ranking classifiers using interval labeled data. We show the correctness of PRIL by showing that it preserves the orderings of the thresholds in successive trials. We show that the proposed algorithm converges in finite number of steps if there exists an ideal classifier. We also give the mistake bound for the general case and provide O(pT) regret bound for the proposed algorithm. We show the effectiveness of PRIL by comparing its performance with other approaches.
PLUME: Polyhedral Learning Using Mixture of Experts
Shah Kulin Nitinkumar, P. S. Sastry,Naresh Manwani
Technical Report, arXiv, 2019
@inproceedings{bib_PLUM_2019, AUTHOR = {Shah Kulin Nitinkumar, P. S. Sastry, Naresh Manwani}, TITLE = {PLUME: Polyhedral Learning Using Mixture of Experts}, BOOKTITLE = {Technical Report}. YEAR = {2019}}
In this paper, we propose a novel mixture of expert architecture for learning polyhedral classifiers. We learn the parameters of the classifier using an expectation maximization algorithm. We derive the generalization bounds of the proposed approach. Through extensive simulation study, we show that the proposed method performs comparably to other state-of-the-art approaches.
Sparse Reject Option Classifier Using Successive Linear Programming
Shah Kulin Nitinkumar,Naresh Manwani
American Association for Artificial Intelligence, AAAI, 2019
@inproceedings{bib_Spar_2019, AUTHOR = {Shah Kulin Nitinkumar, Naresh Manwani}, TITLE = {Sparse Reject Option Classifier Using Successive Linear Programming}, BOOKTITLE = {American Association for Artificial Intelligence}. YEAR = {2019}}
In this paper, we propose an approach for learning sparse reject option classifiers using double ramp loss Ldr. We use DC programming to find the risk minimizer. The algorithm solves a sequence of linear programs to learn the reject option classifier. We show that the loss Ldr is Fisher consistent. We also show that the excess risk of loss Ld is upper bounded by excess risk of Ldr. We derive the generalization error bounds for the proposed approach. We show the effectiveness of the proposed approach by experimenting it on several real world datasets. The proposed approach not only performs comparable to the state of the art, it also successfully learns sparse classifiers.
Expert2Coder: Capturing Divergent Brain Regions Using Mixture of Regression Experts
OOTA SUBBA REDDY,Naresh Manwani,Bapiraju Surampudi
Technical Report, arXiv, 2019
@inproceedings{bib_Expe_2019, AUTHOR = {OOTA SUBBA REDDY, Naresh Manwani, Bapiraju Surampudi}, TITLE = {Expert2Coder: Capturing Divergent Brain Regions Using Mixture of Regression Experts}, BOOKTITLE = {Technical Report}. YEAR = {2019}}
fMRI semantic category understanding using linguistic encoding models attempts to learn a forward mapping that relates stimuli to the corresponding brain activation. Stateof-the-art encoding models use a single global model (linear or non-linear) to predict brain activation (all the voxels) given the stimulus. However, the critical assumption in these methods is that a priori different brain regions respond the same way to all the stimuli, that is, there is no modularity or specialization assumed for any region. This goes against the modularity theory, supported by many cognitive neuroscience investigations suggesting that there are functionally specialized regions in the brain. In this paper we achieve this by clustering similar regions together and for every cluster we learn a different linear regression model using a mixture of linear experts model. The key idea here is that each linear expert captures the behaviour of similar brain regions. Given a new stimulus, the utility of the proposed model is twofold (i) predicts the brain activation as a weighted linear combination of the activations of multiple linear experts and (ii) to learn multiple experts corresponding to different brain regions. We argue that each expert captures activity patterns related to a particular region of interest (ROI) in the human brain. This study helps in understanding the brain regions that are activated together given different kinds of stimuli. Importantly, we suggest that the mixture of regression experts (MoRE) framework successfully combines the two principles of organization of function in the brain, namely that of specialization and integration. Experiments on fMRI data from paradigm 1 [1] where participants view linguistic stimuli show that the proposed MoRE model has better prediction accuracy compared to that of conventional models. Our model achieves a mean absolute error (MAE) of 3.94, with an R 2-score of 0.45 on this data set. This is an improvement over performance of traditional methods including, ridge regression (5.58 MAE, 0.15 R 2 -score), MLP (4.63 MAE, 0.35 R 2 -score). We also elaborate on the specializations captured by various experts in our mixture model and their implications.
fMRI Semantic Category Decoding using Linguistic Encoding of Word Embeddings
OOTA SUBBA REDDY,Naresh Manwani,Bapiraju Surampudi
International Conference on Neural Information Processing, ICONIP, 2018
@inproceedings{bib_fMRI_2018, AUTHOR = {OOTA SUBBA REDDY, Naresh Manwani, Bapiraju Surampudi}, TITLE = {fMRI Semantic Category Decoding using Linguistic Encoding of Word Embeddings}, BOOKTITLE = {International Conference on Neural Information Processing}. YEAR = {2018}}
The dispute of how the human brain represents conceptual knowledge has been argued in many scientific fields. Brain imaging studies have shown that the spatial patterns of neural activation in the brain are correlated with thinking about different semantic categories of words (for example, tools, animals, and buildings) or when viewing the related pictures. In this paper, we present a computational model that learns to predict the neural activation captured in functional magnetic resonance imaging (fMRI) data of test words. Unlike the models with hand-crafted features that have been used in the literature, in this paper we propose a novel approach wherein decoding models are built with features extracted from popular linguistic encodings of Word2Vec, GloVe, Meta-Embeddings in conjunction with the empirical fMRI data associated with viewing several dozen concrete nouns. We compared these models with several other models that use word features extracted from FastText, Randomly-generated features, Mitchell’s 25 features [1]. The experimental results show that the predicted fMRI images using Meta-Embeddings meet the state-of-the-art performance. Although models with features from GloVe and Word2Vec predict fMRI images similar to the state-of-the-art model, model with features from Meta-Embeddings predicts significantly better. The proposed scheme that uses popular linguistic encoding offers a simple and easy approach for semantic decoding from fMRI experiments.
Exact Passive-Aggressive Algorithms for Learning to Rank Using Interval Labels
Naresh Manwani,Mohit Chandra
Transactions on Neural Networks and Learning Systems, NNLS, 2018
@inproceedings{bib_Exac_2018, AUTHOR = {Naresh Manwani, Mohit Chandra}, TITLE = {Exact Passive-Aggressive Algorithms for Learning to Rank Using Interval Labels}, BOOKTITLE = {Transactions on Neural Networks and Learning Systems}. YEAR = {2018}}
In this paper, we propose exact passive-aggressive (PA) online algorithms for learning to rank. The proposed algorithms can be used even when we have interval labels instead of actual labels for examples. The proposed algorithms solve a convex optimization problem at every trial. We find exact solution to those optimization problems to determine the updated parameters. We propose support class algorithm (SCA) which finds the active constraints using the KKT conditions of the optimization problems. These active constrains form support set which determines the set of thresholds that need to be updated. We derive update rules for PA, PA-I and PA-II. We show that the proposed algorithms maintain the ordering of the thresholds after every trial. We provide the mistake bounds of the proposed algorithms in both ideal and general settings. We also show experimentally that the proposed algorithms successfully learn accurate classifiers using interval labels as well as exact labels. Proposed algorithms also do well compared to other approaches.
Mixture of Regression Experts in fMRI Encoding
OOTA SUBBA REDDY,ADITHYA AVVARU,Naresh Manwani,Bapiraju Surampudi
Neural Information Processing Systems Workshops, NeurIPS-W, 2018
@inproceedings{bib_Mixt_2018, AUTHOR = {OOTA SUBBA REDDY, ADITHYA AVVARU, Naresh Manwani, Bapiraju Surampudi}, TITLE = {Mixture of Regression Experts in fMRI Encoding}, BOOKTITLE = {Neural Information Processing Systems Workshops}. YEAR = {2018}}
fMRI semantic category understanding using linguistic encoding models attempt to learn a forward mapping that relates stimuli to the corresponding brain activation. Classical encoding models use linear multi-variate methods to predict the brain activation (all voxels) given the stimulus. However, these methods essentially assume multiple regions as one large uniform region or several independent regions, ignoring connections among them. In this paper, we present a mixture of expertsbased model where a group of experts captures brain activity patterns related to particular regions of interest (ROI) and also show the discrimination across different experts. The model is trained word stimuli encoded as 25-dimensional feature vectors as input and the corresponding brain responses as output. Given a new word (25-dimensional feature vector), it predicts the entire brain activation as the linear combination of multiple experts’ brain activations. We argue that each expert learns a certain region of brain activations corresponding to its category of words, which solves the problem of identifying the regions with a simple encoding model. We showcase that proposed mixture of experts-based model indeed learns region-based experts to predict the brain activations with high spatial accuracy.
On the Robustness of Decision Tree Learning under Label Noise
Aritra Ghosh,Naresh Manwani,P. S. Sastry
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2016
@inproceedings{bib_On_t_2016, AUTHOR = {Aritra Ghosh, Naresh Manwani, P. S. Sastry}, TITLE = {On the Robustness of Decision Tree Learning under Label Noise}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2016}}
In most practical problems of classifier learning, the training data suffers from the label noise. Hence, it is important to understand how robust is a learning algorithm to such label noise. This paper presents some theoretical analysis to show that many popular decision tree algorithms are robust to symmetric label noise under large sample size. We also present some sample complexity results which provide some bounds on the sample size for the robustness to hold with a high probability. Through extensive simulations we illustrate this robustness.