Data Science and Artificial Intelligence for Justice Delivery in India: Overview and Research Issues
Parvatam Pavan Kumar,Krishna Reddy Polepalli,Gaurang Patil,K.V.K. Santhy,M. KUMARA SWAMY
@inproceedings{bib_Data_2025, AUTHOR = {Parvatam Pavan Kumar, Krishna Reddy Polepalli, Gaurang Patil, K.V.K. Santhy, M. KUMARA SWAMY}, TITLE = {Data Science and Artificial Intelligence for Justice Delivery in India: Overview and Research Issues}, BOOKTITLE = {International Workshop on Juris-informatics}. YEAR = {2025}}
Data science and artificial intelligence (DSAI) based methods can process massive amounts of data to extract useful knowledge and can be employed to build decision support systems in various domains. Like other domains, the legal systems in several countries are being digitized and there is a scope to build DSAI-based frameworks to improve the performance of legal systems. In the literature, research efforts are being made to investigate DSAI-based methods to improve justice delivery. The Indian legal system is currently experiencing a major problem with a substantial backlog of cases. This paper provides an overview of DSAI-based efforts in the legal domain related to India. We also listed potential research issues to be explored. We hope the issues will encourage further research to improve justice delivery performance in India and other countries.
@inproceedings{bib_Mini_2025, AUTHOR = {Ishan Choubey, R Uday Kiran, Mittapally Nivesh, Krishna Reddy Polepalli}, TITLE = {Mining of Top-K Subgraphs From Uncertain Graph Data}, BOOKTITLE = {International Conference on Fuzzy Systems}. YEAR = {2025}}
A graph transactional database (GTD) is a collection of graphs. Frequent Top-k Subgraph Pattern Mining (FTSPM) involves finding the complete set of top-k frequently occurring subgraph patterns in a GTD. Most previous studies focused on finding these patterns in certain graphs by disregarding the crucial information regarding the existential probabilities that may exist between the edges of any two nodes. With this motivation, this paper proposes a novel model to discover top-k (frequently occurring) subgraphs in an uncertain graph transactional database. We introduce a novel algorithm, top-k uncertain subgraph miner (TUSM), to find all top-k subgraphs in the data. We also introduce an approximate top-k uncertain subgraph miner (ATUSM) algorithm to tackle the computational expansiveness of TUSM. Experimental results on synthetic and protein-protein interaction datasets demonstrate that the proposed model finds valuable information and the algorithms are efficient.
Location and Intent Privacy Preservation for Spatial Range Queries in a Mobile Network
@inproceedings{bib_Loca_2025, AUTHOR = {M A Shadaab Siddiqie, Krishna Reddy Polepalli, Srinivas Reddy Annappalli}, TITLE = {Location and Intent Privacy Preservation for Spatial Range Queries in a Mobile Network}, BOOKTITLE = {IEEE Access}. YEAR = {2025}}
Location-based services (LBSs) in a mobile network environment that provide personalized and timely information to users entail privacy concerns due to the leakage of user locations to adversaries. In the literature, several cloaking-based privacy preservation approaches have been proposed by considering the P2P environment. However, existing spatial cloaking approaches form a large cluster of mobile users to cloak the user query location. Maintaining such structures in a highly dynamic mobile-P2P network is challenging. In addition to user’s location, the user’s intent is also an important concern and needs to be preserved from an adversary. This paper proposes a location privacy and intent privacy preservation scheme, especially for spatial range queries. The key contributions of our work are three-fold. First, we introduce the concept of ijk-anonymity to achieve improved location privacy. Second, we propose a location-based privacy-preservation approach, which we designate as ijkCloak, for spatial range queries in a mobile network environment. The proposed approach preserves user location and intent information from the LBS provider and nearby peers. Third, we conduct theoretical analysis and experiments on resistance to attacks. We show that ijkCloak effectively facilitates improved user location privacy and intent privacy by employing fewer peers w.r.t. existing approaches.
Chandra Mohan Dasari,Dheeraj Kodati,Mittapally Nivesh,Srinivas Reddy Annappalli,Krishna Reddy Polepalli
@inproceedings{bib_Grap_2025, AUTHOR = {Chandra Mohan Dasari, Dheeraj Kodati, Mittapally Nivesh, Srinivas Reddy Annappalli, Krishna Reddy Polepalli}, TITLE = {Graph Neural Networks Based Explainability of Drug-Target Interactions}, BOOKTITLE = {International Conference on Bioinformatics & Computational Biology}. YEAR = {2025}}
Over a decade, efforts have been made to extend data science-based approaches to improve the process of drug-target interactions (DTIs) by modeling drugs and targets as graphs. From a drug discovery point of view, a pharmacophore, a subgraph of a drug candidate, is an essential structural and chemical feature necessary for interacting with a specific biological target. Effective identification of potential pharmacophores from a given candidate drug is a research issue. In the literature, efforts are being made to investigate machine learning and Graph Neural Network (GNN) based frameworks to identify the potential drug candidates for a given target. However, the problem of extracting the knowledge of the potential pharmacophore of the given candidate drug has not been explored. There is an opportunity to extract pharmacophores from drug candidates by exploiting the knowledge of potential subgraphs extracted through a trained GNN model. In this paper, we propose a two-phase GNN-based framework for identifying drug candidates and extracting potential pharmacophores from these candidates. The framework consists of two parts. First, we employ a GNN-based model to compute the affinity score for the given drug compound. Second, by employing the Monte Carlo Tree Search (MCTS) algorithm, the trained GNN is leveraged to extract potential subgraphs representing pharmacophores. The experimental results on the Davis, Kiba and Allergy datasets demonstrate the feasibility of the proposed approach to extract pharmacophores of candidate drugs with high performance. The predicted binding affinities of molecular subgraphs extracted from drug candidates are very similar to those of the corresponding drug candidates. The proposed approach helps explore the potential pharmacophore of the given candidate drug, which enhances the explainability of DTI to obtain deeper insights into crucial molecular interactions.
Data Cube for Exploring Anomalies in Justice Delivery: An Experiment on Indian Judgements
Bondugula Sriharshitha,Krishna Reddy Polepalli,Narendra Babu Unnam,K.V.K. Santhy
@inproceedings{bib_Data_2025, AUTHOR = {Bondugula Sriharshitha, Krishna Reddy Polepalli, Narendra Babu Unnam, K.V.K. Santhy}, TITLE = {Data Cube for Exploring Anomalies in Justice Delivery: An Experiment on Indian Judgements}, BOOKTITLE = {International Conference on the AI Revolution: Research, Ethics, and Society}. YEAR = {2025}}
In decision making settings such as medical diagnosis, investment choices, or sentencing in a court of law, an individual’s background and experience influence their decisions. In the legal domain, multiple factors like personal bias/beliefs, recent events, and the contemporary state of mind could affect decision-making, leading to inconsistencies in judgments between and within jurisdictions. It is widely reported in the literature that anomalies and disparities exist in judicial decisions, such as bail grants and sentence impositions, stemming from implicit bias or
other contextual factors. Notably, in domains like sales and marketing, data cube-based systems are being used to extract interesting trends and
anomalies in subspaces from large multidimensional databases. In this paper, we extend the data cube framework to explore possible anomalies
and disparities in court sentences by conducting experiments on a sample Indian judgment dataset of criminal cases. The results show that the
data cube-based framework could identify anomalies in the legal domain. We hope this work will encourage researchers to investigate a compre-
hensive data cube-based framework to reduce disparities and improve
justice delivery worldwide.
Text Representation Models based on the Spatial Distributional Properties of Word Embeddings
@inproceedings{bib_Text_2024, AUTHOR = {Narendra Babu Unnam, Krishna Reddy Polepalli, Amit Pandey, Naresh Manwani}, TITLE = {Text Representation Models based on the Spatial Distributional Properties of Word Embeddings}, BOOKTITLE = {India Joint International Conference on Data Science & Management of Data}. YEAR = {2024}}
In the current digital era, about 80% of the digital data which is being generated is unstructured and unlabeled natural language text. In the development cycle of information retrieval and text mining applications, text representation is the most fundamental and critical step, as its effectiveness directly impacts the application’s performance. The existing traditional text representation frameworks are mostly frequency distribution-based. In this work, we explored the spatial distribution of word embeddings and proposed two text representational models. The experimental demonstrated that proposed models perform consistently better at text mining tasks compared to baseline methods.
A Model for Retrieving High-Utility Itemsets with Complementary and Substitute Goods
@inproceedings{bib_A_Mo_2024, AUTHOR = {Raghav Mittal, Anirban Mondal, Krishna Reddy Polepalli}, TITLE = {A Model for Retrieving High-Utility Itemsets with Complementary and Substitute Goods}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2024}}
Given a retail transactional database, the objective of high-utility pattern mining is to discover high-utility itemsets (HUIs), i.e., itemsets that satisfy a user-specified utility threshold. In retail applications, when purchasing a set of items (i.e., itemsets), consumers seek to replace or substitute items with each other to suit their individual preferences (e.g., Coke with Pepsi, tea with coffee). In practice, retailers, too, require substitutes to address operational issues like stockouts, expiration, and other supply chain constraints. The implication is that items that are interchangeably purchased, i.e., substitute goods, are critical to ensuring both user satisfaction and sustained retailer profits. In this regard, this work presents (i) an efficient model to identify HUIs containing substitute goods in place of items that require substitution, (ii) the SubstiTution-based Itemset indeX (STIX) to retrieve HUIs containing substitutes, and (iii) an experimental study to depict the benefits of the proposed approach w.r.t. a baseline method.
@inproceedings{bib_Intr_2024, AUTHOR = {Krishna Reddy Polepalli}, TITLE = {Introduction to the special issue on PAKDD’2021}, BOOKTITLE = {International Journal of Data Science and Analytics}. YEAR = {2024}}
Citation Anchor Text for Improving Precedent Retrieval: An Experimental Study on Indian Legal Documents
Gaurang Patil,Bhoomeendra Singh Sisodiya,Krishna Reddy Polepalli,K.V.K. Santhy
@inproceedings{bib_Cita_2024, AUTHOR = {Gaurang Patil, Bhoomeendra Singh Sisodiya, Krishna Reddy Polepalli, K.V.K. Santhy}, TITLE = {Citation Anchor Text for Improving Precedent Retrieval: An Experimental Study on Indian Legal Documents}, BOOKTITLE = {International Conference on Legal Knowledge and Information Systems}. YEAR = {2024}}
In the legal domain, research efforts are being made to enhance precedent retrieval by exploiting features based on meta-data, catchphrases, citations, sentences, paragraphs, etc. It is reported in the legal domain that the text surrounding a citation provides information to identify the referenced judgment and supplies additional information about the referenced judgment and its connection to the formulated argument. In this paper, we have exploited the resourcefulness of the text surrounding the citation to improve the document representation of the referenced judgment. Experiments conducted on Indian court judgments show that the proposed Preceding citation Anchor Text (PAT)-based approach captures certain nuances that are not captured by the text present in the referenced judgment, indicating that there is a scope to exploit PAT to improve the performance of precedent retrieval systems.
Bhoomeendra Singh Sisodiya,Narendra Babu Unnam,Krishna Reddy Polepalli,Apala Das,K.V.K. Santhy,V BALAKRISHNA REDDY
@inproceedings{bib_Anal_2023, AUTHOR = {Bhoomeendra Singh Sisodiya, Narendra Babu Unnam, Krishna Reddy Polepalli, Apala Das, K.V.K. Santhy, V BALAKRISHNA REDDY}, TITLE = {Analysing the Resourcefulness of the Paragraph for Precedence Retrieval}, BOOKTITLE = {International Conference on Artificial Intelligence and Law}. YEAR = {2023}}
Developing methods for extracting relevant legal information to aid legal practitioners is an active research area. In this regard, research efforts are being made by leveraging different kinds of information, such as meta-data, citations, keywords, sentences, paragraphs, etc. Similar to any text document, legal documents are composed of paragraphs. In this paper, we have analyzed the resourcefulness of paragraph-level information in capturing similarity among judgments for improving the performance of precedence retrieval. We found that the paragraph-level methods could capture the similarity among the judgments with only a few paragraph interactions and exhibit more discriminating power over the baseline document-level method. Moreover, the comparison results on two benchmark datasets for the precedence retrieval on the Indian supreme court judgments task show that the paragraph-level methods exhibit comparable performance with the state-of-the-art methods.
A Review of Approaches on Facets for Building IT-Based Career Guidance Systems
Chandra Shekar N,Krishna Reddy Polepalli,Kunkulagunta Anupama,Amarender Reddy
International Conference on Big Data Analytics, BDA, 2023
Abs | | bib Tex
@inproceedings{bib_A_Re_2023, AUTHOR = {Chandra Shekar N, Krishna Reddy Polepalli, Kunkulagunta Anupama, Amarender Reddy}, TITLE = {A Review of Approaches on Facets for Building IT-Based Career Guidance Systems}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2023}}
A Novel Explainable Link Forecasting Framework for Temporal Knowledge Graphs Using Time-Relaxed Cyclic and Acyclic Rules
Rage Uday Kiran,Abinash Maharana,Krishna Reddy Polepalli
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2023
@inproceedings{bib_A_No_2023, AUTHOR = {Rage Uday Kiran, Abinash Maharana, Krishna Reddy Polepalli}, TITLE = {A Novel Explainable Link Forecasting Framework for Temporal Knowledge Graphs Using Time-Relaxed Cyclic and Acyclic Rules}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2023}}
Link forecasting in a temporal Knowledge Graph (tKG) in- volves predicting a future event from a given set of past events. Most previous studies suffered from reduced performance as they disregarded acyclic rules and enforced a tight constraint that all past events must exist in a strict temporal order. This paper proposes a novel explainable rule-based link forecasting framework by introducing two new concepts, namely ‘relaxed temporal cyclic and acyclic random walks’ and ‘link-star rules’. The former concept involves generating rules by performing cyclic and acyclic random walks on a tKG by taking into account the real-world phenomenon that the order of any two events may be ignored if their oc- currence time gap is within a threshold value. Link-star rules are a special class of acyclic rules generated based on the natural phenomenon that history repeats itself after a particular time. Link-star rules eliminate the problem of combinatorial rule explosion, thereby making our frame- work practicable. Experimental results demonstrate that our framework outperforms the state-of-the-art by a substantial margin. The evaluation measures hits@1 and mean reciprocal rank were improved by 45% and 23%, respectively. Keywords: Knowledge Graphs · Graph Analytics · Forecasting
Location and Intent Privacy Preservation for Spatial Range Queries in a Mobile Network
M A Shadaab Siddiqie,Srinivas Reddy Annappalli,Krishna Reddy Polepalli
Technical Report, arXiv, 2023
@inproceedings{bib_Loca_2023, AUTHOR = {M A Shadaab Siddiqie, Srinivas Reddy Annappalli, Krishna Reddy Polepalli}, TITLE = {Location and Intent Privacy Preservation for Spatial Range Queries in a Mobile Network}, BOOKTITLE = {Technical Report}. YEAR = {2023}}
Location-based services (LBSs) in a mobile network environment that provide personalized and timely information to users entail privacy concerns due to the leakage of user locations to adversaries. In the literature, several cloaking-based privacy preservation approaches have been proposed by considering the P2P environment. However, existing spatial cloaking approaches form a large cluster of mobile users to cloak the user query location. Maintaining such structures in a highly dynamic mobile- P2P network is challenging. In addition to user’s location, the user’s intent is also an important concern and needs to be preserved from an adversary. This paper proposes a location privacy and intent privacy preservation scheme, especially for spatial range queries. The key contributions of our work are three-fold. First, we introduce the concept of ijk-anonymity to achieve improved location privacy. Second, we propose a location- based privacy-preservation approach, which we designate as ijkCloak, for spatial range queries in a mobile network environ- ment. The proposed approach preserves user location and intent information from the LBS provider and nearby peers. Third, we conduct theoretical analysis and experiments on resistance to attacks. We show that ijkCloak effectively facilitates improved user location privacy and intent privacy by employing fewer peers w.r.t. existing approaches. Index Terms—Mobile computing, location-based services, loca- tion privacy, spatial cloaking, spatial queries, peer-to-peer, spatial range queries
A Consumer-Good-Type Aware Itemset Placement Framework for Retail Businesses
Raghav Mittal,Anirban Mondal,Krishna Reddy Polepalli
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2023
Abs | | bib Tex
@inproceedings{bib_A_Co_2023, AUTHOR = {Raghav Mittal, Anirban Mondal, Krishna Reddy Polepalli}, TITLE = {A Consumer-Good-Type Aware Itemset Placement Framework for Retail Businesses}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2023}}
It is a well-established fact that strategic placement of items on the shelves of a retail store significantly impacts the revenue of the retailer. Consumer goods sold in retail stores can be classified into essential and typically low-priced convenience items, and non-essential and high-priced shopping items. Notably, the lower-priced convenience items are critical to ensuring consumer foot-traffic, thereby also driving the sales of shopping items. Moreover, users typically buy multiple items together (i.e., itemsets) to facilitate one-stop shopping. Hence, it becomes a necessity to strategically index and place itemsets that contain both convenience items and shopping items. In this regard, we propose a consumer-good-type aware and revenue-conscious itemset indexing scheme for efficiently retrieving high-revenue itemsets containing both convenience and shopping items. Moreover, we propose an itemset placement …
An inventory-aware and revenue-based itemset placement framework for retail stores
Anirban Mondal,Raghav Mittal,Samant Saurabh,Parul Chaudhary,Krishna Reddy Polepalli
Expert Systems with Applications, ESWA, 2023
Abs | | bib Tex
@inproceedings{bib_An_i_2023, AUTHOR = {Anirban Mondal, Raghav Mittal, Samant Saurabh, Parul Chaudhary, Krishna Reddy Polepalli}, TITLE = {An inventory-aware and revenue-based itemset placement framework for retail stores}, BOOKTITLE = {Expert Systems with Applications}. YEAR = {2023}}
Retailer revenue is significantly impacted by item placement. Given the prevalence and popularity of medium-to-large-sized retail stores, several research efforts have been made towards facilitating item/itemset placement for improving retailer revenue. However, they fail to consider the issue of inventory of the items w.r.t. itemset placement. Notably, the inventory of a given item refers to the number of instances of that item that are available to the retailer for sales purposes. Moreover, efficient retrieval and placement of top-revenue itemsets in the retail store slots cannot be performed by existing approaches. Our key contributions are summarized as follows. First, we introduce the notion of inventory in retail itemset placement. Second, we propose an inventory-aware indexing scheme, designated as IRIS, for efficiently retrieving high-revenue itemsets. Moreover, we propose the IRPS inventory-aware itemset placement scheme, which exploits the IRIS indexing scheme, for facilitating improved retailer revenue. Third, we conduct a performance study with two real datasets to demonstrate the effectiveness of our proposed itemset indexing and placement schemes in improving retailer revenue.
An improved dummy generation approach for infeasible regions
M A Shadaab Siddiqie,Anirban Mondal,Krishna Reddy Polepalli
Applied Intelligence, APIN, 2023
Abs | | bib Tex
@inproceedings{bib_An_i_2023, AUTHOR = {M A Shadaab Siddiqie, Anirban Mondal, Krishna Reddy Polepalli}, TITLE = {An improved dummy generation approach for infeasible regions}, BOOKTITLE = {Applied Intelligence}. YEAR = {2023}}
Location-based services (LBS), which provide personalized and timely information, entail privacy concerns such as unwanted leaks of current user locations to potential stalkers. In this regard, existing works have proposed dummy generation techniques by creating a cloaking region (CR) such that the user’s location is at a fixed distance from the centre of CR. Hence, if the adversary somehow knows the location of the centre of CR, the user’s location would be vulnerable to attacks. Moreover, in case of the existing approaches, infeasible regions are assumed to have no relationship with time. However, this assumption is typically not valid in real-world scenarios. For example, a supermarket can be considered to be an infeasible region from 9 pm to 9 am since it would be closed at that time. Thus, if a dummy is placed at this location at that particular time, the attacker would know that it is a dummy, thereby reducing the user’s location privacy. In this regard, our key contributions are three-fold. First, we propose an improved dummy generation approach, which we designate as Annulus-based Gaussian Dummy Generation (AGDG), for facilitating improved location privacy for mobile users. Second, we introduce the notion of time-dependent infeasible regions to further improve the dummy generation approach by considering infeasible regions that change with time. Third, we conducted experiments to demonstrate that the AGDG effectively provides improved location privacy, including for regions with time-dependent infeasible regions w.r.t. existing approaches.
Analysis of Weather Condition Based Reuse Among Agromet Advisory: A Validation Study
A MAMATHA,Krishna Reddy Polepalli,Anirban Mondal,S. G. Mahadevappa, Balaji Naik Banoth
International Conference on Big Data Analytics, BDA, 2022
Abs | | bib Tex
@inproceedings{bib_Anal_2022, AUTHOR = {A MAMATHA, Krishna Reddy Polepalli, Anirban Mondal, S. G. Mahadevappa, Balaji Naik Banoth}, TITLE = {Analysis of Weather Condition Based Reuse Among Agromet Advisory: A Validation Study}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2022}}
India Meteorological Department (IMD) is delivering agromet advisories, i.e., weather-based crop risk management advisories based on the medium-range weather forecast (five days) across India. Based on the weather prediction, once in five days, agromet advisory is provided for major crops and livestock by considering the district/block as a unit. In the literature, a framework was proposed to improve the process of advisory preparation by employing the notion of reuse. In that framework, an approach was explored to reuse the advisory prepared for the given weather situation to prepare advisory for similar weather situations in the future. For this, a notion of category-based weather condition (CWC) was proposed to model a given
Mining subgraph coverage patterns from graph transactions
Srinivas Reddy Annappalli,Krishna Reddy Polepalli,Anirban Mondal,Deva Priyakumar U
International Journal of Data Science and Analytics, IJDSA, 2022
Abs | | bib Tex
@inproceedings{bib_Mini_2022, AUTHOR = {Srinivas Reddy Annappalli, Krishna Reddy Polepalli, Anirban Mondal, Deva Priyakumar U}, TITLE = {Mining subgraph coverage patterns from graph transactions}, BOOKTITLE = {International Journal of Data Science and Analytics}. YEAR = {2022}}
Pattern mining from graph transactional data (GTD) is an active area of research with applications in the domains of bioinformatics, chemical informatics and social networks. Existing works address the problem of mining frequent subgraphs from GTD. However, the knowledge concerning the coverage aspect of a set of subgraphs is also valuable for improving the performance of several applications. In this regard, we introduce the notion of subgraph coverage patterns (SCPs). Given a GTD, a subgraph coverage pattern is a set of subgraphs subject to relative frequency, coverage and overlap constraints provided by the user. We propose the Subgraph ID-based Flat Transactional (SIFT) framework for the efficient extraction of SCPs from a given GTD. Our performance evaluation using three real datasets demonstrates that our proposed SIFT framework is indeed capable of efficiently extracting SCPs from GTD. Furthermore, we demonstrate the effectiveness of SIFT through a case study in computer-aided drug design.
A Novel Null-Invariant Temporal Measure to Discover Partial Periodic Patterns in Non-uniform Temporal Databases
R.UDAY KIRAN,Vipul Chhabra,Saideep Chennupati,Krishna Reddy Polepalli,Minh-Son Dao,Koji Zettsu
International Conference on Database Systems for Advanced Applications, DASFAA, 2022
@inproceedings{bib_A_No_2022, AUTHOR = {R.UDAY KIRAN, Vipul Chhabra, Saideep Chennupati, Krishna Reddy Polepalli, Minh-Son Dao, Koji Zettsu}, TITLE = {A Novel Null-Invariant Temporal Measure to Discover Partial Periodic Patterns in Non-uniform Temporal Databases}, BOOKTITLE = {International Conference on Database Systems for Advanced Applications}. YEAR = {2022}}
“Rare item problem” is a fundamental problem in pattern mining. It represents the inability of a pattern mining model to discover the knowledge about frequent and rare items in a database. In the literature, researchers advocated the usage of null-invariant measures as they disclose genuine correlations without being influenced by the object co-absence in the database. Since the existing null-invariant measures consider only an item’s frequency and disregard its temporal occurrence information, they are inadequate to address the rare item problem faced by the partial periodic pattern model. This paper proposes a novel null-invariant measure, called relative periodic-support, to find the patterns containing both frequent and rare items in non-uniform temporal databases. We also introduce an efficient pattern-growth algorithm to find all desired patterns in a database. Experimental results demonstrate that our algorithm is efficient.
Air Quality Data Collection in Hyderabad Using Low-Cost Sensors: Initial Experiences
Chandra Shekar N,Srinivas Reddy Annappalli,Krishna Reddy Polepalli,Anirban Mondal,Girish Agrawal
International Conference on Database Systems for Advanced Applications, DASFAA, 2022
Abs | | bib Tex
@inproceedings{bib_Air__2022, AUTHOR = {Chandra Shekar N, Srinivas Reddy Annappalli, Krishna Reddy Polepalli, Anirban Mondal, Girish Agrawal}, TITLE = {Air Quality Data Collection in Hyderabad Using Low-Cost Sensors: Initial Experiences}, BOOKTITLE = {International Conference on Database Systems for Advanced Applications}. YEAR = {2022}}
Exposure to ambient Particulate Matter (PM) of air pollution is a leading risk factor for morbidity and mortality. The most common approach for air quality monitoring is to rely on environmental monitoring stations, which are expensive to acquire as well as to maintain. Moreover, such stations are typically sparsely deployed, thereby resulting in limited spatial resolution for measurements. Recently, low-cost air quality sensors have emerged as an alternative for improving the granularity of monitoring. We are making an effort to explore the framework for air quality data collection by employing low-cost sensors. In this paper, we have reported our initial experiences and observations concerning PM data collection for four months starting from October 2021 in the city of Hyderabad in India.
Visualizing Spatio-temporal Variation of Ambient Air Pollution in Four Small Towns in India
Girish Agrawal,Hifzur Rahman ,Anirban Mondal,Krishna Reddy Polepalli
International Conference on Database Systems for Advanced Applications, DASFAA, 2022
Abs | | bib Tex
@inproceedings{bib_Visu_2022, AUTHOR = {Girish Agrawal, Hifzur Rahman , Anirban Mondal, Krishna Reddy Polepalli}, TITLE = {Visualizing Spatio-temporal Variation of Ambient Air Pollution in Four Small Towns in India}, BOOKTITLE = {International Conference on Database Systems for Advanced Applications}. YEAR = {2022}}
Air pollution is a major threat to human health in India. More than three-quarters of the people in India are exposed to pollution levels higher than the limits recommended by the National Ambient Air Quality Standards in India and significantly higher than those recommended by the World Health Organization. Despite the poor air quality, the monitoring of air pollution levels is limited even in large urban areas in India and virtually absent in small towns and rural areas. The lack of data results in a minimal understanding of spatial and temporal patterns of air pollutants at local and regional levels. This paper is the second in a planned series of papers presenting particulate air pollution trends monitored in small cities and towns in India. The findings presented here are important for framing state and regional level policies for addressing air pollution problems in urban areas, and achieve the sustainable development goals (SDGs) linked to public health, reduction in the adverse environmental impact of cities, and adaptation to climate change, as indicated by SDGs 3.9, 11.6 and 11.b.
A Market Segmentation Aware Retail Itemset Placement Framework
Raghav Mittal,Anirban Mondal,Krishna Reddy Polepalli
International Conference on Database and Expert Systems Applications, DEXA, 2022
Abs | | bib Tex
@inproceedings{bib_A_Ma_2022, AUTHOR = {Raghav Mittal, Anirban Mondal, Krishna Reddy Polepalli}, TITLE = {A Market Segmentation Aware Retail Itemset Placement Framework}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2022}}
It is a well-established fact in the retail industry that the placement of products on the shelves of the retail store has a significant impact on the revenue of the retailer. Given that customers tend to purchase sets of items together (i.e., itemsets) instead of individual items, it becomes a necessity to strategically place itemsets on the shelves of the retail store for improving retailer revenue. Furthermore, in practice, customers belong to different market segments based on factors such as purchasing power, demographics and customer behaviour. Existing research efforts do not address the issue of market segmentation w.r.t. itemset placement in retail stores. Consequently, they fail to efficiently index, retrieve and place high-utility itemsets in the retail slots in a market segmentation aware manner. In this work, we introduce the problem of market segmentation aware itemset placement for retail stores. Moreover, we propose a market segmentation aware retail itemset placement framework, which takes high-utility itemsets as input. Our performance evaluation with two real datasets demonstrates that our proposed framework is indeed effective in improving retailer revenue w.r.t. existing schemes.
A Spatiotemporal Image Fusion Method for Predicting High-Resolution Satellite Images
Vipul Chhabra,R. Uday Kiran,Juan Xiao,Krishna Reddy Polepalli,Ram Avtar
International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Sy, IEA/AIE, 2022
@inproceedings{bib_A_Sp_2022, AUTHOR = {Vipul Chhabra, R. Uday Kiran, Juan Xiao, Krishna Reddy Polepalli, Ram Avtar}, TITLE = {A Spatiotemporal Image Fusion Method for Predicting High-Resolution Satellite Images}, BOOKTITLE = {International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Sy}. YEAR = {2022}}
Given a coarse satellite image and a fine satellite image of a particular location taken at the same time, the high-resolution spatiotemporal image fusion technique involves understanding the spatial correlation between the pixels of both images and using it to generate a finer image for a given coarse (or test) image taken at a later time. This technique is extensively used for monitoring agricultural land cover, forest cover, etc. The two key issues in this technique are: (i) handling missing pixel data and (ii) improving the prediction accuracy of the fine image generated from the given test coarse image. This paper tackles these two issues by proposing an efficient method consisting of the following three basic steps: (i) imputation of missing pixels using neighborhood information, (ii) cross-scale matching to adjust both the Point Spread Functions Effect (PSF) and geo-registration errors between the course and high-resolution images, and (iii) error-based modulation, which uses pixel-based multiplicative factors and residuals to fix the error caused due to modulation of temporal changes. The experimental results on the real-world satellite imagery datasets demonstrate that the proposed model outperforms the state-of-art by accurately producing the high-resolution satellite images closer to the ground truth.
A Pattern Mining Framework for Improving Billboard Advertising Revenue
PARVATHANENI REVANTH RATHAN,Krishna Reddy Polepalli,Anirban Mondal
Transactions on Large-Scale Data- and Knowledge-Centered Systems, TLDKS, 2022
Abs | | bib Tex
@inproceedings{bib_A_Pa_2022, AUTHOR = {PARVATHANENI REVANTH RATHAN, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {A Pattern Mining Framework for Improving Billboard Advertising Revenue}, BOOKTITLE = {Transactions on Large-Scale Data- and Knowledge-Centered Systems}. YEAR = {2022}}
Billboard advertisement is one of the dominant modes of traditional outdoor advertisements. A billboard operator manages the ad slots of a set of billboards. Normally, a user traversal is exposed to multiple billboards. Given a set of billboards, there is an opportunity to improve the revenue of the billboard operator by satisfying the advertising demands of an increased number of clients and ensuring that a user gets exposed to different ads on the billboards during the traversal. In this paper, we propose a framework to improve the revenue of the billboard operator by employing transactional modeling in conjunction with pattern mining. Our main contributions are three-fold. First, we introduce the problem of billboard advertisement allocation for improving the billboard operator revenue. Second, we propose an efficient user trajectory-based transactional framework using coverage pattern mining for improving the revenue of the billboard operator. Third, we conduct a performance study with a real dataset to demonstrate the effectiveness of our proposed framework.
Journey to the center of the words: Word weighting scheme based on the geometry of word embeddings
Narendra Babu Unnam,Krishna Reddy Polepalli,Amit Pandey,Naresh Manwani
International Conference on Scientific and Statistical Database Management, SSDBM, 2022
@inproceedings{bib_Jour_2022, AUTHOR = {Narendra Babu Unnam, Krishna Reddy Polepalli, Amit Pandey, Naresh Manwani}, TITLE = {Journey to the center of the words: Word weighting scheme based on the geometry of word embeddings}, BOOKTITLE = {International Conference on Scientific and Statistical Database Management}. YEAR = {2022}}
Improving efficiency of block-level agrometeorological advisory system by exploiting reuse: A study in Telangana
A MAMATHA,Krishna Reddy Polepalli,Balajinaik Banoth,Sreenivas Gade,Anirban Mondal,Seishi Ninomiya
Journal of Agrometeorology, AGROMET, 2022
@inproceedings{bib_Impr_2022, AUTHOR = {A MAMATHA, Krishna Reddy Polepalli, Balajinaik Banoth, Sreenivas Gade, Anirban Mondal, Seishi Ninomiya}, TITLE = {Improving efficiency of block-level agrometeorological advisory system by exploiting reuse: A study in Telangana}, BOOKTITLE = {Journal of Agrometeorology}. YEAR = {2022}}
India Meteorological Department (IMD) has started block-level level agromet advisory (AA) service from the year 2015 and is currently operating in a few blocks of each state across India. In a block-level AA service, on every Tuesday and Friday, AA is being prepared for each block based on the block-level Medium Range weather Forecast (MRF). In this paper, we propose a framework to improve the preparation of block- level AA by modeling a weather situation as “Category-based Weather Condition (CWC)” and exploiting both “temporal reuse” and “spatial reuse” of AA based on the similarity among CWCs. The weather data analysis for 12 blocks of Telangana by considering the phenophase-specific CWCs of Rice crop showed that there is a scope to improve the efficiency of block-level AA bulletin preparation process by exploiting reuse
Efficient Discovery of Partial Periodic Patterns in Large Temporal Databases
Rage Uday Kiran,Pamalla Veena,Penugonda Ravikumar,Saideep Chennupati,Koji Zettsu,Haichuan Shang,Masashi Toyoda,Masaru Kitsuregawa,Krishna Reddy Polepalli
Electronics, Electronics, 2022
@inproceedings{bib_Effi_2022, AUTHOR = {Rage Uday Kiran, Pamalla Veena, Penugonda Ravikumar, Saideep Chennupati, Koji Zettsu, Haichuan Shang, Masashi Toyoda, Masaru Kitsuregawa, Krishna Reddy Polepalli}, TITLE = {Efficient Discovery of Partial Periodic Patterns in Large Temporal Databases}, BOOKTITLE = {Electronics}. YEAR = {2022}}
Periodic pattern mining is an emerging technique for knowledge discovery. Most previous approaches have aimed to find only those patterns that exhibit full (or perfect) periodic behavior in databases. Consequently, the existing approaches miss interesting patterns that exhibit partial periodic behavior in a database. With this motivation, this paper proposes a novel model for finding partial periodic patterns that may exist in temporal databases. An efficient pattern-growth algorithm, called Partial Periodic Pattern-growth (3P-growth), is also presented, which can effectively find all desired patterns within a database. Substantial experiments on both real-world and synthetic databases showed that our algorithm is not only efficient in terms of memory and runtime, but is also highly scalable. Finally, the effectiveness of our patterns is demonstrated using two case studies. In the first case study, our model was employed to identify the highly polluted areas in Japan. In the second case study, our model was employed to identify the road segments on which people regularly face traffic congestion.
A framework for itemset placement with diversification for retail businesses
Anirban Mondal,Raghav Mittal,Parul Chaudhary,Krishna Reddy Polepalli
Applied Intelligence, APIN, 2022
@inproceedings{bib_A_fr_2022, AUTHOR = {Anirban Mondal, Raghav Mittal, Parul Chaudhary, Krishna Reddy Polepalli}, TITLE = {A framework for itemset placement with diversification for retail businesses}, BOOKTITLE = {Applied Intelligence}. YEAR = {2022}}
Alongside revenue maximization, retailers seek to offer a diverse range of items to facilitate sustainable revenue generation in the long run. Moreover, customers typically buy sets of items, i.e., itemsets, as opposed to individual items. Therefore, strategic placement of diversified and high-revenue itemsets is a priority for the retailer. Research efforts made towards the extraction and placement of high-revenue itemsets in retail stores do not consider the notion of diversification. Further, candidate itemsets generated using existing utility mining schemes usually explode; which can cause memory and retrieval- time issues. This work makes three key contributions. First, we propose an efficient framework for retrieval of high-revenue itemsets with a varying size and a varying degree of diversification. A higher degree of diversification is indicative of fewer repetitive items in the top-revenue itemsets. Second, we propose the kUI (k Utility Itemset) index for quick and efficient retrieval of diverse top-λ high-revenue itemsets. We also propose the HUDIP (High-Utility and Diversified Itemset Placement) scheme, which exploits our proposed kUI index for placement of high-revenue and diversified itemsets. Third, our extensive performance study with both real and synthetic datasets demonstrates the effectiveness of our proposed HUDIP scheme in efficiently determining high-revenue and diversified itemsets.
Improving Billboard Advertising Revenue Using Transactional Modeling and Pattern Mining
PARVATHANENI REVANTH RATHAN,Krishna Reddy Polepalli,Anirban Mondal
International Conference on Database and Expert Systems Applications, DEXA, 2021
Abs | | bib Tex
@inproceedings{bib_Impr_2021, AUTHOR = {PARVATHANENI REVANTH RATHAN, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {Improving Billboard Advertising Revenue Using Transactional Modeling and Pattern Mining}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2021}}
Billboard advertisement is among the dominant modes of outdoor advertisements. The billboard operator has an opportunity to improve its revenue by satisfying the advertising demands of an increased number of clients by means of exploiting the user trajectory data. Hence, we introduce the problem of billboard advertisement allocation for improving the billboard operator revenue, and propose an efficient user trajectory-based transactional framework using coverage pattern mining. Our experiments validate the effectiveness of our framework.
An Urgency-Aware and Revenue-Based Itemset Placement Framework for Retail Stores
Raghav Mittal,Anirban Mondal,Parul Chaudhary,Krishna Reddy Polepalli
International Conference on Database and Expert Systems Applications, DEXA, 2021
Abs | | bib Tex
@inproceedings{bib_An_U_2021, AUTHOR = {Raghav Mittal, Anirban Mondal, Parul Chaudhary, Krishna Reddy Polepalli}, TITLE = {An Urgency-Aware and Revenue-Based Itemset Placement Framework for Retail Stores}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2021}}
Placement of items on the shelf space of retail stores signifcantly impacts the revenue of the retailer. Given the prevalence and popularity of medium-to-large-size retail stores, several research efforts have been made towards facilitating item/itemset placement in retail stores for improving retailer revenue. However, they do not consider the issue of urgency of sale of individual items. Hence, they cannot efficiently index, retrieve and place high-revenue itemsets in retail store slots in an urgency-aware manner. Our key contributions are two-fold. First, we introduce the notion of urgency for retail itemset placement. Second, we propose the urgency-aware URI index for efficiently retrieving high-revenue and urgent itemsets of different sizes. We discuss the URIP itemset placement scheme, which exploits URI for improving retailer revenue. We also conduct a performance evaluation with two real datasets to demonstrate that URIP is indeed effective in improving retailer revenue w.r.t. existing schemes.
A Retail Itemset Placement Framework Based on Premiumness of Slots and Utility Mining
Anirban Mondal,Samant Saurabh,Parul Chaudhary,Raghav Mittal,Krishna Reddy Polepalli
IEEE Access, ACCESS, 2021
Abs | | bib Tex
@inproceedings{bib_A_Re_2021, AUTHOR = {Anirban Mondal, Samant Saurabh, Parul Chaudhary, Raghav Mittal, Krishna Reddy Polepalli}, TITLE = {A Retail Itemset Placement Framework Based on Premiumness of Slots and Utility Mining}, BOOKTITLE = {IEEE Access}. YEAR = {2021}}
Retailer revenue is significantly impacted by item placement in retail stores. Notably, placement of items in the premium slots (i.e., slots with increased visibility/accessibility) improves the probability of sale w.r.t. item placement in non-premium slots. Moreover, customers often tend to buy sets of items (i.e., itemsets) instead of individual purchases. In this paper, we address the problem of maximizing retailer revenue by determining the placement of itemsets in different types of slots with varied premiumness . Our key contributions are as follows. First, we introduce the notion of premiumness of retail slots and discuss the issue of itemset placement in slots with varied premiumness. Second, we propose two efficient schemes, namely P remiumness and R evenue-based I temset P lacement (PRIP) and P remiumness and A verage R evenue-based I temset P lacement (PARIP), for placing itemsets with varying revenue in slots with varied premiumness. Third, we perform a detailed performance analysis using both real and synthetic datasets to showcase the effectiveness of our proposed schemes. We also perform a comprehensive mathematical analysis of our proposed schemes w.r.t. the complexity analysis.
An incremental framework to extract coverage patterns for dynamic databases
Komallapalli Kaushik,Krishna Reddy Polepalli,Anirban Mondal,Ralla Akhil
International Journal of Data Science and Analytics, IJDSA, 2021
Abs | | bib Tex
@inproceedings{bib_An_i_2021, AUTHOR = {Komallapalli Kaushik, Krishna Reddy Polepalli, Anirban Mondal, Ralla Akhil}, TITLE = {An incremental framework to extract coverage patterns for dynamic databases}, BOOKTITLE = {International Journal of Data Science and Analytics}. YEAR = {2021}}
Pattern mining is an important task of data mining and involves the extraction of interesting associations from large transactional databases. Typically, a given transactional database D gets updated due to the addition and deletion of transactions. Consequently, some of the previously discovered patterns may become invalid, while some new patterns may emerge. This has motivated significant research efforts in the area of incremental mining. The goal of incremental mining is to efficiently mine patterns when D gets updated with additions and/or deletions of transactions as opposed to mining all of the patterns from scratch. Incidentally, active research efforts are being made to develop incremental pattern mining algorithms for extracting frequent patterns, sequential patterns and utility patterns. Another important type of pattern is the coverage pattern (CP), which has significant applications in areas such as banner advertising, search engine advertising and visibility mining. However, none of the existing works address the issue of incremental mining for extracting CPs. In this regard, the main contributions of this work are twofold. First, we introduce the problem of incremental mining of CPs. Second, we propose an approach, designated as Comprehensive Coverage Pattern Mining, for efficiently extracting CPs under the incremental paradigm. We have also performed extensive experiments using two real click-stream datasets and one synthetic dataset to demonstrate the overall effectiveness of our proposed approach.
Improving Billboard Advertising Reven usingTransactional Modeling and Pattern Mining
PARVATHANENI REVANTH RATHAN,Krishna Reddy Polepalli,Anirban Mondal
International Conference on Database and Expert Systems Applications, DEXA, 2021
@inproceedings{bib_Impr_2021, AUTHOR = {PARVATHANENI REVANTH RATHAN, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {Improving Billboard Advertising Reven usingTransactional Modeling and Pattern Mining}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2021}}
Billboard advertisement is one of the dominant modes of traditional outdoor advertisements. Given a set of billboards, there is an opportunity to improve the revenue of the billboard operator by satisfying the advertising demands of an increased number of clients by exploiting the user trajectory data. Our main contributions are three-fold. First, we introduce the problem of billboard advertisement allocation for improving the billboard operator revenue. Second, we propose an efficient user trajectory-based transactional framework using coverage pattern mining for addressing the billboard advertisement allocation problem. Third, we conduct a performance study with a real dataset to show the effectiveness of the proposed framework.
An Urgency-aware and Revenue-based Itemset Placement Framework for Retail Stores
Raghav Mittal,Anirban Mondal,Parul Chaudhary,Krishna Reddy Polepalli
International Conference on Database and Expert Systems Applications, DEXA, 2021
@inproceedings{bib_An_U_2021, AUTHOR = {Raghav Mittal, Anirban Mondal, Parul Chaudhary, Krishna Reddy Polepalli}, TITLE = {An Urgency-aware and Revenue-based Itemset Placement Framework for Retail Stores}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2021}}
Placement of items on the shelf space of retail stores significantly impacts the revenue of the retailer. Given the prevalence and popularity of medium-to-large-size retail stores, several research efforts have been made towards facilitating item/itemset placement in retail stores for improving retailer revenue. However, they do not consider the issue of urgency of sale of individual items. Hence, they cannot efficiently index, retrieve and place high-revenue itemsets in retail store slots in an urgency-aware manner. Our key contributions are two-fold. First, we introduce the notion of urgency for retail itemset placement. Second, we propose the urgency-aware URI index for efficiently retrieving high-revenue and urgent itemsets of different sizes. We discuss the URIP itemset placement scheme, which exploits URI for improving retailer revenue. We also conduct a performance evaluation with two real datasets to demonstrate that URIP is indeed effective in improving retailer revenue w.r.t. existing schemes.
An Efficient Distributed Coverage Pattern Mining Algorithm
Sathineni Preetham Reddy,Srinivas Reddy Annappalli,Krishna Reddy Polepalli,Anirban Mondal
International Conference on Big Data Analytics, BDA, 2021
@inproceedings{bib_An_E_2021, AUTHOR = {Sathineni Preetham Reddy, Srinivas Reddy Annappalli, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {An Efficient Distributed Coverage Pattern Mining Algorithm}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2021}}
Mining of coverage patterns from transactional databases is one of the data mining tasks. It has applications in banner advertising, search engine advertising and visibility computation. In general, most real-world transactional databases are typically large. Mining of coverage patterns from large transactional databases such as query log transactions on a single computer is challenging and time-consuming. In this paper, we propose Distributed Coverage Pattern Mining (DCPM) approach. In this approach, we employ a notion of the summarized form of Inverse Transactional Database (ITD) and replicate it at every node. We also employ an efficient clustering-based method to distribute the computational load of extracting coverage patterns among the Worker nodes. We performed extensive experiments using two real-world datasets and one synthetic dataset. The results show that the proposed approach significantly improves the performance over the state-of-the-art approaches in terms of execution time and data shuffled.
A Model of Graph Transactional Coverage Patterns with Applications to Drug Discovery
Srinivas Reddy Annappalli,Krishna Reddy Polepalli,Anirban Mondal,Deva Priyakumar U
International Conference on High Performance Computing, HiPC, 2021
@inproceedings{bib_A_Mo_2021, AUTHOR = {Srinivas Reddy Annappalli, Krishna Reddy Polepalli, Anirban Mondal, Deva Priyakumar U}, TITLE = {A Model of Graph Transactional Coverage Patterns with Applications to Drug Discovery}, BOOKTITLE = {International Conference on High Performance Computing}. YEAR = {2021}}
Facilitating the discovery of drugs by combining diverse compounds is becoming prevalent, especially for treating complex diseases like cancers and HIV. A drug is a chemi- cal compound structure and any sub-structure of a chemical compound is designated as a fragment. A chemical compound or a fragment can be modeled as a graph structure. Given a set of chemical compounds and their corresponding large set of fragments modeled as graph structures, we address the problem of identifying potential combinations of diverse chemical compounds, which cover a certain percentage of the set of fragments. In this regard, the key contributions of this work are three-fold: First, we introduce the notion of Graph Transactional Coverage Patterns (GTCPs) for any given graph transactional dataset. Second, we propose an efficient model and framework for extracting GTCPs from a given graph transactional dataset. Third, we conduct an extensive performance study using three real datasets to demonstrate that it is indeed feasible to efficiently extract GTCPs using our proposed GTCP-extraction framework. We also demonstrate the effectiveness of the GTCP-extraction framework through a case study in computer-aided drug design. Index Terms—Graph mining, Graph transactions, Coverage patterns, Drug discovery
Discovering Relative High Utility Itemsets in Very Large Transactional Databases Using Null-Invariant Measure
R. Uday Kiran,PALLIKILA PRADEEP CHANDRA REDDY,J. M. Luna,Philippe Fournier-Viger,Masashi Toyoda,Krishna Reddy Polepalli
International Conference on Big Data, BD, 2021
@inproceedings{bib_Disc_2021, AUTHOR = {R. Uday Kiran, PALLIKILA PRADEEP CHANDRA REDDY, J. M. Luna, Philippe Fournier-Viger, Masashi Toyoda, Krishna Reddy Polepalli}, TITLE = {Discovering Relative High Utility Itemsets in Very Large Transactional Databases Using Null-Invariant Measure}, BOOKTITLE = {International Conference on Big Data}. YEAR = {2021}}
High utility itemset mining is an important model in data mining. It involves discovering all itemsets in a quantitative transactional database that satisfy a user-specified minimum utility (minUtil) constraint. MinUtil controls the minimum value that an itemset must maintain in a database. Since the model evaluates an itemset’s interestingness using only the minUtil constraint, it implicitly assumes that all items in the database have similar utility values. However, some items have high utility, while others may have relatively low utility in a database. If minUtil is set too high, the user will miss all itemsets containing low utility items. To find itemsets that involve both high and low utility items, minUtil has to be set very low. However, this may cause a combinatorial explosion as the items with high utility may combine with others in all possible ways. This dilemma is called the low utility item problem. This paper proposes a flexible …
Discovering Top-k Spatial High Utility Itemsets in Very Large Quantitative Spatiotemporal databases
PALLIKILA PRADEEP CHANDRA REDDY,P. Veena,R. Uday Kiran,Ram Avatar,Sadanori Ito,Koji Zettsu,Krishna Reddy Polepalli
International Conference on Big Data, BD, 2021
@inproceedings{bib_Disc_2021, AUTHOR = {PALLIKILA PRADEEP CHANDRA REDDY, P. Veena, R. Uday Kiran, Ram Avatar, Sadanori Ito, Koji Zettsu, Krishna Reddy Polepalli}, TITLE = {Discovering Top-k Spatial High Utility Itemsets in Very Large Quantitative Spatiotemporal databases}, BOOKTITLE = {International Conference on Big Data}. YEAR = {2021}}
Spatial High Utility Itemset Mining (SHUIM) is an important knowledge discovery technique with many real-world applications. It involves discovering all itemsets that satisfy the user-specified m inimum u tility (minUtil) i n a q uantitative spatiotemporal database. The popular adoption and the successful industrial application of this technique have been hindered by the following two limitations: (i) Since the rationale of SHUIM is to find all itemsets that satisfy the minUtil constraint, it often produces too many patterns, most of which may be redundant or uninteresting to the user. (ii) Specifying a right minUtil value is an open research problem in SHUIM. This paper tackles these two problems by proposing a novel model of top-k spatial high utility itemsets that may exist in a database. A new constraint, called dynamic minimum utility (dMinUtil), was explored to reduce the search space effectively. This constraint is based on …
A Model of Graph Transactional Coverage Patterns with Applications to Drug Discovery
Srinivas Reddy Annappalli,Krishna Reddy Polepalli,Anirban Mondal,Deva Priyakumar U
International Conference on High Performance Computing, HiPC, 2021
@inproceedings{bib_A_Mo_2021, AUTHOR = {Srinivas Reddy Annappalli, Krishna Reddy Polepalli, Anirban Mondal, Deva Priyakumar U}, TITLE = {A Model of Graph Transactional Coverage Patterns with Applications to Drug Discovery}, BOOKTITLE = {International Conference on High Performance Computing}. YEAR = {2021}}
Facilitating the discovery of drugs by combining diverse compounds is becoming prevalent, especially for treating complex diseases like cancers and HIV. A drug is a chemical compound structure and any sub-structure of a chemical compound is designated as a fragment. A chemical compound or a fragment can be modeled as a graph structure. Given a set of chemical compounds and their corresponding large set of fragments modeled as graph structures, we address the problem of identifying potential combinations of diverse chemical compounds, which cover a certain percentage of the set of fragments. In this regard, the key contributions of this work are three-fold: First, we introduce the notion of Graph Transactional Coverage Patterns (GTCPs) for any given graph transactional dataset. Second, we propose an efficient model and framework for extracting GTCPs from a given graph transactional dataset. Third, we conduct an extensive performance study using three real datasets to demonstrate that it is indeed feasible to efficiently extract GTCPs using our proposed GTCP-extraction framework. We also demonstrate the effectiveness of the GTCP-extraction framework through a case study in computer-aided drug design. Index Terms—Graph mining, Graph transactions, Coverage patterns, Drug discovery
Mining subgraph coverage patterns from graph transactions
Srinivas Reddy Annappalli,Krishna Reddy Polepalli,Anirban Mondal,Deva Priyakumar U
International Journal of Data Science and Analytics, IJDSA, 2021
@inproceedings{bib_Mini_2021, AUTHOR = {Srinivas Reddy Annappalli, Krishna Reddy Polepalli, Anirban Mondal, Deva Priyakumar U}, TITLE = {Mining subgraph coverage patterns from graph transactions}, BOOKTITLE = {International Journal of Data Science and Analytics}. YEAR = {2021}}
Pattern mining from graph transactional data (GTD) is an active area of research with applications in the domains of bioinformatics, chemical informatics and social networks. Existing works address the problem of mining frequent subgraphs from GTD. However, the knowledge concerning the coverage aspect of a set of subgraphs is also valuable for improving the performance of several applications. In this regard, we introduce the notion of subgraph coverage patterns (SCPs). Given a GTD, a subgraph coverage pattern is a set of subgraphs subject to relative frequency, coverage and overlap constraints provided by the user. We propose the Subgraph ID-based Flat Transactional (SIFT) framework for the efficient extraction of SCPs from a given GTD. Our performance evaluation using three real datasets demonstrates that our proposed SIFT framework is indeed capable of efficiently extracting SCPs from GTD. Furthermore, we demonstrate the effectiveness of SIFT through a case study in computer-aided drug design.
An Improved Dummy Generation Approach for Enhancing User Location Privacy
Anirban Mondal,M A Shadaab Siddiqie,Krishna Reddy Polepalli
International Conference on Database Systems for Advanced Applications, DASFAA, 2021
@inproceedings{bib_An_I_2021, AUTHOR = {Anirban Mondal, M A Shadaab Siddiqie, Krishna Reddy Polepalli}, TITLE = {An Improved Dummy Generation Approach for Enhancing User Location Privacy}, BOOKTITLE = {International Conference on Database Systems for Advanced Applications}. YEAR = {2021}}
Location-based services (LBS), which provide personalized and timely information, entail privacy concerns such as unwanted leak of current user locations to potential stalkers. Existing works have proposed dummy generation techniques by creating a cloaking region (CR) such that the user’s location is at a fixed distance from the center of CR. Hence, if the adversary somehow knows the location of the center of CR, the user’s location would be vulnerable to attack. We propose an improved dummy generation approach for facilitating improved location privacy for mobile users. Our performance study demonstrates that our proposed approach is indeed effective in improving user location privacy.
A Novel Parameter-Free Energy Efficient Fuzzy Nearest Neighbor Classifier for Time Series Data
Penugonda Ravikumar,R. Uday Kiran,NARENDRA BABU UNNAM,Yutaka Watanobe,Kazuo Goda,V. Susheela Devi,Krishna Reddy Polepalli
International Conference on Fuzzy Systems, FUZZ , 2021
@inproceedings{bib_A_No_2021, AUTHOR = {Penugonda Ravikumar, R. Uday Kiran, NARENDRA BABU UNNAM, Yutaka Watanobe, Kazuo Goda, V. Susheela Devi, Krishna Reddy Polepalli}, TITLE = {A Novel Parameter-Free Energy Efficient Fuzzy Nearest Neighbor Classifier for Time Series Data}, BOOKTITLE = {International Conference on Fuzzy Systems}. YEAR = {2021}}
Time series classification is an important model in data mining. It involves assigning a class label to a test instance based on the training data with known class labels. Most previous studies developed time series classifiers by disregarding the fuzzy nature of events (i.e., events with similar values may belong to different classes) within the data. Consequently, these studies suffered from performance issues, including decreased accuracy and increased memory, runtime, and energy requirements. With this motivation, this paper proposes a novel fuzzy nearest neighbor classifier for time series data. The basic idea of our classifier is to transform the very large training data into a relatively small representative training data and use it to label a test instance by employing a new fuzzy distance measure known as Ravi. Experimental results on real world benchmark datasets demonstrate that the proposed classifier …
A framework for discovering popular paths using transactional modeling and pattern mining
PARVATHANENI REVANTH RATHAN,Krishna Reddy Polepalli,Anirban Mondal
Distributed and Parallel Databases, DBP, 2021
@inproceedings{bib_A_fr_2021, AUTHOR = {PARVATHANENI REVANTH RATHAN, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {A framework for discovering popular paths using transactional modeling and pattern mining}, BOOKTITLE = {Distributed and Parallel Databases}. YEAR = {2021}}
While the problems of fnding the shortest path and k-shortest paths have been extensively researched, the research community has been shifting its focus towards discovering and identifying paths based on user preferences. Since users naturally follow some of the paths more than other paths, the popularity of a given path often refects such user preferences. Given a set of user traversals in a road network and a set of paths between a given source and destination pair, we address the problem of performing top-k ranking of the paths in that set based on path popularity. In this paper, we introduce a new model for computing the popularity scores of paths. Our main contributions are threefold. First, we propose a framework for modeling user traversals in a road network as transactions. Second, we present an approach for efciently computing the popularity score of any path based on the itemsets extracted from the transactions using pattern mining techniques. Third, we conducted an extensive performance evaluation with two real datasets to demonstrate the efectiveness of the proposed scheme.
PEAR: A Product Expiry-Aware and Revenue-Conscious Itemset Placement Scheme
Anirban Mondal,Raghav Mittal,Vrinda Khandelwal,Parul Chaudhary,Krishna Reddy Polepalli
International Conference on Data Science and Advanced Analytics, DSAA, 2021
@inproceedings{bib_PEAR_2021, AUTHOR = {Anirban Mondal, Raghav Mittal, Vrinda Khandelwal, Parul Chaudhary, Krishna Reddy Polepalli}, TITLE = {PEAR: A Product Expiry-Aware and Revenue-Conscious Itemset Placement Scheme}, BOOKTITLE = {International Conference on Data Science and Advanced Analytics}. YEAR = {2021}}
Placement of items on the shelf space of retail stores significantly impacts the revenue of the retailer. Since customers typically tend to buy sets of items (i.e., itemsets) together, several research efforts have been undertaken towards facilitating itemset placement in retail stores for improving retailer revenue. However, they fail to consider that the time-period of expiry can vary across items i.e., some items expire sooner than others. This leads to loss of opportunity towards improving retailer revenue. Hence, we propose PEAR, which is a Product Expiry-Aware and Revenue-conscious itemset placement scheme for improving retailer revenue. Our key contributions are three-fold. First, we introduce the problem of addressing retail itemset placement when the items can be associated with different time-periods of expiry. Second, we propose the expiry-aware PEAR scheme for efficiently identifying and placing high …
A Retail Itemset Placement Framework Based on Premiumness of Slots and Utility Mining
ANIRBAN MONDAL, SAMANT SAURABH, PARUL CHAUDHARY,RAGHAV MITTAL,Krishna Reddy Polepalli
IEEE Access, ACCESS, 2021
@inproceedings{bib_A_Re_2021, AUTHOR = {ANIRBAN MONDAL, SAMANT SAURABH, PARUL CHAUDHARY, RAGHAV MITTAL, Krishna Reddy Polepalli}, TITLE = {A Retail Itemset Placement Framework Based on Premiumness of Slots and Utility Mining}, BOOKTITLE = {IEEE Access}. YEAR = {2021}}
Retailer revenue is significantly impacted by item placement in retail stores. Notably, placement of items in the premium slots (i.e., slots with increased visibility/accessibility) improves the probability of sale w.r.t. item placement in non-premium slots. Moreover, customers often tend to buy sets of items (i.e., itemsets) instead of individual purchases. In this paper, we address the problem of maximizing retailer revenue by determining the placement of itemsets in different types of slots with varied premiumness. Our key contributions are as follows. First, we introduce the notion of premiumness of retail slots and discuss the issue of itemset placement in slots with varied premiumness. Second, we propose two efficient schemes, namely Premiumness and Revenue-based Itemset Placement (PRIP) and Premiumness and Average Revenue-based Itemset Placement (PARIP), for placing itemsets with varying revenue in slots with varied premiumness. Third, we perform a detailed performance analysis using both real and synthetic datasets to showcase the effectiveness of our proposed schemes. We also perform a comprehensive mathematical analysis of our proposed schemes w.r.t. the complexity analysis.
A Revenue-based Product Placement Framework to Improve Diversity in Retail Businesses
Pooja Gaur,Krishna Reddy Polepalli,M. Kumara Swamy,Anirban Mondal
International Conference on Big Data Analytics, BDA, 2020
@inproceedings{bib_A_Re_2020, AUTHOR = {Pooja Gaur, Krishna Reddy Polepalli, M. Kumara Swamy, Anirban Mondal}, TITLE = {A Revenue-based Product Placement Framework to Improve Diversity in Retail Businesses}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2020}}
Product placement in retail stores has a significant impact on the revenue of the retailer. Hence, research efforts are being made to propose approaches for improving item placement in retail stores based on the knowledge of utility patterns extracted from the log of customer purchase transactions. Another strategy to make any retail store interesting from a customer perspective is to cater to the varied requirements and preferences of customers. This can be achieved by placing a wider variety of items in the shelves of the retail store, thereby increasing the diversity of the items that are available for purchase. In this regard, the key contributions of our work are three-fold. First, we introduce the problem of concept hierarchy based diverse itemset placement in retail stores. Second, we present a framework and schemes for facilitating efficient retrieval of the diverse top-revenue itemsets based on a concept hierarchy. Third, we conducted a performance evaluation with a real dataset to demonstrate the overall effectiveness of our proposed schemes.
Parallel Mining of Partial Periodic Itemsets in Big Data
Chennupati Sai Deep,R. Uday Kiran,Koji Zettsu,Cheng-Wei Wu,Krishna Reddy Polepalli,Masashi Toyoda,Masaru Kitsuregawa
International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Sy, IEA/AIE, 2020
@inproceedings{bib_Para_2020, AUTHOR = {Chennupati Sai Deep, R. Uday Kiran, Koji Zettsu, Cheng-Wei Wu, Krishna Reddy Polepalli, Masashi Toyoda, Masaru Kitsuregawa}, TITLE = {Parallel Mining of Partial Periodic Itemsets in Big Data}, BOOKTITLE = {International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Sy}. YEAR = {2020}}
Partial Periodic itemsets are an important class of regularities that exist in a temporal database. A Partial Periodic itemset is something persistent and predictable that appears in the data. Past studies on Partial Periodic itemsets have been primarily focused on centralized databases and are not scalable for Big Data environments. One cannot ignore the advantage of scalability by using more resources. This is because we deal with large databases in a real-time environment and using more resources can increase the performance. To address the issue we have proposed a parallel algorithm by including the step of distributing transactional identifiers among the machines and mining the identical itemsets independently over the different machines. Experiments on Apache Spark’s distributed environment show that the proposed approach speeds up with the increase in a number of machines.
Improving Product Placement in Retail with Generalized High-Utility Itemsets
CHINMAY BAPNA,Krishna Reddy Polepalli,Anirban Mondal
International Conference on Data Science and Advanced Analytics, DSAA, 2020
@inproceedings{bib_Impr_2020, AUTHOR = {CHINMAY BAPNA, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {Improving Product Placement in Retail with Generalized High-Utility Itemsets}, BOOKTITLE = {International Conference on Data Science and Advanced Analytics}. YEAR = {2020}}
Product placement in retail has a significant impact on the sales revenue of retailers. Hence, research efforts are being made to improve retailer revenue using high-utility pattern mining based product placement approaches. However, none of these existing approaches has explored generalized high-utility itemset mining for determining product placement in retail. The knowledge of generalized high-utility itemsets extracted from user purchase transactional database in conjunction with a product taxonomy can provide new insights about customer purchase behaviour. This work proposes the generalized utility itemset (GUI) index for retrieving generalized high-utility (revenue) itemsets. We also present a framework, which leverages the GUI index towards retail product placement to improve revenue. Our performance study using real datasets shows the effectiveness of our proposed scheme w.r.t. two existing …
An improved scheme for determining top-revenue itemsets for placement in retail businesses
Parul Chaudhary,Anirban Monda,Krishna Reddy Polepalli
International Journal of Data Science and Analytics, IJDSA, 2020
@inproceedings{bib_An_i_2020, AUTHOR = {Parul Chaudhary, Anirban Monda, Krishna Reddy Polepalli}, TITLE = {An improved scheme for determining top-revenue itemsets for placement in retail businesses}, BOOKTITLE = {International Journal of Data Science and Analytics}. YEAR = {2020}}
Utility mining has been emerging as an important area in data mining. While existing works on utility mining for retail businesses have primarily focused on the problem of finding high-utility itemsets from transactional databases, they implicitly assume that each item occupies only one slot. Here, the slot size of a given item is the number of (integer) slots occupied by that item on the retail store shelves. However, in many real-world scenarios, the number of slots consumed by different items typically varies. Hence, this paper considers that a given item may physically occupy any fixed (integer) number of slots. Thus, we address the problem of efficiently determining the top-utility itemsets when a given number of slots is specified as input. The key contributions of our work are three fold. First, we present an efficient framework to determine the top-utility itemsets for different user-specified number of slots that need to be …
Discovering Fuzzy Periodic-Frequent Patterns in Quantitative Temporal Databases
R Uday Kiran,C Saideep,Penugonda Ravikumar,Koji Zettsu, Masashi Toyoda,Masaru Kitsuregawa,Krishna Reddy Polepalli
International Conference on Fuzzy Systems, FUZZ , 2020
@inproceedings{bib_Disc_2020, AUTHOR = {R Uday Kiran, C Saideep, Penugonda Ravikumar, Koji Zettsu, Masashi Toyoda, Masaru Kitsuregawa, Krishna Reddy Polepalli}, TITLE = {Discovering Fuzzy Periodic-Frequent Patterns in Quantitative Temporal Databases}, BOOKTITLE = {International Conference on Fuzzy Systems}. YEAR = {2020}}
Periodic-frequent pattern mining is a challenging problem of great importance in many applications. Most previous works focused on finding these patterns in binary temporal databases and did not take into account the quantities of items within the data. This paper proposes a novel model of fuzzy periodic-frequent pattern (FPFP) that may exist in a quantitative temporal database (QTD). Finding FPFPs in QTD is a non-trivial and challenging task due to its huge search space. A novel pruning technique, called improved maximum scalar cardinality, has been introduced to effectively reduce the search space and the computational cost of finding the desired itemsets. This technique facilitates the mining of FPFPs in real-world very large databases practicable. An efficient algorithm has also been presented to find all FPFPs in a QTD. Experimental results demonstrate that the proposed algorithm is efficient. We also …
A document representation framework with interpretable features using pre-trained word embeddings
NARENDRA BABU UNNAM,Krishna Reddy Polepalli
International Journal of Data Science and Analytics, IJDSA, 2020
@inproceedings{bib_A_do_2020, AUTHOR = {NARENDRA BABU UNNAM, Krishna Reddy Polepalli}, TITLE = {A document representation framework with interpretable features using pre-trained word embeddings}, BOOKTITLE = {International Journal of Data Science and Analytics}. YEAR = {2020}}
We propose an improved framework for document representation using word embeddings. The existing models represent the document as a position vector in the same word embedding space. As a result, they are unable to capture the multiple aspects as well as the broad context in the document. Also, due to their low representational power, existing approaches perform poorly at document classification. Furthermore, the document vectors obtained using such methods have uninterpretable features. In this paper, we propose an improved document representation framework which captures multiple aspects of the document with interpretable features. In this framework, a document is represented in a different feature space by representing each dimension with a potential feature word with relatively high discriminating power. A given document is modeled as the distances between the feature words and the …
A model of concept hierarchy-based diverse patterns with applications to recommender system
M. KUMARA SWAMY,Krishna Reddy Polepalli
International Journal of Data Science and Analytics, IJDSA, 2020
@inproceedings{bib_A_mo_2020, AUTHOR = {M. KUMARA SWAMY, Krishna Reddy Polepalli}, TITLE = {A model of concept hierarchy-based diverse patterns with applications to recommender system}, BOOKTITLE = {International Journal of Data Science and Analytics}. YEAR = {2020}}
Frequent pattern mining is one among the popular data mining techniques. Frequent pattern mining approaches extract interesting associations among the items in a given transactional database. The items of the transactional database can be organized as a concept hierarchy. Notably, frequent pattern mining does not distinguish the patterns by analyzing the categories of the items in a given concept hierarchy. In several applications, it is often useful to distinguish among the frequent patterns by analyzing how the items of the pattern are mapped to different categories of the concept hierarchy. In this paper, we propose a new interestingness measure, designated as diversity rank (drank), for capturing the diversity of a given pattern by analyzing the extent to which the items of the pattern are associated with the categories of the corresponding concept hierarchy. Given a transactional database over a set I of items and the corresponding concept hierarchy on I, we propose a methodology to compute the drank of the given pattern. Furthermore, by extending the notion of drank, we propose an approach to improve the diversity and accuracy of association rule-based recommender system. The results of our performance evaluation on the real-world MovieLens dataset demonstrate that the proposed diversity model extracts different kinds of patterns as compared to frequent patterns. Furthermore, our proposed recommender system approach improves the diver sit y performance w.r.t. the existing association rule-based recommender system without significantly compromising the accuracy. Overall, the proposed concept hierarchy-based diverse pattern model provides a scope to develop new approaches for improving the performance of frequent pattern mining-based applications.
Sensor-based Framework for Improved Air Conditioning Under Diverse Temperature Layout
M A Shadaab Siddiqie,Ralla Akhil,Krishna Reddy Polepalli,Anirban Mondal
India Joint International Conference on Data Science & Management of Data, COMAD/CODS, 2020
@inproceedings{bib_Sens_2020, AUTHOR = {M A Shadaab Siddiqie, Ralla Akhil, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {Sensor-based Framework for Improved Air Conditioning Under Diverse Temperature Layout}, BOOKTITLE = {India Joint International Conference on Data Science & Management of Data}. YEAR = {2020}}
Due to advances in technology for better output/production, there has been an increased demand for technology with optimized resource utilization in order to conserve resources. Advancements in sensors have improved resource optimization in many fields such as electric power consumption, water management and quality management. In this regard, air conditioner (AC) management is an important aspect of electric power consumption. In this paper, we address the issue of reducing power consumption for air conditioning of a given room. In previous works, the whole room is maintained at a fixed required temperature by inherently assuming that all of the users prefer the same temperature; this is generally not true in practice. In this paper, we propose a sensor-based framework to place and manage ACs for maintaining diverse temperature zones in a given layout to reduce power consumption. Through a simulation study, we demonstrate that the proposed framework indeed has the potential to reduce power consumption significantly as compared to the naive approach by achieving user satisfaction.
Coverage Pattern Mining Based on MapReduce
Ralla Akhil,M A Shadaab Siddiqie,Krishna Reddy Polepalli,Anirban Monda
India Joint International Conference on Data Science & Management of Data, COMAD/CODS, 2020
@inproceedings{bib_Cove_2020, AUTHOR = {Ralla Akhil, M A Shadaab Siddiqie, Krishna Reddy Polepalli, Anirban Monda}, TITLE = {Coverage Pattern Mining Based on MapReduce}, BOOKTITLE = {India Joint International Conference on Data Science & Management of Data}. YEAR = {2020}}
Pattern mining is an important task of data mining and involves the extraction of interesting associations from large databases. However, developing fast and efficient parallel algorithms for handling large volumes of data is a challenging task. The MapReduce framework enables the distributed processing of huge amounts of data in large-scale distributed environment with robust fault-tolerance. In this paper, we propose a parallel algorithm for extracting coverage patterns. The results of our performance evaluation with real-world and synthetic datasets demonstrate that it is indeed feasible to extract coverage patterns effectively under the MapReduce framework.
An improved human-in-the-loop model for fine-grained object recognition with batch-based question answering
VYSHNAVI GUTTA,NARENDRA BABU UNNAM,Krishna Reddy Polepalli
India Joint International Conference on Data Science & Management of Data, COMAD/CODS, 2020
@inproceedings{bib_An_i_2020, AUTHOR = {VYSHNAVI GUTTA, NARENDRA BABU UNNAM, Krishna Reddy Polepalli}, TITLE = {An improved human-in-the-loop model for fine-grained object recognition with batch-based question answering}, BOOKTITLE = {India Joint International Conference on Data Science & Management of Data}. YEAR = {2020}}
Fine-grained object recognition refers to a subordinate level of object recognition such as recognition of bird species and car models. It has become crucial for recognition of previously unknown classes. While fine-grained object recognition has seen unprecedented progress with the advent of neural networks, many of the existing works are cost-sensitive as they are acutely picture-dependent and fail without the adequate number of quality pictures. Efforts have been made in the literature for a picture-independent recognition with hybrid human-computer recognition methods via single question answering with a human-in-the-loop. To this end, we propose an improved batch-based local question answering method for making the recognition efficient and picture-independent. When pictures are unavailable, at each time-step, the proposed method mines a batch of binary cluster-centric local questions to pose to a human-in-the-loop and incorporates the responses received to the questions into the model. After a preset number of time-steps, the most probable class of the target object is returned as the final prediction. When pictures are available, our model facilitates the plug-in of computer vision algorithms into the framework for better performance. Experiments on three challenging datasets show significant performance improvement with respect to accuracy and computation time as compared to the existing schemes.
Efficient Discovery of Weighted Frequent Neighborhood Itemsets in Very Large Spatiotemporal Databases
R. UDAY KIRAN,PALLIKILA PRADEEP CHANDRA REDDY,KOJI ZETTSU,MASASHI TOYODA,MASARU KITSUREGAWA,Krishna Reddy Polepalli
IEEE Access, ACCESS, 2020
@inproceedings{bib_Effi_2020, AUTHOR = {R. UDAY KIRAN, PALLIKILA PRADEEP CHANDRA REDDY, KOJI ZETTSU, MASASHI TOYODA, MASARU KITSUREGAWA, Krishna Reddy Polepalli}, TITLE = {Efficient Discovery of Weighted Frequent Neighborhood Itemsets in Very Large Spatiotemporal Databases}, BOOKTITLE = {IEEE Access}. YEAR = {2020}}
Weighted Frequent Itemset (WFI) mining is an important model in data mining. It aims to discover all itemsets whose weighted sum in a transactional database is no less than the user-specified threshold value. Most previous works focused on finding WFIs in a transactional database and did not recognize the spatiotemporal characteristics of an item within the data. This paper proposes a more flexible model of Weighted Frequent Neighborhood Itemsets (WFNI) that may exist in a spatiotemporal database. The recommended patterns may be found very useful in many real-world applications. For instance, an WFNI generated from an air pollution database indicates a geographical region where people have been exposed to high levels of an air pollutant, say PM2.5. The generated WFNIs do not satisfy the anti-monotonic property. Two new measures have been presented to effectively reduce the search space and the computational cost of finding the desired patterns. A pattern-growth algorithm, called Spatial Weighted Frequent Pattern-growth, has also been presented to find all WFNIs in a spatiotemporal database. Experimental results demonstrate that the proposed algorithm is efficient. We also describe a case study in which our model has been used to find useful information in air pollution database.
Discovering Spatial Weighted Frequent Itemsets in Spatiotemporal Databases
PALLIKILA PRADEEP CHANDRA REDDY,R. Uday Kiran,Koji Zettsu,Masashi Toyoda,Masaru Kitsuregawa,Krishna Reddy Polepalli
International Conference on Data Mining Workshops, ICDM-W, 2019
@inproceedings{bib_Disc_2019, AUTHOR = {PALLIKILA PRADEEP CHANDRA REDDY, R. Uday Kiran, Koji Zettsu, Masashi Toyoda, Masaru Kitsuregawa, Krishna Reddy Polepalli}, TITLE = {Discovering Spatial Weighted Frequent Itemsets in Spatiotemporal Databases}, BOOKTITLE = {International Conference on Data Mining Workshops}. YEAR = {2019}}
Weighted Frequent Itemset (WFI) mining is animportant model in data mining. It aims to discover all itemsetswhose weighted sum in a transactional database is no less than theuser-specified threshold value. Most previous works focused onfinding WFIs in a transactional database and did not recognizethe spatiotemporal characteristics of an item within the data.This paper proposes a more flexible model of Spatial WeightedFrequent Itemset (SWFI) that may exist in a spatiotemporaldatabase. The recommended patterns may be found very useful inmany real-world applications. For instance, an SWFI generatedfrom an air pollution database indicates a geographical regionwhere people have been exposed to high levels of an air pollutant,sayPM2.5. The generated SWFIs do not satisfy the anti-monotonic property. Two new measures have been presentedto effectively reduce the search space and the computationalcost of finding the desired patterns. A pattern-growth algorithm,called Spatial Weighted Frequent Pattern-growth, has also beenpresented to find all SWFIs in a spatiotemporal database.Experimental results demonstrate that the proposed algorithmis efficient. We also describe a case study in which our modelhas been used to find useful information in air pollution database.
Regional scale spatiotemporal trends of precipitation and temperatures over Afghanistan
Rehana Shaik,Krishna Reddy Polepalli,Sai Bhaskar Reddy N,Abdul Raheem Daud,Shoaib Saboory,Shoaib Khaksari,Tomer SK
Climatic Change, CCh, 2019
@inproceedings{bib_Regi_2019, AUTHOR = {Rehana Shaik, Krishna Reddy Polepalli, Sai Bhaskar Reddy N, Abdul Raheem Daud, Shoaib Saboory, Shoaib Khaksari, Tomer SK}, TITLE = {Regional scale spatiotemporal trends of precipitation and temperatures over Afghanistan}, BOOKTITLE = {Climatic Change}. YEAR = {2019}}
Afghanistan is the most vulnerable to climate extremes related hazards, including droughts and floods that have caused huge impact on the socio-economic development of the country. The present study analysed the observed precipitation and temperature trends for seven agro-climatic zones of Afghanistan over the period 1951 to 2006 with Asian Precipitation-Highly-Resolved Observational Data Integration towards Evaluation of Water Resources (APHRODITE). The trend analysis was performed on daily data to test the increasing or decreasing rainfall and temperature trends using Mann-Kendall trend test for each agro-climatic zone of Afghanistan. The annual total precipitation has shown an increasing trend for the zones of South, South-West, East and Central, whereas, a decreasing trend has been observed for North, North-East and West zones of Afghanistan. The trend analysis of the precipitation with gridded data sets reveals for most parts of the Afghanistan, the rainfall has been observed to be decreasing. Whereas, an increasing trend of temperatures were observed for all seven agro-climatic zones of Afghanistan.
Finding periodic-frequent patterns in temporal databases using periodic summaries
Rage Uday Kiran ,ALAMPALLY ANIRUDH,Saideep Chennupati,Masashi Toyoda ,Krishna Reddy Polepalli,Masaru Kitsuregawa
Data Science and Pattern Recognition, DSPR, 2019
@inproceedings{bib_Find_2019, AUTHOR = {Rage Uday Kiran , ALAMPALLY ANIRUDH, Saideep Chennupati, Masashi Toyoda , Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Finding periodic-frequent patterns in temporal databases using periodic summaries}, BOOKTITLE = {Data Science and Pattern Recognition}. YEAR = {2019}}
Periodic-frequent pattern mining is an important model in data mining. The popular adoption and successful industrial application of this model has been hindered by the following two limitations: (i) The periodic-frequent pattern model implicitly assumes that all transactions within the data occur at a uniform time interval. This assumption limits the model’s applicability as the transactions in many real-world databases occur at irregular time intervals. (ii) Finding periodic-frequent patterns in very large databases is a memory intensive process because its mining algorithm has to maintain a list structure to record all timestamps at which an itemset has appeared in the whole data. This paper makes an effort to address these two limitations. A flexible model of periodic-frequent pattern in temporal databases has been described to address the former issue. In order to address the latter issue, a novel concept known as period summary has been introduced to effectively capture the temporal occurrence information of an itemset in a database. A new tree structure, called Periodic Summary-tree (PS-tree), has be introduced to record the temporal occurrence information of an itemset in a temporal database. A patterngrowth algorithm has also been described to find all periodic-frequent patterns from PStree. Experimental results demonstrate that the proposed algorithm is efficient.
Evaluation of insecticides against white stem borer, Xylotrechus quadripes (Cerambycidae: Coleoptera) infesting coffee
Krishna Reddy Polepalli,A Roobak Kumar,MS Uma,GV Manjunatha Reddy,HG Seetharama,M Dhanam
Journal of Entomology and Zoology Studies, JEZS, 2019
@inproceedings{bib_Eval_2019, AUTHOR = {Krishna Reddy Polepalli, A Roobak Kumar, MS Uma, GV Manjunatha Reddy, HG Seetharama, M Dhanam}, TITLE = {Evaluation of insecticides against white stem borer, Xylotrechus quadripes (Cerambycidae: Coleoptera) infesting coffee}, BOOKTITLE = {Journal of Entomology and Zoology Studies}. YEAR = {2019}}
The coffee white stem borer (CWSB), Xylotrechus quadripes (Coleoptera: Cerambycidae), is a major pest of arabica coffee causing considerable losses to the growers and its control has been an issue of significance in the pest management. Among the integrated management strategies, spraying of Chlorpyrifos 20EC at appropriate time will prevent the development of pest. This study was aimed to find out the alternative chemical for Chlorpyrifos 20EC which is being used for more than a decade. The laboratory experiments indicated that insecticides like Phenthoate 50EC, Fipronil 5SC, Thiamethoxam 12.6 + Lambda-Cyhalothrin 9.5 ZC, Chlorpyrifos 50EC + Cypermethrin 5EC and Chlorpyrifos 20EC caused 100% mortality of eggs and it is significantly on par with Ethioprole + Imidacloprid 80WG (98%). In case of neonate larvae, 100 percent mortality was observed in the treatments viz., Phenthoate 50EC, Imidacloprid 17.8SL, Fipronil 5SC, Thiamethoxam 12.6 + Lambda-Cyhalothrin 9.5% ZC, Ethioprole + Imidacloprid 80 WG, Chlorpyrifos 50EC + Cypermethrin 5EC and Chlorpyrifos 20EC. The least mortality was observed in Indoxacarb 14.5 SC among the insecticides tested. The field experiment data revealed that maximum ovicidal and larvicidal action was observed to Chlorpyrifos 50EC + Cypermethrin 5EC and Chlorpyrifos 20EC followed by Phenthoate 50EC and Fipronil 5SC. The least mortality of eggs and neonate larvae was recorded to Novaluron 10 EC and Indoxacarb 14.5 SC. It is, therefore, Chlorpyrifos 50EC + Cypermethrin 5EC, can be utilized as a valuable alternate for Chlorpyrifos 20EC in integrated management of coffee white stem borer.
Efficiently finding high utility-frequent itemsets using cutoff and suffix utility
R. Uday Kiran,TATIKONDA YASHWANTH REDDY,Philippe Fournier-Viger,Masashi Toyoda,Krishna Reddy Polepalli,Masaru Kitsuregawa
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2019
@inproceedings{bib_Effi_2019, AUTHOR = {R. Uday Kiran, TATIKONDA YASHWANTH REDDY, Philippe Fournier-Viger, Masashi Toyoda, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Efficiently finding high utility-frequent itemsets using cutoff and suffix utility}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2019}}
High utility itemset mining is an important model with many real-world applications. But the popular adoption and successful industrial application of this model has been hindered by the following two limitations: (i) computational expensiveness of the model and (ii) infrequent itemsets may be output as high utility itemsets. This paper makes an effort to address these two limitations. A generic high utility-frequent itemset model is introduced to find all itemsets in the data that satisfy user-specified minimum support and minimum utility constraints. Two new pruning measures, named cutoff utility and suffix utility, are introduced to reduce the computational cost of finding the desired itemsets. A single phase fast algorithm, called High Utility Frequent Itemset Miner (HU-FIMi), is introduced to discover the itemsets efficiently. Experimental results demonstrate that the proposed algorithm is efficient.
Discovering spatial high utility itemsets in spatiotemporal databases
R. Uday Kiran,Koji Zettsu,Masashi Toyoda,Philippe Fournier-Viger,Krishna Reddy Polepalli,Masaru Kitsuregawa
International Conference on Scientific and Statistical Database Management, SSDBM, 2019
@inproceedings{bib_Disc_2019, AUTHOR = {R. Uday Kiran, Koji Zettsu, Masashi Toyoda, Philippe Fournier-Viger, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Discovering spatial high utility itemsets in spatiotemporal databases}, BOOKTITLE = {International Conference on Scientific and Statistical Database Management}. YEAR = {2019}}
In real-world databases, high utility itemset (HUI) is an important class of regularities. Most previous studies have focused on mining HUIs in transactional databases and did not consider the spatiotemporal characteristics of items. In this study, a more flexible model of spatial HUIs (SHUIs) that exist in spatiotemporal databases is proposed. In a spatiotemporal database (STD), an itemset is said to be an SHUI if its utility is not less than a user-specified minimum utility and the distance between any two of its items is not more than a user-specified maximum distance. Identifying SHUIs is very challenging because the generated itemsets do not satisfy the anti-monotonic property. In this study, we present two novel pruning techniques for reducing computational costs. Moreover, a fast single scan algorithm is presented for effectively evaluating all SHUIs in a STD. Furthermore, two case studies are presented, in which the proposed model is used to identify useful information in traffic congestion data and air pollution data.
An Efficient Premiumness and Utility-Based Itemset Placement Scheme for Retail Stores
Parul Chaudhary, Anirban Mondal,Krishna Reddy Polepalli
International Conference on Database and Expert Systems Applications, DEXA, 2019
@inproceedings{bib_An_E_2019, AUTHOR = {Parul Chaudhary, Anirban Mondal, Krishna Reddy Polepalli}, TITLE = {An Efficient Premiumness and Utility-Based Itemset Placement Scheme for Retail Stores}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2019}}
In retail stores, the placement of items on the shelf space significantly impacts the sales of items. In particular, the probability of sales of a given item is typically considerably higher when it is placed in a premium (i.e., highly visible/easily accessible) slot as opposed to a non-premium slot. In this paper, we address the problem of maximizing the revenue for the retailer by determining the placement of the itemsets in different types of slots with varied premiumness such that each item is placed at least once in any of the slots. We first propose the notion of premiumness of slots in a given retail store. Then we discuss a framework for efficiently identifying itemsets from a transactional database and placing these itemsets by mapping itemsets with different revenue to slots with varied premiumness for maximizing retailer revenue. Our performance evaluation on both synthetic and real datasets demonstrate that the proposed scheme indeed improves the retailer revenue by up to 45% w.r.t. a recent existing scheme.
Discovering Diverse Popular Paths Using Transactional Modeling and Pattern Mining
PARVATHANENI REVANTH RATHAN,Krishna Reddy Polepalli,Anirban Mondal
International Conference on Database and Expert Systems Applications, DEXA, 2019
@inproceedings{bib_Disc_2019, AUTHOR = {PARVATHANENI REVANTH RATHAN, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {Discovering Diverse Popular Paths Using Transactional Modeling and Pattern Mining}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2019}}
While the problems of finding the shortest path and k-shortest paths have been extensively researched, the research community has been shifting its focus towards discovering and identifying paths based on user preferences. Since users naturally follow some of the paths more than other paths, the popularity of a given path often reflects such user preferences. Moreover, users typically prefer diverse paths over similar paths for gaining flexibility in path selection. Given a set of user traversals in a road network and a set of paths between a given source and destination pair, we propose a scheme based on transactional modeling and pattern mining for performing top-k ranking of these paths based on both path popularity and path diversity. Our performance evaluation with a real dataset demonstrates the effectiveness of the proposed scheme.
Discovering Partial Periodic High Utility Itemsets in Temporal Databases
TATIKONDA YASHWANTH REDDY,R. Uday Kiran,Masashi Toyoda,Krishna Reddy Polepalli, Masaru Kitsuregawa
International Conference on Database and Expert Systems Applications, DEXA, 2019
@inproceedings{bib_Disc_2019, AUTHOR = {TATIKONDA YASHWANTH REDDY, R. Uday Kiran, Masashi Toyoda, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Discovering Partial Periodic High Utility Itemsets in Temporal Databases}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2019}}
High Utility Itemset Mining (HUIM) is an important model with many real-world applications. Given a (non-binary) transactional database and an external utility database, the aim of HUIM is to discover all itemsets within the data that satisfy the user-specified minimum utility (minUtil) constraint. The popular adoption and successful industrial application of HUIM has been hindered by the following two limitations: (i) HUIM does not allow external utilities of items to vary over time and (ii) HUIM algorithms are inadequate to find recurring customer purchase behavior. This paper introduces a flexible model of Partial Periodic High Utility Itemset Mining (PPHUIM) to address these two problems. The goal of PPHUIM is to discover only those interesting high utility itemsets that are occurring at regular intervals in a given temporal database. An efficient depth-first search algorithm, called PPHUI-Miner (Partial Periodic High Utility Itemset-Miner), has been proposed to enumerate all partial periodic high-utility itemsets in temporal databases. Experimental results show that the proposed algorithm is efficient.
Coverage pattern based framework to improve search engine advertising
AMAR BUDHIRAJA,Ralla Akhil,Krishna Reddy Polepalli
International Journal of Data Science and Analytics, IJDSA, 2019
@inproceedings{bib_Cove_2019, AUTHOR = {AMAR BUDHIRAJA, Ralla Akhil, Krishna Reddy Polepalli}, TITLE = {Coverage pattern based framework to improve search engine advertising}, BOOKTITLE = {International Journal of Data Science and Analytics}. YEAR = {2019}}
Sponsored search has emerged as one of the most dominant forms for advertising on the Web. In sponsored search, advertisers create ad campaigns and bid on the keywords of potential search queries related to a given product or service. It has been observed that search queries follow a long-tail distribution of a small yet fat head of frequent queries and a long and thin tail of infrequent queries. Normally, the advertisers tend to bid on frequent keywords related to search queries. As a result, the ad space of the tail portion of search queries is harder to exploit. In this paper, we have proposed an improved allocation approach to utilize the ad space of the tail keywords related to search queries based on the knowledge of coverage patterns extracted from the transactions formed from search query logs. The advertisers bid on potential concepts represented by coverage patterns which consist of a combination of head and tail keywords. By facilitating the advertisers to bid on the concepts, the proposed approach improves the ad space utilization of tail queries. Experiments on the real-world dataset of search query logs demonstrate that the proposed approach indeed improves the performance of search engine advertising by improving ad space utilization of tail queries.
Multi-location visibility query processing using portion-based transactional modeling and pattern mining
Lakshmi Gangumalla,Krishna Reddy Polepalli,Anirban Mondal
Data Mining and Knowledge Discovery, DMKD, 2019
@inproceedings{bib_Mult_2019, AUTHOR = {Lakshmi Gangumalla, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {Multi-location visibility query processing using portion-based transactional modeling and pattern mining}, BOOKTITLE = {Data Mining and Knowledge Discovery}. YEAR = {2019}}
Visibility computation is critical in spatial databases for realizing various interesting and diverse applications such as defence-related surveillance, identifying interesting spots in tourist places and online warcraft games. Existing works address the problem of identifying individual locations for maximizing the visibility of a given target object. However, in case of many applications, a set of locations may be more effective than just individual locations towards maximizing the visibility of the given target object. In this paper, we introduce the Multi-Location Visibility (MLV) query. An MLV query determines the top-k query locations from which the visibility of a given target object can be maximized. We propose a portion-based transactional framework and coverage pattern mining based algorithm to process MLV queries. Our performance evaluation with real datasets demonstrates the effectiveness of the proposed scheme in terms of query processing time, pruning efficiency and target object visibility w.r.t. a recent existing scheme.
An Incremental Technique for Mining Coverage Patterns in Large Databases
Ralla Akhil,Krishna Reddy Polepalli,Anirban Mondal
International Conference on Data Science and Advanced Analytics, DSAA, 2019
@inproceedings{bib_An_I_2019, AUTHOR = {Ralla Akhil, Krishna Reddy Polepalli, Anirban Mondal}, TITLE = {An Incremental Technique for Mining Coverage Patterns in Large Databases}, BOOKTITLE = {International Conference on Data Science and Advanced Analytics}. YEAR = {2019}}
Pattern mining is an important task of data mining and involves the extraction of interesting associations from large databases. Typically, pattern mining is carried out from huge databases, which tend to get updated several times. Consequently, as a given database is updated, some of the patterns discovered may become invalid, while some new patterns may emerge. This has motivated significant research efforts in the area of Incremental Mining. The goal of incremental mining is to efficiently and incrementally mine patterns when a database is updated as opposed to mining all of the patterns from scratch from the complete database. Incidentally, research efforts are being made to develop incremental pattern mining algorithms for extracting different kinds of patterns such as frequent patterns, sequential patterns and utility patterns. However, none of the existing works addresses incremental mining in the context of coverage patterns, which has important applications in areas such as banner advertising, search engine advertising and graph mining. In this regard, the main contributions of this work are three-fold. First, we introduce the problem of incremental mining in the context of coverage patterns. Second, we propose the IncCMine algorithm for efficiently extracting the knowledge of coverage patterns when incremental database is added to the existing database. Third, we performed extensive experiments using two real-world click stream datasets and one synthetic dataset. The results of our performance evaluation demonstrate that our proposed IncCMine algorithm indeed improves the performance significantly w.r.t. the existing CMine algorithm. Index Terms—Data mining, Coverage patterns, Incremental mining, Knowledge discovery
Discovering Periodic Patterns in Irregular Time Series
Saideep Chennupati,R. Uday Kiran,Koji Zettsu,Philippe Fournier-Viger,Masaru Kitsuregawa,Krishna Reddy Polepalli
International Conference on Data Mining Workshops, ICDM-W, 2019
@inproceedings{bib_Disc_2019, AUTHOR = {Saideep Chennupati, R. Uday Kiran, Koji Zettsu, Philippe Fournier-Viger, Masaru Kitsuregawa, Krishna Reddy Polepalli}, TITLE = {Discovering Periodic Patterns in Irregular Time Series}, BOOKTITLE = {International Conference on Data Mining Workshops}. YEAR = {2019}}
Finding (partial) periodic patterns in time series data is a challenging problem of great importance in many applications. Due to computational reasons, most previous studies in this area have focused on the efficient discovery of periodic patterns in regular time series data. Unfortunately, these studies have limited applicability because real-world data naturally exists as an irregular time series. This paper proposes a more flexible model of periodic pattern that may be present in irregular time series. Two measures, period and period-support, were employed to determine the interestingness of a pattern in a series. The former measure captures the inter-arrival times of a pattern in a series, while the latter captures the number of periodic occurrences of a pattern in a series. A novel tree structure, called Periodic Pattern tree (PP-tree), has been introduced to record the irregular occurrences of items within the series. A pattern-growth algorithm has also been presented to find all periodic patterns from PP-tree. Experimental results demonstrate that the proposed model can find useful information, and the algorithm is efficient.
Discovering Spatial High Utility Frequent Itemsets in Spatiotemporal Databases
Pradeep,R. Uday Kiran,Koji Zettsu,Masashi Toyoda,Krishna Reddy Polepalli,Masaru Kitsuregawa
International Conference on Big Data Analytics, BDA, 2019
@inproceedings{bib_Disc_2019, AUTHOR = {Pradeep, R. Uday Kiran, Koji Zettsu, Masashi Toyoda, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Discovering Spatial High Utility Frequent Itemsets in Spatiotemporal Databases}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2019}}
Spatial High Utility Itemset Mining (SHUIM) aims to discover all itemsets in a spatiotemporal database that satisfy the userspecified minimum utility (minU til) and maximum distance (maxDist) constraints. The popular adoption and successful industrial application of SHUIM suffers from the following two limitations: (i) Since SHUIM determines the interestingness of an itemset without taking into account its support with in the data, it is sensitive to the random noise and length of the transactions within the data. Consequently, SHUIM facilitates sporadic itemsets with high utility to be generated as SHUIs. For instance, items in long transactions can combine with each other and be generated as SHUIs. (ii) SHUIM is a computationally expensive process because the generated itemsets do not satisfy the downward closure property. This paper introduces Spatial High Utility Frequent Itemset Mining (SHUFIM) to address these two issues. A SHUI in a spatiotemporal database is said to be a SHUFI if and only if its support is no less than the user-specified minimum support (minSup) constraint. The usage of minSup not only facilitates the proposed model to be tolerant to the random noise within the data, but also facilitate us to employ additional pruning techniques to reduce the computational cost. A single scan fast algorithm has also been proposed to discover all SHUFIs in a spatiotemporal database. Experimental results demonstrate that the proposed algorithm is efficient. We also demonstrate the usefulness of the proposed model with two real-world applications.
A Diversification-Aware Itemset Placement Framework for Long-term Sustainability of Retail Businesses
Parul Chaudhary,Anirban Mondal,Krishna Reddy Polepalli
International Conference on Database and Expert Systems Applications, DEXA, 2018
@inproceedings{bib_A_Di_2018, AUTHOR = {Parul Chaudhary, Anirban Mondal, Krishna Reddy Polepalli}, TITLE = {A Diversification-Aware Itemset Placement Framework for Long-term Sustainability of Retail Businesses}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2018}}
t. In addition to maximizing the revenue, retailers also aim at diversifying product offerings for facilitating sustainable revenue generation in the long run. Thus, it becomes a necessity for retailers to place appropriate itemsets in a limited k number of premium slots in retail stores for achieving the goals of revenue maximization and itemset diversification. In this regard, research efforts are being made to extract itemsets with high utility for maximizing the revenue, but they do not consider itemset diversification i.e., there could be duplicate (repetitive) items in the selected top-utility itemsets. Furthermore, given utility and support thresholds, the number of candidate itemsets of all sizes generated by existing utility mining approaches typically explodes. This leads to issues of memory and itemset retrieval times. In this paper, we present a framework and schemes for efficiently retrieving the top-utility itemsets of any given itemset size based on both revenue as well as the degree of diversification. Here, higher degree of diversification implies less duplicate items in the selected top-utility itemsets. The proposed schemes are based on efficiently determining and indexing the topλ high-utility and diversified itemsets. Experiments with a real dataset show the overall effectiveness and scalability of the proposed schemes in terms of execution time, revenue and degree of diversification w.r.t. a recent existing scheme.
IT-based Framework for Block level Agro-meteorological Advisory System
A MAMATHA,Sreenivas Gade,Krishna Reddy Polepalli,Balaji Naik Banoth
International Conference on. Computing For Sustainable Global Development, INDIACom, 2018
@inproceedings{bib_IT-b_2018, AUTHOR = {A MAMATHA, Sreenivas Gade, Krishna Reddy Polepalli, Balaji Naik Banoth}, TITLE = {IT-based Framework for Block level Agro-meteorological Advisory System}, BOOKTITLE = {International Conference on. Computing For Sustainable Global Development}. YEAR = {2018}}
Many domain specific decisions of our day-to-day lives are dependent on weather. Weather and its variables have a significant impact on crop gro wth stages and crop producti vity.Weather forecast service has been started by In dia Meteorological Department (IMD) from the year 1945. To help the farming community, IMD has started pro viding district level agro-meteorological advisory based on the district-level Medium Range Forecast (MRF) f rom the year2008 fo r all districts through 130 agro-meteorological field units (AMFUs). Based on the MRF for five days, each AMFU is providing crop-s pecific agromet advice for the districts covered by that AMFU. To impro ve the effectiveness, IMD has started block level MRF on an experimental basis from the year 2015. Based on block level MRF, IMD is planning to provide block-level agromet advisory based on the block-level MRF. In such a situation, agromet scientisst haveto prepare agromet advisory for several blocks and crops at the earliest after receiving the block level MRF which will be disseminated to the farmers and stakeholders. In this context, research efforts have to be made to develop efficient system to enable agromet scientists to prepare the block-level agromet advice based on the block-level MRF. Normally, it can be observed that the weather and cro ps of nearby blocks may not vary significantly. S o, there is an opportunity to reuse the agromet advice prepared for one block to the nearby blocks. In this paper, we have pro posed an IT- based framewo rk to enable the agromet scientists to prepare the block level agromet advisoryby exploitingthe fact that the agromet advice which has been prepared for one block can be reused for another block if the weather condition and the crop is the same. The proposed framework pro vides the scope to exploit the reuse of agromet advisory prepared for one block toother blocks and could improve the efficiency of agromet scientists for the preparation of block-level agromet advisory based on the MRF
Analysis of similar weather conditions to improve reuse in weather-based decision support systems
A MAMATHA,Krishna Reddy Polepalli,Sreenivas Gade,Seishi Ninomiya
Computers and Electronics in Agriculture, CEAG, 2018
@inproceedings{bib_Anal_2018, AUTHOR = {A MAMATHA, Krishna Reddy Polepalli, Sreenivas Gade, Seishi Ninomiya}, TITLE = {Analysis of similar weather conditions to improve reuse in weather-based decision support systems}, BOOKTITLE = {Computers and Electronics in Agriculture}. YEAR = {2018}}
Weather-based decision support systems (DSSs) are being built to improve the efficiency of the production systems in the domains of healthcare, agriculture, transport, governance and so on. Normally, a weather con-dition (WC) is represented by the statistical values of weather variables for a given duration (e.g. a day or a week). In a weather-based DSS, given a WC, the domain experts prepare the appropriate suggestions to improve the efficiency of the stakeholders. Normally, once the domain experts prepare a suggestion for a given WC that belongs to certain period (e.g. year or season), there is a scope to reuse the same suggestion for the similar WCs of other period(s). As a result, the performance of the DSS could be improved. In this paper, to improve reuse, we have proposed a notion of category-based WC (CWC) which is formed by using the categories of weather variables in the respective domain. By considering the context of agromet advisory service operated by India Meteorological Department (IMD) and the corresponding weather categories provided by IMD, we have ana-lyzed the extent of reuse among CWCs by conducting the experiments on 30 years of weather data collected at Rajendranagar, Hyderabad, Telangana state. The experiments are conducted by considering two types of CWCs with the duration equal to one day and five days. By varying the number of weather variables in CWC from one to five, we have computed the extent of reuse among CWCs of different periods of the following period types: year, season, and phenophases (i.e., growth stages) of the Rice crop. The results show that there is a significant similarity among the CWCs of the given period and the CWCs of the preceding periods of each period type. For any domain including agriculture, the results provide an opportunity to improve the efficiency of weather-based agricultural DSSs by improving the reuse of the weather-based suggestions
Discovering periodic-correlated patterns in temporal databases
J N VENKATESH,R. Uday Kiran,Krishna Reddy Polepalli,Masaru Kitsuregawa
Transactions on Large-Scale Data-and Knowledge-Centered Systems, TLDKCS, 2018
@inproceedings{bib_Disc_2018, AUTHOR = {J N VENKATESH, R. Uday Kiran, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Discovering periodic-correlated patterns in temporal databases}, BOOKTITLE = {Transactions on Large-Scale Data-and Knowledge-Centered Systems}. YEAR = {2018}}
Preablative stimulated thyroglobulin (ps‑Tg) is an important investigation in the follow‑up of patients with Differentiated thyroid cancer(DTC) after surgery. Levels of ps‑Tg >2–10 ng/ml have been suggested to predict metastasis to cervical and extracervical sites. There is still debate on the need for routine iodine whole‑body scan (131I WBS) in the management of low‑to‑intermediate‑risk DTC patients. Objective: We analyzed our data of patients with DTC who underwent total thyroidectomy to discuss the predictability of ps‑Tg on metastatic disease on the 131I WBS. Materials and Methods: Retrospective analysis of patient records. Results: One hundred and seventeen patients with DTC (95 papillary thyroid cancer [71 had classic histology, 8 had tall cell variant, 16 had follicular variant] and 22 follicular thyroid cancer [18 minimally invasive, 2 hurtle cell, and 2 widely invasive cancers]) had undergone total thyroidectomy. All these patients underwent ps‑Tg assessment and an 131I WBS. About 65% of them went on to have radioiodine ablation along with a posttherapy 131I WBS. We divided the cohort into four groups based on their ps‑Tg levels: Group 1 (ps‑Tg <1), Group 2 (ps‑Tg 1–1.9), Group 3 (ps‑Tg 2–5), and Group 4 (ps‑Tg >5). None of the patients in Group 1, 7% of those combined in Groups 2 and 3 (2 out of 28 patients), and 26% (12 out of 47) of those in Group 4 had either cervical or extracervical metastasis. Those with extracervical metastatic disease to lungs and bones had a mean (standard deviation) ps‑Tg value of 436 (130) and median of 500 ng/ml and those with cervical metastatic disease had a mean Tg value of 31 (64) and median 6.6 ng/ml. Conclusions: A ps‑Tg value in the absence of anti‑Tgantibodies <1 ng/ml reliably excludes metastatic disease in DTC, while a value >5 ng/ml has a 26% risk of having either cervical or extracervical metastasis.
Novel Data Segmentation Techniques for Efficient Discovery of Correlated Patterns Using Parallel Algorithms
KOTNI AMULYA,R. Uday Kiran,Masashi Toyoda,Krishna Reddy Polepalli,Masaru Kitsuregawa
International Conference on Big Data Analysis and Knowledge Discovery, BDAKD, 2018
@inproceedings{bib_Nove_2018, AUTHOR = {KOTNI AMULYA, R. Uday Kiran, Masashi Toyoda, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Novel Data Segmentation Techniques for Efficient Discovery of Correlated Patterns Using Parallel Algorithms}, BOOKTITLE = {International Conference on Big Data Analysis and Knowledge Discovery}. YEAR = {2018}}
Efficient discovery of interesting patterns using parallel algorithms is an actively studied topic in data mining. A key research issue related to this topic is data segmentation, which influences the overall computational requirements of an algorithm. This paper makes an effort to address this issue in correlated pattern mining. Two novel data segmentation techniques, ‘database segmentation’ and ‘transaction segmentation,’ have been introduced to discover the patterns efficiently. The former technique involves segmenting the database into multiple sub-databases such that each sub-database can be mined independently. The latter technique involves segmenting a transaction into multiple sub-transactions such that each sub-transaction can be processed as an individual transaction. The proposed techniques are algorithm independent, and therefore, can be incorporated into any parallel algorithm to find correlated patterns effectively. In this paper, we introduce map-reduce based pattern-growth algorithm by incorporating the above mentioned techniques. Experimental results demonstrate that the proposed algorithm is memory and runtime efficient and highly scalable as well.
Efficient discovery of weighted frequent itemsets in very large transactional databases: A re-visit
R. Uday Kiran,KOTNI AMULYA,Krishna Reddy Polepalli,Masashi Toyoda,Subhash Bhalla,Masaru Kitsuregawa
International Conference on Big Data, BD, 2018
@inproceedings{bib_Effi_2018, AUTHOR = {R. Uday Kiran, KOTNI AMULYA, Krishna Reddy Polepalli, Masashi Toyoda, Subhash Bhalla, Masaru Kitsuregawa}, TITLE = {Efficient discovery of weighted frequent itemsets in very large transactional databases: A re-visit}, BOOKTITLE = {International Conference on Big Data}. YEAR = {2018}}
Weighted Frequent Itemset (WFI) mining is an important model in data mining. The popular adoption and successful industrial application of this model has been hindered by the following two obstacles: (i) finding WFIs is a computationally expensiveness process as these itemsets do not satisfy the downward closure property and (ii) lack of parallel algorithms to find WFIs in very large databases (e.g. astronomical data and twitter data). This paper makes an effort to address these two obstacles. Two pattern-growth algorithms, Sequential Weighted Frequent Pattern-growth and Parallel Weighted Frequent Pattern-growth, have been introduced to discover WFIs efficiently. Both algorithms employ three novel pruning techniques to reduce the computational cost effectively. The first pruning technique prunes some of the uninteresting items by employing a criterion known as cutoff weight. The second pruning technique, called conditional pattern base elimination, eliminates the construction of conditional pattern bases if a suffix item is an uninteresting item. The third pruning technique, called pattern-growth termination, defines a new terminating condition for the pattern-growth technique. Experimental results demonstrate that the proposed algorithms are memory and runtime efficient, and highly scalable as well.
An Improved Approach for Long Tail Advertising in Sponsored Search
AMAR BUDHIRAJA,Krishna Reddy Polepalli
International Conference on Database Systems for Advanced Applications, DASFAA, 2017
@inproceedings{bib_An_I_2017, AUTHOR = {AMAR BUDHIRAJA, Krishna Reddy Polepalli}, TITLE = {An Improved Approach for Long Tail Advertising in Sponsored Search}, BOOKTITLE = {International Conference on Database Systems for Advanced Applications}. YEAR = {2017}}
Search queries follow a long tail distribution which results in harder management of ad space for sponsored search. During keyword auctions, advertisers also tend to target head query keywords, thereby creating an imbalance in demand for head and tail keywords. This leads to under-utilization of ad space of tail query keywords. In this paper, we have explored a mechanism that allows the advertisers to bid on concepts rather than keywords. The tail query keywords are utilized by allocating a mix of head and tail keywords related to the concept. In the literature, an effort has been made to improve sponsored search by extracting the knowledge of coverage patterns among the keywords of transactional query logs. In this paper, we propose an improved approach to allow advertisers to bid on high level concepts instead of keywords in sponsored search. The proposed approach utilizes the knowledge of levelwise coverage patterns to allocate incoming search queries to advertisers in an efficient manner by utilizing the long tail. Experimental results on AOL search query data set show improvement in ad space utilization and reach of advertisers.
Association Rule Based Approach to Improve Diversity of Query Recommendations
M. KUMARA SWAMY,Krishna Reddy Polepalli,Subhash Bhalla
International Conference on Database and Expert Systems Applications, DEXA, 2017
@inproceedings{bib_Asso_2017, AUTHOR = {M. KUMARA SWAMY, Krishna Reddy Polepalli, Subhash Bhalla}, TITLE = {Association Rule Based Approach to Improve Diversity of Query Recommendations}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2017}}
Query recommendation (QR) support search engine to provide alternative queries as a recommendation using similarity-based approaches. In the literature, orthogonal query recommendation (OQR) has been proposed to compute the diversity of QR when the user does not formulate proper queries. The OQR uses dissimilarity measure in QR to recommend completely different queries. In this paper, we propose an approach in QR by extending association rules, diverse patterns, and unbalanced concept hierarchy of search terms. We conceptualize association rules based QR, and order the rules based on confidence and diversity. Subsequently, the high ranked rules based on confidence and diversity are provided in QRs. The experimental results on real world AOL click-through dataset show that the diverse QRs improve the performance significantly
Discovering partial periodic-frequent patterns in a transactional database
R.UdayKiran,J N VENKATESH,Masashi Toyoda,Masaru Kitsuregawa,Krishna Reddy Polepalli
Journal of Systems and Software, JSS, 2017
@inproceedings{bib_Disc_2017, AUTHOR = {R.UdayKiran, J N VENKATESH, Masashi Toyoda, Masaru Kitsuregawa, Krishna Reddy Polepalli}, TITLE = {Discovering partial periodic-frequent patterns in a transactional database}, BOOKTITLE = {Journal of Systems and Software}. YEAR = {2017}}
Time and frequency are two important dimensions to determine the interestingness of a pattern in a database. Periodic-frequent patterns are an important class of regularities that exist in a database with respect to these two dimensions. Current studies on periodic-frequent pattern mining have focused on discovering full periodic-frequent patterns, i.e., finding all frequent patterns that have exhibited complete cyclic repetitions in a database. However, partial periodic-frequent patterns are more common due to the imperfect nature of real-world. This paper proposes a flexible and generic model to find partial periodic-frequent patterns. A new interesting measure, periodic-ratio, has been introduced to determine the periodic interestingness of a frequent pattern by taking into account its proportion of cyclic repetitions in a database. The proposed patterns do not satisfy the anti-monotonic property. A novel pruning technique has been introduced to reduce the search space effectively. A pattern-growth algorithm to find all partial periodic-frequent patterns has also been presented in this paper. Experimental results demonstrate that the proposed model can discover useful information, and the algorithm is efficient.
Discovering Periodic Patterns in Non-Uniform Temporal Databases
R. Uday Kiran,J N VENKATESH,Philippe Fournier-Viger,Masashi Toyoda,Krishna Reddy Polepalli,Masaru Kitsuregawa
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2017
@inproceedings{bib_Disc_2017, AUTHOR = {R. Uday Kiran, J N VENKATESH, Philippe Fournier-Viger, Masashi Toyoda, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Discovering Periodic Patterns in Non-Uniform Temporal Databases}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2017}}
A temporal database is a collection of transactions, ordered by their timestamps. Discovering periodic patterns in temporal databases has numerous applications. However, to the best of our knowledge, no work has considered mining periodic patterns in temporal databases where items have dissimilar support and periodicity, despite that this type of data is very common in real-life. Discovering periodic patterns in such non-uniform temporal databases is challenging. It requires defining (i) an appropriate measure to assess the periodic interestingness of patterns, and (ii) a method to efficiently find all periodic patterns. While a pattern-growth approach can be employed for the second sub-task, the first sub-task has to the best of our knowledge not been addressed. Moreover, how these two tasks are combined has significant implications. In this paper, we address this challenge. We introduce a model to assess the periodic interestingness of patterns in databases having a non-uniform item distribution, which considers that periodic patterns may have different period and minimum number of cyclic repetitions. Moreover, the paper introduces a pattern-growth algorithm to efficiently discover all periodic patterns. Experimental results demonstrate that the proposed algorithm is efficient and the proposed model may be utilized to find prior knowledge about event keywords and their associations in Twitter data.
Towards Question Improvement on Knowledge Sharing Platforms: A Stack Overflow Case Study
RISHABH GUPTA,Krishna Reddy Polepalli
International Conference on Big Knowledge, ICBK, 2017
@inproceedings{bib_Towa_2017, AUTHOR = {RISHABH GUPTA, Krishna Reddy Polepalli}, TITLE = {Towards Question Improvement on Knowledge Sharing Platforms: A Stack Overflow Case Study}, BOOKTITLE = {International Conference on Big Knowledge}. YEAR = {2017}}
Community-driven Knowledge Sharing (KS) plat-forms have gained immense popularity in recent years among the Internet users to seek, learn & share information and expertise.These platforms encourage rich content by recognizing users’ contributions; measured as the reputation on platforms. Thus,peers on the platform dissuade other users to post low quality content by closing, disliking or not answering their questions.The aggressive attitude of the community leads to a negative impact on the users’ experience (especially for newbies) and the users tend to lose interest in the platform. Since the quality management on KS platforms is a necessity, we aim to emphasize the need of a mechanism to help users in improving their questions. To study this, we perform experiments on the Stack Overflow (SO) platform, however, the study can be adapted for other KS platforms.The SO community aggressively mark questions, even with slightest deformity, as closed. The non-triviality of reopening process and the rigorous review mechanism can be handled only by proper editing of the closed questions. Thus we present first-of-a-kind study, to assist the SO platform users, in the reopening process of their closed questions. We build predictive models to suggest the users that if their edited version of closed question will lead to a successful reopen or not. This can assist users at large by retracting them from entering the review process with improper edits. To learn these models effectively, we consider the user categories on the platform, based on their reputation: established& non-established. In addition to being the major contributor to closed questions, the non-established users have lower odds of getting their closed questions reopened than established users.Thus, we leverage the better editing skills of established users to learn question edit models by employing their reopened closed questions. We present the results of predictive model and suggest insights on the useful edits in the reopening process on the SO platform
An Efficient Map-Reduce Framework to Mine Periodic Frequent Patterns
ALAMPALLY ANIRUDH,R.Uday Kiran,Krishna Reddy Polepalli,M.Toyoda,Masaru Kitsuregawa
International Conference on Big Data Analysis and Knowledge Discovery, BDAKD, 2017
@inproceedings{bib_An_E_2017, AUTHOR = {ALAMPALLY ANIRUDH, R.Uday Kiran, Krishna Reddy Polepalli, M.Toyoda, Masaru Kitsuregawa}, TITLE = {An Efficient Map-Reduce Framework to Mine Periodic Frequent Patterns}, BOOKTITLE = {International Conference on Big Data Analysis and Knowledge Discovery}. YEAR = {2017}}
Periodic Frequent patterns (PFPs) are an important class of regularities that exist in a transactional database. In the literature,pattern growth-based approaches to mine PFPs have be proposed by considering a single machine. In this paper, we propose a Map-Reduce framework to mine PFPs by considering multiple machines. We have proposed a parallel algorithm by including the step of distributing transactional identifiers among the machines. Further, the notion of partition summary has been proposed to reduce the amount of data shuffled among the machines. Experiments on Apache Spark’s distributed environment show that the proposed approach speeds up with the increase in number of machines and the notion of partition summary significantly reduces the amount of data shuffled among the machines.
A Flexible and Efficient Indexing Scheme for Placement of Top-Utility Itemsets for Different Slot Sizes
Parul Chaudhary,Anirban Mondal,Krishna Reddy Polepalli
International Conference on Big Data Analytics, BDA, 2017
@inproceedings{bib_A_Fl_2017, AUTHOR = {Parul Chaudhary, Anirban Mondal, Krishna Reddy Polepalli}, TITLE = {A Flexible and Efficient Indexing Scheme for Placement of Top-Utility Itemsets for Different Slot Sizes}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2017}}
Utility mining has been emerging as an important area in data mining. While existing works on utility mining have primarily focused on the problem of finding high-utility itemsets from transactional databases, they implicitly assume that each item occupies only one slot. However, in many real-world scenarios, the number of slots consumed by different items typically varies. Hence, this pa-per considers that a given item may physically occupy any fixed (integer) number of slots. Thus, we address the problem of efficiently determining the top-utility itemsets when a given number of slots is specified as input. The key contributions of our work are three-fold. First, we present an efficient framework to determine the top-utility itemsets for different user-specified number of slots that need to be filled. Second, we propose a novel flexible and efficient index, designated as the STUI index, for facilitating quick retrieval of the top-utility itemsets for a given number of slots. Third, we conducted an extensive performance evaluation using real datasets to demonstrate the overall effectiveness of the proposed indexing scheme in terms of execution time and utility (net revenue) as compared to a recent existing scheme.
Efficient discovery of periodic-frequent patterns in very large databases
R. Uday Kiran,Masaru Kitsuregawa,Krishna Reddy Polepalli
Journal of Systems and Software, JSS, 2016
@inproceedings{bib_Effi_2016, AUTHOR = {R. Uday Kiran, Masaru Kitsuregawa, Krishna Reddy Polepalli}, TITLE = {Efficient discovery of periodic-frequent patterns in very large databases}, BOOKTITLE = {Journal of Systems and Software}. YEAR = {2016}}
Periodic-frequent patterns (or itemsets) are an important class of regularities that exist in a transactional database. Finding these patterns involves discovering all frequent patterns that satisfy the user specified maximum periodicity constraint. This constraint controls the maximum inter-arrival time of a pattern in a database. The time complexity to measure periodicity of a pattern is O(n), where n represents the number of timestamps at which the corresponding pattern has appeared in a database. As unusually represents a high value in voluminous databases, determining the periodicity of every candidate pattern in the itemset lattice makes the periodic-frequent pattern mining a computationally expensive process. This paper introduces a novel approach to address this problem. Our approach determines the periodic interestingness of a pattern by adopting greedy search. The basic idea of our approach is to discover all periodic-frequent patterns by eliminating a periodic patterns based on suboptimal solutions. The best and worst case time complexities of our approach to determine the periodic interestingness of a frequent pattern are O(1) and O(n), respectively. We introduce two pruning techniques and propose a pattern-growth algorithm to find these patterns efficiently. Experimental results show that our algorithm is runtime efficient and highly scalable as well.
An Approach to Allocate Advertisement Slots for Banner Advertising
VADDADI NAGASAI KAVYA,Krishna Reddy Polepalli
Conference on Data Science, CODS, 2016
@inproceedings{bib_An_A_2016, AUTHOR = {VADDADI NAGASAI KAVYA, Krishna Reddy Polepalli}, TITLE = {An Approach to Allocate Advertisement Slots for Banner Advertising}, BOOKTITLE = {Conference on Data Science}. YEAR = {2016}}
n the banner advertising scenario, an advertiser aims to reach the maximum number of potential visitors and a publisher tries to meet the requests of increased number of advertisers to maximize the revenue. In the literature, a model was introduced to extract the knowledge of coverage patterns from transactional database. In this paper, we propose an ad slots allocation approach by extending the notion of coverage patterns to select distinct sets of ad slots to meet the requests of multiple advertisers. The preliminary experimental results on a real world dataset show that the proposed approach meets the requests of increased number of advertisers when compared with the baseline approach of allocation.
Learning from Gurus: Analysis and Modeling of Reopened Questions on Stack Overflow
RISHABH GUPTA,Krishna Reddy Polepalli
Conference on Data Science, CODS, 2016
@inproceedings{bib_Lear_2016, AUTHOR = {RISHABH GUPTA, Krishna Reddy Polepalli}, TITLE = {Learning from Gurus: Analysis and Modeling of Reopened Questions on Stack Overflow}, BOOKTITLE = {Conference on Data Science}. YEAR = {2016}}
Community-driven Question Answering (Q&A) platforms are gaining popularity now-a-days and the number of posts on such platforms are increasing tremendously. Thus, the challenge to keep these platforms noise-free is attracting the interest of research community. Stack Overflow is one such popular computer programming related Q&A platform. The established users on Stack Overflow have learnt the accept-able format and scope of questions in due course. Even if their questions get closed, they are aware of the required ed-its, therefore the chances of their questions being reopened increases. On the other hand, non-established users have not adapted to the Stack Overflow system and find difficulty in editing their closed questions. In this work, we aim to identify features which help differentiate editing approaches of established and non-established users, and motivate the need of recommendation model. Such a recommendation model can assist every user to edit their closed questions leveraging the edit-style of the established users of the platform.
Improving the Performance of Collaborative Filtering with Category-Specific Neighborhood
Dileep Kumar,Krishna Reddy Polepalli,P BALAKRISHNA REDDY,Longbing Cao
Asian Conference on Intelligent Information and Database Systems, ACIIDS, 2016
@inproceedings{bib_Impr_2016, AUTHOR = {Dileep Kumar, Krishna Reddy Polepalli, P BALAKRISHNA REDDY, Longbing Cao}, TITLE = {Improving the Performance of Collaborative Filtering with Category-Specific Neighborhood}, BOOKTITLE = {Asian Conference on Intelligent Information and Database Systems}. YEAR = {2016}}
Recommender system (RS) helps customers to select appropriate products from millions of products and has become a key component in e-commerce systems. Collaborative filtering (CF) based approaches are widely employed to build RSs. In CF, recommendation to the target user is computed after forming the corresponding neighbour-hood of users. Neighborhood of a target user is extracted based on the similarity between the product rating vector of the target user and the product rating vectors of individual users. In CF, the methodology employed for neighborhood formation influences the performance. In this paper, we have made an effort to improve the performance of CF by proposing a different approach to compute recommendations by considering two kinds of neighborhood. One is the neighborhood by considering the product ratings of the user as a single vector and the other is based on the neighborhood of the corresponding virtual users. For the target user,the virtual users are formed by dividing the ratings based on the category of products. We have proposed a combined approach to compute better recommendations by considering both kinds of neighborhoods.The experiments results on real world Movie Lens dataset show that the proposed approach improves the performance over CF.
Analyzing the Extraction of Relevant Legal Judgments using Paragraph-level and Citation Information
K RAGHAV,Krishna Reddy Polepalli,V. Balakista Reddy
Artificial Intelligence for Justice, AIJS, 2016
@inproceedings{bib_Anal_2016, AUTHOR = {K RAGHAV, Krishna Reddy Polepalli, V. Balakista Reddy}, TITLE = {Analyzing the Extraction of Relevant Legal Judgments using Paragraph-level and Citation Information}, BOOKTITLE = {Artificial Intelligence for Justice}. YEAR = {2016}}
Building efficient search systems to extract relevant in-formation from a huge volume of legal judgments is a research is-sue. In the literature, efforts are being made to build efficient search systems in the legal domain by extending information retrieval approaches. We are making efforts to investigate improved approaches to extract relevant legal judgments for a given input judgment by exploiting text and citation information of legal judgments. Typically,legal judgments are very large text documents and contain several intricate legal concepts. In this paper, we analyze how the paragraph-level and citation information of the judgments could be exploited for retrieving relevant legal judgments for the given judgment. In this paper, we have proposed improved ranking approach to find the relevant legal judgments of a given judgment based on the similarity be-tween the paragraphs of the judgments by employing Okapi retrieval model and citation information. The user evaluation study on legal judgments data set delivered by Supreme Court of India shows that the proposed approach improves the ranking performance over the baseline approach. Overall, the analysis shows that there is a scope to exploit the paragraph-level and citation information of the judgments to improve the search performance.
Discovering Periodic-Frequent Patterns in Transactional Databases Using All-Confidence and Periodic-All-Confidence
J N VENKATESH,R. Uday Kiran,Krishna Reddy Polepalli,Masaru Kitsuregawa
International Conference on Database and Expert Systems Applications, DEXA, 2016
@inproceedings{bib_Disc_2016, AUTHOR = {J N VENKATESH, R. Uday Kiran, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Discovering Periodic-Frequent Patterns in Transactional Databases Using All-Confidence and Periodic-All-Confidence}, BOOKTITLE = {International Conference on Database and Expert Systems Applications}. YEAR = {2016}}
Periodic-frequent pattern mining involves finding all frequent patterns that have occurred at regular intervals in a transactional database. The basic model considers a pattern as periodic-frequent, if it satisfies the user-specified minimum support (min Sup) and maximum periodicity (maxP er) constraints. The usage of a single minSupandmaxP erfor an entire database leads to the are-item problem.When confronted with this problem in real-world applications, researchers have tried to address it using the item-specific minSupandmaxP er con-straints. It was observed that this extended model still generates a sig-nificant number of uninteresting patterns, and moreover, suffers from the issue of specifying item-specific minSupandmaxP er constraints.This paper proposes a novel model to address the rare-item problem in periodic-frequent pattern mining. The proposed model considers a pat-tern as interesting if its support and periodicity are close to that of its individual items. The al l-confidence is used as an interestingness measure to filter out uninteresting patterns in support dimension. In addition, anew interestingness measure, called periodic-al l-confidence, is being pro-posed to filter out uninteresting patterns in periodicity dimension. We have proposed a model by combining both measures and proposed a pattern-growth approach to resolve the rare-item problem and extract interesting periodic-frequent patterns. Experimental results show that the proposed model is efficient
Memory Efficient Mining of Periodic Frequent Patterns in Transactional Databases
ALAMPALLY ANIRUDH,R. Uday Kiran,Krishna Reddy Polepalli,Masaru Kitsuregawa
Symposium Series on Computational Intelligence, SSCI, 2016
@inproceedings{bib_Memo_2016, AUTHOR = {ALAMPALLY ANIRUDH, R. Uday Kiran, Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Memory Efficient Mining of Periodic Frequent Patterns in Transactional Databases}, BOOKTITLE = {Symposium Series on Computational Intelligence}. YEAR = {2016}}
Periodic-frequent patterns are an important class of regularities which exists in a transactional database. A frequent pattern is called periodic-frequent if it appears at regular intervals in a transactional database. In the literature, a model of periodic-frequent patterns was proposed and pattern growth like approaches to extract patterns are being explored. In these approaches, a periodic-frequent pattern tree is built in which a transaction-id list is maintained at each path’s tail-node. As the typical size of transactional database is very huge in the modern e-commerce era, extraction of periodic-frequent patterns by maintaining transaction-ids in the tree requires more memory.In this paper, to reduce the memory requirements, we introduced a notion of period summary by capturing the periodicity of the patterns in a sequence of transaction-ids. While building the tree,the period summary of the transactions is computed and stored at the tail-node of the tree instead of the transaction-ids. We have also proposed a merging framework for period summaries for mining periodic-frequent patterns. The performance could be improved significantly as the memory required to store the period summaries is significantly less than the memory required to store the transaction-id list. Experimental results show that the pro-posed approach reduces the memory consumption significantly and also improves the runtime efficiency considerably over the existing approaches.
A Framework to Improve Reuse in Weather-Based DSS Based on Coupling Weather Conditions
A MAMATHA,Krishna Reddy Polepalli,M. KUMARA SWAMY,G. Sreenivas,D. Raji Reddy
International Conference on Big Data Analytics, BDA, 2015
@inproceedings{bib_A_Fr_2015, AUTHOR = {A MAMATHA, Krishna Reddy Polepalli, M. KUMARA SWAMY, G. Sreenivas, D. Raji Reddy}, TITLE = {A Framework to Improve Reuse in Weather-Based DSS Based on Coupling Weather Conditions}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2015}}
The systems for weather observation and forecast are being operated to deal with adverse weather in general to mankind. Weather-based decision support systems (DSSs) are being build to improve the efficiency of the production systems in the domains of health, agriculture, livestock, transport, business, planing, governance and so on.The weather-based DSS provides appropriate suggestions based on the weather condition of the given period for the selected domain. In the literature, the notion of reuse is being employed in improving the efficiency of DSSs. In this paper, we have proposed a framework to identify similar weather conditions, which could help in improving the performance of weather-based DSSs with better reuse. In the proposed framework, the range of weather variable is divided into categories based on its influence on that domain. We form a weather condition for a period which is the combination of category values of weather variables. By comparing the daily/weekly weather conditions of a given year to weather conditions of subsequent years, the proposed framework identifies the extent of reuse.We have conducted the experiment by applying the proposed framework on 30 years of weather data of Rajendranagar, Hyderabad and using the categories employed by India Meteorological Department in Meteorology domain. The results show that there is a significant degree of similarity among daily and weekly weather conditions over the years. The results provide an opportunity to improve the efficiency of weather-based DSSs by improving the degree of reuse of the developed suggestions/knowledge for the corresponding weather conditions.
Relaxed Neighbor Based Approach For Improving Protein Function Prediction
SATHEESH KUMAR DWADASI,Krishna Reddy Polepalli
Conference on Data Science, CODS, 2015
@inproceedings{bib_Rela_2015, AUTHOR = {SATHEESH KUMAR DWADASI, Krishna Reddy Polepalli}, TITLE = {Relaxed Neighbor Based Approach For Improving Protein Function Prediction}, BOOKTITLE = {Conference on Data Science}. YEAR = {2015}}
Protein-protein interaction (PPI) networks are valuable biological data source which contain rich information useful for protein function prediction. The PPI network data obtained from high-throughput experiments is known to be noisy and incomplete. In the literature, common neighbor, clustering, and classification-based approaches have been proposed to improve the performance of protein function prediction by modeling PPI data as a graph. These approaches exploit the fact that protein shares function with other proteins directly interacting with it. In this paper we have experimented an alternative approach by exploiting the notion that two proteins share a function if they have a well defined group of directly or indirectly connected common neighbors. The experiments conducted on variety of PPI network datasets show that the proposed approach improves protein function prediction accuracy over existing approaches
An Approach to Cover More Advertisers in Adwords
AMAR BUDHIRAJA,Krishna Reddy Polepalli
International Conference on Data Science and Advanced Analytics, DSAA, 2015
@inproceedings{bib_An_A_2015, AUTHOR = {AMAR BUDHIRAJA, Krishna Reddy Polepalli}, TITLE = {An Approach to Cover More Advertisers in Adwords}, BOOKTITLE = {International Conference on Data Science and Advanced Analytics}. YEAR = {2015}}
The system of advertising on web search engines is more popularly known as Adwords. In Adwords, advertisers bid on relevant search keywords for displaying their advertisement when a query containing any of those keywords is fired. During keyword auction, there is very high competition for the frequent keywords while little to no competition for the less frequent ones. In this paper, we have proposed an approach to utilize the eye balls related to infrequent keywords to meet the demands of more advertisers by employing the notions of coverage and concept taxonomy. We employed the notion of coverage to form the multiple distinct groups of infrequent keywords. We also employed concept taxonomy to ensure that each group of keywords is semantically related. We have conducted experiments on the search queries dataset of AOL search engine. The results show that the proposed approach has a potential to meet the advertising demands of more number of advertisers over the existing approach.
Improved Approach for Protein Function Prediction by Exploiting Prominent Proteins
SATHEESH KUMAR DWADASI,Krishna Reddy Polepalli
International Conference on Data Science and Advanced Analytics, DSAA, 2015
@inproceedings{bib_Impr_2015, AUTHOR = {SATHEESH KUMAR DWADASI, Krishna Reddy Polepalli}, TITLE = {Improved Approach for Protein Function Prediction by Exploiting Prominent Proteins}, BOOKTITLE = {International Conference on Data Science and Advanced Analytics}. YEAR = {2015}}
Protein-protein interaction (PPI) networks are valuable biological data source which contain rich information useful for protein function prediction. The PPI network data set obtained from high-throughput experiments is known to be noisy and incomplete. By modeling PPI data as a graph, research efforts are being made in the literature to improve the performance of protein function prediction by extending common neighbor, clustering, and classification based approaches. These approaches exploit the fact that protein shares function with other proteins which are connected through common neighbours. As PPI data is modeled as a graph, it contains prominent nodes which establish relatively high connectivity with other modes. In this paper we propose an improved approach for protein function prediction by exploiting the connectivity properties of prominent proteins. Experimental results on real-world data sets demonstrate the effectiveness of proposed approach.
Mining coverage patterns from transactional databases
GOWTHAM SRINIVAS. P,Krishna Reddy Polepalli,ATMAKURI VENKATA TRINATH,S BHARGAV,R.Uday Kiran
Journal of Intelligent Information Systems, JIIS, 2015
@inproceedings{bib_Mini_2015, AUTHOR = {GOWTHAM SRINIVAS. P, Krishna Reddy Polepalli, ATMAKURI VENKATA TRINATH, S BHARGAV, R.Uday Kiran}, TITLE = {Mining coverage patterns from transactional databases}, BOOKTITLE = {Journal of Intelligent Information Systems}. YEAR = {2015}}
We propose a model of coverage patterns (CPs) and approaches for extracting CPs from transactional databases. The model is motivated by the problem of banner advertisement placement in e-commerce web sites. Normally, an advertiser expects that the banner advertisement should be displayed to a certain percentage of web site visitors. On the other hand, to generate more revenue for a given web site, the publisher makes efforts to meet the coverage demands of multiple advertisers. Informally, a CP is a set of non overlapping items covered by certain percentage of transactions in a transactional database.The CPs do not satisfy the downward closure property. Efforts are being made in the literature to extract CPs using level-wise pruning approach. In this paper, we propose CP extraction approaches based on pattern growth techniques. Experimental results show that the proposed pattern growth approaches improve the performance over the level-wise pruning approach. The results also show that CPs could be used in meeting the demands of multiple advertisers.
A Framework to Harvest Page Views of Web for Banner Advertising
Krishna Reddy Polepalli
International Conference on Big Data Analytics, BDA, 2015
@inproceedings{bib_A_Fr_2015, AUTHOR = {Krishna Reddy Polepalli}, TITLE = {A Framework to Harvest Page Views of Web for Banner Advertising}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2015}}
Online advertising provides an opportunity for product sellers and service providers to reach customers and has become a key factor in the growth of economy. It is a major source of revenue for the major search engine and social networking sites. Search engine, context-specific and banner advertising are the major modes of online advertising. The banner advertisement mode has certain advantages over other modes of advertising. Currently, the number of websites registered comes to a billion. Each day, a typical website receives the number of visitors ranging from hundreds to millions. In a few years, the entire population of the globe is going to be connected to Internet and browse websites. It is possible for a product seller or service provider to reach every potential customer through banner advertising. In this paper, a framework is proposed to harvest the pages views of web by forming the clusters of similar websites. Rather than managing a single website, the publisher manages the aggregated advertising space of a collection of websites. As a result, the advertisement space could be expanded significantly and it will provide the opportunity for increased number of publishers to market the aggregated advertisement space of millions of websites to advertisers for reaching potential customers. It will also help in balancing the management of banner advertising market.
Content specific coverage patterns for banner advertisement placement
ATMAKURI VENKATA TRINATH,GOWTHAM SRINIVAS. P,Krishna Reddy Polepalli
International Conference on Data Science and Advanced Analytics, DSAA, 2014
@inproceedings{bib_Cont_2014, AUTHOR = {ATMAKURI VENKATA TRINATH, GOWTHAM SRINIVAS. P, Krishna Reddy Polepalli}, TITLE = {Content specific coverage patterns for banner advertisement placement}, BOOKTITLE = {International Conference on Data Science and Advanced Analytics}. YEAR = {2014}}
In banner advertisement scenario, advertiser expects his advertisement should be displayed to certain percentage of visitors. He also expects the advertisement to be relevant to the content of web page. On the other hand, to generate more revenue for a given website, publisher has to meet the coverage demands of several advertisers by providing appropriate sets of pages. Coverage patterns (CPs) are a set of web pages visited by certain percentage of visitors. The model of CPs does not reflect the real world scenario as they do not capture the aspect of relevance of web pages to the advertisement. In this paper, we propose a model of content-specific CPs and a methodology to extract content-specific CPs from click-through data, given keywords describing every web page. Content-specific CP is a set of web pages visited by certain percentage of users interested in particular content of the web pages. Experimental results show that the proposed model extracts CPs relevant to the topics of interest of advertiser over the previous model.
Extracting diverse patterns with unbalanced concept hierarchy
M. KUMARA SWAMY,Krishna Reddy Polepalli,SOMYA SRIVASTAVA
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2014
@inproceedings{bib_Extr_2014, AUTHOR = {M. KUMARA SWAMY, Krishna Reddy Polepalli, SOMYA SRIVASTAVA}, TITLE = {Extracting diverse patterns with unbalanced concept hierarchy}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2014}}
The process of frequent pattern extraction finds interesting information about the association among the items in a transactional database. The notion of support is employed to extract the frequent patterns. Normally, in a given domain, a set of items can be grouped into a category and a pattern may contain the items which belong to multiple categories. In several applications, it may be useful to distinguish between the pattern having items belonging to multiple categories and the pattern having items belonging to one or a few categories. The notion of diversity captures the extent the items in the pattern belong to multiple categories. The items and the categories form a concept hierarchy. In the literature, an approach has been proposed to rank the patterns by considering the balanced concept hierarchy. In a real life scenario, the concept hierarchies are normally unbalanced. In this paper, we propose a general approach to calculate the rank based on the diversity, called drank, by considering the unbalanced concept hierarchy. The experiment results show that the patterns ordered based on drank are different from the patterns ordered based on support, and the proposed approach could assign the drank to different kinds of unbalanced patterns
Relaxed neighbor based graph transformations for effective preprocessing: A function prediction case study
SATHEESH KUMAR DWADASI,Krishna Reddy Polepalli,Nita Parekh
International Conference on Big Data Analytics, BDA, 2014
@inproceedings{bib_Rela_2014, AUTHOR = {SATHEESH KUMAR DWADASI, Krishna Reddy Polepalli, Nita Parekh}, TITLE = {Relaxed neighbor based graph transformations for effective preprocessing: A function prediction case study}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2014}}
Protein-protein interaction (PPI) networks are valuable biological source of data which contain rich information useful for protein function prediction. The PPI networks face data quality challenges like noise in the form of false positive edges and incompleteness in the form of missing biologically valued edges. These issues can be handled by enhancing data quality through graph transformations for improved protein function prediction. We proposed an improved method to extract similar proteins based on the notion of relaxed neighborhood. The proposed method can be applied to carry out graph transformation of PPI network data sets to improve the performance of protein function prediction task by adding biologically important protein interactions, removing dissimilar interactions and increasing reliability score of the interactions. By preprocessing PPI network data sets with the proposed methodology, experiment results on both un-weighted and weighted PPI network data sets show that, the proposed methodology enhances the data quality and improves prediction accuracy over other approaches. The results indicate that the proposed approach could utilize underutilized knowledge, such as distant relationships embedded in the PPI graph.
A framework to improve reuse in weather-based decision support systems
A MAMATHA,Krishna Reddy Polepalli,M. KUMARA SWAMY,G. Sreenivas,D. Raji Reddy
International Conference on Big Data Analytics, BDA, 2014
@inproceedings{bib_A_fr_2014, AUTHOR = {A MAMATHA, Krishna Reddy Polepalli, M. KUMARA SWAMY, G. Sreenivas, D. Raji Reddy}, TITLE = {A framework to improve reuse in weather-based decision support systems}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2014}}
The systems for weather observation and forecast are being operated to deal with adverse weather in general to mankind. Weatherbased decision support systems (DSSs) are being build to improve the efficiency of the production systems in the domains of health, agriculture, livestock, transport, business, planing, governance and so on. The weather-based DSS provides appropriate suggestions based on the weather condition of the given period for the selected domain. In the literature, the notion of reuse is being employed in improving the efficiency of DSSs. In this paper, we have proposed a framework to identify similar weather conditions, which could help in improving the performance of weather-based DSSs with better reuse. In the proposed framework, the range of weather variable is divided into categories based on its influence on that domain. We form a weather condition for a period which is the combination of category values of weather variables. By comparing the daily/weekly weather conditions of a given year to weather conditions of subsequent years, the proposed framework identifies the extent of reuse. We have conducted the experiment by applying the proposed framework on 30 years of weather data of Rajendranagar, Hyderabad and using the categories employed by India Meteorological Department in Meteorology0 domain. The results show that there is a significant degree of similarity among daily and weekly weather conditions over the years. The results provide an opportunity to improve the efficiency of weather-based DSSs by improving the degree of reuse of the developed suggestions/knowledge for the corresponding weather conditions.
Finding similar legal judgements under common law system
SUSHANTA KUMAR,Krishna Reddy Polepalli,V. Balakista Reddy,Malti Suri
International workshop on databases in networked information systems, DNIS, 2013
@inproceedings{bib_Find_2013, AUTHOR = {SUSHANTA KUMAR, Krishna Reddy Polepalli, V. Balakista Reddy, Malti Suri}, TITLE = {Finding similar legal judgements under common law system}, BOOKTITLE = {International workshop on databases in networked information systems}. YEAR = {2013}}
Legal judgements are complex in nature and contain cita-tions to other judgements. Research efforts are going on to develop meth-ods for efficient search of relevant legal information by extending the pop-ular approaches used in information retrieval and web searching research areas. In the literature, it was shown that it is possible to find similarjudgements by exploiting citations or links. In this paper, an approachh as been has been proposed using the notion of “paragraph-link” to im-prove the efficiency of link-based similarity method. The experiments on real-world data set and user evaluation study show the encouraging results.
Discovering coverage patterns for banner advertisement placement
GOWTHAM SRINIVAS. P,Krishna Reddy Polepalli,S. BHARGAV,R UDAY KIRAN,SATHEESH KUMAR DWADASI
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2012
@inproceedings{bib_Disc_2012, AUTHOR = {GOWTHAM SRINIVAS. P, Krishna Reddy Polepalli, S. BHARGAV, R UDAY KIRAN, SATHEESH KUMAR DWADASI}, TITLE = {Discovering coverage patterns for banner advertisement placement}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2012}}
We propose a model of coverage patterns and a methodology to ex-tract coverage patterns from transactional databases. We have discussed how thecoverage patterns are useful by considering the problem of banner advertisementsplacement in e-commerce web sites. Normally, advertiser expects that the banneradvertisement should be displayed to a certain percentage of web site visitors. Onthe other hand, to generate more revenue for a given web site, the publisher has tomeet the coverage demands of several advertisers by providing appropriate setsof web pages. Given web pages of a web site, a coverage pattern is a set of pagesvisited by a certain percentage of visitors. The coverage patterns discovered fromclick-stream data could help the publisher in meeting the demands of several ad-vertisers. The efficiency and advantages of the proposed approach is shown byconducting experiments on real world click-stream data sets.
A Model of Virtual Crop Labs as a Cloud Computing Application for Enhancing Practical Agricultural Education
Krishna Reddy Polepalli,Basi Bhaskar Reddy ,D. Rama Rao
International Conference on Big Data Analytics, BDA, 2012
@inproceedings{bib_A_Mo_2012, AUTHOR = {Krishna Reddy Polepalli, Basi Bhaskar Reddy , D. Rama Rao}, TITLE = {A Model of Virtual Crop Labs as a Cloud Computing Application for Enhancing Practical Agricultural Education}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2012}}
A model of crop specific virtual labs is proposed to improve practical agricultural education by considering the agricultural educa-tion system in India. In agricultural education, the theoretical concepts are being imparted through class room lectures and laboratory skills are imparted in the dedicated laboratories. Further, practical agricultural education is being imparted by exposing the students to the field prob-lems through Rural Agricultural Work Experience Program (RAWEP),experiential learning and internships. In spite of these efforts, there isa feeling that the level of practical skills exposed to the students is not up to the desired level. So we have to devise the new ways and means to enhance the practical knowledge and skills of agricultural students to understand the real-time crop problems and provide the corrective stepsat the field level. Recent developments in ICTs, thus, provide an oppor-tunity to improve practical education by developing virtual crop labs.The virtual crop labs contain a well organized, indexed and summarized digital data (text, photograph, and video). The digital data corresponds to farm situations reflecting life cycles of several farms of different crops cultivated under diverse farming conditions. The practical knowledge of the students could be improved, if we systematically expose them to virtual crop labs along with course teaching. We can employ cloud com-puting platform to store huge amounts of data and render to students and other stakeholders in an online manner
Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms
R UDAY KIRAN,Krishna Reddy Polepalli
International Conference on Extending Database Technology, EBDT, 2011
@inproceedings{bib_Nove_2011, AUTHOR = {R UDAY KIRAN, Krishna Reddy Polepalli}, TITLE = {Novel techniques to reduce search space in multiple minimum supports-based frequent pattern mining algorithms}, BOOKTITLE = {International Conference on Extending Database Technology}. YEAR = {2011}}
Frequent patterns are an important class of regularities thatexist in a transaction database. Certain frequent patternswith low minimum support (minsup) value can provide use-ful information in many real-world applications. However,extraction of these frequent patterns with singleminsup-based frequent pattern mining algorithms such as Aprioriand FP-growth leads to “rare item problem.” That is, athighminsupvalue, the frequent patterns with lowminsupare missed, and at lowminsupvalue, the number of frequentpatterns explodes. In the literature,“multipleminsupsframe-work” was proposed to discover frequent patterns. Further-more, frequent pattern mining techniques such as MultipleSupport Apriori and Conditional Frequent Pattern-growth(CFP-growth) algorithms have been proposed. As the fre-quent patterns mined with this framework do not satisfydownward closure property, the algorithms follow differenttypes of pruning techniques to reduce the search space. Inthis paper, we propose an efficient CFP-growth algorithmby proposing new pruning techniques. Experimental resultsshow that the proposed pruning techniques are effective.
Using Lower-Bound Similarity to Enhance the Performance of Recommender Systems
MOHAK SHARMA,Krishna Reddy Polepalli
Compute Conference, COMPUTE, 2011
@inproceedings{bib_Usin_2011, AUTHOR = {MOHAK SHARMA, Krishna Reddy Polepalli}, TITLE = {Using Lower-Bound Similarity to Enhance the Performance of Recommender Systems}, BOOKTITLE = {Compute Conference}. YEAR = {2011}}
Recommender systems employ the popular K-nearest neigh-bor collaborative filtering (K-CF) methodology and its vari-ations for recommending the products. In K-CF approach,recommendation for a given user is computed based on theratings of K-nearest neighbors. In K-CF approach, it can benoted that, the system identifiesKneighbors for each userirrespective of the number of products he/she has rated.As a result, the user who have rated few products may getthe less-similar neighbors and the user who have rated moreproducts may miss the genuine neighbors. In the literature,the notion of lower-bound similarity has been proposed toimprove the clustering performance in which the clusters areextracted by fixing the similarity threshold. In this paper,we have extended the notion of lower-bound similarity torecommender systems to improve the performance of K-CFapproach. In the proposed approach, instead of fixingKfor finding the neighborhood, the similarity threshold valueis fixed to extract the neighbors for each user. As a re-sult, each user gets appropriate number of neighbors basedon the number of products rated by him/her in a dynamicmanner. The experimental results, on MovieLens dataset,show that the proposed lower bound similarity CF approachimproves the performance of recommender systems over K-CF approach.
Similarity analysis of legal judgments
SUSHANTA KUMAR,Krishna Reddy Polepalli,V. Balakista Reddy,Aditya Singh
Compute Conference, COMPUTE, 2011
@inproceedings{bib_Simi_2011, AUTHOR = {SUSHANTA KUMAR, Krishna Reddy Polepalli, V. Balakista Reddy, Aditya Singh}, TITLE = {Similarity analysis of legal judgments}, BOOKTITLE = {Compute Conference}. YEAR = {2011}}
In this paper, we have made an effort to propose approachesto find similar legal judgements by extending the popu-lar techniques used in information retrieval and search en-gines. Legal judgements are complex in nature and re-fer other judgements. We have analyzed all-term, legal-term, co-citation and bibliographic coupling-based similaritymethods to find similar judgements. The experimental re-sults show that the legal-term cosine similarity method per-forms better than all-term cosine similarity method. Also,the results show that bibliographic coupling similarity methodimproves the performance over co-citation approach.
An efficient approach to mine periodic-frequent patterns in transactional databases
AKSHAT SURANA,R UDAY KIRAN,Krishna Reddy Polepalli
Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2011
@inproceedings{bib_An_e_2011, AUTHOR = {AKSHAT SURANA, R UDAY KIRAN, Krishna Reddy Polepalli}, TITLE = {An efficient approach to mine periodic-frequent patterns in transactional databases}, BOOKTITLE = {Pacific-Asia Conference on Knowledge Discovery and Data Mining}. YEAR = {2011}}
Recently, temporal occurrences of the frequent patterns in a transactional database has been exploited as an interestingness criterion to discover a class of user-interest-based frequent patterns, called periodic-frequent patterns. Informally, a frequent pattern is said to be periodic-frequent if it occurs at regular intervals specified by the user throughout the database. The basic model of periodic-frequent patterns is based on the notion of “single constraints.” The use of this model to mine periodic-frequent patterns containing both frequent and rare items leads to a dilemma called the “rare item problem.” To confront the problem, an alternative model based on the notion of “multiple constraints” has been proposed in the literature. The periodic-frequent patterns discovered with this model do not satisfy downward closure property. As a result, it is computationally expensive to mine periodic-frequent patterns with the model. Furthermore, it has been observed that this model still generates some uninteresting patterns as periodic-frequent patterns. With this motivation, we propose an efficient model based on the notion of “multiple constraints.” The periodic-frequent patterns discovered with this model satisfy downward closure property. Hence, periodic-frequent patterns can be efficiently discovered. A pattern-growth algorithm has also been discussed for the proposed model. Experimental results show that the proposed model is effective.
Performance evaluation of speculation-based protocol for read-only transactions
T RAGUNATHAN,Krishna Reddy Polepalli
Compute Conference, COMPUTE, 2010
@inproceedings{bib_Perf_2010, AUTHOR = {T RAGUNATHAN, Krishna Reddy Polepalli}, TITLE = {Performance evaluation of speculation-based protocol for read-only transactions}, BOOKTITLE = {Compute Conference}. YEAR = {2010}}
In the literature, speculation-based protocols have been pro-posed to improve the performance of read-only transactions(ROTs) over the existing two-phase locking (2PL) and snap-shot isolation(SI)-based protocols. In this paper, we havecompared the performance of asynchronous speculation-basedprotocol with 2PL and SI-based protocols through analyti-cal and simulation methods. The results show that asyn-chronous speculation-based protocol improves the perfor-mance over 2PL and SI-based protocols.
An approach to extract special skills to improve the performance of resume selection
SUMIT MAHESHWARI,ABHISHEK SAINANI,Krishna Reddy Polepalli
International workshop on databases in networked information systems, DNIS, 2010
@inproceedings{bib_An_a_2010, AUTHOR = {SUMIT MAHESHWARI, ABHISHEK SAINANI, Krishna Reddy Polepalli}, TITLE = {An approach to extract special skills to improve the performance of resume selection}, BOOKTITLE = {International workshop on databases in networked information systems}. YEAR = {2010}}
In the Internet era, the enterprises and companies receive thousands ofresumes from the job seekers. Currently available filtering techniques and searchservices help the recruiters to filter thousands of resumes to few hundred poten-tial ones. Since these filtered resumes are similar to each other, it is difficult toidentify the potential resumes by examining each resume. We are investigatingthe issues related to the development of approaches to improve the performanceof resume selection process. We have extended the notion of special features andproposed an approach to identify resumes with special skill information. In theliterature, the notion of special features have been applied to improve the pro-cess of product selection in E-commerce environment. However, extending thenotion of special features for the development of approach to process resumes isa complex task as resumes contain unformatted text or semi-formatted text. Inthis paper, we have proposed an approach by considering only skills related in-formation of the resumes. The experimental results on the real world data-set ofresumes show that the proposed approach has the potential to improve the processof resume selection.
Interface Tailoring by Exploiting Temporality of Attributes for Small Screens
M. KUMARA SWAMY,Krishna Reddy Polepalli,R UDAY KIRAN,M.VENUGOPAL REDDY
International workshop on databases in networked information systems, DNIS, 2010
@inproceedings{bib_Inte_2010, AUTHOR = {M. KUMARA SWAMY, Krishna Reddy Polepalli, R UDAY KIRAN, M.VENUGOPAL REDDY}, TITLE = {Interface Tailoring by Exploiting Temporality of Attributes for Small Screens}, BOOKTITLE = {International workshop on databases in networked information systems}. YEAR = {2010}}
In the pervasive computing era, mobile phones and personaldigital assistants are widely used for data collection. The traditional userinterfaces which are employed for data collection in personal computerenvironments are to be modified appropriately for the mobile environ-ment. Because, it is difficult to display full interface on a single mobilescreen due to the limitation of the mobile phone screen size. Interfacetailoring methods are investigated in the literature for designing user in-terface for mobile screens. In the literature, temporality-based approachis proposed for designing efficient user interface for personal computerenvironment. In this paper, we extend the notion of attribute temporalityto interface tailoring methods and propose an improved user interface de-sign approach for small screens. The analysis on the real-world datasetsshows that the proposed approach can be used for better user interfacedesign for small screens.
Semantics-based asynchronous speculative locking protocol for improving the performance of read-only transactions
T RAGUNATHAN,Krishna Reddy Polepalli,MOHIT GOYAL
Spring Simulation Multiconference, SPINGSIM, 2010
@inproceedings{bib_Sema_2010, AUTHOR = {T RAGUNATHAN, Krishna Reddy Polepalli, MOHIT GOYAL}, TITLE = {Semantics-based asynchronous speculative locking protocol for improving the performance of read-only transactions}, BOOKTITLE = {Spring Simulation Multiconference}. YEAR = {2010}}
Speculative locking (SL) protocols have been proposed inthe literature for improving the performance of read-onlytransactions (ROTs) without correctness and data currencyissues. In these protocols, ROTs carry out speculative ex-ecutions and update transactions (UTs) follow two-phaselocking (2PL). In these protocols, UTs are blocked if theyconflict with ROTs. To reduce blocking of UTs, semantics-based protocol has been proposed in the literature by ex-ploiting the “compensatability” property of ROTs. In thatprotocol compensatable ROTs are processed without block-ing and non-compensatable ROTs are processed by usingsynchronous speculation method. In this paper, we haveproposed a semantics-based speculative locking protocol forROTs in which non-compensatable ROTs are processed withasynchronous speculation method. The simulation resultsshow that the proposed approach improves the performanceof ROTs over other protocols
Improved approaches to mine rare association rules in transactional databases
R UDAY KIRAN,Krishna Reddy Polepalli
Workshop on Innovative Database Research, IDAR, 2010
@inproceedings{bib_Impr_2010, AUTHOR = {R UDAY KIRAN, Krishna Reddy Polepalli}, TITLE = {Improved approaches to mine rare association rules in transactional databases}, BOOKTITLE = {Workshop on Innovative Database Research}. YEAR = {2010}}
Rare association rules are the association rules consisting ofrare items. It is difficult to mine rare association rules withthe single minimum support based approaches such as Apri-ori and FP-growth as they suffer fromrare item problem. Inthe literature, efforts has been made to extract rare associa-tion rules with multiple minimum supports. It was observedthat the multiple minimum supports-based approach stillsuffers from performance problems. As a part of proposedwork, we have analyzed the multiple minimum supports-based approach and proposed improved approaches for ex-tracting rare association rules. Experimental results showthat the proposed approaches are efficient
Extending Speculation-Based Protocols for Processing Read-Only Transactions in Distributed Database Systems
MOHIT GOYAL,T RAGUNATHAN,Krishna Reddy Polepalli
High Performance Computing and Communications, HPCC, 2010
@inproceedings{bib_Exte_2010, AUTHOR = {MOHIT GOYAL, T RAGUNATHAN, Krishna Reddy Polepalli}, TITLE = {Extending Speculation-Based Protocols for Processing Read-Only Transactions in Distributed Database Systems}, BOOKTITLE = {High Performance Computing and Communications}. YEAR = {2010}}
The main issues in processing read-only transactions (ROTs) are correctness, data currency and performance. The popular two-phase locking (2PL) protocol processes the transactions correctly according to serializability criteria, but its performance deteriorates with data contention. To improve the performance, snapshot isolation (SI)-based approaches have been proposed. Even though SI-based approaches improve performance, however they compromise both correctness and data currency aspects. In the literature, an effort has been made to propose improved approaches to process ROTs based on the notion of speculation. The speculation-based approaches improve performance without compromising both correctness and data currency aspects. In this paper, we have extended the speculation-based protocols for processing ROTs in a distributed database systems. It has been identified that an ROT under speculation-based protocols in distributed database systems require a commit phase. In addition, additional messages are required for making speculative versions available to ROTs during the execution phase of an update transaction. In spite of these overheads, the proposed protocols reduce the waiting time of ROTs significantly by increasing the parallelism without violating both correctness and data currency aspects. The simulation experiments show that the proposed protocols significantly improve the performance over 2PL and SI-based protocols.
Analysing dynamics of crop problems by applying text analysis methods on farm advisory data of eSaguTM.
R UDAY KIRAN,Krishna Reddy Polepalli,M.KUMARA SWAMY,G. Syamasundar Reddy
International Journal of Computational Science and Engineering, IJCSE, 2010
@inproceedings{bib_Anal_2010, AUTHOR = {R UDAY KIRAN, Krishna Reddy Polepalli, M.KUMARA SWAMY, G. Syamasundar Reddy}, TITLE = {Analysing dynamics of crop problems by applying text analysis methods on farm advisory data of eSaguTM.}, BOOKTITLE = {International Journal of Computational Science and Engineering}. YEAR = {2010}}
By extending information and communication technologies, a personalizedagricultural advisory system called eSaguTMhas been developed in which the farmersreceive agricultural expert advice for each of their farms at regular intervals. The expertadvice is prepared by agricultural experts based on the crop status information receivedin the form of both digital photographs and text. During 2004-05, the eSaguTMsystemwas operated for 1051 cotton farms covering three villages in the state of AndhraPradesh, India. In eSaguTM, the expert advice had been delivered to every cottonfarm, once in a week. As a result, the data set consisting of about 20,000 such advicetexts had been generated. In this paper, we have carried out the cluster/textualanalysis experiments on the data set and reported interesting results concerning thedynamics of crop problems. Normally, all are cotton farms and belonging to nearbyarea/region should have faced similar problems. However, the cluster analysis of theadvices delivered on each day shows that significant number of farms are suffering fromdistinct crop production problems. The results also indicate that, a cluster of farmswhich face the same crop problem during one week face distinct crop problems duringthe subsequent weeks. Based on the results, we can conclude that it is necessaryto deliver agricultural expert advice to each farm by building agricultural advisorysystems which deliver farm-specific agricultural advices to reduce crop failures andimprove crop productivity.
An Improved Frequent Pattern-growth Approach to Discover Rare Association Rules.
R UDAY KIRAN,Krishna Reddy Polepalli
International Conference on Knowledge Discovery and Information Retrieval, KDIR, 2009
@inproceedings{bib_An_I_2009, AUTHOR = {R UDAY KIRAN, Krishna Reddy Polepalli}, TITLE = {An Improved Frequent Pattern-growth Approach to Discover Rare Association Rules.}, BOOKTITLE = {International Conference on Knowledge Discovery and Information Retrieval}. YEAR = {2009}}
Rare association rules (or rules) are the association rules involving rare items (low frequent items). To minerare association rules efficiently, an effort has been made in CFP-growth approach to mine frequent patternsusing multiple minimum support (minsup) values. This approach is an extension of FP-growth approach tomultiple minsup values. This approach involves construction of MIS-tree and generating frequent patternsfrom the MIS-tree. The issue in CFP-growth is constructing the compact MIS-tree because CFP-growth con-siders certain items which will generate neither frequent patterns nor rules. In this paper, we propose an effi-cient approach for constructing the compact MIS-tree. To do so, the proposed approach explores the notions“least minimum support” and “infrequent child node pruning”. Experimental results on both synthetic andreal world datasets show that the proposed approach improves the performance over CFP-growth approach.
An Improved Multiple Minimum Support Based Approach to Mine Rare Association Rules
R UDAY KIRAN,Krishna Reddy Polepalli
Symposium on Computational Intelligence and Data Mining, CIDM, 2009
@inproceedings{bib_An_I_2009, AUTHOR = {R UDAY KIRAN, Krishna Reddy Polepalli}, TITLE = {An Improved Multiple Minimum Support Based Approach to Mine Rare Association Rules}, BOOKTITLE = {Symposium on Computational Intelligence and Data Mining}. YEAR = {2009}}
In this paper we have proposed an improved ap-proach to extract rare association rules. Rare association rulesare the association rules containing rare items. Rare items are lessfrequent items. For extracting rare itemsets, the single minimumsupport (minsup) based approaches like Apriori approach sufferfrom “rare item problem” dilemma. At high minsup value, rareitemsets are missed, and at low minsup value, the number offrequent itemsets explodes. To extract rare itemsets, an effort hasbeen made in the literature in which minsup of each item is fixedequal to the percentage of its support. Even though this approachimproves the performance over single minsup based approaches,it still suffers from “rare item problem” dilemma. If minsup forthe item is fixed by setting the percentage value high, the rareitemsets are missed as the minsup for the rare items becomes closeto their support, and if minsup for the item is fixed by setting thepercentage value low, the number of frequent itemsets explodes.In this paper, we propose an improved approach in which minsupis fixed for each item based on the notion of “support difference”.The proposed approach assigns appropriate minsup values forfrequent as well as rare items based on their item supportsand reduces both “rule missing” and “rule explosion” problems.Experimental results on both synthetic and real world datasetsshow that the proposed approach improves performance overexisting approaches by minimizing the explosion of number offrequent itemsets involving frequent items and without missingthe frequent itemsets involving rare items.
BoostWeight: An Approach to Boost the Term Weights in a Document Vector by Exploiting Open Web Directory.
GAURAV RUHELA,Krishna Reddy Polepalli
International Conference on Information & Knowledge Engineering, IKE, 2009
@inproceedings{bib_Boos_2009, AUTHOR = {GAURAV RUHELA, Krishna Reddy Polepalli}, TITLE = {BoostWeight: An Approach to Boost the Term Weights in a Document Vector by Exploiting Open Web Directory.}, BOOKTITLE = {International Conference on Information & Knowledge Engineering}. YEAR = {2009}}
For clustering, cosine similarity method is apopular approach to compute the similarity between twodocument vectors where each document vector consists ofweighted terms. It can be noted that, the value of theweight assigned to the terms influences the similarity value.Generally, the weight for each term is calculated by using thepopular TF-IDF method. In the TF-IDF method, the weightof a term is a function of its frequency in the document andin overall document collection. However, it can be observedthat, TF-IDF weighting method ignores the semantic asso-ciation between the terms. In this paper, we have proposedan improved term weighting method calledBoostWeightbyconsidering the semantic association between the terms. Forthis, we exploit the generalization ability of hierarchicalknowledge repositories such as “Open Web Directory” tosemantically associate the terms. The proposedBoostWeightmethod increases the weight of terms if they have commongeneralized term. The clustering experiments on the realdata-set show thatBoostWeightimproves both entropy andpurity values over traditional term weighting approaches.
Emerging Challenges in Agriculture-Deficiencies in Extension-Scope for Alternative Technologiesn
VENKATESHWARLU BAVANDALA,Krishna Reddy Polepalli,A. Sudarshan Reddy
Conference of the Andhra Pradesh Economic Association, APEC, 2009
@inproceedings{bib_Emer_2009, AUTHOR = {VENKATESHWARLU BAVANDALA, Krishna Reddy Polepalli, A. Sudarshan Reddy}, TITLE = {Emerging Challenges in Agriculture-Deficiencies in Extension-Scope for Alternative Technologiesn}, BOOKTITLE = {Conference of the Andhra Pradesh Economic Association}. YEAR = {2009}}
Despite the growing importance of other sectors, agriculture continues to hold the key to the growth of GDP, food security, employment and income in India. Over a period of time, agriculture is undergoing a change towards high value crops, livestock, fisheries etc. Along with it substantial changes have occurred in seed varieties, use of chemical inputs, cultivation practices. Such a process has also culminated in distress conditions in the farm sector. Studies1 undertaken in the states of Andhra Pradesh, Karnataka, Maharastra and Punjab have revealed the serious crisis among the farming community. It is pertinent to note that net income is negative in some states( Narayan Moorthy: 2006) and nearly two thirds of the farmers are frustrated with their profession ( Desh
Discovering special product features for improving the process of product selection in e-commerce environment
SUMIT MAHESHWARI,Krishna Reddy Polepalli
International Conference Electronic Commerce, ICEC, 2009
@inproceedings{bib_Disc_2009, AUTHOR = {SUMIT MAHESHWARI, Krishna Reddy Polepalli}, TITLE = {Discovering special product features for improving the process of product selection in e-commerce environment}, BOOKTITLE = {International Conference Electronic Commerce}. YEAR = {2009}}
In the E-commerce environment, a customer faces several difficul-ties for selecting the product. The technology like recommendationsystem is being developed to improve the performance of productselection. In this paper, we have investigated the problem of ‘se-lecting a product from group of similar products’ faced by the cus-tomer. For example, when a customer wants to buy a Sony camerathrough E-commerce Web site, the customer has to go through theinformation of several Sony camera models to select the appropri-ate one. In this paper, we have proposed an improved approachto help the customers to select the appropriate product. We haveexploited the fact that every product possesses some specialness,which is exhibited through few special features. The proposed ap-proach identifies the special features and organizes the features ofthe product in an effective manner. We have conducted the experi-ments on three real world data-sets related to Nokia-mobile phones,Sony-cameras and HP-laptops. The results indicate that the pro-posed approach has a potential to improve the performance of theproduct selection.
Overview of eSagu and Future Plan
Krishna Reddy Polepalli,G.Syamasundar Reddy,B.Bhaskar Reddy
IEEE Conference on Technologies for Humanitarian Challenges, CTHC, 2009
@inproceedings{bib_Over_2009, AUTHOR = {Krishna Reddy Polepalli, G.Syamasundar Reddy, B.Bhaskar Reddy}, TITLE = {Overview of eSagu and Future Plan}, BOOKTITLE = {IEEE Conference on Technologies for Humanitarian Challenges}. YEAR = {2009}}
The eSaguTMsystem is an IT-based personalized agro advisory system. In this system, the agricultural experts generate the advice by using the latest information about the crop situation received in the form of both photographs and text. The agricultural expert advice is delivered to each farm on a regular basis (typically once in a week/two weeks depending on the type of crop) from the sowing stage to the harvesting stage to the farmer without farmer asking a question. Since 2004, the eSagu system is being developed by operating on several crops and farms in Andhra Pradesh state of India. It has been found that the agriculture expert can prepare the expert advice in an efficient manner based on the crop photographs and the related information. The impact results show that the expert advices helped the farmers to achieve significant savings in the capital investment and improvement in the yield. Encouraged by the results of eSagu experiment, a country-wide integrated agri-service program is planned by Media Lab Asia to provide expert advice and other agri-related services to Indian farming community. In this paper we briefly explain the development of eSagu system, advantages and future plan.
An Efficient Context-based User Interface by Exploiting Temporality of Attributes
M.KUMARA SWAMY,Krishna Reddy Polepalli
Asia-Pacific Software Engineering Conference, APSEC, 2009
@inproceedings{bib_An_E_2009, AUTHOR = {M.KUMARA SWAMY, Krishna Reddy Polepalli}, TITLE = {An Efficient Context-based User Interface by Exploiting Temporality of Attributes}, BOOKTITLE = {Asia-Pacific Software Engineering Conference}. YEAR = {2009}}
Information systems receive data through theuser interface forms. Improperly designed user interface formincreases navigational burden which in turn increases data-entry cost and data-entry errors. Research efforts are goingon to investigate the improved methods for user interfacedesign. Normally, context-based approach is followed to designuser interfaces. In this approach, the input attributes aredivided into several contexts based on the functionality anduser interface form is designed for each context. It can beobserved that, in some information systems, some attributesreceive values only during certain time duration. We termthis property as temporality of the attribute. In this paper,we have proposed an approach to exploit the temporality ofattributes for identifying the active-contexts to design improveduser interface. The implementation results on a real systemshow that the proposed method reduces the navigationalburden significantly as compared to the existing context-basedapproach
Mining rare periodic-frequent patterns using multiple minimum supports
R UDAY KIRAN,Krishna Reddy Polepalli
International Conference on Management of Data, COMAD, 2009
@inproceedings{bib_Mini_2009, AUTHOR = {R UDAY KIRAN, Krishna Reddy Polepalli}, TITLE = {Mining rare periodic-frequent patterns using multiple minimum supports}, BOOKTITLE = {International Conference on Management of Data}. YEAR = {2009}}
Recently, an approach has been proposed in the literatureto extract frequent patterns which occur periodically. Inthis paper, we have proposed an approach to extract rareperiodic-frequent patterns. Normally, the singleminsupbased frequent pattern mining approaches like Apriori andFP-growth suffer from “rare item problem”. That is, at highminsu p, frequent patterns consisting of rare items will bemissed, and at lowminsu p, number of frequent patternsexplode. In the literature, efforts have been made to extractrare frequent patterns under “multiple minimum supportframework”. It was observed that the periodic-frequentpattern mining approach also suffers from the “rare itemproblem”. In this paper, we have extended “multiple mini-mum support framework” to extract rare periodic-frequentpatterns and developed a new algorithm to extract rareperiodic-frequent patterns. Experiment results show thatthe proposed approach is efficient.
Extending ICTs to impart Applied Agricultural Knowledge
Krishna Reddy Polepalli,G.Shyamasundar Reddy,B.Bhaskar Reddy
National Conference on Agro-Informatics and Precision Farming, AIPA, 2009
@inproceedings{bib_Exte_2009, AUTHOR = {Krishna Reddy Polepalli, G.Shyamasundar Reddy, B.Bhaskar Reddy}, TITLE = {Extending ICTs to impart Applied Agricultural Knowledge}, BOOKTITLE = {National Conference on Agro-Informatics and Precision Farming}. YEAR = {2009}}
To improve the performance of agricultural education, research efforts are being made to exploit to developan education methodology to impart practical skills to agricultural graduates regarding aforesaid farming situationproblems in an effective and efficient manner. In this paper, we explain the background and the overview of theproposed ICT-based framework to impart applied education to agriculture professionals.
Improving the performance of read-only transactions through asynchronous speculation
T RAGUNATHAN,Krishna Reddy Polepalli
Spring simulation multiconference., SpringSim, 2008
@inproceedings{bib_Impr_2008, AUTHOR = {T RAGUNATHAN, Krishna Reddy Polepalli}, TITLE = {Improving the performance of read-only transactions through asynchronous speculation}, BOOKTITLE = {Spring simulation multiconference.}. YEAR = {2008}}
A read-only transaction (ROT) does not modify any data. The main issues regarding processing of read-only transactions (ROTs) are correctness, data currency and performance. Even though the popular two-phase locking protocol processes ROTs correctly with no data currency related issues, its performance deteriorates with data contention. To improve the performance of ROTs, snapshot isolation-based approaches have been proposed. Eventhough snapshot isolation-based approaches improve the performance of ROTs, both data currency of ROTs andcorrectness (serializability) are compromised. In this paper, we propose an asynchronous speculative locking protocol (called as ASLR) which improves the performance ofROTs by trading extra processing resources. The simulation results show that ASLR improves the performance of ROTs significantly over two-phase locking and snapshot isolation-based approaches with manageable extra processing resources. The ASLR approach processes ROTs without any data currency and correctness issues.
Exploiting Semantics and Speculation for Improving the Performance of Read-only Transactions.
T RAGUNATHAN,Krishna Reddy Polepalli
International Conference on Management of Data, COMAD, 2008
@inproceedings{bib_Expl_2008, AUTHOR = {T RAGUNATHAN, Krishna Reddy Polepalli}, TITLE = {Exploiting Semantics and Speculation for Improving the Performance of Read-only Transactions.}, BOOKTITLE = {International Conference on Management of Data}. YEAR = {2008}}
A read-only transaction (ROT) does not modify anydata. Efforts are being made in the literature toimprove the performance of ROTs without correct-ness and data currency issues. The widely used two-phase locking protocol (2PL) processes the transac-tions without any correctness and data currency issues.However, the performance of 2PL deteriorates withdata contention. Snapshot isolation (SI)-based pro-tocols proposed in the literature improve the perfor-mance of ROTs, but they compromise on correctnessand data currency issues. Speculative locking (SL)protocols are proposed in the literature for improv-ing the performance of ROTs by carrying out spec-ulative executions only for ROTs and following 2PLfor update transactions. In SL-based protocols, up-date transactions are blocked if they conflicting withROTs. In this paper, we have proposed an improvedapproach to improve parallelism among update trans-actions and ROTs by exploiting a new notion called“compensatability”. In this protocol, an ROT whichcan be “compensatable” can complete the executionand carry out compensating operation to incorporatethe effect of conflicting update transactions. As aresult, the parallelism is improved over SL protocolsas the update transactions which are conflicting with‘compensatable’ ROTs need not block. In this paper,we have proposed a protocol by exploiting both “com-pensatability” property of ROTs and speculation. Thesimulation results show that the proposed protocol im-proves the performance by carrying out less number ofspeculative executions. Further, the proposed protocoldoes not violate serializability criteria.
Performance enhancement of Read-only transactions using speculative locking protocol
T RAGUNATHAN,Krishna Reddy Polepalli
Inter Research Institute Student Seminar in Computer Science, IRISS, 2007
@inproceedings{bib_Perf_2007, AUTHOR = {T RAGUNATHAN, Krishna Reddy Polepalli}, TITLE = {Performance enhancement of Read-only transactions using speculative locking protocol}, BOOKTITLE = {Inter Research Institute Student Seminar in Computer Science}. YEAR = {2007}}
A read-only transaction (ROT) does not modify any data. The main issues regarding processing ROTs are correctness, data currency and performance. Two-phase Locking (2PL) protocol is widely used for concur-rency control with serializabilty as correctness criteria. Even though 2PL processes ROTs correctly with no data currency related issues, the performance deteriorates as data contention increases. To improve the performance, an approach has been proposed in the literature by considering “Snapshot Isolation (SI)” as correctness criteria to process ROTs. At SI-level, ROTs are processed by reading from a snapshot of the committed data and ignoring the modifications produced by the concurrent active transactions. Even though SI-based algorithms improve the performance of ROTs, data currency of ROTs and correctness (serializability) are compromised. In this paper, we propose an approach by extending the notion of speculation to improve the performance of ROTs without com-promising data currency of transactions and correctness. Speculation-based approach improves the performance of ROTs by trading extra computing resources without violating serializability as correctness criteria. The simula-tion results show that with the proposed protocol the throughput performance is improved significantly over 2PL and the data currency of ROTs is improved significantly over SI-based ap proaches with manageable extra re-sources.
eSagu™: a data warehouse enabled personalized agricultural advisory system
Krishna Reddy Polepalli,GV Ramaraju,G.S. Reddy
International Conference on Management of Data, COMAD, 2007
@inproceedings{bib_eSag_2007, AUTHOR = {Krishna Reddy Polepalli, GV Ramaraju, G.S. Reddy }, TITLE = {eSagu™: a data warehouse enabled personalized agricultural advisory system}, BOOKTITLE = {International Conference on Management of Data}. YEAR = {2007}}
In this paper, we explain a personalized agricultural advisory system called eSagu, which has been developed to improve the performance and utilization of agriculture technology and help Indian farmers. In eSagu, rather than visiting the crop in person, the agricultural expert delivers the expert advice at regular intervals (once in one or two weeks) to each farm by getting the crop status in the form of digital photographs and other information. During 2004-06, through eSagu, agricultural expert advices delivered for about 6000 farms covering six crops. The results show that the expert advices helped the farmers to achieve savings in capital investment and improved the crop yield. Mainly, the data warehouse of farm histories has been developed which is providing the crop related information to the agricultural expert in an integrated manner for generating a quality agricultural expert advice. In this paper, after explaining eSagu and its advantages, we discuss how data warehouse of farm histories is enabling agricultural expert to deliver a quality expert advice. We also discuss some research issues to improve the performance of eSagu.
Improving the performance of read-only transactions through speculation
T RAGUNATHAN,Krishna Reddy Polepalli
International workshop on databases in networked information systems, DNIS, 2007
@inproceedings{bib_Impr_2007, AUTHOR = {T RAGUNATHAN, Krishna Reddy Polepalli}, TITLE = {Improving the performance of read-only transactions through speculation}, BOOKTITLE = {International workshop on databases in networked information systems}. YEAR = {2007}}
A read-only transaction (ROT) does not modify any data. The main issues regarding processing ROTs are correctness, data currency and perform-ance. Two-phase Locking (2PL) protocol is widely used for concurrency con-trol with serializabilty as correctness criteria. Even though 2PL processes ROTs correctly with no data currency related issues, the performance deteriorates as data contention increases. To improve the performance over 2PL, snapshot iso-lation (SI)-based protocols have been proposed. SI-based protocols process ROTs by reading from a snapshot of the committed data and ignoring the modi-fications produced by the concurrent active transactions. Even though SI-based algorithms improve the performance of ROTs, both data currency of ROTs and correctness (serializability) are compromised. In this paper, we propose an ap-proach to improve the performance of ROTs using speculation without compromising data currency of transactions and correctness. The proposed approach improves the performance of ROTs by trading extra computing re-sources without violating serializability as correctness criteria. The simulation results show that with the proposed protocol the throughput performance is im-proved significantly over 2PL and SI-based approaches with manageable extra resources.
eSagu: An IT based personalized agricultural extension system prototype-analysis of 51 farmers' case studies
B. V. Ratnam,Krishna Reddy Polepalli,G. S. Reddy
International Journal of Education and Development using ICT, IJEDICT, 2006
@inproceedings{bib_eSag_2006, AUTHOR = {B. V. Ratnam, Krishna Reddy Polepalli, G. S. Reddy}, TITLE = {eSagu: An IT based personalized agricultural extension system prototype-analysis of 51 farmers' case studies}, BOOKTITLE = {International Journal of Education and Development using ICT}. YEAR = {2006}}
To bridge the information gap between the agricultural expert and the farmer, International Institute of Information Technology (IIIT), Hyderabad has built the eSagu (“Sagu” means cultivation in Telugu language) system, which is an IT-based personalized agricultural extension system to improve agricultural productivity by disseminating a fresh expert agricultural advice to the farmers, both in a timely and personalized manner. In e-Sagu, the agricultural experts generate the expert advice based on the information about the crop situation received in the form of both text and digital photographs. In Kharif2 2004, a prototype was developed and implemented with 1051 farms. In the prototype, a team of agricultural experts stayed at IIIT, Hyderabad (India) and delivered 20,000 pieces of agricultural expert advice to 1051 cotton farms of three villages (Oorugonda, Gudeppad and Oglapur) in Atmakur Mandal of Warangal district in Andhra Pradesh state, India, by looking at digital photographs and other farm information supplied by some educated and experienced farmers (coordinators) in these villages. The pilot project was implemented successfully. In this article, an analysis of 51 registered farmers’ regarding compliance of advices and corresponding effect is reported. The following of expert advice on pest and disease management and IPM practices were analyzed by giving appropriate scores based on their effect on yield and input cost. The analysis showed that about fifty percent of farmers have followed the practices which increase yield and reduce input cost. A high positive correlation (r=0.46**) was observed between the compliance rate and the yields. Though some farmers obtained same yield as the farmers who have followed the advices, their input costs were significantly higher.
Efficient Implementation of Agri-Insurance Schemes by Piggy backing eSagu System
G. Syamasundar Reddy,Krishna Reddy Polepalli
Information Communications and Technology, ICT, 2006
@inproceedings{bib_Effi_2006, AUTHOR = {G. Syamasundar Reddy, Krishna Reddy Polepalli}, TITLE = {Efficient Implementation of Agri-Insurance Schemes by Piggy backing eSagu System}, BOOKTITLE = {Information Communications and Technology}. YEAR = {2006}}
Timely agro-advisory service and efficient risk mitigation mechanisms can provide stableincome to the farmers. To provide timely agro-advisory service, e-Sagu systemhas beendeveloped by IIIT-Hyderabad and Media Lab Asia. It is an IT-based personalized agro-advisorysystemwhich is being developedto provide high-qualitypersonalized (farm-specific) agricultural expert adviceto each and everyfarmin a timely manner at thefarmer’s door-step without farmer asking a question. The results show that significantbenefits have flowntothe farmers interms of reducedinputs andenhancedyield.Regarding risk mitigation, the agri-insurance schemes around the worldwhether anadvanced or a developing nation like India are suffering due to catastrophic losses,covariate risks, asymmetric information and lack of quality data. These problems maypose a serious threat to the survival of the implementing agencies unless the there is aneffective monitoring systemin place, with enough man power and robust infrastructure.In this paper we explain how agri-insurance schemes can be effectively implemented ontop of eSagu system. The agri-insurance scheme can bevery cost-effective as it can useresources and processes of eSagu system. Also, as eSagu provides quality and reliabledata at farmlevel,the problemof asymmetric information can be effectively hedged.Most important, eSagualsoimproves the performance offormal risk mitigationmechanisms through farm-specific agro-advisory and monitoring system.
Leader-page Resources in World Wide Web
D RAVI SHANKAR,PRADEEP KUMAR BEERLA,Krishna Reddy Polepalli
India Joint International Conference on Data Science & Management of Data, COMAD/CODS, 2005
@inproceedings{bib_Lead_2005, AUTHOR = {D RAVI SHANKAR, PRADEEP KUMAR BEERLA, Krishna Reddy Polepalli}, TITLE = {Leader-page Resources in World Wide Web}, BOOKTITLE = {India Joint International Conference on Data Science & Management of Data}. YEAR = {2005}}
Ranking the search results is an important research problem in WWW. HITS, PageRank and variations of these algorithms are widely used approaches for ranking. In this paper we proposed a new ranking algorithm to rank the search results by introducing the concept of “leader-page”. The notion of “leader-page” is defined by extending the concept of “leader” from leadership theory. In leadership theory, a leader is defined as a person, who has more contacts with the other members of the group, both initiates and receives communication, and whose characteristics are the most similar to the group’s own characteristics. Similarly, in WWW, the proposed approach identifies leader-pages and assigns “leadership score” to them based on several kinds of cyclic and similarity relationships it establishes with other web pages. Given a set of key words (search query), the proposed approach ranks the related pages based on the corresponding “leadership score”. The experiment results show that the proposed approach gives high leadership score to resourceful pages as compared to the results of the HITS algorithm and the Google search engine
An improved approach to extract document summaries based on popularity
P. ARUN KUMAR,K. PRAVEEN KUMAR,T.Someswara Rao,Krishna Reddy Polepalli
International workshop on databases in networked information systems, DNIS, 2005
@inproceedings{bib_An_i_2005, AUTHOR = {P. ARUN KUMAR, K. PRAVEEN KUMAR, T.Someswara Rao, Krishna Reddy Polepalli}, TITLE = {An improved approach to extract document summaries based on popularity}, BOOKTITLE = {International workshop on databases in networked information systems}. YEAR = {2005}}
With the rapid growth of the Internet, most of the textualdata in the form of newspapers, magazines and journals tend to be avail-able on-line. Summarizing these texts can aid the users access the in-formation content at a faster pace. However, doing this task manuallyis expensive and time-consuming. Automatic text summarization is asolution for dealing with this problem. For a given text, a text summa-rization algorithm selects a few salient sentences based on certain fea-tures. In the literature, weight-based, foci-based, and machine learningapproaches have been proposed. In this paper, we propose a popularity-based approach for text summarization. A popularity of the sentence isdetermined based on the number of other sentences similar to it. Usingthe notion of popularity, it is possible to extract potential sentences forsummarization that could not be extracted by the existing approaches.The experimental results show that by applying both popularity andweight-based criteria it is possible to extract effective summaries.
A framework of information technology-based agriculture information dissemination system to improve crop productivity
Krishna Reddy Polepalli,R.Ankaiah
Current Science, CURR SCI, 2005
@inproceedings{bib_A_fr_2005, AUTHOR = {Krishna Reddy Polepalli, R.Ankaiah}, TITLE = {A framework of information technology-based agriculture information dissemination system to improve crop productivity}, BOOKTITLE = {Current Science}. YEAR = {2005}}
eSagu: An IT-based Personalized Agricultural Extension System-A Prototype Experience
G.Syamasundar Reddy,Krishna Reddy Polepalli, A.Sudarshan Reddy,B.Venkateshwar Rao
Science, Technology, and Development, STD, 2005
@inproceedings{bib_eSag_2005, AUTHOR = {G.Syamasundar Reddy, Krishna Reddy Polepalli, A.Sudarshan Reddy, B.Venkateshwar Rao}, TITLE = {eSagu: An IT-based Personalized Agricultural Extension System-A Prototype Experience}, BOOKTITLE = {Science, Technology, and Development}. YEAR = {2005}}
The growing trend towards commercial crops with ever changing technology packages necessitate a transformation in the existing agricultural extension system with emerging technologies like IT.In this context IIIT , Hyderabad has developed and implemented eSagu, a Web Based Agricultural Expert Advice Dissemination System in Atmakur mandal, Warangal district during 2004-05 covering 1,051 cotton farms. eSagu system of extension has succeeded in acquisition of infrastructure and human resources as envisaged and proved its organizational ability in coordinating different branches for a smooth functioning on proposed lines. Wholistic information in the form of clear images of crop (through digital photographs and zooming) and other information i.e. soil, weather, crop history has helped scientists to provide effective advice. The system was able to offer collective expert advice from one place with in 24 hours of response time to the farmers at the other end. Further, it is only the information that moved, while the farmers and scientists remained at their respective working places. The system has proved its technical efficiency in terms of pest identification and prediction with appropriate advices based on IPM practices to all sections of farmers. In the process of information dissemination, the system was able to provide 20,000 advices and accumulated 1,11,000 crop photographs in a period of one year. In the field of research it was able to identify new pests like stem borer and early detection of Gray Mildew disease and share the information with other research agencies. As an experiment, the project has tested feasibility and acceptability of IT for tapping its potential as an alternative to the existing extension system. It also offers scope for further reduction in the cost of delivery of advice provided a cluster based approach is adopted in the future
The Application of ICT in Indian Agriculture–The case of eSagu Model of Web-based Agricultural Expert Advice Dissemination System
Krishna Reddy Polepalli,B.Venkateshwar Rao,A. Sudarshan Reddy,M.KUMARA SWAMY
International Conference on Technology, Knowledge and Society, ICTKS, 2005
@inproceedings{bib_The__2005, AUTHOR = {Krishna Reddy Polepalli, B.Venkateshwar Rao, A. Sudarshan Reddy, M.KUMARA SWAMY}, TITLE = {The Application of ICT in Indian Agriculture–The case of eSagu Model of Web-based Agricultural Expert Advice Dissemination System}, BOOKTITLE = {International Conference on Technology, Knowledge and Society}. YEAR = {2005}}
In view of technology/extension gaps in Indian agriculture and to exploit ICT revolution, International Institute of Information Technology, Hyderabad, A.P., India had developed eSagu model of extension system and implemented it for the cotton crop in three villages of Oorugonda, Gudeppad and Oglapur covering 749 farmers and 1041 farms during 2004-05 crop season. The main objective is to build a cost effective and scalable agricultural expert advice dissemination system to all the farmers. The three-tier system consists of farmers as end users, coordinators as intermediaries to obtain crop status through digital photographs and text and communicate the advice to the farmers. The scientists with knowledge system prepare farm advices. The evaluation study clearly brings out the feasibility and acceptability of ICT based model of eSagu. The crop status, the zooming facility of digital photographs, weather data, crop history, soil data, all have enabled the scientists in advice making and prediction. The advices were provided to the farmers in 24-36 hours. eSagu operation has improved the access, knowledge and technology adoption rate as compared to their counter parts in non-project area. It further established that the acquired technology has been quite useful and reflected in increased production by 1.5 quintals per acre, saved fertilizers by 0.76 bags, pesticides by 2.3 sprays per acre. This has resulted in a net gain of Rs. 3820.0 per acre. Farmers’ response to the utility of the project is quite positive and they want the project to continue in the future. In conclusion, the evaluation study reveals that accomplishments of the project are impressive, the system is generally accepted and throws a promise for wider application in the future.
Speculative locking protocols to improve performance for distributed database systems
Krishna Reddy Polepalli,Masaru Kitsuregawa
IEEE Transactions on Knowledge and Data Engineering, TKDE, 2004
@inproceedings{bib_Spec_2004, AUTHOR = {Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Speculative locking protocols to improve performance for distributed database systems}, BOOKTITLE = {IEEE Transactions on Knowledge and Data Engineering}. YEAR = {2004}}
n this paper, we have proposed speculative locking (SL) protocols to improve the performance of distributed databasesystems (DDBSs) by trading extra processing resources. In SL, a transaction releases the lock on the data object whenever itproduces corresponding after-image during its execution. By accessing both before and after-images, the waiting transaction carriesout speculative executions and retains one execution based on the termination (commit or abort) mode of the preceding transactions.By carrying out multiple executions for a transaction, SL increases parallelism without violating serializability criteria. Under the naiveversion of SL, the number of speculative executions of the transaction explodes with data contention. By exploiting the fact that asubmitted transaction is more likely to commit than abort, we propose the SL variants that process transactions efficiently bysignificantly reducing the number of speculative executions. The simulation results indicate that even with manageable extraresources, these variants significantly improve the performance over two-phase locking in the DDBS environments where transactionsspend longer time for processing and transaction-aborts occur frequently.
Reducing the blocking in two-phase commit with backup sites
Krishna Reddy Polepalli,Masaru Kitsuregawa
Information Processing Letters, IPL, 2003
@inproceedings{bib_Redu_2003, AUTHOR = {Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {Reducing the blocking in two-phase commit with backup sites}, BOOKTITLE = {Information Processing Letters}. YEAR = {2003}}
The blocking phenomena in two-phase commit (2PC) reduces the availability of the system as the blocked transactionskeep all the resources until the recovery of the coordinator. The three-phase commit (3PC) protocol involves an extra roundof message transmission to resolve the blocking problem. In this paper, we propose a backup commit (BC) protocol to reducethe blocking problem by attaching multiple backup sites to the coordinator site. In BC, after receiving responses from theparticipants, the coordinator quickly communicates the final decision to the backup sites, before it sends the final decision tothe participants. When blocking occurs, the participant sites can terminate the transaction by consulting a backup site of thecoordinator. The BC protocol resolves the blocking in most of the coordinator site failures without involving an expensivecommunication cycle as in 3PC. The simulation experiments indicate that the throughput performance of BC is close to 2PC.2002 Elsevier Science B.V. All rights reserved.
Asynchronous operations in distributed concurrency control
Krishna Reddy Polepalli,Subhash Bhalla
IEEE Transactions on Knowledge and Data Engineering, TKDE, 2003
@inproceedings{bib_Asyn_2003, AUTHOR = {Krishna Reddy Polepalli, Subhash Bhalla}, TITLE = {Asynchronous operations in distributed concurrency control}, BOOKTITLE = {IEEE Transactions on Knowledge and Data Engineering}. YEAR = {2003}}
Distributed locking is commonly adopted for performing concurrency control in distributed systems. It incorporatesadditional steps for handling deadlocks. This activity is carried out by methods based on wait-for-graphs or probes. The present studyexamines detection of conflicts based on enhanced local processing for distributed concurrency control. In the proposed “edgedetection” approach, a graph-based resolution of access conflicts has been adopted. The technique generates a uniform wait-forprecedence order at distributed sites for transactions to execute. The earlier methods based on serialization graph testing are difficultto implement in a distributed environment. The edge detection approach is a fully distributed approach. It presents a unified techniquefor locking and deadlock detection exercises. The technique eliminates many deadlocks without incurring message overheads
Understanding Helicoverpa armigera pest population dynamics related to chickpea crop using neural networks
RAJAT GUPTA,B V L Narayana,Krishna Reddy Polepalli,G.V.Ranga Rao,CLL Gowda,YVR Reddy ,G.Rama Murthy
International Conference on Data Mining, ICDM, 2003
@inproceedings{bib_Unde_2003, AUTHOR = {RAJAT GUPTA, B V L Narayana, Krishna Reddy Polepalli, G.V.Ranga Rao, CLL Gowda, YVR Reddy , G.Rama Murthy}, TITLE = {Understanding Helicoverpa armigera pest population dynamics related to chickpea crop using neural networks}, BOOKTITLE = {International Conference on Data Mining}. YEAR = {2003}}
Insect pests are a major cause of crop loss globally. Pest management will be effective and efficient if we can predict the occurrence of peak activities of a given pest. Research efforts are going on to understand the pest dynamics by applying analytical and other techniques on pest surveillance data sets. In this study we make an effort to understand pest population dynamics using Neural Networks by analyzing pest surveillance data set of Helicoverpa armigera or Pod borer on chickpea (Cicer arietinum L.) crop. The results show that neural network method successfully predicts the pest attack incidences for one week in advance.
An Approach to Build a Cyber-Community Hierarchy
Krishna Reddy Polepalli,Masaru Kitsuregawa
International Conference on Data Mining, ICDM, 2002
@inproceedings{bib_An_A_2002, AUTHOR = {Krishna Reddy Polepalli, Masaru Kitsuregawa}, TITLE = {An Approach to Build a Cyber-Community Hierarchy}, BOOKTITLE = {International Conference on Data Mining}. YEAR = {2002}}
In this paper we propose an approach to extract community structures in the Web by considering a community structure as a group of content creators that manifests itself as a set of interlinked pages. We abstract the community structure as a dense bipartite graph (DBG) over a group of Web pages and proposed an algorithm to extract the DBGs from the given data set. Also, a high-level community is abstracted as a DBG over a set of a low-level communities. Using the proposed approach, a community hierarchy can be constructed for the given data set that generalizes a large number of low-level communities into a few highlevel communities. Using the proposed approach, we have extracted a a three-level community hierarchy from 10 GB TPEC (Text PEtrieval Conference) data set. We believe that the extracted community hierarchy facilitates easy analysis of the low-level communities, helps in reorganizing the Web sites, and provides a way to understand the sociology of the Web
A Framework of Information Technology Based Agriculture Information Dissemination System to Improve Crop Productivity
Krishna Reddy Polepalli
International Conference on Big Data Analytics, BDA, 2002
@inproceedings{bib_A_Fr_2002, AUTHOR = {Krishna Reddy Polepalli}, TITLE = {A Framework of Information Technology Based Agriculture Information Dissemination System to Improve Crop Productivity}, BOOKTITLE = {International Conference on Big Data Analytics}. YEAR = {2002}}
Indian farming community is facing a multitude of problems to maximize cropproductivity. In spite of successful research on new agricultural practices concerning cropcultivation, the majority of farmers is not getting upper-bound yield due to several reasons. Oneof the reasons is that expert/scientific advice regarding crop cultivation is not reaching farmingcommunity in a timely manner. It is true that India possesses a valuable agricultural knowledgeand expertise. However, a wide information gap exists between the research level and practice.Indian farmers need timely expert advice to make them more productive and competitive. In thispaper, we made an effort to present a solution to bridge the information gap by exploitingadvances in Information Technology (IT). We propose a framework of a cost-effectiveagricultural information dissemination system (AgrIDS) to disseminate expert agricultureknowledge to the farming community to improve the crop productivity. Some of the crucialbenefits of AgrIDS are as follows. It is a scalable system which can be incrementally developedand extended to cover all the farmers (crops) of India in a cost effective manner. It enables thefarmer to cultivate a crop with expertise, as that of an agricultural expert, by disseminating bothcrop and location specific expert advice in a personalized and timely manner. With AgrIDS, thelag period between research effort to practice can be reduced significantly. Finally, the proposedsystem assumes a great importance due to the trend of globalization, as it aims to provide expertadvice which is crucial to for the Indian farmer to harvest different kinds of crop varietiesbased on the demand in the world market.
A graph based approach to extract a neighborhood customer community for collaborative filtering
Krishna Reddy Polepalli,Masaru Kitsuregawa,SREEKANTH PALLURU,S SRINIVASA RAO
International workshop on databases in networked information systems, DNIS, 2002
@inproceedings{bib_A_gr_2002, AUTHOR = {Krishna Reddy Polepalli, Masaru Kitsuregawa, SREEKANTH PALLURU, S SRINIVASA RAO}, TITLE = {A graph based approach to extract a neighborhood customer community for collaborative filtering}, BOOKTITLE = {International workshop on databases in networked information systems}. YEAR = {2002}}
In E-commerce sites, recommendation systems are used torecommend products to their customers. Collaborative filtering (CF) iswidely employed approach to recommend products. In the literature,researchers are making efforts to improve the scalability and onlineperformance of CF. In this paper we propose a graph based approach toimprove the performance of CF. We abstract a neighborhood communityof a given customer through dense bipartite graph (DBG). Given adata set of customer preferences, a group of neighborhood customersfor a given customer is extracted by extracting corresponding DBG.The experimental results on the MovieLens data set show that therecommendation made with the proposed approach matches closelywith the recommendation of CF. The proposed approach possesses apotential to adopt to frequent changes in the product preference data set.