Volume 5 Number 4 March 2016

1

Enhanced MSER algorithm for Text Extraction
P. Jaswanth, S. Anusuya, M. Anil Kumar, T. Dhikhi

Abstract: Text extraction from natural scene images is a challenging problem due to the variations in color, font size, text alignment, illumination etc. And it is a technique to identify and isolate the desired text from the images. In this work, we propose a novel method referred as Enhanced Maximally Stable Extremal Region (EMSER) to extract the text present in the images. The existing approaches deal with the same are lacking in accuracy. The proposed Enhanced Maximally Stable Extremal Region (EMSER) algorithm works with morphological operators such as dilation and reconstruction with different size of structuring elements to identify the shape of the text objects and to find the number of connected components accurately. The proposed method has been compared with Sobel and Canny edge detection methods and the superiority of proposed method has been shown in experimental results.

2

Predictive Regression Forecasting for Share Trading based onTotal Part Summation
K. Jayanthi, P.Suresh

Abstract: In today's changing  economic  scenario,  there is adequate  opportunity  to influence  the varied sources of time series data. These time series data are now easily available and accessible to the intelligent decision maker. Many research works have been contributed towards the interdisciplinary  notion of data mining  for  forecasting  of share  trading  when  used  to solve  time  series  problems.  Though,  diversified ensemble  of individual  predictors  was made, but individual  time series forecasting  was not made in an efficient  manner.  Also the presence  of irrelevant  attributes  posed significant  problems.  In this paper to address individual time series forecasting and reduce the forecasting error a framework called Predictive Regression  Forecast  based  on Total  Part  Summation  (PRF-TPS)  is designed.  PRF-TPS  framework  is developed on share trading dataset to forecast time series of each component in the presence of irrelevant attributes  and therefore  reduce the forecasting  error. PRF operates on large collections  of events which summarizes the events with individual time series forecasting. Total Part Summation (TPS) forecasts discretely,  the Dividend  Stock  Price  Ratio,  Income  Growth  Ratio  and Stock  Price  Income  Growth  of share trading.  Total Part Summation  effectively  learns the irrelevant  attributes  at different  time series properties  of  the  components.  The  TPS  produces  statistically  and  economically  significant  gains  for investors  and  performs  better  out  of  sample  in  predictive  regressions.  An  intensive  and  comparative study shows the efficiency  of these enhancements  and shows better  performance  in terms of run-time, forecast efficiency,  test error rate, and predictive  accuracy  on shares. Experimental  analysis shows that PRF-TPS framework  is able to reduce the run-time for share rate forecasting by 36.86% and reduce the test error rate by 50.79% compared to the state-of-the-art  works.

3

An Efficient Technique for Sequential Pattern Searching
S. Deepa

Abstract: This The task of Sequential pattern mining aims to extract the sequences from large databases, which in turn can be interpreted as domain knowledge for several purposes. Sequential pattern mining is used in several domains such as studying the customer behaviors, mining several web logs distributed on multiple servers, protein and gene sequence analysis and in computational biology to analyze the amino acid mutation patterns. The searching process in sequence databases plays an important role in many application domains, mainly for information retrieval and data mining. Hence there exist a lot of interests among the researchers towards the development of new concepts related to the sequence search. This research work proposes two new search techniques namely Sequence Search by Partitioning (SSP) and Sequence Search by Indexing (SSI) for performing sequence search efficiently. Performance of the proposed techniques are compared based on the key features such as total execution time, search time and memory utilization.

4

Facial Expression Recognition for Color Images Using Genetic Algorithm
M. Mahadevii, C.P. Sumathi

Abstract: Expressions are the voices that spell the mood of a person. To identify the emotions of the population and their underlying feelings like happy, sad, surprise, disgust need to be classified. The aim of this work is to recognize the emotions of people and classify them under the basic expressions categories.  The research was carried on for color images for which segmentation based on a skin color helps to localize the face. The face is divided into three facial parts using horizontal and vertical projections. A Dataset is created by extracting the statistical features from each of the facial parts, and feature selection is implemented using a genetic algorithm based on the Root Mean Square Error (RMSE) that occurred while training the data with Naïve Bayes classifier. The algorithm has an accuracy of 92.31%.

5

Interpretability of Fuzzy Clusters by Fuzzy Association Rules Using Cluster based Fuzzy Partitioning
Swati Ramdasi, Shailaja Shirwaikar

Abstract: Data mining is widely accepted and used tool for extracting interesting information from data.  Associative rule mining and Clustering are descriptive techniques. Fuzzy approach has enhanced the power of both these techniques. Clustering is used in data processing for discretization and data reduction. However, Clustering suffers from interpretability problem. This paper presents a multi-step combination of the above two techniques which gives a better insight on the dataset and also identifies irrelevant attributes. It extends fuzzy association rule mining algorithm by using user defined support confidence framework. Several Clustering based methods are proposed and compared for fuzzy partitioning of individual attributes. Our proposed algorithm addresses the problem of interpretability of cluster by using expressive power of fuzzy rules as well as helps in improving quality of cluster by finding prime attributes contributing in cluster formation. The paper presents expected and interesting results obtained when the algorithm is applied on some known datasets.

6

Empirical Evaluation of Shrinkage Thresholding Technique in Contourlet Domain
S. Shajun Nisha, M. Pitchammal

Abstract: Medical imaging emanate as one of the most preeminent sub-fileds in the world of science and technology. Image Denoising is one of the primary step in digital image processing. The cardinal intention is to eliminate the noise from the input image. Medical image is used as input image. Medical images are obstructed by a variety of noises depending on their devices through acquisition and transmission & Storage. In this for denoising, Gaussian noise, Speckle noise, and Salt and Pepper noise in Magnetic Resonance Image (MRI) undergo a contourlet domain for decomposition of input images. Contourlet transform is used to preserve the edges and contours the regions. After decomposition some thresholding methods are used, they are Heursure shrink, Min-Max shrink, Neighsure shrink, Bishrink, Visu shrink, Sure shrink, Neigh shrink, Bayes shrink, Normal shrink, Block shrink. Thresholding function is used to identify and filter the noisy coefficient and take inverse transform to intermittent the original image. Theresholding techniques are instigate and scrutinise its performance to find the best result. MRI images are taken as datasets for quantitative validation. The Peak Signal-to-Noise Ratio (PSNR), weighted signal-to-noise ratio (WSNR) , visual signal-to-noise ratio(VSNR) are employed to quantify the performance of denoising.

7

Performance Analysis of Bat K-Means Clustering Algorithm Using Gene Expression Data Set
P. Neelavathi, K. Thangavel, E.N. Sathish Kumar

Abstract: Clustering is one of the popular data mining methods aiming at representing large dataset by a collection of cluster. Clustering gene expression data sets is a difficult task because of seed selection. Hence to group the gene expression data set we proposed hybrid Bat K-Means algorithm (BKM).  The proposed system have been utilizes the concepts of Bat optimization techniques and K-Means Algorithms. In this approach initial cluster centre are computed by using Bat optimization technique which gives nearly the exact centre of the cluster based on the echolocation behaviour of Bat location and velocity. This method is tested upon the “cell-free RNA across pregnancy time course from pregnant women” gene expression data set. The performance of the BKM is evaluated using confusion matrix validation and compared. Further, experimental studies show that BKM based clustering outperforms.

8

Performance Analysis of Firefly K-Means Clustering Algorithm Using Gene Expression Data
M. Lakshmi, K. Thangavel, P.S. Raja

Abstract: Clustering is a common technique for grouping similar objects and is used in many fields, including data mining, pattern recognition, and image analysis. K-means clustering is a bench mark algorithm for data clustering, but this method has some limitations such as the random selection of initial centroid and it results in local optimal. The initial centroid is playing vital role in K-Means clustering process. This paper adopts firefly algorithm to initialize the cluster centroids rather than selecting randomly. Firefly algorithm is an optimization algorithm which is used to find the initial centroids and then the K-Means clustering allows to refine the centroid and cluster. Through this Firefly optimization algorithm, optimal solution is optioned to find the initial cluster centroid for K-Means clustering algorithm and it applied for Asthma gene expression data. The performance of the Firefly K-Means algorithms is evaluated using the Asthma Gene expression data and compared with K-Means algorithm. The proposed Firefly K-Means algorithms out performs.

9

A Review on Code Smell Techniques Using Nesting Structure Tree
Chaitanya Kulkarni, S.D. Joshi

Abstract: Duplicated code is harmful for maintaining and evolving the source code in software systems. The refactorability supports to remove the software clones and checking the unification while merging the duplicated code. An approach for automatically pair the clones that can be refactored without changing the behaviour of program is needed. There is a problem by using the support of removing the clones from the source code through refactoring because they are having more modification in the code that may not support. The proposed approach will examine differences between clones that can be safely parameterized without causing any side-effects. The computational cost of the proposed approach is negligible (less than a second) in the huge majority of the examined outcomes. The clones will be detected by tools which are discussed by clone types. Input to detect the clones is source code which consists of set of statements and preconditions. By using Program Dependence Graph (PDG) mapping and Abstract Syntax Tree (AST) matching techniques can detect clones from source code with the help of refactoring technique. Using variety of clone detection tools on open source project the present work’s performance will be studied and also assessment of refactorable clones in production code is more than the clones in test code there will be done.

10

Comparative Analysis of Edge Detection Algorithms based on Content Based Image Retrieval with Heterogeneous Images
T. Dharani, I. Laurence Aroquiaraj, V. Mageshwari

Abstract: Heterogeneous image database having various types of visual features.  Nowadays, Content Based Image Retrieval system is mostly handling unlabelled images and also heterogeneous images.  Still feature extraction process is an important issue of CBIR system, because it is the initial and most essential step of CBIR system.  Traditional CBIR system works with low level feature’s that is colour and texture.  The next level visual feature’s such as middle and high are shape and semantic.  In this paper, we explained about CBIR system and compared the first order, second order and Gaussian edge detection methods with fuzzy logic. The proposed work gives a brief knowledge on edge detection algorithms and also states the merits and demerits for better understanding.  The Fuzzy inference pixel system based edge detection algorithm performs well when compare to other methods.