International Journal on Soft Computing ( IJSC ): MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER DATASET

Monday, 3 December 2018

MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER DATASET

MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER DATASET

Soumen Kumar Pati1 and Asit Kumar Das2

1Department of Computer Science/Information Technology, St. Thomas‘College of Engineering and Technology, 4, D.H. Road, Kolkata-23 soumen_pati@rediffmail.com 2Department of Computer Science and Technology, Bengal Engineering and Science University, Shibpur, Howrah-03 asitdas72@rediffmail.com

ABSTRACT

Microarray is a useful technique for measuring expression data of thousands or more of genes simultaneously. One of challenges in classification of cancer using high-dimensional gene expression data is to select a minimal number of relevant genes which can maximize classification accuracy. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust gene identification methods is extremely fundamental. Many gene selection methods as well as their corresponding classifiers have been proposed. In the proposed method, a single gene with high classdiscrimination capability is selected and classification rules are generated for cancer based on gene expression profiles. The method first computes importance factor of each gene of experimental cancer dataset by counting number of linguistic terms (defined in terms of different discreet quantity) with high class discrimination capability according to their depended degree of classes. Then initial important genes are selected according to high importance factor of each gene and form initial reduct. Then traditional kmeans clustering algorithm is applied on each selected gene of initial reduct and compute missclassification errors of individual genes. The final reduct is formed by selecting most important genes with respect to less miss-classification errors. Then a classifier is constructed based on decision rules induced by selected important genes (single) from training dataset to classify cancerous and non-cancerous samples of experimental test dataset. The proposed method test on four publicly available cancerous gene expression test dataset. In most of cases, accurate classifications outcomes are obtained by just using important (single) genes that are highly correlated with the pathogenesis cancer are identified. Also to prove the robustness of proposed method compares the outcomes (correctly classified instances) with some existing well known classifiers.

KEYWORDS

Microarray cancer data, K-means algorithm, Gene selection, Classification Rule, Cancer sample identification, Gene reducts.

ORIGINAL SOURCE URL : http://airccse.org/journal/ijsc/papers/3312ijsc06.pdf

http://airccse.org/journal/ijsc/current2012.html

International Journal on Soft Computing ( IJSC )

Monday, 3 December 2018

MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER DATASET

No comments:

Post a Comment

Call for Papers! Welcome to AIS 2026!

Report Abuse

Labels