• Users Online: 394
  • Print this page
  • Email this page
ORIGINAL ARTICLE
Year : 2022  |  Volume : 12  |  Issue : 2  |  Page : 122-126

Using classification and K-means methods to predict breast cancer recurrence in gene expression data


1 Medical Image and Signal Processing Research Center, Department of Bioinformatics, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
2 Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences; Pediatric Inherited Diseases Research Center, Research Institute for Primordial Prevention of Non Communicable Disease, Isfahan University of Medical Sciences, Isfahan, Iran
3 Department of Hematology-Oncology, Isfahan University of Medical Sciences, Isfahan, Iran
4 Health Information Technology Research Center, Isfahan University of Medical Sciences, Isfahan, Iran

Correspondence Address:
Mohammad Sattari
Health Information Technology Research Center, Isfahan University of Medical Sciences, Isfahan
Iran
Login to access the Email id

Source of Support: None, Conflict of Interest: None


DOI: 10.4103/jmss.jmss_117_21

Rights and Permissions

Background: Breast cancer is a type of cancer that starts in the breast tissue and affects about 10% of women at different stages of their lives. In this study, we applied a new method to predict recurrence in biological networks made from gene expression data. Method: The method includes the steps such as data collection, clustering, determining differentiating genes, and classification. The eight techniques consist of random forest, support vector machine and neural network, randomforest + k-means, hidden markov model, joint mutual information, neural network + k-means and suportvector machine + k-menas were implemented on 12172 genes and 200 samples. Results: Thirty genes were considered as differentiating genes which used for the classification. The results showed that random forest + k-means get better performance than other techniques. The two techniques including neural network + k-means and random forest + k-means performed better than other techniques in identifying high risk cases. Conclusion: Thirty of 12,172 genes are considered for classification that the use of clustering has improved the classification techniques performance.


[FULL TEXT] [PDF]*
Print this article     Email this article
 Next article
 Previous article
 Table of Contents

 Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
 Citation Manager
 Access Statistics
 Reader Comments
 Email Alert *
 Add to My List *
 * Requires registration (Free)
 

 Article Access Statistics
    Viewed1021    
    Printed70    
    Emailed0    
    PDF Downloaded150    
    Comments [Add]    

Recommend this journal