TY - JOUR AU - Oliveira, Marcos de Souza AU - Queiroz, Sergio PY - 2020/04/27 Y2 - 2024/03/29 TI - Unsupervised Feature Selection Methodology for Clustering in High Dimensionality Datasets JF - Revista de Informática Teórica e Aplicada JA - RITA VL - 27 IS - 2 SE - Regular Papers DO - 10.22456/2175-2745.96081 UR - https://seer.ufrgs.br/index.php/rita/article/view/RITA_VOL27_NR2_30 SP - 30-41 AB - <div class="page" title="Page 1"><div class="section"><div class="layoutArea"><div class="column"><p><span>Feature selection is an important research area that seeks to eliminate unwanted features from datasets. Many feature selection methods are suggested in the literature, but the evaluation of the best set of features is usually performed using supervised metrics, where labels are required. In this work we propose a methodology that tries to aid data specialists to answer simple but important questions, such as: (1) do current feature selection methods give similar results? (2) is there is a consistently better method ? (3) how to select the </span><span>m</span><span>-best features? (4) as the methods are not parameter-free, how to choose the best parameters in the unsupervised scenario? and (5) given different options of selection, could we get better results if we fusion the results of the methods? If yes, how can we combine the results? We analyze these issues and propose a methodology that, based on some unsupervised methods, will make feature selection using strategies that turn the execution of the process fully automatic and unsupervised, in high-dimensional datasets. After, we evaluate the obtained results, when we see that they are better than those obtained by using the selection methods at standard configurations. In the end, we also list some further improvements that can be made in future works.</span></p></div></div></div></div> ER -