Variable selection in multivariate calibration considering non-decomposability assumption and building blocks hypothesis

Nenhuma Miniatura disponível

Data

2018-12-06

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal de Goiás

Resumo

The procedure used to select a subset of suitable features in a given data set consists in variable selection, which is important when the dataset contains large number of variables and many of them are redundant. Multivariate calibration combines variable selection with statistical techniques to build mathematical models which relate the data to a given property of interest in order to predict this property by selecting informative variables. In this context, variable selection techniques have been widely applied to the solution of several optimization problems. For instance, Genetic Algorithms (GAs) are easy to implement and consist in a population-based model that uses selection and recombination operators to generate new solutions. However, usually in multivariate calibration the dataset present a considerable correlation degree among variables and this provides an evidence about the problem not being properly decomposed. Moreover, some studies in literature have claimed genetic operators used by GAs can cause the building blocks (BBs) disruption of viable solutions. Therefore, this work aims to claim that selecting variables in multivariate calibration is a non-completely decomposable problem (hypothesis 1) as well as that recombination operators affects the non-decomposability assumption (hypothesis 2). Additionally, we are proposing two heuristics, one local search-based operator and two versions of an Epistasis-based Feature Selection Algorithm (EbFSA) to improve model prediction performance and avoid BBs disruption. Based on the performed inquiry and experimental results, we are able to endorse the viability of our hypotheses and demonstrate EbFSA can overcome some traditional algorithms.

Descrição

Citação

PAULA, Lauro Cássio Martins de. Variable selection in multivariate calibration considering non-decomposability assumption and building blocks hypothesis. 2018. 116 f. Tese (Doutorado em Ciência da Computação em Rede) - Universidade Federal de Goiás, Goiânia, 2018.