Only weights and biases are learned by gradient descent-based training of Deep Neural Networks (DNN). The other parameters, i.e., hyperparameters have a huge influence on the quality of the model but finding optimal values for them is not a trivial solution. The hyperparameter space grows exponentially and they also display non-linearity and interactions. A network with twelve hyperparameters, each with five potential values can have about 250 million unique combinations of hyperparameter sequences. If training on each set takes 6 minutes, exhaustive training on all potential combinations of the hyperparameters to find the optimal values will take almost 3000 years. Expert knowledge or random selection are some alternate options, but they are not scalable or consistently reliable. Metaheuristics such as evolutionary algorithms are a great choice for solving combinatorial optimization problems like hyperparameter optimization. While other researchers have used evolutionary algorithms such as standard implementation of genetic algorithm (GA), we introduce additional nature-inspired enhancements to GA for better exploration of the hyperparameter solution space to optimize the DNN architecture. The training is complemented with Monte Carlo based variance reduction method called importance sampling. We demonstrate that these fine-tunings result in improvements in the network accuracy on MNIST and CIFAR-10 datasets that outperforms standard use of genetic algorithm.
Arnaud RibertEmmanuel StockerY. LecourtierAbdel Ennaji
Muthna Jasim FadhilMajli Nema HawasMaitham Ali Naji
Larissa Ferreira Rodrigues MoreiraAndré Ricardo BackesBruno Augusto Nassif TravençoloGina M. B. Oliveira
Mohammed Amine Janati IdrissiHassan RamchounYoussef GhanouMohamed Ettaouil