Spectroscopy data are useful for modelling biological systems. However, using a wide spectrum of wavelengths is not practical in a production setting. Variable selection methods are one efficient way to obtain an optimal model and were the aim of this work. Near-infrared spectral data in the range of 800 – 2500 nm were used to classify bruise damage. Six machine learning classification algorithms were employed, and two variable selection methods were used to determine the most relevant wavelengths for the problem of distinguishing between bruised and non-bruised apples. The selected wavelengths clustered around 900 nm, 1200nm and 1900 nm. The best results were achieved using linear regression and a support vector machine based on up to 40 wavelengths.