Machine learning methods, while extremely popular, have been justifiably criticized for their opaque 'black box' nature. In energy transition research, data is becoming available at increasing scales. To make real use of this requires data science methods that take advantage of ML advances while respecting the underlying physical nature of the systems analyzed. This talk will consider this challenge and present practical examples and methods related to energy in buildings.