In recent times, ancillaries have become a key driver for revenue growth for travel industries. Traditionally, pricing and offer generation for ancillary items have been managed using static business rules. In such scenarios, where historical prices show very little or no variation, typical methods of estimating purchase probabilities and then finding the optimal price are not applicable. In this study, we develop practical approaches for dynamic pricing of ancillaries based on reinforcement learning ideas. We propose a contextual bandit model for dynamic pricing of ancillaries considering the trip and customer features. The pricing setting presents significant challenges to the application of the multi-armed bandit framework since the arms are highly correlated. To capture correlation across arms, we consider a Bayesian logistic bandit framework and use Variational Bayes methodology to construct fast and scalable algorithms for this setting. We demonstrate using simulations that these methods can efficiently discover the optimal prices and can provide significant revenue lift.