Deep learning snowy images

Past week I started to play with the Caffe deep learning framework. Although I initially planned on using the SegNet branch of the Caffe framework to classify snow in PhenoCam images. However, given that it concerns a rather binary classification I don’t need to segment the picture (I do not care where the snow in the image is, only if it is present). As such, a more semantic approach could be used.

Luckily people at MIT had already trained a classifier, the Places-CNN, which deals with exactly this problem, characterizing an image scene. So, instead of training my own classifier I gave theirs a try. Depending on the image type, and mostly the view angle the results are very encouraging (even with their stock model).

For example, the below image got classified as: mountain snowy, ski slope, snowfield, valley, ski_resort. This all seems very reasonable indeed. Classifying a year worth of images at this site yielded an accuracy of  89% (compared to human observations).

 

24721196381_042bb62b7e_z

However, when the vantage point changes so does the accuracy of the classification, mainly due to the lack of images of this sort in the original training data set I presume. The image below was classified as: rainforest, tree farm, snowy mountain, mountain, cultivated field. As expected, the classification accuracy dropped to a mere 13%. There is still room for improvement using PhenoCam based training data. But, building upon the work by the group at MIT should make these improvements easier.