Reflection on Mar.24 Slice of Data Science

Victoria Yuanyuan Chang

Today’s Slice of Data Science meeting discussed convolutional neural networks, transfer learning and stacked generalization. Regular CNN achieves feature extraction by convolving filters over an image to detect characteristics of the image, thus labeling the image accordingly. However, training such model may take a long time and requires a large set of training data. In this case, transfer learning can be an accurate and efficient method to use. Transfer learning refers to transferring models that have been previously trained on other datasets on to the data we are using. For example, as Clare introduced in her presentation, her team used previously trained CNNs to predict the quality of road with satellite imagery. They first removed the original classifiers since they were relevant only to previous datasets on which the model was trained. Then they built their custom classifiers while keeping some of the original layers of the model, so that the pre-trained weight remained. This way they didn’t have to spend time training the model’s weight using their data, saving time while tasing accuracy of the model. In the case of the road quality detection, Clare’s team froze some layers of the model re-trained the rest because the model they used had already been used to extract features of road images. On the other hand, if the model is previously trained on a very different dataset from the data to be predicted, it might be better not to freeze the layers and train the entire model. If the data to be predicted is relatively small in size, it’s a good idea to freeze all the layers and use the pre-trained weight in order to prevent the problem of overfitting.

Stacked generalization is another method to improve the accuracy of neural networks. Such structures use multiple models to predict the same images. Stacked together, these models vote for their predictions. Because the models employ different algorithms and detect different features, their aggregate decision can be more accurate than the prediction made by one single model.

On top of stacked generalization, model inception can be employed to further increase the accuracy of the predictions. In model inception, a model takes in the stacked votes of different models in stacked generalization as input and is trained to predict which of the votes are the most accurate.