Project 4
Victoria Yuanyuan Chang
1. Problem Statement
The current project aims at identifying dog breeds from images. The problem I wish to solve is the identification and recording of stray dogs. If one finds a stray dog in need of rescue or adoption, they can take a picture of the dog and the model can identify the breed of dog. Such model, with more refinement, also has the potential to identify lost pets if caught in security cameras. Moreover, dogs are just cute and they deserve the attention of data science people. The complexity of this project lays on the variety of the dog breeds. To fully incorporate all dog breeds the model needs to be able to classify images into hundreds of categories. What’s more, the input images can have various backgrounds that interrupts the identification, and different dog breeds can have very subtle differences in terms of appearance. In conclusion, the central research question is how to build a model that identifies dog breed with an image of the dog.
2. Description of Data
The training and testing data is a segment of a dataset from Kaggle (https://www.kaggle.com/c/dog-breed-identification/data). The original data includes one .csv file that maps the images to its breed name and 10222.jpg images of various dogs. The .csv file consists of one column that records image filename and another that records its breed. The images are everyday photos of all kinds of dogs in different backgrounds (some include other objects/animals/humans) rather than staged stock images. These images fits better than stock images the scenario in which the model can be potentially applied. Sample images are shown to the right. The size of the images vary, but they are mostly under 50KB. In consideration of this preliminary model’s limited capacity, I segmented the original data by selecting only the 10 most common breeds (out of the 120 breeds it contains). This segment of data contrains 1141 images and 10 categories.
Sample Images:
Sample .csv file (label names):
id | breed |
---|---|
000bec180eb18c7604dcecc8fe0dba07 | boston_bull |
001513dfcb2ffafc82cccf4d8bbaba97 | dingo |
001cdf01b096e06d78e9e5112d419397 | pekinese |
00214f311d5d2247d5dfe4fe24b2303d | bluetick |
0021f9ceb3235effd7fcde7f7538ed62 | golden_retriever |
002211c81b498ef88e1b40b9abf84e1d | bedlington_terrier |
00290d3e1fdd27226ba27a8ce248ce85 | bedlington_terrier |
002a283a315af96eaea0e28e7163b21b | borzoi |
003df8b8a8b05244b1d920bb6cf451f9 | basenji |
The top 10 most common breeds are shown below. These are the labels used for the current model.
Index(['afghan_hound', 'basenji', 'bernese_mountain_dog', 'entlebucher',
'great_pyrenees', 'maltese_dog', 'pomeranian', 'samoyed',
'scottish_deerhound', 'shih-tzu']
The 10 selected breeds, as shown above, are actually commonly seen pet breeds, therefore, I think the model would have some practical value if it can predict these categories accurately.
3. Specification for the applied machine learning method
Since the aim of this project is to label images, I believe the method that presented the most promise in providing a solution is CNN. I first encoded the labeling into indicator variables using df.get_dummies(the resulting dataframe is shown below).
I then turned the images into pixel arrays and did data augmentation using ImageDataGenerator from tensorflow.
train_datagen = ImageDataGenerator(
rotation_range=30,
width_shift_range=0.2,
height_shift_range=0.2,
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
test_datagen=ImageDataGenerator(rescale=1./255)
Finally, after splitting the training and testing data using tts, I built the CNN model using keras.
Model summary:
Model architecture, layers, functional arguments and specifications for compiling and fitting are shown as the code below:
model=Sequential()
model.add(ZeroPadding2D((1,1),input_shape=(299,299,3)))
model.add(Conv2D(32,kernel_size=(3,3),activation='relu'))
model.add(ZeroPadding2D(padding=(1,1)))
model.add(Conv2D(32,kernel_size=(3,3),activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))
model.add(Flatten())
model.add(Dense(64,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10,activation='softmax'))
model.compile(loss=categorical_crossentropy,optimizer='adam',metrics=['accuracy'])
model.summary()
epochs=30
history=model.fit_generator(training_set,
steps_per_epoch = 16,
validation_data = testing_set,
validation_steps = 4,
epochs = epochs,
verbose = 1)
4. Assessment of Model Performance
The performance and accuracy of the model is shown in the graph below (epoch=30). Although CNN and image classfication is a well-studied field, few literature is available in this particular narrow topic I am interested in. However, as the original data is obtained from a Kaggle programming competition, I find that the highest rated submisson achived an accuracy of 0.98 using pretrained keras models (that are too complicated for me to understand or draw lessons from). Comparing to this top-rated model, my model still needs much improvement.