Investigating Transfer Learning and Scratch Training for Image Recognition

Overview

This project explores image classification using convolutional neural networks (CNNs). The goal is to classify images into three categories - apparel, chairs, and footwear. The project covers:

  • Data collection and labeling
  • Splitting data into train, validation, and test sets
  • Building input pipelines with data augmentation
  • Fine-tuning a pretrained CNN model (ResNet50)
  • Training a CNN from scratch

Data Collection and Labeling

  • Collected over 100 images per class (apparel, chairs, footwear) using a phone camera
  • Labeled images and assigned category labels - 0 for apparel, 1 for chairs, 2 for footwear
image

Data Splitting

  • Split data into 60% train, 20% validation, 20% test per class
  • Copied split data into separate folders for train, validation, and test

Input Pipeline and Data Augmentation

  • Used Keras image_dataset_from_directory to load images as datasets
  • Added random flips and rotations for data augmentation
  • Defined a pipeline that applies these data augmentation to each element of the training data set in parallel.
  • I am also using _prefetch_ to avoid I/O blocking.

Fine-tuning ResNet50

  • Used transfer learning with pretrained ResNet50 model
  • Froze base layers, added GlobalAveragePooling and Dense layers
  • Achieved 98.46% test accuracy after 5 epochs of fine-tuning
image image

Training CNN from Scratch

  • Built a CNN model with convolutional, pooling, batch norm, dropout layers
  • Trained for 10 epochs using Adam optimizer
  • Achieved 70% test accuracy
  • Training a model from scratch on a small dataset makes it difficult to learn robust features compared to fine-tuning a pretrained model. More data would be needed to improve accuracy.
image image

Conclusion

This project demonstrated classifying images using CNNs. Fine-tuning a pretrained CNN model like ResNet50 achieved high accuracy given the small dataset size. Training a CNN from scratch produced decent but lower accuracy. Overall, the models were able to effectively classify the apparel, chair, and footwear images.

References

  • Dataset collected using a phone camera. Location: Indiana University Bloomington Campus
  • Models built with TensorFlow/Keras
Gauri Chaudhari
Gauri Chaudhari
Data Professional

Data Alchemist who loves to brew insights from data.

Next
Previous

Related