Investigating Transfer Learning and Scratch Training for Image Recognition
![](/project/investigating-transfer-learning-and-scratch-training-for-image-recognition/featured_hu6890f83a518f4014d67ea7415281d263_64492_720x0_resize_lanczos_3.png)
Overview
This project explores image classification using convolutional neural networks (CNNs). The goal is to classify images into three categories - apparel, chairs, and footwear. The project covers:
- Data collection and labeling
- Splitting data into train, validation, and test sets
- Building input pipelines with data augmentation
- Fine-tuning a pretrained CNN model (ResNet50)
- Training a CNN from scratch
Data Collection and Labeling
- Collected over 100 images per class (apparel, chairs, footwear) using a phone camera
- Labeled images and assigned category labels - 0 for apparel, 1 for chairs, 2 for footwear
Data Splitting
- Split data into 60% train, 20% validation, 20% test per class
- Copied split data into separate folders for train, validation, and test
Input Pipeline and Data Augmentation
- Used Keras image_dataset_from_directory to load images as datasets
- Added random flips and rotations for data augmentation
- Defined a pipeline that applies these data augmentation to each element of the training data set in parallel.
- I am also using
_prefetch_
to avoid I/O blocking.
Fine-tuning ResNet50
- Used transfer learning with pretrained ResNet50 model
- Froze base layers, added GlobalAveragePooling and Dense layers
- Achieved 98.46% test accuracy after 5 epochs of fine-tuning
Training CNN from Scratch
- Built a CNN model with convolutional, pooling, batch norm, dropout layers
- Trained for 10 epochs using Adam optimizer
- Achieved 70% test accuracy
- Training a model from scratch on a small dataset makes it difficult to learn robust features compared to fine-tuning a pretrained model. More data would be needed to improve accuracy.
Conclusion
This project demonstrated classifying images using CNNs. Fine-tuning a pretrained CNN model like ResNet50 achieved high accuracy given the small dataset size. Training a CNN from scratch produced decent but lower accuracy. Overall, the models were able to effectively classify the apparel, chair, and footwear images.
References
- Dataset collected using a phone camera. Location: Indiana University Bloomington Campus
- Models built with TensorFlow/Keras