Operation Blossom

NAME OF PROJECT
Operation Blossom

PROJECT DESCRIPTION
Vision Transformers (ViT) is an up and coming technology with news that it will be replacing Convolutional Neural Networks (CNN). We want to demystify this claim by experimenting ourselves with a flower images dataset. The task will be to identify the flower in an image. We will build a Streamlit app that runs both a CNN and ViT and compare the accuracy and duration time of Convolutional Neural Networks and Vision Transformers. The model predictions will be displayed along with the training and testing results of both CNN and ViT.

GRANT DELIVERABLES

  1. Run a CNN and tune the model
  2. Run and tune a ViT model
  3. Build a Streamlit app allowing users to upload images
  4. Build a pipeline that takes users images and makes predictions
  5. Display training process of both CNN and ViT

SQUAD
Ren W

Tariq R

  • Twitter: @taraqur
  • Discord: the_proton_crusher#8317
  • ETH: 0x32aE0C5b4e34340e4e1550a92d0B6206c68663A3
1 Like

Hi,
It appears that this is a topic that has been investigated to some degree of detail. Have you taken a look at, for instance: https://towardsdatascience.com/vision-transformers-or-convolutional-neural-networks-both-de1a2c3c62e4
or https://towardsdatascience.com/are-transformers-better-than-cnns-at-image-recognition-ced60ccc7c8 and the links in them?

1 Like

Hi,

Thanks for the info. We were aware that it is not a novel idea but trying to implement it ourselves. We were planning on displaying results in an interactive way in streamlit.