Operation Blossom

the_proton_crusher · November 11, 2022, 3:16am

NAME OF PROJECT
Operation Blossom

PROJECT DESCRIPTION
Vision Transformers (ViT) is an up and coming technology with news that it will be replacing Convolutional Neural Networks (CNN). We want to demystify this claim by experimenting ourselves with a flower images dataset. The task will be to identify the flower in an image. We will build a Streamlit app that runs both a CNN and ViT and compare the accuracy and duration time of Convolutional Neural Networks and Vision Transformers. The model predictions will be displayed along with the training and testing results of both CNN and ViT.

GRANT DELIVERABLES

Run a CNN and tune the model
Run and tune a ViT model
Build a Streamlit app allowing users to upload images
Build a pipeline that takes users images and makes predictions
Display training process of both CNN and ViT

SQUAD
Ren W

Twitter: https://twitter.com/renweview
Discord: tea first#5268

Tariq R

Twitter: @taraqur
Discord: the_proton_crusher#8317
ETH: 0x32aE0C5b4e34340e4e1550a92d0B6206c68663A3

antaraxia · November 11, 2022, 3:49pm

Hi,
It appears that this is a topic that has been investigated to some degree of detail. Have you taken a look at, for instance: https://towardsdatascience.com/vision-transformers-or-convolutional-neural-networks-both-de1a2c3c62e4
or https://towardsdatascience.com/are-transformers-better-than-cnns-at-image-recognition-ced60ccc7c8 and the links in them?

the_proton_crusher · November 11, 2022, 7:01pm

Hi,

Thanks for the info. We were aware that it is not a novel idea but trying to implement it ourselves. We were planning on displaying results in an interactive way in streamlit.