Solega Co. Done For Your E-Commerce solutions.
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
No Result
View All Result
Home Artificial Intelligence

Explanation of Vision Transformer with implementation | by Hiroaki Kubo | Nov, 2024

Solega Team by Solega Team
November 4, 2024
in Artificial Intelligence
Reading Time: 3 mins read
0
Explanation of Vision Transformer with implementation | by Hiroaki Kubo | Nov, 2024
0
SHARES
6
VIEWS
Share on FacebookShare on Twitter


First, we reshape the picture right into a sequence of flattened 2D patches. The code is as follows. image_size means width and peak of picture. The code is as follows.

image_size = 224
channel_size = 3
picture = Picture.open('pattern.png').resize((image_size, image_size))
X = T.PILToTensor()(picture) # Form is [channel_size,image_size,image_size]
patch_size = 16
patches = (
X.unfold(0, channel_size, channel_size)
.unfold(1, patch_size, patch_size)
.unfold(2, patch_size, patch_size)
) # Form is [1,image_size/patch_size,image_size/patch_size,channel_size,patch_size,patch_size]
patches = (
patches.contiguous()
.view(patches.dimension(0), -1, channel_size * patch_size * patch_size)
.float()
) # Form is [1, Number of patches, channel_size*patch_size*patch_size]

Lastly, we create a matrix during which single patch has channel_size*patch_size*patch_size data.

Subsequent, Transformer makes use of fixed latent vector dimension D via all of its layers, so we map the patches to D dimensions with a trainable linear projection (Eq. 1).

We initilize E as follows.

self.E = nn.Parameter(
torch.randn(patch_size * patch_size * channel_size, embedding_dim)
)

We reshape the patches by calculaduring matrix product.

patch_embeddings = torch.matmul(patches, self.E)



Source link

Tags: ExplanationHiroakiImplementationKuboNovTransformerVision
Previous Post

Coinbase Is Embarrassing Itself By Not Buying Bitcoin

Next Post

The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

Next Post
The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR POSTS

  • 20 Best Resource Management Software of 2025 (Free & Paid)

    20 Best Resource Management Software of 2025 (Free & Paid)

    0 shares
    Share 0 Tweet 0
  • How to Make a Stakeholder Map

    0 shares
    Share 0 Tweet 0
  • 10 Ways To Get a Free DoorDash Gift Card

    0 shares
    Share 0 Tweet 0
  • The Role of Natural Language Processing in Financial News Analysis

    0 shares
    Share 0 Tweet 0
  • How To Sell Gold (Step-By-Step Guide)

    0 shares
    Share 0 Tweet 0
Solega Blog

Categories

  • Artificial Intelligence
  • Cryptocurrency
  • E-commerce
  • Finance
  • Investment
  • Project Management
  • Real Estate
  • Start Ups
  • Travel

Connect With Us

Recent Posts

How to Make a Stakeholder Map

How to Write a Stakeholder Communication Plan: Example & Template

November 13, 2025
Texas Gov. Greg Abbott Floats Eliminating School Property Taxes for Homeowners in Reelection Bid

Texas Gov. Greg Abbott Floats Eliminating School Property Taxes for Homeowners in Reelection Bid

November 13, 2025

© 2024 Solega, LLC. All Rights Reserved | Solega.co

No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel

© 2024 Solega, LLC. All Rights Reserved | Solega.co