Solega Co. Done For Your E-Commerce solutions.
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
No Result
View All Result
Home Artificial Intelligence

Explanation of Vision Transformer with implementation | by Hiroaki Kubo | Nov, 2024

Solega Team by Solega Team
November 4, 2024
in Artificial Intelligence
Reading Time: 3 mins read
0
Explanation of Vision Transformer with implementation | by Hiroaki Kubo | Nov, 2024
0
SHARES
9
VIEWS
Share on FacebookShare on Twitter


First, we reshape the picture right into a sequence of flattened 2D patches. The code is as follows. image_size means width and peak of picture. The code is as follows.

image_size = 224
channel_size = 3
picture = Picture.open('pattern.png').resize((image_size, image_size))
X = T.PILToTensor()(picture) # Form is [channel_size,image_size,image_size]
patch_size = 16
patches = (
X.unfold(0, channel_size, channel_size)
.unfold(1, patch_size, patch_size)
.unfold(2, patch_size, patch_size)
) # Form is [1,image_size/patch_size,image_size/patch_size,channel_size,patch_size,patch_size]
patches = (
patches.contiguous()
.view(patches.dimension(0), -1, channel_size * patch_size * patch_size)
.float()
) # Form is [1, Number of patches, channel_size*patch_size*patch_size]

Lastly, we create a matrix during which single patch has channel_size*patch_size*patch_size data.

Subsequent, Transformer makes use of fixed latent vector dimension D via all of its layers, so we map the patches to D dimensions with a trainable linear projection (Eq. 1).

We initilize E as follows.

self.E = nn.Parameter(
torch.randn(patch_size * patch_size * channel_size, embedding_dim)
)

We reshape the patches by calculaduring matrix product.

patch_embeddings = torch.matmul(patches, self.E)



Source link

Tags: ExplanationHiroakiImplementationKuboNovTransformerVision
Previous Post

Coinbase Is Embarrassing Itself By Not Buying Bitcoin

Next Post

The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

Next Post
The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR POSTS

  • Health-specific embedding tools for dermatology and pathology

    Health-specific embedding tools for dermatology and pathology

    0 shares
    Share 0 Tweet 0
  • 20 Best Resource Management Software of 2025 (Free & Paid)

    0 shares
    Share 0 Tweet 0
  • 10 Ways To Get a Free DoorDash Gift Card

    0 shares
    Share 0 Tweet 0
  • How to Configure Proxy Server Settings on iPhone in 2025

    0 shares
    Share 0 Tweet 0
  • How To Save for a Baby in 9 Months

    0 shares
    Share 0 Tweet 0
Solega Blog

Categories

  • Artificial Intelligence
  • Cryptocurrency
  • E-commerce
  • Finance
  • Investment
  • Project Management
  • Real Estate
  • Start Ups
  • Travel

Connect With Us

Recent Posts

Russian crypto payment system expands into Africa

Russian crypto payment system expands into Africa

April 13, 2026
Kroger and Flashfood expand partnership to reduce food waste

Kroger and Flashfood expand partnership to reduce food waste

April 13, 2026

© 2024 Solega, LLC. All Rights Reserved | Solega.co

No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel

© 2024 Solega, LLC. All Rights Reserved | Solega.co