Solega Co. Done For Your E-Commerce solutions.
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
No Result
View All Result
Home Artificial Intelligence

Explanation of Vision Transformer with implementation | by Hiroaki Kubo | Nov, 2024

Solega Team by Solega Team
November 4, 2024
in Artificial Intelligence
Reading Time: 3 mins read
0
Explanation of Vision Transformer with implementation | by Hiroaki Kubo | Nov, 2024
0
SHARES
6
VIEWS
Share on FacebookShare on Twitter


First, we reshape the picture right into a sequence of flattened 2D patches. The code is as follows. image_size means width and peak of picture. The code is as follows.

image_size = 224
channel_size = 3
picture = Picture.open('pattern.png').resize((image_size, image_size))
X = T.PILToTensor()(picture) # Form is [channel_size,image_size,image_size]
patch_size = 16
patches = (
X.unfold(0, channel_size, channel_size)
.unfold(1, patch_size, patch_size)
.unfold(2, patch_size, patch_size)
) # Form is [1,image_size/patch_size,image_size/patch_size,channel_size,patch_size,patch_size]
patches = (
patches.contiguous()
.view(patches.dimension(0), -1, channel_size * patch_size * patch_size)
.float()
) # Form is [1, Number of patches, channel_size*patch_size*patch_size]

Lastly, we create a matrix during which single patch has channel_size*patch_size*patch_size data.

Subsequent, Transformer makes use of fixed latent vector dimension D via all of its layers, so we map the patches to D dimensions with a trainable linear projection (Eq. 1).

We initilize E as follows.

self.E = nn.Parameter(
torch.randn(patch_size * patch_size * channel_size, embedding_dim)
)

We reshape the patches by calculaduring matrix product.

patch_embeddings = torch.matmul(patches, self.E)



Source link

Tags: ExplanationHiroakiImplementationKuboNovTransformerVision
Previous Post

Coinbase Is Embarrassing Itself By Not Buying Bitcoin

Next Post

The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

Next Post
The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR POSTS

  • 10 Ways To Get a Free DoorDash Gift Card

    10 Ways To Get a Free DoorDash Gift Card

    0 shares
    Share 0 Tweet 0
  • The Role of Natural Language Processing in Financial News Analysis

    0 shares
    Share 0 Tweet 0
  • They Combed the Co-ops of Upper Manhattan With $700,000 to Spend

    0 shares
    Share 0 Tweet 0
  • Saal.AI and Cisco Systems Inc Ink MoU to Explore AI and Big Data Innovations at GITEX Global 2024

    0 shares
    Share 0 Tweet 0
  • How To Sell Gold (Step-By-Step Guide)

    0 shares
    Share 0 Tweet 0
Solega Blog

Categories

  • Artificial Intelligence
  • Cryptocurrency
  • E-commerce
  • Finance
  • Investment
  • Project Management
  • Real Estate
  • Start Ups
  • Travel

Connect With Us

Recent Posts

5 Fall Scents That Will Help Sell Your Home

5 Fall Scents That Will Help Sell Your Home

October 21, 2025
Real Estate Myths vs. Facts: What Buyers and Sellers Often Get Wrong

Real Estate Myths vs. Facts: What Buyers and Sellers Often Get Wrong

October 21, 2025

© 2024 Solega, LLC. All Rights Reserved | Solega.co

No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel

© 2024 Solega, LLC. All Rights Reserved | Solega.co