Solega Co. Done For Your E-Commerce solutions.
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
No Result
View All Result
Home Artificial Intelligence

Explanation of Vision Transformer with implementation | by Hiroaki Kubo | Nov, 2024

Solega Team by Solega Team
November 4, 2024
in Artificial Intelligence
Reading Time: 3 mins read
0
Explanation of Vision Transformer with implementation | by Hiroaki Kubo | Nov, 2024
0
SHARES
6
VIEWS
Share on FacebookShare on Twitter


First, we reshape the picture right into a sequence of flattened 2D patches. The code is as follows. image_size means width and peak of picture. The code is as follows.

image_size = 224
channel_size = 3
picture = Picture.open('pattern.png').resize((image_size, image_size))
X = T.PILToTensor()(picture) # Form is [channel_size,image_size,image_size]
patch_size = 16
patches = (
X.unfold(0, channel_size, channel_size)
.unfold(1, patch_size, patch_size)
.unfold(2, patch_size, patch_size)
) # Form is [1,image_size/patch_size,image_size/patch_size,channel_size,patch_size,patch_size]
patches = (
patches.contiguous()
.view(patches.dimension(0), -1, channel_size * patch_size * patch_size)
.float()
) # Form is [1, Number of patches, channel_size*patch_size*patch_size]

Lastly, we create a matrix during which single patch has channel_size*patch_size*patch_size data.

Subsequent, Transformer makes use of fixed latent vector dimension D via all of its layers, so we map the patches to D dimensions with a trainable linear projection (Eq. 1).

We initilize E as follows.

self.E = nn.Parameter(
torch.randn(patch_size * patch_size * channel_size, embedding_dim)
)

We reshape the patches by calculaduring matrix product.

patch_embeddings = torch.matmul(patches, self.E)



Source link

Tags: ExplanationHiroakiImplementationKuboNovTransformerVision
Previous Post

Coinbase Is Embarrassing Itself By Not Buying Bitcoin

Next Post

The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

Next Post
The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

The Startup Magazine 5 People That Will Benefit From Using a Data Removal Service

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR POSTS

  • 10 Ways To Get a Free DoorDash Gift Card

    10 Ways To Get a Free DoorDash Gift Card

    0 shares
    Share 0 Tweet 0
  • They Combed the Co-ops of Upper Manhattan With $700,000 to Spend

    0 shares
    Share 0 Tweet 0
  • Saal.AI and Cisco Systems Inc Ink MoU to Explore AI and Big Data Innovations at GITEX Global 2024

    0 shares
    Share 0 Tweet 0
  • Exxon foe Engine No. 1 to build fossil fuel plants with Chevron

    0 shares
    Share 0 Tweet 0
  • They Wanted a House in Chicago for Their Growing Family. Would $650,000 Be Enough?

    0 shares
    Share 0 Tweet 0
Solega Blog

Categories

  • Artificial Intelligence
  • Cryptocurrency
  • E-commerce
  • Finance
  • Investment
  • Project Management
  • Real Estate
  • Start Ups
  • Travel

Connect With Us

Recent Posts

‘Kirklandisation’ of Big Law pushes firms to launch salaried partnerships

‘Kirklandisation’ of Big Law pushes firms to launch salaried partnerships

June 23, 2025
Tender Document in Construction: A Quick Guide

8 Construction Cost Estimator Softwar

June 23, 2025

© 2024 Solega, LLC. All Rights Reserved | Solega.co

No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel

© 2024 Solega, LLC. All Rights Reserved | Solega.co