Solega Co. Done For Your E-Commerce solutions.
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel
No Result
View All Result
No Result
View All Result
Home Artificial Intelligence

From Keywords to Meaning: Understanding Semantic Search with Real-World Examples | by ALSAFAK KAMAL | Jun, 2025

Solega Team by Solega Team
June 17, 2025
in Artificial Intelligence
Reading Time: 14 mins read
0
From Keywords to Meaning: Understanding Semantic Search with Real-World Examples | by ALSAFAK KAMAL | Jun, 2025
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter


ALSAFAK KAMAL

Imagine Googling “how to fix a leaky faucet” and getting results like “how to repair a dripping tap” instead of just articles containing the exact words “leaky faucet.” That’s the power of semantic search.

In this post, we’ll explore what semantic search is, how it differs from traditional keyword search, and how modern AI models like BERT, Siamese networks, and sentence transformers are making search systems smarter. We’ll also walk through a practical example using Python to implement semantic search using embeddings.

Semantic search refers to the process of retrieving information based on its meaning rather than merely matching keywords.

Traditional search engines rely on keyword frequency and placement. In contrast, semantic search understands the intent and context behind your query.

For example:

Query: “What’s the capital of India?”

Keyword Search Result: Pages with “capital” and “India” appearing together.
Semantic Search Result: “New Delhi” — even if the phrase “capital of India” isn’t used directly.

Traditional search engines often fail when:

  • Synonyms are used (e.g., “car” vs. “automobile”)
  • Questions are asked in a conversational tone
  • Context is important to disambiguate meaning (e.g., “python” the snake vs. “Python” the language)

Semantic search addresses these challenges by leveraging Natural Language Understanding (NLU).

Under the hood, semantic search typically involves three steps:

  1. Convert Queries and Documents to Vectors using language models.
  2. Store document embeddings in a vector database or index.
  3. Find the Most Similar Embeddings to the query using similarity metrics (like cosine similarity).
Basic Workflow

1. From Text to Vectors: Embeddings

Using models like BERT, RoBERTa, or sentence-transformers, sentences are converted into high-dimensional vectors.

Example:

  • “How to fix a leaking tap?” → [0.23, -0.47, ..., 0.19] (768-dim vector)

These embeddings capture semantic properties of the text. Semantically similar texts lie closer in this vector space.

2. Storing the vectors in VectorDBs

Pinecone, Weaviate, Qdrant, etc., are some vector databases to store the embeddings and can be fetched easily whenever required.

3. Retrieving the similar Embeddings

To compare how similar two texts are, we calculate the distance or angle between their vectors. Common similarity/distance metrics include:

a. Cosine Similarity (Most Common for NLP)

What it measures:
The angle between two vectors (i.e., how similar their directions are).

Formula:
Cosine Similarity = (A • B) / (||A|| * ||B||)

where:

  • A • B is the dot product of vectors A and B
  • ||A|| is the magnitude (length) of vector A
  • ||B|| is the magnitude of vector B

Range: (-1 to 1):

  • 1 → Vectors point in the same direction (very similar)
  • 0 → Vectors are orthogonal (unrelated)
  • -1 → Vectors point in opposite directions (rare in practice with text embeddings)

Why it’s useful:
Cosine similarity ignores magnitude and focuses on direction, which makes it perfect for comparing sentence or word embeddings.

b. Euclidean Distance

What it measures:
The straight-line distance between two vectors in space.

Formula:
Euclidean Distance = Square root of the sum of squared differences across all dimensions
= sqrt( (A1 — B1)² + (A2 — B2)² + … + (An — Bn)² )

Interpretation:

  • Lower distance = more similar
  • Higher distance = more different

Why it’s used less in NLP:
It’s sensitive to vector magnitude and not ideal when you’re only interested in direction or semantic closeness.

c. Manhattan Distance (Also called L1 distance)

What it measures:
The sum of absolute differences across all dimensions.

Formula:
Manhattan Distance = |A1 — B1| + |A2 — B2| + … + |An — Bn|

Use case:
Useful in high-dimensional or sparse data scenarios. Not very common in dense text embeddings.

  • BERT & Sentence-BERT: Pretrained models that generate contextual embeddings.
  • FAISS: Facebook’s library for efficient similarity search.
  • Pinecone, Weaviate, Qdrant: Vector databases.
  • Hugging Face Transformers: For generating embeddings from models.
# To install dependencies 
pip install sentence-transformers

#Sample Documents
docs = [
"How to learn Python programming?",
"Best ways to stay healthy during winter",
"Tips for fixing a leaky faucet",
"Introduction to machine learning",
"How to repair a dripping tap"
]

# Create Embeddings
from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer('all-MiniLM-L6-v2')
doc_embeddings = model.encode(docs, convert_to_tensor=True)

# Query & Search
query = "How can I fix a leaking tap?"
query_embedding = model.encode(query, convert_to_tensor=True)

# Find the most similar document
scores = util.pytorch_cos_sim(query_embedding, doc_embeddings)

# Rank documents by score
best_match_index = scores.argmax()
print(f"Best match: {docs[best_match_index]}")

#This Must be the output
" Best match: How to repair a dripping tap "

Visual Representation of the above code snippet:

  • E-commerce: “Affordable laptop for students” → Results with low-cost notebooks
  • Customer Support: Match tickets to relevant knowledge base articles
  • Recruitment Platforms: Match resumes to job descriptions
  • Chatbots: Retrieve context-aware answers from documentation

Semantic search isn’t just a buzzword — it’s transforming the way we find information. Whether you’re building a smart search feature in your app or optimizing your content for voice search, understanding semantics is now essential.

As language models grow more powerful, the line between “search” and “understanding” continues to blur. And that’s exactly what makes this space so exciting.



Source link

Tags: ALSAFAKExamplesJunKAMALKeywordsMeaningRealWorldSearchSemanticUnderstanding
Previous Post

Crypto group Tron to go public after US pauses probe into billionaire founder

Next Post

A comprehensive list of 2025 tech layoffs

Next Post
A comprehensive list of 2025 tech layoffs

A comprehensive list of 2025 tech layoffs

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR POSTS

  • 10 Ways To Get a Free DoorDash Gift Card

    10 Ways To Get a Free DoorDash Gift Card

    0 shares
    Share 0 Tweet 0
  • They Combed the Co-ops of Upper Manhattan With $700,000 to Spend

    0 shares
    Share 0 Tweet 0
  • Saal.AI and Cisco Systems Inc Ink MoU to Explore AI and Big Data Innovations at GITEX Global 2024

    0 shares
    Share 0 Tweet 0
  • Exxon foe Engine No. 1 to build fossil fuel plants with Chevron

    0 shares
    Share 0 Tweet 0
  • They Wanted a House in Chicago for Their Growing Family. Would $650,000 Be Enough?

    0 shares
    Share 0 Tweet 0
Solega Blog

Categories

  • Artificial Intelligence
  • Cryptocurrency
  • E-commerce
  • Finance
  • Investment
  • Project Management
  • Real Estate
  • Start Ups
  • Travel

Connect With Us

Recent Posts

Why omnichannel is mission-critical for B2B

Why omnichannel is mission-critical for B2B

July 8, 2025
BlackRock’s big bet on private assets

BlackRock’s big bet on private assets

July 8, 2025

© 2024 Solega, LLC. All Rights Reserved | Solega.co

No Result
View All Result
  • Home
  • E-commerce
  • Start Ups
  • Project Management
  • Artificial Intelligence
  • Investment
  • More
    • Cryptocurrency
    • Finance
    • Real Estate
    • Travel

© 2024 Solega, LLC. All Rights Reserved | Solega.co