A Gentle Introduction to Spring AI's Embedding Model Abstraction
In this article, we'll present a simple example of using embedding models to perform semantic search via pgvector. Learn how Spring AI's Embedding Model abstraction makes it easy to implement powerful semantic search capabilities in your Spring Boot applications.
Introduction to Embedding Models
Embedding models transform text into numerical vector representations, capturing semantic meaning in a way that machines can understand. These vectors enable powerful semantic search capabilities, allowing us to find content based on meaning rather than just keyword matching.
Spring AI provides a clean abstraction for working with embedding models through its EmbeddingModel
interface, making
it straightforward to integrate these capabilities into your Spring Boot applications.
The Support Assistant Demo Project
To demonstrate the power of embedding models, we've created a simple support assistant application that uses AI to provide support responses based on previous support tickets. The application:
- Converts customer messages into vector embeddings
- Finds semantically similar previous support tickets using vector search
- Uses these similar tickets as context for generating a response
Let's explore the key components of this system.
The Embedding Model API
At the heart of our application is Spring AI's
EmbeddingModel
interface. Here's how we use it in our
service:
@Service
class ResponseSuggestionService(
private val embeddingModel: EmbeddingModel,
private val supportTicketRepository: SupportTicketRepository,
// ...
) {
fun suggestResponse(customerMessage: String, limit: Int = 5): String {
// Generate embedding for the customer message
val embedding = embeddingModel.embed(customerMessage)
// Find similar tickets
val similarTickets = supportTicketRepository.findSimilarTickets(embedding, limit)
// ...
}
}
The embeddingModel.embed()
method converts text into a vector representation, which we can then use to find
semantically similar content.
Vector Search with pgvector
Once we have our text converted to vectors, we need a way to efficiently search for similar vectors. This is where PostgreSQL's pgvector extension comes in.
The Repository Query
The heart of our semantic search is the findSimilarTickets
method in our repository:
@Query(
value = """
SELECT * FROM support_tickets
WHERE embedding <=> (:embedding)::vector < :threshold
ORDER BY embedding <=> (:embedding)::vector
LIMIT :limit
""",
nativeQuery = true
)
fun findSimilarTickets(
@Param("embedding") embedding: FloatArray,
@Param("limit") limit: Int,
@Param("threshold") threshold: Float = 0.5f, // 0 perfect matches, 1 different concept, 2 opposite
): List<SupportTicket>
Let's break down this query:
embedding <=> (:embedding)::vector
- This calculates the cosine distance between the stored embedding and our query embeddingWHERE ... < :threshold
- We filter results to only include those with a distance less than our thresholdORDER BY embedding <=> (:embedding)::vector
- We order results by distance, with closest matches firstLIMIT :limit
- We limit the number of results returned
The <=>
operator is pgvector's cosine distance operator, which measures the dissimilarity between vectors.
Creating and Storing Embeddings
An important part of our system is how we create and store support tickets with their embeddings. Here's how it works:
// Generate embedding for the ticket content
val embedding = embeddingModel.embed(
"$title $questionVariation $responseVariation"
)
val ticket = SupportTicket(
...,
embedding = embedding
)
supportTicketRepository.save(ticket)
The embedding is generated by concatenating the ticket's title, customer message, and agent response, then passing this text to the embedding model. The resulting vector is stored in the embedding
field of the SupportTicket
entity.
In our model, the embedding field is defined with Hibernate annotations:
@JdbcTypeCode(SqlTypes.VECTOR)
@Column(name = "embedding")
@Array(length = 1024) // must match dimensions of the embedding model
val embedding: FloatArray? = null
The @JdbcTypeCode(SqlTypes.VECTOR)
annotation tells Hibernate to use the PostgreSQL vector type, while @Array(length = 1024)
specifies the vector dimension, which must match the output of our embedding model.
Cosine Distance vs. Cosine Similarity
It's important to understand the difference between cosine distance and cosine similarity:
- Cosine Similarity: Ranges from -1 to 1
- 1: Vectors are identical
- 0: Vectors are orthogonal (unrelated)
- -1: Vectors are opposite
- Cosine Distance: Defined as 1 - cosine similarity, ranges from 0 to 2
- 0: Identical vectors
- 1: Unrelated vectors
- 2: Opposite vectors
The <=>
operator in our SQL query is PostgreSQL's cosine distance operator, which measures the dissimilarity between vectors.
The threshold value of 0.5 ensures that we only return tickets that are semantically similar to the query, filtering out
unrelated or opposite concepts.
Using Ollama for Embeddings
For our embedding model, we're using Ollama with two specific models:
- mistral - The default model for generating responses
- mxbai-embed-large - A specialized model for generating embeddings
Ollama provides a lightweight way to run these models locally, making it perfect for development and testing.
Setting Up Vector Storage
To use vector embeddings with PostgreSQL and Hibernate, we need a few key components:
SQL Migration
First, we need to set up our database with the pgvector extension:
CREATE
EXTENSION IF NOT EXISTS vector;
CREATE TABLE support_tickets
(
-- other columns...
embedding vector(1024)
);
CREATE INDEX ON support_tickets USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
The key points here are:
- We create a vector column with 1024 dimensions
- We create an index using the IVF algorithm optimized for cosine distance operations
Hibernate Configuration
On the Java side, we need the Hibernate Vector extension:
dependencies {
implementation("org.hibernate.orm:hibernate-vector:6.6.11.Final")
implementation("org.springframework.ai:spring-ai-pgvector-store-spring-boot-starter")
// ...
}
This extension allows Hibernate to work with PostgreSQL's vector type, mapping between Java's FloatArray
and
PostgreSQL's vector
.
Conclusion
Spring AI's Embedding Model abstraction provides a clean, simple way to work with vector embeddings in your Spring Boot applications. Combined with PostgreSQL's pgvector extension, it enables powerful semantic search capabilities with minimal code.
In our next blog post, we'll explore how to use Spring AI's VectorStore
abstraction to simplify vector storage and
retrieval even further, eliminating the need for custom repository methods.
The full source code for this example is available in GitHub.