Edge Functions

Semantic Search

Semantic Search with pgvector and Supabase Edge Functions


Semantic search interprets the meaning behind user queries rather than exact keywords. It uses machine learning to capture the intent and context behind the query, handling language nuances like synonyms, phrasing variations, and word relationships.

Since Supabase Edge Runtime v1.36.0 you can run the gte-small model natively within Supabase Edge Functions without any external dependencies! This allows you to easily generate text embeddings without calling any external APIs!

In this tutorial you're implementing three parts:

  1. A generate-embedding database webhook edge function which generates embeddings when a content row is added (or updated) in the public.embeddings table.
  2. A query_embeddings Postgres function which allows us to perform similarity search from an egde function via Remote Procedure Call (RPC).
  3. A search edge function which generates the embedding for the search term, performs the similarity search via RPC function call, and returns the result.

You can find the complete example code on GitHub

Create the database table and webhook

Given the following table definition:

create extension if not exists vector with schema extensions;

create table embeddings (
id bigint primary key generated always as identity,
content text not null,
embedding vector (384)
);
alter table embeddings enable row level security;

create index on embeddings using hnsw (embedding vector_ip_ops);

You can deploy the following edge function as a database webhook to generate the embeddings for any text content inserted into the table:

const model = new Supabase.ai.Session('gte-small')

Deno.serve(async (req) => {
const payload: WebhookPayload = await req.json()
const { content, id } = payload.record

// Generate embedding.
const embedding = await model.run(content, {
mean_pool: true,
normalize: true,
})

// Store in database.
const { error } = await supabase
.from('embeddings')
.update({ embedding: JSON.stringify(embedding) })
.eq('id', id)
if (error) console.warn(error.message)

return new Response('ok')
})

Create a Database Function and RPC

With the embeddings now stored in your Postgres database table, you can query them from Supabase Edge Functions by utilizing Remote Procedure Calls (RPC).

Given the following Postgres Function:

-- Matches document sections using vector similarity search on embeddings
--
-- Returns a setof embeddings so that we can use PostgREST resource embeddings (joins with other tables)
-- Additional filtering like limits can be chained to this function call
create or replace function query_embeddings(embedding vector(384), match_threshold float)
returns setof embeddings
language plpgsql
as $$
begin
return query
select *
from embeddings

-- The inner product is negative, so we negate match_threshold
where embeddings.embedding <#> embedding < -match_threshold

-- Our embeddings are normalized to length 1, so cosine similarity
-- and inner product will produce the same query results.
-- Using inner product which can be computed faster.
--
-- For the different distance functions, see https://github.com/pgvector/pgvector
order by embeddings.embedding <#> embedding;
end;
$$;

Query vectors in Supabase Edge Functions

You can use supabase-js to first generate the embedding for the search term and then invoke the Postgres function to find the relevant results from your stored embeddings, right from your Supabase Edge Function:

const model = new Supabase.ai.Session('gte-small')

Deno.serve(async (req) => {
const { search } = await req.json()
if (!search) return new Response('Please provide a search param!')
// Generate embedding for search term.
const embedding = await model.run(search, {
mean_pool: true,
normalize: true,
})

// Query embeddings.
const { data: result, error } = await supabase
.rpc('query_embeddings', {
embedding,
match_threshold: 0.8,
})
.select('content')
.limit(3)
if (error) {
return Response.json(error)
}

return Response.json({ search, result })
})

That's it, you now have AI powered semantic search set up without any external dependencies! Just you, pgvector, and Supabase Edge Functions!