How to Implement Vector Search with OpenAI Embeddings and Supabase

Published on

Dec 13, 2024

Navdeep

5 mins

In the era of AI-powered applications, delivering intelligent and relevant search experiences is a must. Traditional keyword-based searches often fail to capture contextual nuances, leading to less meaningful results. Enter vector search—an advanced method that leverages machine learning to provide more relevant and contextual results.

This blog outlines how to implement vector search with a dataset, integrating OpenAI’s Embeddings API and Supabase’s vector capabilities. Let’s dive in!

1. Dataset Preparation

The foundation of any search system is its dataset. For this guide:

• Start with a dataset, such as a JSON file containing celebrity data (e.g., name, profession, biography, etc.).
• Implement a basic user interface (UI) that displays the dataset and includes search functionality.

Key Tasks:

• Parse and display the dataset in the UI.
• Ensure a responsive design to provide a seamless user experience.

<div className="py-16 w-full flex justify-center items-center bg-gray-100 flex-col overflow-y-scroll">
     <form
       onSubmit={handleSubmit}
       className="m-16 bg-white rounded-lg shadow-md flex w-full md:w-1/2 p-4"
     >
       <input
         type="text"
         placeholder="Search..."
         className="flex-grow text-lg font-light focus:outline-none"
         value={searchTerm}
         onChange={handleChange}
       />
       <button
         type="submit"
         className="px-4 py-2 bg-green-500 text-white rounded-r-lg"
       >
         Search
       </button>
     </form>
     <div className="flex flex-wrap justify-center overflow-y-auto">
       {searchResults.map((profile, index) => (
         <CelebrityProfileCard key={index} profile={profile} />
       ))}
     </div>
   
   </div>

const CelebrityProfileCard = ({ profile }: { profile: any }) => (
   <div className="max-w-sm rounded overflow-hidden shadow-md m-4 h-1/2 w-1/2">
     <img
       src={profile.image}
       alt={profile.first_name}
       className="w-full h-64 object-cover"
     />
     <div className="px-6 py-4">
       <div className="font-bold text-xl mb-2">{`${profile.first_name} ${profile.last_name}`}</div>
       <p>{`Age: ${profile.age}`}</p>
     </div>
   </div>
 );

2. Implementing Regular Search

Before diving into vector search, start with a basic text-matching mechanism:

• Filter the dataset locally based on user-provided search terms.
• Use JavaScript array methods like .filter() to perform the text matching.

While effective for simple searches, this method lacks contextual understanding and struggles with complex queries.

const handleSubmit = async (event: any) => {
   event.preventDefault();


 
  
   const searchResults = celebrities.filter((profile) => {
     const fullName = `${profile.first_name} ${profile.last_name}`;
     return fullName.toLowerCase().includes(searchTerm.toLowerCase());
   });
   setSearchResults(searchResults);
   console.log(searchResults);


 }

3. Vectorizing Data

Here’s where the magic begins! Transform your textual data into vector embeddings to enable contextual searches.

Steps:

1. Generate Embeddings: Use OpenAI’s text-embedding-ada-002 model to convert text into numerical vectors.
2. Store Embeddings: Save these embeddings in a Supabase database for efficient retrieval during searches.

import { createClient } from '@supabase/supabase-js';


export const supabaseClient = createClient(
 process.env.NEXT_PUBLIC_SUPABASE_URL!,
 process.env.NEXT_PUBLIC_SUPABASE_KEY!
);

import { NextRequest, NextResponse } from 'next/server';
import { createClient } from '@supabase/supabase-js';
import OpenAI from 'openai';
import { supabaseClient } from '../../../../utils/supabaseClient';
export async function POST(req: NextRequest) {
 const body = await req.json();
 const { celebrities } = body;


 const openai = new OpenAI({
   apiKey: process.env.NEXT_PUBLIC_OPENAI_API_KEY,
 });


 // Function to generate OpenAI embeddings for a given text
 async function generateOpenAIEmbeddings(profile: any) {
   const textToEmbed = Object.values(profile).join(' ');
   const response = await openai.embeddings.create({
     model: 'text-embedding-ada-002',
     input: textToEmbed,
   });
   return response.data[0].embedding;
 }
 try {
   // Map over the array and process each item
   const processedDataArray = await Promise.all(
     celebrities.map(async (item: any) => {
       // Generate OpenAI embeddings for the entire profile object
       const embeddings = await generateOpenAIEmbeddings(item);
       // Modify the item to add an 'embeddings' property
       const modifiedItem = { ...item, embeddings };


       // Post the modified item to the 'profiles' table in Supabase
       const { data, error } = await supabaseClient
         .from('celebrities')
         .upsert([modifiedItem]);


       // Check for errors
       if (error) {
         console.error('Error inserting data into Supabase:', error.message);
         return NextResponse.json({
           success: false,
           status: 500,
           result: error,
         });
       }


       return NextResponse.json({
         success: true,
         status: 200,
         result: data,
       });
     })
   );


   // Check if any insertions failed
   const hasError = processedDataArray.some((result) => !result.success);


   if (hasError) {
     return NextResponse.json({
       error: 'One or more insertions failed',
       status: 500,
     });
   }


   // Data successfully inserted for all items


   return NextResponse.json({
     status: 200,
     success: true,
     results: processedDataArray,
   });
 } catch (error: any) {
   console.error('Unexpected error:', error.message);
   return NextResponse.json({
     status: 500,
     success: false,
     results: error,
     message: 'Internal Server Error',
   });
 }
}

4. Implementing Vector Search

Once your data is vectorized, you can perform similarity-based searches.

Key Steps:

• Query Embedding: Generate an embedding for the user’s search query.
• Vector Search: Use a stored procedure in Supabase to find the most similar embeddings in your dataset.

import { NextRequest, NextResponse } from "next/server";


import OpenAI from "openai";
import { supabaseClient } from "../../../../utils/supabaseClient";


// Initialize the OpenAI client with your API key
const openai = new OpenAI({
 apiKey: process.env.NEXT_PUBLIC_OPENAI_API_KEY,
});


export async function POST(request: Request) {
 const body = await request.json();


 const query = body.searchTerm;


 if (!query) {
   return NextResponse.json({ error: "Empty query" });
 }


 // Create Embedding
 const openAiEmbeddings = await openai.embeddings.create({
   model: "text-embedding-ada-002",
   input: query,
 });


 const [{ embedding }] = openAiEmbeddings.data;


 // Search Supabase
 const { data, error } = await supabaseClient.rpc("vector_search", {
   query_embedding: embedding,
   similarity_threshold: 0.8,
   match_count: 2,
 });


 // query ChatGPT via Langchain, pass the query and database results as context


 if (data) {
   console.log(data);
   return NextResponse.json({ data });
 }
 console.log(error);


 return NextResponse.json({ error });
}

Display Results:

Dynamically update the UI with fetched results to provide an interactive user experience.

const handleSubmit = async (event: any) => {
   event.preventDefault();




   // VECTTOR SEARCH
   if (searchTerm.trim() === "") {
     // If the search term is empty, fetch the original list from Supabase
     await fetchCelebrities();
   } else {
     const semanticSearrch = await fetch("/api/search", {
       method: "POST",
       headers: {
         "Content-Type": "application/json",
       },
       body: JSON.stringify({
         searchTerm: searchTerm,
       }),
     });


     const semanticSearrchResponse = await semanticSearrch.json();
     console.log(semanticSearrchResponse.data);
     setSearchResults(semanticSearrchResponse.data);
   }
 };

‍

Why This Approach is Effective

1. Enhanced Search Experience

Traditional keyword searches fall short in understanding context. With vector search:

• Users can search for related attributes like hobbies or professions.
• The search engine provides more relevant, nuanced results.

2. Scalability

• Embeddings are compact, enabling efficient storage and retrieval for large datasets.
• Supabase’s capabilities ensure smooth handling of high search volumes.

3. Future-Ready

• This implementation integrates AI directly into the frontend, aligning with modern tech trends.
• It’s a robust foundation for building smarter applications, such as recommendation systems or personalized experiences.

Conclusion

Implementing vector search enhances your application’s functionality, offering intelligent and context-aware search capabilities. By integrating OpenAI’s Embeddings API with Supabase’s powerful vector support, you can build scalable, future-ready solutions.

Start exploring vector search today to revolutionize how users interact with your application!

‍

Insights