Build a Smart FAQ System in Laravel Using Vector Search
A user types "how do I get my money back" into your FAQ search box. Zero results. Your FAQ has a perfectly clear "Refund Policy" entry — the concepts are identical, the words aren't.
That's a keyword search problem, and it's embarrassing in a way that's hard to explain to a product manager. The fix is semantic search: match by meaning instead of exact terms. Laravel's official AI package gives you an embeddings API and a whereVectorSimilarTo query builder method that handles this in about 40 lines of real code. This article shows you how to wire it together, from migration to search endpoint.
Why Keyword Search Breaks Here
User language and documentation language diverge constantly. A support team writes clearly. Users type how they think:
- "can I return this?" → your entry says "Refund Policy"
- "login doesn't work" → your entry says "Authentication Troubleshooting"
- "how long until my stuff arrives?" → your entry says "Delivery Timeframes"
Full-text search handles synonyms to a point, but it's brittle at the semantic level. Vector search converts text to numerical representations — embeddings — and measures cosine similarity between them. A question about returns and a policy about refunds land close together in vector space because the model was trained on language that treats those concepts as related. Your user doesn't need to know the right vocabulary.
Getting Started
composer require laravel/ai
The package supports OpenAI, Anthropic, and Gemini. For this build, we'll use OpenAI's text-embedding-3-small — fast, cheap, and dimensioned at 1536 values per vector. Add your key to .env:
OPENAI_API_KEY=sk-...
Publish the config if you need to swap providers later:
php artisan vendor:publish --provider="Laravel\Ai\AiServiceProvider"
The Database: Adding a Vector Column
php artisan make:model FaqEntry -m
The migration needs a vector column alongside the question and answer fields:
Schema::create('faq_entries', function (Blueprint $table) {
$table->id();
$table->string('question');
$table->text('answer');
$table->string('category')->nullable();
$table->vector('embedding', 1536)->nullable();
$table->timestamps();
});
The 1536 matches the dimensionality of text-embedding-3-small. Switch to 3072 if you upgrade to text-embedding-3-large.
On PostgreSQL, pgvector has to be enabled before the migration runs:
CREATE EXTENSION IF NOT EXISTS vector;
SQLite and MySQL don't support native vector types — Laravel AI falls back to JSON columns, which works up to a few thousand entries before query times become painful. PostgreSQL is the right choice here. If you're on a managed database and don't control extensions, check your provider's docs; most now support pgvector by default.
Generating Embeddings on Ingest
The FaqEntry model gets a static method that generates the embedding at creation time:
namespace App\Models;
use Illuminate\Database\Eloquent\Model;
use Laravel\Ai\Embeddings;
class FaqEntry extends Model
{
protected $fillable = ['question', 'answer', 'category', 'embedding'];
protected $casts = [
'embedding' => 'array',
];
public static function createWithEmbedding(
string $question,
string $answer,
?string $category = null
): static {
$response = Embeddings::for([$question])
->dimensions(1536)
->cache()
->generate();
return static::create([
'question' => $question,
'answer' => $answer,
'category' => $category,
'embedding' => $response->first()->embedding,
]);
}
}
The .cache() chain stores embedding responses keyed by input hash. In production this doesn't matter much — embeddings are fast. During development, when you're dropping and reseeding the database repeatedly, it stops you from re-embedding identical strings and eating your monthly quota. I've run db:seed six times before lunch on a 200-entry FAQ and burned through a noticeable chunk of the free tier before adding that line.
A seed class to populate the initial FAQ bank:
use App\Models\FaqEntry;
class FaqSeeder extends Seeder
{
public function run(): void
{
$entries = [
[
'question' => 'How do I reset my password?',
'answer' => 'Go to the login page and click "Forgot Password."',
'category' => 'account',
],
[
'question' => 'What is your refund policy?',
'answer' => 'We offer full refunds within 30 days of purchase.',
'category' => 'billing',
],
[
'question' => 'How long does shipping take?',
'answer' => 'Standard shipping takes 5–7 business days.',
'category' => 'orders',
],
[
'question' => 'Can I change my email address?',
'answer' => 'Yes — go to Account Settings and update your email.',
'category' => 'account',
],
[
'question' => 'Where can I track my order?',
'answer' => 'Open "Orders" in your dashboard for a live tracking link.',
'category' => 'orders',
],
];
foreach ($entries as $entry) {
FaqEntry::createWithEmbedding(
$entry['question'],
$entry['answer'],
$entry['category']
);
}
}
}
php artisan db:seed --class=FaqSeeder
Searching by Semantic Similarity
whereVectorSimilarTo is the query builder method Laravel AI adds to Eloquent. Pass it the column name, the user's raw query string, and a minimum similarity threshold. The method handles embedding generation internally — you're not writing any embedding logic on the search side:
$results = FaqEntry::query()
->whereVectorSimilarTo('embedding', $userQuery, minSimilarity: 0.6)
->limit(3)
->get(['question', 'answer', 'category']);
The threshold is the part that needs tuning for your use case. At 0.6, you're accepting matches where the model is reasonably confident the concepts are related. Push it to 0.7 for narrow technical documentation where phrasing matters more. Drop below 0.55 and near-misses start creeping in — you get answers that are conceptually adjacent but actually wrong, which is a worse outcome than returning nothing.
Category filtering before the similarity pass is useful when your FAQ is large and users can scope their search:
$results = FaqEntry::query()
->where('category', $validated['category'])
->whereVectorSimilarTo('embedding', $userQuery, minSimilarity: 0.6)
->limit(3)
->get(['question', 'answer', 'category']);
That narrows the candidate set before the vector math runs.
The API Endpoint
namespace App\Http\Controllers;
use App\Models\FaqEntry;
use Illuminate\Http\JsonResponse;
use Illuminate\Http\Request;
class FaqController extends Controller
{
public function search(Request $request): JsonResponse
{
$validated = $request->validate([
'query' => ['required', 'string', 'max:500'],
'category' => ['nullable', 'string'],
]);
$query = FaqEntry::query()
->whereVectorSimilarTo('embedding', $validated['query'], minSimilarity: 0.6)
->limit(3);
if (!empty($validated['category'])) {
$query->where('category', $validated['category']);
}
$results = $query->get(['question', 'answer', 'category']);
return response()->json([
'results' => $results,
'count' => $results->count(),
]);
}
}
Route:
// routes/api.php
Route::post('/faq/search', [FaqController::class, 'search'])
->middleware('throttle:30,1');
Thirty requests per minute per user. That's not conservative — it's arithmetic. Each search call hits the embedding API. Without rate limiting, a client-side search-as-you-type implementation will chew through your monthly quota in an afternoon.
What Happens When Nothing Matches?
When whereVectorSimilarTo returns zero results, your frontend needs something to show. An empty box with no explanation frustrates users more than a clear "no results" message.
One pattern: return a fallback field in the JSON response so the client doesn't need separate branching logic:
if ($results->isEmpty()) {
return response()->json([
'results' => [],
'count' => 0,
'fallback' => "We couldn't find a match. Try rephrasing, or contact support.",
]);
}
I'm not enthusiastic about this design — the controller ends up owning UX copy — but it keeps things simple when you don't have a frontend team handling empty-state behavior separately. The alternative is returning a 200 with an empty array and letting the client handle it, which is cleaner but requires agreement on where that logic lives.
Testing Without Hitting the API
Laravel AI ships with a fake for embeddings:
use Laravel\Ai\Embeddings;
Embeddings::fake();
// Run your code that calls Embeddings::for()->generate()
Embeddings::assertGenerated(
fn ($prompt) => $prompt->contains('reset my password')
);
Embeddings::fake() intercepts API calls and returns synthetic vectors. Your test suite stays fast and free. The synthetic vectors aren't semantically meaningful — whereVectorSimilarTo searches against real stored embeddings, so you'll need a seeded test database with actual vectors for integration tests. But for unit testing that your model method calls Embeddings::for with the right input, the fake is all you need.
The Edge Case Worth Knowing
Every embedding in your database is a snapshot of the question text at ingest time. Update a FAQ entry's question and the stored vector doesn't change. Searches keep using the old embedding. The match rate quietly degrades.
Here's what that looks like in practice: you edit "How do I reset my password?" to "How do I recover my account access?" — reasonable copy update. The embedding column still holds the vector for the original phrasing. A user who types "how do I recover my account" now scores lower against that entry than they should, possibly below your 0.6 threshold, and gets no result for a question you definitely have an answer to.
The direction most teams take is an Eloquent observer that regenerates the embedding whenever question changes:
namespace App\Observers;
use App\Models\FaqEntry;
use Laravel\Ai\Embeddings;
class FaqEntryObserver
{
public function updating(FaqEntry $entry): void
{
if ($entry->isDirty('question')) {
$response = Embeddings::for([$entry->question])
->dimensions(1536)
->generate();
$entry->embedding = $response->first()->embedding;
}
}
}
Register it in AppServiceProvider:
FaqEntry::observe(FaqEntryObserver::class);
This works cleanly for low-edit workflows. If your content team is doing a large copy audit and touching 50 entries in an afternoon, each edit hits the API synchronously — which is fine until it isn't. A background job that batches re-embedding is probably the better architecture for high-edit environments, but it introduces a window where stored vectors are stale.
Neither approach is clean for every case. Which one fits yours depends on how often your FAQ changes and whether you have the infrastructure for async jobs. That's a question only you can answer with actual usage numbers in front of you.
Share