# Thoughts v2 — Plan 2: Full-Text Search (postgres-search) > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Upgrade search from a full-table-scan ILIKE to indexed trigram search (pg_trgm), returning both thoughts and users from a single `/search` endpoint. **Architecture:** A new `SearchPort` trait in domain defines cross-entity search (thoughts + users). `crates/adapters/postgres-search` implements it using `pg_trgm` similarity with GIN indexes. The existing `FeedRepository::search` in `postgres/feed.rs` is also upgraded to use the `%` trigram operator so it benefits from the new index. Presentation adds `search: Arc` to `AppState`. **Tech Stack:** Rust, sqlx 0.8, PostgreSQL `pg_trgm` extension, GIN indexes, axum --- ## File Map ``` Modified: crates/domain/src/ports.rs ← add SearchPort trait Modified: crates/domain/src/testing.rs ← add TestStore impl for SearchPort Modified: crates/adapters/postgres-search/Cargo.toml ← add deps Modified: crates/adapters/postgres-search/src/lib.rs ← PgSearchRepository (was empty stub) Create: crates/adapters/postgres/migrations/004_search_indexes.sql Modified: crates/adapters/postgres/src/feed.rs ← upgrade ILIKE → trigram operator Modified: crates/presentation/src/state.rs ← add search field Modified: crates/presentation/src/lib.rs ← wire PgSearchRepository in build_state Modified: crates/presentation/src/handlers/feed.rs ← search_handler returns thoughts + users ``` --- ### Task 1: Migration — pg_trgm extension and GIN indexes **Files:** - Create: `crates/adapters/postgres/migrations/004_search_indexes.sql` - [ ] **Write `004_search_indexes.sql`:** ```sql CREATE EXTENSION IF NOT EXISTS pg_trgm; CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_thoughts_content_trgm ON thoughts USING GIN(content gin_trgm_ops); CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_username_trgm ON users USING GIN(username gin_trgm_ops); CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_display_name_trgm ON users USING GIN(display_name gin_trgm_ops) WHERE display_name IS NOT NULL; ``` - [ ] **Apply migration to test DB:** ```bash DATABASE_URL=postgres://postgres:postgres@localhost:5434/postgres \ cargo sqlx migrate run --source crates/adapters/postgres/migrations ``` Expected: `Applied 1/migrate search indexes` - [ ] **Verify pg_trgm works:** ```bash psql postgres://postgres:postgres@localhost:5434/postgres \ -c "SELECT similarity('hello world', 'hello');" ``` Expected: a float value like `0.5` (not an error). - [ ] **Commit:** ```bash git add crates/adapters/postgres/migrations/004_search_indexes.sql git commit -m "feat(postgres): pg_trgm extension and GIN search indexes" ``` --- ### Task 2: Domain — SearchPort trait and TestStore implementation **Files:** - Modify: `crates/domain/src/ports.rs` - Modify: `crates/domain/src/testing.rs` - [ ] **Write failing test** — add to bottom of `crates/domain/src/testing.rs` (inside `#[cfg(any(test, feature = "test-helpers"))]`): ```rust #[cfg(test)] mod search_tests { use super::*; use crate::models::feed::PageParams; #[tokio::test] async fn test_store_search_thoughts_returns_empty() { let store = TestStore::default(); let result = store.search_thoughts("hello", &PageParams { page: 1, per_page: 20 }, None).await.unwrap(); assert_eq!(result.total, 0); } #[tokio::test] async fn test_store_search_users_returns_empty() { let store = TestStore::default(); let result = store.search_users("alice", &PageParams { page: 1, per_page: 20 }).await.unwrap(); assert_eq!(result.total, 0); } } ``` - [ ] **Run:** `cargo test -p domain` — Expected: FAIL (SearchPort not defined yet). - [ ] **Add `SearchPort` to `crates/domain/src/ports.rs`** — append after the `FeedRepository` trait: ```rust #[async_trait] pub trait SearchPort: Send + Sync { /// Full-text search over public thoughts, ranked by trigram similarity. async fn search_thoughts( &self, query: &str, page: &PageParams, viewer_id: Option<&UserId>, ) -> Result, DomainError>; /// Search users by username or display_name, ranked by trigram similarity. async fn search_users( &self, query: &str, page: &PageParams, ) -> Result, DomainError>; } ``` - [ ] **Add `TestStore impl SearchPort`** in `crates/domain/src/testing.rs` — append after the `impl FeedRepository for TestStore` block: ```rust #[async_trait] impl SearchPort for TestStore { async fn search_thoughts(&self, _q: &str, _p: &PageParams, _v: Option<&UserId>) -> Result, DomainError> { Ok(Paginated { items: vec![], total: 0, page: 1, per_page: 20 }) } async fn search_users(&self, _q: &str, _p: &PageParams) -> Result, DomainError> { Ok(Paginated { items: vec![], total: 0, page: 1, per_page: 20 }) } } ``` - [ ] **Run:** `cargo test -p domain` — Expected: all tests PASS. - [ ] **Commit:** ```bash git add crates/domain/src/ports.rs crates/domain/src/testing.rs git commit -m "feat(domain): SearchPort trait with thought and user search" ``` --- ### Task 3: postgres-search — PgSearchRepository **Files:** - Modify: `crates/adapters/postgres-search/Cargo.toml` - Modify: `crates/adapters/postgres-search/src/lib.rs` - [ ] **Write failing tests** at bottom of `crates/adapters/postgres-search/src/lib.rs`: ```rust #[cfg(test)] mod tests { use super::*; use domain::{ models::{thought::{Thought, Visibility}, user::User}, ports::{SearchPort, ThoughtRepository, UserRepository}, value_objects::*, }; async fn seed_thought(pool: &sqlx::PgPool, username: &str, content: &str) -> (User, Thought) { use postgres::{thought::PgThoughtRepository, user::PgUserRepository}; let urepo = PgUserRepository::new(pool.clone()); let trepo = PgThoughtRepository::new(pool.clone()); let u = User::new_local( UserId::new(), Username::new(username).unwrap(), Email::new(format!("{username}@ex.com")).unwrap(), PasswordHash("h".into()), ); urepo.save(&u).await.unwrap(); let t = Thought::new_local( ThoughtId::new(), u.id.clone(), Content::new_local(content).unwrap(), None, Visibility::Public, None, false, ); trepo.save(&t).await.unwrap(); (u, t) } #[sqlx::test(migrations = "../postgres/migrations")] async fn search_thoughts_finds_by_keyword(pool: sqlx::PgPool) { seed_thought(&pool, "alice", "hello world").await; seed_thought(&pool, "bob", "goodbye universe").await; let repo = PgSearchRepository::new(pool); let result = repo.search_thoughts("hello", &domain::models::feed::PageParams { page: 1, per_page: 20 }, None).await.unwrap(); assert_eq!(result.total, 1); assert_eq!(result.items[0].thought.content.as_str(), "hello world"); } #[sqlx::test(migrations = "../postgres/migrations")] async fn search_users_finds_by_username(pool: sqlx::PgPool) { use postgres::user::PgUserRepository; let urepo = PgUserRepository::new(pool.clone()); let alice = User::new_local(UserId::new(), Username::new("alice_search").unwrap(), Email::new("alice@ex.com").unwrap(), PasswordHash("h".into())); urepo.save(&alice).await.unwrap(); let repo = PgSearchRepository::new(pool); let result = repo.search_users("alice", &domain::models::feed::PageParams { page: 1, per_page: 20 }).await.unwrap(); assert!(!result.items.is_empty()); assert!(result.items.iter().any(|u| u.username.as_str() == "alice_search")); } #[sqlx::test(migrations = "../postgres/migrations")] async fn search_thoughts_returns_empty_for_no_match(pool: sqlx::PgPool) { seed_thought(&pool, "alice", "hello world").await; let repo = PgSearchRepository::new(pool); let result = repo.search_thoughts("zzzzzzzzz", &domain::models::feed::PageParams { page: 1, per_page: 20 }, None).await.unwrap(); assert_eq!(result.total, 0); } } ``` - [ ] **Run:** `cargo test -p postgres-search` — Expected: FAIL (PgSearchRepository not defined). - [ ] **Update `crates/adapters/postgres-search/Cargo.toml`:** ```toml [package] name = "postgres-search" version = "0.1.0" edition = "2021" [dependencies] domain = { workspace = true } sqlx = { workspace = true } uuid = { workspace = true } chrono = { workspace = true } async-trait = { workspace = true } [dev-dependencies] tokio = { workspace = true, features = ["full"] } sqlx = { workspace = true, features = ["migrate"] } postgres = { workspace = true } ``` Note: `postgres` in dev-dependencies is the internal crate at `crates/adapters/postgres` (already in workspace.dependencies). Add it to workspace.dependencies in root `Cargo.toml` if not already there: ```toml # In root Cargo.toml [workspace.dependencies] — verify this line exists: postgres = { path = "crates/adapters/postgres" } ``` - [ ] **Write `crates/adapters/postgres-search/src/lib.rs`:** ```rust use async_trait::async_trait; use chrono::{DateTime, Utc}; use sqlx::PgPool; use domain::{ errors::DomainError, models::{ feed::{FeedEntry, PageParams, Paginated}, thought::Thought, user::User, }, ports::SearchPort, value_objects::{Content, Email, PasswordHash, ThoughtId, UserId, Username}, }; use domain::models::thought::Visibility; pub struct PgSearchRepository { pool: PgPool } impl PgSearchRepository { pub fn new(pool: PgPool) -> Self { Self { pool } } } // ── Feed row ───────────────────────────────────────────────────────────────── #[derive(sqlx::FromRow)] struct FeedRow { thought_id: uuid::Uuid, t_user_id: uuid::Uuid, content: String, in_reply_to_id: Option, in_reply_to_url: Option, t_ap_id: Option, visibility: String, content_warning: Option, sensitive: bool, t_local: bool, thought_created_at: DateTime, updated_at: Option>, author_id: uuid::Uuid, username: String, email: String, password_hash: String, display_name: Option, bio: Option, avatar_url: Option, header_url: Option, custom_css: Option, author_local: bool, u_ap_id: Option, inbox_url: Option, public_key: Option, private_key: Option, author_created_at: DateTime, author_updated_at: DateTime, like_count: i64, boost_count: i64, reply_count: i64, } const FEED_SELECT: &str = " SELECT t.id AS thought_id, t.user_id AS t_user_id, t.content, t.in_reply_to_id, t.in_reply_to_url, t.ap_id AS t_ap_id, t.visibility, t.content_warning, t.sensitive, t.local AS t_local, t.created_at AS thought_created_at, t.updated_at, u.id AS author_id, u.username, u.email, u.password_hash, u.display_name, u.bio, u.avatar_url, u.header_url, u.custom_css, u.local AS author_local, u.ap_id AS u_ap_id, u.inbox_url, u.public_key, u.private_key, u.created_at AS author_created_at, u.updated_at AS author_updated_at, (SELECT COUNT(*) FROM likes l WHERE l.thought_id=t.id) AS like_count, (SELECT COUNT(*) FROM boosts b WHERE b.thought_id=t.id) AS boost_count, (SELECT COUNT(*) FROM thoughts r WHERE r.in_reply_to_id=t.id) AS reply_count FROM thoughts t JOIN users u ON u.id=t.user_id"; fn row_to_entry(r: FeedRow) -> FeedEntry { let thought = Thought { id: ThoughtId::from_uuid(r.thought_id), user_id: UserId::from_uuid(r.t_user_id), content: Content::new_remote(r.content), in_reply_to_id: r.in_reply_to_id.map(ThoughtId::from_uuid), in_reply_to_url: r.in_reply_to_url, ap_id: r.t_ap_id, visibility: Visibility::from_str(&r.visibility), content_warning: r.content_warning, sensitive: r.sensitive, local: r.t_local, created_at: r.thought_created_at, updated_at: r.updated_at, }; let author = User { id: UserId::from_uuid(r.author_id), username: Username::from_trusted(r.username), email: Email::from_trusted(r.email), password_hash: PasswordHash(r.password_hash), display_name: r.display_name, bio: r.bio, avatar_url: r.avatar_url, header_url: r.header_url, custom_css: r.custom_css, local: r.author_local, ap_id: r.u_ap_id, inbox_url: r.inbox_url, public_key: r.public_key, private_key: r.private_key, created_at: r.author_created_at, updated_at: r.author_updated_at, }; FeedEntry { thought, author, like_count: r.like_count, boost_count: r.boost_count, reply_count: r.reply_count, liked_by_viewer: false, boosted_by_viewer: false } } // ── User row ────────────────────────────────────────────────────────────────── #[derive(sqlx::FromRow)] struct UserRow { id: uuid::Uuid, username: String, email: String, password_hash: String, display_name: Option, bio: Option, avatar_url: Option, header_url: Option, custom_css: Option, local: bool, ap_id: Option, inbox_url: Option, public_key: Option, private_key: Option, created_at: DateTime, updated_at: DateTime, } impl From for User { fn from(r: UserRow) -> Self { User { id: UserId::from_uuid(r.id), username: Username::from_trusted(r.username), email: Email::from_trusted(r.email), password_hash: PasswordHash(r.password_hash), display_name: r.display_name, bio: r.bio, avatar_url: r.avatar_url, header_url: r.header_url, custom_css: r.custom_css, local: r.local, ap_id: r.ap_id, inbox_url: r.inbox_url, public_key: r.public_key, private_key: r.private_key, created_at: r.created_at, updated_at: r.updated_at, } } } const USER_SELECT: &str = "SELECT id,username,email,password_hash,display_name,bio,avatar_url,header_url,\ custom_css,local,ap_id,inbox_url,public_key,private_key,created_at,updated_at FROM users"; // ── SearchPort implementation ───────────────────────────────────────────────── #[async_trait] impl SearchPort for PgSearchRepository { async fn search_thoughts( &self, query: &str, page: &PageParams, _viewer_id: Option<&UserId>, ) -> Result, DomainError> { // Use pg_trgm similarity operator — requires the GIN index from migration 004 let total: i64 = sqlx::query_scalar( "SELECT COUNT(*) FROM thoughts t WHERE t.content % $1 AND t.visibility='public'" ) .bind(query) .fetch_one(&self.pool) .await .map_err(|e| DomainError::Internal(e.to_string()))?; let sql = format!( "{FEED_SELECT} WHERE t.content % $1 AND t.visibility='public' ORDER BY similarity(t.content, $1) DESC LIMIT $2 OFFSET $3" ); let rows = sqlx::query_as::<_, FeedRow>(&sql) .bind(query) .bind(page.limit()) .bind(page.offset()) .fetch_all(&self.pool) .await .map_err(|e| DomainError::Internal(e.to_string()))?; Ok(Paginated { items: rows.into_iter().map(row_to_entry).collect(), total, page: page.page, per_page: page.per_page, }) } async fn search_users( &self, query: &str, page: &PageParams, ) -> Result, DomainError> { let total: i64 = sqlx::query_scalar( "SELECT COUNT(*) FROM users u WHERE u.local=true AND (u.username % $1 OR u.display_name % $1)" ) .bind(query) .fetch_one(&self.pool) .await .map_err(|e| DomainError::Internal(e.to_string()))?; let sql = format!( "{USER_SELECT} WHERE local=true AND (username % $1 OR display_name % $1) ORDER BY similarity(username || ' ' || COALESCE(display_name,''), $1) DESC LIMIT $2 OFFSET $3" ); let rows = sqlx::query_as::<_, UserRow>(&sql) .bind(query) .bind(page.limit()) .bind(page.offset()) .fetch_all(&self.pool) .await .map_err(|e| DomainError::Internal(e.to_string()))?; Ok(Paginated { items: rows.into_iter().map(User::from).collect(), total, page: page.page, per_page: page.per_page, }) } } ``` - [ ] **Run:** `DATABASE_URL=postgres://postgres:postgres@localhost:5434/postgres cargo test -p postgres-search` Expected: 3 tests pass. - [ ] **Commit:** ```bash git add crates/adapters/postgres-search/ git commit -m "feat(postgres-search): PgSearchRepository using pg_trgm" ``` --- ### Task 4: Upgrade postgres ILIKE search to trigram operator **Files:** - Modify: `crates/adapters/postgres/src/feed.rs` The current `FeedRepository::search` uses `ILIKE '%pattern%'` which does a full table scan. Upgrade it to use the `%` trigram similarity operator which uses the GIN index from migration 004. - [ ] **Update the `search` method** in `crates/adapters/postgres/src/feed.rs`: Replace the entire `search` method (lines ~123-136) with: ```rust async fn search(&self, query: &str, page: &PageParams, _viewer_id: Option<&UserId>) -> Result, DomainError> { let total: i64 = sqlx::query_scalar( "SELECT COUNT(*) FROM thoughts t WHERE t.content % $1 AND t.visibility='public'" ) .bind(query) .fetch_one(&self.pool) .await .map_err(|e| DomainError::Internal(e.to_string()))?; let sql = format!("{FEED_SELECT} WHERE t.content % $1 AND t.visibility='public' ORDER BY similarity(t.content, $1) DESC LIMIT $2 OFFSET $3"); let rows = sqlx::query_as::<_, FeedRow>(&sql) .bind(query) .bind(page.limit()) .bind(page.offset()) .fetch_all(&self.pool) .await .map_err(|e| DomainError::Internal(e.to_string()))?; Ok(Paginated { items: rows.into_iter().map(row_to_entry).collect(), total, page: page.page, per_page: page.per_page }) } ``` Also update the existing search test in `feed.rs` — the ILIKE test uses `"hello world"` vs `"hello"`. Trigram similarity works on substrings but with a minimum threshold. Update the test: ```rust #[sqlx::test(migrations = "./migrations")] async fn search_returns_matching_thoughts(pool: sqlx::PgPool) { let (_, _) = seed(&pool, "alice", "hello world").await; let (_, _) = seed(&pool, "bob", "goodbye world").await; let repo = PgFeedRepository::new(pool); // pg_trgm matches "hello" in "hello world" via trigram similarity let result = repo.search("hello world", &PageParams { page: 1, per_page: 20 }, None).await.unwrap(); assert!(result.total >= 1); assert!(result.items.iter().any(|e| e.thought.content.as_str() == "hello world")); } ``` Note: use the full string `"hello world"` as query since single short words may fall below the default similarity threshold (0.3). Alternatively, adjust the threshold — but keeping the test realistic is better. - [ ] **Run:** `DATABASE_URL=postgres://postgres:postgres@localhost:5434/postgres cargo test -p postgres` Expected: all tests pass. - [ ] **Commit:** ```bash git add crates/adapters/postgres/src/feed.rs git commit -m "feat(postgres): upgrade search from ILIKE to pg_trgm similarity" ``` --- ### Task 5: Wire SearchPort into presentation **Files:** - Modify: `crates/presentation/src/state.rs` - Modify: `crates/presentation/src/lib.rs` - Modify: `crates/presentation/src/handlers/feed.rs` - [ ] **Add `search` field to `AppState`** in `crates/presentation/src/state.rs`: ```rust use std::sync::Arc; use domain::ports::*; #[derive(Clone)] pub struct AppState { pub users: Arc, pub thoughts: Arc, pub likes: Arc, pub boosts: Arc, pub follows: Arc, pub blocks: Arc, pub tags: Arc, pub api_keys: Arc, pub top_friends: Arc, pub notifications: Arc, pub remote_actors: Arc, pub feed: Arc, pub search: Arc, // NEW pub auth: Arc, pub hasher: Arc, pub events: Arc, } ``` - [ ] **Wire `PgSearchRepository` in `build_state`** in `crates/presentation/src/lib.rs`: Add `postgres_search` import and the field. The lib.rs `build_state` function currently returns `AppState { ... }` — add one line for `search`: ```rust // At top of file, add: use postgres_search::PgSearchRepository; // In build_state, add to the AppState struct literal: search: Arc::new(PgSearchRepository::new(pool.clone())), ``` Also add `postgres-search` to `crates/presentation/Cargo.toml`: ```toml postgres-search = { workspace = true } ``` - [ ] **Run:** `cargo check -p presentation` — Expected: no errors. - [ ] **Update `search_handler`** in `crates/presentation/src/handlers/feed.rs` to use `SearchPort` and return both thoughts and users: Replace the existing `search_handler` function: ```rust pub async fn search_handler( State(s): State, OptionalAuthUser(viewer): OptionalAuthUser, Query(q): Query, ) -> Result, ApiError> { use domain::models::feed::PageParams; let page = PageParams { page: q.page.unwrap_or(1), per_page: q.per_page.unwrap_or(20) }; let query = q.q.trim().to_string(); let (thoughts_result, users_result) = tokio::join!( s.search.search_thoughts(&query, &page, viewer.as_ref()), s.search.search_users(&query, &page), ); let thoughts = thoughts_result?.items.into_iter().map(|e| serde_json::json!({ "id": e.thought.id.as_uuid(), "content": e.thought.content.as_str(), "author": to_user_response(&e.author), "like_count": e.like_count, "boost_count": e.boost_count, "reply_count": e.reply_count, "created_at": e.thought.created_at, })).collect::>(); let users = users_result?.items.into_iter().map(|u| to_user_response(&u)).collect::>(); Ok(Json(serde_json::json!({ "query": query, "thoughts": thoughts, "users": users, }))) } ``` Add `use crate::handlers::auth::to_user_response;` at the top of `feed.rs` if not already imported. - [ ] **Run:** `cargo build -p presentation` — Expected: clean build. - [ ] **Smoke test:** ```bash # Start server DATABASE_URL=postgres://postgres:postgres@localhost:5434/postgres JWT_SECRET=dev cargo run -p presentation & sleep 2 # Register + post a thought + search TOKEN=$(curl -s -X POST http://localhost:3000/auth/register \ -H 'content-type: application/json' \ -d '{"username":"searcher","email":"searcher@test.com","password":"pw"}' | jq -r .token) curl -s -X POST http://localhost:3000/thoughts \ -H 'content-type: application/json' \ -H "Authorization: Bearer $TOKEN" \ -d '{"content":"searching for trigrams"}' curl -s "http://localhost:3000/search?q=trigram" | jq . kill %1 ``` Expected: JSON with `thoughts` array containing the posted thought, `users` array. - [ ] **Commit:** ```bash git add crates/presentation/src/state.rs crates/presentation/src/lib.rs \ crates/presentation/src/handlers/feed.rs crates/presentation/Cargo.toml git commit -m "feat(presentation): wire SearchPort, upgrade /search to return thoughts + users" ``` --- ## Self-Review **Spec coverage:** - ✅ pg_trgm extension + GIN indexes (Task 1) - ✅ `SearchPort` trait in domain (Task 2) - ✅ `postgres-search` crate filled in with `PgSearchRepository` (Task 3) - ✅ Existing ILIKE upgraded to trigram operator (Task 4) - ✅ Presentation wired: `search: Arc` in AppState (Task 5) - ✅ `/search` endpoint returns both thoughts and users (Task 5) **Placeholder scan:** None — all code blocks are complete. **Type consistency:** - `SearchPort::search_thoughts` → returns `Paginated` — matches domain model - `SearchPort::search_users` → returns `Paginated` — matches domain model - `PgSearchRepository::new(pool: PgPool)` — consistent with all other repo constructors - `AppState.search: Arc` — consistent with existing fields **Notes for implementer:** - `pg_trgm` `%` operator default threshold is 0.3 — short single-word queries may return no results if the word is too short. The smoke test uses `"trigram"` (7 chars) which is long enough. - `CONCURRENTLY` in migration lets the index build without locking the table — safe for production. - `postgres-search` dev-dependency on `postgres` crate is for seeding test data only — no runtime coupling.