Implement Balanced Full-Text Search for Diary Entries #5156
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds search functionality to the diary entries page, addressing the need for users to find entries by title and body #3289
Key Changes
Model Updates:
pg_search
gem to enable full-text search on diary entries with relevance ranking.searchable
column using atsvector
for efficient searching and created an index for performance optimization.Controller Enhancements:
index
action inDiaryEntriesController
to support search queries, allowing users to filter results by keywords in titles and bodies, with optional language filtering.Testing:
Context
Previous methods explored for implementing search functionality, such as the PostgreSQL LIKE operator and pg_trgm search, were either too fast but lacked relevance ranking or too slow to be practical, especially with large datasets like the diary entries (600,000+ records). For instance, LIKE provided speed but no relevance, while pg_trgm could take over 40 seconds to run on a dataset of this size, making it unfeasible. The pg_search gem, utilizing tsearch, offers a balance between relevance and performance, though it required additional database migrations and optimizations. Despite the need for these migrations, this approach serves as a "golden middle," providing a reasonable trade-off between speed and search result quality IMHO.
Commit Summary
pg_search
for full-text search capabilities.tsvector
column for storing precomputed search data.searchable
column for performance improvements.Work in Progress
This is a draft PR. I want to confirm if I'm heading in the right direction. If the approach is approved, I'll add more tests for better coverage and handling of edge cases. Any comments and recommendations welcome.