Escape your Full Text Search queries
Full Text Search (FTS) is a powerful feature of SQLite, but its intricacies sometimes make it a footgun.
Footgun (noun) source
Any feature whose addition to a product results in the user shooting themselves in the foot.
Say you have a database of SpaceX launches, and you want to use FTS on it to power the search experience in an application. You create the FTS table, you build a nice UI around it, and it works perfectly. Super easy!
Then you notice whenever you search for anything that starts with a dash/minus-sign (
-), your application crashes with an error like this:
In this case, the search query was '-6’ to search for the Starlink-6 launch.
This happened to me recently. Upon investigation, I realized that I had shot myself in the foot with FTS because of its Set Operations feature.
FTS Set Operations
Set Operations allow users to combine the results of multiple FTS queries in different ways. Essentially, they allow you to use logical
NOT operators in search queries.
Set Operations treat the unary minus symbol
- as the logical
NOT operator. When used correctly, it can be used to retrieve the complement result set (i.e., all results minus the results for the given query). Additionally, any queries prefixed with
- are forbidden, and SQLite throws an error when it encounters one (which explains the earlier crash).
Let’s say you want to search for all Starlink launches, except the ones which contain ‘8’ in their name. You can accomplish this by simply prefixing the unwanted term with a
-. The search query then becomes ‘Starlink -8’.
…But wait! This is also the exact query you would to search for the Starlink-8 launch! But FTS interprets it completely differently!
To fix this problem, you must escape such characters in the search query.
Escaping search queries
The simplest way to escape special characters in the user’s search queries is to enclose them in double quotes. This ensures that SQLite treats the search query literally.
Double quotes ensure that SQLite treats the search query literally.
Unfortunately, it’s not enough. This code breaks if the input query contains double quotes too.
Escaping search queries even more
SQLite’s documentation clearly defines verbatim strings for FTS5, and through experimentation I can confirm that the definition applies to FTS3 and FTS4 too.
According to the docs, any
" in the query should be replaced with
"", and then the whole thing must be wrapped in double quotes 1:
This solution seems to work for most cases. I shall update this post if I discover any queries that break this code.
Search query manipulation can still be performed outside the quotes. For example, to match the query when it’s in the middle of a word, surround it asterisks:
Given the complexity of sanitizing user input, it makes more sense to expose Set Operations related functionality of FTS through controllable UI elements (such as buttons and toggles), rather than an uncontrolled text field.
In conclusion, always sanitize user input and escape your FTS search queries.
If you want learn more about the Full Text Search feature in Room, check out the tutorial I did for raywenderlich.com: Introduction to Full Text Search in Room.
Question, comments or feedback? Feel free to reach out to me on Twitter @haroldadmin