Page Content

Tutorials

Understanding LIKE Operator in PostgreSQL with Example

LIKE Operator in PostgreSQL

A key tool for pattern matching in character strings in PostgreSQL is the LIKE operator, which is frequently used in the WHERE clause of a SELECT statement to filter records where string values in a column match a predetermined pattern. In contrast, NOT LIKE returns FALSE in the identical scenario. If a string matches the pattern, it returns TRUE. LIKE works exactly like the equals (=) operator if the pattern doesn’t contain any special wildcard characters.

The underscore (_), which matches any single character, and the percent sign (%), which matches any sequence of zero or more characters, are the two main wildcards that enable versatile pattern building. While ‘D%’ matches strings that begin with ‘D,’ ‘%D%’ matches any string that contains ‘D,’ and ‘_D%’ matches strings in which ‘D’ is the second character. It’s important to keep in mind that LIKE pattern matching is intended to cover the entire string; therefore, a pattern must normally be surrounded by percent signs in order to be found anywhere within a string.

LIKE Operator in PostgreSQL
LIKE Operator in PostgreSQL

Basic Functionality and Wildcards

In cases where the string matches the specified pattern, the LIKE operator returns TRUE; in the same case, NOT LIKE returns FALSE. When a pattern is devoid of special wildcard characters, LIKE functions in the same way as the equals (=) operator.

The two main wildcards that LIKE allows for pattern building are:

Underscore (_): PostgreSQL’s LIKE operator uses the underscore (_) as a wildcard to match any character. A LIKE pattern’s underscore placeholder represents exactly one character in the string being compared.

Percent sign (%): The percent sign (%) is a potent wildcard character used in PostgreSQL’s LIKE operator for pattern matching within character strings. The % sign matches any string of zero or more characters, in contrast to the underscore (_), which matches any single character.

Flexible search patterns are made possible by these wildcards. For example:

  • Any string that starts with the letter “D” would match “D%.”
  • If ‘%D%’ appears anyplace in a string, it would match that string.
  • Where ‘D’ is the second character, ‘_D%’ would match strings.
  • If a string begins with’s’ and has precisely four characters, it would match’s_’ (with three underscores).

Notably, LIKE pattern matching always aims to cover the entire string. Consequently, a pattern usually needs to start and finish with a percent sign in order to match a string’s character sequence wherever in the string.

Case Sensitivity

The PostgreSQL LIKE operator matches case-sensitive patterns by default. This means that ‘Apple’ LIKE ‘apple’ returns FALSE since it separates uppercase and lowercase letters. LIKE acts like = in patterns without wildcards. The non-standard PostgreSQL ILIKE operator matches patterns case-insensitively. ‘Apple’ ILIKE ‘apple’ returns TRUE. In PostgreSQL, LIKE is translated to the operator and ILIKE to the * operator, including its negated versions.EXPLAIN output typically displays and!* for NOT LIKE and NOT ILIKE.

To match case-insensitively with the LIKE operator or without ILIKE, column data and search patterns are converted to lowercase using functions like lower(). Create a functional index on the column’s lower() function, such as CREATE INDEX ON account(lower(first_name));, to optimise index use. This lets queries like SELECT * FROM account WHERE lower(first_name) = ‘foo’ scan the index. You can use B-tree indexes for ILIKE and ~*, but only if the pattern begins with non-alphabetic characters that are not impacted by case conversion.

However, extensions like pg_trgm can give operator classes to index ILIKE for wildcard searches like ‘%something%’. In non-C locales, collation rules may prevent LIKE from using standard indexes. For anchored, case-sensitive LIKE patterns, specific operator classes like text_pattern_ops or varchar_pattern_ops can be used to enforce character-by-character comparison without altering case-sensitivity.perators are frequently found in EXPLAIN output because LIKE and ILIKE are internally translated to them by the parser.

Escape Character

In PostgreSQL, the LIKE operator uses an escape character to look for literal instances of its wildcard characters, the underscore (_) and percent sign (%), in a string pattern. The backslash () is the default escape character. A literal wildcard character like a percent sign or underscore must be preceded by this escape character in the pattern. LIKE ‘%_%’ matches strings with literal underscores.

Write the escape character twice to match it in a pattern. Using LIKE ‘%\%’ matches strings with a literal backslash. You can disable the escape mechanism in PostgreSQL by setting the ESCAPE clause to an empty string (ESCAPE ”). If the standard_conforming_strings configuration parameter is disabled, backslashes in regular string variables must be doubled for literal interpretation, including LIKE patterns.

PostgreSQL’s behaviour here is a modest nonstandard variation of how backslashes in LIKE patterns are interpreted. The SQL standard prohibits removing the ESCAPE clause and allowing a zero-length ESCAPE value. PostgreSQL’s behaviour is a tiny deviation from the SQL standard, which prohibits zero-length ESCAPE values.

Limitations and Comparisons

LIKE is a common and practical operator, but it has a number of drawbacks, particularly when contrasted with regular expressions and full-text search in PostgreSQL:

Linguistic Support: Since LIKE lacks linguistic properties, it cannot automatically identify derivative versions of words (for example, “satisfies” while looking for “satisfy”). Without clearly listing every potential alternative, its raw string matching method may result in missing results. As opposed to this, full-text search normalises words using lexemes and dictionaries, enabling searches for several variants of the same word.

Ranking: LIKE operators only give back a Boolean TRUE or FALSE when ranking. A fundamental component of full-text search, they lack a mechanism for ranking search results according to similarity or relevancy.

Performance: LIKE typically suffers from inefficient index utilisation for non-anchored patterns (such as ‘%text_pattern’), which frequently leads to a slower sequential scan of the full table.

Because of its limitations, LIKE is a less complex pattern-matching operator than more sophisticated ones like SIMILAR TO and POSIX regular expressions (~, ~*). It is also typically seen as safer to employ with potentially doubtful pattern.

Indexing for Performance

Indexing has the potential to greatly impact LIKE operations’ efficiency.

Anchored Patterns: Column LIKE ‘foo%’ or column ~ ‘^foo’ are examples of LIKE patterns that are anchored to the start of the text. The query optimiser can employ a B-tree index for these patterns.

Non-Anchored Patterns:Nevertheless, a regular B-tree index is usually not utilised for patterns that begin with a wildcard (e.g., column LIKE ‘%bar’). This results in a full sequential table search, which can be extremely slow for large tables.

Locale Considerations: Because locale-specific collation rules may prevent LIKE from using ordinary indexes in databases with non-C locales. When building an index, one way to get around this is to utilise particular operator classes, such as text_pattern_ops, varchar_pattern_ops, or bpchar_pattern_ops, to impose a rigorous character-by-character comparison while breaking locale constraints. An alternative is to use the C collation to create indexes.

Case-Insensitive Searches: ILIKE or LIKE expressions with case-insensitive comparisons can be indexed using the column’s lower() (or higher()) function (slow(column) LIKE lower(‘pattern’)). Create index on account(lower(first_name)); optimises index use in queries like * FROM account WHERE lower(first_name) = ‘foo’;.

Code Example:

-- Sample table
CREATE TABLE account (
    id SERIAL PRIMARY KEY,
    first_name TEXT
);
-- Insert sample data
INSERT INTO account (first_name) VALUES
('FooBar'),  ('baz');
CREATE INDEX idx_firstname_prefix ON account(first_name text_pattern_ops);
EXPLAIN SELECT * FROM account WHERE first_name LIKE 'Foo%';
EXPLAIN SELECT * FROM account WHERE first_name LIKE '%bar';

Output:

CREATE TABLE
INSERT 0 2
CREATE INDEX
                       QUERY PLAN                       
 Seq Scan on account  (cost=0.00..1.02 rows=1 width=36)
   Filter: (first_name ~~ 'Foo%'::text)
(2 rows)
                       QUERY PLAN                       
 Seq Scan on account  (cost=0.00..1.02 rows=1 width=36)
   Filter: (first_name ~~ '%bar'::text)
(2 rows)

Conclusion

The simple yet powerful PostgreSQL LIKE operator matches text data patterns, but its performance depends on how it’s utilised. Anchored patterns like ‘Foo%’ can use B-tree indexes, especially with operator classes like text_pattern_ops or collations like ‘C’; non-anchored patterns like ‘%bar’ impose sequential scans, which are slower on big databases. ILIKE for case-insensitive searches or functional indexes on expressions like lower(column) can manage case sensitivity. Although flexible and simple, LIKE lacks linguistic awareness, ranking, and efficient index utilisation for arbitrary wildcards, making it less suitable for advanced queries than full-text search or regular expressions. Understanding indexing and collation rules helps boost LIKE performance.

Kowsalya
Kowsalya
Hi, I'm Kowsalya a B.Com graduate and currently working as an Author at Govindhtech Solutions. I'm deeply passionate about publishing the latest tech news and tutorials that bringing insightful updates to readers. I enjoy creating step-by-step guides and making complex topics easier to understand for everyone.
Index