How to filter a ClickHouse table by an array-column?
Introduction
Filtering a ClickHouse table by an array-column is a common task and the product offers a lot of functions to work with array-columns.
In this article, we're going to focus on filtering a table by an array-column, but the video below covers a lot of other array-related functions:
Example
We'll use an example of a table with two columns tags_string and tags_int that contain an array of strings and integers respectively.
- Create a sample database and table.
- Create a sample table
- Insert some sample data into the table.
Filter the table using the has(arr, elem) function to return the rows where the arr array contains the elem element.
Filter the table to return the rows where the tags_string array contains the tag1 element.
Use the hasAll(arr, elems) function to return the rows where all the elements in the elems array are present in the arr array.
Filter the table to return the rows where all the elements in the tags_string array are present in the ['tag1', 'tag2'] array.
Use the hasAny(arr, elems) function to return the rows where at least one element in the elems array is present in the arr array.
Filter the table to return the rows where at least one element in the tags_string array is present in the ['tag1', 'tag2'] array.
We can use a lambda function to filter the table using the arrayExists(lambda, arr) function.
Filter the table to return the rows where at least one element in the tags_int array is greater than 3.
