In this article, we’ll show you how to use REGEX on Looker Studio (previously Google Data Studio) with real step-by-step examples. From one marketer to another: I’ll try to make this easy and quick to implement for you.
Personally, I think that using Regular Expressions is a must if you want to create useful reports in Looker Studio. For instance, I use them to analyze my SEO and Google Analytics traffic reports in terms of content categories. I check these reports every week, so I’m confident you’ll find them useful, too.
You can also learn how to use regex on Looker Studio with our step-by-step YouTube video tutorial:
Now, let’s go.
What is a Regular Expression?
A regular expression (or RegEX) is a language that allows you to match patterns of characters. Regular expressions describe a search pattern defined by special characters.
Is particularly useful when parsing and analyzing text data, allowing for the extraction, matching, filtering, and categorization of information based on defined patterns.
Just like you use formulas to work with numbers (e.g., divisions, sums, etc.), you use regular expressions to work with the text to find patterns.
For example, with RegEX, you may group all the website landing page URLs containing the word “contact.”
A regular expression is just the type of instructions or formulas you already know on Google Sheets or Excel to separate text, concatenate it, or convert it to capital letters.
Still, you’ll use symbols instead of these tools’ default functions this time.
Types of RegEX
There are two types of regular expressions: the basic regular expressions, often deemed simple, and the extended regular expressions. These expressions are pivotal, especially when used within calculated field formulas.
Looker Studio uses four REGEX functions; REGEXP_REPLACE, REGEXP_CONTAINS REGEX_EXTRACT, REGEX_MATCH
Let’s see how they work:
1. REGEXP_MATCH()
REGEXP_MATCH() is a function that returns true if a string matches a regular expression. The function takes two arguments: the string to match and the regular expression. It attempts to match the entire string contained in the field expression, ensuring a comprehensive analysis based on the pattern set in the regular expression.
The function always returns true or false if the string does not match the regular expression. REGEXP_MATCH() is your most common type in your reports.
Its everyday use case is categorizing or grouping specific sequences of your data with custom values. In other words, it lets you create new dimensions or breakdowns.
For instance, using a pattern like “.+@gmail.com” can help identify personal emails by matching the entire string contained in the email field. Similarly, a pattern like “s?” demonstrates how the preceding character ‘s’ is made optional, allowing for the match of both ‘http’ and ‘https’ in URLs.
Some examples of REGEXP_MATCH() with Google Looker Studio are:
Grouping your social media posts by topic
Social media managers can use RegEx functions to group social media posts by topics since they typically create different content topics.
The match RegEX allows social media managers to know which particular topics perform better, maximizing their efforts.
Grouping your ad campaigns by stage of the funnel
The REGEXP match function allows marketers to group ad campaigns by funnel stage. Every campaign set up by marketers always has a goal. The goal could be awareness, lead generation, or conversion.
Assuming you run tons of campaigns for your business or clients and have named your campaigns appropriately, pulling out reports might still be stressful.
The Regex match function allows you to break down your reports based on the campaign name. This allows users to easily compare various ad campaigns and identify the most profitable ones at different stages of the funnel.
Breaking down your marketing data by campaign, product, brand, market, or category
Let’s assume you have an e-commerce store and want a breakdown of how your category performs against each other and how profitable each category is to your business.
The match RegEX helps you break your marketing data down by category, allowing you to see how much revenue each category brings to your business.
Breaking down your organic keyword searches by branded VS non-branded
Every business, especially well-known businesses, is searched online by branded or organic keywords. The RegEX match function allows identifying branded vs. non-branded searches from your search console data.
Identifying personal emails from business emails in your CRM or email marketing software
The RegEX match function allows you to differentiate between personal and business emails in your CRM.
After uploading your list to Looker Studio, you can enter the RegEX match formula to identify email addresses beginning with a domain name (signifying a business email) and email addresses that end with “gmail.com” or “outlook.com” or “hotmail.com”
Breaking down your data for tracking team performance and general marketing analysis
Another use case of the RegEX match is to create lists that allow you to quickly break down your data to track your team’s marketing performance.
Assuming you have a content marketing agency, you assign various topics to your team. A RegEX match allows you to filter your data to know the most efficient and fastest writer in your team that gets work done.
You can also use the RegEX match to find the number of articles each staff wrote within a given period.
Identifying high-performing or low-performing ads
Identifying low and high-performing ads is possible using the RegEX match. Every ad campaign has a goal attached to it. These can be tracked since they are metrics attached to them.
You can quickly enter the RegEX match formula to sort high-performing ads from low-performing ads based on KPIs
Cleaning acquisition channels and grouping by regions
The RegEX match allows you to group your acquisition channels and label them either as social ads, CPC, direct, organic, or affiliates.
You can also group your data by regions where your acquisition channels produce more results.
Here is an example of how to use REGEX and data blending to report your marketing efforts on multiple channels.
2. RegEX Extract
RegEX Extract extracts matching substrings from a text string using regular expressions.
In situations where a set of strings is not consistent in format, the REGEXP_EXTRACT function is beneficial.
The REGEXP_EXTRACT function can be used for the following purposes:
To extract different parts of email addresses.
To extract a domain name from a URL.
To extract numbers from a text string.
To extract links from a text.
3. Regex Replace
Regex Replace replaces a part of a text string with a different text string using regular expressions.
Regex Replace helps to do the following:
The function can be used to delete or replace any numerical value in a string
Removes all URLs from a string or replace them with new ones
It helps remove HTML tags from strings
Cleaning UTMs and acquisition sources
A case use of RegEX Replace is in cleaning UTMs and acquisition sources.
When you want to know the acquisition source of your traffic, you use short text codes called UTM to query parameters.
This allows you to track leads from a specific Facebook ad campaign, a referral campaign, or an influencer campaign.
Let’s assume you want to express your data without those codes, extract just input values from the acquisition sources, and use the RegEX replace function to group and clean your data without errors.
How To use RegEX in Google Looker Studio
We’ll see how to use RegEX in Looker Studio, with a step-by-step example using Google Search Console data. We will add categories to the website content based on the URLs so it’s easier to understand if a category is improving in terms of web impressions and clicks.
For this use case, I’m using two different functions: 1. CASE STATEMENT so we can apply different conditions, and 2. REGEXP_MATCH so we can specify the different content categories.
First, I’ll add a table where I can see the landing page URL and metrics like impressions and clicks.
Now I’ll create a custom field by clicking on ‘+Add dimension’ and then click on ‘+Add a field’
Now you will see a box where you can type your formula, but first, let’s understand the structure of our content.
I have 7 different content categories and I have a URL structure that will help me categorize the content.
Now I’m going to create a new condition for each when the URL matches the exact term
CASE
WHEN REGEXP_MATCH(Landing Page, '.*/report-templates/.*') THEN "Report templates"
WHEN REGEXP_MATCH(Landing Page, '.*/templates/.*') THEN "Looker templates"
WHEN REGEXP_MATCH(Landing Page, '.*/tutorial/.*') THEN "Tutorials"
WHEN REGEXP_MATCH(Landing Page, '.*/connectors/.*') THEN "Connectors pages"
WHEN REGEXP_MATCH(Landing Page, '.*/compare/.*') THEN "Comparisons"
WHEN REGEXP_MATCH(Landing Page, '.*/articles/.*') THEN "Articles"
WHEN REGEXP_MATCH(Landing Page, '.*/solutions/.*') THEN "solutions"
WHEN REGEXP_MATCH(Landing Page, '.*/home/.*') THEN "Brand"
WHEN REGEXP_MATCH(Landing Page, '.*pricing.*') THEN "Brand"
ELSE "other"
END
We must apply (*.) in the formula to indicate to Looker Studio that the regular expression should apply to any character and that it should match the previous value ‘0’ or more times
That’s it. Now, you can change the graph from a table to a line chart and see the trend of each category.
I actually combine line charts to understand if something is trending up or trending down, and charts to analyze which specific page has changed in terms of performance.
RegEx Example and Formulas
RegEX, or regular expressions, are a powerful tool for manipulating text and data.
They can be used to search, replace, and validate data. For marketers, RegEX can clean up data, extract information from web pages and documents, and automate repetitive tasks.
Here is a list of Regex formulas you will need if you are interested in further exploring RegEX on Google Properties.
Here’s a simple RegEX use case for marketers:
1. Social Media content tagging using RegEX
Let’s see how to tag and categorize your social media content by topics. For this example, I’m using Porter Metrics Instagram Public data connector that allows you to analyze any Instagram account (you can try it for free too)
2. Cross-channel reports using RegEX and data blending
I’m going to show you how you can combine different marketing channels such as Google Ads, Facebook Ads, GA4 and Google Sheets and use regular expressions to make those different channels match in a single report using ‘Campaign Name’
3. Clean GA4 acquisition reports with REGEX on Google Looker Studio
Learn to clean your Google Analytics 4 acquisition reports using regular expressions. See how to group and filter data accurately to eliminate redundancies in the sources.
RegEx Symbols & Combos
We’ve learned that RegEX is used for pattern matching or string matching. Now let’s learn how the symbols and various combos are used in regular expressions.
Period (.): A period is used to match any one character.
Caret (៱): It’s used to match characters that occur at the beginning of any query.
The dollar sign ($): This is used to match characters at the end of a string—for example*, good marketing results$* match only good marketing results but not good marketing results and reports.
Question mark (?): indicates that you want to match either one or zero occurrences of this pattern.
Square brackets []: Matches any character within the bracket
[៱abc] matches any character except a, b, c.
[a-z] matches enclosed characters a to z in lowercase.
[A-Z] matches characters A to Z in the upper case
[a-zA-Z] matches characters a to z and A to Z in lower and upper cases.
[0-9] matches characters or numbers 0-9
Parentheses (): Are used to group expressions within a regex pattern, which is particularly useful for handling complex data types, such as IP addresses.
Quantifiers
If you want to specify how many times a single character just has to be present in a query string for a match to be found, we use quantifiers.
Here are some examples: (you can input any character inside the empty square brackets)
[ ]? reports characters that occur between 0 and 1 time.
[ ]+ reports characters that occur one or more times
[ ]* reports characters that are between 0 or more times
[ ]{n} reports characters that occur “n” times
[ ]{n,} reports characters that occur “n” or more times
[ ]{x,y} reports characters that occur at least “x” time but less than “y” time.
RegEX Metacharacters
Metacharacters are special characters used in regular expressions to specify search criteria and to manipulate text. They are integral to the regex function, serving as commands or instructions within a regex pattern.
Here are examples of metacharacters and their functions within the regexp.:
d matches any digit character [0-9]. This is a shorthand character class that makes the regex more human-readable, functioning as a regex function to find digit characters.
w matches any word character [a-z, A-Z_0-9], another example of a regex function that utilizes shorthand character classes for simplifying patterns.
s matches whitespace and tabs, demonstrating how the backslash acts as an escape character in regex functions, allowing for the identification of whitespace characters.
Conclusion
Regular expressions can clean up data, find patterns, and automate tasks for marketers. After cleaning up your data, you’d want to present them in an easy-to-read format for your reports.
Porter Metrics provides beautiful, ready-to-use report templates that allow you to create reports on Looker Studio in less than 90 seconds!
Get started and make a free report today.
To match a string, use REGEX_MATCH function.
Frequently Asked Questions
Do I have to learn programming languages to use RegEX
No, you don’t need to learn any programming language to use RegEX. The basic knowledge outlined in this post is enough to help you create your data studio report using our templates
What does mean in RegEX?
A single slash “” is used to match a character. Let’s say you intend to match a question mark “?”, you use “?” or “?$” to match the question mark at the end of the line. A double slash “” helps you check a text string if it has a backslash “.”
What type of RegEX does Looker Studio use?
Google Data Looker uses four regex functions; REGEXP_REPLACE, REGEXP_CONTAINS REGEX_EXTRACT, REGEX_MATCH
Is Regexp_contains case-sensitive?
All regex functions on Looker Studio are case-sensitive. This allows you to filter your data accurately.