URL Match Pattern
Brief Overview
Match patterns are a way to specify groups of URLs: a match pattern matches a specific set of URLs.
Match Pattern Structure
All match patterns are specified as strings. Match patterns consist of three parts: scheme, host, and path. The scheme and host are separated by ://.
<scheme>://<host><path>
scheme
The scheme component can be one of the following: http or https.
host
The host component may take one of three forms:
Form | Matches |
* | Any host. |
*. Followed by part of hostname | The given host and any of its subdomains. |
A complete hostname without wildcards | Only the given host. |
Host must not include a port number.
* Note that the wildcard may only appear at the start.
path
The path component must begin with a /.
After that, it may subsequently contain any combination of the * wildcard and any of the characters that are allowed in URL paths or query strings. Unlike host, the path component may contain the * wildcard in the middle or at the end, and the * wildcards may appear more than once.
The value for the path matches against the string which is the URL path plus the URL query string. This includes the ? between the two, if the query string is present in the URL.
For example, if you want to match URLs on any domain where the URL path ends with foo.bar, then you need to use an array of Match Patterns like ['*://*/*foo.bar', '*://*/*foo.bar?*']. The ?* is needed, rather than just bar*, in order to anchor the ending * as applying to the URL query string and not some portion of the URL path.
Neither the URL fragment identifier, nor the # which precedes it, are considered as part of the path.
Examples
Pattern | Example matches | Example non-matches |
https://*/path Match HTTPS URLs on any host, whose path is "path". |
https://mozilla.org/path https://a.mozilla.org/path https://something.com/path |
http://mozilla.org/path (unmatched scheme)https://mozilla.org/path/ (unmatched path)https://mozilla.org/a (unmatched path) https://mozilla.org/ https://mozilla.org/path?foo=1 |
https://*/path/ Match HTTPS URLs on any host, whose path is "path/" and which has no URL query string. |
https://mozilla.org/path/ https://a.mozilla.org/path/ https://something.com/path/ |
http://mozilla.org/path/ (unmatched scheme)https://mozilla.org/path (unmatched path)https://mozilla.org/a (unmatched path) https://mozilla.org/ https://mozilla.org/path/?foo=1 |
https://mozilla.org/* Match HTTPS URLs only at "mozilla.org", with any URL path and URL query string. |
https://mozilla.org/ https://mozilla.org/path https://mozilla.org/another https://mozilla.org/path/to/doc https://mozilla.org/path/to/doc?foo=1 |
http://mozilla.org/path (unmatched scheme)https://mozilla.com/path (unmatched host) |
https://mozilla.org/a/b/c/ Match only this URL, or this URL with any URL fragment. |
https://mozilla.org/a/b/c/ https://mozilla.org/a/b/c/#section1 |
Anything else. |
https://mozilla.org/*/b/*/ Match HTTPS URLs hosted on "mozilla.org", whose path contains a component "b" somewhere in the middle. Will match URLs with query strings, if the string ends in a /. |
https://mozilla.org/a/b/c/ https://mozilla.org/d/b/f/ https://mozilla.org/a/b/c/d/ https://mozilla.org/a/b/c/d/#section1 https://mozilla.org/a/b/c/d/?foo=/ https://mozilla.org/a?foo=21314&bar=/b/&extra=c/ |
https://mozilla.org/b/*/ (unmatched path)https://mozilla.org/a/b/ (unmatched path)https://mozilla.org/a/b/c/d/?foo=bar (unmatched path due to URL query string) |
Invalid Match Patterns
Invalid pattern | Reason |
resource://path/ | Unsupported scheme. |
https://mozilla.org | No path. |
https://mozilla.*.org/ | "*" in host must be at the start. |
https://*zilla.org/ | "*" in host must be the only character or be followed by ".". |
http*://mozilla.org/ | "*" in scheme must be the only character. |
https://mozilla.org:80/ | Host must not include a port number. |
https://* | Empty path: this should be "https://*/*". |
URL Match Pattern vs Regex Statement
WalkMe provides two options for extension configuration - URL match pattern and Regex. Each option has its own advantages, and below is a brief overview of both tools.
What is Regex and URL Match Pattern?
URL match pattern is a way to tell a browser extension which web pages it should work on. It uses a special syntax to specify the URLs that the extension should apply to. It's a way of defining patterns that match URLs. These patterns can be used to determine which web pages an extension should interact with when applying features like content scripts or background scripts. URL match patterns are designed for the specific purpose of controlling the behavior of browser extensions based on the URLs they encounter.
Regex (Regular expression) often abbreviated as regex or regexp, is a powerful sequence of characters that defines a search pattern. It's used for pattern matching within strings, allowing you to perform tasks like searching for specific patterns, extracting information, or replacing parts of a text. Regex is a versatile tool widely used in text processing, data validation, and various programming tasks.
Comparison
URL match pattern | Regex | |
Purpose | Created and optimized for web browsers to define which URLs a browser extension or application should apply to. | Powerful tool for pattern matching in strings. It allows you to define a search pattern using a combination of characters and meta characters. |
Usage | Used by browser extensions to specify which web pages they should operate on, like content scripts or background scripts. | Can be used in a wide range of applications like text processing, data validation, search and replace operations, and more. |
Syntax | URL match patterns use a specific syntax that includes wildcard characters like '*' and '?' to specify patterns for URLs. | Regex patterns are expressed using a specific syntax that includes metacharacters like '.' (matches any character), '*' (matches zero or more occurrences), '+' (matches one or more occurrences), etc. |
Example | The pattern https://example.com/* will match any URL that starts with https://example.com/. | This regex ^(http|https)://example\.com(?:/[^/?#]+)*$ matches URLs that belong to the domain example.com. |
Pros and Cons
URL match pattern | Regex | |
Pros |
Simplicity for URL matching: Specifically designed for matching URLs, making it intuitive and easy to use for this purpose. Easy to understand: URL match patterns tend to be more readable and straightforward compared to complex regex. Built for browser extensions: Using URL match patterns is often the recommended and standard way to specify which URLs your extension should interact with. |
General purpose: Extremely versatile and can be used for a wide range of pattern matching tasks beyond just URLs. It's a powerful tool for text processing and manipulation. Rich syntax: Provides a rich set of metacharacters and expressions that allow for highly precise pattern matching and manipulation. Pattern flexibility: Can handle complex patterns that go beyond URL structures. This includes matching specific characters, sequences, or sets of characters. |
Cons |
Limited to URL matching: Specific to matching URLs and are not suitable for general-purpose pattern matching tasks. Lack of complexity: Lack the rich feature set and flexibility of regular expressions. They are not suitable for tasks that require intricate pattern matching beyond URL structures. Not suitable for all applications: They are tailored specifically for web browser extensions. If you're working on tasks outside of this scope, they may not be relevant. |
Complexity: Writing complex regular expressions can be challenging and error-prone. It may take time to master the syntax. Performance Overheads: Extremely complex regex patterns can lead to severe performance impact, especially with large inputs. Readability: Complex regex patterns can be hard to read and understand, making maintenance and debugging difficult. |
Official WalkMe Statement
WalkMe recommends using URL match pattern for extension configuration, if the use case allows it. We consider Regex as a last resort when there's an edge case. URL match pattern will guarantee seamless extension functionality and minimize behind the scenes rules validation which will also provide a better user experience in return.
URL match pattern is currently set as a default tool for any new system configuration in extension settings of Admin Center.