path_patterns
Centralized regex patterns for config path parsing.
This module contains all regex patterns used across sparkwheel for parsing configuration paths, references, and file paths. Patterns are compiled once at module load and documented with examples.
Why regex here? - Complex patterns (lookahead, Unicode support) - Performance (C regex engine) - Correctness (battle-tested patterns)
Patterns are centralized here instead of scattered across multiple files for easier maintenance, testing, and documentation.
PathPatterns
Collection of compiled regex patterns for config path parsing.
All patterns are compiled once at class definition time and reused. Each pattern includes documentation with examples of what it matches.
Source code in src/sparkwheel/path_patterns.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | |
ABSOLUTE_REFERENCE = re.compile(f'{RESOLVED_REF_KEY}(\w+(?:::\w+)*)')
class-attribute
instance-attribute
Match absolute resolved reference patterns: @id::path::value
Finds @ resolved references in config values and expressions. Handles nested paths with :: separators and list indices (numbers).
Matches
- "@model::lr" -> captures "model::lr"
- "@data::0::value" -> captures "data::0::value"
- "@x" -> captures "x"
Examples in expressions
- "$@model::lr * 2" -> matches "@model::lr"
- "$@x + @y" -> matches "@x" and "@y"
Pattern breakdown
- @ -> literal @ symbol
- (\w+(?:::\w+)*) -> captures word chars followed by optional :: and more word chars
Note: \w includes [a-zA-Z0-9_] plus Unicode word characters, so this handles international characters correctly.
FILE_AND_ID = re.compile('(.*\\.(yaml|yml))(?=(?:::.*)|$)', re.IGNORECASE)
class-attribute
instance-attribute
Split combined file path and config ID.
The pattern uses lookahead to find the file extension without consuming the :: separator that follows.
Matches
- "config.yaml::model::lr" -> group 1: "config.yaml"
- "path/to/file.yml::key" -> group 1: "path/to/file.yml"
- "/abs/path/cfg.yaml:
:b" -> group 1: "/abs/path/cfg.yaml"
Non-matches
- "model::lr" -> no .yaml/.yml extension
- "data.json::key" -> wrong extension
Edge cases handled
- Case insensitive: "Config.YAML::key" works
- Multiple extensions: "backup.yaml.old" stops at first .yaml
- Absolute paths: "/etc/config.yaml::key" works
RELATIVE_REFERENCE = re.compile(f'(?:{RESOLVED_REF_KEY}|{RAW_REF_KEY})(::)+')
class-attribute
instance-attribute
Match relative reference prefixes: @::, @::::, %::, etc.
Used to find relative navigation patterns in config references. The number of :: pairs indicates how many levels to go up.
Matches
- "@::" -> resolved reference one level up (parent)
- "@::::" -> resolved reference two levels up (grandparent)
- "%::" -> raw reference one level up
- "%::::" -> raw reference two levels up
Examples in context
- In "model::optimizer", "@::lr" means "@model::lr"
- In "a:
:c", "@::::x" means "@a::x"
Pattern breakdown
- (?:@|%) -> @ or % symbol (non-capturing group)
- (::)+ -> one or more :: pairs (captured)
find_absolute_references(text)
classmethod
Find all absolute reference patterns in text.
Only searches in expressions ($...) or pure reference values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
String to search |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of reference IDs found (without @ prefix) |
Examples:
>>> PathPatterns.find_absolute_references("@model::lr")
['model::lr']
>>> PathPatterns.find_absolute_references("$@x + @y")
['x', 'y']
>>> PathPatterns.find_absolute_references("normal text")
[]
Source code in src/sparkwheel/path_patterns.py
find_relative_references(text)
classmethod
Find all relative reference patterns in text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
String to search |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of relative reference patterns found (e.g., ['@::', '@::::']) |
Examples:
>>> PathPatterns.find_relative_references("value: @::sibling")
['@::']
>>> PathPatterns.find_relative_references("@::::parent and @::sibling")
['@::::', '@::']
Source code in src/sparkwheel/path_patterns.py
split_file_and_id(src)
classmethod
Split combined file path and config ID using FILE_AND_ID pattern.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
src
|
str
|
String like "config.yaml::model::lr" |
required |
Returns:
| Type | Description |
|---|---|
tuple[str, str]
|
Tuple of (filepath, config_id) |
Examples:
>>> PathPatterns.split_file_and_id("config.yaml::model::lr")
("config.yaml", "model::lr")
>>> PathPatterns.split_file_and_id("model::lr")
("", "model::lr")
>>> PathPatterns.split_file_and_id("/path/to/file.yml::key")
("/path/to/file.yml", "key")
Source code in src/sparkwheel/path_patterns.py
find_references(text)
Convenience function wrapping PathPatterns.find_absolute_references().
is_yaml_file(filepath)
Check if filepath is a YAML file (.yaml or .yml).
Simple string check - no regex needed for this.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str
|
Path to check |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if filepath ends with .yaml or .yml (case-insensitive) |
Examples:
>>> is_yaml_file("config.yaml")
True
>>> is_yaml_file("CONFIG.YAML")
True
>>> is_yaml_file("data.json")
False