Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Matching Brackets] draft approaches #3670

Merged
merged 5 commits into from
Aug 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions exercises/practice/matching-brackets/.approaches/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
{
"introduction": {
"authors": [
"colinleach",
"BethanyG"
]
},
"approaches": [
{
"uuid": "449c828e-ce19-4930-83ab-071eb2821388",
"slug": "stack-match",
"title": "Stack Match",
"blurb": "Maintain context during stream processing by use of a stack.",
"authors": [
"colinleach",
"BethanyG"
]
},
{
"uuid": "b4c42162-751b-42c8-9368-eed9c3f4e4c8",
"slug": "repeated-substitution",
"title": "Repeated Substitution",
"blurb": "Use substring replacement to iteratively simplify the string.",
"authors": [
"colinleach",
"BethanyG"
]
}
]
}
78 changes: 78 additions & 0 deletions exercises/practice/matching-brackets/.approaches/introduction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Introduction

The aim in this exercise is to determine whether opening and closing brackets are properly paired within the input text.

These brackets may be nested deeply (think Lisp code) and/or dispersed among a lot of other text (think complex LaTeX documents).

Community solutions fall into two main groups:

1. Those which make a single pass or loop through the input string, maintaining necessary context for matching.
2. Those which repeatedly make global substitutions within the text for context.


## Single-pass approaches

```python
def is_paired(input_string):
bracket_map = {"]" : "[", "}": "{", ")":"("}
tracking = []

for element in input_string:
if element in bracket_map.values():
tracking.append(element)
if element in bracket_map:
if not tracking or (tracking.pop() != bracket_map[element]):
return False
return not tracking
```

The key in this approach is to maintain context by pushing open brackets onto some sort of stack (_in this case appending to a `list`_), then checking if there is a corresponding closing bracket to pair with the top stack item.

See [stack-match][stack-match] approaches for details.


## Repeated-substitution approaches

```python
def is_paired(text):
text = "".join(item for item in text if item in "()[]{}")
while "()" in text or "[]" in text or "{}" in text:
text = text.replace("()","").replace("[]", "").replace("{}","")
return not text
```

In this approach, we first remove any non-bracket characters, then use a loop to repeatedly remove inner bracket pairs.

See [repeated-substitution][repeated-substitution] approaches for details.


## Other approaches

Languages prizing immutibility are likely to use techniques such as `foldl()` or recursive matching, as discussed on the [Scala track][scala].

This is possible in Python, but can read as unidiomatic and will (likely) result in inefficient code if not done carefully.

For anyone wanting to go down the functional-style path, Python has [`functools.reduce()`][reduce] for folds and added [structural pattern matching][pattern-matching] in Python 3.10.

Recursion is not highly optimised in Python and there is no tail call optimization, but the default stack depth of 1000 should be more than enough for solving this problem recursively.


## Which approach to use

For short, well-defined input strings such as those currently in the test file, repeated-substitution allows a passing solution in very few lines of code.
But as input grows, this method could become less and less performant, due to the multiple passes and changes needed to determine matches.

The single-pass strategy of the stack-match approach allows for stream processing, scales linearly (_`O(n)` time complexity_) with text length, and will remain performant for very large inputs.

Examining the community solutions published for this exercise, it is clear that many programmers prefer the stack-match method which avoids the repeated string copying of the substitution approach.

Thus it is interesting and perhaps humbling to note that repeated-substitution is **_at least_** as fast in benchmarking, even with large (>30 kB) input strings!

See the [performance article][article-performance] for more details.

[article-performance]:https://exercism.org/tracks/python/exercises/matching-brackets/articles/performance
[pattern-matching]: https://docs.python.org/3/whatsnew/3.10.html#pep-634-structural-pattern-matching
[reduce]: https://docs.python.org/3/library/functools.html#functools.reduce
[repeated-substitution]: https://exercism.org/tracks/python/exercises/matching-brackets/approaches/repeated-substitution
[scala]: https://exercism.org/tracks/scala/exercises/matching-brackets/dig_deeper
[stack-match]: https://exercism.org/tracks/python/exercises/matching-brackets/approaches/stack-match
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Repeated Substitution


```python
def is_paired(text):
text = "".join([element for element in text if element in "()[]{}"])
while "()" in text or "[]" in text or "{}" in text:
text = text.replace("()","").replace("[]", "").replace("{}","")
return not text
```

In this approach, the steps are:

1. Remove all non-bracket characters from the input string (_as done through the filter clause in the list-comprehension above_).
2. Iteratively remove all remaining bracket pairs: this reduces nesting in the string from the inside outwards.
3. Test for a now empty string, meaning all brackets have been paired.


The code above spells out the approach particularly clearly, but there are (of course) several possible variants.


## Variation 1: Walrus Operator within a Generator Expression


```python
def is_paired(input_string):
symbols = "".join(char for char in input_string if char in "{}[]()")
while (pair := next((pair for pair in ("{}", "[]", "()") if pair in symbols), False)):
symbols = symbols.replace(pair, "")
return not symbols
```

The second solution above does essentially the same thing as the initial approach, but uses a generator expression assigned with a [walrus operator][walrus] `:=` (_introduced in Python 3.8_) in the `while-loop` test.


## Variation 2: Regex Substitution in a While Loop

Regex enthusiasts can modify the previous approach, using `re.sub()` instead of `string.replace()` in the `while-loop` test:

```python
import re

def is_paired(text: str) -> bool:
text = re.sub(r'[^{}\[\]()]', '', text)
while text != (text := re.sub(r'{\}|\[]|\(\)', '', text)):
continue
return not bool(text)
```


## Variation 3: Regex Substitution and Recursion


It is possible to combine `re.sub()` and recursion in the same solution, though not everyone would view this as idiomatic Python:


```python
import re

def is_paired(input_string):
replaced = re.sub(r"[^\[\(\{\}\)\]]|\{\}|\(\)|\[\]", "", input_string)
return not input_string if input_string == replaced else is_paired(replaced)
```

Note that solutions using regular expressions ran slightly *slower* than `string.replace()` solutions in benchmarking, so adding this type of complexity brings no benefit to this problem.

[walrus]: https://martinheinz.dev/blog/79/
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
def is_paired(text):
text = "".join(element for element in text if element in "()[]{}")
while "()" in text or "[]" in text or "{}" in text:
text = text.replace("()","").replace("[]", "").replace("{}","")
return not text
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Stack Match


```python
def is_paired(input_string):
bracket_map = {"]" : "[", "}": "{", ")":"("}
stack = []

for element in input_string:
if element in bracket_map.values():
stack.append(element)
if element in bracket_map:
if not stack or (stack.pop() != bracket_map[element]):
return False
return not stack
```

The point of this approach is to maintain a context of which bracket sets are currently "open":

- If a left bracket is found, push it onto the stack (_append it to the `list`_).
- If a right bracket is found, **and** it pairs with the last item placed on the stack, pop the bracket off the stack and continue.
- If there is a mismatch, for example `'['` with `'}'` or there is no left bracket on the stack, the code can immediately terminate and return `False`.
- When all the input text is processed, determine if the stack is empty, meaning all left brackets were matched.

In Python, a [`list`][concept:python/lists]() is a good implementation of a stack: it has [`list.append()`][list-append] (_equivalent to a "push"_) and [`lsit.pop()`][list-pop] methods built in.

Some solutions use [`collections.deque()`][collections-deque] as an alternative implementation, though this has no clear advantage (_since the code only uses appends to the right-hand side_) and near-identical runtime performance.

The default iteration for a dictionary is over the _keys_, so the code above uses a plain `bracket_map` to search for right brackets, while `bracket_map.values()` is used to search for left brackets.

Other solutions created two sets of left and right brackets explicitly, or searched a string representation:

```python
if element in ']})':
```

Such changes made little difference to code length or readability, but ran about 5-fold faster than the dictionary-based solution.

At the end, success is an empty stack, tested above by using the [False-y quality][falsey] of `[]` (_as Python programmers often do_).

To be more explicit, we could alternatively use an equality:

```python
return stack == []
```

[list-append]: https://docs.python.org/3/tutorial/datastructures.html#more-on-lists
[list-pop]: https://docs.python.org/3/tutorial/datastructures.html#more-on-lists
[collections-deque]: https://docs.python.org/3/library/collections.html#collections.deque
[falsey]: https://docs.python.org/3/library/stdtypes.html#truth-value-testing
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
bracket_map = {"]" : "[", "}": "{", ")":"("}
stack = []
for element in input_string:
if element in bracket_map.values(): tracking.append(element)
if element in bracket_map:
if not stack or (stack.pop() != bracket_map[element]):
return False
return not stack
14 changes: 14 additions & 0 deletions exercises/practice/matching-brackets/.articles/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"articles": [
{
"uuid": "af7a43b5-c135-4809-9fb8-d84cdd5138d5",
"slug": "performance",
"title": "Performance",
"blurb": "Compare a variety of solutions using benchmarking data.",
"authors": [
"colinleach",
"BethanyG"
]
}
]
}
Loading
Loading