kayomn.net/content/blog/hacking-generators-cpp.md

259 lines
12 KiB
Markdown
Executable File

+++
title = "Hacking Generator Functions in C++"
date = 2022-05-22
description = "Taking a look at all the dark corners regarding lambdas in C++."
[taxonomies]
programming = ["c++"]
+++
# Preface
Since their introduction in the C++11 standards revision, lambdas have been a massive success story for C++ due to their interaction with existing, pre-lambda code - both inside and outside of the standard library implementations. However, features of their specification make them far more powerful than initially obvious, as is explored in this article.
# History
As mentioned well in the preface, C++ lambdas integrate well with existing standard and third-party code, in a manner where picking them up over the previous "functor `struct`s" approach comes with little-to-no friction.
```cpp
struct MyFunctor {
std::string name;
MyFunctor(std::string name) {
this->name = name;
}
void operator()() {
std::cout << "Hello, " << this->name << "!";
}
};
void beforeLambdas() {
MyFunctor myFunctor = MyFunctor("Functors");
myFunctor();
}
```
Achieving "lambda-like" functionality in a pre-C++11 world was feasible; primitive operator overloading and the object model provided by C++ from the get-go allowed for making structs that somewhat visibly behaved like functions at the call-site, although with a lot of associated boilerplate.
Practically, this was no different than writing a highly-specialized and, arguably redundant single-method class. In many cases, this made type-erasing callback interfaces like C's [`qsort`](https://en.cppreference.com/w/c/algorithm/) preferable as a lower-boilerplate solution.
Irrespective of individual developer sentiment, the C++ standard library adopted functors for much of its [algorithm](https://en.cppreference.com/w/cpp/algorithm) and [container](https://en.cppreference.com/w/cpp/container) libraries, where a type-safe callback for customizable logic was necessary.
```cpp
void afterLambdas() {
std::string name = "Lambdas";
auto myLambda = [name]() -> void {
std::cout << "Hello, " << name << "!";
};
}
```
Lambdas bridged the two worlds of ease-of-use and type-safety by introducing new syntax into the language that allowed for expressing a standalone function and data captures as a lambda expression. Typically, these lambdas are compiled down into anonymous structs with their data captures as members and the function as an method overloading the call syntax operator.
Alongside the introduction of general-purpose type erasure in [`std::function`](https://en.cppreference.com/w/cpp/utility/functional/function), this made lambdas either compatible with, or at the very least preferable over, code that was built with functor objects in mind.
# C++ Lambda Characteristics
```cpp
~HashTable() {
auto destroyChain = [this](Bucket * bucket) {
while (bucket) {
Bucket * nextBucket = bucket->next;
bucket->item.key.~KeyType();
bucket->item.value.~ValueType();
this->allocator->Deallocate(bucket);
bucket = nextBucket;
}
};
destroyChain(this->freeBuckets);
uint32_t count = this->count;
for (uint32_t i = 0; count != 0; i += 1) {
Bucket * bucket = this->buckets.At(i);
if (bucket) {
destroyChain(bucket);
count -= 1;
}
}
this->allocator->Deallocate(this->buckets.pointer);
}
```
Through personal practice, I've found them to be incredibly powerful beyond as a substitute for [nested functions](https://dlang.org/spec/function.html#nested) and [local functions](https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/local-functions) in the D and C# programming languages respectively.
The above code demonstrates their use in the aforementioned manner as a simple cleanup function, which removes code duplication between the two units of logic calling `destroyChain`. This same solution could have been achieved through a secondary method in the `HashTable` class, however it then requires the code reader to go elsewhere to find its declaration and further pollutes the class namespace with otherwise unnecessary symbols. With that said, it's important not to lose sight of overhead considerations when using lambdas.
Programmers coming from languages like C# or Java may have a pre-conceived notion that lambdas are inherently more costly compared to methods, however this is not necessarily the case. In part, this is because C++ can reason about the size of a lambda at compile-time because captured values are explicit rather than implicit. In contrast, a Java lambda or C# delegate does not necessarily know the bounds of its stack at compile-time and must dynamically allocate it during execution - introducing the potential for more overhead if it cannot be hoisted onto the stack via [escape analysis](https://en.wikipedia.org/wiki/Escape_analysis).
```
HashTable::~HashTable() [complete object destructor]:
push r13
push r12
push rbp
push rbx
sub rsp, 8
mov rbp, rdi
mov rbx, QWORD PTR [rdi+32]
test rbx, rbx
je .L8
.L9:
mov rsi, rbx
mov rbx, QWORD PTR [rbx+16]
mov rdi, QWORD PTR [rbp+0]
mov rax, QWORD PTR [rdi]
call [QWORD PTR [rax+8]] ; this->allocator->Deallocate(bucket)
test rbx, rbx
jne .L9
.L8:
mov r13d, DWORD PTR [rbp+8]
test r13d, r13d
je .L10
mov r12d, 0
jmp .L14
.L19:
mov ecx, OFFSET FLAT:.LC0
mov edx, 43
mov esi, OFFSET FLAT:.LC1
mov edi, OFFSET FLAT:.LC2
call __assert_fail
.L20:
sub r13d, 1
.L12:
add r12d, 1
test r13d, r13d
je .L10
.L14:
mov eax, r12d
cmp rax, QWORD PTR [rbp+16]
jnb .L19
mov rdx, QWORD PTR [rbp+24]
mov rbx, QWORD PTR [rdx+rax*8]
test rbx, rbx
je .L12
.L13:
mov rsi, rbx
mov rbx, QWORD PTR [rbx+16]
mov rdi, QWORD PTR [rbp+0]
mov rax, QWORD PTR [rdi]
call [QWORD PTR [rax+8]] ; this->allocator->Deallocate(bucket)
test rbx, rbx
jne .L13
jmp .L20
.L10:
mov rdi, QWORD PTR [rbp+0]
mov rsi, QWORD PTR [rbp+24]
mov rax, QWORD PTR [rdi]
call [QWORD PTR [rax+8]] ; this->allocator->Deallocate(bucket)
add rsp, 8
pop rbx
pop rbp
pop r12
pop r13
ret
```
The lack of dynamic allocation overhead with lambdas in C++ can be visibly observed if the disassembly for the previous code sample is viewed, wherein no calls to `malloc`, C++'s `new` implementation, or otherwise. In fact, through closer inspection of the above Clang 11 x86-64 disassembly, the `destroyChain` lambda has been completely elided, instead inlining its operations to each of its two call-sites.
This leaves the intrinsic `__assert_fail` and user-defined `Allocator::Deallocate(void *)` functions as the only remaining `call` instructions in the generated output, the latter of which is a virtual function and is therefore significantly harder to statically inline as opposed to a statically allocated lambda functor.
However, the specification and intrinsics of lambdas in C++ isn't really the concern of this article. Rather, I'm more interested in looking at how lambdas can be broken to implement support for *functionality* they was never intended.
# Generative Functions
For those with no experience with them from other languages or no formal background in computer science, generative functions - also referred to as *"generator"* or *"generating"* functions - are a sub-category of functions that appear in both computing and mathematics. However, much like the differences in terms for functions between computing and mathematics, generators can also be defined differently between both domains.
Because functions in lambda calculus are defined as always producing the same output for any given input, most applications of generator functions in the space of computing would therefore not match that definition. Iterators - one of the most popular applications of generative logic across many computer science domains and languages - are a perfect example of this, as they are designed to move their way through a set of data, producing whatever value is at the current iteration of it.
```js
function* factorial(n) {
let total = 1
while (n > 1) {
total *= n
n -= 1
yield total
}
}
for (let value of factorial(10)) {
console.log(value)
}
```
Even the above example of factorial value generation is flawed by the standards of lambda calculus, as it is producing a new result each proceeding invocation. However, this article is looking at C++, not mathematics, so I digress.
```cpp
int main(int argc, char ** argv) {
auto doThing = [i = (size_t)0]() -> void {
i += 1;
};
}
```
If an attempt at compiling the above source code were made, a compiler error would be raised telling the programmer that `i` cannot be assigned to as it is read-only. Indeed, this is true for any capture, as the underlying anonymous `struct` that is generated by the compiler is always marked `const` with *no exceptions to this*.
```cpp
int main(int argc, char ** argv) {
auto doThing = [i = (size_t)0]() mutable -> void {
i += 1;
}
}
```
Or rather, this is the case until the lambda declaration is annotated with the `mutable` keyword, after which, the source code compiles and the desired behavior is produced.
# Mutable C++ Functors
It's hard to determine what a committee's overall rationale behind making lambdas `const` by default was, however, I can speculate based on my own experience as a programmer in general and specifically with C++.
## Capture Syntax Ambiguity
The difference between capturing a variable by value versus reference is one character: an ampersand. Considering this, the general formatting rules people apply when writing lambdas, and the speed at which your average professional will scan-read source code, it is believable that the committee considered potential typos to be a massive human error source.
```cpp
int32_t i = 0;
auto modify = [i]() {
i += 1;
};
```
Were `i` to be `mutable` by default in the above example, any invocation of `modify` would write to the captured copy of `i` rather than the outer variable that the capture shadows. This would result in `i += 1` having no observable side-effects in the compiled program and a hidden bug that is hard to spot.
## Parallelism Concerns with Shared Functors
The previous concern is complicated further if lambda instances are shared between parallel units of computation, like threads, as this capture could be being modified simultaneously. This would result in unforeseen race conditions, as the lambda could have otherwise been considered [mathematically pure](https://en.wikipedia.org/wiki/Pure_function) in the sense that its inputs were immutable.
## Functional Influences
A catch-all for both of the above without fully understanding the domain of concerns. This hypothesizes that the committee may have settled to make captures `const` by default because "functional languages like `const` a lot". While this is a somewhat naive and very cynical view of the thought-process of hundreds of intelligent individuals, it is common that the sum of a group is a lot dumber than the people that compose them. Furthermore, with with a language that has as many moving parts as C++, oversights have been and continue to be frequent. Therefore, cordoning off an entire area of potential problems would serve as an effective solution until more experimentation with them could be undertaken.
# Concluding
Herb Sutter looked at these three concerns and more in a [whitepaper produced as part of the evolution working group for C++ back in 2012](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3424.pdf), arguing that - while immutable lambda inputs makes sense in a lot of cases - there are some other unintended side-effects of these decisions that couldn't have been foresaw beforehand. The paper is very much worth a read for a more in-depth look at the hidden rough-edges around lambdas.
```cpp
std::function tokenize(std::string const & str) {
return [str, cursor = (size_t)0]() mutable -> Token {
while (cursor < str.size()) {
// ...
}
return Token{};
};
}
```
Whatever the case may be, for the time-being this is a solution that works and continues to work well for a language as close to the metal as C++. I have successfully used it myself in many projects that required things like simple tokenization and more. I find that, for cases where I would reach for a single-use class, it can be preferable as it avoids creating more nominal types.