Steve Love
@IAmSteveLove@mastodon.social
@IAmSteveLove
Work with children or animals
Offer to present on C# performance at a C++ conference
Definitely don’t attempt live coding
Optimization is hard
A little learning…
Drink deep, or taste not the Pierian Spring
There shallow draughts intoxicate the brain, And drinking largely sobers us again.
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.
A few misconceptions
Measuring performance in the wild
What do equality comparisons have to do with pool anyway?
Well, when things got interesting…
C++ is too complicated
Java is too simplistic
GC
Boxing
Copying
Reflection
… is unpredictable
… takes a long time
… interrupts program flow
But
deterministic destruction costs
value types aren’t (individually) collected
CONSIDER defining a struct instead of a class if instances of the type are small and commonly short-lived or are commonly embedded in other objects.
C# More like |
var id = Guid.NewGuid();
//...
Console.WriteLine("Id is {0}", id);
vs.
var id = Guid.NewGuid();
//...
Console.WriteLine("Id is {0}", id.ToString());
string
again…"value-like"
Copied by-reference:
var original = "A string";
var copy = original;
Assert.That(original, Is.SameAs(copy));
Compared by value:
var buf = new StringBuilder("A string");
copy = buf.ToString();
Assert.That(original, Is.Not.SameAs(copy));
Assert.That(original, Is.EqualTo(copy));
Equals
method
More to equality comparisons than Equals
public readonly struct Colour
{
public int Red { get; init; }
public int Green { get; init; }
public int Blue { get; init; }
}
[Test, Category("Performance")]
public void CreateUniqueSet()
{
var rng = new Random(210);
var unique = Enumerable.Range(0, 10_000)
.Select(_ => rng.Next())
.Select(r => new Colour {
Red = (byte)(r >> 16),
Green = (byte)(r >> 8),
Blue = (byte)r })
.ToHashSet();
Assert.That(unique, Is.Not.Empty);
}
dotnet test -v Normal
...
NUnit Adapter 4.5.0.0: Test execution complete
Passed CreateUniqueSet [17 ms]
public readonly struct Colour
{
public byte Red { get; init; }
public byte Green { get; init; }
public byte Blue { get; init; }
}
The problem isn’t really with Equals
…
…or with byte
fields
ValueType.GetHashCode
(in the worst case with byte
fields)
considers only the first non-null
field
means a maximum of 256 unique codes (buckets)
1
2 →
1 comparison
12
3 →
2 more comparions (3 total)
123
4 →
3 more comparions (6 total)
1234
Next addition 4 more comparisons
(10 now)
1 n = 1
23 n = 2
456 n = 3
78910 n = 4
…
(n2 + n) / 2
10,000 Colour
objects
49,968,936 calls to Equals
10,0002 + 10,000 / 2 =
50,005,000
…don’t have to be complicated.
public override int GetHashCode()
{
return (Red << 16) | (Green << 8) | Blue;
}
HashCode.Combine(Red, Green, Blue)
also works, and is efficient.
public override bool Equals(object? obj)
{
return obj is Colour other &&
Red == other.Red && Green == other.Green && Blue == other.Blue;
}
public readonly struct Colour : IEquatable<Colour>
{
public int Red { get; init; }
public int Green { get; init; }
public int Blue { get; init; }
public bool Equals(Colour other)
{
return Red == other.Red && Green == other.Green && Blue == other.Blue;
}
public override bool Equals(object? obj)
{
return obj is Colour other && Equals(other);
}
public override int GetHashCode()
{
return (Red << 16) | (Green << 8) | Blue;
}
}
Since C# v12.0 (.NET 8)
public readonly struct Colour(int r, int g, int b) : IEquatable<Colour>
{
public int Red { get; init; } = r;
public int Green { get; init; } = g;
public int Blue { get; init; } = b;
public bool Equals(Colour other)
{
return Red == other.Red && Green == other.Green && Blue == other.Blue;
}
public override bool Equals(object? obj)
{
return obj is Colour other && Equals(other);
}
public override int GetHashCode()
{
return (Red << 16) | (Green << 8) | Blue;
}
}
Since C# v9.0 (.NET 5)
public sealed record Colour(int Red, int Green, int Blue);
or, since C# v10.0 (.NET 6)
public readonly record struct Colour(int Red, int Green, int Blue);
More accurate timings than profiling
Less fine-grained results
Simple struct with overridden virtuals (Equals
& GetHashCode
)
Struct implementing IEquatable<T>
Struct implementing IEquatable<T>
using Primary Constructor
Class implementing IEquatable<T>
Record
Record struct
Create a HashSet
from 10,000 objects
| Method | Case | Mean | Error | StdDev |
|----------- |-------------------------------------- |---------:|---------:|--------:|
| CreateHash | [Shared.Types.EquatableColourClass] | 232.7 us | 10.67 us | 0.59 us |
| CreateHash | [Shared.Types.EquatableColour] | 135.0 us | 26.52 us | 1.45 us |
| CreateHash | [Shared.Types.EquatablePrimaryColour] | 134.6 us | 55.93 us | 3.07 us |
| CreateHash | [Shared.Types.OverrideColour] | 123.4 us | 28.58 us | 1.57 us |
| CreateHash | [Shared.Types.RecordColour] | 227.0 us | 49.61 us | 2.72 us |
| CreateHash | [Shared.Types.RecordStructColour] | 128.6 us | 83.39 us | 4.57 us |
Predictable vs. Fast
Managed runtime
JIT vs. AOT
GC (again)
using int_type = std::int32_t;
struct colour
{
explicit colour(int_type red, int_type green, int_type blue)
: r { red }, g { green }, b { blue } {}
int_type red() const { return r; }
int_type green() const { return g; }
int_type blue() const { return b; }
bool operator==(const colour & other) const {
return r == other.r && g == other.g && b == other.b;
}
private:
int_type r, g, b;
};
Colour
std::vector<colour> create_seeds(int N)
{
std::default_random_engine gen(210);
std::uniform_int_distribution<int_type> dist(0);
auto seeds = std::views::iota(0, N)
| std::views::transform([&](int _) { return dist(gen); })
| std::views::transform([](int_type r)
{ return colour(r >> 16 & 0xFF, r >> 8 & 0xFF, r & 0xFF); });
return { std::begin(seeds), std::end(seeds) };
}
Not testing ranges - already been done:
Tina Ulbrich - Throwing Tools at C++ Ranges (CPPONSea 2023)
const int N = 10000;
std::vector<colour> colours;
void BM_create_global_state(const benchmark::State& state)
{
colours = create_seeds(N);
}
struct colour_hash
{
std::size_t operator()(const colour & col) const
{
return col.red() << 16 | col.green() << 8 | col.blue();
}
};
void BM_create_hashmap(benchmark::State& state)
{
for (const auto& _ : state)
{
auto result = std::unordered_set<colour, colour_hash>{ colours.cbegin(), colours.cend() };
auto count = result.size();
benchmark::DoNotOptimize(count);
}
}
BENCHMARK(BM_create_hashmap)
->Setup(BM_create_global_state)
->Unit(benchmark::kMicrosecond);
unordered_map<T>
much faster on lookups
Measure
Measure again!
Worst performance culprits - always the programmers
Benchmark.NET: https://benchmarkdotnet.org/
Google Benchmark: https://github.com/google/benchmark