Wednesday 

Room 2 

11:40 - 12:40 

(UTC+02

Talk (60 min)

Are You Smarter Than A Branch Predictor?

A single unpredictable branch can quietly dominate your execution time. In this (very) high energy, interactive session (part performance talk, part game show), the audience becomes the branch predictor. Participants will see small C++ snippets drawn from real-world production code and vote on which version is faster. Stress balls included for correct answers :) .

C++
Technique

Each example of illustrative pseudocode and patterns will focus on branch prediction and elimination. Yes, they're microbenchmarks. But they expose patterns that show up in real systems, especially when profiling points to control flow as your bottleneck.
Examples will use animated visuals, annotated assembly, and performance counter data to explore how modern CPUs handle control flow. We'll look at different architectures: from out-of-order x86 cores to in-order ARM and embedded processors to see how each behaves under pressure. We will also look at how different popular compilers, from GCC to MSVC, transform the same source code in surprisingly different ways.


Through these examples, we’ll dig into branch prediction and the costs of misprediction, indirect calls and virtual dispatching, conditional moves versus branches (including when and why compilers tend to emit CMOV and when they don't), and more! Each example will be connected to real world trade-offs: what -O2 and -O3 already do for you, determinism and worst-case latency concerns, code size constraints in certain systems, and why the same branch can behave very differently across a variety of architectures.
We’ll measure hardware, inspect branch counters, and discuss how to design trustworthy microbenchmarks without fooling ourselves.


These examples will come from real-world production scenarios that have surprised even experienced engineers. The talk has been refined through multiple internal runs and fits comfortably within the allotted time. While examples are shown in C++, the principles apply to any language targeting modern CPUs.
Attendees will leave with:

  • A concrete mental model of how branches behave on real processors
  • Practical patterns for reducing mispredictions and improving worst-case latency
  • A clearer understanding of when not to micro-optimize
  • The confidence to validate changes using proper profiling tools

Michelle D'souza

Michelle Fae D'Souza is a software engineer at Bloomberg, where she develops systems for the company's data license solution, delivering trading and financial data to firms worldwide. She has also worked on real-time, low-level production systems. An active member of Bloomberg's C++ Guild, she also serves as a Technical Rep within the firm.
Michelle earned a bachelor's degree in computer science from the University of California, Berkeley, where she was a Davis Scholar and President of the Computer Science Honors Society. She regularly speaks about C++, performance, and systems-level topics.