Making Software 10x Faster with Low-Level CPU Optimizations

Speaker: Sasha Goldshtein Modern processors are extremely complex. Writing fast code means not only avoiding slow APIs but also taking advantage of every last bit of performance the processor has to offer. In this session we'll review some key performance wins you can get from modern processors by properly using instruction-level parallelism, vectorizing loops, avoiding store-to-load forwarding stalls, making better use of the CPU cache, and employing other low-level optimizations that

Register to view the full article

You have reached some of our most popular content! Register or log in to view.


Registering gives you access to more exclusive content like this article.