Tensorflow - Your CPU supports instructions that this binary was not compiled to use

If you've installed TensorFlow from pip, you've probably come across this message:

tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA

Well, I did too and it got me wondering how much of a difference those instructions end up making. People often say GPUs are required for any non-trivial ML work so I wanted to see if that was really true. I decided to delve in and optimize TensorFlow and make it faster on my machine. I gave a talk about this yesterday at the TensorFlow and Deep Learning Meetup in Singapore, the slides will be attached at the end as well.

more ...