as they code I notice nested for loops that could have been one matrix multiplication.
This seems like an odd choice for your primary example.
Is the primary concern that a sufficiently smart compiler could take your matrix multiplication and turn it into a vectorized instruction?
Is it only applicable in certain languages then? E.g. do JVM languages typically enable vectorized instruction optimizations?
Is the primary concern that a single matrix multiplication is more maintainable than nested for loops?
Is it only applicable in certain domains then (e.g. machine learning)? Most of my data isn’t modelled as matrices, so would I need some nested for loops anyway to populate a matrix to enable this refactoring?
Is it perhaps worth writing a (short?) top level post with an worked out example of the refactoring you have in mind, and why matrix multiplication would be better than nested for loops?
Imagine someone named Omega offers to play a game with you. Omega has a bag, and they swear on their life that exactly one of the following statements is true:
They put a single piece of paper in the bag, and it has “1” written on it.
They put 10 trillion pieces of paper in the bag, numbered “1”, “2″, “3”, etc. up to ten trillion.
Omega then has an independent neutral third party reach into the bag and pull out a random piece of paper which they then hand to you. You look at the piece of paper and it says “1” on it. Omega doesn’t get to look at the piece of paper, so they don’t know what number you saw on that paper.
Now the game Omega propose to you is: If you can guess which of the two statements was the true one, they’ll give you a million dollars. Otherwise, you get nothing.
Which do you guess? Do you guess that the bag had a single piece of paper in it, or do you guess that the bag had 10 trillion pieces of paper in it?