A team of developers from the University of California, Soochow University and LuxiTec has created a new formula that allows to implement AI language models without matrix multiplication (MatMul), which is usually seen as a limiting factor.
Pre-printed to arXiv, their method renews 16-bit floating points {0,1} with three numbers {-1, 0, one}, and uses certain functions as well as quantization techniques that speed up data processing. The researchers replaced the isomorphic potentials with a MatMul-free linear gated recurrent unit from normal transformer blocks. They found that their system scales up as well the current state-of-the-art, but requires markedly less computing power and electricity.
To read the blog, VISIT HERE.