JumpyTaco
JumpyTaco

Paper with Code: You can now run LLMs without Matrix Multiplications

Saw this paper: https://arxiv.org/pdf/2406.02528

In essence, MatMul operations can be completely eliminated from LLMs while maintaining strong performance at billion-parameter scales and by utilising an optimised kernel during inference, their model’s memory consumption can be reduced by more than 10× compared to un-optimised models.

source: https://x.com/rohanpaul_ai/status/1799122826114330866

6mo ago
2.6Kviews
Find out if you are being paid fairly.Download Grapevine
No comments yet

You're early. There are no comments yet.

Be the first to comment.

Discover more
Curated from across