2/3 There are no embedding or logits flops in the LCM & the context length is much shorter (a sentence is on average 30 subwords), so a context length of 3000 subwords is only 100 in the LCM. See section 2.5.1 of the paper arxiv.org/abs/2412.08821 for a comparison of inference flops.
LLMs have revolutionized the field of artificial intelligence and have emerged as the de-facto tool for many tasks. The current established technology of LLMs is to process input and generate output a...