NBFC Platform

Project:

Back Edit Delete

Text Compression for Efficient Language Generation

Created by MG96

External Public cs.CL

Statistics

Citations

References

Last updated

Authors

David Gu Peter Belcak Roger Wattenhofer

Project Resources

Filter by Resource Type:

Name	Type	Source	Actions
ArXiv Paper	Paper	arXiv	View Edit Delete

Abstract

We challenge the prevailing assumption that LLMs must rely fully on sub-word tokens for high-quality text generation. To this end, we propose the "Generative Pretrained Thoughtformer" (GPTHF), a hierarchical transformer language model capable of text generation by compressing text into sentence embeddings and employing a sentence attention mechanism. GPTHF retains GPT's architecture, modifying only token interactions via dynamic sparse attention masks. Our experiments show that GPTHF achieves an up to an order of magnitude improvement in FLOPs efficiency and a threefold increase in runtime speed compared to equally-sized GPT models in the low-size regime. This is achieved through a unique generation method that caches and reuses sentence embeddings, allowing significant portions of the input to bypass large parts of the network.

Project:

Text Compression for Efficient Language Generation

Statistics

Citations

References

Last updated

Authors

Project Resources

Abstract

Note:

No note available for this project.

Contact:

No contact available for this project.

Project:

Text Compression for Efficient Language Generation

Statistics

Citations

References

Last updated

Authors

Authors (3)

Project Resources

Resources (1)

Abstract

Note:

No note available for this project.

Contact:

No contact available for this project.