
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
transformers
huggingface/transformers: 是一个基于 Python 的自然语言处理库,它使用了 PostgreSQL 数据库存储数据。适合用于自然语言处理任务的开发和实现,特别是对于需要使用 Python 和 PostgreSQL 数据库的场景。特点是自然语言处理库、Python、PostgreSQL 数据库。
项目地址:https://gitcode.com/gh_mirrors/tra/transformers
·
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
2022年的论文,提出把XW过程里能量化的量化掉,不能量化的原样计算。

从图里也能看出来它怎么搞的。
- 按张量并行的方式彻底拆分X和W。
- 看X每一列是否足够大,够大,它就不能量化掉,因为它对结果影响很大。其余的就是量化的对象。
- X和W被分成两组,分别计算,最后相加,和张量并行一样。
缺点:
算子拆分了,一个矩阵乘法搞出两个,是省显存了,速度下来了。而且你8bit也和4bit差一半呢。
huggingface/transformers: 是一个基于 Python 的自然语言处理库,它使用了 PostgreSQL 数据库存储数据。适合用于自然语言处理任务的开发和实现,特别是对于需要使用 Python 和 PostgreSQL 数据库的场景。特点是自然语言处理库、Python、PostgreSQL 数据库。
最近提交(Master分支:2 个月前 )
33868a05
* [i18n-HI] Translated accelerate page to Hindi
* Update docs/source/hi/accelerate.md
Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>
* Update docs/source/hi/accelerate.md
Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>
* Update docs/source/hi/accelerate.md
Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>
* Update docs/source/hi/accelerate.md
Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com>
---------
Co-authored-by: Kay <kay@Kays-MacBook-Pro.local>
Co-authored-by: K.B.Dharun Krishna <kbdharunkrishna@gmail.com> 11 天前
e2ac16b2
* rework converter
* Update modular_model_converter.py
* Update modular_model_converter.py
* Update modular_model_converter.py
* Update modular_model_converter.py
* cleaning
* cleaning
* finalize imports
* imports
* Update modular_model_converter.py
* Better renaming to avoid visiting same file multiple times
* start converting files
* style
* address most comments
* style
* remove unused stuff in get_needed_imports
* style
* move class dependency functions outside class
* Move main functions outside class
* style
* Update modular_model_converter.py
* rename func
* add augmented dependencies
* Update modular_model_converter.py
* Add types_to_file_type + tweak annotation handling
* Allow assignment dependency mapping + fix regex
* style + update modular examples
* fix modular_roberta example (wrong redefinition of __init__)
* slightly correct order in which dependencies will appear
* style
* review comments
* Performance + better handling of dependencies when they are imported
* style
* Add advanced new classes capabilities
* style
* add forgotten check
* Update modeling_llava_next_video.py
* Add prority list ordering in check_conversion as well
* Update check_modular_conversion.py
* Update configuration_gemma.py 11 天前
更多推荐

所有评论(0)