Overview
Request 1201559 accepted
- update to version 4.4.0
**Removed**: Flash Attention support in the Python package due to
significant package size increase with minimal
performance gain.
### New features
* Support Llama3
* Support Gemma2
* Add log probs for all tokens in vocab
* Grouped conv1d
### Fixes and improvements
* Some improvements in flash attention
* Fix crash when using return_alternative on CUDA
* Quantization AWQ GEMM + GEMV
- Created by adrianSuSE
- In state accepted
Request History
adrianSuSE created request
- update to version 4.4.0
**Removed**: Flash Attention support in the Python package due to
significant package size increase with minimal
performance gain.
### New features
* Support Llama3
* Support Gemma2
* Add log probs for all tokens in vocab
* Grouped conv1d
### Fixes and improvements
* Some improvements in flash attention
* Fix crash when using return_alternative on CUDA
* Quantization AWQ GEMM + GEMV
anag+factory added openSUSE:Factory:Staging:adi:24 as a reviewer
Being evaluated by staging project "openSUSE:Factory:Staging:adi:24"
anag+factory accepted review
Picked "openSUSE:Factory:Staging:adi:24"
factory-auto added opensuse-review-team as a reviewer
Please review sources
factory-auto accepted review
Check script succeeded
darix accepted review
Accepted review for by_group opensuse-review-team request 1201559 from user factory-auto
licensedigger accepted review
The legal review is accepted preliminary. The package may require actions later on.
anag+factory accepted review
Staging Project openSUSE:Factory:Staging:adi:24 got accepted.
anag+factory approved review
Staging Project openSUSE:Factory:Staging:adi:24 got accepted.
anag+factory accepted request
Staging Project openSUSE:Factory:Staging:adi:24 got accepted.