[MX] Rowwise `W` cached for backwards

**Describe the bug**

In [transformer_engine.pytorch.Linear](https://github.com/NVIDIA/TransformerEngine/blob/5ba01faadb6c05b2d9aed1e5e96b976afa65af4a/transformer_engine/pytorch/module/linear.py#L278), both `rowwise` and `columnwise` quantized `W` are saved for backwards.

However, only `columnwise` `W` is needed for backwards for `mx` mixed precision.

Notably, `rowwise` is discarded and only `columnwise` is saved in the newer [transformer_engine.pytorch.ops.BasicLinear](https://github.com/NVIDIA/TransformerEngine/blob/5ba01faadb6c05b2d9aed1e5e96b976afa65af4a/transformer_engine/pytorch/ops/basic/basic_linear.py#L609).
   

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MX] Rowwise `W` cached for backwards #2546

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[MX] Rowwise W cached for backwards #2546

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[MX] Rowwise `W` cached for backwards #2546