Skip to content

Dimension Mismatch Error When Using FASTOutputFile.toDataFrame(): operands could not be broadcast together with shapes (4801,138) (138,1) #41

@Asionm

Description

@Asionm

Description

When loading a FAST binary file using FASTOutputFile from the openfast_toolbox.io module and converting it to a DataFrame, a dimension mismatch error occurs in non-buffered mode. The error originates from openfast_toolbox/io/fast_output_file.py during the data scaling step.

Error Trigger Code:

from openfast_toolbox.io import FASTOutputFile
out_file = FASTOutputFile("test.outb").toDataFrame()  # Fails here

Error Message:

ValueError: operands could not be broadcast together with shapes (NT, NumOutChans) (NumOutChans, 1)

Affected File:
openfast_toolbox/io/fast_output_file.py


Steps to Reproduce

  1. Import FASTOutputFile and load a FAST binary file:
    from openfast_toolbox.io import FASTOutputFile
    out_file = FASTOutputFile("test.outb").toDataFrame()
  2. Ensure the file is in a compressed format (e.g., FileFmtID_WithTime or FileFmtID_WithoutTime).
  3. The error occurs during the data scaling step in non-buffered mode (use_buffer=False).

Root Cause

In fast_output_file.py, the scaling arrays ColOff and ColScl are incorrectly shaped as column vectors ((NumOutChans, 1)), while the data array data has shape (NT, NumOutChans). This violates NumPy broadcasting rules when performing element-wise operations:

# In fast_output_file.py (non-buffered mode):
data = (data - ColOff) / ColScl  # Shapes: (NT,138) vs (138,1)

Buffered mode works because it uses 1D arrays and applies scaling column-by-column.


Proposed Fix

Adjust the dimensions of ColOff and ColScl to align with broadcasting rules.

Option 1: Flatten to 1D Arrays

Modify the code in fast_output_file.py to convert ColOff and ColScl to 1D arrays:

# Before (line ~X in fast_output_file.py):
ColScl = fread(fid, NumOutChans, 'float32')  # Shape: (138, 1)
ColOff = fread(fid, NumOutChans, 'float32')  # Shape: (138, 1)

# After:
ColScl = fread(fid, NumOutChans, 'float32').flatten()  # Shape: (138,)
ColOff = fread(fid, NumOutChans, 'float32').flatten()  # Shape: (138,)

Option 2: Transpose Scaling Arrays

Alternatively, transpose ColOff and ColScl to row vectors:

data = (data - ColOff.T) / ColScl.T  # Shapes: (NT,138) vs (1,138)

Why Buffered Mode Works

In buffered mode (use_buffer=True):
ColOff and ColScl are 1D arrays (shape (NumOutChans,)).
• Scaling is applied column-wise in a loop, avoiding broadcasting:

for iCol in range(NumOutChans):
    data[:, iCol+1] = (data[:, iCol+1] - ColOff[iCol]) / ColScl[iCol]

Impact

Affected Users: Anyone using FASTOutputFile.toDataFrame() in non-buffered mode.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions