Skip to content

Invalid floating point operation (or memory corruption) when reallocating memory #85

@upland-yfortin

Description

@upland-yfortin

An Invalid floating point operation exception may be raised when reallocating a buffer between 32 and 64 bytes. The reallocation can be explicit, e.g. a call to ReallocMem(), but could also be implicit, e.g. when concatenating strings, making it much harder to spot.

This is due to the fact that FastMM copies memory blocks between 32 and 64 bytes using the FPU registers. But it assumes that all the registers it is going to use are free. This assumption is broken if faulty code was executed prior to the reallocation.

If the FPU overflow exception mask is reset, which is the case by default in Delphi, an exception is raised.

However, if the overflow exception mask is set, the error is ignored and the register contains a magic "invalid" value instead of the data copied to it. When the register content is restored at the destination memory location, a number of the bytes are incorrect, resulting in memory corruption.

In the POC code below, the val() Delphi function errors out, leaving a value on the FPU stack. The subsequent ReallocMem() tries to copy the buffer to its new location. It does so using the 8 FPU registers, one of which is still in use. This raises an exception. If you uncomment the Set8087CW($133f), the FPU overflow mask is set, the exception is silently ignored and memory corruption occurs, which can be seen by inspecting the buffer.

procedure TForm1.Button1Click(Sender: TObject);
const
  BSMALL = 64;
  BBIG = 256;
type
  TSmall = array[0..BSMALL-1] of byte;
var
  f: double;
  i: integer;
  p: pointer;
begin
  GetMem(p, BSMALL);
  for i := 0 to BSMALL-1 do
    TSmall(p^)[i] := i;
//  Set8087CW($133f);
  val('565E394529', f, i);
  ReallocMem(p, BBIG);
end;

My suggestion would be to ffree as many registers as needed, starting at the bottom of the stack. This way the required registers are free, with the caveat that additional garbage remains, but the situation is not made any worse and it reduces the number of calls to the minimum.

Other solutions were considered.

  • Calling fninit is not a good solution because it resets the FPU code and status word as well.
  • Calling emms is 2-4 times slower than freeing the registers.
  • Calling femms is only supported on AMD.
  • Popping from the stack may cause an underflow exception, or involves inspecting the status word and/or tag word to examine the FPU state, which ends up being slower than just freeing the registers.

Here is an example of a fixed Move36 function, with un-related code removed for simplicity.

procedure Move36(const ASource; var ADest; ACount: Integer);
asm
  // Remove the bottom 4 values from the stack to make room for our 4 values
  ffree st(7)
  ffree st(6)
  ffree st(5)
  ffree st(4)
  fild qword ptr [eax]
  fild qword ptr [eax + 8]
  fild qword ptr [eax + 16]
  fild qword ptr [eax + 24]
  mov ecx, [eax + 32]
  mov [edx + 32], ecx
  fistp qword ptr [edx + 24]
  fistp qword ptr [edx + 16]
  fistp qword ptr [edx + 8]
  fistp qword ptr [edx]
end;

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions