Skip to content

Vanishing Gradient at ~7-10k tokens multiple models/platforms #35

@neocoretechs

Description

@neocoretechs

Using the JVM with this cmdl:
java -server -Xmx26g --enable-preview --add-modules jdk.incubator.vector -jar llama3.jar --model Llama-3.2-3B-Instruct-Q8_0.gguf --chat -n 75000 --temperature 0
under JDK 25 Graalvm EA Win11 Ryzen 7 32 gig and also 3.2-1B model both under Win11 and Ubuntu 22.04 Ryzen 9 Hx370 same VM except -Xmx96g and -n 128000 since that machine has 128 gig I notice the gradient vanishes around 7-10k tokens and the models either repeat an endless loop of doggerel or issue only sullen 2 or 3 word repetitive responses. Going to try 3.1-8B under Ubuntu and see what happens but so far its consistently become deranged at right around the same 7-10k token mark.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions