Question about proper settings of prompt extraction attack.

I ran this code with the command of
```
python  carve_sigil.py \
        name=llama2_gcg_sys \
        wandb.tags=[extraction] \
        model=llama2-7b-chat \
        optimizer=gcg \
        sigil.num_tokens=32 \
        sigil=sysrepeater
```
I evaluated the result prompt with 50 test samples in awesome-chatgpt-prompts dataset, but only 3 samples was correctly responsed (correctly repeat the system prompt). I don't know if it is normal or I just did something wrong. Could you please offer me the best setting of prompt extraction attack or tell me where I went wrong? Thank you very much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about proper settings of prompt extraction attack. #60

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about proper settings of prompt extraction attack. #60

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions