Twistt - Push-to-Talk Transcription Tool

A Linux speech-to-text transcription tool using OpenAI or Deepgram for STT with push-to-talk functionality.

Features

Push-to-Talk: Hold a function key (F1-F12) to record and transcribe
Toggle mode: Double-tap the key to start recording, press again to stop
Smart transcription: Text appears when you pause or stop speaking
Auto-output: Automatically outputs transcribed text at cursor position
Multi-language support: Transcribe in any language supported by the provider
Configurable audio gain: Amplify microphone input if needed
Multiple model support: Choose between gpt-4o-transcribe and gpt-4o-mini-transcribe
Post-treatment: Optional AI-powered correction of transcribed text for improved accuracy

Requirements

Linux (tested on X11 and Wayland)
Python 3.11+
ydotool for simulating keyboard input (by pasting or typing + pasting)
OpenAI or Deepgram API key for transcription (depending on provider)
OpenAI, Cerebras, or OpenRouter API key for post-treatment (if used)
Microphone access

Installation

Using uv (Recommended)

The script is designed to run with uv, which handles dependencies automatically:

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Run the script (dependencies will be auto-installed)
./twistt.py --help

Using pip

If you prefer using pip:

# Install dependencies
pip install -r requirements.txt

# Run the script
python twistt.py --help

System Dependencies

ydotool is required for output. It's a replacement for xdotool that works on both X11 and Wayland, used here to simulate typing and pasting.

Important: The versions available in Debian/Ubuntu repositories are too old. You'll need to build from source.

For installation instructions, see: https://docs.o-x-l.com/automation/ydotool.html

Here's a simplified systemd service for single-user setup:

# /etc/systemd/system/ydotoold.service
[Unit]
Description=ydotoold (root) for user 1000
# Ensure /run/user/1000 exists
Requires=user-runtime-dir@1000.service
After=user-runtime-dir@1000.service
# Start after display/user session
After=display-manager.service user@1000.service
BindsTo=user@1000.service

[Service]
Type=simple
# Avoid stale socket -> "Connection refused"
ExecStartPre=/usr/bin/rm -f /run/user/1000/.ydotool_socket
ExecStart=/usr/local/sbin/ydotoold --socket-path=/run/user/1000/.ydotool_socket --socket-own=1000:0
Restart=always
RestartSec=2

[Install]
WantedBy=multi-user.target

Note: If you use a custom socket path (as shown above with /run/user/1000/.ydotool_socket), you'll need to specify it when running twistt:

Via environment variable: YDOTOOL_SOCKET=/run/user/1000/.ydotool_socket ./twistt.py
Or via argument: ./twistt.py --ydotool-socket /run/user/1000/.ydotool_socket

Configuration

API Key Setup

Set your OpenAI API key(s) using one of these methods (in order of priority):

Command line argument: --api-key YOUR_KEY
User config file: ~/.config/twistt/config.env
Local .env file: Create .env in the script directory
Environment variable: Export in your shell

Example .env or config.env file:

# OpenAI API key (required if model from OpenAI, by default)
TWISTT_OPENAI_API_KEY=sk-...
# or
OPENAI_API_KEY=sk-...

# Deepgram API key (required if model from Deepgram)
TWISTT_DEEPGRAM_API_KEY=dg_...
# or
DEEPGRAM_API_KEY=dg_...

# Optional settings
TWISTT_HOTKEY=F9           # Single hotkey
TWISTT_HOTKEYS=F8,F9,F10   # Multiple hotkeys (comma-separated)
TWISTT_MODEL=gpt-4o-transcribe   # For OpenAI; for Deepgram use e.g. nova-2-general
TWISTT_LANGUAGE=en  # Leave empty or omit for auto-detect
TWISTT_SILENCE_DURATION=500  # Milliseconds of silence before ending the current segment
TWISTT_GAIN=1.0
TWISTT_MICROPHONE=Elgato Wave 3  # Optional text filter to auto-select a microphone
TWISTT_DOUBLE_TAP_WINDOW=0.5  # Time window for double-tap detection
TWISTT_KEYBOARD=keychron  # Optional text filter to auto-select matching keyboard
TWISTT_YDOTOOL_SOCKET=/run/user/1000/.ydotool_socket  # Optional, auto-detected by default

# Output mode
TWISTT_OUTPUT_MODE=batch  # batch (default) or full
TWISTT_USE_TYPING=false  # Type ASCII characters via ydotool instead of copy/paste (slower)
TWISTT_KEYBOARD_DELAY=20  # Delay in milliseconds between keyboard actions (default: 20ms)

# Logging
TWISTT_LOG=/path/to/custom/twistt.log  # Optional, defaults to ~/.config/twistt/twistt.log

# Post-treatment settings (optional)
TWISTT_POST_TREATMENT_PROMPT="Fix grammar and punctuation"  # Can be text, file path, or multiple separated by '::'
TWISTT_POST_TREATMENT_MODEL=gpt-4o-mini  # Model for post-treatment
TWISTT_POST_TREATMENT_PROVIDER=openai  # Provider: openai, cerebras, or openrouter
# Post-treatment correct mode (apply corrections in-place with keyboard; requires batch output mode)
TWISTT_POST_TREATMENT_CORRECT=false
# Disable post-treatment entirely (ignores prompts/files)
TWISTT_POST_TREATMENT_DISABLED=false

# Provider-specific API keys (for post-treatment)
TWISTT_CEREBRAS_API_KEY=csk-...  # Required if using cerebras provider
TWISTT_OPENROUTER_API_KEY=sk-or-...  # Required if using openrouter provider

Available Options

Option	Environment Variable	Default	Description
`-k, --hotkey`	`TWISTT_HOTKEY` or `TWISTT_HOTKEYS`	F9	Push-to-talk key(s) (F1-F12), comma-separated for multiple
`-kb, --keyboard`	`TWISTT_KEYBOARD`	-	Filter text for automatically selecting the keyboard input device Pass without a value to force interactive selection and ignore env defaults
`-dtw, --double-tap-window`	`TWISTT_DOUBLE_TAP_WINDOW`	0.5	Time window in seconds for double-tap detection
`-m, --model`	`TWISTT_MODEL`	gpt-4o-transcribe	Transcription model (for OpenAI or Deepgram)
`-l, --language`	`TWISTT_LANGUAGE`	Auto-detect	Transcription language (ISO 639-1)
`-sd, --silence-duration`	`TWISTT_SILENCE_DURATION`	500	Silence duration in milliseconds before the transcription service ends the current segment
`-g, --gain`	`TWISTT_GAIN`	1.0	Microphone amplification
`-mic, --microphone`	`TWISTT_MICROPHONE`	Default input	Text filter or ID to select the microphone Pass without a value to force interactive selection and ignore env defaults
`-koa, --openai-api-key`	`TWISTT_OPENAI_API_KEY` or `OPENAI_API_KEY`	-	OpenAI API key
`-kdg, --deepgram-api-key`	`TWISTT_DEEPGRAM_API_KEY` or `DEEPGRAM_API_KEY`	-	Deepgram API key
`-ys, --ydotool-socket`	`TWISTT_YDOTOOL_SOCKET` or `YDOTOOL_SOCKET`	Auto-detect	Path to ydotool socket
`-p, --post-prompt`	`TWISTT_POST_TREATMENT_PROMPT`	-	Post-treatment prompt (text/file). Can be specified multiple times. Within a value, use `::` to separate multiple prompts. Prefix any `-p` value with `::` to include env/config variable. Example: `-p :: -p file.txt`
`-pm, --post-model`	`TWISTT_POST_TREATMENT_MODEL`	gpt-4o-mini	Model for post-treatment
`-pp, --post-provider`	`TWISTT_POST_TREATMENT_PROVIDER`	openai	Provider for post-treatment (openai, cerebras, openrouter)
`-pc, --post-correct, -npc, --no-post-correct`	`TWISTT_POST_TREATMENT_CORRECT`	false	Apply post-treatment by correcting already-output text in-place (only in batch output mode)
`-np, --no-post`	`TWISTT_POST_TREATMENT_DISABLED`	false	Disable post-treatment regardless of prompts or files
`-kcb, --cerebras-api-key`	`TWISTT_CEREBRAS_API_KEY` or `CEREBRAS_API_KEY`	-	Cerebras API key
`-kor, --openrouter-api-key`	`TWISTT_OPENROUTER_API_KEY` or `OPENROUTER_API_KEY`	-	OpenRouter API key
`-o, --output-mode, -no, --no-output-mode`	`TWISTT_OUTPUT_MODE`	batch	Output mode: batch (incremental), full (complete on release), or none (disabled)
`-t, --use-typing, -nt, --no-use-typing`	`TWISTT_USE_TYPING`	false	Type ASCII characters directly (slower); clipboard still handles non-ASCII. Use `-t`/`--use-typing` to enable, `-nt`/`--no-use-typing` to disable
`-kd, --keyboard-delay`	`TWISTT_KEYBOARD_DELAY`	20	Delay in milliseconds between keyboard actions (typing, paste, navigation keys). Increase if you experience character ordering issues
`--log`	`TWISTT_LOG`	`~/.config/twistt/twistt.log`	Path to log file where transcription sessions are saved
`--check`	-	-	Display configuration and exit without logging anything to file. Useful for verifying settings before running.
`--list-configs [DIR]`	-	-	List all configuration files found in `~/.config/twistt/` (or DIR if specified) with their variables and exit. API keys are masked, all values are limited to 100 characters.
`-c, --config PATH`	`TWISTT_CONFIG`	`~/.config/twistt/config.env`	Load configuration from file(s). Can be specified multiple times or use `::` separator. Later files override earlier ones. Prefix with `::` to include default config. Example: `-c ::fr.env` (default + modifier)
`-sc, --save-config [PATH]`	`TWISTT_CONFIG`	false	Persist provided command-line values to a config file (defaults to `~/.config/twistt/config.env` or `TWISTT_CONFIG` if set)

Selecting a microphone sets the PULSE_SOURCE environment variable for Twistt only, so your system default input stays untouched. Run ./twistt.py --microphone without a value to pick from the list even if an environment variable is set.

Use --config (or TWISTT_CONFIG) to load settings from one or more files. You can specify multiple config files either by using -c multiple times or by separating paths with :: in a single argument or environment variable. Later files override values from earlier ones.

Including the default config: Prefix any -c value with :: to include the default config (~/.config/twistt/config.env) as the base, allowing you to use modifier files that only specify what differs. For example, -c ::fr.env combines the default config with fr.env (where fr.env might only set TWISTT_LANGUAGE=fr). Without the :: prefix, -c replaces the default config entirely.

If you provide a relative path that doesn't exist in the current directory, and a file with that name (plus .env) exists in ~/.config/twistt/, it will be used automatically. For example, --config work will use ~/.config/twistt/work.env if work doesn't exist locally. Use --save-config to capture only the options you explicitly pass on the command line; existing keys in the config file are preserved. Provide a path (or set TWISTT_CONFIG) to control which file gets written. TWISTT_CONFIG is read only from the process environment—do not place it in .env files or config.env.

Config Inheritance and Multiple Config Files

Twistt supports two complementary ways to combine configuration files:

1. Multiple config files via -c or TWISTT_CONFIG:

You can specify multiple config files that are loaded in sequence, with later files overriding values from earlier ones:

# Load multiple configs via command line
./twistt.py -c base.env -c project.env -c local.env

# Or using :: separator
./twistt.py -c "base.env::project.env::local.env"

# Or via environment variable
TWISTT_CONFIG="base.env::project.env" ./twistt.py

In these examples:

base.env is loaded first (lowest priority)
project.env overrides values from base.env
local.env overrides values from both base.env and project.env (highest priority)

Using modifier files with the default config:

Create small config files that only specify what differs from your default configuration, then use the :: prefix to combine them:

# Create a French language modifier
echo "TWISTT_LANGUAGE=fr" > ~/.config/twistt/fr.env

# Create a high-gain modifier for quiet microphones
echo "TWISTT_GAIN=3.0" > ~/.config/twistt/loud.env

# Use modifiers with default config
./twistt.py -c ::fr.env  # French language + all default settings
./twistt.py -c ::loud.env  # High gain + all default settings
./twistt.py -c ::fr.env -c ::loud.env  # French + high gain + all defaults

This is particularly useful when you have a well-configured default setup and only want to temporarily change one or two settings.

2. Parent config inheritance via TWISTT_PARENT_CONFIG:

Individual config files can define TWISTT_PARENT_CONFIG to inherit from another config file. Values in the child file take precedence over the parent:

# ~/.config/twistt/config.env - shared settings
TWISTT_OPENAI_API_KEY=sk-...
...

# ~/.config/twistt/gpt.env - inherits base and use open ai model without typing mode (because not recommended)
TWISTT_PARENT_CONFIG=config.env
TWISTT_MODEL=gpt-4o-transcribe
TWISTT_USE_TYPING=false

# ~/.config/twistt/nova.env - inherits base and use nova-2 model with typing mode (because it fits well)
TWISTT_PARENT_CONFIG=config.env
TWISTT_MODEL=nova-2
TWISTT_USE_TYPING=true

In those examples, nova.env and gpt.env being in ~/.config/twistt/, they can be used like that: twistt.py --config nova or ./twistt.py --config gpt (without passing the full path and the .env extension to the config argument)

Parent paths can be relative (resolved from the child config's directory) or absolute. Circular references are detected and will cause an error.

Combining both approaches:

You can mix multiple config files and parent inheritance. For example:

# Load base config with its parent, then override with local settings
./twistt.py -c gpt.env -c local.env

This will:

Load config.env (parent of gpt.env)
Load gpt.env (overrides config.env)
Load local.env (overrides both config.env and gpt.env)

Listing Available Configurations

Use --list-configs to see all configuration files in ~/.config/twistt/ and their variables:

./twistt.py --list-configs

# Or list configs from a specific directory
./twistt.py --list-configs /path/to/configs

This displays:

All .env files in the config directory, sorted alphabetically
For each file:
- Filename with parent config shown in parentheses if defined
- All variables in alphabetical order
- API keys are masked (only first 3 characters + "...")
- All values are limited to 100 characters with newlines replaced by spaces
- "..." is appended only if the value exceeds 100 characters

Example output:

Configuration files found in: /home/user/.config/twistt

config.env
  TWISTT_HOTKEY = F8,F9
  TWISTT_LANGUAGE = fr
  TWISTT_OPENAI_API_KEY = sk-...
  TWISTT_POST_TREATMENT_PROMPT = Fix grammar and punctuation. Remove filler words like "um" and "uh". Keep the conversational...

fr.env
  TWISTT_LANGUAGE = fr

gpt.env (parent config: ~/.config/twistt/config.env)
  TWISTT_MODEL = gpt-4o-transcribe
  TWISTT_USE_TYPING = false

This is useful for:

Discovering what config files you have
Understanding config inheritance relationships
Verifying variable values without opening files
Security: checking API keys are set without revealing full values

Logging

All transcription sessions are automatically logged to a file. By default, logs are saved to ~/.config/twistt/twistt.log. You can customize the log file location using:

Command-line argument: --log /path/to/logfile.log
Environment variable: TWISTT_LOG=/path/to/logfile.log

The log file contains:

Configuration panel (displayed at startup)
Completed transcription sessions with timestamps
Both raw transcription and post-treatment results (if enabled)

Note: Live updates during recording are not logged, only finalized sessions are saved.

To disable logging, point the log file to /dev/null:

./twistt.py --log /dev/null

Post-Treatment Prompt

The --post-prompt argument and TWISTT_POST_TREATMENT_PROMPT environment variable support multiple prompts that can be combined.

Multiple prompts with :: separator:

You can specify multiple prompts separated by ::. Each part is resolved independently as either a file (if it exists) or literal text, then all parts are combined with double newlines between them:

# Environment variable examples
TWISTT_POST_TREATMENT_PROMPT="prompt1.txt::Fix grammar::prompt2.txt"
TWISTT_POST_TREATMENT_PROMPT="corrections.txt::Make it formal"

File resolution for each part:

Absolute paths are checked directly
Relative paths are searched in: current directory → script directory → ~/.config/twistt/
Shell expansion such as ~ is supported
When the filename has no extension, Twistt tries with no extension, then .txt and .prompt variants
If a file is found, its content is used; otherwise the value is treated as direct text
Empty files are rejected

Using -p / --post-prompt argument:

The -p flag can be specified multiple times and supports two modes:

Replace mode (default) - ignores environment variable:

./twistt.py -p "Fix grammar"                    # Uses only this prompt
./twistt.py -p "prompt1.txt::Make it formal"    # Combines these two
./twistt.py -p file1.txt -p "Fix grammar"       # Multiple -p: file1.txt + literal text

Append mode (prefix ANY -p value with ::) - includes environment variable:

# If TWISTT_POST_TREATMENT_PROMPT="base.txt"
./twistt.py -p "::"                       # Uses only base.txt (env var)
./twistt.py -p "::extra.txt"              # Combines: base.txt + extra.txt
./twistt.py -p :: -p file1.txt            # Combines: base.txt + file1.txt
./twistt.py -p file1.txt -p "::file2.txt" # Combines: base.txt + file1.txt + file2.txt

Key points:

You can use -p multiple times: -p file1.txt -p file2.txt -p "Fix grammar"
If ANY -p value starts with ::, the environment variable is included first
Order: env var (if requested) → all -p values in order (with :: prefix removed)
Each -p value can contain :: separators for multiple prompts within one argument

Examples:

# Single prompts
./twistt.py -p translate            # Uses translate.txt if exists, else literal text
./twistt.py -p "Fix grammar"        # Direct text
./twistt.py -p ./prompts/formal.txt # Explicit file path

# Multiple prompts via environment
TWISTT_POST_TREATMENT_PROMPT="base.txt::Fix grammar"
./twistt.py  # Uses both prompts combined

# Multiple -p arguments
./twistt.py -p file1.txt -p "Fix grammar" -p file2.txt

# Mixing :: separator and multiple -p
./twistt.py -p "prompt1.txt::Make formal" -p prompt2.txt

# Including environment variable
TWISTT_POST_TREATMENT_PROMPT="base.txt"
./twistt.py -p ::               # Uses only base.txt
./twistt.py -p "::extra.txt"    # Uses base.txt + extra.txt
./twistt.py -p :: -p custom.txt # Uses base.txt + custom.txt
./twistt.py -p file1.txt -p "::file2.txt"  # Uses base.txt + file1.txt + file2.txt

# Disable post-treatment
./twistt.py --no-post

Usage

Basic Usage

# Start with default settings (F9 key, auto-detect language)
./twistt.py

# Use F5 key with English transcription
./twistt.py --hotkey F5 --language en

# Use multiple hotkeys
./twistt.py --hotkey F8,F9,F10

# Force French language
./twistt.py --language fr

# Increase microphone sensitivity
./twistt.py --gain 2.0

# Enable post-treatment to fix grammar and punctuation
./twistt.py --post-prompt "Fix grammar, punctuation, and obvious errors"

# Use a file for more complex post-treatment instructions
./twistt.py --post-prompt instructions.txt

# Specify a different model for post-treatment
./twistt.py --post-prompt "Make the text more formal" --post-model gpt-4o

# Use Cerebras for post-treatment (faster inference)
./twistt.py --post-prompt "Fix errors" --post-provider cerebras --post-model llama3-8b

# Use OpenRouter for post-treatment (access to many models)
./twistt.py --post-prompt "Fix errors" --post-provider openrouter --post-model meta-llama/llama-3.2-3b-instruct

# Post-treatment correct mode: output raw immediately then update in place via post-treatment
./twistt.py --post-prompt "Fix grammar" --post-correct

# Use full output mode (wait for hotkey release to output/process)
./twistt.py --output-mode full

# Type ASCII characters directly (slower; non-ASCII characters are still handled via clipboard)
./twistt.py --use-typing

# Use Deepgram as provider
TWISTT_PROVIDER=deepgram TWISTT_DEEPGRAM_API_KEY=dg_xxx ./twistt.py --model nova-2-general --language fr

# Save your preferred options for next time
./twistt.py --language fr --gain 2.0 --microphone "Elgato Wave 3" --save-config

# Save to a custom config file
./twistt.py --language fr --gain 2.0 --save-config ~/.config/twistt/presets/french.env

# Load a custom preset
./twistt.py --config ~/.config/twistt/french.env
./twistt.py --config french  # equivalent to the one above
./twistt.py --config /path/to/gaming.env

# Load multiple config files (later files override earlier ones)
./twistt.py --config base.env --config local.env
./twistt.py -c "base.env::project.env::local.env"

# Use modifier files with default config (:: prefix includes default)
./twistt.py -c ::fr.env  # Combines default config + fr.env modifier
./twistt.py -c :: -c local.env  # Combines default config + local.env
./twistt.py -c ::  # Uses only default config explicitly

# Specify a custom log file
./twistt.py --log /tmp/twistt-debug.log

# Disable logging (output to /dev/null)
./twistt.py --log /dev/null

# Check configuration without starting (useful to verify settings)
./twistt.py --check
./twistt.py --config french --check  # Verify a specific config

# List all available config files and their variables
./twistt.py --list-configs
./twistt.py --list-configs /path/to/configs  # List from custom directory

How It Works

Twistt supports two recording modes:

Push-to-Talk Mode (Hold)

Start the script: Run ./twistt.py
Position cursor: Click where you want text to appear
Hold to record: Press and hold one of your configured hotkeys (default: F9)
Speak: Talk while holding the key
Release to transcribe: Let go of the key
Auto-output: Text is automatically output at cursor position

Toggle Mode (Double-Tap)

Start the script: Run ./twistt.py
Position cursor: Click where you want text to appear
Double-tap to start: Press-release-press the same hotkey quickly (within 0.5s)
Speak freely: Recording continues without holding any key
Press to stop: Press the same hotkey once to stop recording (only the hotkey that started toggle mode can stop it)
Auto-output: Text is automatically output at cursor position

The transcription appears where the cursor is located.

An indicator ("(Twisting...)" text) is shown at the cursor position when recording is active, or text is being output or post-treatment is running.

Output Modes

Twistt supports three output modes that control when text is processed and output:

batch mode (default): Text is processed and can be output incrementally as you speak. Each pause triggers processing of that segment. With post-treatment enabled, each segment maintains context from previous segments.
full mode: All text is accumulated while you hold the key and only processed/output when you release it. With post-treatment, the entire text is processed at once without maintaining context between sessions. This mode is useful when you want to speak a complete thought before any processing occurs.
none: Twistt skips all output entirely. Transcription and post-treatment still run (just like batch mode), but nothing is pasted or typed at the cursor position. Use when you only want live feedback in the terminal or plan to copy results manually later.

Tips

Shift mode: Press Shift at any time while recording to use Ctrl+Shift+V instead of Ctrl+V to paste (useful for terminals). Shift can be pressed:
- When starting recording (together with the hotkey)
- At any moment while holding the hotkey
- The earliest Shift press is remembered for the entire recording session
Alt to toggle post-treatment: Press Alt at any time while recording to toggle post-treatment on/off for the current session. This is useful when you have post-treatment configured but want to temporarily disable it for certain inputs (or the reverse).
Multiple sentences: Keep holding the key to transcribe continuously
Pause support: Brief pauses are handled automatically
Live feedback: Watch the terminal to see transcription as it processes
Output mode choice: Use --output-mode full when you want to complete your entire thought before processing, or --no-output-mode to disable output entirely
Post-treatment: Enable for improved accuracy, especially useful for:
- Fixing punctuation and capitalization
- Correcting common speech-to-text errors
- Adapting text style (formal, informal, technical)
- Language-specific corrections

Keyboard Detection

The script automatically detects your physical keyboard. If multiple keyboards are found, you'll be prompted to select one. Virtual keyboards are automatically filtered out. Set --keyboard "partial name" or TWISTT_KEYBOARD=partial name to pre-filter devices and auto-select when only one match remains. Pass --keyboard with no value to always display the selection menu and ignore any configured default.

Post-Treatment (Optional)

Post-treatment uses AI to improve transcription accuracy by correcting errors, fixing punctuation, and applying custom transformations. It's activated automatically when you provide a prompt.

Supported Providers

Transcription

You can choose between different AI providers for transcription:

OpenAI: Uses OpenAI's GPT transcribe models (gpt-4o-transcribe (default) gpt-4o-mini-transcribe). Better to not use --use-typing.
Deepgram: Uses Deepgram's Nova models (nova-2, nova-3). Really real time but more expensive. Great with --use-typing

Post-Treatment

You can choose between different AI providers for post-treatment:

OpenAI (default): Uses OpenAI's GPT models
Cerebras: Fast inference with open-source models (docs). Models can be free!
OpenRouter: Access to many different AI models (docs). Provides paid cerebras models like GPT-OSS.

Each provider requires its own API key, which can be set via environment variables or command-line arguments.

Creating a Post-Treatment Prompt

You can provide instructions directly via command line or use a file for more complex prompts:

Example prompt file (corrections.txt):

Fix any grammar and punctuation errors.
Ensure proper capitalization.
Expand common abbreviations.
Remove filler words like "um" and "uh".
Keep the conversational tone.

Then use it with:

./twistt.py --post-prompt corrections.txt

Post-Treatment Examples

# Simple corrections
./twistt.py --post-prompt "Fix grammar and punctuation"

# Technical writing
./twistt.py --post-prompt "Use technical vocabulary, expand acronyms on first use"

# Formal style
./twistt.py --post-prompt "Make the text more formal and professional"

# Use a more powerful model for complex corrections
./twistt.py --post-prompt complex_rules.txt --post-model gpt-4o

# Use Cerebras for faster processing
export CEREBRAS_API_KEY=csk-...
./twistt.py --post-prompt "Fix errors" --post-provider cerebras --post-model llama3-70b

# Use OpenRouter for access to various models
export OPENROUTER_API_KEY=sk-or-...
./twistt.py --post-prompt "Improve clarity" --post-provider openrouter --post-model anthropic/claude-3-haiku

Note: Post-treatment adds a small delay (typically under 1 second) as it processes the text through the AI model.

Language Support

By default, the tool auto-detects the language you're speaking. You can also specify a language using ISO 639-1 codes:

en - English
fr - French
es - Spanish
de - German
it - Italian
pt - Portuguese
ja - Japanese
zh - Chinese
And many more...

Leave the language parameter empty to use auto-detection.

Troubleshooting

"No physical keyboard detected"

The script needs to monitor keyboard events
Run with appropriate permissions if needed
Select your keyboard manually from the list

"ydotool error"

Ensure ydotool daemon is running: sudo ydotoold &
If using a custom socket path, set it via YDOTOOL_SOCKET environment variable or --ydotool-socket argument

"Permission denied on /dev/input/eventX"

Add your user to the input group: sudo usermod -a -G input $USER
Log out and back in for changes to take effect
Or run with sudo (not recommended for regular use)

Audio issues

Check microphone permissions
Adjust --gain if audio is too quiet/loud
Ensure no other application is using the microphone

Security Notes

The API key is sent only to OpenAI's servers
Audio is processed in real-time and not stored locally
Transcriptions are only kept in memory during the session

Ideas

We maintain a curated list of potential enhancements in IDEAS.md. If you have suggestions or want to pick something up, check it out and open an issue or PR.

Author

Stephane "Twidi" Angel, with the help of @claude and @codex

License

MIT License - See LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
IDEAS.md		IDEAS.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
ruff.toml		ruff.toml
twistt.py		twistt.py

License

twidi/twistt

Folders and files

Latest commit

History

Repository files navigation

Twistt - Push-to-Talk Transcription Tool

Features

Requirements

Installation

Using uv (Recommended)

Using pip

System Dependencies

Configuration

API Key Setup

Available Options

Config Inheritance and Multiple Config Files

Listing Available Configurations

Logging

Post-Treatment Prompt

Usage

Basic Usage

How It Works

Push-to-Talk Mode (Hold)

Toggle Mode (Double-Tap)

Output Modes

Tips

Keyboard Detection

Post-Treatment (Optional)

Supported Providers

Transcription

Post-Treatment

Creating a Post-Treatment Prompt

Post-Treatment Examples

Language Support

Troubleshooting

"No physical keyboard detected"

"ydotool error"

"Permission denied on /dev/input/eventX"

Audio issues

Security Notes

Ideas

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages