A Linux speech-to-text transcription tool using OpenAI or Deepgram for STT with push-to-talk functionality.
- Push-to-Talk: Hold a function key (F1-F12) to record and transcribe
- Toggle mode: Double-tap the key to start recording, press again to stop
- Smart transcription: Text appears when you pause or stop speaking
- Auto-output: Automatically outputs transcribed text at cursor position
- Multi-language support: Transcribe in any language supported by the provider
- Configurable audio gain: Amplify microphone input if needed
- Multiple model support: Choose between
gpt-4o-transcribeandgpt-4o-mini-transcribe - Post-treatment: Optional AI-powered correction of transcribed text for improved accuracy
- Linux (tested on X11 and Wayland)
- Python 3.11+
ydotoolfor simulating keyboard input (by pasting or typing + pasting)- OpenAI or Deepgram API key for transcription (depending on provider)
- OpenAI, Cerebras, or OpenRouter API key for post-treatment (if used)
- Microphone access
The script is designed to run with uv, which handles dependencies automatically:
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Run the script (dependencies will be auto-installed)
./twistt.py --helpIf you prefer using pip:
# Install dependencies
pip install -r requirements.txt
# Run the script
python twistt.py --helpydotool is required for output. It's a replacement for xdotool that works on both X11 and Wayland, used here to simulate typing and pasting.
Important: The versions available in Debian/Ubuntu repositories are too old. You'll need to build from source.
For installation instructions, see: https://docs.o-x-l.com/automation/ydotool.html
Here's a simplified systemd service for single-user setup:
# /etc/systemd/system/ydotoold.service
[Unit]
Description=ydotoold (root) for user 1000
# Ensure /run/user/1000 exists
Requires=user-runtime-dir@1000.service
After=user-runtime-dir@1000.service
# Start after display/user session
After=display-manager.service user@1000.service
BindsTo=user@1000.service
[Service]
Type=simple
# Avoid stale socket -> "Connection refused"
ExecStartPre=/usr/bin/rm -f /run/user/1000/.ydotool_socket
ExecStart=/usr/local/sbin/ydotoold --socket-path=/run/user/1000/.ydotool_socket --socket-own=1000:0
Restart=always
RestartSec=2
[Install]
WantedBy=multi-user.targetNote: If you use a custom socket path (as shown above with /run/user/1000/.ydotool_socket), you'll need to specify it when running twistt:
- Via environment variable:
YDOTOOL_SOCKET=/run/user/1000/.ydotool_socket ./twistt.py - Or via argument:
./twistt.py --ydotool-socket /run/user/1000/.ydotool_socket
Set your OpenAI API key(s) using one of these methods (in order of priority):
- Command line argument:
--api-key YOUR_KEY - User config file:
~/.config/twistt/config.env - Local .env file: Create
.envin the script directory - Environment variable: Export in your shell
Example .env or config.env file:
# OpenAI API key (required if model from OpenAI, by default)
TWISTT_OPENAI_API_KEY=sk-...
# or
OPENAI_API_KEY=sk-...
# Deepgram API key (required if model from Deepgram)
TWISTT_DEEPGRAM_API_KEY=dg_...
# or
DEEPGRAM_API_KEY=dg_...
# Optional settings
TWISTT_HOTKEY=F9 # Single hotkey
TWISTT_HOTKEYS=F8,F9,F10 # Multiple hotkeys (comma-separated)
TWISTT_MODEL=gpt-4o-transcribe # For OpenAI; for Deepgram use e.g. nova-2-general
TWISTT_LANGUAGE=en # Leave empty or omit for auto-detect
TWISTT_SILENCE_DURATION=500 # Milliseconds of silence before ending the current segment
TWISTT_GAIN=1.0
TWISTT_MICROPHONE=Elgato Wave 3 # Optional text filter to auto-select a microphone
TWISTT_DOUBLE_TAP_WINDOW=0.5 # Time window for double-tap detection
TWISTT_KEYBOARD=keychron # Optional text filter to auto-select matching keyboard
TWISTT_YDOTOOL_SOCKET=/run/user/1000/.ydotool_socket # Optional, auto-detected by default
# Output mode
TWISTT_OUTPUT_MODE=batch # batch (default) or full
TWISTT_USE_TYPING=false # Type ASCII characters via ydotool instead of copy/paste (slower)
TWISTT_KEYBOARD_DELAY=20 # Delay in milliseconds between keyboard actions (default: 20ms)
# Logging
TWISTT_LOG=/path/to/custom/twistt.log # Optional, defaults to ~/.config/twistt/twistt.log
# Post-treatment settings (optional)
TWISTT_POST_TREATMENT_PROMPT="Fix grammar and punctuation" # Can be text, file path, or multiple separated by '::'
TWISTT_POST_TREATMENT_MODEL=gpt-4o-mini # Model for post-treatment
TWISTT_POST_TREATMENT_PROVIDER=openai # Provider: openai, cerebras, or openrouter
# Post-treatment correct mode (apply corrections in-place with keyboard; requires batch output mode)
TWISTT_POST_TREATMENT_CORRECT=false
# Disable post-treatment entirely (ignores prompts/files)
TWISTT_POST_TREATMENT_DISABLED=false
# Provider-specific API keys (for post-treatment)
TWISTT_CEREBRAS_API_KEY=csk-... # Required if using cerebras provider
TWISTT_OPENROUTER_API_KEY=sk-or-... # Required if using openrouter provider| Option | Environment Variable | Default | Description |
|---|---|---|---|
-k, --hotkey |
TWISTT_HOTKEY or TWISTT_HOTKEYS |
F9 | Push-to-talk key(s) (F1-F12), comma-separated for multiple |
-kb, --keyboard |
TWISTT_KEYBOARD |
- | Filter text for automatically selecting the keyboard input device Pass without a value to force interactive selection and ignore env defaults |
-dtw, --double-tap-window |
TWISTT_DOUBLE_TAP_WINDOW |
0.5 | Time window in seconds for double-tap detection |
-m, --model |
TWISTT_MODEL |
gpt-4o-transcribe | Transcription model (for OpenAI or Deepgram) |
-l, --language |
TWISTT_LANGUAGE |
Auto-detect | Transcription language (ISO 639-1) |
-sd, --silence-duration |
TWISTT_SILENCE_DURATION |
500 | Silence duration in milliseconds before the transcription service ends the current segment |
-g, --gain |
TWISTT_GAIN |
1.0 | Microphone amplification |
-mic, --microphone |
TWISTT_MICROPHONE |
Default input | Text filter or ID to select the microphone Pass without a value to force interactive selection and ignore env defaults |
-koa, --openai-api-key |
TWISTT_OPENAI_API_KEY or OPENAI_API_KEY |
- | OpenAI API key |
-kdg, --deepgram-api-key |
TWISTT_DEEPGRAM_API_KEY or DEEPGRAM_API_KEY |
- | Deepgram API key |
-ys, --ydotool-socket |
TWISTT_YDOTOOL_SOCKET or YDOTOOL_SOCKET |
Auto-detect | Path to ydotool socket |
-p, --post-prompt |
TWISTT_POST_TREATMENT_PROMPT |
- | Post-treatment prompt (text/file). Can be specified multiple times. Within a value, use :: to separate multiple prompts. Prefix any -p value with :: to include env/config variable. Example: -p :: -p file.txt |
-pm, --post-model |
TWISTT_POST_TREATMENT_MODEL |
gpt-4o-mini | Model for post-treatment |
-pp, --post-provider |
TWISTT_POST_TREATMENT_PROVIDER |
openai | Provider for post-treatment (openai, cerebras, openrouter) |
-pc, --post-correct, -npc, --no-post-correct |
TWISTT_POST_TREATMENT_CORRECT |
false | Apply post-treatment by correcting already-output text in-place (only in batch output mode) |
-np, --no-post |
TWISTT_POST_TREATMENT_DISABLED |
false | Disable post-treatment regardless of prompts or files |
-kcb, --cerebras-api-key |
TWISTT_CEREBRAS_API_KEY or CEREBRAS_API_KEY |
- | Cerebras API key |
-kor, --openrouter-api-key |
TWISTT_OPENROUTER_API_KEY or OPENROUTER_API_KEY |
- | OpenRouter API key |
-o, --output-mode, -no, --no-output-mode |
TWISTT_OUTPUT_MODE |
batch | Output mode: batch (incremental), full (complete on release), or none (disabled) |
-t, --use-typing, -nt, --no-use-typing |
TWISTT_USE_TYPING |
false | Type ASCII characters directly (slower); clipboard still handles non-ASCII. Use -t/--use-typing to enable, -nt/--no-use-typing to disable |
-kd, --keyboard-delay |
TWISTT_KEYBOARD_DELAY |
20 | Delay in milliseconds between keyboard actions (typing, paste, navigation keys). Increase if you experience character ordering issues |
--log |
TWISTT_LOG |
~/.config/twistt/twistt.log |
Path to log file where transcription sessions are saved |
--check |
- | - | Display configuration and exit without logging anything to file. Useful for verifying settings before running. |
--list-configs [DIR] |
- | - | List all configuration files found in ~/.config/twistt/ (or DIR if specified) with their variables and exit. API keys are masked, all values are limited to 100 characters. |
-c, --config PATH |
TWISTT_CONFIG |
~/.config/twistt/config.env |
Load configuration from file(s). Can be specified multiple times or use :: separator. Later files override earlier ones. Prefix with :: to include default config. Example: -c ::fr.env (default + modifier) |
-sc, --save-config [PATH] |
TWISTT_CONFIG |
false | Persist provided command-line values to a config file (defaults to ~/.config/twistt/config.env or TWISTT_CONFIG if set) |
Selecting a microphone sets the PULSE_SOURCE environment variable for Twistt only, so your system default input stays untouched. Run ./twistt.py --microphone without a value to pick from the list even if an environment variable is set.
Use --config (or TWISTT_CONFIG) to load settings from one or more files. You can specify multiple config files either by using -c multiple times or by separating paths with :: in a single argument or environment variable. Later files override values from earlier ones.
Including the default config: Prefix any -c value with :: to include the default config (~/.config/twistt/config.env) as the base, allowing you to use modifier files that only specify what differs. For example, -c ::fr.env combines the default config with fr.env (where fr.env might only set TWISTT_LANGUAGE=fr). Without the :: prefix, -c replaces the default config entirely.
If you provide a relative path that doesn't exist in the current directory, and a file with that name (plus .env) exists in ~/.config/twistt/, it will be used automatically. For example, --config work will use ~/.config/twistt/work.env if work doesn't exist locally. Use --save-config to capture only the options you explicitly pass on the command line; existing keys in the config file are preserved. Provide a path (or set TWISTT_CONFIG) to control which file gets written. TWISTT_CONFIG is read only from the process environment—do not place it in .env files or config.env.
Twistt supports two complementary ways to combine configuration files:
1. Multiple config files via -c or TWISTT_CONFIG:
You can specify multiple config files that are loaded in sequence, with later files overriding values from earlier ones:
# Load multiple configs via command line
./twistt.py -c base.env -c project.env -c local.env
# Or using :: separator
./twistt.py -c "base.env::project.env::local.env"
# Or via environment variable
TWISTT_CONFIG="base.env::project.env" ./twistt.pyIn these examples:
base.envis loaded first (lowest priority)project.envoverrides values frombase.envlocal.envoverrides values from bothbase.envandproject.env(highest priority)
Using modifier files with the default config:
Create small config files that only specify what differs from your default configuration, then use the :: prefix to combine them:
# Create a French language modifier
echo "TWISTT_LANGUAGE=fr" > ~/.config/twistt/fr.env
# Create a high-gain modifier for quiet microphones
echo "TWISTT_GAIN=3.0" > ~/.config/twistt/loud.env
# Use modifiers with default config
./twistt.py -c ::fr.env # French language + all default settings
./twistt.py -c ::loud.env # High gain + all default settings
./twistt.py -c ::fr.env -c ::loud.env # French + high gain + all defaultsThis is particularly useful when you have a well-configured default setup and only want to temporarily change one or two settings.
2. Parent config inheritance via TWISTT_PARENT_CONFIG:
Individual config files can define TWISTT_PARENT_CONFIG to inherit from another config file. Values in the child file take precedence over the parent:
# ~/.config/twistt/config.env - shared settings
TWISTT_OPENAI_API_KEY=sk-...
...
# ~/.config/twistt/gpt.env - inherits base and use open ai model without typing mode (because not recommended)
TWISTT_PARENT_CONFIG=config.env
TWISTT_MODEL=gpt-4o-transcribe
TWISTT_USE_TYPING=false
# ~/.config/twistt/nova.env - inherits base and use nova-2 model with typing mode (because it fits well)
TWISTT_PARENT_CONFIG=config.env
TWISTT_MODEL=nova-2
TWISTT_USE_TYPING=trueIn those examples, nova.env and gpt.env being in ~/.config/twistt/, they can be used like that: twistt.py --config nova or ./twistt.py --config gpt (without passing the full path and the .env extension to the config argument)
Parent paths can be relative (resolved from the child config's directory) or absolute. Circular references are detected and will cause an error.
Combining both approaches:
You can mix multiple config files and parent inheritance. For example:
# Load base config with its parent, then override with local settings
./twistt.py -c gpt.env -c local.envThis will:
- Load
config.env(parent ofgpt.env) - Load
gpt.env(overridesconfig.env) - Load
local.env(overrides bothconfig.envandgpt.env)
Use --list-configs to see all configuration files in ~/.config/twistt/ and their variables:
./twistt.py --list-configs
# Or list configs from a specific directory
./twistt.py --list-configs /path/to/configsThis displays:
- All
.envfiles in the config directory, sorted alphabetically - For each file:
- Filename with parent config shown in parentheses if defined
- All variables in alphabetical order
- API keys are masked (only first 3 characters + "...")
- All values are limited to 100 characters with newlines replaced by spaces
- "..." is appended only if the value exceeds 100 characters
Example output:
Configuration files found in: /home/user/.config/twistt
config.env
TWISTT_HOTKEY = F8,F9
TWISTT_LANGUAGE = fr
TWISTT_OPENAI_API_KEY = sk-...
TWISTT_POST_TREATMENT_PROMPT = Fix grammar and punctuation. Remove filler words like "um" and "uh". Keep the conversational...
fr.env
TWISTT_LANGUAGE = fr
gpt.env (parent config: ~/.config/twistt/config.env)
TWISTT_MODEL = gpt-4o-transcribe
TWISTT_USE_TYPING = false
This is useful for:
- Discovering what config files you have
- Understanding config inheritance relationships
- Verifying variable values without opening files
- Security: checking API keys are set without revealing full values
All transcription sessions are automatically logged to a file. By default, logs are saved to ~/.config/twistt/twistt.log. You can customize the log file location using:
- Command-line argument:
--log /path/to/logfile.log - Environment variable:
TWISTT_LOG=/path/to/logfile.log
The log file contains:
- Configuration panel (displayed at startup)
- Completed transcription sessions with timestamps
- Both raw transcription and post-treatment results (if enabled)
Note: Live updates during recording are not logged, only finalized sessions are saved.
To disable logging, point the log file to /dev/null:
./twistt.py --log /dev/nullThe --post-prompt argument and TWISTT_POST_TREATMENT_PROMPT environment variable support multiple prompts that can be combined.
Multiple prompts with :: separator:
You can specify multiple prompts separated by ::. Each part is resolved independently as either a file (if it exists) or literal text, then all parts are combined with double newlines between them:
# Environment variable examples
TWISTT_POST_TREATMENT_PROMPT="prompt1.txt::Fix grammar::prompt2.txt"
TWISTT_POST_TREATMENT_PROMPT="corrections.txt::Make it formal"File resolution for each part:
- Absolute paths are checked directly
- Relative paths are searched in: current directory → script directory →
~/.config/twistt/ - Shell expansion such as
~is supported - When the filename has no extension, Twistt tries with no extension, then
.txtand.promptvariants - If a file is found, its content is used; otherwise the value is treated as direct text
- Empty files are rejected
Using -p / --post-prompt argument:
The -p flag can be specified multiple times and supports two modes:
-
Replace mode (default) - ignores environment variable:
./twistt.py -p "Fix grammar" # Uses only this prompt ./twistt.py -p "prompt1.txt::Make it formal" # Combines these two ./twistt.py -p file1.txt -p "Fix grammar" # Multiple -p: file1.txt + literal text
-
Append mode (prefix ANY
-pvalue with::) - includes environment variable:# If TWISTT_POST_TREATMENT_PROMPT="base.txt" ./twistt.py -p "::" # Uses only base.txt (env var) ./twistt.py -p "::extra.txt" # Combines: base.txt + extra.txt ./twistt.py -p :: -p file1.txt # Combines: base.txt + file1.txt ./twistt.py -p file1.txt -p "::file2.txt" # Combines: base.txt + file1.txt + file2.txt
Key points:
- You can use
-pmultiple times:-p file1.txt -p file2.txt -p "Fix grammar" - If ANY
-pvalue starts with::, the environment variable is included first - Order: env var (if requested) → all
-pvalues in order (with::prefix removed) - Each
-pvalue can contain::separators for multiple prompts within one argument
Examples:
# Single prompts
./twistt.py -p translate # Uses translate.txt if exists, else literal text
./twistt.py -p "Fix grammar" # Direct text
./twistt.py -p ./prompts/formal.txt # Explicit file path
# Multiple prompts via environment
TWISTT_POST_TREATMENT_PROMPT="base.txt::Fix grammar"
./twistt.py # Uses both prompts combined
# Multiple -p arguments
./twistt.py -p file1.txt -p "Fix grammar" -p file2.txt
# Mixing :: separator and multiple -p
./twistt.py -p "prompt1.txt::Make formal" -p prompt2.txt
# Including environment variable
TWISTT_POST_TREATMENT_PROMPT="base.txt"
./twistt.py -p :: # Uses only base.txt
./twistt.py -p "::extra.txt" # Uses base.txt + extra.txt
./twistt.py -p :: -p custom.txt # Uses base.txt + custom.txt
./twistt.py -p file1.txt -p "::file2.txt" # Uses base.txt + file1.txt + file2.txt
# Disable post-treatment
./twistt.py --no-post# Start with default settings (F9 key, auto-detect language)
./twistt.py
# Use F5 key with English transcription
./twistt.py --hotkey F5 --language en
# Use multiple hotkeys
./twistt.py --hotkey F8,F9,F10
# Force French language
./twistt.py --language fr
# Increase microphone sensitivity
./twistt.py --gain 2.0
# Enable post-treatment to fix grammar and punctuation
./twistt.py --post-prompt "Fix grammar, punctuation, and obvious errors"
# Use a file for more complex post-treatment instructions
./twistt.py --post-prompt instructions.txt
# Specify a different model for post-treatment
./twistt.py --post-prompt "Make the text more formal" --post-model gpt-4o
# Use Cerebras for post-treatment (faster inference)
./twistt.py --post-prompt "Fix errors" --post-provider cerebras --post-model llama3-8b
# Use OpenRouter for post-treatment (access to many models)
./twistt.py --post-prompt "Fix errors" --post-provider openrouter --post-model meta-llama/llama-3.2-3b-instruct
# Post-treatment correct mode: output raw immediately then update in place via post-treatment
./twistt.py --post-prompt "Fix grammar" --post-correct
# Use full output mode (wait for hotkey release to output/process)
./twistt.py --output-mode full
# Type ASCII characters directly (slower; non-ASCII characters are still handled via clipboard)
./twistt.py --use-typing
# Use Deepgram as provider
TWISTT_PROVIDER=deepgram TWISTT_DEEPGRAM_API_KEY=dg_xxx ./twistt.py --model nova-2-general --language fr
# Save your preferred options for next time
./twistt.py --language fr --gain 2.0 --microphone "Elgato Wave 3" --save-config
# Save to a custom config file
./twistt.py --language fr --gain 2.0 --save-config ~/.config/twistt/presets/french.env
# Load a custom preset
./twistt.py --config ~/.config/twistt/french.env
./twistt.py --config french # equivalent to the one above
./twistt.py --config /path/to/gaming.env
# Load multiple config files (later files override earlier ones)
./twistt.py --config base.env --config local.env
./twistt.py -c "base.env::project.env::local.env"
# Use modifier files with default config (:: prefix includes default)
./twistt.py -c ::fr.env # Combines default config + fr.env modifier
./twistt.py -c :: -c local.env # Combines default config + local.env
./twistt.py -c :: # Uses only default config explicitly
# Specify a custom log file
./twistt.py --log /tmp/twistt-debug.log
# Disable logging (output to /dev/null)
./twistt.py --log /dev/null
# Check configuration without starting (useful to verify settings)
./twistt.py --check
./twistt.py --config french --check # Verify a specific config
# List all available config files and their variables
./twistt.py --list-configs
./twistt.py --list-configs /path/to/configs # List from custom directoryTwistt supports two recording modes:
- Start the script: Run
./twistt.py - Position cursor: Click where you want text to appear
- Hold to record: Press and hold one of your configured hotkeys (default: F9)
- Speak: Talk while holding the key
- Release to transcribe: Let go of the key
- Auto-output: Text is automatically output at cursor position
- Start the script: Run
./twistt.py - Position cursor: Click where you want text to appear
- Double-tap to start: Press-release-press the same hotkey quickly (within 0.5s)
- Speak freely: Recording continues without holding any key
- Press to stop: Press the same hotkey once to stop recording (only the hotkey that started toggle mode can stop it)
- Auto-output: Text is automatically output at cursor position
The transcription appears where the cursor is located.
An indicator ("(Twisting...)" text) is shown at the cursor position when recording is active, or text is being output or post-treatment is running.
Twistt supports three output modes that control when text is processed and output:
-
batch mode (default): Text is processed and can be output incrementally as you speak. Each pause triggers processing of that segment. With post-treatment enabled, each segment maintains context from previous segments.
-
full mode: All text is accumulated while you hold the key and only processed/output when you release it. With post-treatment, the entire text is processed at once without maintaining context between sessions. This mode is useful when you want to speak a complete thought before any processing occurs.
-
none: Twistt skips all output entirely. Transcription and post-treatment still run (just like batch mode), but nothing is pasted or typed at the cursor position. Use when you only want live feedback in the terminal or plan to copy results manually later.
- Shift mode: Press Shift at any time while recording to use Ctrl+Shift+V instead of Ctrl+V to paste (useful for terminals). Shift can be pressed:
- When starting recording (together with the hotkey)
- At any moment while holding the hotkey
- The earliest Shift press is remembered for the entire recording session
- Alt to toggle post-treatment: Press Alt at any time while recording to toggle post-treatment on/off for the current session. This is useful when you have post-treatment configured but want to temporarily disable it for certain inputs (or the reverse).
- Multiple sentences: Keep holding the key to transcribe continuously
- Pause support: Brief pauses are handled automatically
- Live feedback: Watch the terminal to see transcription as it processes
- Output mode choice: Use
--output-mode fullwhen you want to complete your entire thought before processing, or--no-output-modeto disable output entirely - Post-treatment: Enable for improved accuracy, especially useful for:
- Fixing punctuation and capitalization
- Correcting common speech-to-text errors
- Adapting text style (formal, informal, technical)
- Language-specific corrections
The script automatically detects your physical keyboard. If multiple keyboards are found, you'll be prompted to select one. Virtual keyboards are automatically filtered out. Set --keyboard "partial name" or TWISTT_KEYBOARD=partial name to pre-filter devices and auto-select when only one match remains. Pass --keyboard with no value to always display the selection menu and ignore any configured default.
Post-treatment uses AI to improve transcription accuracy by correcting errors, fixing punctuation, and applying custom transformations. It's activated automatically when you provide a prompt.
You can choose between different AI providers for transcription:
- OpenAI: Uses OpenAI's GPT transcribe models (
gpt-4o-transcribe(default)gpt-4o-mini-transcribe). Better to not use--use-typing. - Deepgram: Uses Deepgram's Nova models (
nova-2,nova-3). Really real time but more expensive. Great with--use-typing
You can choose between different AI providers for post-treatment:
- OpenAI (default): Uses OpenAI's GPT models
- Cerebras: Fast inference with open-source models (docs). Models can be free!
- OpenRouter: Access to many different AI models (docs). Provides paid cerebras models like GPT-OSS.
Each provider requires its own API key, which can be set via environment variables or command-line arguments.
You can provide instructions directly via command line or use a file for more complex prompts:
Example prompt file (corrections.txt):
Fix any grammar and punctuation errors.
Ensure proper capitalization.
Expand common abbreviations.
Remove filler words like "um" and "uh".
Keep the conversational tone.
Then use it with:
./twistt.py --post-prompt corrections.txt# Simple corrections
./twistt.py --post-prompt "Fix grammar and punctuation"
# Technical writing
./twistt.py --post-prompt "Use technical vocabulary, expand acronyms on first use"
# Formal style
./twistt.py --post-prompt "Make the text more formal and professional"
# Use a more powerful model for complex corrections
./twistt.py --post-prompt complex_rules.txt --post-model gpt-4o
# Use Cerebras for faster processing
export CEREBRAS_API_KEY=csk-...
./twistt.py --post-prompt "Fix errors" --post-provider cerebras --post-model llama3-70b
# Use OpenRouter for access to various models
export OPENROUTER_API_KEY=sk-or-...
./twistt.py --post-prompt "Improve clarity" --post-provider openrouter --post-model anthropic/claude-3-haikuNote: Post-treatment adds a small delay (typically under 1 second) as it processes the text through the AI model.
By default, the tool auto-detects the language you're speaking. You can also specify a language using ISO 639-1 codes:
en- Englishfr- Frenches- Spanishde- Germanit- Italianpt- Portugueseja- Japanesezh- Chinese- And many more...
Leave the language parameter empty to use auto-detection.
- The script needs to monitor keyboard events
- Run with appropriate permissions if needed
- Select your keyboard manually from the list
- Ensure ydotool daemon is running:
sudo ydotoold & - If using a custom socket path, set it via
YDOTOOL_SOCKETenvironment variable or--ydotool-socketargument
- Add your user to the
inputgroup:sudo usermod -a -G input $USER - Log out and back in for changes to take effect
- Or run with sudo (not recommended for regular use)
- Check microphone permissions
- Adjust
--gainif audio is too quiet/loud - Ensure no other application is using the microphone
- The API key is sent only to OpenAI's servers
- Audio is processed in real-time and not stored locally
- Transcriptions are only kept in memory during the session
We maintain a curated list of potential enhancements in IDEAS.md. If you have suggestions or want to pick something up, check it out and open an issue or PR.
Stephane "Twidi" Angel, with the help of @claude and @codex
MIT License - See LICENSE file for details