A powerful large language model for text generation and reasoning optimized for any device
Parameter | Description | Default | Range |
---|---|---|---|
max_tokens | Maximum number of tokens to generate | 256 | 1-2048 |
temperature | Controls randomness (higher = more random) | 0.8 | 0.0-2.0 |
top_p | Nucleus sampling parameter | 0.95 | 0.0-1.0 |
top_k | Limits vocabulary to top k tokens | 40 | 1-100 |
Hardware | Tokens/Second | Memory Usage |
---|---|---|
Jetson AGX Orin | ~15 | ~8GB |
RTX 4090 | ~50 | ~12GB |
Intel i7 CPU | ~5 | ~6GB |