DeepSeek R1 Model
DeepSeek R1 is a powerful large language model (LLM) designed for text generation, reasoning, and code generation tasks. With InferX, you can run DeepSeek R1 on any device using the same API - from edge devices to powerful servers.Features
- Advanced Reasoning: State-of-the-art reasoning and problem-solving capabilities
- Code Generation: Expert-level code generation and explanation
- Cross-Platform: Same code works on Jetson, GPU, or CPU
- Hardware-Optimized: Automatically detects and optimizes for your hardware
- Real-time Processing: Optimized for fast inference across all platforms
Installation
DeepSeek R1 is included with InferX:Basic Usage
Advanced Usage
Text Generation
Code Generation
System Prompts
Parameters
Parameter | Description | Default | Range |
---|---|---|---|
max_tokens | Maximum number of tokens to generate | 256 | 1-2048 |
temperature | Controls randomness (higher = more random) | 0.8 | 0.0-2.0 |
top_p | Nucleus sampling parameter | 0.95 | 0.0-1.0 |
top_k | Limits vocabulary to top k tokens | 40 | 1-100 |
Performance
InferX optimizes DeepSeek R1 for your hardware:Hardware | Tokens/Second | Memory Usage |
---|---|---|
Jetson AGX Orin | ~15 | ~8GB |
RTX 4090 | ~50 | ~12GB |
Intel i7 CPU | ~5 | ~6GB |
Example Applications
Chatbot Development
Code Assistant
Hardware Detection
Next Steps
- Try CLIP model for multimodal understanding
- Explore practical examples
- Learn about custom model optimization