DeepSeek R1 Model
DeepSeek R1 is a powerful large language model (LLM) designed for text generation, reasoning, and code generation tasks. With InferX, you can run DeepSeek R1 on any device using the same API - from edge devices to powerful servers.
Features
- Advanced Reasoning: State-of-the-art reasoning and problem-solving capabilities
- Code Generation: Expert-level code generation and explanation
- Cross-Platform: Same code works on Jetson, GPU, or CPU
- Hardware-Optimized: Automatically detects and optimizes for your hardware
- Real-time Processing: Optimized for fast inference across all platforms
Installation
DeepSeek R1 is included with InferX:
pip install git+https://github.com/exla-ai/InferX.git
Basic Usage
from inferx.models.deepseek_r1 import deepseek_r1
# Initialize the model (automatically detects your hardware)
model = deepseek_r1()
# Run the interactive interface
model.run()
Advanced Usage
Text Generation
# Generate text with custom parameters
response = model.generate(
prompt="Write a short poem about artificial intelligence.",
max_tokens=100,
temperature=0.7,
top_p=0.9,
top_k=40
)
print(response)
Code Generation
# Generate and explain code
code_response = model.generate(
prompt="Write a Python function to calculate the Fibonacci sequence",
max_tokens=200,
temperature=0.3 # Lower temperature for more deterministic code
)
print(code_response)
System Prompts
# Use system prompts to guide behavior
response = model.generate(
prompt="What is machine learning?",
system_prompt="You are a helpful AI assistant that explains complex topics in simple terms.",
max_tokens=150
)
print(response)
Parameters
Parameter | Description | Default | Range |
---|
max_tokens | Maximum number of tokens to generate | 256 | 1-2048 |
temperature | Controls randomness (higher = more random) | 0.8 | 0.0-2.0 |
top_p | Nucleus sampling parameter | 0.95 | 0.0-1.0 |
top_k | Limits vocabulary to top k tokens | 40 | 1-100 |
InferX optimizes DeepSeek R1 for your hardware:
Hardware | Tokens/Second | Memory Usage |
---|
Jetson AGX Orin | ~15 | ~8GB |
RTX 4090 | ~50 | ~12GB |
Intel i7 CPU | ~5 | ~6GB |
Example Applications
Chatbot Development
def create_chatbot():
model = deepseek_r1()
print("DeepSeek R1 Chatbot (type 'exit' to quit)")
while True:
user_input = input("You: ")
if user_input.lower() == 'exit':
break
response = model.generate(
prompt=user_input,
system_prompt="You are a helpful assistant.",
max_tokens=200
)
print(f"Assistant: {response}")
# Run the chatbot
create_chatbot()
Code Assistant
def code_assistant():
model = deepseek_r1()
system_prompt = """You are an expert programmer. Provide clear,
well-commented code solutions and explanations."""
while True:
question = input("Code question: ")
if question.lower() == 'exit':
break
response = model.generate(
prompt=question,
system_prompt=system_prompt,
temperature=0.3, # More deterministic for code
max_tokens=300
)
print(f"Solution:\n{response}\n")
code_assistant()
Hardware Detection
✨ InferX - DeepSeek R1 Model ✨
🔍 Device Detected: AGX_ORIN
⠏ [2.5s] Loading DeepSeek R1 model
✓ [3.0s] Ready for text generation
Next Steps