DeepSeek R1 Model

DeepSeek R1 is a powerful large language model (LLM) designed for text generation, reasoning, and code generation tasks. With InferX, you can run DeepSeek R1 on any device using the same API - from edge devices to powerful servers.

Features

  • Advanced Reasoning: State-of-the-art reasoning and problem-solving capabilities
  • Code Generation: Expert-level code generation and explanation
  • Cross-Platform: Same code works on Jetson, GPU, or CPU
  • Hardware-Optimized: Automatically detects and optimizes for your hardware
  • Real-time Processing: Optimized for fast inference across all platforms

Installation

DeepSeek R1 is included with InferX:

pip install git+https://github.com/exla-ai/InferX.git

Basic Usage

from inferx.models.deepseek_r1 import deepseek_r1

# Initialize the model (automatically detects your hardware)
model = deepseek_r1()

# Run the interactive interface
model.run()

Advanced Usage

Text Generation

# Generate text with custom parameters
response = model.generate(
    prompt="Write a short poem about artificial intelligence.",
    max_tokens=100,
    temperature=0.7,
    top_p=0.9,
    top_k=40
)

print(response)

Code Generation

# Generate and explain code
code_response = model.generate(
    prompt="Write a Python function to calculate the Fibonacci sequence",
    max_tokens=200,
    temperature=0.3  # Lower temperature for more deterministic code
)

print(code_response)

System Prompts

# Use system prompts to guide behavior
response = model.generate(
    prompt="What is machine learning?",
    system_prompt="You are a helpful AI assistant that explains complex topics in simple terms.",
    max_tokens=150
)

print(response)

Parameters

ParameterDescriptionDefaultRange
max_tokensMaximum number of tokens to generate2561-2048
temperatureControls randomness (higher = more random)0.80.0-2.0
top_pNucleus sampling parameter0.950.0-1.0
top_kLimits vocabulary to top k tokens401-100

Performance

InferX optimizes DeepSeek R1 for your hardware:

HardwareTokens/SecondMemory Usage
Jetson AGX Orin~15~8GB
RTX 4090~50~12GB
Intel i7 CPU~5~6GB

Example Applications

Chatbot Development

def create_chatbot():
    model = deepseek_r1()
    
    print("DeepSeek R1 Chatbot (type 'exit' to quit)")
    
    while True:
        user_input = input("You: ")
        if user_input.lower() == 'exit':
            break
            
        response = model.generate(
            prompt=user_input,
            system_prompt="You are a helpful assistant.",
            max_tokens=200
        )
        
        print(f"Assistant: {response}")

# Run the chatbot
create_chatbot()

Code Assistant

def code_assistant():
    model = deepseek_r1()
    
    system_prompt = """You are an expert programmer. Provide clear, 
    well-commented code solutions and explanations."""
    
    while True:
        question = input("Code question: ")
        if question.lower() == 'exit':
            break
            
        response = model.generate(
            prompt=question,
            system_prompt=system_prompt,
            temperature=0.3,  # More deterministic for code
            max_tokens=300
        )
        
        print(f"Solution:\n{response}\n")

code_assistant()

Hardware Detection

✨ InferX - DeepSeek R1 Model ✨
🔍 Device Detected: AGX_ORIN
⠏ [2.5s] Loading DeepSeek R1 model
✓ [3.0s] Ready for text generation

Next Steps

DeepSeek R1 Model

DeepSeek R1 is a powerful large language model (LLM) designed for text generation, reasoning, and code generation tasks. With InferX, you can run DeepSeek R1 on any device using the same API - from edge devices to powerful servers.

Features

  • Advanced Reasoning: State-of-the-art reasoning and problem-solving capabilities
  • Code Generation: Expert-level code generation and explanation
  • Cross-Platform: Same code works on Jetson, GPU, or CPU
  • Hardware-Optimized: Automatically detects and optimizes for your hardware
  • Real-time Processing: Optimized for fast inference across all platforms

Installation

DeepSeek R1 is included with InferX:

pip install git+https://github.com/exla-ai/InferX.git

Basic Usage

from inferx.models.deepseek_r1 import deepseek_r1

# Initialize the model (automatically detects your hardware)
model = deepseek_r1()

# Run the interactive interface
model.run()

Advanced Usage

Text Generation

# Generate text with custom parameters
response = model.generate(
    prompt="Write a short poem about artificial intelligence.",
    max_tokens=100,
    temperature=0.7,
    top_p=0.9,
    top_k=40
)

print(response)

Code Generation

# Generate and explain code
code_response = model.generate(
    prompt="Write a Python function to calculate the Fibonacci sequence",
    max_tokens=200,
    temperature=0.3  # Lower temperature for more deterministic code
)

print(code_response)

System Prompts

# Use system prompts to guide behavior
response = model.generate(
    prompt="What is machine learning?",
    system_prompt="You are a helpful AI assistant that explains complex topics in simple terms.",
    max_tokens=150
)

print(response)

Parameters

ParameterDescriptionDefaultRange
max_tokensMaximum number of tokens to generate2561-2048
temperatureControls randomness (higher = more random)0.80.0-2.0
top_pNucleus sampling parameter0.950.0-1.0
top_kLimits vocabulary to top k tokens401-100

Performance

InferX optimizes DeepSeek R1 for your hardware:

HardwareTokens/SecondMemory Usage
Jetson AGX Orin~15~8GB
RTX 4090~50~12GB
Intel i7 CPU~5~6GB

Example Applications

Chatbot Development

def create_chatbot():
    model = deepseek_r1()
    
    print("DeepSeek R1 Chatbot (type 'exit' to quit)")
    
    while True:
        user_input = input("You: ")
        if user_input.lower() == 'exit':
            break
            
        response = model.generate(
            prompt=user_input,
            system_prompt="You are a helpful assistant.",
            max_tokens=200
        )
        
        print(f"Assistant: {response}")

# Run the chatbot
create_chatbot()

Code Assistant

def code_assistant():
    model = deepseek_r1()
    
    system_prompt = """You are an expert programmer. Provide clear, 
    well-commented code solutions and explanations."""
    
    while True:
        question = input("Code question: ")
        if question.lower() == 'exit':
            break
            
        response = model.generate(
            prompt=question,
            system_prompt=system_prompt,
            temperature=0.3,  # More deterministic for code
            max_tokens=300
        )
        
        print(f"Solution:\n{response}\n")

code_assistant()

Hardware Detection

✨ InferX - DeepSeek R1 Model ✨
🔍 Device Detected: AGX_ORIN
⠏ [2.5s] Loading DeepSeek R1 model
✓ [3.0s] Ready for text generation

Next Steps