Setup Guide

This guide will walk you through setting up your environment for using the Exla SDK. We’ll cover installing the necessary tools, creating a virtual environment, and installing the SDK.

Prerequisites

Before you begin, make sure you have:

Python 3.10 or later installed
Git installed
Access to the Exla SDK repository (you’ll need an access token)

Step 1: Set up docker in sudo mode

sudo usermod -aG docker $USER

Restart your terminal for this to take effect.

Step 2: Install uv

First, install uv, a fast Python package installer and resolver that we recommend for managing dependencies:

curl -LsSf https://astral.sh/uv/install.sh | sh

This will install uv on your system. After installation, you may need to restart your terminal or source your shell configuration file to use uv.

Step 3: Create a Project Directory

Create a new directory for your project and navigate into it:

mkdir exla-project && cd exla-project

Step 4: Create a Virtual Environment

Create a Python virtual environment using uv. We recommend using Python 3.10 for optimal compatibility:

uv venv --python 3.10

This creates a virtual environment in the .venv directory. Activate the virtual environment:

# On Linux/macOS
source .venv/bin/activate

# On Windows
.venv\Scripts\activate

Step 5: Set the GitHub Token

Set the GitHub token as an environment variable:

export EXLA_TOKEN=xxxxxx # Replace with token provided by Exla team

Step 6: Install the Exla SDK

Install the Exla SDK directly from the GitHub repository using your access token:

uv pip install git+https://${EXLA_TOKEN}@github.com/exla-ai/exla-sdk.git

If everything is set up correctly, you should see the Exla SDK version and a success message!

Next Steps

Now that you have set up your environment and tested the Exla SDK, you can:

Explore the Quickstart Guide for more examples
Check out the CLIP model documentation for image-text matching
Try the RoboPoint model for keypoint affordance prediction
Learn about hardware compatibility for optimized performance

Troubleshooting

If you encounter any issues during setup:

Make sure you’re using Python 3.10 or later
Verify that your access token has the necessary permissions
Check that all dependencies are properly installed
Please don’t hesitate to reach out to us on email at contact@exla.ai

Getting Started with Your First Model

Now that you have the Exla SDK installed, let’s run your first model! We’ll use the CLIP model, which is a powerful multimodal model that connects text and images.

Using CLIP for Image-Text Matching

CLIP (Contrastive Language-Image Pretraining) allows you to find the best matching images for a given text description or vice versa. Here’s how to use it:

from exla.models.clip import clip
import json

# Initialize the model (automatically detects your hardware)
model = clip()

# Run inference with sample images and text queries
results = model.inference(
    image_paths=["path/to/image1.jpg", "path/to/image2.jpg"],
    text_queries=["a photo of a dog", "a photo of a cat", "a photo of a bird"]
)

# Print results
print(json.dumps(results, indent=2))

What’s Happening Behind the Scenes

When you run this code:

The Exla SDK automatically detects your hardware (Jetson, GPU, or CPU)
It loads the appropriate optimized implementation of CLIP
The model processes your images and text queries
It returns similarity scores between each image and text query

Sample Output

The output will look something like this:

[
  {
    "a photo of a dog": [
      {
        "image_path": "data/dog.png",
        "score": "23.1011"
      },
      {
        "image_path": "data/cat.png",
        "score": "17.1396"
      }
    ]
  },
  {
    "a photo of a cat": [
      {
        "image_path": "data/cat.png",
        "score": "25.3045"
      },
      {
        "image_path": "data/dog.png",
        "score": "18.7532"
      }
    ]
  }
]

Next Steps with Models

Now that you’ve run your first model, you can explore other models in the Exla SDK:

DeepSeek: For large language model capabilities
RoboPoint: For keypoint affordance prediction in robotics
SAM2: For advanced image segmentation
MobileNet: For efficient image classification
ResNet34: For high-accuracy image classification

Check out the Models section for detailed documentation on each model.

Exploring Example Code

To help you get started quickly, we provide a repository of example code for all our models and features. These examples demonstrate real-world usage and best practices.

Setting Up the Examples Repository

Clone the examples repository:

git clone https://github.com/exla-ai/exla-sdk-examples.git

Navigate to the examples directory:

cd exla-sdk-examples

Explore the available examples:

ls

You’ll see directories for each model and feature, including:

clip/ - Examples for the CLIP model
deepseek_r1/ - Examples for the DeepSeek language model
robopoint/ - Examples for the RoboPoint model
custom_model/ - Examples for optimizing your own models
And more!

Running an Example

Let’s run a simple example using the CLIP model:

Navigate to the CLIP examples directory:

cd clip

Run the example:

python example_clip.py

This will demonstrate how to use CLIP for image-text matching with sample images.

Running the RoboPoint Example

For a more advanced example, try the RoboPoint model:

Navigate to the RoboPoint examples directory:

cd ../robopoint

Run the example:

python example_robopoint.py

This will demonstrate how to use RoboPoint for robotic perception tasks.

Optimizing Your Own Models

To see how to optimize your own custom models:

Navigate to the custom model examples directory:

cd ../custom_model

Run the example:

python example_optimize_custom_model.py

This example shows how to optimize a pre-trained EfficientNet model for faster inference.

Next Steps

After exploring the examples, you can:

Modify the examples to fit your specific use case
Integrate the code into your own projects
Learn about advanced optimization techniques
Explore hardware-specific optimizations

Getting Started

​Setup Guide

​Prerequisites

​Step 1: Set up docker in sudo mode

​Step 2: Install uv

​Step 3: Create a Project Directory

​Step 4: Create a Virtual Environment

​Step 5: Set the GitHub Token

​Step 6: Install the Exla SDK

​Next Steps

​Troubleshooting

​Getting Started with Your First Model

​Using CLIP for Image-Text Matching

​What’s Happening Behind the Scenes

​Sample Output

​Next Steps with Models

​Exploring Example Code

​Setting Up the Examples Repository

​Running an Example

​Running the RoboPoint Example

​Optimizing Your Own Models

​Next Steps

Setup Guide

Prerequisites

Step 1: Set up docker in sudo mode

Step 2: Install uv

Step 3: Create a Project Directory

Step 4: Create a Virtual Environment

Step 5: Set the GitHub Token

Step 6: Install the Exla SDK

Next Steps

Troubleshooting

Getting Started with Your First Model

Using CLIP for Image-Text Matching

What’s Happening Behind the Scenes

Sample Output

Next Steps with Models

Exploring Example Code

Setting Up the Examples Repository

Running an Example

Running the RoboPoint Example

Optimizing Your Own Models

Next Steps