CLIP Model
The CLIP (Contrastive Language-Image Pretraining) model is a powerful multimodal model that connects text and images. With InferX, you can run CLIP on any device using the same API - whether it’s a Jetson, GPU server, or CPU-only system.Features
- Universal API: Same code works on Jetson, GPU, or CPU
- Hardware-Optimized: Automatically detects your hardware and uses the appropriate implementation
- Real-time Processing: Optimized for fast inference across all platforms
- Zero Configuration: No setup required - just import and run
Installation
CLIP is included with InferX. No separate installation required.Basic Usage
Advanced Usage
Processing Multiple Images
Batch Processing
Performance
InferX automatically optimizes CLIP for your hardware:Hardware | Typical Inference Time | Memory Usage |
---|---|---|
Jetson AGX Orin | ~50ms | ~2GB |
RTX 4090 | ~20ms | ~3GB |
Intel i7 CPU | ~200ms | ~1GB |
Response Format
Hardware Detection
InferX automatically detects and optimizes for your hardware:Error Handling
Next Steps
- Explore other InferX models
- Check out practical examples
- Learn about custom model optimization