InternVL2.5 multimodal vision-language model optimized for Jetson and other devices
internvl2_5()
model.vqa(image_path=None, image_paths=None, question=None, timeout=300, debug=False)
image_path
(str): Path to a single imageimage_paths
(list): List of image paths for batch processingquestion
(str): Question to ask about the image(s)timeout
(int): Maximum time in seconds to wait for inference (default: 300)debug
(bool): Whether to print detailed debug information (default: False)model.caption(image_path=None, image_paths=None, timeout=300, debug=False)
image_path
(str): Path to a single imageimage_paths
(list): List of image paths for batch processingtimeout
(int): Maximum time in seconds to wait for inference (default: 300)debug
(bool): Whether to print detailed debug information (default: False)model.reason(image_path, prompt, timeout=300, debug=False)
image_path
(str): Path to an imageprompt
(str): Reasoning prompt or instructiontimeout
(int): Maximum time in seconds to wait for inference (default: 300)debug
(bool): Whether to print detailed debug information (default: False)model.install_nvidia_pytorch()
bool
: True if installation was successful, False otherwise