Skip to content

Enhancing Classification Capabilities of Object Recognition Programs Using Limited Amounts of Data

Microsoft and City, University of London unveil ORBIT, a collection of annotated video clips showcasing diverse objects. The purpose of ORBIT is to instill learning capabilities into AI systems for the aid of visually impaired individuals, using minimal samples. The dataset encompasses 3,822...

Enhancing Object Recognition AI Using Minimal Datasets
Enhancing Object Recognition AI Using Minimal Datasets

Enhancing Classification Capabilities of Object Recognition Programs Using Limited Amounts of Data

In a groundbreaking development, Microsoft and City, University of London have unveiled ORBIT, a comprehensive dataset designed to train and benchmark AI applications involving object property reasoning. This innovative dataset, detailed in a paper published on arXiv, is particularly valuable for developing AI tools that can aid the blind and low vision community.

The ORBIT dataset is a diverse collection of videos, totalling 3,822 annotated clips of 486 unique objects. These videos were shot on smartphones in a variety of real-world environments to provide a broad spectrum of conditions for training future AI systems.

The dataset encompasses three main types of videos: Photographic, Animated, and AI-generated, covering 26 themes spanning indoor, neutral, and outdoor scenes. Images were sourced from public platforms like Google Images, Unsplash, Freepik, and generative AI (GPT-4o, Grok3).

The creation of ORBIT followed a meticulous pipeline. This included image collection, question-answer pair annotation by humans inspired by AI-generated questions, and rigorous quality assurance through multiple rounds of refinement.

To access the ORBIT dataset, you can follow these steps:

  1. Visit the arXiv page for the ORBIT dataset paper.
  2. Look for supplementary materials or dataset availability statements within the paper.
  3. Follow any GitHub or project repository links if provided.
  4. If not found, email the authors requesting access or further instructions.

It's important to note that, as of the latest information in 2025, direct download links or hosting platforms are not explicitly mentioned in the search snippet.

ORBIT also includes videos recorded by individuals who are blind or have low vision, making it an invaluable resource for training AI applications tailored to the needs of this community. The dataset is intended to help improve the functionality of AI applications for individuals with visual impairments.

The image associated with this article is credited to Flickr user Bill Smith.

In conclusion, the ORBIT dataset represents a significant step forward in the development of AI systems capable of understanding object properties and contexts in images, with the potential to greatly enhance accessibility solutions for the blind and low vision community.

The ORBIT dataset, a valuable resource for training AI applications, includes diverse videos totalling 3,822 annotated clips of 486 unique objects, encompassing three main types – Photographic, Animated, and AI-generated – and spanning 26 themes. This dataset, designed to improve the functionality of AI applications for individuals with visual impairments, is a prime example of how artificial-intelligence technology can be employed to process and understand data relating to object property reasoning.

Read also:

    Latest