Computer vision applications owe their rapid development to the increasing availability of data. However, as image and video datasets grow over time, managing and locating specific data is becoming just as important as collecting it. So how do teams working on computer vision systems quickly locate data for a project? And how do you organize video data when you have more than 10,000 hours of it?
To learn more we talked with Mark Pfeiffer, the CTO of SiaSearch. SiaSearch is a platform for efficiently exploring vision data based on metadata. It streamlines data identification and search difficulties to help increase efficiency. In this interview, we look at the current state of data pipelines, and how SiaSearch helps to manage and find data for their computer vision systems.
For people who don’t know, can you tell me what work SiaSearch does in the field of AI?
SiaSearch is an intelligent API / GUI that makes it significantly easier and faster for ML developers to explore, understand, use and share raw data. Behind the scenes is a custom built metadata architecture and database that specifically addresses the performance and usability shortcomings of existing technologies. We did a lot of work in the mobile robotics industries (with one big use case being autonomous driving) and are currently expanding to many other applications of computer vision. We envision SiaSearch as the standard gateway for any company building real-world computer vision applications.
Where did you make your start in AI? What drew you to this field of research?
I have always been fascinated by autonomous and intelligent systems. Having machines controlled by computers operating in the same environment as us humans and conducting useful tasks is one of the biggest technical challenges of current times. Excited by this challenge, I decided to pursue a PhD in robotics, especially focused on the navigation of autonomous systems in dynamic environments shared with humans. My special interest was the data-driven development of navigation software, i.e. robots that are able to learn to leverage human observations in order to improve their own motion patterns over time.
Can you give us a simple introduction to SiaSearch and how it supports computer vision applications?
SiaSearch is an efficient and scalable data catalog for image data. Given that image data is unstructured and its content is hard to access, currently a large amount of manual work goes into data selection and data operation processes. SiaSearch allows users to significantly reduce this amount of manual work and allows our users to focus on building software instead of handling data.
Under the hood, SiaSearch consists of a scalable metadata catalog and query engine, which allow users to easily add metadata for images or sequences, obtain an overview of the underlying data at large scale, and allow users to assemble subsets of data by efficient queries.
Users typically interact with SiaSearch through our Python SDK and the GUI. It can be easily hosted in any cloud environment, or on-premises, and helps to streamline the process of data selection for annotation, training, testing and validation purposes.
How would someone use SiaSearch to enhance their data pipeline?
A typical use case of SiaSearch is the data selection for annotation and model training. In order to get the most out of the image annotation process, users carefully select images to share with annotation providers, e.g. Lionbridge. Without the right software tools, this is an extremely time consuming and manual task, and it slows down the whole development pipeline.
As data selection is typically an iterative process, SiaSearch allows users to easily add new metadata to the underlying data catalog, which in turn allows them to easily browse and slice the data. Examples for such metadata include weather information, image content, inference results, evaluation metrics and more.
As the CTO of SiaSearch, what are your thoughts on the current state of computer vision algorithms?
In the past decade we’ve seen a lot of progress in the area of computer vision applications and research. Especially with the rise of deep learning, boundaries in research keep getting pushed. However, we still see a lot of problems when very promising approaches transition from research labs to real-world setups.
Typically there are three parts that need to play together in order to end up with reliable computer vision models: (1) Choosing the right model, (2) selecting the right training strategies and (3) selecting the right data for training and validation. For real world applications we see the data selection as a major bottleneck in order to further improve the performance of the models and operate robustly on a large set of real-world applications. Real-world data is typically biased and has a huge amount of edge cases which all need to be treated in the right way.
Therefore we see a huge need to provide better tooling to improve data operations and therefore allow the transition of the promising computer vision technology into real-world applications. This means both data selection and annotation, as most algorithms are currently trained in a supervised manner.
One large computer vision use case is automated vehicles. What challenges do you see on the way to higher levels of autonomy?
I think the hardest problem to solve is complete autonomy in any situation. As humans we are pretty good at working with uncertainty. The randomness of a typical driving experience — and all the variations of certain situations — is extremely hard to cover completely in an autonomous car. There will always be a long tail of events with unseen situations arising, which will require a huge amount of semantic understanding. It might also involve deciding which risks are okay to take, something that is typically hard for algorithms to do. These will be hard problems to solve, but more efficient data operations can significantly accelerate the development process to ultimately get there.
To learn more about SiaSearch and their computer vision management platform, check out the SiaSearch homepage.
For more about computer vision applications and automated vehicles, be sure to view the related articles below and subscribe to our newsletter. And if you have any questions about creating or labeling your own datasets for computer vision projects, get in touch!