Computer vision is approaching a milestone

Computer vision, a well-known and highly useful branch of artificial intelligence, is nearing a small revolution thanks to advances in automated labeling systems.

For us humans, getting to know things is a child’s play. It is truly a skill that we actively develop from a very young age, and the human brain is fantastically gifted to do this type of task with tremendous accuracy.

He is even so talented that machines regularly ask for our opinion on the matter; This is the case every time we complete certain types of captchas. In fact, it’s not for fun that Google and your buddies regularly ask you which square has a sailboat, pedestrian crossing, or one of those pesky traffic lights.

The responses of Internet users are then used as references to improve the reliability of autonomous systems based on artificial intelligence. By hitting samples in this way, the big names in AI are hoping to make their creations more reliable.

But today it seems clear that this approach is not sufficient by itself; If it is enough to ask the general public whether the image represents a bus or a truck to train an AI specialist called “strong“It has been a long time since our civilization has completely changed.

Classifying data is a real headache that takes a lot of time from researchers working with artificial intelligence systems applied to images. © Tim Gouw – Unsplash

Labeling, the plight of AI researchers

When it comes to developing ultra-compact and reliable systems, very few approaches are viable – and trusting angry netizens is certainly not one of them. Instead, you have to spend a lot of time to ensure the reliability of the information that the neural network will ingest.

If it is approximate, then the AI ​​will rely on wrong data and so the results will not necessarily be meaningful. In short: “gtree in, trash” (Waste at the entrance, waste at the exit) as the discipline specialists say.

This requires researchers to select the images one by one to “paint” colored masks on them, each identifying an element that the AI ​​is supposed to identify, as follows. This is called data naming.

It is a very time consuming process and is calculated regularly in hundreds of hours. In fact, the databases in question collect thousands, even hundreds of thousands of images. There is no question of rushing to work. The quality of the final product directly depends on the amount of data available.

So this approach seems overly paradoxical, not to mention outdated in a niche field like AI research. This is a problem of increasing importance as this time can be devoted to substantive work, which is much more important for the development of this technology.

Therefore, researchers are trying to develop systems capable of this Perform this excessively ungrateful task for them. So far, results have always been mixed in terms of quality. Moreover, this approach involves working pixel by pixel; You don’t have to be a great computer scientist to understand that this quickly raises the problem of computing power. After all, it involves using hundreds of thousands of images which should all be consistent from start to finish.

Pre-working chewing algorithm

The human brain remains the great specialist in this discipline. But the latest work by MIT researchers spotted by Engadget may have come from Significantly reduce the gap. With help from Cornell University and Microsoft, MIT developed an algorithm called STEGO. Its goal: to label images independently in record time and with pixel accuracy.

The idea is that these algorithms can select coherent groups pretty much automatically so we don’t have to do it ourselves.Mark Hamilton, lead author of the study, explains.

To achieve this, this algorithm analyzes the entire data set in search of repeated objects that appear multiple times over the images. “Then he links them to build a coherent end result on all the images he learns from,” the team explains in a press release.

The researchers then compared the STEGO results to other independent labeling systems. The result was absolutely amazing. They explain that at least STEGO appeared Twice the efficiency of analogues. This is the first time that an algorithm of this type aligns almost perfectly with images of controls rated by humans.

It’s a big progress; This could allow many researchers to dramatically increase the speed at which they can annotate large data sets. But it would also be simplistic to limit the impact of standalone systems like STEGO to simple throughput.

The rated images (bottom) were created independently by STEGO. © Hamilton W. Paragraph.

Transcend human limits once and for all

The main interest of this method is the ability to identify complex patterns that humans cannot accurately classify. “If you’re looking at tumor scans, images of a planet’s surface, or high-resolution microbiological images, it’s hard to know where to look without being a true expert.‘, explained the researchers.

In some areas, even human experts don’t know what the things involved look like.Hamilton adds. “In this kind of situation where we operate at the frontiers of science, we cannot rely on humans to understand before machines“, Determines.

Thus a self-supervised system of this kind can work real miracles in certain areas. Just think about a cancer diagnosis or learning about the environment in self-driving vehicles. But it is only The tip of a huge iceberg possible applications.

There is still work to get to this point. For example, as it stands, STEGO still suffers from some limitations. For example, it is possible for him to lose control completely by sending an eccentric picture like a banana placed on the bowl of a landline phone. This is a good age Trash in, trash out So it is still valid. But it is not Maybe just a matter of time Before STEGO and its successors are mature enough to create a A real revolution in this very important place for artificial intelligence.

Leave a Comment

Your email address will not be published.