Microsoft's new image-captioning AI is pretty accurate

Microsoft’s new image-captioning AI system is more accurate than humans

The American tech giant, Microsoft, has reached yet another milestone. The company has developed a new image-captioning AI system that would describe contents in photos pretty accurately.

In fact, in most cases, it is more accurate than humans. Above all, this will help accessibility for sight-impaired users who strongly rely on screen readers and “alt text” when viewing images online.

Read: Microsoft brings in new features for Teams

Seeing AI –– Microsoft new image-captioning system

Microsoft has developed an image-captioning system that is more accurate than humans. In a blog post, Microsoft said that the system “can generate captions for images that are, in many cases, more accurate than the descriptions people write. The breakthrough in a benchmark challenge is a milestone in Microsoft’s push to make its products and services inclusive and accessible to all users.”

The company used the AI system to update the company’s assistant app that will most especially help the visually impaired –– the system called Seeing AI. The app is popular among users with visual impairments who use their smartphone camera to read a text or describe objects and their surroundings.

Seeing AI –– Microsoft new image-captioning system
Seeing AI –– Microsoft new image-captioning system

Soon, the system will be incorporated into other Microsoft services such as Word, PowerPoint, and Outlook. In these services, the system will be used for tasks like creating alt-text for images –– a function that is important for increasing the user’s accessibility.

“Ideally, everyone would include alt text for all images in documents, on the web, in social media — as this enables people who are blind to access the content and participate in the conversation,” said Saqib Shaikh, a software engineering manager with Microsoft’s AI team. “But, alas, people don’t. So, there are several apps that use image captioning as a way to fill in alt text when it’s missing,” he added.

How Microsoft’s Seeing AI works is that it uses computer vision to describe the world as seen through the user’s smartphone camera. The AI can identify everything –– from objects to household items. It scans and reads text messages, describes the surroundings, and can even identify friends.

Furthermore, you can also use Seeing AI to describe images in other apps, such as email clients, social media apps, and messaging apps like WhatsApp.

Microsoft further lightens the load for the visually impaired

Microsoft further lightens the load for the visually impaired

Already the Seeing AI app is a very helpful app. It is one of the leading apps for people who are blind or those who have low vision. But Microsoft’s new image-captioning algorithm will further improve the Seeing AI’s performance significantly.

It is not only able to identify objects but it will also be able to describe the relationship between them precisely. So for example, the algorithm looks at a picture and describes the person or objects in it, e.g., a person, basketball, court. At the same time, it will also tell how these things are interacting, e.g., a person is playing basketball on a court.

Microsoft takes great pride in this new app as it says that the algorithm is twice as good as the previous image-captioning system in use since 2015.

Leave a Reply

Your email address will not be published.