Accessible technology: an application for visually impaired people that recognizes objects and their location
Does your smartphone mean communication only? It’s common knowledge, that this gadget is no longer just for phone calls and messages. Nowadays, smartphone’s functionality helps us to resolve our daily tasks like banking, making payments, tracking health and physical activities, checking bills and documentation, and plenty more, including management of the entire social life of the user. There are possibilities we can barely imagine today, but they will become a reality of tomorrow. However, even now, a smartphone is capable enough to become the eyes of its user.
One of the first applications for visually impaired people was developed in 2011. The capabilities of such apps have been increasing with the growth of technology. Recently, our Tech Lead Andrii Narinian has started his work on a product for blind and visually impaired people. He kindly agreed to tell us about it, so more about the newest iOS app read below.
This year, Apple released a new model of iPhone — iPhone 12 Pro equipped with a LiDAR sensor. LiDAR is a technology that determines distance by targeting an object with a laser, measuring the time for the reflected light to return to the receiver. The principle of the sensor’s work is similar to the principle of Sonar. The difference lies in the fact that Sonar uses sound waves, LiDAR — optical. LiDAR is not a present-day technology. It was created in the middle of the 20th century, but its presence in brand-new gadgets is very important for us.
By the way, the first Apple device equipped with a LiDAR sensor was iPad Pro (4th generation). It was introduced to the world in 2020. Thanks to this iPad, we could start working on the project even before the latest iPhone was released.
As proof of concept, we took the 3D reconstruction of space, a grid that overlays the object in real-time, up to 5 meters in a distance. When we specify the object, the light reflected from it, and we could hear a signal in response. The sound gets louder depending on the object’s approximation.
Now, we have a lot of apps created for visually impaired people. All of them developed with the use of various approaches. The Google Research team, for example, has developed an application that uses AI to allow blind people to run free of any other assistance. Google’s app uses a smartphone camera to recognize a line painted on the ground. Through headphones, the app makes sounds when a person moves away from the line.
Notwithstanding, we are working on classifying objects along with reporting the distance on which they are located. To recognize and track moving objects we use Google ML Kit. When the object has been recognized once, even if it is moving, the system wouldn’t consider it as new. In other words, we’re trying to combine two worlds in one app: Apple’s technology with LiDAR sensor and the image recognition engine from Google.
Interestingly, that there are some ways to analyze an object’s disposition even without applying a LiDAR sensor. For instance, by using complex algorithms to find isometry and projection. Though this method is extremely complicated and the probability of error is very high. The technology used in Apple gadgets allows us to get reliable and accurate information. We can know exactly, whether the object is near or we only think it’s near.
We are concentrated mostly on developing and improving the UX base of our app. Our chief task is to create a proven user-friendly product with an accurate recognition system. There is no difficulty in recognizing whether it is a dog or a cat, it is difficult to help a visually impaired person to solve his/her daily tasks, like to go to the bathroom, take a toothbrush, and brush the teeth.
Consequently, we are working on the possibility for blind users to create their own presets according to specific tasks. For instance, the items in presets for a straight walk and flowers’ identification would vary.
To use our product, a person should carry a smartphone with a turned-on camera in a special pocket on the chest. When a person walks, the application focuses on what is towards the front. The navigation mode of our app remains elements on the periphery out of focus due to the importance of knowing what is ahead and whether it is safe or dangerous.
We’ve created systems of classification, categorization, and forming of various use cases. I have hundreds of identified objects in my apartment: a table, a sofa, an armchair, etc., but the question is: what exactly do I need to know in my journey from the armchair to the bathroom?
The main complexity lies outside the laboratory conditions. Everything can work faultlessly during the experiments, but in real use, a lot of troubles may occur. Everything depends on the environment. We strive to create a system that will allow a user to choose the most relevant classifications. At first, the creation of individual presets will require assistance, but in the prospect, everything will be managed via VoiceOver.
The accessibility features of the product are made to help people with all types of visual impairment. Our application is easy to configure, it’s simple to add changes according to the needs of a particular user. We have a special data format that is easy to read and edit, our flexible system can adapt to different conditions.
Speaking about languages, currently, we are working with English, in prospect we plan to add the others.
I find it attractive to work on a product that is not like the other popular apps of today, a product that does not belong to the mainstream. It seems to me, that nowadays technology should be used to help people, to make their lives easier, instead of focusing on making the very best photo for Instagram.
Thanks for reading. We hope, you enjoy our article.
Stay tuned for what’s new and follow us!