Artificial Intelligence Basics: What is Computer Vision?

It’s #futurefridays and today we’ll be talking about the basics of Artificial Intelligence.

If this catches on, I might even turn this into a series!

For this first installment, I’ll be talking about Computer Vision.

Wikipedia defines Computer Vision as:

Computer vision is an interdisciplinary scientific field that deals with how computers can be made to gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to automate tasks that the human visual system can do.

Let me show you some examples of how Computer Vision is being used in real-world scenarios.

But first, I would recommend you watch this video from Accenture that gives a really simple explanation of what Computer Vision is, in 1 just minute.

AI 101: What is Computer Vision?

Video © Accenture

Now that you have a basic understanding of what Computer Vision is, allow me to give some examples of Computer Vision in action today.

OCR (Optical Character Recognition)

This technology isn’t new.

OCR is simply the ability to have a machine convert text from a scanned document or image into machine readable format.

Once it’s in a format that a computer can recognize, the possibilities are vast.

For example, the Google Translate app allows you to walk around in a foreign country and understand the signs, even when it’s typed in a language you don’t understand.

Simply open the Google Translate app, point your camera at the sign, and it will translate the sign into a language you can understand.

I talked about OCR this in detail in a previous article – How can OCR (Optical Character Recognition) become AI (Artificial Intelligence)? Here’s how.

I suggest you give that a read.

Facial Recognition

Have you ever wondered how Facebook and Google Photos are able to instantly know who the people are in the photos that you upload?

That’s Facial Recognition in action.

Since there’s already an existing database of photos of people in their social media profiles, the Googles and Facebooks of the world can pattern-match who’s who whenever new photos are uploaded online.

This is the same technology being used in some of the spy movies that you get to watch, where they can pause a CCTV recording to get a snapshot of someone’s face, and run it through Facial Recognition software to identify who that person is, even when they’re wearing a disguise.

Facial Recognition has become more advanced lately where computers are now able to tell sentiments or emotions based on your facial expression e.g. looking happy, surprised, etc.

Image Recognition

If you haven’t tried using Google Lens then you might have been living under a rock for some time.

Google Glass may have made you look dorky but Google Lens allows you to use your mobile phone’s camera to do wonders.

For example, when my wife was getting into succulents as a hobby, it was difficult for us to know the various species of succulents whenever we saw some.

What made the learning curve faster was through the help of Google Lens.

Simply take a photo of the succulent, and Google Lens will scrape the internet for all information relevant to that particular succulent, including the scientific name, rarity, origin, in what type of weather or environment it grows in, how to take care of it, and so forth.

You can use Google Lens to get information about things you see, and what to know more about.

Saw a watch or any other item you like and want to know more about it and where to get it? Use Google Lens.

Combining Image Recognition (or other forms of Computer Vision) with other sensors can yield wonders as well.

For example, there may be many replicas of the Eiffel Tower in many places all over the globe.

You can take a picture of the Eiffel Tower and by using geo-location, you can tell if that’s the real Eiffel Tower if your coordinates indicate that you’re in Paris standing in front of the Eiffel Tower.

Biometric Authentication

Fingerprint, Iris, and Facial Recognition, which are sometimes combined with Voice Recognition and even DNA matching.

These types of Computer Vision are usually being used in the security industry.

These things have also become a basic feature for unlocking our mobile phones.

Yes, that’s Computer Vision in action.

Driver Assist and Self-Driving Cars

Ah yes, driver assist and self-driving cars.

From lane and obstacle detection, to self-parking cars, to fully-automated self-driving cars.

This field is probably one of the most comprehensive uses of Computer Vision.

The reason is because all of the previous examples I mentioned of Computer Vision are all typically based on a flat 2-dimensional image.

For self-driving cars, you need 3D. You need to give Computer Vision a sense of depth. Especially because you are moving forward, really fast.

One way to do that is by using LIDAR, which stands for Light Detection and Ranging.

I won’t be explaining it in detail here, though you can Google it if you’d like, but in a nutshell, this is achieved through lasers to give the Computer a sense of range and depth.

Elon Musk uses this technology as well to dock their SpaceX rocket ships.

But for Tesla, on the other hand, instead of LIDAR , they use 8 cameras, 12 ultrasonic sensors (for near-field “vision”), and a forward RADAR (because that’s the only direction where you’re moving really fast).

The RADAR helps the cameras get a better sense of depth and range, even in the non-visible spectrum, such as fog, snow, smoke, and so forth.

Ending Note

I hope you found this to be of value, and gave you a sense of what Computer Vision is and where it is being utilized in real-world scenarios.

If you have any questions, reply with a comment. I’d love to hear from you.

And if you think this is helpful and you’d want to get updates on the next article, subscribe for updates, and get a free copy of my book – The Business Optimization Blueprint.

I help transform businesses and take them to the next level with my expertise in Agile, Lean Six Sigma, Operational Excellence, and Intelligent Automation. Author of The Business Optimization Blueprint.