We’re finally at the point where we can enhance digital zoom. Now, it’s not just a matter of taking one pixel and copying it while finding nearest neighbor pixels to blend, creating a cropped, blurry, but larger picture. Now, using clever techniques and AI, Google can create zoom photos that are almost as good as those taken with a zoom lens. Here’s how.
Camera Shake Improves Photos?
The more you zoom in on a photo, the more you realize slight vibrations of your hand. Your hand will follow a sort of elliptical path as you attempt to stabilize your phone. By capturing multiple images during that movement, we can get depth information. This is initially how I thought Google was adding their portrait mode to the rear camera, but it still worked on a tripod. It turns out that, like “super res zoom,” portrait mode used a variety of techniques, including split pixel imaging.
Besides collecting depth information about a photo, a slight wiggle can also help your phone capture more details of a subject. As your hand moves, you capture multiple images. These images can then be combined. By combining multiple images, the real details emerge, and noise can be canceled out. Your natural hand movements complete the image on the sensor, filling in missing information from the first photo.
But what about tripods? How can super res zoom, along with portrait mode, still work on tripods? By wiggling the sensor! Since modern phones use optical image stabilization, which suspends the camera lens system and sensor with springs to reduce shake, a small tap or magnetic impulse can cause the image sensor to wiggle. So, if you’re not wiggling your phone, your phone will wiggle the sensor for you! Isn’t that clever?
Machine Learning
Most zoom enhancement comes via machine learning. Companies like Nvidia have been able to take an image, crop it, and then actually enhance and enlarge that cropped image. Yes, like CSI or that scene in Super Troopers. These techniques complete details using other images. For example, since we know what text looks like, when you zoom in on text, the machine learning algorithm can make the text easier to read. It can make details in faces clearer as well. However, these methods produce images that are acceptable to the human eye, but not a faithful reproduction of what you were looking at. It’s more useful for enhancing a photo that’s been taken to grab details from it than producing fabulous photos.
Google does use machine learning, but only to assist the burst images they took for the initial step of super res zoom. It uses machine learning to appropriately line up the images and reduce noise. They used ActiveQA, a system that uses reinforcement learning to reward the algorithm that produces the best result. It uses a combination of AI and humans to improve the algorithm used by your device to combine images, and identify patterns. This helps Google decide which settings to use when combining images.
The Future of Photography?
This is a revolutionary feature that could change the photography industry. It’s perfect for smartphones, which have small sensors, extremely limited lens options, and often display poor detail. However, what about larger cameras? Could this work for your favorite interchangeable lens mirrorless camera? Yes and no.
Most people with interchangeable lens cameras use manual modes. Sure, automatic modes are great for unexpected photos or situations, but, for the most part, we know what we’re getting into before we take the photo. If you know you’re going to be zooming in, you use a zoom lens, easy! But what if you want to zoom in further after the shot is taken? What if you want to zoom in further than the lens is capable of? Since shutter speed, ISO, and f-stop are generally set by hand, a primary image could not be altered. But what if you could set a feature to take short burst photos before and after your shot, storing them in the same image file? Software could then pull out the details from those shots, combine it with your primary image, and create extra detail. You could zoom in on something after you took your shot.
Up until recently, hardware defined the quality of your photos, outside of skill. If you knew exactly what you were doing and could take two identical shots, the one with the better lens and sensor would look slightly better. It could be more detailed, have a wider color gamut, or appear more realistic. However, now, thanks to clever software and shutter techniques, we’re looking at a new camera quality race, one that will inevitably find its way into professional cameras. Software will give us better images. Soon, technology like this will exist in every camera you shoot with.
Sources:
- Scott Scrivens, Android Police
- Bartlomiej Wronski, Google AI Blog