Mapping the planes

Samsung has a patent and a plan for using two lenses with triangulation (image offset) depth detection between two images in what is roughly a stereo pair. Here’s a link:

Pentax also have a system on the new Q range which takes more than one exposure, changes the focus point between them, and uses this to evaluate the focus map and create bokeh-like effects. Or so the pre-launch claims for this system indicate, though the process is not described. It’s almost certain to be a rapid multishot method, and it could equally well involve blending a sharp image with a defocused one.

In theory, the sweep panorama function of Sony and some other cameras can be used to do exactly the same thing – instead of creating a 3D 16:9 shot it could create a depth mapped focus effect in a single shot. 3D is possible with sweep pans by simply taking two frames from the multi-shot pan separated by a certain amount, so the lens positions for the frames are separated enough to be stereographic. 3D ‘moving’ pans (scrolling on the TV screen) can be compared to delaying the playback of the left eye view and shifting the position of subject detail to match the right. But like 16:9 pans, they are just two JPEGs.

All these methods including the Samsung concept can do something else which is not yet common – they can alter any other parameter, not just focus blur. They could for example change the colour balance or saturation so that the focused subject stands out against a monochrome scene, or so the background to a shot is made darker or lighter than the focused plane, or warmer in tone or cooler – etc. Blur is just a filter, in digital image terms. Think of all the filters available from watercolour or scraperboard effects to noise reduction, sharpening, blurring, tone mapping, masking – digital camera makers have already shown that the processors in their tiny cameras can handle such things pretty well.

Once a depth map exists there’s almost no limit to the manipulation possible. Samsung only scratches the surface by proposing this is used for the esoteric and popular bokeh enhancement (a peculiarly Japanese obsession which ended up going viral and infecting the entire world of images). I can easily image a distance-mapped filter turning your background scene into a Monet or a van Gogh, while applying a portrait skin smoothing process to your subjects.

Any camera with two lenses in stereo configuration should also, in theory, be able to focus using a completely different method to existing off-sensor AF – using the two lenses exactly like a rangefinder with two windows. So far this has not been implemented.

Way back – 40 years ago – I devised a rangefinder optical design under which you can see nothing at all at the focus point unless the lens was correctly focused. It works well enough for a single spot, the image detail being the usual double coincident effect when widely out of focus, but blacking out when nearly in focus and suddenly becoming visible only when focus is perfect. I had the idea of making a chequerboard pattern covering an entire image, so that the viewfinder would reveal the focused subject and blank out the rest of the scene, but a little work with a pencil and paper quickly shows why it wouldn’t work like that. The subject plane would have integrity, other planes would not all black out, they’d create an interestingly chaotic mess with phase-related black holes.

Samsung’s concept, in contrast, could isolate the subject entirely – almost as effectively as green screen techniques. It would be able to map the outline of a foreground subject like a newsreader by distance, instead of relying on the colour matte effect of green or blue screen technology. This would free film makers and TV studios from the restraints of chroma-keyed matting (not that you really want the newsreader wearing a green tie).

The sensitivity of the masking could be controlled by detecting the degree of matched image detail offset and its direction (the basic principle of stereographic 3D) – or perhaps more easily by detecting exactly coincident detail, in the focused plane. Photoshop’s snap-to for layers works by detecting a match and so do the stitching functions used for sweep and multi shot in-camera panorama assembly. Snap-to alignment of image data is a very mature function.

Just when you think digital photography has rung all the bells and blown all the whistles, the tones of an approaching caliope can be heard rolling down the river…

– David Kilpatrick


A dream of the future – and past

Sometimes earlier this year, early Spring I think, I had a vivid and detailed dream during a slow waking-up hour. It was the kind of dream which feels rational not random. I knew what I was doing in it – in control!
This time, I described the dream to my wife and son; he knows a lot about this stuff, and thought it was an accurate dream. It was possible. Now Sony is about to release the camera I was using in the dream.
Here is the dream.
I am walking across a kind of pier or boardwalk construction at the edge of water. It’s not in Britain. It’s warm and sunny, and it could be in the USA. The boards are raised above what would be the shore, and there are wooden buildings left and right of me. Ahead, I can see the lake water, and boat moorings with a jetty. To the left of me is the largest building, which is a shop or museum; something to visit. There are ornamental shrubs placed on planters or pots, and there are some notices or signs on the building. To the right, the wooden building is functional; it could be a boat house, a yacht club, or something like that. There are pine woods beyond.
My job is to move to the four corners of this scene, and other positions, taking care to make a complete set of images from a range of camera placements and angles. I’m using a wide-angle lens, and my camera is equipped with GPS which records the exact position and orientation of the camera for every shot.
I do not worry about people in the pictures because the software will ignore them, nor about the light, but it is a beautiful day anyway. I am taking the pictures for a project and this is paid work. This is actually what I do for a living (in the dream). I am visiting hundreds of the most frequently-photographed places in the world, and producing a set of pictures of each one.
But it’s not what I am doing which is the interesting bit. It’s what I know about it. In the dream, I have all the knowledge about what I am doing that I would have if it was real.
480 width ad:

Time and space
Here’s how my pictures are being used. Each set of images with its GPS coordinates is fed into a system which constructs a 3D model of the environment. It is capable of recognising identical elements seen from different angles, and uses the GPS data to help identify them. With two 2D views of a building from different positions, it can use the focus distance and lens angle information to compensate for small inaccuracies in the GPS data, and wireframe the exact design and scale of the structure.
It identifies textures and objects like foliage, text on signs, clouds, and people. Once my entire set of images from this place has been processed (I am aware they are being transmitted as I take the pictures) new photographs which never existed can be created. A virtual camera can be positioned anywhere within the area I photographed, and my few dozen still images from fixed positions enable a new view to be constructed with complete accuracy.
I’ve used the result (in my dream) and it has incredibly high resolution because of the correlated image information. It’s a bit like Sony’s multi-shot or HDR or panorama technology, but instead of aligning two very similar images, it maps the coincident key points of entirely different views of the same scene. Where a walk-through VR allows viewing all angles from one position, this allows viewing any angle from any position.
And it goes beyond that to add a timeline.
The system I’m working for gathers millions of photographs from people all over the world. I’m photographing these key locations because they are the most photographed in the world. Camera phone images now record GPS data, and also record the date. So (at this future time) do most digital cameras and video cameras.
The system can find images matching every location by trawling the web; from Flickr, Facebook or whatever is out there. It can analyse the images to see whether they actually match the location they appear to be from. For every location, the system gathers in as many more pictures as it can find.
The first result of this is more detail. The second is that the viewer can change the season or weather conditions in which the location is seen. It can be viewed at night, in snow, in rain, at sunset; whatever. My image-set provides the framework, but seasonal changes can be created from the ‘found’ images of the place.
The second result is the timeline. Old photographs of these places have been fed into the system. For some popular spots, it’s possible to track the environment backwards for over 100 years. Trees change size, buildings appear and disappear. By turning on ‘people’ (which the software can remove) the crowds, groups or individuals who were in the scene at any time can be shown. And the 3D environment is still enabled because all the old photographs are co-ordinate mapped to the new information.
I do not have to work all this out in my dream, because I already know it. I am working with this awareness. The entire thing is known to me, without having to think about it. I also know that future pictures captured from internet will continue to add to the timeline and the ‘people’ function, so in five years’ time the seasons and the visitors to this place can be viewed almost by the minute.
The dark side
Because this is a dream, I do not have to think or rationalise to get this understanding; it was included with the dream. As I wake up, I realise what I have been dreaming and then make an effort to ‘save to memory’. That also kicks in the thinking process.
I start to wonder who was hiring me to do this survey-type photography, because in the dream that is one thing I don’t know. I realise how exciting it is to be able to use this Google-Earth or Google-Street type application to view not only any part and any angle of these tourist locations, but any season or time of day, and many past times in their history.
When I describe it to him, Richard suggests it’s probably Microsoft. He likes the collation of web-sourced images covering seasons, and maybe decades of past time. He thinks it is all possible and the core technology exists right now. I should patent it and give it a name!
But there is one thing which I understood just as I was waking up; the system can recognise people. Not just as people to be ‘removed’ from a scene or turned back on; it can recognise faces. The movements of one individual can be reconstructed within that location, and it can use a ‘cloud’ of gathered pictures taken at the same time to do so. This is not just virtual tourism and virtual history. In other locations – not beautiful waterside boardwalk quays – it is surveillance brought to a new level.
Sony A55 and A580
Sony’s new models with built-in GPS are the first cameras which will record the data my dream required. The GPS is not the typical latitude-longitude only. It also records height above sea level (elevation) and the direction the camera is pointing (orientation). The camera-data information records the focus distance and point of focus, and the angle of view of the lens (focal length), the time, and the measured light level and apparent colour temperature. Maybe in the A55 the spirit level function also records horizon tilt and position.
OK, the camera I was using in the dream was more like a 5 x 4 on a tripod. But that could be just a dream – like the giant fish which leapt on to boards and brought the jetty crashing down into the water a second before I woke up…
– David Kilpatrick
480 width ad: