An anonymous reader writes "This article on Photo.Net describes a new type of imaging technique that finds depth discontinuities in real-world scenes with multiple flashes added to ordinary digital cameras. As depth discontinuities correspond to real 3D object boundaries, the resulting images look like line drawings. The same technique was used at this year's SIGGRAPH to create a live A-ha 'Take On Me' demo."
It's interesting to see that people finally wanted to try to obtain from their hardware what they'd usually expect Photoshop filters to do. I am for example very happy with my Motorola v550 cell phone camera which takes the trashiest but also most colorful nunrealistic photos.
I think this is a quantum level above the Photoshop filters on an ordinary photo.
In a standard photo, where is light and where is dark is only an approximation to 3D properties from a specific angle
The use of multiple flashes gives a much more complete picture of depth.
The real question is what is the cost of this process, and how does it compare with laser modeling techniques?
If the cost and ease of use is not very low, i would say most of the uses of this technology would be better served by the capability of laser scanners to produce a high resolution digital 3D model of an object, rather than a 2D representation of a 3D object.
I know which one i would rather my surgeon was using i know that much!
If you want a 3D model, then this isn't going to be a big help to you. But oftentimes you don't need a full model, you just need a really good image from one or two POVs.
In my previous life in manufacturing, this would have been a godsend for creating as-built drawings of custom work and for making assembly drawings for the customer.
The hard part is the image processing software that turns the differences in shadow between the images into the outline image.
That's where Photoshop comes in. It seems like most of the math tools required are built in as layer modes...
From the article:
The shadows of an image are detected by first computing a shadow-free image, which is approximated with the MAX composite image. The MAX composite image is assembled by choosing from each pixel the maximum intensity value from the image set.
I wonder if this technology could be extended to allow one to quickly take a picture of a real world object and turn it into 3D models (for use in 3D Studio etc.). Obviously one would have to take multiple pictures (six?) to get a proper all round representation of the object.
3D cameras do exist... though the one that I saw was a fairly substantial beast. About the size of a phone booth, you stand in the middle and well-calibrated cameras all around you take pictures, generating a 3D model of whatever's in there.
It was strange seeing a surprisingly high resolution 3D model of me on screen seconds after I'd stepped out of the thing.
The cameras I have seen, low end, that are used for 3-D in jewelry CAM (cameos, broaches, rings, busts, etc.) project a grid on the object and then photograph multiple views. I've done s little engineering work for a company that sells these as a side line to their table top CNC milling machines. If you're interested in Jewelry and small model making, SHAMELESS PLUG WARNING, have a look at the modelmaster [nyud.net] web site (I don't run this web site so don't bitch at me:-) Talk to Mike. Tell him Bob sent you (for
This technology is a long way from 3-D. First, this camera can only estimate relative depth not absolute depth. Thus, it might determine that the foreground object is half the distance to the camera as the background object, but have no estimate of the numerical distance of either object - the foreground could be 3 feet from the camera and the background would be 6 feet or the foreground could be 5 feet from the camera and the background would be at 10 feet.
Furthermore, this technology only sees edge discontinuities where a foreground object sits in front of a background object.
Thus it cannot tell the difference between a circular disk in the foreground or a sphere in the foreground. Actually it is worse than that because the rounded edge of the sphere will cause errors in the estimation of the relative depth of the sphere vs. the background.
Even with these limitations, the technology could be quite useful in robotics. Combining multiple edge images using optical flow and knowledge of the robot's motion would yield a more accurate 3-D depth map at least for the purposes of navigation.
As for extending the technology, a second camera would do wonders for pinning down the distances to each observed edge. The system would still need separate software magic for mapping the front surfaces of objects (e.g. discerning the difference between a 3-D sphere and a 2-D disk).
How about having a camcorder with several differently coloured light sources? By analyzing the correspondingly differently coloured shadows one could create depth information in real time.
Add this to moving around a room while filming it. It should be possible to create an accurate 3D-representation even with today's technology.
If the colours of the light sources we're properly matched any discoloration could probably be eliminated as well.
What should work too, is using two camcorders taped together. Stereo photography has been around for ages.
Yes, but the whole point of this system is to use one camera which is much cheaper and simpler to implement than two.
Additionally, if you RTA, it is mentioned that this system copes better with surfaces which would look uniform (white object on white background) than a stereo-based system, as different directions of light are more likely to expose object borders than light from a single direction.
It bothers me a lot that stereo photography has been around so long yet isn't ubiquitous yet. Modern digi-cams don't do this. You said it's been around for ages, I hope most people know you mean more than decades. A quick google search tells me 1839 at the latest. What is stopping it?
Putting 2 sensors on a digi cam (photo or video) is not a difficult trick. You store the images in a format that supports 2 channels (left/right) and you can view them on any monitor with a simple pair of USB controlled glasses that flicks back and forth blacking out each eye. Also there are already 3d monitors out there that work without glasses.
Print out one channel for a 2d image or use photoshop filters to create red/blue 3d prints. Or even send images to a printer and get back those wheels used in those orange stereoscope toys.
If I had this ALL my pictures would be 3d. For that matter all movies should be 3D. IMAX has a workable solution but I think every movie should be shown this way. People would even buy their own personal polarized glasses that are more comfortable than the pairs handed out at the show.
I've been eyeing a digital-SLR for quite some time, for the cost of one of those I'd gladly turn my attention to a 3D capable camera with lower quality. And if the grandparent post is right something similar should be possible for SLR cameras without using 2 huge lenses. Although I'd submit that you can't always control the lighting.
Every now and then a red/blue 3D image comes up on APOD [nasa.gov] or elsewhere and I kick myself for not having a cheap pair red/blue glasses.
Print out one channel for a 2d image or use photoshop filters to create red/blue 3d prints. Or even send images to a printer and get back those wheels used in those orange stereoscope toys.
FYI, they're called "View-Master [fisher-price.com]," and apparently they're no longer available in the vertical-wheel red/orange style I had as a kid.
All you need to make a red-blue 3D movie is two cameras a certain distance apart. Apply a red filter to one and a blue filter to the other, and voila. This multiple-flash technique uses a single camera, as would the parent's suggestion.
Actually, you don't want to apply the filters to the camera, you want to apply them after the image has been captured, when you combine the two images onto one piece of film.
This could be very useful when you need to postprocess an image - like apply a segmentation algorithm.
Several segmentation algorithms exists. Ususally, they look at the color/brightness of an area and uses that to do the segmentation. Adding knowledge of spatial position to an image will help segmentation immensly. I'm not sure that 3 small flashes is enough. The examples provided are not exceptional - the same results could be obtained without that special camera. Nevertheless, the idea is good.
I know it ain't really a 3d mapper but is it a quick way to grab info that could be later given a more in-depth scan? Could this technology be modified to produce a good 3d mapper? What's it's claim 2 fame? Shadow-comparison , right? Length-of-shadow=height-of-object, yes?
The depth edge maps bear a superficial resemblance to phase congruency maps. It's the best edge detection method I've come across, and works on ordinary 2D images. Check out some examples on Peter Kovesi's pages [uwa.edu.au], there's also some code for download.
That's some pretty impressive edge detection, thanks for pointing it out. A related problem is to identify areas in an image that are the same thing. The best algorithm I know of for segmenting images is Leo Grady's brand new iso parametric graph partitioning method. His work is at http://cns.bu.edu/~lgrady/ [bu.edu]. His PhD thesis is probably the best place to start.
This technique sounds like it could be useful for 3d reconstruction problems. The main issue in, for example shape from stereo algorithms is accurately finding depth discontinuities, and it can be nigh on impossible with a textureless, evenly lit surface.
Having said that, I'm not sure whether it would be better than existing solutions for that sort of thing, for example structured light.
Could this tech be used to help robots, or any computer really better understand it's evironment visually? As I understand it one of the problems facing robot optics is the lack of depth perception and identifying object bounderies, if they used optics in the nonvisable spectrum and basically walked around with they're flashes strobing happily along would that help these problems? The only problem I see with that is multiple robots flashes interfering with each other, so maybe it's only be used sparingly when absolutely needed? Or is this technology completely inappropriate for this application?
One possible implementation would be to use 4 single-wavelength searchlights in different places on the robot. If these were outside the visible spectrum, then they would not be distracting to humans (as multiple flashes would be), and could be used to build an object-overlay. By using the flashes intermittently, the robot could subtract the ambient image from the flash image to remove the effects of other robots' flashes.
Pfft. Red eye? That's two flashes. With four flashes, you need to run the forked tail and horn remover, too.
Well, technically, red eye is avoided with two flashes. One flash surprises the eye and reflects light off it before the pupil has a chance to shrink. Red-eye removal basically takes a "pre-flash" to prepare onlookers for the real picture.
Joking aside, this 4 flash thing does make me think that it's not useable on any targets that are moving at all.
I wonder if such a technogology could be used for biometric facial recognition. Since the lightsources are internal, it would be relatively simple get consistent refrence points from it.
Also, it would not be *AS* processor intensive, so you could take more photos from more angles.
Using autofocus, and a short depth of focus, you isolate figures even in crouds. Isolate the target from multiple photos, so you have more than one agle for a biometric.
If we can track the target in motion, we can assume that FRONT is aproxomately the direction they are traveling. Use and IR flash so that people don't get all paranoid (not saying they don't have a reason).
Even with glasses and a beard change it would be tough to fool the system.
I read a book "Phantoms in the Brain" by Neuroscientist V. S. Ramachandran awhile back and he described a case, where a person with brain injury saw in part of her vision everything as "cartoons". He went on to speculate, if I remember correctly, that we all have this cartoon vision under our "real vision".
This came to mind, when I after looking at these pictures read your post. Perhaps our brain needs some sort of caricature or simplified image of faces for us to recognize them, but this layer of vision i
by Anonymous Coward
on Wednesday December 01 2004, @07:54AM (#10962140)
For those that are uneducated in graphics, the engine photos show two comparative methods:
The TOP row shows how the camera output is good enough to be used as a technical drawing- it requires very little modification or touch-up.
The BOTTOM row shows how a Photoshop filters butcher the image and the result is completely useless. No amount of touch up could help that image.
Furthermore... NO THIS CAMERA CANNOT BE USED ON MONOCHROME IMAGES. It can't be used on any kind of images, and it isn't a post-filter. There isn't any edge detection involved.
The 4 flashes cause shadows to be cast in 4 different directions and creates a composite from the difference. If the subject DOESN'T cast a shadow, then the camera won't work.
I assume this camera cannot be used to photograph the outdoor scenes, simply because the flashes will not render shadows at that great distance.
This is an brilliant method though, and the results are excellent (look at how the details in the spine pop out).
Had to mention this for those who didn't catch it in 2001. Some students in Wisconsin created a Quake II mod that converts the Open GL rendering engine output to non-photorealistic sketches. Looks like the A-ha video in realtime. I'd really like to see someone bring this to more modern 1st-person-shooters like Doom 3 or Quake 3.
Great link -- it hadn't occured to me, but 3D modelling with simple polygons like those earlier FPS games is probably the easiest application to apply a sketch filter to. Nifty.
Also, there's good news for you -- the page you linked connects to this one [wisc.edu], which is a rough replacement OpenGL driver to postprocess any application's OpenGL calls with any sort of filter... *very* cool stuff, though the page isn't dated, and there's no source, so it's hard to tell if it's still alive. Does have a screencap from
With this, why have only one object in focus? Here's what I mean;
If autofocus (or any other method) from differnet angles allows for this enhancement, this technique can be used to 'cut' the image into different focus layers.
Piece the layers together, and you get a photo that has depth of field and is much sharper at each level.
The layer information could be stored seperately for later processing or combined with only a little fudging to give a weighted blur to the non-primary layer(s). Keeping the layers seperate and doing a comparison would also allow editing tricks such as cutting out objects at a specific depth or performing color enhancements on each level.
You are mistaken. The final image is the 'equivalent' photoshop filter, and it's showing what a poor job it does in contrast to the third image, a good image, from the multiflash.
Disclaimer: I am not a expert on graphics technology.
But look at the second image in the final set, it's clearly able to detect the edges of things. I'm not even sure what the filter in the last image is for.
And I'm not sure what you mean by "reproducing what can already bt produced". There are other multiple-image processing engines that can do line drawings and even 3d from multiple sources, but the thing is, they all require multiple cameras and calculating the slight offset in objects from different sources.
What's interesting about this new technique is that it uses the shadows from the flashes to determine edges and depth. Doing it entirely with lighting without multiple cameras is a really neat hack, imho.
You'll notice a lot of pro photographers have devices to move the flash further from the lens: either tall stalks with the flash at the end, handheld flash units on wires (to be held arm-outstretched in the non-camera hand), or even RC flash units on tripods several metres from the camera.
Sure, and they usually also have some sort of diffuser or umbrella with their flash. Or they'll bounce their flash off the ceiling for the same effect. Or multiple flashes are set up so that each one fills in the shadows
Women are more easily and more deeply terrified ... generating more
sheer horror than the male of the species.
-- Spock, "Wolf in the Fold", stardate 3615.4
Creative uses (Score:5, Funny)
I am for example very happy with my Motorola v550 cell phone camera which takes the trashiest but also most colorful nunrealistic photos.
Re:Creative uses (Score:5, Funny)
Parent
Don't encourage him (Score:5, Funny)
Jolyon
Parent
Re:Creative uses (Score:5, Interesting)
In a standard photo, where is light and where is dark is only an approximation to 3D properties from a specific angle
The use of multiple flashes gives a much more complete picture of depth.
The real question is what is the cost of this process, and how does it compare with laser modeling techniques?
If the cost and ease of use is not very low, i would say most of the uses of this technology would be better served by the capability of laser scanners to produce a high resolution digital 3D model of an object, rather than a 2D representation of a 3D object.
I know which one i would rather my surgeon was using i know that much!
Parent
Quanta of image processing capability (Score:5, Funny)
So, you mean, this is the tiniest possible improvement over Photoshop filters?
Parent
Re:Quanta of image processing capability (Score:5, Funny)
Parent
Re:Quanta of image processing capability (Score:5, Funny)
--
Evan
Parent
Re:Creative uses (Score:3, Informative)
In my previous life in manufacturing, this would have been a godsend for creating as-built drawings of custom work and for making assembly drawings for the customer.
For its designed purpose, this is brilliant.
Re:Creative uses (Score:3, Informative)
That's where Photoshop comes in. It seems like most of the math tools required are built in as layer modes...
From the article:
OK, this is stac
Technology runs wild! (Score:5, Funny)
People always ask how we'll know when technology will go too far, and I think we've just found out.
Re:Technology runs wild! (Score:5, Funny)
Or maybe ASCII Starwars & Matrix movies... Oh wait...
Parent
Demo Video (Score:5, Informative)
Since the site (Score:3, Informative)
3D Modelling Applications? (Score:2, Interesting)
Just a thought.
Re:3D Modelling Applications? (Score:4, Informative)
It was strange seeing a surprisingly high resolution 3D model of me on screen seconds after I'd stepped out of the thing.
Parent
Re:3D Modelling Applications? (Score:3, Informative)
And yes, it's pretty damn cool to see. They have lots of computer graphics/computer vision stuff going on here, some of it's pretty funky.
Re:3D Modelling Applications? (Score:3, Interesting)
No, this cannot estimate depth & sees only edg (Score:5, Interesting)
Furthermore, this technology only sees edge discontinuities where a foreground object sits in front of a background object. Thus it cannot tell the difference between a circular disk in the foreground or a sphere in the foreground. Actually it is worse than that because the rounded edge of the sphere will cause errors in the estimation of the relative depth of the sphere vs. the background.
Even with these limitations, the technology could be quite useful in robotics. Combining multiple edge images using optical flow and knowledge of the robot's motion would yield a more accurate 3-D depth map at least for the purposes of navigation.
As for extending the technology, a second camera would do wonders for pinning down the distances to each observed edge. The system would still need separate software magic for mapping the front surfaces of objects (e.g. discerning the difference between a 3-D sphere and a 2-D disk).
Parent
3D applications (Score:5, Interesting)
Add this to moving around a room while filming it. It should be possible to create an accurate 3D-representation even with today's technology.
If the colours of the light sources we're properly matched any discoloration could probably be eliminated as well.
Food for thought.
Re:3D applications (Score:3, Informative)
Yes, but the whole point of this system is to use one camera which is much cheaper and simpler to implement than two.
Additionally, if you RTA, it is mentioned that this system copes better with surfaces which would look uniform (white object on white background) than a stereo-based system, as different directions of light are more likely to expose object borders than light from a single direction.
Where is -MY- 3D camera? (Score:5, Interesting)
It bothers me a lot that stereo photography has been around so long yet isn't ubiquitous yet. Modern digi-cams don't do this. You said it's been around for ages, I hope most people know you mean more than decades. A quick google search tells me 1839 at the latest. What is stopping it?
Putting 2 sensors on a digi cam (photo or video) is not a difficult trick. You store the images in a format that supports 2 channels (left/right) and you can view them on any monitor with a simple pair of USB controlled glasses that flicks back and forth blacking out each eye. Also there are already 3d monitors out there that work without glasses.
Print out one channel for a 2d image or use photoshop filters to create red/blue 3d prints. Or even send images to a printer and get back those wheels used in those orange stereoscope toys.
If I had this ALL my pictures would be 3d. For that matter all movies should be 3D. IMAX has a workable solution but I think every movie should be shown this way. People would even buy their own personal polarized glasses that are more comfortable than the pairs handed out at the show.
I've been eyeing a digital-SLR for quite some time, for the cost of one of those I'd gladly turn my attention to a 3D capable camera with lower quality. And if the grandparent post is right something similar should be possible for SLR cameras without using 2 huge lenses. Although I'd submit that you can't always control the lighting.
Every now and then a red/blue 3D image comes up on APOD [nasa.gov] or elsewhere and I kick myself for not having a cheap pair red/blue glasses.
Parent
ViewMaster (Score:3, Informative)
FYI, they're called "View-Master [fisher-price.com]," and apparently they're no longer available in the vertical-wheel red/orange style I had as a kid.
Re:3D applications (Score:4, Interesting)
Actually, you don't want to apply the filters to the camera, you want to apply them after the image has been captured, when you combine the two images onto one piece of film.
Parent
Segmentation (Score:2)
Several segmentation algorithms exists. Ususally, they look at the color/brightness of an area and uses that to do the segmentation. Adding knowledge of spatial position to an image will help segmentation immensly. I'm not sure that 3 small flashes is enough. The examples provided are not exceptional - the same results could be obtained without that special camera. Nevertheless, the idea is good.
Does this have machine-vision applications? (Score:2, Insightful)
Could this technology be modified to produce a good 3d mapper?
What's it's claim 2 fame? Shadow-comparison , right? Length-of-shadow=height-of-object, yes?
The old 'edge detection' bites again (Score:2, Insightful)
Real world discontinuities, what they mean is, you run an edge detection algorithm on the distance signal.
This will not find edges in newspaper print.
No edge detection system is perfect - even this which uses spatial edges.
There is no real new technology, the multiple flash cameras are amazing and beat any faked edge detection hands down.
I do think they have awesome capabilities to allow computers to do what our eyes do, which is segment and label areas of our vision, a
Phase congruency (Score:5, Informative)
Parent
Image Partitioning (Score:3, Informative)
Could be very useful (Score:3, Insightful)
Having said that, I'm not sure whether it would be better than existing solutions for that sort of thing, for example structured light.
Re:Could be very useful (Score:5, Interesting)
But I don't think it will be useful for 3d reconstruction, since the algorithm doesn't have information about the depth of the shadows/borders.
Parent
robot vision (Score:4, Interesting)
Re:robot vision (Score:4, Interesting)
Parent
Re:robot vision (Score:5, Funny)
Parent
Four flashes? (Score:3, Funny)
Re:Four flashes? (Score:5, Funny)
Parent
Re:Four flashes? (Score:3, Informative)
Pfft. Red eye? That's two flashes. With four flashes, you need to run the forked tail and horn remover, too.
Well, technically, red eye is avoided with two flashes. One flash surprises the eye and reflects light off it before the pupil has a chance to shrink. Red-eye removal basically takes a "pre-flash" to prepare onlookers for the real picture.
Joking aside, this 4 flash thing does make me think that it's not useable on any targets that are moving at all.
manuals (Score:3, Insightful)
Biometrics (Score:3, Interesting)
Also, it would not be *AS* processor intensive, so you could take more photos from more angles.
Using autofocus, and a short depth of focus, you isolate figures even in crouds. Isolate the target from multiple photos, so you have more than one agle for a biometric.
If we can track the target in motion, we can assume that FRONT is aproxomately the direction they are traveling. Use and IR flash so that people don't get all paranoid (not saying they don't have a reason).
Even with glasses and a beard change it would be tough to fool the system.
Re:Biometrics (Score:3, Interesting)
This came to mind, when I after looking at these pictures read your post. Perhaps our brain needs some sort of caricature or simplified image of faces for us to recognize them, but this layer of vision i
Finnally! (Score:5, Funny)
geek speculation is way off here. (Score:3, Informative)
The TOP row shows how the camera output is good enough to be used as a technical drawing- it requires very little modification or touch-up.
The BOTTOM row shows how a Photoshop filters butcher the image and the result is completely useless. No amount of touch up could help that image.
Furthermore... NO THIS CAMERA CANNOT BE USED ON MONOCHROME IMAGES. It can't be used on any kind of images, and it isn't a post-filter. There isn't any edge detection involved.
The 4 flashes cause shadows to be cast in 4 different directions and creates a composite from the difference. If the subject DOESN'T cast a shadow, then the camera won't work.
I assume this camera cannot be used to photograph the outdoor scenes, simply because the flashes will not render shadows at that great distance.
This is an brilliant method though, and the results are excellent (look at how the details in the spine pop out).
NPR Quake (Score:3, Interesting)
Dan East
Nonphotorealistic Quake Mod (Score:3, Interesting)
Had to mention this for those who didn't catch it in 2001. Some students in Wisconsin created a Quake II mod that converts the Open GL rendering engine output to non-photorealistic sketches. Looks like the A-ha video in realtime. I'd really like to see someone bring this to more modern 1st-person-shooters like Doom 3 or Quake 3.
NPR Quake [wisc.edu].
Re:Nonphotorealistic Quake Mod (Score:3, Interesting)
Also, there's good news for you -- the page you linked connects to this one [wisc.edu], which is a rough replacement OpenGL driver to postprocess any application's OpenGL calls with any sort of filter
Win 10000 $ (Score:3, Interesting)
One of them is Canesta [canesta.com] that makes a photo sensors that can make pictures that include deep maps.
To my surprise I see that they are running a contest were your can win 10000 $.
But I don't have time to participate myself, because I am writing on my masters. So enjoy the contest [canesta.com].
Focus across the whole filed of view (Score:3, Insightful)
If autofocus (or any other method) from differnet angles allows for this enhancement, this technique can be used to 'cut' the image into different focus layers.
Piece the layers together, and you get a photo that has depth of field and is much sharper at each level.
The layer information could be stored seperately for later processing or combined with only a little fudging to give a weighted blur to the non-primary layer(s). Keeping the layers seperate and doing a comparison would also allow editing tricks such as cutting out objects at a specific depth or performing color enhancements on each level.
Re:Very interesting, but stupid (Score:5, Informative)
Parent
Re:Very interesting, but stupid (Score:4, Insightful)
Parent
Re:Very interesting, but stupid (Score:4, Insightful)
But look at the second image in the final set, it's clearly able to detect the edges of things. I'm not even sure what the filter in the last image is for.
And I'm not sure what you mean by "reproducing what can already bt produced". There are other multiple-image processing engines that can do line drawings and even 3d from multiple sources, but the thing is, they all require multiple cameras and calculating the slight offset in objects from different sources.
What's interesting about this new technique is that it uses the shadows from the flashes to determine edges and depth. Doing it entirely with lighting without multiple cameras is a really neat hack, imho.
Parent
Re:A-ha (Score:3)
Sure, and they usually also have some sort of diffuser or umbrella with their flash. Or they'll bounce their flash off the ceiling for the same effect. Or multiple flashes are set up so that each one fills in the shadows