Forgot your password?
typodupeerror
Math Science

Recovering Data From Noise 206

Posted by kdawson
from the sparse-world-after-all dept.
An anonymous reader tips an account up at Wired of a hot new field of mathematics and applied algorithm research called "compressed sensing" that takes advantage of the mathematical concept of sparsity to recreate images or other datasets from noisy, incomplete inputs. "[The inventor of CS, Emmanuel] Candès can envision a long list of applications based on what he and his colleagues have accomplished. He sees, for example, a future in which the technique is used in more than MRI machines. Digital cameras, he explains, gather huge amounts of information and then compress the images. But compression, at least if CS is available, is a gigantic waste. If your camera is going to record a vast amount of data only to throw away 90 percent of it when you compress, why not just save battery power and memory and record 90 percent less data in the first place? ... The ability to gather meaningful data from tiny samples of information is also enticing to the military."
This discussion has been archived. No new comments can be posted.

Recovering Data From Noise

Comments Filter:
  • by rcb1974 (654474) <richardballantyne@@@gmail...com> on Tuesday March 02, 2010 @09:27AM (#31328842) Homepage
    The military probably wants the ability to send/receive without revealing the data or the location of its source to the enemy. For example, its nuclear subs need to surface in order to communicate, and they don't want the enemy to be able to use triangulation to pinpoint the location of the subs. So, they make the data they're transmitting appear as noise. That way if the enemy happens to be listening on that frequency, they don't detect anything.
  • Re:Why not... (Score:5, Interesting)

    by eldavojohn (898314) * <eldavojohn.gmail@com> on Tuesday March 02, 2010 @09:31AM (#31328888) Journal

    If your camera is going to record a vast amount of data only to throw away 90 percent of it when you compress, why not just save battery power and memory and record 90 percent less data in the first place? ..

    Because it's hard to know what is needed and what isn't to produce a photograph that still looks good to a human, and pushing that computing power down to the camera sensors where power is more limited than a computer is unlikely to save either time or power.

    If you read the article, the rest of that quote makes a lot more sense. Here it is in context:

    If your camera is going to record a vast amount of data only to throw away 90 percent of it when you compress, why not just save battery power and memory and record 90 percent less data in the first place? For digital snapshots of your kids, battery waste may not matter much; you just plug in and recharge. “But when the battery is orbiting Jupiter,” Candès says, “it’s a different story.” Ditto if you want your camera to snap a photo with a trillion pixels instead of a few million.

    So, while this strategy might not be implemented in my Canon Powershot anytime soon, it sounds like a really great idea for exploration or just limited resources in general. I was thinking more along the lines of making really crappy resolution low power cameras that are very cheap but distributing them with this software that takes the images on your computer and processes them to make them highly defined images.

  • Re:Why not... (Score:5, Interesting)

    by Idbar (1034346) on Tuesday March 02, 2010 @09:42AM (#31328992)
    In fact, it's expected to be used to increase the aperture of cameras. The advantage of this, is that using random patterns you could be able to determine the kernel of the convolving pattern in the picture, therefore, you would be able to re-focus the image after it was taken. In regular photography that kernel is normally Gaussian and very hard to de-blur. But using certain patterns when taking the picture (probably implemented as micro-mirrors), you could, easily do this in post processing.
  • by Futurepower(R) (558542) <MJennings.USA@NOT_any_of_THISgmail.com> on Tuesday March 02, 2010 @10:00AM (#31329166) Homepage
    MOD PARENT UP for this: "This algorithm doesn't create absent data nor does it infer it, it just makes the uncertainties it has "nicer" than the usual smoothing."

    Fraud alert: The title, "Fill in the Blanks: Using Math to Turn Lo-Res Datasets Into Hi-Res Samples" should have been "A better smoothing algorithm".
  • Re:Why not... (Score:3, Interesting)

    by Bakkster (1529253) <Bakkster.man@gm a i l . com> on Tuesday March 02, 2010 @10:21AM (#31329412)

    Kind-of.

    This technique is taking the noisy or incomplete data, and inferring the details already captured but only on a few pixels. So, if there's a line or square on the image but you only catch a few pixels on it, this technique can infer the shape from those few pixels. So, it will enhance the detail on forms you can almost see, but not create the detail from scratch.

    Rather than 'enhancing' the image, a better term would be 'upsampling'. The example used in the article was of a musical performance. This technique could take a 44.1kHz sample of a musical instrument at 8-bit resolution and upsample it to 96kHz and 32-bit resolution. Since instruments create predictable frequencies (aside from percussion, the same frequency is usually present for many times the wavelength) the algorithm can determine which frequencies are present, at which times, and at which amplitude and phase. That information can then be used to 'fill in the gaps' more accurately than normal upsampling (usually done with a Sinc filter [wikipedia.org]). However, it can't recreate information that wasn't recorded in the first place, so if the audio was recorded at 20kHz you would only get output of audio below 10kHz (the Nyquist frequency [wikipedia.org] in this case), although it's conceivable that even more advanced algorithms could infer these frequencies as most instruments have a predictable distribution of harmonics.

    It also seems that most compression algorithms (JPG for example) would destroy these bits of detail that the algorithm would use, so raw data is likely to be needed in most cases. I'm just going off of my knowledge of DSP, I don't know any particulars of this technique beyond this article, but it looks legitimate and very useful as long as you aren't expecting CSI-level miracles.

  • Re:Why not... (Score:3, Interesting)

    by gravis777 (123605) on Tuesday March 02, 2010 @10:30AM (#31329526)

    Truthfully, I was thinking along the lines of taking a high resolution camera and making it better, rather than taking a low resolution camera and making it high. My aging Nikon is a 7.1 megapixel, with only a 3x optical zoom. There have been times I wanted to take a picture of something quick, so do not necessaraly have time to zoom or move closer to the object. After cropping, I may end up with a 1-2 megapixel image (sometimes much lower). For the longest, I thought I just needed more megapixels, and a faster and higher powered optical zoom. However, looking at the pictures I have, I am like, if someone could just come up with something to make this look better... There is usually plenty of detail there for my eye, if something would come in and soften jaggie edges, sharpen the overall picture, and understand textures (such as clothing)...

    Truthfully, with what I just talked about, I am looking for them to implement this in Photoshop so I can clean up some existing crappy photography of mine.

  • Re:Why not... (Score:3, Interesting)

    by Matje (183300) on Tuesday March 02, 2010 @10:38AM (#31329628)

    RTFA that's the point of the algorithm: the camera sensors don't need to calculate what is interesting about the picture, they just need to sample a randomly distributed set of pixels. The algorithm calculates the highres image from that sample.

    The idea behind the algorithm is really very elegant. To parafrase their approach: imagine a 1000x1000 pixel image with 24 bit color. There are 24 ^ 1000000 unique pixel configurations to fill that image. The vast majority of those configuration will look like noise. In real life you generally take pictures of non-noise things, like portraits etc. You might define a non-noise image as one where knowing the actual value of a given pixel allows a probability of predicting the value of a neighboring pixel that is greater than chance. A noisy image is one where knowing a given pixel value gives you no information about neighboring pixels at all.

    The algorithm provides a way to distinguish between image configurations that depict random noise and those that depict something non-random. Since, apparently, the ratio of non-random image configurations is so small compared to the noisy image configurations, you need only a couple of hints to figure out which of the non-random image configurations you need. What the algoritm does is take a random sample of a non-random image (10% of the original pixels), and calculates a non-random image configuration that fits the given sample. Even though in theory you might end up with Madonna from a picture of E-T, in practice you don't (and I believe they claim they can prove that the chance of accidentally ending up with Madonna is extremely small).

    It's all about entropy really.

  • Re:CSI (Score:3, Interesting)

    by ceoyoyo (59147) on Tuesday March 02, 2010 @10:40AM (#31329650)

    Your AC wish is my command [robbtech.com].

  • Re:Why not... (Score:4, Interesting)

    by girlintraining (1395911) on Tuesday March 02, 2010 @10:45AM (#31329714)

    In fact, it's expected to be used to increase the aperture of cameras. The advantage of this, is that using random patterns you could be able to determine the kernel of the convolving pattern in the picture, therefore, you would be able to re-focus the image after it was taken. In regular photography that kernel is normally Gaussian and very hard to de-blur. But using certain patterns when taking the picture (probably implemented as micro-mirrors), you could, easily do this in post processing.

    You people think in such limited terms. The military uses rapid frequency shifting and spread spectrum communications to avoid jamming. Such technology could be used to more rapidly identify the keys and encoding of such transmissions, as well as decreasing the amount of energy required to create an effective jamming signal by several orders of magnitude across the spectrum used if any pattern could be identified. Currently, massive antenna arrays are required to provide the resolution necessary to conduct such an attack. This makes the jamming equipment more mobile, and more effective at the same time. A successful attack on that vector could effectively kill most low-power communications capabilities of a mobile force, or at least increase the error rate (hello Shannon's Law) to the point where the signal becomes unusable. The Air Force is particularily dependent on realtime communications that rely on low-power signal sources.

    If nothing else, getting a signal lock would at least tell you what's in the air. Stealth be damned -- you get a signal lock on the comms, which are on most of the time these days, and you don't need radar. Just shoot in the general direction of Signal X and *bang*. Anything that reduces the noise floor generates a greater exposure area for these classes of sigint attacks. Cryptologists need not apply.

  • Re:Why not... (Score:3, Interesting)

    by wfolta (603698) on Tuesday March 02, 2010 @10:52AM (#31329802)

    Actually, you don't process and throw away information. You are not Sensing and then Compressing, you are Compressed Sensing, so you take in less data in the first place.

    A canonical example is a 1-pixel camera that uses a grid of micro-mirrors, each of which can be set to reflect onto the pixel or not. By setting the grid randomly, you are essentially doing a Random Projection of the data before it's recorded, so you are Compressed Sensing. With a sufficient number of these 1-pixel images, each with a different random mirror setup you can reproduce the original image to some level of accuracy, using fewer bits than a JPEG/etc of similar quality. Unlike JPEG, you are not taking in a full set of data, then compressing, so it takes LESS processing power, not more.

    So you save in image transmission bandwidth if the sensor is, say, orbiting Jupiter. And you save energy expended in compressing the image. And you could perhaps afford to make a VERY expensive single pixel imager that has an incredibly wide frequency range, which might be prohibitively expensive, or even impossible to fabricate in a larger array.

    Personally, I think there's a lot of hype to CS, but it's definitely not the same as JPEG/Wavelet/etc compression after taking a full-resolution image.

  • by timeOday (582209) on Tuesday March 02, 2010 @11:02AM (#31329936)
    No, not just "nicer." It fills in the data with what was most likely to have been there in the first place, given the prior probabilities on the data. The axiom of being unable to regain information that was lost or never captured is, as commonly applied, mostly wrong. The fact is, almost all of our data collection is on samples that we already know a LOT about what they look like. Does this let you recapture a license plate from a 4 pixel image, no, but given a photo of Barack Obama's face with half of it blacked out, you can estimate with great accuracy what was in the other half.
  • Re:Why not... (Score:3, Interesting)

    by shabtai87 (1715592) on Tuesday March 02, 2010 @11:47AM (#31330514)

    Amusingly enough, the idea of compressed sensing (I will rephrase for clarity) that a minimal sampling is needed for working with high dimensional data that can be described in a much smaller subspace at any given time has been used to describe neural processes in the visual cortex (V1). [See Redwood Center for Theoretical Neuroscience, https://redwood.berkeley.edu/%5D [berkeley.edu]. The lingo used is a bit different than the CS community, but the math is essentially the same. The point being that compressed sensing could lead to answers a lot more natural for human perception than simply canceling out high frequencies.

    Also the point is that CS leads to [near] perfect reconstruction for signals of a certain nature rather than the fuzzyness that comes from some other algorithms that do not take the inherent sparsity of the signal into account.

Numeric stability is probably not all that important when you're guessing.

Working...