Become a fan of Slashdot on Facebook


Forgot your password?
Science Technology

New Sampling Techniques Make Up For Lost Data 162

An unnamed reader writes: "Professors at Vanderbilt and the University of Connneticut have published a non-uniform sampling theory that could yield better quality digital signals than the standard Uniform sampling techniques pioneered by Shannon at Bell Labs. The Vanderbilt press release and link to the published paper can be found here."
This discussion has been archived. No new comments can be posted.

New Sampling Techniques Make Up For Lost Data

Comments Filter:
  • So.... (Score:1, Insightful)

    by PeeOnYou2 ( 539746 )
    Even better sound than what we now know as almost perfect? Great! Makes you wonder how much better it will get in the future, even when we have perfect sound....
    • Re:So.... (Score:5, Informative)

      by pmc ( 40532 ) on Sunday January 20, 2002 @02:45PM (#2872743) Homepage

      As the abstract says

      "The new theory, however, handles situations where the sampling is non-uniform and the signal is not band-limited."

      So it isn't applicable to digital music (as this is band-limited by our hearing, and we can pick the sampling interval) but other signals that cannot be sampled well by regular sampling (either in time or in space). Examples given are seismic surveys and MRI scans. But you knew this as you'd have taken the time to read the linked article first, wouldn't you?
      • Please read again, the classical sampling techniques required the signal to be band-limited, this new one claims to be able to also handle non-limited signals.
      • The reason the usual techniques are band limited is the problem of aliasing (as we all can remember from watching the wagon wheels go backwards in old Westerns). The limitation of our ears makes uniform sampling techniques feasible for digital audio, but that doesn't mean that the new theory isn't applicable to digital audio.
      • It seems to me that you could treat music as not-band-limited. CD's, for instance, start with the "static" assumption that the musical signal will be between 0 Hz and 22050 Hz (44.1 KHz sampling rate, right?). But most music isn't going to hit frequencies near 22KHz, so a lot of that sampling is going wasted. (Most people can't hear below 20-25Hz, but it's not worth the effort to avoid sampling that.)

        As an alternative, I could say that the music will usually be between 0 Hz and 10KHz. Now my sampling rate is cut in half, and if I'm wrong, I can iteratively adjust upward to recover the higher frequency information (if I'm understanding the basics of the paper correctly). This seems attractive to me, at least. Obviously it's too late to change the CD or DVD standards, but maybe some new music format for 3G cell phones, for instance?

        Or am I misunderstanding something here?


        • Actually, the bandwidth used by music is not limited. What humans can hear is limited. What audiophiles think they can hear is not so limited.
          A low-frequency note is shaped by high-frequency components. If a difference in shape of the lower-frequency can somehow be detected, then inaudible frequences still make a difference.
          Normal telephone IIRC cuts off about 3.5kHz.

  • Okay ... I don't mind them using sampling (i.e. guessing) for my CDs and movies ... but please try and be a bit more accurate with my brainscans!
    • Re:Brain scans? (Score:5, Insightful)

      by Hal-9001 ( 43188 ) on Sunday January 20, 2002 @04:07PM (#2873115) Homepage Journal
      Any medical imaging technique can only be so accurate, due to either machine or physical limitations. This defines a maximal meaningful sampling rate or resolution for that imaging modality. For example, positron emission tomography (PET) has a physical resolution limit of 10mm because positrons can propagate up to 10mm from where they are generated before they decay into gamma radiation that can be detected by the machine. With this technique, a doctor can get an image with better than 10mm resolution, something that the machine by itself could never do.

      BTW, sampling doesn't mean that you're guessing. The sampled data points are the actual measured values of the signal at specified points in time or space. You have to sample because there is no way that you could collect all values for the signal for all points in time or space, and there is usually a sampling rate at which point you're collecting more data than you need to accurately represent the signal.
    • Re:Brain scans? (Score:2, Informative)

      by perky ( 106880 )
      sampling is not guessing. The Nyquist sampling theorem shows that a signal that is sampled at twice the frequency of the highest frequency component in the signal can be reconstructed perfectly. With music this doesn't matter though, because humans have bandlimited hearing, so all we have to do is sample at twice the maximum frequency we can hear.

  • You fill in the missing data
  • by Anonymous Coward
    The paper does not seem that new. It basically is using more modern methods of wavelets and an adaptive filter to deconstruct digital samples. This does not differ too much from current JPEG encoding or MP3 encoding. Such techniques have been used in control systems for a while. For that matter, non-uniform sampling has been in use for a while, for example the telephone system (which the article implied used uniform sampling). The telephone system samples using a uLaw algorithm, though it does occur at regular sample intervals.
    • You're confused. uLaw uses uniform sampling but non-uniform quantising, which makes sense given the human perceptual system. It's similar to using Mel coefficients as the basis of speech recognition or compression).
  • I tried a similar technique on a Statistics paper I had to write, and got an F for plagarism!
  • Could this lead to new data compression schemes for non-detail critical images and other files? A sort of JPG with half detail and half math? It wouldnt be high quality but it would be a fraction of the size, perhaps yet another low-bandwidth video codec?

    Well.. one thing's for sure, if I ever have a doctor reading my brain MRI I sure as hell don't want half of it removed (neither my brain, nor the scan).
    • Could this lead to new data compression schemes for non-detail critical images and other files? A sort of JPG with half detail and half math? It wouldnt be high quality but it would be a fraction of the size, perhaps yet another low-bandwidth video codec?

      Are you even familiar with how JPEG works? It's already half-detail, half-math... JPEG images involve a significant amount of math and statistical trickery (in throwing away data). However, all the math used in standard signal processing (image and audio compression fall in this category) make the basic assumption that the signal is sampled at a uniform rate... a lot of the current techniques wouldn't apply without tweaking...
    • by Leeji ( 521631 ) <slashdot@le[ ] ['eho' in gap]> on Sunday January 20, 2002 @03:56PM (#2873068) Homepage

      Along your point, there's actually a technique that uses the self similarity of images to help you compress themselves. For example, you might have seen the "Sierpinsky Triangle." [] You can generate this image with a few very simple recursive move/resize/draw operations.

      Fractal compression uses this technique on abstract images. It aims to find a set of operations (sometimes very large) to generate any given input picture. It's very cool, and you can get more information (including example pictures) at this page. []

      The "state of the art" of fractal compression beats JPEG compression at some compression ratios, but looses at others. It's also interesting that a fractally-compressed image has no implicit size (ie: 640x460), so it enlarges MUCH better than simple image enlargement.

      • Fractal compression is a cool idea, and it can achieve incredible compression rates. Unfortunately it hasn't quite panned out in the real world.

        One main problem has been that no one has found an efficient way to create the PIFS functions for an arbitrary image. So fractal compression can take a long time and is non-deterministic (i.e. you can't tell ahead of time how long it will take).
        Another problem is that Barnsley et al. hold patents on many of the techniques used. Until its performance makes it a clear winner, why pay royalties.

        It's been a couple of years since I paid close attention fo fractal compression, but I haven't heard of anything that changes the above problems.
  • by michaelmalak ( 91262 ) <> on Sunday January 20, 2002 @03:19PM (#2872897) Homepage
    Think about computer displays. Would you ever want to have to deal with non-square pixels? Sometimes, yes, like in the CGA days where the goal was to display 80 columns while keeping memory and bandwidth costs down. In general, it's a PITA. Now multiply that pain by not only having non-square pixels but where the pixels also come in various sizes.

    What's the practicality of this? Well, spiral MRIs [], for example, where for mechanical reasons you don't want to have to stop-and-start the very heavy "scanner", wasting time and jarring sensitive equipment. As I said, niche applications.

    As for compressing audio, there are already plenty of other psychoacoustic compression schemes -- whether non-uniform sampling is better or worse will likely depend on the application.

    • Accually the width value of 80 for CGA display goes back to the punchcard days, not as you state in trying to keep memory and bandwidth costs down.

      And, I'm still trying to figure out by what you mean by non-square pixels. Are you trying to say the physical size on the screen, or how they are stored in memory on the graphics adaptor?

      If these guys have the ability to return useful data from non reporting areas I can see a whole range of non niche applications - and real word applications where data recovery would be useful.
      • And, I'm still trying to figure out by what you mean by non-square pixels. Are you trying to say the physical size on the screen, or how they are stored in memory on the graphics adaptor?
        The pixels that make up a CGA image aren't square...they were drawn on a 640x200 grid. The pixels on a VGA display at most resolutions are square (1280x1024 is the most common exception)...for instance, (1024/4)/(768/3)=1. With CGA, (640/4)/(200/3)=2.4, which means it's stretched vertically.
    • MRI (magnetic resonance imaging) involves no moving parts. I think yon meant "Spiral CT"
    • Actually, digital video uses non-square pixels.

      PAL DV has pixels slightly taller than square, and NTSC DV has pixels slightly shorter than square. It makes editing on a square-pixel PC monitor a bit wierd, because the images look stretched or squashed.

      I had to recode an NTSC DV tape for PAL once, which was a total PITA. Different frame rate, different resolution, horrible smeary colour... Never again.
    • It's essentially a POINT - it has no dimensions. When you see those little squares you actually see a poor (and fast) representation of pixels - pixels themselves are not square or non-square. Pixels won't come in various sizes, they'll still be regular 0-sized points.

      Here's a good paper on why it's important to keep in mind the true nature of pixels (by Alvy Ray Smith):

      A Pixel Is Not A Little Square, A Pixel Is Not A Little Square, A Pixel Is Not A Little Square! (And a Voxel is Not a Little Cube) []
  • Nyquist, not Shannon (Score:5, Informative)

    by s20451 ( 410424 ) on Sunday January 20, 2002 @03:21PM (#2872903) Journal
    It was at Bell Labs ... but the guy who developed the Uniform Sampling Theorem [] was Nyquist, not Shannon.
  • As a Computer Engineering Major, this tosses a significant portion of my degree out the window, but, I suppose, it's a good thing. Aliasing (Java) [] and Folding (no link) were always a pain.
  • by KjetilK ( 186133 ) <> on Sunday January 20, 2002 @03:23PM (#2872915) Homepage Journal
    The article was really short on details, I think, so I found it very hard to understand what was new about this. Some time ago, Prof Jaan Pelt (who is also going to be the referee of my thesis), gave a really mind-blowing lecture about non-uniform sampling. Shortly thereafter, I posted a message to the Vorbis-dev mailing list [] about this stuff.

    In fact, you're not limited by the Nyquist frequency when you are sampling non-uniformly, so it has some strengths in that respect. However, it has to be more to it than this for it to be news. Can anybody who understands this better than I provide any insights?

    • The article was really short on details, I think, so I found it very hard to understand what was new about this. Some time ago, Prof Jaan Pelt (who is also going to be the referee of my thesis), gave a really mind-blowing lecture about non-uniform sampling. Shortly thereafter, I posted a message to the Vorbis-dev mailing list [] about this stuff.

      So this is a classical expansion on the variable bit rate sampling as is done on MP3 files now. The only difference is that this is done on bitmap files in place of sound files.
      • by rhh ( 525195 )
        The variable bit-rate in MP3 compression does not
        alter the amount of time between each sample. In
        terms of sampling frequency MP3, even VBR is still
        uniform, uniform as in time. VBR changes how many
        bits are in a sample, not the time between samples.
      • No, no, no. It has nothing to do with it.

        Variable bit-rate, if I have understood it correctly, is about say that you have a period in the sound file that is very quiet, then you don't need many bits to represent it well. Therefore, you don't use many bits per sample, and you save some space.

        You still sample regularily, for example, if you sample with a 44.1 kHz sampling frequency, then you take a sample every 0.023 milliseconds, exactly.

        This stuff is different. Instead of taking a sample with exactly the same interval, you sample at random, or you sample every now and then. The number of bits you have for each sample is a completely different matter, that may or may not be variable.

        The funny thing is that you can actually use this to reconstruct the signal much better in many cases, which is pretty counterintuitive when you think about it! (until you've thought much about it, because then it makes a lot of sense... :-) )

    • Can anybody who understands this better than I provide any insights?

      We have a lot of powerful tools (such as Fourier transforms) for analizing precisely sampled data. For example sound is often sampled at a precise frequency - about 44khz.

      The problem is that sometimes the data available isn't spaced regularly. This makes most analysis techniques throw fits. He's come up with tools to ues here that do a good job of taking irregular data and returning a very good estimation of the values everywhere.

      If you're familiar with Fourier transforms, this is a more generalized version.

  • by bokmann ( 323771 ) on Sunday January 20, 2002 @03:25PM (#2872927) Homepage
    About 7 years ago, I was involved in a research project, trying to use video teleconferencing and doctors for remote diagnosis of patients.

    We found that jpeg compression of images made medical diagnosis unreliable. Hairline fractures in x-rays are exactly the kind of small details that tend to get washed away in 'lossy' compression, and the banding caused can lead to false assumptions as well.

    The article suggests that this is still a lossy compression with small amounts of data loss. I know Doctors that would take that admission as a condemnation of the technique.
    • The article suggests that this is still a lossy compression with small amounts of data loss. I know Doctors that would take that admission as a condemnation of the technique.

      From what I read, the paper does not represent a compression technique, but a better way to fill in the missing data between samples, especially when the samples are nonuniform, or samples are missing. This would allow you to remove data for storage/transport and recover a similar image later, or as it would probably be used with medical imaging, to recover data lost during the imaging process due to sampling and quantizing error. In the second case, the fracture shouldn't be lost if done correctly.

    • The article suggests that this is still a lossy compression with small amounts of data loss.

      Nope. Re-read the article. It's not a compression scheme. If anything, it's the reverse, and expansion scheme. It takes all the available data and does a good job of filling in the gaps. It even works when the available data isn't arranged nice and neatly.

      Used in the right context it would make things like hairline fractures MORE visible. You wouldn't usually use it in video teleconferencing though.

  • by Anonymous Coward on Sunday January 20, 2002 @03:26PM (#2872929)

    Hereby I donate the following algorithm to the public. It's called GNU-squat.

    Step 1:
    Non-uniformly sample your favorite music using just 1 bit. This will ofcourse take up at least 8 bits on your harddisk but let's not nitpick. The good part is you don't even need special hardware to sample the music, just enter if the music is loud (1) or soft (0).

    Step 2:
    Use the Vanderbilt mathematical routines to extrapolate the rest of the data, and presto: the complete song re-appears from just one bit of data.
  • by markj02 ( 544487 ) on Sunday January 20, 2002 @03:26PM (#2872933)
    Doctor to patient, after looking at the reconstructed images: "Ah there is the problem. The cause of your headaches is that you have a bunch of inch-long bony spikes sticking out of your neck, plus a bunch of holes in your skull."
    • Some of the "filling in the dots" indeed produced long spikes that are *not* in the orginal. In the sample chosen, that was not a problem because we know that is not how the (normal) skull goes.

      However, in some other image settings this might not be the case. For example, where there are a lot of linear-dimensioned information that tends to go by the same grain as the pixels.

      They might be pulling some wool over our eyes by picking samples that minimize the downsides of their algorithms.

      Perhaps they should focus on esthetics improvement, such as music and clipart and not on domains where you can get your ess sued off if somebody dies from a misleading image.

      (troll mode on)

      This kind of reminds me of the OOP books which tend to show change patterns that OOP seems to benefit, but completely ignores change patterns which tend to get messy under OOP.

      (troll mode off)
  • by fatboy-fitz ( 182286 ) on Sunday January 20, 2002 @03:35PM (#2872981) Homepage
    example. It was not provided to show a compression mechanism in which the original image could be compressed. It was intended to show that if you sample randomly, then their algorithm can come up with a highly accurate representation of the original. The implication here is that given current capability to sample, if you apply the new technique, you can get a better image/audio recording using their technique, than you can using the current fixed sampling interval technique, making the image more vivid, or the musical recording more lifelike than current sampling provides.
    • If their point is that they can better reconstruct the original image from non-uniform samples, then it would have been more interesting to see a comparison of reconstructions: in particular, take one MRI images with random pixels missing, and the second MRI images with the same number of missing pixels but arranged in a regular pattern, such as a grid.

      However, I suspect their point is that they can reconstruct the original at all with non-uniform sampling. This is useful in cases when it is not feasible to obtain fixed samples.
    • The implication here is that given current capability to sample, if you apply the new technique, you can get a better image/audio recording using their technique, than you can using the current fixed sampling interval technique, making the image more vivid, or the musical recording more lifelike than current sampling provides.

      48kHz 24 bit is all you need to generate a perfect reproduction of audio as far as the human ear is concerned. These days, audio in the pre-amplified stage is about as good as it's going to get, because it's already as good as the human ear.

      Non-uniform sampling, if it really improves matters (which I doubt in the case of audio), can't improve on what's already perfect.

      Just to emphasise: by perfect I mean the theory says that none of the distortions generated are even close to what a human ear can hear. This is also true in practice.

  • near-perfect zooming (Score:1, Interesting)

    by ShadeARG ( 306487 )
    If it is possible to use these mathematics techniques to replace all of the unknown parts of an image, then why not resize the image a few times larger than the original and save the random parts of it? This would allow the algorithms to fill in even more detail to each image relative pixel. Upon resizing the image to 2x the original size, you would find much better clarity and precision than just resizing the image without.

    On a side note, you could apply random color-relative noise on to the entire zoomed image before you save the random parts, then it might pick up the slack of the algorithm placing the same bordering colors over the unknown pixels.

    If they consider digital music captured with this set of algorithms near-perfect, then near-perfect zooming is just around the corner.
    • Formats supported by Eastman Kodak Company-

      PhotoCD works with a differential 'error' image that was created by comparing the resampled to the original, and then that was compressed. Effect? Take a small image, blow it up by a factor of 2x, apply this itty bitty 'error' transform, and you have a nearly perfect 'fixed' image for the cost of some small change on disk space

      Then there is the 'much better clarity' etc statement- there's 'inverse point transform' for getting out defects.. they used that on the Hubble Telescope. Looked pretty good for being wildly out of focus.

      Everything you've mentioned is already available... the technique looks interesting but it's all data dependent ... given enough training data you can make a GA to give 'guesses' into any dataset.
  • When you reconstruct a function from sampled data, there are an infinite number of possible reconstructions. That issue is resolved by making certain assumptions about the functions you are reconstructing. An assumption of band-limited data is useful because it approximates what happens in many communications systems and (perhaps more importantly) because it leads to simple and efficient algorithms (some comment about only having hammers and everything looking like nails is in order).

    There are already many other methods for reconstructing functions from sparse, non-uniformly sampled data, so this paper doesn't solve an unsolved problem. Rather, it provides one more solution under a set of assumptions that are mathematically a bit more like those of the original sampling theorem.

    Will it be useful? That's hard to tell at this point. I think it will take a lot more work to figure out whether this method is any better than existing methods on real-world problems, whether its application can be justified in real problems, and whether it leads to algorithms that are practical. It may also turn out that the method is closely related to methods already in use in other fields; for example, the kinds of function spaces they study have received some attention in neural networks, but the authors cite no papers from that work and may not be aware of it.

  • Time vs. Frequency (Score:5, Informative)

    by PingXao ( 153057 ) on Sunday January 20, 2002 @03:59PM (#2873084)
    Classical techniques also require that the original signal be "band limited" - a technical term meaning that the signal must stay within certain, defined limits.

    This is not quite accurate. The original signal is not "required" to be band-limited. Rather, it is accepted that frequencies outside of your design bandwidth will not be captured. The signal can stray outside of the "defined limits", but should it do so that information will be lost. Furthermore, Fourier's math tells us that a signal that is limited in time is unlimited in frequency, and a signal that is limited in frequency is unlimited in time. This has important ramifications. The biggest - and most obvious - is that all man-made signals are limited in time and therefore unlimited in frequency. Ergo there will ALWAYS be information lost no matter what bandwidth you design for.

    Now to read the rest of the article - it sounds intriguing...
    • The original signal is not "required" to be band-limited. Rather, it is accepted that frequencies outside of your design bandwidth will not be captured.

      Well, that's not entirely accurate either. The presence of frequencies outside of the design bandwidth will lead to aliasing. The reconstructed signal will have additional low-frequency energy that should not be there.

      In practice, we often use analog "anti-aliasing" lowpass filters to band-limit the signal before sampling.

    • by madsatod ( 535808 )
      You're right about the Fourier-stuff.
      But I think you misunderstood the "band limited" thing.
      When you sample you have to the filter out frequencies above the Nyquist-freq., if you want to avoid aliasing-problems.
      Aliasing comes from the mirroring of the spectrum around n*Fsample. So if you don't want your original signal to get distorted when sampling, one have to use an anti-aliasing filter, that "band-limits" the signal to below Fsample/2.
      Does this new technique mean, you can skip anti-aliasing filters?
      • The article being extremely light on technical details, I think what was meant is that by non-uniform sampling intervals (deliberately jittering your sample time, but _knowing_ the actual time each sample was taken), you can dodge the aliasing problem. That is, although your average sample rate is (say) 20K/sec, so a 12 KHz sine wave would alias to 2KHz, you have samples taken at other intervals that reveal that a 2KHz wave won't fit.

        I'm not sure if this is new at all. Some digital scopes will attempt to hit higher effective bandwidths by shifting the time of starting sampling at each sweep, so as to fill in between the dots of the first sweep. This only works if the signal you are measuring is absolutely regular, and the triggering (detection of the start point in the signal) is perfect...
    • by pslam ( 97660 )
      This is not quite accurate. The original signal is not "required" to be band-limited. Rather, it is accepted that frequencies outside of your design bandwidth will not be captured.

      Two of the other replies point out that this isn't quite right - the frequencies outside of nyquist just alias. However, this can actually be used to your advantage if you know that a signal lies within a narrow band of frequencies centered around a high frequency.

      For example, you can perfectly sample a signal confined to 1.0-1.1MHz using a sampling rate of just 200kHz, instead of 2.2MHz. What's even more interesting is that you can play this 200kHz sample back and get the same signal in the 1.0-1.1MHz band you had originally, but along with aliases all over the rest of the spectrum. In this case, you need bandpass anti-aliasing filters and not lowpass bandlimiting ones.

  • Nothing new here. (Score:2, Insightful)

    by gewalker ( 57809 )
    I was making up missing data for lab reports twenty years ago. It filled in the gaps well enough to fool the teachers :)

    News article was, as usual, totally lacking in technical details. But they did link to technical articles at the bottom of the story.


    I skimmed the technical article (heavy math alert), and the results seem to be along the lines that: given an irregular (and possibly noisy) sample of data, reconstruct a
    function that gives smoothed (continuous, not discrete) approximation for entire data set.

    There is some nice mathematics that make it suitable for such purposes. The algorithms are selected to limit number of terms and guarantee convergance, and are computationally efficient. If you think of it as fancy interpolation, you are not far off the mark from what I saw.

    This is not to disparage the efforts here (it looks to be quite useful in several domains), but it is a technique for generate a smooth, continuous function to represent a set of non-uniform samples. It cannot magically find missing results not were not evident in the limited sample data.

    The author
  • I was hoping someone would finally come up with a good method to use in lunzip (see this [] for more details. In short, it's a superior compression utility, at least for certain jobs, like prepping your computer for the FBI).
  • Some Clarifications (Score:3, Interesting)

    by dh003i ( 203189 ) <> on Sunday January 20, 2002 @04:20PM (#2873167) Homepage Journal
    From what I've read, some people seem to be thinking this is some kind of "magic bullet". For example, one comment, which emanated stupidity, was titled something like, "Infinite Zooming" and the implication of the post was that it might be possible with this method to "zoom in" on an image and accurately reconstruct the image. In other words, the idea is you could zoom in on a tiny head on a photograph and accurately reconstruct all of the details.

    This, my friends, is complete nonsense. You cannot zoom in on an image and accurately reconstruct further details. To imply that this is possible is to imply that you can add accurately representative data where there was none before.

    As for "zooming technology" it is possible to better reconstruct a zoomed-in image, though not any more accurately. For example, when I go into MS Paint and zoom in, it simply blows up all the pixels as larger blocks. This clearly is not good. You could create some kind of algorithm to determine the "shapes" of sharp edges, as well as where gradients where, and scale those up when zooming in...for example, small a circle can be composed of four pixels -- such a technology would scale this up, not as four very large blocks, but as a circle.

    But this involves assumptions about what the original pattern was representative of? Was it representative of a circle, or of four large blocks seen from a distance? So you're not really adding data, but just attempting to "zoom in" on an image "better" based on a set of good assumptions which generally work.

    Such a thing could be accomplished. Indeed, it already has been accomplished -- in us. When we look at a small photograph and want to draw a poster from it, we don't draw a large, blocky, pixelated image. We are able to tell what things -- such as frecles -- are details to be scaled up in our drawing; what things are gradients -- such as a dark to light gradient going from the near to the far side of a forehead -- to be scaled up and gradiated; and what are sharp borders, to kept sharp -- such as the sides of one's face.

    However, even this amazing system we have of reconstructing larger images from smaller one's cannot add detail where there is none. If a woman is freckled with tiny freckles, they won't be visible from 10 feet away; a picture taken from that distance won't show them, and if we wanted to make a portrait of her head based on that picture, we wouldn't know to add freckles.
  • Look at the sample image. Most of the details are reconstructed correctly. But some errors like the spikes are obvious.
    If you use a classic technique like interpolation through splines, diff the images and remove the gross errors created by this new method, the result might be quite convincing.
  • For starters, if human threshold of hearing tapers off at around 20khz (its actually closer to 18khz where at 20khz most audio is fully attenuated... but anyways)...

    How will a "new and improved" method of sampling help me hear audio I can't hear anyways?

    Nyquist proved that with uniform sampling at 2/T you will lose no spectral information between DC and 1/T.

    Somehow I think this is more "Magic Ph.D" material than real science.

    • You make a good point, but there's actually a considerable amount of debate in the recording world as to the acceptability of 44.1 KHz / 16-bit audio (aka CD-quality). My own hardware records up to 48 KHz / 24-bit, and there's gear out there that will go up to like 96 KHz, for making DVD-audio or some freakin' thing.

      Now, 44.1 KHz / 16-bit is just fine for me, but I can at least consider the idea that there are things happening in the frequencies above 22.05 KHz (the top frequency 44.1 can record) that have some affect on us even if we can't consciously hear them. Well, fine, but I'm not going to record everything at 96 KHz and increase all my audio file sizes to 218% of their current size just so that SuperAudioFileMan can hear the dog whistle in the background. But if I can get a variable sampling scheme that will grab some extra frequencies when the source material's spectral content warrants it, and maybe even sample below 44.1 when the tympani solo comes along, that works for me and is at least an improvement for the hypertreble freaks.
      • I was reading somhere (can't remember where) that although we can't hear above 20Khz, sounds that are above that range will lower in frequency when they bounce around the room and fall into some peoples hearing range.

        CD's sampled at 44khz miss some of these sounds and that is what audiophiles complain about when they say digital audio sounds flat.
    • Don't forget that there's more to digital audio than the faithful representation of a mono signal. Differences between the signals of two or more channels contribute to the overall spatial image of the recording. Sometimes this phenomenon takes the form of a phase difference between two channels in the range of a few microseconds, which is easier to reproduce at higher sampling rates. More importantly, problems with A-D and D-A conversion are more easily solved at higher sampling rates.
    • Nyquist was talking about aliasing of the input signal. If you sample a 220 Hz sinusoidal wave at 440 Hz, then output it through a linearly interpolating DAC, you will hear a triangular wave. In other words, there is aliasing of the output signal.

      If you are sampling audio at 44100 Hz, then an 8000 Hz tone will only be sampled at about 5 spots in its cycle. Although the frequency information of that 8000 Hz tone is retained, the actual waveform is lost. Exactly what the reconstructed waveform will look like is up to the DAC.

      Whether the human ear can hear the difference at higher sampling rates is another question, however.

  • by dstone ( 191334 ) on Sunday January 20, 2002 @05:25PM (#2873427) Homepage
    I have a question/theory about nonuniform sampling rates. Okay, sticking with a 44kHz sample rate, will you hear the differeces between 8, 16, and 24 bit samples? Yes, of course. It's common in digital audio to use 16 bit samples to save space, not because it's the ultimate sample size. (While it's arguable the 44kHz rate side of the equation is pretty darn good.) It's subjective and some ears don't need any "more" audio information to be happy, but I see the choice of sample size as more of a variable than the "provable" sufficient rate for 20kHz audio cutoff behing 44kHz. All I'm saying is that there is potentially audible information below 20kHz that isn't getting encoded and recreated not because of sample rate, but because of sample size. For example, if my source material didn't "need" 44kHz througout a song, could the sample rate be trimmed back in places while the sample size was increased? In the end, it's all just a stream of x samples per second, y bits deep. So if a new sampling technique allows us to reproportion (optimize) those two dimensionons in the same amount of overall space, it's possible that better audio will result. Thoughts?
    • Very keen and in fact techniques like that are used in Vorbis sound compression. It has lower frequencies separated from higher, and the higher are recreated using a number to set the sensitivity and a number to set the sample. At least that is how I took it from the people in #vorbis.
    • For example, if my source material didn't "need" 44kHz througout a song, could the sample rate be trimmed back in places while the sample size was increased?

      interesting idea. The reason that we use 44kHz as the standard sampling rate is that most people's hearing ferequency cutoff is at about 20kHz, and hence the Nyquist sampling theorem shows that we need to sample at 40kHz. add a little bit to account for the fact that anti-aliasing filters aren't infinitely steep and we get 44kHz. So the real question is, in music are there blocks in which the highest frequency is below 20kHz? Then ask whether the reduced quantization brought by using higher sample sizes audible?

      and audiophiles care to comment?

      • Actually, "half the sampling rate" is a bit too optimistic. You could, theoretically, record a 22050Hz wave as 44100Hz, if the peaks and troughs of the wave were exactly in sync with your sampling. Realistically, you start getting diminishing returns at around (IIRC) 1/4 the sampling rate.

        While decreasing the sample rate would give you some savings, if you tried to get a smaller file size than an MP3, your maximum frequency response would probably be less than 3000Hz.

        • Realistically, you start getting diminishing returns at around (IIRC) 1/4 the sampling rate.

          That's not true. The whole point is that if a signal is sampled at frequency f, then it can be reconstructed perfectly if its bandwidth is less than f/2. Go learn the maths [] instead of making vague statements that you think must be right intuitively, but which you actually don't know about.
          • If you know a method for turning the data points 0, 0.7, -1, 0.7, 0 -0.7, 1, -0.7, 0, 0.7, -1... into a perfect analog reconstruction of a 16537Hz sine wave, I'd love to hear about it.

            And my point about resampling vs. MP3 still stands, regardless.

          • That's actually not true. The problem is phase. Imagine a sine wave f of frequency 1, so that f(0)=0, f(0.25)=1, f(0.5)=0, f(0.75)=-1, f(1)=0.

            now if you sample it at frequency 2, you will get a great reconstruction if you sample at time 0.25 and 0.75. However, you will get a much worse reconstruction if you sample at time 0 and 0.5. The phase interaction between the samples and the signal become more noticiable the closer the signal is to the nyquist frequency.

            Now I think you owe the previous poster an apology. A little humilty wouldn't be out of place.
          • It can be reconstructed perfectly -- if it continues repeating itself exactly forever at a rate less than 1/2 the sample rate. Because if the repetition rate is f/sec and the sample rate is (2f+1)/sec, then eventually the samples will cover every part of the waveform. But real music doesn't work this way (except maybe for Yoko Ono and bad church choirs), it's continually changing, and anything too close to 22KHz may not be adequately rendered in the number of samples taken before the sound ends or changes.

            For an extreme example, consider a 21.9KHz tone that only goes for 1 cycle. The sample received may be to points at the top and bottom, in which case reconstruction will be pretty close. Or it may be two points at (almost) the zero crossing, so it appears that there is almost no sound.
    • Nope - the sampling accuracy and quantization is only going to affect the accuracy to which you can reconstruct the component frequencies. Whether or not your sampling is capturing given frequency components is a matter of the sampling rate (or more generally - as is applicable here in the case on non-uniform sampling - the minimum inter-sample delays). Higher sampling rate will only gain you higher frequency components; the lower frequency components are already going to be there unless you deliberately chose to lose them via a high pass filter.

      Regarding 16 bit vs 24 bit "samples", note that there's a difference between sampling accuracy and the number of bits to store your quantized samples. The two are only the same if you're using linear quantization and thus, for example, storing your 24-bit accuracy sample "itself" (i.e. linearly quantized into 2**24 discrete steps). Linear quantization is rather wasteful as the human hearing system does not have equal discrimination at all volume levels, so you might want to quantize more roughly at higher volume levels something like this:

      (0) (1) (2) .. (10 11) (12 13) ... (20 21 22) (23 24 25) etc

      So you could sample at 24 bits to capture additional detail at low volume and yet non-linearly quantize to store your samples in 16 bits wihtout losing that detail.
    • by Anonymous Coward
      Interesting post. If you push this line of reasoning, you come up with high-level compression methods like MP3. Think about it-- all MP3 says is, instead of representing the next one second of audio as 44100 16-bit numbers, represent it as some function of time f(t), which (if you're lucky) takes much fewer bytes to store. There is no reason that you couldn't encode very high frequency sounds this way, say even higher than 96kHz. However, I think the real problem is the lack of such quality on input.
    • First of all, sampling rate implies sampling size. A "sample" is meant to represent the value of a signal over a period of time, not at an instant in time. Consider the following situation. A 44100-th of a second segment of waveform enters an ADC chip. Imagine that the signal has a very high value over this entire duration, except for a brief instant in the middle. It is at this point that the ADC takes a sample. What results is a sample which is not a very good representative of that portion of wave.

      This is why ADCs do not just sample the incoming voltage -- they integrate over a period of time, to "boil down" the voltage over that time period to an average value, that best represents what the signal was doing during that sampling period.

      Now, moving on to your point, which is to vary the sampling rate according to the characteristics of the source; this is somewhat a wasted effort, since in order to determine the source characteristics, you must perform some type of frequency analysis, or autoregression. This is intensive computation, and you would be better off spending that time doing some real compression, such as spectral quantization, or perceptual coding.

      Varying the sampling rate from sample-to-sample would be the ultimate, if it were possible to gain anything from it. Unfortunately, if you vary the sampling rate at each sample, then in order to transmit the sampled stream you must transmit not only the samples, but the duration between samples as well. In the worst case you have doubled your data rate, not compressed it.

      However, as you say, this could work wonders for the fidelity of the sampled signal. Instead of sampling at regular time intervals, we could build a predictive ADC that samples only when the predicted signal value becomes different from the actual by some predetermined amount. Then, send two values: the sample itself, and the duration since the last sample. This works because the DAC which converts the signal also does interpolation. It would be possible to keep the error arbitrarily small, no matter what the characteristics of the signal, up to the limits of the ADC chip itself.

      • A "sample" is meant to represent the value of a signal over a period of time, not at an instant in time

        No, no, no. A sample is an instantanuous value, not an integration. The reason why sampling is not sensitive to very short (compared to sampling frequency) is that there is normally a (anti-aliasing) low-pass filter before the sampling operation.
        As for transmitting the value and "time it stays the same", I'd suggest you first get more familiar with the sampling theory before innovating...
        • I think we are just using different terminology, not talking about different concepts. I just meant that the signal is smoothed out over the sampling period, and the sample value is representative of the entire period, not just a momentary voltage. I certainly did not mean "integrate" in the sense of an integrating filter.

          I wasn't talking about "the time the signal stays the same," I was talking about the time period over which the prediction error reaches some threshold value. For example if I am using a first-order linear predictor for digital-to-analog conversion, and the signal changes linearly from 0 to 1 over a period of 1 second, then I only have to send two samples during that one second period in order to completely describe the behavior of the signal.

          • First, the anti-aliasing filter doesn't just "smooth the signak over the sampling period", it has a much wider effect. Theoratically, it would be a perfect low-pass filter: "sin(t)/t", which has an _infinite_ response (Of course, practically we approximate infinity with something smaller ;-) ).

            As for using a "first-order linear predictor for digital-to-analog conversion", the cost/complexity of building a good analog linear predictor would far exceed any gain you'd otherwise have...
            • Thanks for enlightening me. I'm fairly familiar with the mathematical side of DSP but I don't know much about what's actually happening in the guts of an ADC/DAC. So are you saying that the input signal is filtered by a perfect low-pass filter over the entire time domain, instead of on a per-sample basis? I suppose this makes sense, doesn't it :)
              • The exact LP filter is implementation-dependent, but it should cut everything that's above half the sampling frequency. Then, "instantanuous" sampling is made.

                A sampled signal is represented by equally-spaces impulses (delta) of various amplitudes, which is the same as multiplying the low-passed signal by an impulse train.
    • All current high compression ratio audio comressors work in frequency space, and to some extend, perception "space." In frequency space if some frequencies are non existant, they are not encoded. It doesn't really matter what your original sampling frequency was.

      Also 16 bits is quite enough, although not very well used. Nowadays most CDs are published with high average volume to have sort of an upper hand in broadcasts (check classic music titles for comparison), a better approach is maintaining only the peaks of music near to highest representable number, the high-average volume approach severly limits dynamic range. 65535 different volume steps is quite enough for human ear, you are not supposed to hear any difference beyond that for processed music (for raw recordings, it is better to have higher resolution.)

    • good idea. I don't claim to be an expert in this, but I believe in order for this to work you would have to sample the signal multiple times simultaneously with the different bandwidths (or have some sort of master signal which you could resample). For example, imagine 3 A/D converters at 0-44kHz, 0-22kHz and 22KHz-44KHz. If both the higher and lower frequencies contain data, then use the 0-44kHz data. If the 22KHz-44kHz is empty, than a flag would be inserted into the "data" to use the 0-22kHz data for this time period. BUT, in order to get 16 bits out of the lower 0-22kHz data you would need to have 16 bit accuracy in the range, or put another way, 32 bit converters throughout, which begs the question of simply storing the data as 32 bit in the first place, and then using lossless compression for the parts without the higher frequencies. So, in effect, I believe your idea to be a good one, but implementing it for realistic applications might better be served by oversampling, and then dithering or compressing to the desired storage size. IMHO. As far as whether or not we or audiophiles could hear the difference, this is an on-going debate. I would say that there are very few people in the world who, played music recorded at 16 bit and the exact same music at 32 bit (or 1000 bit for that matter), could distinguish them. Many people claim to be able to hear the difference, but put them in a room with high quality audio components and play one or the other at random 50 times, I'd bet that most them couldn't get it right much more than half the time. 16 bit 44.1kHz audio is pretty damn good.
  • On a totally unrelated note, I took differential equations from Akram Aldroubi. On the first day of class when we all sat down he said, "Welcome to French 101." Scared the hell out of a class full of engineers that haven't seen anything to do with humanities/arts courses in a while. If you ever have to take math at VU this guy is really good at explaining it.
  • by Anonymous Coward
    There are quite some examples in math how non equidistant sampling methods can vastly improve the order of accuracy, let's think about quadratures (numerical Integration):

    Integrating a function f(x) from a to b means measuring the area below the graph. So the first estimation would be to split the interval from a to b into equidistant parts and sum up the area of the rectangles below or over the graph (that would be about f(x_n)*h, where h is the width). This method is called Riemann-Sums or iterated Trapezodial-Rule.

    But you could also try to plot piece-wise polynomials through these equidistant points and calculate the areas below. This would yield better (order) results; these methods are then called iterated simpsons or millne rules. But if you go higher than polynomials of 4th degree, you will get to methods that could compute negative integrals of positive functions, which does not make sense. The reason is that high order polynomials tend to "oszillate" or "run out of bonds" at the end of the intervals. Thus these "Newton-Cotes" methods of equidistant sampling points are of limited capabilites...

    But if you drop the assumption that you need to take equidistant (uniform) sampling points, you will get to far better methods: With Gaussian Quadratures the sampling points are far more dense at start and end of the intervals and thus the interpolating polynomials yield far better order results!

    Thus if you know what you are going to use your data for, then you can always find better sampling methods to optimize for your needs- IMO it really doesn't make sense to simply sample the voltage of the signal at equidistant time frames when trying to digitally represent sound! Where as "lossy compressions" like ogg or mp3 drop information that is less interesting, this equidistant 44kHz sampling just drops anything that does not fit into this sampling; it's kind of a "brute-force" method. And if you then compress to ogg or mp3 it's the same problem like why you should never convert mp3s to ogg... It can (and will) only get worse.

    If you are interested in that quadrature methods then read "Numerical Analysis" by "Kendall E. Atkinson" Chapter 5.
    • equidistant 44kHz sampling just drops anything that does not fit into this sampling

      but unless you first oversample and then selectively reduce the sample rate you do not know which are the "detailed" parts, which warrant a higher sampling frequency and which aren't. Secondly, since you refer to music, human hearing is bandlimited anyhow, so there is no point reconstructing freequencies outside of our perceptual range.

  • Take a look at the triad of MRI images in this article. If you look at the image on the left, it appears to have been scaled up about 2-3x from the original size. If you zoom in on it, you can see that the smallest represented detail in the picture is about 3 pixels across. It looks like they just imported the MRI into Photoshop and did a Bicubic scale to 300%!

    They then remove 50% of the data in the second picture, and proceed to mathematically reconstruct it in the third. In my mind, this would be a great feat, except for two things:

    - More than 50% of the data was unnecessary to present the data in the first place. The original is quite obviously scaled up from its native size.

    - The mathematical reconstruction introduces artifacts that were not even present in the random image, such as huge horizontal pixel smears.

    Can someone point to a better demo of this set of algorithms?

  • I looked through the paper quickly, and it is a survey of existing techniques. The benefits of non-uniform sampling have long been known. Current low-end graphics hardware uses non-uniform sub-sampling grids to give better anti-aliasing results.

    It was shown in the 70's or early 80's by A. Ahumada that the human eye uses a non-uniform distribution of rods and cones (outside the fovea) because it can give better frequency response than a uniform grid (given the same number of cones over a given area).

    In short, while this paper makes good reading, don't think that it represent a breakthrough in the field.
    • by Anonymous Coward
      Thanks for your post. I was beginning to think that
      I dreamt some college lectures (=sad). I would have sworn that nonlinear sampling was part of some courses at college.

      Luckily, I can now rest in peace knowing that I was just a regular student who sometimes was not entirely sober enough to remember all details...
  • ...where the object is to be excited about the new theorem/method/sampling technique without mentioning any details about how it works... I always get a laugh when I see things like "Our theory - which is based on a lot of beautiful new mathematics". I can just imagine the reporter: "tell us about your new mathematical theorem without mentioning anything at all about how it works!"
  • Impressive, but... (Score:2, Insightful)

    by mrjb ( 547783 )
    Looking at the 'restored' pic I see only 'horizontal' distortion, imagine how well the picture would have been restored if they would have applied their maths in *two* dimensions...
  • I finally found a better explanation [] of the new sampling theory. It has to take repeated passes at the same analogue data. First pass is sampled at regular intervals, as usual. This data is analyzed, then on the second pass areas where the data changed fast are sampled at a higher rate. Repeat if needed...

    This will usually give results similar to scanning at the maximum sample rate, then "compressing" by throwing out data points where the values are not changing much -- you need less RAM, but the maximum digitizer speed is the same, and you have to replay the analog data somehow. For instance, in an MRI, the multiple scans might mean holding the patient in the machine longer. That's not good, and enough RAM to hold everything isn't going to add much to the cost of the machine. Also, there is one condition where the results could be different -- if a detail such as a hairline fracture is so fine that it might be entirely missed between the points on the first coarse scan. If you scan at maximum resolution first, you won't miss that.

Lo! Men have become the tool of their tools. -- Henry David Thoreau