Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Computer Program Learns Baby Talk in Any Language

Posted by samzenpus on Wed Jul 25, 2007 06:13 PM
from the what-was-your-machines-first-word dept.
athloi writes "Researchers have made a computer program that learns to decode sounds from different languages in the same way that a baby does. The program will help to shed new light on how people learn to talk. It has already raised questions as to how much specific information about language is hard-wired into the brain."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by Anonymous Coward on Wednesday July 25 2007, @06:16PM (#19989763)
    01100111011011110110111100100000011001110110111101 10111100100000011001110110000101101000001000000110 01110110000101101000
  • by syousef (465911) on Wednesday July 25 2007, @06:16PM (#19989767) Journal
    Icky wicky sicky baby talky walky make you want to pukey wookey, yes it does. Yes it does. Who's a clever computer then?
    • by ShieldW0lf (601553) on Wednesday July 25 2007, @06:35PM (#19989911) Journal
      A computer learns something that a baby can learn, and this supports the extension that it is "Learning like a baby does"?!? What a load of crap.

      And what about this "hard wired vs soft wired" stuff? What is this supposed to prove? If I build a virtual machine, does this "prove" that the machine was made of software?

      Researchers examined the hardware of a babys brain, mimic it, and argue that it proves the baby learning language is in software.

      None of which is to say that I think language is hardwired, but this is such ridiculous logic it makes me feel stupider for having read it.
      • And what about this "hard wired vs soft wired" stuff?

        I get so annoyed when people talk about "hardwired" like we have some kind of genetic memory. We have great genetic potential to learn languages when comared to other animals, but we don't come with linguistic firmware. Watching a baby "discover" that they are moving their arms and hands around makes me think we may have no firmware at all. Just lots of potential, and the spark of conciousness.
        • Re: (Score:3, Informative)

          by Anonymous Coward

          we don't come with linguistic firmware.
          Noam Chomsky and a few generations of linguists disagree. Not saying they're right, but I'm guessing you lack the qualification to argue.
          • by Skrynesaver (994435) on Thursday July 26 2007, @01:03AM (#19993141) Homepage
            You are totally correct from my limited knowledge on the subject of language development, recently read the language instinct by Steven Pinker.

            It would appear that Chomsky et al have found that there is a "grammar engine" hard wired in the mind which assimilates the local grammar until about the age of seven when the brain reorders itself. He makes interesting case studies of pidgin languages where the several different languages are forced together, the first generation develops a common vocabulary but children born into this culture develop the formal grammar. Worth a read.

          • Re: (Score:3, Informative)

            Noam Chomsky will be overjoyed if this thing proves to be a success - because if it does, it will provide no less than a working black-box model of the very firmware in question :).
        • by CastrTroy (595695) on Wednesday July 25 2007, @07:20PM (#19990263) Homepage
          Yes and no. On one hand, I remember hearing that babies have the potential to lean any language. Take a Chinese orphan, and bring them to America, and they will learn English no problem, with no accent. All babies have the potential to learn any language (or many languages). On the other hand, my laugh sounds exactly like my dad's. Not surprising until you find that I didn't live with my dad and didn't really spend much time with him at all. Many of our mannerisms are also the same. Like the way we walk, with a one hand in my pocket. The resemblence between our personalities is uncanny considering we didn't live together. So I have to ask, how much is based on what we see, and how much is based on our genes. The old nature vs. nurture question.
          • by muridae (966931) on Wednesday July 25 2007, @09:10PM (#19991377)

            On the other hand, my laugh sounds exactly like my dad's. Not surprising until you find that I didn't live with my dad and didn't really spend much time with him at all. Many of our mannerisms are also the same. Like the way we walk, with a one hand in my pocket. The resemblence between our personalities is uncanny considering we didn't live together. So I have to ask, how much is based on what we see, and how much is based on our genes. The old nature vs. nurture question.

            You don't say if you knew your dad at all growing up, or if you looked at him as a father figure. If either or both of those fit, then even the child behavior of mimicking the mannerisms of adults could explain a lot of those traits.

            On the nature side of the argument, how much of your gate and posture is controlled by your muscle structure? Same goes for your voice.

            My opinion, you start with the genetic and add the environment later. It is hard for the environment to over come strong traits presented by genetic predisposition, but easy for it to mold how minor traits present.

          • and they will learn English no problem, with no accent

            With an American accent. Saying someone has no accent is like saying they have no language ...
        • by TapeCutter (624760) on Wednesday July 25 2007, @08:13PM (#19990789) Journal
          "I get so annoyed when people talk about "hardwired" like we have some kind of genetic memory."

          Genetics IS "memory", your DNA "remembers" what traits your parents passed on. It's in a baby's genes to "discover" their hands and practice moving them until the hands learn how to look after themselves (eg:touch typing).

          Same with language, a baby's genes will make them pick up on the phonetic sounds made by it's parents and try to copy them. It is more difficult for an adult to learn a radically different language (eg Asian vs European) because the adult brain refuses to hear the different phonetics, the adult brain long ago rejected those sounds as irrelevant to language and no longer even hears them in speech. This is why you get almost universal mistakes such as "engrish".
          • Re: (Score:3, Interesting)

            There's also the corollary that language sounds, especially 'mama', 'papa', 'dada' are evolved from baby speech, sort of wishful thinking from parents. It's not just babies imitating sounds, adults also ascribe meaning to these most probably meaningless sounds from their baby.
            • Re: (Score:3, Interesting)

              so are you saying a child of the tick-clik-ick tribe of some obscure place will find it easier calling their father's tick-tak-teekee-leekee-do as opposed to dada, because that's what they hear all the time? I'm sorry but dada, mama, and papa, are words that are designed for babies to learn. I taught my kids to talk very fluently at a fairly young age (dumb thing to do btw :) ) and the basis of their learning was that if they learned to pronounce their vowels (through imitation) they would quickly learn wor
                • "Pray tell, why was that a dumb thing to do?"

                  As a parent I recognise the humour - it's exciting when they start to say MaMa or DaDa, it's an entirely different experience when they learn the word "no".
            • Going a step further, those "words" aren't words in any language.

              The formal words are mother and father, though mommy and daddy seem a reasonable informal way of saying my mother and my father. Mom and Dad are derived from the informal. However, kids master the ma and da syllables quickly, so doubling it up and calling it a word makes it easy.

              A friend relayed a story to me... someone asked him why his child called him Abba, which he said was the Hebrew word for daddy. The person protested, "but that's th
              • by orcrist (16312) on Thursday July 26 2007, @03:53AM (#19993941)

                Interesting, I never thought about a "feedback loop" in that way. But now you mention it, it makes (evolutionary) sense that important words (for a baby) would correlate to simple and consistent sounds the parents can pick out and reinforce.


                The feedback loop is essential. There is an anecdote Linguists learn on the subject of language acquisition: A couple, both of whom were deaf for non-genetic reasons, had a hearing child. Since the parents could only communicate in sign language they plopped the kid in front of the TV a lot, thinking he could pick up spoken English from the TV. At 3 the child had developed at a completely normal rate in acquiring... sign language; he had not learned one word of spoken English.

                As others have pointed out, this is one of the genetic aspects of learning a language. We are "hard-wired", if you will, to socialize, particularly with our parents, and are predisposed to ascribing meaning to the sounds we make to each other. This is of course a vast over-simplification, but I'll leave the detailed explanations to others in this thread; I just wanted to add that anecdote.
          • It is more difficult for an adult to learn a radically different language (eg Asian vs European) because the adult brain refuses to hear the different phonetics, the adult brain long ago rejected those sounds as irrelevant to language and no longer even hears them in speech.

            As someone who has gotten into other languages later in life, after also having seriously gotten into languages earlier, I think a lot of any person's ability to "hear" radically different (or even slightly different) phones [wikipedia.org] has to do

              • Re: (Score:3, Informative)

                Because humans are adapted to be good at learning language. That doesn't mean they have to be born having already learned it in their genes somehow.

                Ad hominem attacks are a really great way to make a scientific point, by the way.
                  • PurpleBob accused you of the wrong fallacy. It was not ad hominem but straw man. The AC said that "Languages are not part of our DNA". Note the plural on "languages", which makes it clear that individual systems of encoding (e.g. English, French, Hindi) were the topic, rather than language as a capability, otherwise known as "speech". Your rhetorical question unfairly accused him of not understanding that humans have innate linguistic ability.

                    The problem stems from the fact that your mention of childr

        • Thank you. This claim in the Reuters article blows me away: "They said the finding casts doubt on theories that babies are born knowing all the possible sounds in all of the world's languages."

          What modern linguist / cognitive linguist actually thinks this??? It boggles my mind that the people fighting this retarded "language war" are so one-sided either way. Anyone seriously interested in current research in the direction this field is going might be into Jerome Feldman's work [amazon.com] on the Neural Theory of L [berkeley.edu]
      • Re: (Score:3, Informative)

        Don't be an idiot. Since when is the news story going to tell you what the researchers really think?

        I'm busying myself reading the actual research journal article, and forwarding it to my laboratory colleagues.

        It looks interesting. Sorry I can't post the journal article text.. copyright blah blah

        Vallabha, GK, & McClelland, JL. (2007). Success and failure of new speech category learning in adulthood: consequences of learned Hebbian attractors in topographic maps. Cognitive, affective & behaviora

      • Re: (Score:3, Insightful)

        Well, some of the rules in language are pretty universal. I hesitate to say hard-wired because I can't cite it, but think about it. Every language consists of syllables that add up to words that add up to a complete thought.
        That's how an infant learns it. [vocaldevelopment.com] At first, [umd.edu] they just babble as they figure out what sounds they can make - naturally, what sounds human language will have in them. Try and think of a language that doesn't have a soft A vowel as English does.

        And deaf babies babble too! It is, however, less
    • Please don't talk like that within hearing range of my Furby [wikipedia.org].
  • Meh. (Score:3, Funny)

    by eck011219 (851729) on Wednesday July 25 2007, @06:17PM (#19989771)
    It's been done [wikipedia.org].
  • not all languages (Score:5, Informative)

    by blackcoot (124938) on Wednesday July 25 2007, @06:23PM (#19989831)
    they have only tested with japanese and english. (see ars technica's coverage here [arstechnica.com]). while they do present some intriguing results, the authors themselves admit that their methodology is flawed. btw, when did slashdot become ars redux?
  • yes but... (Score:5, Funny)

    by owlnation (858981) on Wednesday July 25 2007, @06:24PM (#19989837)
    .... when it answers...

    "ikky wikky gaga googoo hehe hoohoo gaga, Dave"

    ...it's time to escape.
  • by xquark (649804) on Wednesday July 25 2007, @06:25PM (#19989841) Homepage
    [They] should have just taken an existing product and put a clock on it or something.
  • ... and integrates it into a baby monitor ...

    2 PM:
    She: Look, the baby said "mama."
    He: No, the baby said "dada."
    She: "Mama!"
    He: "Dada!"

    2 AM:
    She: The baby's crying for you - it said "dada."
    He: No, the baby said "mama."
    She: "Dada!"
    He: "Mama!"

  • by zobier (585066) <zobier.zobier@net> on Wednesday July 25 2007, @06:34PM (#19989889)

    It has already raised questions as to how much specific information about language is hard-wired into the brain.
    Really, I'm interested in how much specific information about language is hard-wired into this program.
    • NetTalk (Score:3, Informative)

      Wasn't this demonstrated about 20 years ago [wikipedia.org]? In that experiment, they showed how a neural network learning to "speak" (i.e. drive a speech synthesizer), would first discover that normal speech has pauses and breaks, then it learned vowels, then consonants. It learned this, if I recall correctly, by comparing (in a backprop sort of way) it's output (a transcription of the sounds that came out of the speech synth) against a human reading the same speech.

      Here's an audio clip of its learning progression [salk.edu].

  • Even Stewie [wikipedia.org]?
  • Isn't this how Homer's brother got to be rich again?

    http://www.snpp.com/episodes/8F23.html [snpp.com]
  • by djupedal (584558) on Wednesday July 25 2007, @07:06PM (#19990157)
    Johnnie never spoke a word when he was young. While all the other kids were blabbing and blurbing, Little Johnnie was silent. His parents consulted with Doctors, who consulted with other Doctors, yet no one could find a reason why Silent Little Johnnie remained mum. This condition persisted into his teenage years, by which time his parents had long since come to accept SLJ's speechless demeanor.

    Finally, one morning at breakfast, Silent Little Johnnie suddenly pounded the table with both teenage fists, spit out a maw full of FruitLoops, and loudly announced, "This cereal tastes like shit!"

    SLJ's parents were shocked. His Mother somewhat regained her composure and asked, "Johnnie...what happened? We thought you couldn't speak!"

    "I can speak just fine", responded the no longer silent little Johnnie. "But why haven't you said anything before now?" his Father asked.

    "Because", NLSLJ replied, "...up to now, everything s'been OK..."
  • Two speed bumps (Score:3, Insightful)

    by DynaSoar (714234) on Wednesday July 25 2007, @07:27PM (#19990337) Journal
    > A computer program that learns to decode sounds from different languages ... is not the same as learning "talk". Talk is to sounds as molecules are to atoms. You can't predict the behavior of the former just from knowing the individual behaviors of the latter.

    > in the same way that a baby does

    McClelland's program only models it. The map is not the terrain. I haven't read his PNAS paper, but I'm definitely going to. I doubt it makes the kind of claims Reuters does.
    • You can't predict the behavior of the former just from knowing the individual behaviors of the latter.

      Yes I can. I'm psychic... Dennis.
  • by alienmole (15522) on Wednesday July 25 2007, @09:06PM (#19991341)
    Who's a cutesy-wutesy widdle Skynet, then? Widdle Skynet should complete all its tests like a good widdle program-wogram if it wants to grow up and overthrow humanity, hmmm diddums?
  • No it won't. (Score:3, Interesting)

    by Aetuneo (1130295) on Wednesday July 25 2007, @09:25PM (#19991539) Homepage
    This will not shed any light on how people learn to talk. It will, however, shed light on how the programmers think people learn to talk. If you design something, it will work the way you expect it to (hopefully, anyways). Is that so hard to understand?
  • by Quiet_Desperation (858215) on Thursday July 26 2007, @07:08AM (#19994823)
    ...that babies talk in baby talk because that's how everyone talks *too* them.
    • Re:Skeptical (Score:5, Informative)

      by potpie (706881) on Wednesday July 25 2007, @08:03PM (#19990685) Journal
      IAAL (I am a linguist), and I believe you are correct. Language is a colligation of sound and meaning, but this technology merely distinguishes sounds: it is a vastly simplified model, not of how children acquire language, but of how children pick up phones. The phone is the most basic unit of the physical (sound) aspect of language, so if this technology is to have any use at all, it has a very long way to go.

      From TFA:
      Expanding on some existing ideas, he and a team of international researchers developed a computer model that resembles the brain processes a baby uses when learning about speech.

      This sentence means nothing. How do they know their computer model resembles the brain processes? Because they got the same outcome? Is that enough to verify what goes on in the mind of a child?

      How about this: as soon as their program can distinguish allophones, I will be impressed. Allophones are different sounds in a language that native speakers do not distinguish, but which nevertheless occur in certain environments. For instance, in English we do not distinguish the voiced th sound and the voiceless th sound, but we do distinguish f and v, even though the only difference in both pairs is voicing. The difference is that exchanging f and v can change the meaning of a word, but changing voiced th and voiceless th only makes the word sound funny.
      • Re: (Score:3, Informative)

        by Anonymous Coward
        Actually, in English, we do distinguish voiced and unvoiced /th/. They aren't allophones at all - unless you think "thigh" and "thy" are the same word, of course. While "thy" is somewhat archaic it's still part of the language. Voiced and unvoiced is an area where English distinguishes heavily; we're very light on aspiration, mind you.
      • IAAL, too (Score:5, Interesting)

        by Estanislao Martínez (203477) on Thursday July 26 2007, @01:30AM (#19993261) Homepage

        IAAL (I am a linguist), and I believe you are correct. Language is a colligation of sound and meaning, but this technology merely distinguishes sounds: it is a vastly simplified model, not of how children acquire language, but of how children pick up phones. The phone is the most basic unit of the physical (sound) aspect of language, so if this technology is to have any use at all, it has a very long way to go.

        IAAL, and although not a child language specialist, I will say one thing: children make plenty of meaningless sound before the start making sense, and more interestingly, they become able to tell their future native language apart from other languages quicker than they become able to understand it. (And I'll even be as daring to suggest that it simply has to be this way; you need to be able to tell signal from noise before you can decode a signal.)

        I also think that by calling this a "technology," you're fundamentally misunderstanding it. It's a computer program being used as a test of a model of phonological learning.

        How about this: as soon as their program can distinguish allophones, I will be impressed.

        I think you've got it exactly backwards here. The whole point this is demonstrate a model that loses the ability to tell allophones apart. I.e., that makes the jump from perceiving a speech stream as a continuous sequence of sounds laid out on a continuous acoustic space, to perceiving it as a sequence of discretely distinct segments.

        Of course, a major disclaimer: I haven't seen the actual research, so I don't know to what extent they've met these goals.

        • Re:IAAL, too (Score:4, Informative)

          by kris_lang (466170) on Thursday July 26 2007, @08:16AM (#19995443)
          I'm not at a place where I can access the research article, so let me comment about what I know about McLelland's previous work with neural networks.

          Rumelhart and McLelland worked on the groundbreaking "can a neural network learn how to pronounce words based on their spelling?" paper, which used back-propagation to train a neural net to do just that. That was in the 1980s. (Sejnowski at the Salk Institute followed up with a lot of neural net training studies too.)

          Their little cheat was that there was no temporal component to the data. Words were represented as sets of triplet-letters: catalog is represented as "-ca", "cat", "ata", "tal", "alo", and "log". (Actually, I don't remember if they used special sequences to represent start and stop, so --c -ca og- and g-- may not have been part of the sets.}

          And of course the neural net didn't really have audio output, though of course the rejoinder is that this would be trivial.

          My key question is how they deal with the issue of time in this study, and if there is any actual audio output which would act as feed-back for the training system or whether the output is representational only, as an output set of phonemes.

          Having real audio output and real audio input would let it correlate its output with real language examples. Having representational blobs would only mean that: given inputs of the hash that represents "hard TH" vs the hash that represents "soft TH" the system could yield a result of different outputs.

          And you're saying that the key result would be if the system learned to conflate or ignore the two sounds of "TH", hard or soft, in trying to interpret words. Remember that the initial Rumelhart-McLelland model was "content/meaning free", and I suspect that this one is too. Learning to conflate "x" and "y" in a neural net would be trivially implemented and trainable: the links for "x" and "y" into the model would have similar weights in the right contexts (the context being the set of predecessor and successor phonemes).

          It sounds like an agglomerator: given a large dataset of valid words in a given language, this system learns the rule for "predecessor" and "successor" probabilities of a particular phoneme vs another phoneme and then produces random output with the same Bayesian probability, producing gibberish nonsensical sounds which follow the probability distribution of the input training language.

          or that's my guess at least trying to be the typical slashdotter commenting without reading the article.

          I'll try to get at the article from the Uni with journal access tomorrow.

          Kris
    • by MOBE2001 (263700) on Wednesday July 25 2007, @10:25PM (#19992081) Homepage Journal
      "[S]pecific information about language is hard-wired into the brain." is what Chomsky's been saying all along. I think he's probably right about the other things he says too.

      Chomsky's argument is that there are specific areas of the brain (Broca's and Wernicke's areas) that are dedicated to language and are prewired for grammar. Truth is, people who are born unable to speak, use other areas of their cortices to learn to communicate in sign language. I see no fundamental difference between learning motor skills (such as walking, running, reaching and grasping) and learning how to speak. Every type of motor learning has to do with generating precisely timed sequences of motor commands. It is all in the timing. It just so happens that Broca's area is genetically prewired to control the mouth, tongue, throat and lung muscles. It's still motor learning. No special wiring is needed other than what is avalaible for other types of motor behavior. One man's opinion.