New Algorithm for Learning Languages 454
An anonymous reader writes "U.S. and Israeli researchers have developed a method for enabling a computer program to scan text in any of a number of languages, including English and Chinese, and autonomously and without previous information infer the underlying rules of grammar. The rules can then be used to generate new and meaningful sentences. The method also works for such data as sheet music or protein sequences."
just thought.. (Score:3, Interesting)
would probably help with the problem of either downloading a small, incomplete dictionary, a dictionary with errors, or a massive dictionary file.
Re:just thought.. (Score:5, Insightful)
This algorithm works with sample data. Where is the sample data going to come from? If you have to download it, then that negates the whole point of using it. If you use what you see online, well that's just rediculous, for obvious reasons :).
Re:just thought.. (Score:2, Insightful)
It's going to come from large bodies of text that exist in mmultiple langueages. Things like the Bible, the constitution, etcetera. The whole point of this technology is that by drawing conclusions from those texts, the program infers the underlying rules of the lang
Re:just thought.. (Score:2)
The parent to my comment was suggesting that this algorithm be used in leau of a large dictionary download. I was pointing out that you'd have to download said "large bodies of text" to make it work, and so the whole exercise would be pointless.
O(n^n^n...)????? (Score:4, Interesting)
From TFA: The algorithm discovers the patterns by repeatedly aligning sentences and looking for overlapping parts.
If you take just a single string [of length n] and rotate it against itself in a search for matches, then you've got to do n^2 byte comparisons just to find all singleton matches, and then gosh only knows how many comparions thereafter to find all contiguous stretches of matches.
But if you were to take some set of embedded strings, and rotate them against a second set of global strings [where, in a worst case scenario, the set of embedded strings would consist of the set of all substrings of the set of global strings], then you would need to perform a staggeringly large [for all intents and purposes, infinite] number of byte comparisons.
What did they do to shorten the total number of comparisons? [I've got some ideas of my own in that regard, but I'm curious as to their approach.]
PS: Many languages are read backwards, and I assume they re-oriented those languages before feeding them to the algorithm [it would be damned impressive if the algorithm could learn the forwards grammar by reading backwards].
Re:O(n^n^n...)????? (Score:3, Insightful)
If you take just a single string [of length n] and rotate it against itself in a search for matches, then you've got to do n^2 byte comparisons just to find all singleton matches,...
No you don't :-)
If you want to find all singleton matches, it's enough to sort the string into ascending order (order n.log(n)), and then scan through for adjacent matches (order n). For example, sorting "the cat sat on the mat" gives "cat mat on sat the the"—where the two "the"s are now adjacent and so easily discover
Re:just thought.. (Score:5, Interesting)
Re:just thought.. (Score:5, Informative)
A dictionary is just words. This algorythm cant assign meaning to the buildingblocks, it can only dicide how and in what order the buildingblocks go together.
Re:New algorithm understands slashdot comments! (Score:3, Funny)
Can you teach it to take a breath?
Sucks to be a support tech in India (Score:5, Funny)
Re:Sucks to be a support tech in India (Score:2, Funny)
They is?
Didn't Google already do this? (Score:5, Interesting)
Re:Didn't Google already do this? (Score:2, Informative)
No the didn't (Score:5, Interesting)
I will believe this new program when I see it.
Translation, especially from extremely different languages, is absurdly difficult. For example, I was out with a Japanese woman the other night, and she said "aitakatta". Literally translated, this means "wanted to meet". Translated into native English, it means "I really wanted to see you tonight". It is going to take one hell of a computer program to figure that out from statistical BS. I barely could with my enormous meat-computer and a whole lot of knowledge of the language.
Re:No the didn't (Score:3, Interesting)
"I'm leaving you."
"Who is she?"
However, in written text, where the author can assume that the reader brings no shared assumptions, nor can the author rely on any deefback, 'speakers' usually do a good job of including all necessary information in one way or another -- especially in texts meant to convince or promote a particular viewpoint. I'll bet thes
two-way street (Score:3, Funny)
"I'm leaving you."
What?
"I'm leaving you, Alice."
I don't understand what you're trying to do.
"I've met someone."
What do you mean 'met'?
"Look...just read the pamphlet."
I don't have the pamphlet.
"I have to go."
Which way do you want to go?
"Uh...west."
You would need a machete to head further west.
I can't tell you how many of my break-ups have ended with needing a machete.
Re:No the didn't (Score:5, Informative)
I know it is fairly accurate because I have fooled my spanish speaking friends once in an IM conversation. I told them I learned spanish via hypnosis and basically just copy/pasted everything spanish into IM. The conversation went on for like 15 minutes full spanish before I told them I was using the website. They were pissing their pants.
Random test ... (Score:5, Funny)
Re:Random test ... (Score:3, Funny)
english to german
Ich weiß, dass es ziemlich genau ist, weil ich meine spanischen sprechenden Freunde sobald in einem IM Gespräch zum Narren gehalten habe. Ich sagte ihnen, dass ich Spanisch über Hypnose lernte und grundsätzlich gerade alles Spanisch in IM kopieren/aufkleben. Das Gespräch ging seit ähnlichen 15 Minuten volles Spanisch weiter, bevor ich ihnen sagte, dass ich die Website
Re:Random test ... (Score:3, Interesting)
But as things stand, I'd spend more time knocking the bad translation into shape than if I translated the whole thing from scratch.
Translators are often asked to copy edit other translators' work (customers tend to call it this "proof reading", presumably to devalue it and get it done on the cheap, but it
Re:No the didn't (Score:2, Interesting)
Being more serious, how do you think humans learn the rudiments of language? It's pattern analysis, i.e. precisely the technique this algorithm tries to replicate. It is true that the algorithm won't then progress onto the next stage, which is using that rudimentary grasp of the language to be taught its finer points, but if you genuinely doubt the capacity of this method to produce an understanding of
Re:No the didn't (Score:3, Interesting)
there's one flaw in your analysis
Re:No the didn't (Score:2)
You haven't seen the Google translator he's talking about. It isn't public yet, I don't believe.
Here [outer-court.com] was the original article on it.
Old: "Alpine white new presence tape registered for coffee confirms Laden"
New: "The White House Confirmed the Existence of a new Bin Laden
Re:No the didn't (Score:4, Interesting)
The idea being that you take any input language, Japanese for instance, and get a working Jap Esperanto translator. Being as Esperanto is so consistent and reliable in how it is designed, it should be easier to do than a straight Jap Eng translator.
To finish, you write a Esperanto English translator. By leveraging the consistent language of Esperanto, researchers thought they could write a true universal translator of sorts.
Don't know what ever came of it, but it was an interesting idea.
It's actually a new language study (Score:4, Insightful)
For example, a classical Pragmatics scenario:
John is interested in a co worker Anna, but is shy and doesn't want to ask her out if she's taken. He asks his friend Dave if he knows if Anna is available to which Dave replies "Anna has two kids."
Now, taken literally, Dave did not answer John's question. What he literally said is that Anna has at least two children, and presumably exactly two children. That says nothing of her avalibility for dating. However, there's nobody who reads that scenario who doesn't get what Dave actually meant to communicate: That Anna is married, with children.
So that's a major problem computers hit when trying to really understand natural language. You can write a set of rules that comletely describes all the syntax and grammar. However that doesn't do it, that doesn't get you to meaning, because meaning occurs at a higher level than that. Even when we are speaking literally and directly, there's still a whole lot of context that comes in to play. Since we are quite often at least speaking partially indirectly, it gets to be a real mess.
Your example is a great one of just how bad it gets between languages. The literal meaning in Japanese was not the same as the intended meaning. So first you need to decode that, however even if you know that, a literal translation of the intended meaning may not come out right in another language. To really translate well you need to be able to decode the intended meaning of a literal phrase, translate that into an approprate meaning in the other language, and then encode that in a phrase that conveys that intended meaning accurately, and in the appropriate way.
It's a bitch, and not something computers are even near capable of.
Re:It's actually a new language study (Score:3, Insightful)
So how does that sort of thing work? Well, in mathematics you can have something like y=f(x) and
Re:No the didn't (Score:3, Insightful)
Re:Didn't Google already do this? (Score:5, Interesting)
IIRC, Google's translator works from a source of documents from the UN. By cross referencing the same set of documetents in all kinds of different languages, it is able to do a pretty solid translation built on the work of goodness knows how many professional translators.
What is a little more confusing to me is how machine translation can deal with finer points in language, like different words in a target language where the source language has only one. English for example has the word "to know" but many languages use different words depending on whether it is a thing or a person that is known. Or words that relate to the same physical object but carry very different cultural connotations -- the word for female dog is not derogatory in every language, for example, but some other animals can be extremely profane depending on who you talk to.
Or situations where two entirely different real-world concepts mean similar things in their respective language -- in English, for example, you're up shit creek, but in Slavic languages you're in the pussy.
I've done translation work before (Slovak -> English), and there's much more going on than differences in words and grammar. There are whole conceptual frameworks in languages that just don't translate, and this is frustrating for anyone learning a language, let alone trying to translate. English is very precise (when used as directed) in matters of time and sequence -- we have more than 20 verb tenses where most languages get away with three.
Consider this:
I was having breakfast when my sister, whom I hadn't seen in five years, called and asked if I was going to the county fair this weekend. I told her I wasn't because I'm having the painters come on Saturday. They'll have finished by 5:00, I told her, so we can get together afterwords.
These three sentences use six different tenses: past continuous, past perfect, past simple, present continuous, future perfect, and present simple, and are further complicated by the fact that you have past tenses refering to the future, present tenses refering to the future, and the wonderful future perfect tense that refers to something that will be in the past from an arbitrary future perspective, but which hasn't actually happened yet. Still following?
On the other hand, English is much less precise in things like prepositions and objects, and utterly inexplicable when it comes to things like articles, phrasal verbs, and required word order -- try explaining why:
I'll pick you up after work
I'll pick the kids up after work
I'll pick up the kids after work
are all OK, but
I'll pick up you after work
is not.
Machine translation will be a wonderful thing for a lot of reasons, but because of these kinds of differences in languages, it will be limited to certain types of writing. You may be able to get a computer to translate the words of Shakespeare, but a rose, by whatever name, is not equally sweet in every language.Re:Didn't Google already do this? (Score:3, Insightful)
I'll pick up you after work
is not.
It can be, depending on context or emphasis. "I'll pick up the kids after lunch. I'll pick up you after work."
English only has two tenses. (Score:5, Informative)
Yes! I'd have thrown a mod point at you just for this paragraph if I could.
English is very precise (when used as directed) in matters of time and sequence -- we have more than 20 verb tenses where most languages get away with three.
Not really. Firstly, English only has two or three tenses. (Depending upon which linguist you ask, English either has a past/non-past distinction or past/present/future distinctions. See [1], [2]. The general consensus seems to be in favor of the former, although I humbly disagree with the general consensus.) It maintains a variety of aspect [wikipedia.org] distinctions (perfective vs imperfective, habitual vs continuous, nonprogressive vs progressive). See [3]. Its verbs also interact with modality [wikipedia.org], albeit slightly less strongly.
It's a very common mistake to count the combinations of tense, aspect, and modality in a language and arrive at some astronomical number of "tenses". It's an even more common mistake (for native English speakers, anyway) to think that English is special or different or strange compared to other languages. In most cases, it's not -- especially when compared with other Indo-European languages.
Secondly, and more interestingly IMHO, most languages do not have three distinct tenses. The most common cases are either to have a future/non-future distinction or a past/non-past distinction. In any case, the future tense, if it exists, is normally derived from modal or aspectual markers and is diachronically weak (which is linguist-babble meaning "future tenses forms don't stick around for very long"). See [3].
English is a perfect example: will, of course, used to refer to the agent's desire (his or her will) to do something. Only recently has it shifted to have a more temporal sense, and it still maintains some of its modal flavor. In fact, the least marked way of making the future (in the US, at least) is to use either gonna or a present progressive form: I'm having dinner with my boss tonight. I'm gonna ask him for a raise. See Comrie [1] again.
So as not to be anglo-centric, I'll give another example. Spanish has three widespread means of forming the future tense. Two of these are periphrastic and are exemplified by he de cantar 'I've gotta sing' and voy a cantar 'I'm gonna sing'. The last is the synthetic form, cantaré 'I'll sing'.
Most high school or college Spanish teachers would tell you that the "pure" future is cantaré. Actually, it's historically derived from the phrase cantar he 'I have to sing' (from Latin cantáre habeo), and is being displaced by the other two forms all across the Spanish-speaking world. I'm told, for example, that cantaré has been largely lost in in Argentina and southern Chile (see [4]).
In any case, the parent's main point still holds. It's a b?tch to deal with cross-linguistic differences in major semantic systems computationally. But good lord, it's fun to try. :)
References:
SCIgen (Score:5, Interesting)
From TFL... (your link) (Score:2)
I know you were aiming for funny, but there is a big difference between following a hand-written grammar and deducing it from the text...
Paul B.
Markov Chains anyone? (Score:5, Informative)
Used this (easy to compile) C program:
http://www.eblong.com/zarf/markov/ [eblong.com]
to create these:
http://www.mintruth.com/mirror/texts/ [mintruth.com]
Mod points to whomever can tell us what texts they use. (No mod points can actually be given)
Re:Markov Chains anyone? (Score:2)
PDF of paper (Score:5, Informative)
Full article for non-PNAS subscribers (Score:5, Informative)
Zach Solan, David Horn, Eytan Ruppin and Shimon Edelman
School of Physics and Astronomy and School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel; and Department of Psychology, Cornell University, Ithaca, NY 14853
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The ADIOS (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.
Many types of sequential symbolic data possess structure that is (i) hierarchical and (ii) context-sensitive. Natural-language text and transcribed speech are prime examples of such data: a corpus of language consists of sentences defined over a finite lexicon of symbols such as words. Linguists traditionally analyze the sentences into recursively structured phrasal constituents (1); at the same time, a distributional analysis of partially aligned sentential contexts (2) reveals in the lexicon clusters that are said to correspond to various syntactic categories (such as nouns or verbs). Such structure, however, is not limited to the natural languages; recurring motifs are found, on a level of description that is common to all life on earth, in the base sequences of DNA that constitute the genome. We introduce an unsupervised algorithm that discovers hierarchical structure in any sequence data, on the basis of the minimal assumption that the corpus at hand contains partially overlapping strings at multiple levels of organization. In the linguistic domain, our algorithm has been successfully tested both on artificial-grammar output and on natural-language corpora such as ATIS (3), CHILDES (4), and the Bible (5). In bioinformatics, the algorithm has been shown to extract from protein sequences syntactic structures that are highly correlated with the functional properties of these proteins.
The ADIOS Algorithm for Grammar-Like Rule Induction
In a machine learning paradigm for grammar induction, a teacher produces a sequence of strings generated by a grammar G0, and a learner uses the resulting corpus to construct a grammar G, aiming to approximate G0 in some sense (6). Recent evidence suggests that natural language acquisition involves both statistical computation (e.g., in speech segmentation) and rule-like algebraic processes (e.g., in structured generalization) (7-11). Modern computational approaches to grammar induction integrate statistical and rule-based methods (12, 13). Statistical information that can be learned along with the rules may be Markov (14) or variable-order Markov (15) structure for finite state (16) grammars, in which case the EM algorithm can be used to maximize the likelihood of the observed data. Likewise, stochastic annotation for context-free grammars (CFGs) can be learned by using methods such as the Inside-Outside algorithm (14, 17).
We have developed a method that, like some of those just mentioned, combines statistics and rules: our algorithm, ADIOS (for automatic distillation of structure) uses statistical information present in raw sequential data to identify significant segments and to distill rule-like regularities that support structured generalization. Unlike
Re:PDF of paper (Score:5, Funny)
HEH! funniest meant-to-be-serious acronym ever.
Re:PDF of paper (Score:3, Informative)
Better link for PDF (Score:2, Informative)
PNAS wants you to subscribe to download the PDF.
Or you could just go to the authors' page and download it for free: http://www.cs.tau.ac.il/~ruppin/pnas_adios.pdf [tau.ac.il]
Woah (Score:4, Funny)
Re:Woah (Score:2, Funny)
Noam Chomsky (Score:2, Informative)
Re:Noam Chomsky (Score:2)
Many english speakers (and writers) appear to actively avoid using the actual *official* rules of english grammar (and I'm sure this is true or other languages and their native speakers too).
I've always assumed that natural language comprehension (as it happens in the human brain) is mostly massively parallel guesswork based on context since (hav
Re:Noam Chomsky (Score:2)
There must be some rules in natural language, otherwise how would anyone be able to understand what anyone else was saying? The rules used may not be the "official" rules of the language, and they may not even be clearly/consciously understood by the speakers/listeners themselves, but that doesn't mean they aren't rules.
Re:Noam Chomsky (Score:5, Insightful)
If there were no rules, I could write a post using random letters for random sounds in a random order, or just using a bunch of non-letters. That wouldn't convey anything. Saying "I'm writing on slashdot" is more effective than writing "(*&$@(&^$)(#*$&"
Re:Noam Chomsky (Score:5, Insightful)
Instead of a language module with specialized abilities tuned to learn rule-based grammar, we have an an unsupervised learning system has surmised the grammar of the language merely from the patterns inherent in the data it is given. That a system can do this is evidence against the notion that an innate grammar module in the brain is necessary for language.
Re:Noam Chomsky (Score:2)
Re:Noam Chomsky (Score:3, Interesting)
Even then, Gold showed a long, long time a
Re:Noam Chomsky (Score:3, Interesting)
Re:Noam Chomsky (Score:5, Insightful)
Sorry about the rant, but like I said, my prof did *not* like the Chomskyan view of linguistics.
Oh, and as far as the notion of the "language module" goes, it might be premature to call it a module, but there *is* neurophysiological evidence to suggest that humans are physically predisposed towards learning language from birth, so that much at the very least is tenable.
Speaking as someone working on NLP (Score:5, Interesting)
It's not going to be right. The algorithm is stated as being statistically based which while is similar to the way children learn languages is not exactly it. Children learn by hearing correct native languages from their parents, teachers, friends, etc. The statistics come in when children produce utterances that either do not conform to speech they hear or when people correct them. However, statistics does not come in at all with what they hear.
With respect to the learning of the algorithm the underlying grammar of a language, I am dubious enough to call it a grand, untrue claim. Basically all modern views of syntax are unscientific and we're not going to get anywhere until Chompsky dies. Think about the word "do" in english. No view of syntax describes from where that comes. Rather languages are shoehorned into our constructs.
So, either they're using a flawed view of syntax or they have a new view of syntax and for some reason aren't releasing it in any linguistics journal as far as I know.
Re:Speaking as someone working on NLP (Score:2, Insightful)
However, statistics does not come in at all with what they hear.
Utterance in pattern A is heard more often than utterance in pattern B; utterances in patterns C and D are not heard at all. How is that not statistics?
Re:Speaking as someone working on NLP (Score:3, Interesting)
Native speakers by definition speak correctly, and that is all the child is hearing.
Re:Speaking as someone working on NLP (Score:2)
Consider this, who is in charge of language, an institute or the speakers? Natives cannot be wrong about their own language; they can be wrong on the standard, but A) that standard is always changing and B) given A who then is correct?
Re:Speaking as someone working on NLP (Score:2)
However, have you read the paper? Looked at the data? It seems a bit early for you to dismiss their conclusions based on a blog entry.
Re:Speaking as someone working on NLP (Score:2)
Re:Speaking as someone working on NLP (Score:3, Informative)
I really don't understand that. How are modern views of syntax unscientific? Also, if Chomsky is such an influence on linguistics, then maybe he's right about it. Aren't you essentially saying that we have no way of arguing with him so let's wait til he dies so he can't argue back? I would think the correct view should win out regardless of the speaker.
Other than what I've studied in cogniti
Re:Speaking as someone working on NLP (Score:5, Interesting)
Chomsky is to linguistics as Freud to psych. He had great ideas for the time (many still stand), and the science would be nowhere close to where it is without him. However, A) he's backed off alot of supporting his own theories and B) he's published papers contradicting his original ideas so that is some question there for their veracity. Since so many linguistics undergrads hold him as the pinnical of syntax none are really deviating drastically from him.
WRT the unscientificness, to make his view fit English, there has to be "do-support" which basically is that when forming an interrogative "do" just comes in to make things work without any explanation. In other words, it is in our grammar, but our view of syntax does not account for it.
Re:Do-support, in brief (Score:3)
We say 'earlier you taught me' instead. What is your point?
In terms of language evolution, the word 'taught' has the same relationship to 'teach' as 'wrought' has to 'wreak', and similar relationships to 'thought'-'think', 'brought'-'bring' and (less so) 'bought'-'buy'. The pretirite form of each of these verbs is actually formed by a very similar linguistic rule to the one that forms 'educated' from 'educate' - the basic rule in
Re:Speaking as someone working on NLP (Score:2)
And while their paper is not being published in a linguistic journal, it is being published in the Proceedings of the National Academy of Sciences (PNAS, Vol. 102, No. 33), which is a well respected cross-discipline journal.
Although I, along with you, am skeptical of this, it sounds like
Re:Speaking as someone working on NLP (Score:2)
My argument is based soley upon this blog entry, and what it says doesn't quite seem to add up to me.
Re:Speaking as someone working on NLP (Score:2)
I'm not going to chime in and start a flame-war, but since your view is rather iconoclast, I think it only fair to point this out to the Slashdot audience, who are probably not as informed on the topic as you or I.
Re:Speaking as someone working on NLP (Score:3, Informative)
According to the theory, children come with this universal grammar built-in to their mind (for some reason, Chomsky seems against genetic arguments, but good luck understanding his reasoni
Re:Speaking as someone working on NLP (Score:5, Interesting)
And I agree that this algorithm doesn't seem that it would be entirely successful in learning grammar. But this is not because it's statistical. I don't understand how you can look at something as complicated as the human brain and say "statistics does not come in at all".
If this algorithm worked, then it could be statistical, symbolic, Chomskyan, or magic voodoo and I wouldn't care. There's no reason that computers have to do things the same way the brain does, and I doubt they'll have enough computational power to do so for a long time anyway.
No, the flaws in this algorithm are that it is greedy (so a grammar rule it discovers can never be falsified by new evidence), and it seems not to discover recursive rules, which are a critical part of grammar. Perhaps it's learning a better approximation to a grammar than we've seen before, but it's not really doing the amazing, adaptive, recursive thing we call language.
Re:Speaking as someone working on NLP (Score:2)
Presumably, the distribution of do-support: question inversion, not, and VP ellipsis. I don't really think it's a great mystery, but it's a pain to characterize in Chomsky's formulation of chains.
Wow! (Score:5, Funny)
Input: "For example, the sentences I would like to book a first-class flight to Chicago, I want to book a first-class flight to Boston and Book a first-class flight for me, please may give rise to the pattern book a first-class flight -- if this candidate pattern passes the novel statistical significance test that is the core of the algorithm."
How does it feel to "book a first-class flight"?
Grammar depends on the input (Score:3, Interesting)
Re:Grammar depends on the input (Score:3, Interesting)
Re:Grammar depends on the input (Score:2)
Actually it would be quite remarkable if this was possible, given that the reasons some dialects become privileged and others don't have nothing to do with the formal properties of those dial
Re:Grammar depends on the input (Score:2, Informative)
If you take young children and expose them to rubbish for four or five years while they're learning to speak, they'll speak rubbish too. That's the problem with young children, they can't sort the good from the bad.
But if you expose them to well strucutred language, they'll learn to speak it, without being EXPLICITLY TAUGHT THE RULES. Which is exactly what this paper is about. Unsupervised natural language learning. That's what makes the system good. It's able to build equivalency classes of verbs, noun
Finally some progress (Score:2, Funny)
Re:Finally some progress (Score:2, Insightful)
How about Klingon? (Score:2)
How about Dolphinese? (Score:3, Insightful)
How about Dolphinese? Research shows that they seem to be able to scout and transfer information from one individual to his/her pod. If there's some grammar it would be pretty good nut to crack.
Hieroglyphics? (Score:3, Interesting)
Let's see what it thinks of this (Score:2, Funny)
Re:Let's see what it thinks of this (Score:2, Insightful)
Markov models are perhaps the easiest language acquisition model to implement, but also one of the worst at coming up with valid speech or text.
Interestingly, they do much, much better as recommender systems.
What is actually new about this (Score:2)
Dupe (Score:3, Informative)
Programming Language (Score:2, Interesting)
Finaly (Score:2, Interesting)
Re:Finaly (Score:3, Insightful)
Ov brug termat akti mak lejna trovterna.
And tell you that "termat" and "lejna" are nouns, "akti mak" is a 'composite' verb, "brug" and "trovterna" are adjectives... it still doesn't say anything about the actual meaning.
Universal Translator? (Score:2, Interesting)
Electronic babelfish anyone?
Run it on the bible and get... (Score:2, Funny)
How "intricate"? (Score:2, Insightful)
I hardly would consider
Not really new. (Score:2)
This is not new for protein sequence functionality (Score:3, Informative)
NCBI BlastP [nih.gov] already does this for proteins. Similarities and rules for things can be found but if the meaning of the sequence is not known then what good is it? In the end you need to do experiments involving biology/biochemistry/structural biology to determine the function of a protein or nucleotide sequence. Furthermore in language as well as in biology/chemistry things which have similar vocabulary (chemical formula) may in the end be structurally very different (enantiomers), which leads to vastly different functionality.
Dolphins? (Score:2, Interesting)
I'll be impressed when it can (Score:4, Funny)
- figure out it is a dupe and kill it before it even appears
- RTFA for me and just give me a good summary (by the rate of articles posted here, there's probably not much to summarize either)
- translate "IANAL" into something else that does not make me think of ANAL thing
- figure that articles on Google and Apple are just speculations by some dude living in his (can't be her, for sure) parent's basement, and not really news worth posting
- translate my suggestions into something acceptable to the (kernel) hackers that good hygiene is a good thing
- understand that I'm just ranting, and it should not take it personal.
Give it a real challenge (Score:4, Interesting)
Pug
Finally! (Score:2, Funny)
Spam filter? (Score:3, Interesting)
grammar isn't enough (Score:5, Informative)
Re:grammar isn't enough (Score:3, Interesting)
* The clown threw a ball.
(Probably, a tennis or basket ball)
* The clown threw a ball,....for charity.
(Okay, sorry, a ball a party.)
* The clown threw a ball,....for charity...., and hit the target.
(Okay, sorry again, the tennis ball hit the dunking target and someone fell in the water. Got it. We're in a carnival.)
* The clown threw a ball,....for charity...., and hit the target....of 1 million dollars.
(Scratch th
Re:Finally! (Score:2, Funny)
Re:Isn't This the Universal Translator Idea (Score:2, Interesting)
But for this, I have one word: Dolphins.
Re:Isn't This the Universal Translator Idea (Score:2)
Re:Protein sequences? (Score:2)
Re:Protein sequences? (Score:2)
I hate it when poeple talk like DNA is this big all encompasing thing. There's nothing in my DNA that tells me to reproduce, etc. So you can't just translate DNA into english. All of your cells, and the handful of braincels work together to unbelievably create the walking chemical reaction that you are, it's a whole big picture, and DNA is just one of the tiny factors in it.
MOD PARENT UP (Score:2)