AI May Have Finally Decoded the Mysterious 'Voynich Manuscript' (gizmodo.com) 161
An anonymous reader quotes a report from Gizmodo: Since its discovery over a hundred years ago, the 240-page Voynich manuscript, filled with seemingly coded language and inscrutable illustrations, of has confounded linguists and cryptographers. Using artificial intelligence, Canadian researchers have taken a huge step forward in unraveling the document's hidden meaning. Named after Wilfrid Voynich, the Polish book dealer who procured the manuscript in 1912, the document is written in an unknown script that encodes an unknown language -- a double-whammy of unknowns that has, until this point, been impossible to interpret. The Voynich manuscript contains hundreds of fragile pages, some missing, with hand-written text going from left to right. Most pages are adorned with illustrations of diagrams, including plants, nude figures, and astronomical symbols. But as for the meaning of the text -- nothing. No clue. For Greg Kondrak, an expert in natural language processing at the University of Alberta, this seemed a perfect task for artificial intelligence. With the help of his grad student Bradley Hauer, the computer scientists have taken a big step in cracking the code, discovering that the text is written in what appears to be the Hebrew language, and with letters arranged in a fixed pattern. To be fair, the researchers still don't know the meaning of the Voynich manuscript, but the stage is now set for other experts to join the investigation. The researchers used an AI to study "the text of the 'Universal Declaration of Human Rights' as it was written in 380 different languages, looking for patterns," reports Gizmodo. Following this training, the AI analyzed the Voynich gibberish, concluding with a high rate of certainty that the text was written in encoded Hebrew."
The researchers then entertained a hypothesis that the script was created with alphagrams, words in which text has been replaced by an alphabetically ordered anagram. "Armed with the knowledge that text was originally coded from Hebrew, the researchers devised an algorithm that could take these anagrams and create real Hebrew words." Finally, "the researchers deciphered the opening phrase of the manuscript" and ran it through Google Translate to convert it into passable English: "She made recommendations to the priest, man of the house and me and people." The study appears in Transactions of the Association of Computational Linguistics .
The researchers then entertained a hypothesis that the script was created with alphagrams, words in which text has been replaced by an alphabetically ordered anagram. "Armed with the knowledge that text was originally coded from Hebrew, the researchers devised an algorithm that could take these anagrams and create real Hebrew words." Finally, "the researchers deciphered the opening phrase of the manuscript" and ran it through Google Translate to convert it into passable English: "She made recommendations to the priest, man of the house and me and people." The study appears in Transactions of the Association of Computational Linguistics .
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
It's ambiguous, but acceptable use IMO (as an American).
Often North America refers to the whole Continent, but usually just US, Mexico, Canada, (Greenland?).
I believe the Mayans were into Mexico though, which is unambiguously North America.
Re: (Score:1)
That's
... not NORTH America you shitflooding trolling idiot.
Re: Indian ... not hebrew (Score:2)
Well it's not South America, so which of the two American continents do you suppose it is?
Re: (Score:2)
citations are easy found via google, or other search engines.
Re: (Score:2)
Lol, from angel'o'sphere?
If they said the sky was blue I'd have to go outside and check.
Re: (Score:1)
Link to the results?
Re: (Score:1)
The book has been dated to around the middle of the 15th century. A native American language is highly unlikely.
maybe the old traditions aren't entirely dead... (Score:2)
Re: (Score:1)
Links needed
Re: (Score:2)
Slashdot history
... just search it. ...
Or google
Re: (Score:1)
SLASHDOT history? Well at least that gives me more of a clue. But that's a reference nearly as bad as the Weekly World News.
Re: (Score:1)
I think you were thinking of this: https://www.newscientist.com/article/dn24987-mexican-plants-could-break-code-on-gibberish-manuscript/ [newscientist.com], which is about the drawings in the manuscript, NOT the words. I suggest you ask your doctor about age related dementia.
"Finally Decoded" (Score:5, Insightful)
STOP using this phrase in each bi-weekly story about this book only to say at the bottom of each article it "isn't really decoded".
It's "decoded" when the text is readable.
Lorem Ipsum (Score:5, Interesting)
What if they let loose the same AI on the Lorem Ipsum text that we know to be meaningless. Would it come to a similar conclusion? We humans want to see patters where there are none.
Re: (Score:2)
Re: (Score:1)
patters where there are none
does this mean that AI is some form of paranoia?
Re:Lorem Ipsum (Score:4, Informative)
The lorem ipsum text actually means something though... (some words were removed)
Re: (Score:1)
Re: (Score:2, Informative)
Um, Lorem Ipsum isn't meaningless, it's Latin text copied from Cicero. We already know what it means. There goes your entire post.
Re: (Score:2)
Lorem Ipsum is garbled text from Cicero; it was munged to produce the desired letter frequencies. It's pretty much gobbledygook.
Re: (Score:1)
https://lipsum.com/ [lipsum.com]
and I quote:
"Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. "
Re: (Score:1)
Lorem Ipsum isn't meaningless, it's latin.
Re: (Score:2)
That's exactly what an AI would say. You're an AI, aren't you?
Re: (Score:2)
AI can only know what humans know. If humans consider something impossible then so does the AI.
All AI is doing is getting to answers faster than humans can.
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
Lololololol (Score:5, Insightful)
Failing to find any Hebrew scholars who could help validate their findings, the researchers eventually resorted to using Google Translate,
(Source [sciencealert.com])
This "research" is a joke.
Re: Lololololol (Score:4, Funny)
To be fair, pasting something into google counts as research for millenials.
Re: (Score:2)
From the paper:
"According to a native speaker of the language,
this is not quite a coherent sentence. However,
after making a couple of spelling corrections,
Google Translate is able to convert it into passable
English: “She made recommendations to the priest,
man of the house and me and people.”"
So it is manually "corrected" input that produces that result.
To show that this is a valid approach to decode the document they have to be able to decode larger parts of the text to something that make sense
Re: (Score:1)
From the paper:
"According to a native speaker of the language,
Stop right there. There is no native speaker of classical Hebrew, and modern Hebrew is a very distinct reinvention quite certainly after creation of the Voynich Manuscript.
Again, you need a Hebrew scholar here.
Re: (Score:2)
There is a distinct difference between modern Hebrew as spoken in daily life in Israel, and and Hebrew from 500 years ago. Every single language changes over time, Hebrew is no different. Just because one can read from the Torah does not mean they can speak classical Hebrew, but they can read it with some help and learning some words that are no longer in use, and phrases you will never hear on the streets of Tel Aviv.
Modern Hebrew is also a relatively recent re-invention, it was kept alive before then as
Re: (Score:2)
So it is manually "corrected" input that produces that result.
Yes... that's the best. After all, with carefully "corrected" input you're able to craft world class conspiration theories: https://www.youtube.com/watch?... [youtube.com]
To show that this is a valid approach to decode the document they have to be able to decode larger parts of the text to something that make sense.
That of course doesn't mean that it isn't a valid approach, there may have been deliberate misspellings by the writer before encryption and similar things.
Doesn't Hebrew have those Tetragramm thing where they leave out vowels
But unless they can produce longer readable texts IMO they haven't proved anything.
Re: (Score:2)
To show that this is a valid approach to decode the document they have to be able to decode larger parts of the text to something that make sense.
That of course doesn't mean that it isn't a valid approach, there may have been deliberate misspellings by the writer before encryption and similar things.
Doesn't Hebrew have those Tetragramm thing where they leave out vowels
The Tetragram means literally the Four Letters, that's how in the scriptures the name of God is written - And yes, as is the *usual practice* in Hebrew, vowels are left out. From those four letters, the naming "Jehova" is derived, although it could be read in several different ways.
But, again, in Hebrew we do not write (most) vowels except when writing for children, or in several cases (such as bibles, prayer books and such) where the exact pronunciation is deemed required. Vowels can be identified (by a we
Re: (Score:1)
FTFY.
I did have to leave leading vowels and words made entirely of vowels in there, just so it didn't look too weird.
The point, of course, being that Hebrew is written just like txtspk, nd mst ppl thse dys ndrstnd txtspk n/p. Nly old ppl weep 4 lang chngz.
Re: (Score:1)
Given that languages change over time, it is quite possible that spelling (especially in a language that omits vowels) has also changed over time. And since the encoding method is alphagram (an anagram arranged alphabetically) you need to rearrange the letters to begin with to get something remotely coherent.
Re: (Score:1)
Failing to find any Hebrew scholars who could help validate their findings, the researchers eventually resorted to using Google Translate,
( Source [sciencealert.com])
This "research" is a joke.
Let's assume all Hebrew scholars died out. They still left us with grammars and dictionaries. Any scientist (and contrary to what STEM people believe, there is Language Science) would not be stopped by that. I mean, they figured out Egyptian Hieroglyphs (with help from the Rosetta Stone), they figured out Linear B (a diphone system not actually suited for Greek used in Minoan accounting on thousands of loam slates preserved by an accidental fire) without such help.
What kind of joke researcher has to stri
Re: (Score:1)
I suspect that an AI may perhaps have been able to decode that hieroglyphs were the same as the Coptic cursive, and maybe even relate them to the existing coptic language.
Maybe even in less time than the decades it took people to figure it out.
Re: (Score:1)
It actually would be a good test for their AI though, to see where it goes with solved languages.
Re: (Score:2)
that would certainly boost (or destroy) any credibility in their current results.
Re: (Score:3)
Re: (Score:1)
And a shit one at that.
Re: (Score:3)
I disagree. How do you recruit a classical Hebrew scholar to validate your hypothesis and assist with additional work? Not i the Yellow Pages. You publish your intermediate results and hope that it tickles a suitable person's interest such that they join in the effort.
You may as will declare Linus' work a joke. It's not as if Linux 0.12 was useful for much. It took a boatload of domain experts to bring it up to the capabilities that made people find it useful.
Re: (Score:2)
You hit up people you know to see if they know any, or anyone who might know any. You ask around the faculty at the university you're associated with. You reach out to other researchers in the same field to see if they know someone or someone who might know someone. You hit Google and find scholars and reach out to them via email. Etc... etc...
All of these are professional method
Re:Lololololol (Score:5, Insightful)
Failing to find any Hebrew scholars who could help validate their findings, the researchers eventually resorted to using Google Translate,
(Source [sciencealert.com])
This "research" is a joke.
Why? Because the Hebrew scholars didn't want to participate?
Google Translate botches modern languages. The fact that running their results through Google Translate gave them meaningful output suggests they have real data.
Re: (Score:2)
Google Translate botches modern languages. The fact that running their results through Google Translate gave them meaningful output suggests they have real data.
That Google Translate produces errors when exposed to relatively comprehensible data does not mean that getting meaningful output from Google Translate implies that they have real data. You can't cite Translate's fallibility as an example of its utility.
Re:Lololololol (Score:4, Insightful)
But they didn't get meaningful output. They got "She made recommendations to the priest, man of the house and me and people". This makes little sense as the first line of a book on herbology. This is AFTER "making a couple of spelling corrections" (how many is a couple?) and AFTER "de-anagraming" every single word (i.e. arbitrary picking one of the thousands of permutations of letters in the word). Not to mention that Hebrew is written without vowels, so any string of several characters is as likely as not to be a word.
When I was in high school I used a script to find dictionary anagrams of my name and my friends' name. A few of the anagrams looked pretty cool. Did they have any deeper meaning? Of course not. This is basically the same methodology.
Re: (Score:2)
It makes lots of sense as the opening sentence for a herbology book. The person in question (she) has tried to (or wants to) give this information to the church, to authorities (my take on 'man of the house'), to the author and everyone else.
Basically: This is a Public Domain license.
Re: (Score:1)
Maybe instead of herbology, it's a cookbook.....with the wife making recommendations for dinner to the priest, man of the house, and me and the people.
Re: (Score:2)
... This is AFTER "making a couple of spelling corrections" (how many is a couple?) and AFTER "de-anagraming" every single word (i.e. arbitrary picking one of the thousands of permutations of letters in the word).
...
When I was in high school I used a script to find dictionary anagrams of my name and my friends' name.
This is fun. Now I can make up codes everywhere:
Knew I saw in high school suede prints...
Thanks for introducing me to their methodology. And you should bring those suede prints back. They'll be big.
Re: (Score:2)
But they didn't get meaningful output. They got "She made recommendations to the priest, man of the house and me and people". This makes little sense as the first line of a book on herbology.
In English, it makes little sense. Hebrew, especially ancient/Biblical Hebrew, uses different sentence structure, both in terms of word order and (lack of) punctuation. A better English translation could be something like "She has made many recommendations, first to the priest, then to her husband, then to me, and finally to everyone in town."
not it does not (Score:2)
Re: (Score:2)
Google Translate can also produce seemingly-sensible results when given senseless inputs [upenn.edu]. Getting some meaningful output is only a weak suggestion that they have meaningful inputs. They should not have published without finding at least one Hebrew scholar who would take a look at their work - and the fact that they couldn't convince anyone to do so is itself suggestive.
Re: (Score:2)
I like to see machine learning fail and how it fails. Based on the assumption of an all or nothing training set, neural networks will be 100% confident in their choice and also wrong.
This
.gif shows three different hand positions that all communicate the number three:
https://imgur.com/a/KFR2M [imgur.com]
Re: (Score:2)
that's a 'W' a '3' and a schoolyard 'asshole'
Re: (Score:2)
think of it as a smoke-test to see if the overall approach makes sense. they'll likely take that as a sign they're on to something, finish the decoding - THEN hand the entire thing to a proper hebrew scholar, to do the final translation.
you're focusing on the wrong part of this. =/
Re: (Score:2)
Last year the theory was that it was a gynelogical text based upon the pictures, though the "encryption" was speculative. Ultimately however, the manuscript is just a manuscript. It's interesting as a puzzle but beyond that there will be no deep meanings uncovered or conspiracies unmaksed.
Summary of Text (Score:2)
Re: (Score:2)
So
... it was written by the medieval equivalent of some conspiracy theorist?
Re: (Score:2)
actually a *very* possible hypothesis.
Re: (Score:2)
A lot of scientific knowledge, especially medical, were secretive at some time. Knowledge was protected, guilds were formed to protect the secrets, and so forth. So texts would be written to be obscure, intentionally.
Voynich Manuscript is obviously an elaborate prank (Score:5, Insightful)
you would think over time people would become less gullible, not more.
and sure, if you train an AI long and hard enough, it will probably be able to tickle out something that looks like meaning from that nonsense. just like if you train an AI to see dogs, it can identify weird dogs in literally any image.
https://www.washingtonpost.com... [washingtonpost.com]
Re: (Score:2)
you would think over time people would become less gullible, not more.
One would think so, but Creationism is on the rise again.
Re: (Score:2)
And flat earthers. Very strange, they were almost extinct. Similarly, conspiracy theorists seemed also to be on the decline but they're very common these days too.
Re: (Score:2)
fuck off, misogynist garbage.
this last one got debunked (Score:4, Informative)
https://arstechnica.com/scienc... [arstechnica.com]
its the puzzle that keeps on giving!
One Line (Score:4, Insightful)
XKCD uncovered its meaning long ago (Score:5, Insightful)
https://xkcd.com/593/ [xkcd.com]
It is obvious when you think about it...
Re: (Score:2)
Bah! I was just about to make this joke. I didn't know that xkcd already beat me to it!
Even talking about it is mysterious (Score:2)
Since its discovery over a hundred years ago, the 240-page Voynich manuscript, filled with seemingly coded language and inscrutable illustrations, of has confounded linguists and cryptographers.
"of has confounded" - ?
Ah, I get it. It's not terrible editing, it's more mysterious encryption!
What if... (Score:2)
Re: (Score:2)
That's one of the theories. However, there have been attempts at statistical analysis that suggest total gibberish is unlikely. Moreover, that's a TON of work for a practical joke.
Important? (Score:2)
The manuscript is intriguing, but we can't say it's important without knowing the message. It could be entirely meaningless.
is this research or "research" (Score:1)
Overlooking the Obvious (Score:4, Funny)
Last sentence translated! (Score:2)
"for dark is the suede that mows like a harvest"
/sarcasm.
Wow, some pretty serious research here...
What if author made a mistake? (Score:2)
It has meaning? (Score:2)
I look forward to seeing the fully decoded text. Until now all indications were that it was a "spooky" coffee table book full of nonsense text.
it is widely known (Score:2)
Not knowing Hebrew may actually validate. (Score:2)
If they can get coherent results using only machine translation, not understanding the base language themselves, this gives an even stronger claim in some ways that they have really cracked the code. We will know they aren't hand-tweaking the results to get what they want, because they don't actually know what they want. They only know what comes out the other end of the process.
INB4 (Score:2)
The Protocols of the Elders of Zion
Indus Valley Language and Easter Island glyphs? (Score:2)
It will be exciting to see this process applied to the untranslated Indus Valley Language and Easter Island glyphs.
http://content.time.com/time/w... [time.com]
Wow! (not) (Score:2)
Yeah, right (Score:2)
As soon as you see "anagram" mentioned as part of the process to decode a cipher, you can stop reading, it's not a solution. If you allow for an arbitrary arrangement of letters or symbols as part of the solution, you can arrive at pretty much *any* text as the result, with no real connection to the cipher you started with.
Re: (Score:2)
Re: (Score:2)
You can't just use "diff" and call it AI.
You have to use machine learning to train an algorithm at great expense (with clouds!) to compare two texts, until it does nearly as good a job as diff. Only then is it AI. AI isn't something any run-of-the-mill dev can do, after all, that's why it costs so much.
Re: (Score:2)
Re: (Score:2)
I think you need to blow off a little more space dust there.
Re: (Score:2)
I don't necessarily think it was "debunked". It was incomplete, had some mistakes, but was it debunked in its entirety? It used an approach used by others in the past. I know the true believers hated it because it would mean the answer was very mundane (like 42).