Forgot your password?
typodupeerror
Science

The Science of Word Recognition 430

Posted by michael
from the paris-in-the-the-spring dept.
neile writes "I stumbled across a fascinating paper over at the Microsoft Typography site today that provides a really nice overview of the different theories on how humans read. If you thought we read by recognizing word shapes, think again! With the assistance of fancy eye-tracking cameras researchers have been able to devise several clever experiments to give us new insight into how reading works." We've linked to some of Larson's work previously.
This discussion has been archived. No new comments can be posted.

The Science of Word Recognition

Comments Filter:
  • by rock_climbing_guy (630276) on Thursday September 02, 2004 @05:07AM (#10136689) Journal
    Would one of those stupid comments about the colour scheme on /. be on-topic now?
  • Honest!!! (Score:5, Funny)

    by TheWingThing (686802) on Thursday September 02, 2004 @05:11AM (#10136708)
    I was reading what was written on her T-shirt!
  • Oh no! (Score:3, Funny)

    by barcodez (580516) on Thursday September 02, 2004 @05:12AM (#10136713)
    So are Microsoft going to patent the way we read and then sue?

    "If you are reading this then you owe Microsoft royalies"

  • by Anonymous Coward
    His "word shape" matrix with "than" "tban" "tnan", etc; could be more easily explained by saying that people pay more attention to tall letters than short ones. That would explain why 'tban' gets caught more than 'tnan' just as well as word-shape arguments.

    To make it more obvious, stick a tall letter in a word that only has short letters and you'll come away thinking word shape does matter.

    (or did he explain it... there were way to many words and way too few glossy pictures in that article for me to co

    • I once saw a little article in a free-on-the-train paper demonstrating that we mostly read the top half of a line of text. Try it some time, cover up the bottom half of a line of text and read it, then cover up the top half of the next line of text and read that. Which is easier?

      Cheers & God bless
      Sam "SammyTheSnake" Penny
    • by ideonode (163753) on Thursday September 02, 2004 @07:52AM (#10137211)
      I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg - the phaonmneal pweor of the hmuan mnid. Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer inwaht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.
      • by maxwell demon (590494) on Thursday September 02, 2004 @08:36AM (#10137423) Journal
        Actuallythesplittingintowordsisnotnecessarytounder standwhatiswritteniftheorderoflettersiscorrect.Thi s"proves"thatyouarereadingbytheletter,notbytheword .(relyingonslashcodetoinsertameaninglessspaceevery nowandthen:-))
        • by Orne (144925) on Thursday September 02, 2004 @11:05AM (#10138996) Homepage
          I'm no linguist (elec eng w/ neural net studies), but I would argue that the ability to perceive concatenated sentences like that is a function of the ability of the brain/eye to focus on a particular range and filter out "distractions" (letters to the left and right). Padding our words with spaces helps the brain to quicker define the focus boundaries, after which we can process the text range for meaning...

          I imagine the brain's focus as little perception boxes, scanning up and down the concatenated sentence until enough symbols are aligned to fire a recognition signal... As I read your post above, I find my eyes darting about a little more, actually darting to the center of the "word" once recognition is made.

          runonsentencewithlowercase -- here's your letter by letter scan "mode"

          runonsentencewithcoloring -- slightly easier to define word boundaries by color

          runonSENTENCEwithuppercase -- it's easier to locate the word SENTENCE because we perceive a boundary beween small letters and upper letters.

          runo nsente ncewit hbads pacing -- pain in the ass, but we still comprehend

          run on sentence with lowercase -- whitespace speeds compehension.
      • by lazyl (619939) on Thursday September 02, 2004 @09:13AM (#10137741)
        It makes a big difference if your messed up words use common letter patterns (what, in the article he called 'Psuedowords'), or not.

        Example:

        'uesdnatnrd' wasn't to hard to recognize beacuase 'uesd' and 'tnrd' aren't letter patterns that exist in real words. So the mind works quicker to rearrange the letters to find a real word.

        'aulaclty' was much harder because it's almost pronouncable. 'lac' and 'lty' are common patterns from real words, and 'aul' might not be common but it's pronouncable.

        Just an observation.
  • by Zorilla (791636) on Thursday September 02, 2004 @05:19AM (#10136741)
    New technology will soon be revealed that will instruct Slashdot users on the proper spelling of "lose".

    The USSGN (Union of Slashdot Spelling and Grammar Nazis) is expected to stage protests against the new product in the interest of keeping their jobs.
    • When Slashcode starts spell-checking we may be able to retire, but until then the rate at which people are instructed in the difference between "lose" and "loose" is probably less than the rate at which people join /. and greater than the rate at which people improve their spelling.
    • You just need to be a bit more lose about it all. There's no need to loose the rag!

      (Actually, I don't know HOW anyone can be content with the misspellings. But then, I don't see how Americanese holds water either.)
  • Eye movements? (Score:5, Interesting)

    by ImaLamer (260199) <<moc.liamg> <ta> <ramal.nhoj>> on Thursday September 02, 2004 @05:19AM (#10136743) Homepage Journal
    With the assistance of fancy eye-tracking cameras researchers have been able to devise several clever experiments to give us new insight into how reading works."

    Oh they must have been using EyeQ [infmind.com]....

    I can read at 44692 words per minute! Thanks for posting that long article for me to read, I needed the exercise.

    And thank you EyeQ! Your the greatest!

    Really though, they say that the more letters/words mean faster reading times [microsoft.com]. It's true. Think about a book or article you've read. When the words are together on the page it's easier to read because your eyes can jump around letting your brain fill in the blanks.

    Ever read something that made sense but you couldn't quote it word for word? It's likely because you read in this same way.

  • Quotation (Score:5, Funny)

    by Anonymous Coward on Thursday September 02, 2004 @05:20AM (#10136747)
    "Evidence from the last 20 years of work in cognitive psychology indicates that we use the letters within a word to recognize a word."

    Man, I'm so glad they finally figured this out...
    • Re:Quotation (Score:2, Insightful)

      by Shisha (145964)
      It's not bloody funny! The parent, in a true Slashdot style, didn't even get what the subject of the paper was!

      The question pondered is whether _experienced_ reader reads by, in the first place, recognising the word shape, or by recognising the letters.

      P.S. yes I know that psychologists are great for stating the obvious, but not here...
      P.P.S. to parent: read the article properly, I'm sure you'll find a nice funny case of stating the obvious.
    • Re:Quotation (Score:3, Interesting)

      There was a slashdot story a while back which basically stated exactly that quote. Basically you could easily read an entire book where the words were made up of the correct starting character, the correct ending character, but the middle of the word it didnt matter what order the characters came in.

      For example: "sadhoslt nwes for nrdes. Sfutf taht mrttaes". (I think ive got that correct, someone will obviously correct me if not :))

      Your brain didnt need the middle of the word to understand the word
      • Re:Quotation (Score:3, Interesting)

        by Lars Clausen (1208)
        It also turned out to be mostly urban legend. There was some related research, but none that stated that claim. Bdeeiss, if taht was true, we cloud imoprve ceioomprssn aghilmorts by sinortg the mddile leertts aaabcehilllpty, scine tehir piinoosts are iaaeimmrtl.

        -Lars
        • Re:Quotation (Score:3, Informative)

          by danila (69889)
          But teh gsit of teh sotry was ture. Terhe is a lot of rdeandncuy in the lagnuage nad if th rerhesecars are rghit, yuor bairn reelis mroe on crroect ltteres awanyy.
    • Re:Quotation (Score:3, Insightful)

      by Epistax (544591)
      "Evidence from the last 20 years of work in cognitive psychology indicates that we use the letters within a word to recognize a word."

      Very strange because if y_u r____r we d__'t n__d a_l those l_____s.
  • I love how (Score:5, Insightful)

    by FS1 (636716) on Thursday September 02, 2004 @05:20AM (#10136748)
    Does anyone else think that merely analyzing how english is read is very closed minded? I'm pretty sure only a very small percentage of the world speaks and reads english.

    I would love to see a study comparing how english is read to how chinese is read by native speakers. Very interesting i would gather.
    • Re:I love how (Score:3, Interesting)

      by defMan (175410)
      I would personally be very interested in seeing english compared to dutch or german. In those languages (i'm a native dutch speaker) the word order is much more flexible and the determining verb often comes very late in the sentence. In german this is more prominent than in dutch.

      I just searched around on google and these documents come up
      Word Order in German [about.com]
      Kathol's analysis of German Word Order [let.rug.nl]

    • by alanxyzzy (666696) on Thursday September 02, 2004 @05:42AM (#10136812)
      I would love to see a study comparing how english is read to how chinese is read by native speakers.
      There is an interesting article at the Harvard Gazette [harvard.edu] about research which seems to show that thought comes before language. The Korean language distinguishes between two meanings of "in" - fitting loosely or tightly.

      Research shows that

      Infants of English-speaking parents easily grasp the Korean distinction between a cylinder fitting loosely or tightly into a container. In other words, children come into the world with the ability to describe what's on their young minds in English, Korean, or any other language. But differences in niceties of thought not reflected in a language go unspoken when they get older.
      • by achurch (201270) on Thursday September 02, 2004 @06:55AM (#10137026) Homepage

        Infants of English-speaking parents easily grasp the Korean distinction between a cylinder fitting loosely or tightly into a container. In other words, children come into the world with the ability to describe what's on their young minds in English, Korean, or any other language. But differences in niceties of thought not reflected in a language go unspoken when they get older.

        Absolutely. And adults can "relearn" those distinctions, too; I found that as my Japanese studies progressed (started at 19, pretty close to native now) the range of things I was able to think about expanded considerably--so much so that now I sometimes have trouble speaking to people in English because English doesn't have a word for the concept I'm thinking about.

      • The relationship is probably a lot more complicated than "thought comes before language". I suspect they are both highly dependent on each other.

        For instance, it is clear that many non-verbal animals are able to think, in at least some limited fashion. Larger rodents, for instance, are able to build models of their world and solve simple problems (not limited to learning by trial and error). It is exactly this kind of modelling that concepts like one object being inside another stem from -- spatial reas
    • Re:I love how (Score:5, Interesting)

      by ImaLamer (260199) <<moc.liamg> <ta> <ramal.nhoj>> on Thursday September 02, 2004 @05:44AM (#10136817) Homepage Journal
      You're right. It would seem that for better analysis comparing Hebrew/Chinese to English would be better.

      Maybe we can learn even more about our way of reading, like: Is it the most efficient?

      Is right to left, or left to right the best way to go.

      Interesting side note (don't know why I'm bringing this up...) President #20, James A. Garfield could write in both Latin and Greek at the same time?

      • Is right to left, or left to right the best way to go.

        Isn't that more a consequence of the fact that most people write with their right hand?

      • For the record, Hebrew and Chinese have been studied for years alongside English in reading experiments. I don't feel like looking up the citations right now, but if you're interested, check on PsychInfo.
      • How do you write two languages at the same time? Greek with your left hand, Latin with your right?

        Puzzling.... anyway, it's good to know that at least some presidents have some skills ;)
        • He also "proved" the Pythagoream Theorem too!

          Read up on this man... very cool.

          I learned the first fact, about his writing, from "Incredible But True" a great old book.
      • Re:I love how (Score:4, Interesting)

        by julesh (229690) on Thursday September 02, 2004 @08:39AM (#10137442)
        Is right to left, or left to right the best way to go.

        I remember reading about an interesting study into this. Apparently, there are a small number of people who have a particular form of brain damage which effectively reverses their perception. These people, if they were originally educated to read/write left to right, would afterwards naturally read/write right to left, or vice versa.

        Apparently, once they get used to using their right hand with a style similar to that a left-hander would use (or vice-versa) they can read & write in the opposite direction at roughly the same rate a normal person can in the usual direction. The conclusion: the difference is not noticeable; neither left to right nor right to left is substantially more efficient (or any difference is also negated by the brain damage these people have suffered).

        No, I can't cite references. I just came across it about 10 years ago, I don't even remember what I was studying at the time.
      • Re:I love how (Score:3, Informative)

        by Sunnan (466558)
        Is right to left, or left to right the best way to go.

        If you're right-handed, you'll smudge the text with your hand if you write right-to-left.
    • Re:I love how (Score:5, Insightful)

      by dave420 (699308) on Thursday September 02, 2004 @06:22AM (#10136942)
      There are roughly 400 million people with English as their first language, true, but there are even more with English as a second language. If you're looking to select a language to base a study on, and you want it to be accessible, then you choose English. It really is that simple.

      Also, Chinese is character-based, not letter-based, so the research would be completely different. Kind of like asking someone who's studying jet aircraft to study cars as more people have them.

      • Also, Chinese is character-based, not letter-based, so the research would be completely different.

        Yes, but could there be a similarity in that reading Chinese involves recognition of strokes the same way that reading in English involves letters? FTA...

        Fixations never occur between words, and usually occur just to the left of the middle of a word. Not all words are fixated; short words and particularly function words are frequently skipped.

        Perhaps reading Chinese involves focusing on the strokes

        • Re:I love how (Score:5, Interesting)

          by dave420 (699308) on Thursday September 02, 2004 @07:27AM (#10137121)
          No, there's lots of study on the matter, and it's shown that Chinese people interpret their written language in a completely different part of the brain than english-reading people. That fact alone means a completely different method is at work... :)
    • There is an interesting study of reading of Chinese versus English, in the context of understanding dyslexia:

      "The researchers, led by Dr Li-Hai Tan believe that this region is implicated because reading Chinese is a different mental task compared with reading an alphabetic language.

      With an alphabetic language, reading is done sequentially - the letters are recognised and broken up into blocks of sound which are then matched to a known meaning.

      But with Chinese, the reading is more like parallel processi

      • Re:I love how (Score:3, Interesting)

        by antifoidulus (807088)
        I wish they would have expounded on what they meant. Because the "sound" of a word in Chinese is pretty much the same as it's meaning. Yes, the characters do have meanings, but in Chinese(for the most part, there are some exceptions) each character only has 1 sound. The sound is exactly how you would pronounce the word if you were speaking, so I'm not sure what they mean by saying that children process the sound and the meaning seperately. Or maybe it's the difference between how a person understands a
    • by Simon (815)
      Does anyone else think that merely analyzing how english is read is very closed minded?

      no, not really. It seems very reasonable considering that english is most likely the native language of the researchers. Research is hard enough without introducing extra complexity through using a foreign language and then having to find subjects that are fluent in that language.

      You can't study everything at the same time. Quit complaining for the sake of complaining... sheeesh.

      --
      Simon

  • by DrFrasierCrane (609981) on Thursday September 02, 2004 @05:22AM (#10136756) Homepage
    While reading the article, I suddenly become hyper-aware about how I was reading the article. :-)

    Don't let the Microsoft name scare you off - the article makes for a fascinating look (pun intended) into how we read. I wonder, though, if these findings are duplicated with written Oriental languages.
  • by mocm (141920) on Thursday September 02, 2004 @05:22AM (#10136757) Homepage
    Since most people in the world don't use the latin alphabet, it would be interesting to find out how word recognition works for them. And how they read words in our alphabet.
    • This sounds like bullshit to me. Care to quote some facts? I suspect the sum of all English + Spanish + French readers pretty much has the market cornered and they all use the latin alphabet.
      • Chinese (Mandarin, Cantonese, ...), Japanese, Korean, Hindi, Russian, Hebrew, Bengali and Arabic all use different writing systems. And if you add up
        all the others then they will certainly outnumber those who use the latin alphabet. Of course, the latin alphabet is probably the widest in use but not necessarily used by the majority.
    • by ajs318 (655362) <sd_resp2NO@SPAMearthshod.co.uk> on Thursday September 02, 2004 @07:23AM (#10137106)
      They probably have already written papers on it ..... in their own languages.

      Want my theory? I think the brain uses multiple techniques in parallel, then releases resources from the ones found to be going nowhere. So at any one time you may be trying to read a word letter-by-letter, recognising the word from the Bouma shape, and picking likely words from context. The different techniques will have different successes depending on various factors (clean type vs. messy handwriting, familiar vs unfamiliar words, &c). So my theory is that the brain is trying various methods at the same time, each narrowing down the possibilities, and just goes with whatever produces a result first. As soon as that happens, any half-finished tests in progress are scrapped and their resources deallocated. The eye movements may well have something to do with this ..... different reading techniques require different resolutions, the eye is great at recognising outlines but needs to zero-in on details, once a clue is established from the word envelope. There is evidence that fonts such as Times are more readable than Helvetica, so maybe serifs add recognisability in their own way? And if this is what is happening, then it would explain some of the test results in the article too, since they were looking for a single technique in use at any one time.

      If all this sounds inefficient, you have to remember that human beings are optimised for non-optimum conditions ..... for instance, we have kidneys that pack up if you drink nothing but de-mineralised water, and an immune system that goes berserk and tries to poison you with histamine if it doesn't get enough germs to fight off.
  • Reduced Redudancy (Score:4, Informative)

    by plasticmillion (649623) <matthew@allpeers.com> on Thursday September 02, 2004 @05:26AM (#10136766) Homepage
    This got slashdotted!? The idea of recognizing words by "word shape" seems so silly to me that I almost feel as if the author is attacking a straw man rather than a widely accepted linguistic theory.

    The final conclusions are similar to what I learned in my college linguistics classes 15 years ago. Language contains a lot of redundancy. The reason is that we often encounter situations of so-called "reduced redundancy". For example, someone might have sloppy handwriting so you can't make out all of the letters. Or you might be talking to someone while they brush their teeth. If language were highly optimized, we wouldn't understand a thing in these situations, but because of redundancy we can usually communicate very effectively.

    The same applies to reading. The conclusions of the paper seem trivial to me. Of course, reading exploits "visual" and "contextual" information. How else would be understand a sentence like "The boy ate a ham___er" (with a few letters obscured)?

    The fact that the brain's neural net adds up the weighted lexicographic, syntactic, semantic (and even pragmatic) information available to it in order to interpret language should be familiar to anyone who's read Goedel, Escher, Bach. And that was published in 1979...

    • "The boy ate a ham___er"

      No automatic recognition here.

      Hamster?

      Hammer?
    • How else would be understand a sentence like "The boy ate a ham___er" (with a few letters obscured)?

      How else would be understand?

      Case in point.
    • Ok, ok! Sticklers are we? So this is why geeks are unpopular... ;-)

      My point was probably clear but perhaps a better example would have been "hamb___er". Didn't know there were so many hammer/hamster eaters out there...

    • Re:Reduced Redudancy (Score:5, Interesting)

      by Placido (209939) on Thursday September 02, 2004 @06:07AM (#10136897)
      >> How else would be understand a sentence like "The boy ate a ham___er" (with a few letters obscured)?

      What a way to prove your point. I kept thinking "hamster", "hammer" and then eventually realised that I didn't spot your miss-spelling of 'we' and that I read right over it and filled in the blank.
      • Re:Reduced Redudancy (Score:3, Interesting)

        by nine-times (778537)
        >> How else would be understand a sentence like "The boy ate a ham___er" (with a few letters obscured)?

        What a way to prove your point. I kept thinking "hamster", "hammer" and then eventually realised that I didn't spot your miss-spelling of 'we' and that I read right over it and filled in the blank.

        Wow. Not only did I do what you did, but not-even-reading your post, I picked out "ham___er", "hamster", "hammer", and "we", and tried to figure out if you were suggesting that "we" fit in the missing sp

    • I filled in the blank with hamster [Making it: "The boy ate a hamster"], but maybe I'm just an oddity.
      • It seems everyone (myself included) thought hamster. I guess he meant hamburger, which took me a long time to figure out. The short space is to blame, perhaps.
    • This got slashdotted!? The idea of recognizing words by "word shape" seems so silly to me that I almost feel as if the author is attacking a straw man rather than a widely accepted linguistic theory.

      The author is aiming the article at typographers, not linguists and psychologists. It seems that while everyone who does scientific research into the way that we read has known for a long time that the word shape theory is full of crap, the theory persists as a kind of urban myth among typographers. So the

      • Actually, looking inward at the way I read text, the word shape theory is not full of crap, but over-extended. I find my eyes performing the same jumping as the article suggests (having known this for a long time) and also find that the smaller words like old and the (especially those that are most frequent) are recognized in the "peripheral" of my targets by shape.

        That would be something the computer tracking would not be able to figure out and was somewhat hinted at in the article.

        Extremely inter
  • How we read... (Score:2, Interesting)

    by stupid_is (716292)
    A while ago I was emailed something that stuck out from the usual chain/joke/... flood. Basically it had a very long and badly spelled sentence, where the only rules followed were that the first and last letter in the word were in the correct position. You could read it easily. Go figure!

    Hree is an epamxle of jsut taht, it's qitue esay to raed, ins't it? Agulohth it can get plluartraicy hrad wtih the lgnoer wdros.

    • Re:How we read... (Score:5, Informative)

      by Johan Veenstra (61679) on Thursday September 02, 2004 @06:02AM (#10136867)
      The example:

      Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.

      But soon enough there was a counter example:

      Anidroccg to crad cniyrrag lcitsiugnis planoissefors at an uemannd, utisreviny in Bsitirh Cibmuloa, and crartnoy to the duoibus cmials of the ueticnd rcraeseh, a slpmie, macinahcel ioisrevnn of ianretnl cretcarahs araepps sneiciffut to csufnoe the eadyrevy oekoolnr.

      In the counter example, the letters are not randomly scrabled, the letters are in reverse order, except the first and last letters.
      • Anidroccg to crad cniyrrag lcitsiugnis planoissefors at an uemannd, utisreviny in Bsitirh Cibmuloa, and crartnoy to the duoibus cmials of the ueticnd rcraeseh, a slpmie, macinahcel ioisrevnn of ianretnl cretcarahs araepps sneiciffut to csufnoe the eadyrevy oekoolnr.

        This would be a lot easier to read without that misplaced comma.
        • Re:How we read... (Score:3, Interesting)

          by mikael (484)
          Interesting. Maybe word recognition uses a small cache to perform error correction if characters are swapped around by 2-3 spaces. In the case of the reversed characters, this won't work.

  • So ... (Score:5, Insightful)

    by Pegasus (13291) on Thursday September 02, 2004 @05:40AM (#10136808) Homepage
    when are they going to repeat these experiments in let say China or Japan? I'm *very* interested in what would the conclusions be there.
    For what i know abaout japanese, they don't use spaces between 'words'. A single kanji represents the whole word and their outline is always more or less square. So the whole bouma theory fails here, as he finds out.
    I'm sure they could leard more interesting things in other writing sysmtems ...
    • For what i know abaout japanese, they don't use spaces between 'words'. A single kanji represents the whole word and their outline is always more or less square.

      That would probably be Chinese. Written Japanese seems to be a mix-and-match job involving two native phonetic alpabets (one all spiky and angular, and one with a lot of letters that look like pretzels), one imported phonetic alphabet, and lots of Chinese pictograms for good measure...

    • Kanji = picture-based
      English = character-based

      It's like comparing apples and oranges - two completely different ways a written language is interpreted.

      • Re:So ... (Score:4, Informative)

        by macshit (157376) * <(gro.ung) (ta) (selim)> on Thursday September 02, 2004 @07:57AM (#10137226) Homepage
        Kanji = picture-based
        English = character-based

        It's like comparing apples and oranges - two completely different ways a written language is interpreted.


        I think they're not quite as different as many people seem to think though.

        Most kanji are composed of more primitive components. From observing myself reading Japanese, I've noticed that I make many of the same mistakes in recognition, and use similar tricks in recognizing unknown kanji, as I do when reading english. For instance, I frequently confuse two kanji because they have mostly the same primitive components, but differ in one (often the radical -- even though it's arguably the most important part of a kanji, I find I tend to ignore it when reading!).

        In my opinion it's not unreasonable to think of the parts of a kanji as being like letters and the whole thing as being like a word.
    • I am somewhat fluent in Chinese. Though syllables in Chinese (and Korean) approximately fit into squares, they share two characteristics with alphabetic word shapes:

      First, Chinese characters are often composed of several smaller characters, 500 or so, instead of the 70'ish letters and numerals (including capitals) in English. We say such a character may have a "moon" sub-character on the left, a "white" on the right and so on. The sub-characters can be partial clues to meaning and pronunciation (e.g. a
  • by PotatoHead (12771) * <doug@openge[ ]org ['ek.' in gap]> on Thursday September 02, 2004 @05:42AM (#10136814) Homepage Journal
    I found myself becoming aware of how I read while I read. Fun! I agree with the author regarding letter recognition. The parallel aspect of word recognition is very interesting as well because it begins to explain why we are albe ot raed srcambled txet os eaisly!

    Also, more work needs to be done to consider the visual cues outside the focus of attention. It is here that, I believe, shape and form cue the reader, more than letter shapes do, as to the potential content of the text to come. (Exactly how is for the geniuses.)

  • by kahei (466208) on Thursday September 02, 2004 @05:46AM (#10136824) Homepage

    While some of the results here are interesting (but old), the fact that the entire study focuses on exactly 1 script and 1 language basically renders the conclusions worthless (as conclusions about cognition in general... I suppose they still have value as conclusions about English and the Latin script).

    What has happened here is:

    1 -- Observe people reading a given language/script

    2 -- See how they make use of features of that particular language/script, such as tall letters, case, and the occurrence of 'skippable' words such as articles

    3 -- Describe the way they use these local features, and call that a theory of reading in general.

    I don't really understand how to apply a theory of reading based on word and letter shapes when there are so many people reading text in which:

    --There are no letter boundaries, and/or
    --There are no word boundaries, and/or
    --Letters all have the same form factor

    The experiments described would probably generalize very well to arabic and greek scripts, pretty well to cyrillic (no tall/short letters to speak of), badly to devanagari-type scripts, very badly to Chinese and Japanese, and not at all to hieroglyphics (though I agree that there may never have been a reader of hieroglyphics who was fluent by modern standards).

    To pretend that these experiments apply to humanity in general rather than the author's own language/script choice is silly. It's an interesting article and I'm glad the research was done but unfortunately a certain failure to 'get' the multilingual nature of humanity, which I don't really expect to find in MS work, is in evidence here.

    • by hazem (472289) on Thursday September 02, 2004 @06:05AM (#10136889) Journal
      Everybody seems to be giving this guy a hard time because he did his research for reading only English. My guess is that the guy reads/speaks English and has ready access to people who do the same. This research is a good start and seems to have valuable results.

      Now someone else can work on a PhD Thesis by taking his work and seeing if it applies in other languages.

      Isn't this how science works? You do research, try to make some conclusions, and publish the results. If you wait to publish until you've found the Grand Unified Theory of Everything, then nobody publishes anything and science doesn't advance at all.

      I'm not sure that he missed anything. He has started with what he knows and has resources to study.
    • by olau (314197) on Thursday September 02, 2004 @07:43AM (#10137183) Homepage
      To pretend that these experiments apply to humanity in general rather than the author's own language/script choice is silly.

      You know what is also silly? To pretend that this was the conclusion, although clearly the paper nowhere stated that it had found the grand unified theory of how people read. Here's a hint: when the paper talks about reading, it is obviously talking about reading English.

      Yes, the paper would be even more interesting if it included studies of other scripts, and the failure to acknowledge the existence of other scripts should be criticised. But the rest of your criticism is unfounded.
  • Please (Score:2, Informative)

    by tgv (254536)
    Although it is nice to see mentioning of my trade a /., this paper has about the status of a student's essay. It doesn't even mention literature after 1998!
  • or maybe it's both? (Score:5, Interesting)

    by Illserve (56215) on Thursday September 02, 2004 @06:14AM (#10136919)
    If there's one real take-home lesson of brain-design from cognitive science, it's that the brain tends to do everything several different ways in parallel, and then use the results from all of them.

    Obviously it can't all be shape, there are plenty of words with identical shapes and yet these are distinguishable.

    But it could certainly be true that we use shape and parallel letter recognition at the same time. Shape narrows the field of possibilities from millions to a small handful, and then parallel recognition chooses one of the options.

    Whatever happens, you can be sure it's terribly complicated, extremely robust and very efficient.

  • Don't shout! (Score:5, Interesting)

    by meckardt (113120) on Thursday September 02, 2004 @06:34AM (#10136977) Homepage
    From the article: ...lowercase text is read faster than uppercase text. This could also explain why nobody likes to read email where the other person uses all caps.
    • Re:Don't shout! (Score:5, Informative)

      by Seahawk (70898) <tts AT image DOT dk> on Thursday September 02, 2004 @07:31AM (#10137129)
      And if you had read the rest of the article, you would know that this is just because 99% of all we read is lowercase.

      People can easily be trained to read text in caps as fast as lowercase text - or mirrored text.

      What I fail to understand is how randomizing the middle letters of a word doesnt affect reading much. I had hoped he would use that as an example.

      Tihs is a emxpale of the efecft.
  • FTA... (Score:3, Interesting)

    by Anonymous Writer (746272) on Thursday September 02, 2004 @06:45AM (#10137008)
    Why I wrote this paper

    I am a psychologist who has been working for Microsoft in different capacities since 1996. In 2000 I completed my PhD in cognitive psychology from the University of Texas at Austin studying word recognition and reading acquisition. I joined the ClearType team in 2002 to help get a better scientific understanding of the benefits of ClearType and other reading technologies with the goal of achieving a great on-screen reading experience.

    I'm surprised this guy is actually working with ClearType. That is just a simple way of making characters appear better by using sub-pixels to increase character resolution. I would think this type of work would be better applied in optical character recognition, maybe even with cursive handwriting.

  • by Numen (244707) on Thursday September 02, 2004 @06:56AM (#10137028)
    If there's those that have shied away from Microsoft, well because they're Microsoft, you might not be aware of http://research.microsoft.com which regardless of which side of various fences you might sit has some very interesting material and is generally worth tracking over time.

    Aplogise for the tangent, on the back of this article seemed an apt place to point to the MS research site for those that might not of been aware of it.
  • by jsebrech (525647) on Thursday September 02, 2004 @07:41AM (#10137178)
    The internal representations for these models convert the letter information to phonemic information, which is seen as a mandatory step for word recognition. It is well known that words that have a consistent spelling to sound correspondence such as mint, tint, and hint are recognized faster than words that have an inconsistent spelling to sound correspondence such as pint

    I can not believe this is in a serious paper. Mandatory? Please. What about people born deaf? Are they all unable to read?
  • From TFA:

    Eye movement studies that I will discuss shortly indicate that there are three zones of visual identification. Readers collect information from all three zones during the span of a fixation. Closest to the fixation point is where word recognition takes place. This zone is usually large enough to capture the word being fixated, and often includes smaller function words directly to the right of the fixated word. The next zone extends a few letters past the word recognition zone, and readers gather
  • amusing test... (Score:5, Interesting)

    by zozzi (576178) on Thursday September 02, 2004 @08:43AM (#10137463)
    I enjoy giving people this test: Write a long sentence and make sure that the last word of the sentence is a filler word. Then write that filler word again at the start of the next sentence and write some more. Eg:
    Yesterday I went to the beach and saw the
    the boat I always dreamt about.
    ~ 7 out of 10 people fail to spot it, even if told beforehand there's an obvious error. Somehow music people are more prone to spot the error straight away.
  • Source code? (Score:3, Interesting)

    by maxwell demon (590494) on Thursday September 02, 2004 @09:00AM (#10137581) Journal
    While the study is certainly about reading English texts, could one draw some conclusions about the readability of source code? I guess at least the finding that whitespace governs the jumps of our eyes might have some relevance here.
  • by peter303 (12292) on Thursday September 02, 2004 @09:21AM (#10137822)
    I notice MS Research doing lots of basic research that has never been productized. Its rare to see corporations being so liberal with their resources. Even Google's very imaginative projects seem to be directed towards a commercial goal.

    This suggests an interesting contradiction in MS product strategy. MS has a long history of "clone and conquer", e.g. Excel copies off VisiCalc and Lotus 123. Just this week MS cloned Apple iTunes. Yet MS Research is conducting some very interesting basic research. Go figure!
  • we disagree (Score:3, Insightful)

    by Doc Ruby (173196) on Thursday September 02, 2004 @11:09AM (#10139050) Homepage Journal
    One problem with deciding "word shape" vs. "letters" as the method a reader uses to recognize a word comes from treatment of the reader as "atomic". I am a proficient reader. When I read a word, whether written by another, or by myself (as I type), I have multiple subcurrents of consciousness. A typo in a word might leave me with recognition of the word, and a sense that "something's wrong", simultaneously - it sometimes takes me several seconds to detect the typo, especially if it's one I often make myself. Likewise, some spelling mistakes derive from the difference my spoken accent makes with the written conjugated spelling, most often in the case of syllables separated by an "e" that is pronounced as a "schwa", easily confused with some pronunciations of "i" or "a", and sometimes "y".

    Reading words silently, I sometimes notice an inner chorus pronouncing the words, with one or two discordant notes, even from poorly organized structure or unparseable punctuation. Deciding how people recognize words must also account for how people's minds are organized. The myth of the "undivided self" gets in the way of understanding not only how complex we are "under the hood", where media is digested, but denies credit to our grand integrator, which juggles these partial selves into one face with which to confront the world. As machine intelligence benefits from multiple simultaneous processing, why should they have all the fun? As we mimic our own minds in computer simulations, why should we have all the fun?
  • by cmpalmer (234347) on Thursday September 02, 2004 @12:19PM (#10140031) Homepage
    I read really fast. I also read quite a bit of fantasy and science fiction. I have noticed the effect that weird alien and fantasy names (N'kalogh or Xyztle) are like driving over speedbumps. The higher the density of unfamiliar and nearly unpronounceable names, the more likely I am not to finish the book (or even pick it up).

    "N'kalogh leapt onto his mighty huyloch and rode across the plains of V'looth'u". Next please.

    This paper gives a convincing pyschological model about why this occurs and it is pretty much what I had surmised on my own.

    So, from now on, please name all of your aliens Bob, Larry, Bubba, or Charles.
  • by pandrijeczko (588093) on Thursday September 02, 2004 @12:19PM (#10140035)
    ...revealed that whenever I read the word "Microsoft", my pupils dilate and when I read the word "Longhorn", I fall into a deep sl...

    ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

In any formula, constants (especially those obtained from handbooks) are to be treated as variables.

Working...