Forgot your password?
typodupeerror
Google Open Source Stats Science

'Culturomics' Spreads From Google Books To Scientific Preprints 12

Posted by Soulskill
from the no-making-up-words dept.
ananyo writes "Cultural Observatory at Harvard University in Cambridge, Massachusetts is to index the whole of the ArXiv pre-print database of papers from the physical sciences, breaking down the full text of the articles into component phrases to see how often a particular word or phrase appears relative to others — a measure of how 'meme-like' a term is. The team has already applied a similar approach to 5 million books in the Google Books database to produce their n-gram viewer. But the Google Books database carries with it a major limitation: because many of the works are under copyright, users cannot be pointed to the actual source material. Applying the tool to ArXiv means it could be used to chart trends in high-energy physics, for example: a quickening pulse of papers citing the Higgs boson, for example, or a peak in papers about supersymmetry, a theory which may soon be waning."
This discussion has been archived. No new comments can be posted.

'Culturomics' Spreads From Google Books To Scientific Preprints

Comments Filter:
  • by Anonymous Coward

    Did I miss something? All that's been cast into doubt is minimal supersymmetry. One might as well say that unified theories died with Kaluza-Klein. Supersymmetry is still the best solution to the heirarchy problem, so if it is incompatible with nature then we are royally fucked and have little to no clue what the physical principles on this scale are.

  • Since there is only EIGHT comments and I just lost my mod points, here goes without reading The A.

    Of course in Scientific Circles there are Memes, but they're NOT the same ones that go Viral among Biz. Masters.

    Science has to break new ground, so it's Anti-Meme.

    The Memes circulate a level down, somewhere in the Consultant range.

  • by tibit (1762298) on Friday February 24, 2012 @09:05PM (#39154963)

    The link to the n-gram viewer in the submission is wrong. The Ngram Viewer is case-sensitive. The link goes to the uncapitalized sarch using terms "spock, skywalker". If you correctly capitalize the terms [google.com], you get results higher by 2 orders of magnitude.

  • 1. However
    2. Moreover
    3. Furthermore
    4. Indeed
    5. Subsequently
    6. Utilized
    7. Methodology
    8. Data not shown
    9. Further research is warranted
    10. Vajazzle

Profanity is the one language all programmers know best.

Working...