Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
The Internet Science

Describing The Web With Physics 133

Fungii writes: "There is a fascinating article over on physicsweb.com about 'The physics of the Web.' It gets a little technical, but it is a really interesting subject, and is well worth a read." And if you missed it a few months ago, the IBM study describing "the bow tie theory" (and a surprisingly disconnected Web) makes a good companion piece. One odd note is the reseachers' claim that the Web contains "nearly a billion documents," when one search engine alone claims to index more than a third beyond that, but I guess new and duplicate documents will always make such figures suspect.
This discussion has been archived. No new comments can be posted.

Describing The Web With Physics

Comments Filter:
  • by soboroff ( 91667 ) on Monday August 06, 2001 @09:57AM (#2110502)
    According to a recent study by Steve Lawrence of the NEC Research Institute in New Jersey and Lee Giles of Pennsylvania State University, the Web contains nearly a billion documents. The documents represent the nodes of this complex network and they are connected by locators, known as URLs, that allow us to navigate from one Web page to another.
    The Lawrence and Giles study was published in 1999, so stop picking on the 1 billion number... it's quite out of date. Web researchers know this already.

    The important thing from that paper is on the growth of the web; and from Kumar's bowtie-theory paper, we also think that most of the web is growing in places where we can't see.

  • 1,000,000,000 urls (Score:4, Insightful)

    by grammar nazi ( 197303 ) on Sunday August 05, 2001 @10:50PM (#2163212) Journal
    The story mentions "nearly 10^9 urls", so duplicate documents would be counted multiple times.

    Most of their research seems to be on 'static pages'. They state that the entire internet is connected via 16 links (similar to the way that people are connected to 5-6 aquantances). I believe as the ratio of dynamic to static content on the internet increases, this will bring increase the total number of clicks that it takes to get one site to the next. For example, I could create a website that dynamically generates pages, the first 19 pages are all contained within my site and the 20th time that the page is generated, it contains a link to google.

    The metric functions that they use are good for randomly connected maps, but they don't apply to the internet, where nodes are not randomly connected. Nodes cluster into a group depending on topic or categories. For example, one Michael Jackson site links to other Michael Jackson websites.

  • by swordboy ( 472941 ) on Sunday August 05, 2001 @10:53PM (#2163220) Journal
    The Code Red thing was interesting in the respect that, if it had worked, it would reveal just how *evil* homogeneity is. In nature, it leads to plagues and/or like disasters.

    It turns out that computing may prove similar.

    Different is good!
  • LAIN (Score:1, Insightful)

    by Schezar ( 249629 ) on Sunday August 05, 2001 @10:58PM (#2163243) Homepage Journal
    "The number of nodes in the wired is rapidly approaching the number of cells in the human brain." Or something like that.

    What will happen as the net becomes more and more like a brain? Can it have a soul?

    Or worse, can it comprehend the garbage we use it for? ;^) "Sorry Dave, but I cannot allow you do download that pr0n..."

  • by Anonymous Coward on Sunday August 05, 2001 @11:42PM (#2163371)
    You are confusing the two. WWW is the documents, etc. Internet (which is simply the DARPA net suit connectivity with the underling routing protocols and physical connectivity) is simply a transport mechanism. It can carry anything, it just so happens that the web is the most popular (along with email).
  • by norton_I ( 64015 ) <hobbes@utrek.dhs.org> on Monday August 06, 2001 @12:17AM (#2163462)
    The virus infection threshold is based on something like this model:

    1) Some set of nodes are infected
    2) Each of those nodes has a probability of X of infecting it nearest neighbors.
    3) repeat
    I just made that up, and there are many oportunities for variations (add the ability for nodes to be cleaned and/or vaccinated), but under models like this:

    random networks have a critical threshold for X, above which they will infect the whole network, below which they will die out.

    scale-free networks will have a macroscopic fraction of the network infected for any value of X.

    First of all, there are additional features not caputred in this model, which could be important for "viruses" like Bliss which have an extremely low probabiliy of infection.

    Second, the internet is not exactly a scale free network. As mentioned in the article, while the dominant behavior is a power law, if you go high enough, you find exponential cutoffs. This could cause some viruses to die out (I am certain Bliss isn't the only one that never made it).
  • by Anonymous Coward on Monday August 06, 2001 @12:48AM (#2163504)
    This simple model illustrates how growth and preferential attachment jointly lead to the appearance of a hierarchy. A node rich in links increases its connectivity faster than the rest of the nodes because incoming nodes link to it with higher probability this "rich-gets-richer" phenomenon is present in many competitive systems.
    And there's your explanation for how VHS beat out beta, QWERTY beat out other arrangements, and Microsoft won out in the OS and Apps biz. A small initial advantage gets magnified over time. The wingbeats of a butterfly become a hurricane.

Always try to do things in chronological order; it's less confusing that way.

Working...