Describing The Web With Physics 133
Fungii writes: "There is a fascinating article over on physicsweb.com about 'The physics of the Web.' It gets a little technical, but it is a really interesting subject, and is well worth a read." And if you missed it a few months ago, the IBM study describing "the bow tie theory" (and a surprisingly disconnected Web) makes a good companion piece. One odd note is the reseachers' claim that the Web contains "nearly a billion documents," when one search engine alone claims to index more than a third beyond that, but I guess new and duplicate documents will always make such figures suspect.
The Living Net, or the selfish net? (Score:1)
Lawrence and Giles study was published in 1999... (Score:2, Insightful)
The important thing from that paper is on the growth of the web; and from Kumar's bowtie-theory paper, we also think that most of the web is growing in places where we can't see.
got root? (Score:2)
I have to apologize for that one. I was VPN'ed in the other day and I opened MS Explorer on "\\internet". I accidentally selected "www" and hit CTRL-C, CTRL-V, CTRL-V. I guess the "Copy of www" and "Copy (2) of www" tripled the document count on some search engines. My bad.
some thoughts on complex networks... (Score:1)
The article mentions a 0.05% sample... is that statistically significant? Not to mention the fact that 'web page' is a vaguely defined term (i.e. static versus dynamic) -- this makes me doubt that this report contains any type of 'real' conclusions.
However I suspect this type of research must be really juicy for the big search engine comanies (e.g. Google, etc..). I especially like the idea of giving the user a feeling of spatial orientation when browsing the internet (but what would that mean??)... in the end, I'm afraid that the internet/web/whatever is simply changing too fast -- by the time we analyze it enough to determine its topology and organization, something new will be replacing it. Note that the data in this article is already 2 years old... the web has probably at least doubled in size by now.
To really understand the internet, statistical mechanics is not going to cut it-- we need better tools - adaptive ones that learn the new rules without being reprogrammed...
Re:some thoughts on complex networks... (Score:1)
In this case, don't think of it as a
In any case, it is entirely possibly for their results to be statistically valid.
Slashdot in Space (or terrain) (Score:1)
I especially like the idea of giving the user a feeling of spatial orientation when browsing the internet (but what would that mean??)...
I'm reluctant to post this without having had more time to revise, but one way of spatializing the data is by making it into more familiar terrain. Again, still early stages, but for an example, how about the terrain described by the hyperlinks surrounding Slashdot in a typical week [washington.edu].
Re:Interesting... (Score:1)
Physics or math??? (Score:2, Interesting)
Re:Physics or math??? (Score:2)
Dynamic properties (Score:1)
Does anyone know of any studies on this subject?
Lying with statistics for fun and profit (Score:1)
Tread softly here, Grasshopper, the very fact that you can only easily see 16% of the Web means that you must expect that your sample is strongly biased, hence does not represent the Web in its entirety. Just as statistical resampling of the census would require much more care than a political entity can usually bring to bear, so would attempting to extrapolate web characteristics from a sample at random.
hmmm (Score:3, Funny)
*grin*
What about the rest of the "Web" (Score:1)
Vulnerability to Carefully Coordinated Attack? (Score:2, Interesting)
Consider this example, though it isn't meant to be analogous to the internet in any way. What if the President of the USA, the Vice President, the entire Cabinet, the entire Senate, the entire House of Representatives, etc. etc. were simultaneously assassinated? Can you even imagine ensuing chaos? You can even throw in all the state Governers, whatever, but that still wouldn't come out to more than the top 0.0004% of the country's population, in terms of "political importance" or some other metric. Is this scenario plausible or worth worrying about? You decide.
- The One God of Smilies =)
Re:Vulnerability to Carefully Coordinated Attack? (Score:2)
Re:Vulnerability to Carefully Coordinated Attack? (Score:1)
In networking terms, I don't believe loss of the top officials would have to make a bit of difference to the operations of the country. The strength and stability of the government lies in the well-established and huge burocracy. That complex system really needs no supervision (Though it could use a kick in the ___).
It's interesting that the correlation with the internet breaks down because the President et al are not the hubs of communication. They aren't part of the information network of government; they sit atop it, but separate from it. (They're more like ICANN than Google?)
Unfortunately, humans are not routers, and I wouldn't even try to predict what the emotional effects of mass-assasination would do to the function of the goverment or the nation.
Re:Vulnerability to Carefully Coordinated Attack? (Score:2)
What if the President of the USA, the Vice President, the entire Cabinet, the entire Senate, the entire House of Representatives, etc. etc. were simultaneously assassinated? Can you even imagine ensuing chaos?
It wouldn't be that bad actually. One of the major strengths the US government has is a fairly clear line of succession, it's always obvious who is in charge in a given situation. And really it isn't even important who specifically is in charge, just so long as someone is. I doubt we'd really be too worried about it anyway, we'd be far more concerned about what killed them.
The point is though that this small minority is also under the best protection. You estimate the number to be 0.0004% of our population (a little over a thousand people), inverse that to say they have 250,000 times better security than the rest of us. As this applies to the Internet, we just need to make sure that our main routers have the same level of protection. That rule of thumb makes sense to me, if there are 10,000 machines behind a connection then it should be 10,000 times harder to take down that connection than a single machine. I know it doesn't sound like a good metric, but it's an interesting thought experiment.
Goatse... (Score:1)
For example, almost every slashdot page links to Goatse.cx more than 20 times...
Wow, thats kind of deep. (Score:1)
Its this sort of technical link that keeps me coming back to slashdot, even though its not as good as it used to be, and it no longer seems to attract the 31337 intelligent posters of the good old days.
Oh well, nothing lasts forever.
Mob Psychology describes the web better than... (Score:2)
I's interesting though that every academic out here has tried to comment on completely unrelated fields usin the language of his area of expertise. I've seen studies by mathematicians who claim to be able to model the web, and even industrial design students who claim that the design-to-maturity process of a network of websited (a small subset of the web) is identical to the processes championed by industrial designers who led the way in Japan in the late 1970s.
This seems to suggest (to me anyway) that those who enguage in this cross-discipline analysis, are somehow unsatisfied with their chosen field and are trying to latch onto an area of study that is populat at a particular moment in time.
--CTH
Re:Wow, thats kind of deep. (Score:1)
Reactions like yours make my day :-)
Re:Wow, thats kind of deep. (Score:1)
Fractals (Score:1)
-Tim
I've got a million of them... (Score:3, Interesting)
PhysicsWeb (Score:2)
Re:PhysicsWeb (Score:1)
1,000,000,000 urls (Score:4, Insightful)
Most of their research seems to be on 'static pages'. They state that the entire internet is connected via 16 links (similar to the way that people are connected to 5-6 aquantances). I believe as the ratio of dynamic to static content on the internet increases, this will bring increase the total number of clicks that it takes to get one site to the next. For example, I could create a website that dynamically generates pages, the first 19 pages are all contained within my site and the 20th time that the page is generated, it contains a link to google.
The metric functions that they use are good for randomly connected maps, but they don't apply to the internet, where nodes are not randomly connected. Nodes cluster into a group depending on topic or categories. For example, one Michael Jackson site links to other Michael Jackson websites.
99 bottle of beer (Score:3, Funny)
Take one down, pass worms around,
99 million URLs on the net...
Xix.
Re:99 bottle of beer (Score:2)
99 million, 999 thousand, 999 URLs on the net, surely?
Re:1,000,000,000 urls (Score:1)
Especially if Google is caching someone elses content pages.
Ian.
Re:1,000,000,000 urls (Score:1, Offtopic)
www.m-w.com defines "Internet" as a noun.
Re:1,000,000,000 urls (Score:1)
Actually the article describes the finding that the connectivity of nodes on the web and Internet follow a power-law distribution instead of the poisson distribution one would expect with a randomly connected graph. Maybe we should read beyond the introduction of the article before we post?
Re:1,000,000,000 urls (Score:1)
<pedant>Actually, it might be more accurate to say that if they're talking about "documents", then they are talking about 10^9 URIs, not 10^9 URLs. A URL merely identifies the "site", while a URI identifies a specific document. There's a difference [w3.org].</pedant>
Or, then again, who cares?
Surprising claim... (Score:1)
Internet is not web (Score:1, Insightful)
Describing the web with biology.... (Score:2, Insightful)
It turns out that computing may prove similar.
Different is good!
Re:Describing the web with biology.... (Score:1)
This is certainly an idea that has been around for a while. Consider a 1989 interview with Clifford Stoll ("The Cuckoo's Egg") in The Boston Globe:
Of course, the solution is to make every system idiosyncratic. And (also) of course, this is not anything like a reasonable solution to the problem of security. Rather, we should view networked computing as a whole system--a system that could not even exist but for standardization--and attack the real problems: exploitable weaknesses in widely used software.
Re:Describing the web with biology.... (Score:2)
These figures are normally HTML-only (Score:1)
Also remember that most search engines are indexing only html pages and are probably only counting said pages in their "pages indexed" figures. The web CAN contain other media that may be considered documents. The obvious one is PDF.
Advantage of Scale-Free Topology (Score:2, Interesting)
The problem with getting rid of the current Internet is that we would probably lose the advantage of having scale-free topology ... something the PhysicsWeb article discusses at length. Scale-free topology is one of the key factors in keeping the current Internet stable and relatively fault tolerant even as the number of users have grown exponentially. I doubt that those who want to replace an open Internet would create a replacement that would incorporate this type of scale-free topology.
LAIN (Score:1, Insightful)
What will happen as the net becomes more and more like a brain? Can it have a soul?
Or worse, can it comprehend the garbage we use it for?
Re:LAIN (Score:2)
Would a single cell know whether the whole thing has a soul or not?
Re:LAIN (Score:2)
First of all, the formation of a scale free network was caused by measurable "evolutionary" pressures for fault tolerance. In the absence of some similar evolutionary advantage to developing a global conciousness, it doesn't seem likely that it would happen spontaneously.
On the other hand, if some (possibly unintentional) goal was aligned with that, I wouldn't be totally surprised if through maintenence and updates, some form of conciousness arose.
Except: characteristic time scales on the internet are very large compared to connections within the brain. Any large scale behavior, including conciousness, would be expected to be slower than a human brain by orders of magnitude.
Re:LAIN (Score:1)
Re:LAIN (Score:4, Interesting)
Please don't take this the wrong way, but that's honestly the sort of question I'd expect from someone who doesn't understand computers.
While I believe in the possibility of machine intelligence (along with the moral, ethical, and most importantly philisophical questions that raises), the net is more of a data transfer mechanism than a processing mechanism. Short of very delibrate projects, such as SETI@Home, you just don't have your average machine on the net doing random computation. In that sense, the net really hasn't changed much since its inception. Further, if you did have a distributed consciousness, what would the consequences of lag, network outages, and outright crashes be? In that sense, it would be interesting to see if random/semi-random/genetic algorithms are capable of generating an intelligence capable of coping with such noise. However, I think such issues would rapidly kill off something before it became "evolved" enough to cope. If we do get an intelligence, I think it'll be something that happens on purpose. It may be distributed (maybe as a redundant, non-real-time simulation of a brain), but I doubt it'll be a spontaneous Skynet-like entity.
Re:LAIN (Score:1)
Re:LAIN (Score:1)
Well, in our head brain cells die left and right starting from the days when we're still inside the womb without really noticeable effects... no,wait! Could this explain the precidency of Bush?!
-
Re:LAIN (Score:1)
I think this discussion has been somewhat flawed, in that it has been considering the internet and its human operators as separate entities. However, the Internet is so driven by human action nd interaction that it is impossible to view it as the technology alone. Yes, it is just a communications network, but with humans at its nodes, the internet may be able to act as a brain, albeit a primitive one for the moment. Ideas of group conciousness are not new, they can be traced back through Freud and Roussaeu in the Western tradition, possibly just as far or further in the East. That humans organize their social groupings the same way as biology organizes their brains would seem to lend a bit more credence to this idea. Of course, one could argue that a group conciousness is not as intellegent, self aware or responsive as individual conciousness, but in the past, group conciousness has been limited in scope to towns, villages, and tribal groups. Cities and nations push the slowness and dumbness of group conciousness to the point that it is probably meaningless (i doubt that they are nonscaling systems). The Internet, however, allows us to form a non-scaled social network of unprecedented speed and size.
What does this mean? The possibility for conciousness is there, and perhaps the reality is, as well. But a concious Internet will not go running amok, create a body for itself, or any of that other sci-fi stuff, because it is us. Unless, of course, the whole net condenses on aol or
Oh, and btw, Lain was an ai created on the net, not out of the net, at least as far as I understand Lain, which isn't very.
Re:LAIN (Score:1)
If the internet became 'alive' we would all see a _lot_ of packets going around we didn't understand. We would all see hits to say our webservers from totally random IP's containing code to take over our machines and change them, as any brain changes itself. Such a system would never happen, and if it did we'd have it all over the news about the internet slowing down, and the 'internet' taking over machines.
Besides I'd get loads of messages in my apache logs..
oh wait
wtf
HELP!
Re:LAIN (Score:1)
Re:LAIN (Score:1)
An Internet-wide AI.... hmm... lag would problem be analagous to senility or Alzheimer's, network outages would be memory loss or brain damage, crashes would be brain damage as well. However, given that a network will eventually come back up, and no crash lasts forever (although I'm certain MS is working on a 5 9's crash), it wouldn't be permanent brain damage. And theoretically, such an AI could become 'accustomed' to lag and work around it.
But it's all still speculation...
Kierthos
when describing (Score:1, Funny)
Re:when describing (Score:1)
The equations are just our human attempts to understand the physics.
You are conflating the subject as taught in school with the subject matter itself.
Read the fscking article... (Score:5, Interesting)
Look deeper, grasshopper:
Hey, Timothy, next time try reading the article instead if skimming it.
But the real question is... (Score:1)
So, how many clicks does it take to get to the home page of Kevin Bacon?
-Joe
complexity and deregulation (Score:3, Interesting)
THat's not to say that understanding how the various layers of complexity architecture and dynamics won't provide an answer ... and not because I think such diciplines suck, but because we have and will continue to have commercial influences on how networks are established.
Certainly some, in fact many businesses will higher and follow good practice. The problem comes about when some large companies don't. Or worse when mergers and buyouts occur, e.g. Verizon, CIHost and a few others come to mind.
Not to sound anti-business, because business has footed much of the bill for Internet expansion ... but rather to voice concern that sometimes there is a big disparity between technical solutions and the shareholder's bottom line.
Re:complexity and deregulation (Score:2)
What problem are you talking about? Their research found that the current structure of the internet is extremely resilient to random attacks. Yes, co-ordinated attacks against key routers could work, but every network has some vulnerability, and the best solution is probably just to make sure the few key routers are well-protected and hidden. As Mark Twain says, There's no problem that needs to be solved, so I don't know where you're going with this "Not to sound anti-business" rant. The current chaotic approach to building network infrastructure works great, just like many natural systems.
critical threshold for virus spreading (Score:1)
The mere existence of that term IMHO shows that the threshold is greater than 0.
Re:critical threshold for virus spreading (Score:3, Insightful)
1) Some set of nodes are infected
2) Each of those nodes has a probability of X of infecting it nearest neighbors.
3) repeat
I just made that up, and there are many oportunities for variations (add the ability for nodes to be cleaned and/or vaccinated), but under models like this:
random networks have a critical threshold for X, above which they will infect the whole network, below which they will die out.
scale-free networks will have a macroscopic fraction of the network infected for any value of X.
First of all, there are additional features not caputred in this model, which could be important for "viruses" like Bliss which have an extremely low probabiliy of infection.
Second, the internet is not exactly a scale free network. As mentioned in the article, while the dominant behavior is a power law, if you go high enough, you find exponential cutoffs. This could cause some viruses to die out (I am certain Bliss isn't the only one that never made it).
i had a feeling... (Score:2)
IBM "bow tie" paper (Score:4, Interesting)
In "Graph structure in the web [ibm.com]," Kumar et al. divide 200 million web pages into four categories of roughly equal size:
So is your home page an innie or an outie?
I am infinitely grateful... (Score:1)
I sucked at that! (As I imagine many
:)
Microsoft, Betamax, Qwerty, oh my (Score:1, Insightful)
Relation of Net Connections to Neural Nets (Score:1)
Re:Intersting, but flawed. (Score:2)
Why do systems as different as the Internet, which is a physical network, and the Web, which is virtual, develop similar scale-free networks?
They go on to describe some properties of scale free networks and mention some interesting examples from physics.
So, in summary, you have completedly misunderstood the article.
Re:Intersting, but flawed. (Score:3, Informative)
The interesting thing is that both the web and the physical network follow this power-law structure (or scale-free, as the "Physics Boys" call it).
Oh, don't think it's possible to study the physical structure of the internet? I'd like to introduce you to a new and powerful tool called traceroute [yes, that was sarcasm]. BTW, you can buy maps of the internet [thinkgeek.com] from ThinkGeek [www.thinkgeek], in case traceroute is too much for you.
How the hell did that guy get modded up, anyway?
yeah, right... (Score:1)