National Virtual Observatory 66
scubacuda writes "According to this Technology Review article, U.S. astronomers (compliments of a $10M grant from the National Science Foundation) are building a National Virtual Observatory to make accessible terabytes of astrononomical data to a web browser. One interesting challenge is how the scientists are going to query so many *different* distributed databases (which they're leaving in their respective places to avoiding clogging network bandwidth)."
"Open Source" Knowledge (Score:3, Interesting)
The problem of data interfaces and the layman (Score:5, Interesting)
It seems that there is simply going to be a huge amount of data-cross referenced and collated. From the second page of the article, it seems to include pictoral data. I also hear talk of XML being thrown around, which is a good start, but there's a lot that goes into that transition. Are they looking to set the layman bar at "your novice astronomer", "the third grade science report", or "grad student". Where is this information really being targeted at the sub-obscure level.
While I don't want to trivialize their massive IT effort, it seems that a lot of this is going to come down to the end user of the data. Their sample study [caltech.edu] using this information isn't trivial stuff, and does seem to set the aforementioned bar at somewhere in the undergrad-graduate level. Perhaps that is the nature of the data (I'm not that familiar with it). There's an XML schema, some request examples, and other framework stuff already in place to view by potential client writers.
I'm glad to see XML being done the right way (by collaboration with its end users), and those pictures
Anyone closer to the project know of any simplification efforts?
--jaybonci
Seems like $10 million might not be enough (Score:4, Interesting)
All in all, though, it seems like a good use for those tax dollars. The "Google" of astronomy research is an attractive idea, and I know we'll get some great new acronyms in the deal.
Virtual astronomy (Score:4, Interesting)
Re:How much will this data get re-analyzed? (Score:2, Interesting)
As another example, people still use the plate archives at Harvard. Many of these plates are over 100 years old. Astronomical data gets reused.
Making it accessible to lay people is important (Score:3, Interesting)
While the main benefits of the virtual observatory will be to researchers, the $10 million is only the start, and more money will be needed, and the way to get more money is to make it popular with voters.
There are two examples of indexing large databases for the masses that come to mind. One is Google, and the other is Amazon.
Google ranks items by how popular they are, based in large part by how many links there are to the web page. Amazon gives you a list of books other customers bought when they bought the book you found in your search.
For astronomical data and images, something like those approaches could be quite entertaining. I could go to a popularity list to see which images and data everyone else was looking at (a million flies can't be wrong...). But then, like the Internet Movie Database, it would be fun to see other images and data that was most often found in the same papers or web pages as this item. Somewhat like the Science Citation Index (or the Kevin Bacon game).
Users could also rate the images and data. Then we could have lists such as "people who liked this nebula also liked these HST photos". Images could be grouped by popular use -- "Images most often used as wallpaper", "Images most often used by science magazines", "Data most often used by newspapers", etc.
Re:How much will this data get re-analyzed? (Score:4, Interesting)
To elaborate on that, at my (old) institute [astro.uio.no] people are discouraged from disembarking on a thesis that requires them to obtain original data, it is too risky.
To get observation time, you would have to write a really good proposal; most major observatories have at least three times as many applications as they have time for. If you're lucky enough to get time, it is maybe half a year into the future, and you're getting three nights to complete everything.
You spend that time preparing everything, just to come down to the observatory, and you're in the fog for three nights! Tough luck, you've spent all that time preparing, and you're now one year behind schedule...
I did three observation runs during my thesis work , two as Observing Astronomer (who is kind of the guy deciding what to look at when and for how long when at the telescope, the PI is the guy who decides what the project is about). My own thesis was purely theoretical, and I was happy about that, because we experienced having a total of ten nights (it is rare to get so many nights, it was a world-wide collaboration), and we got one full night + 3 hours on two other nights worth of observation. It's extremely frustrating to sit there getting nothing because of humidity, I can tell you, and if that had been a part of my thesis, I'd be in deep trouble.
Re:P2P as an alternative (Score:2, Interesting)
Why web browser? (Score:1, Interesting)
Just because the web exists doesn't mean that it should be used for everything, even if it can, especially since this project isn't going to be accessable to the general public. A small custom cross-platform client application would make much more sense depending on the data being accessed - it would probably allow for more efficient automation of searching and repetitive tasks as well by not having a completely dumb client.
I hope they considered what tasks the end-users will actually be doing with the data and are going to allow them the flexibility to be creative in their manipulation and searches.