National Virtual Observatory

National Virtual Observatory 66

Posted by michael on Sunday December 01, 2002 @08:22AM from the number-crunching dept.

scubacuda writes "According to this Technology Review article, U.S. astronomers (compliments of a $10M grant from the National Science Foundation) are building a National Virtual Observatory to make accessible terabytes of astrononomical data to a web browser. One interesting challenge is how the scientists are going to query so many *different* distributed databases (which they're leaving in their respective places to avoiding clogging network bandwidth)."

National Virtual Observatory

This discussion has been archived. No new comments can be posted.

Search 66 Comments Log In/Create an Account

Comments Filter:

This is reminiscent of (Score:3, Informative)

by Frederique Coq-Bloqu ( 628621 ) writes: on Sunday December 01, 2002 @09:14AM (#4787220) Journal

the SkyView Virtual Observatory [nasa.gov] run by NASA, though I suspect this National one will be far more sophisticated. Cheers.

Microsoft involvement? (Score:3, Informative)

by bunyip ( 17018 ) writes: on Sunday December 01, 2002 @09:23AM (#4787237)

Jim Gray, at Microsoft Research, has coauthored papers on this topic with at least one of the researchers mentioned in the article. There is some really good reading at:

http://research.microsoft.com/~Gray/JimGrayHomeP ag eSummary.htm

Alan.

Solution (Score:2, Informative)

by SpitFU ( 617828 ) writes: on Sunday December 01, 2002 @09:26AM (#4787239)

I don't know how much data they are actually talking about, but I can offer up a solution.

Some of you might disagree. I've run into a scalable piece of software which will interogate all their information sources irregardless of their storage format, index them, and still leave them all in their respective locations.

Autonomy Inc. [autonomy.com] has a product called DRE AXE which is also XML compliant. They have a pretty simple API to work with and have even seen it work on Java, PHP, and Perl. The query engine is extremely fast, and supports laymans terms. The engine supports both Boolean as well as natural language queries. Check them out, i've been administering their products for about 2 to 3 years now.

Ok, Ok, I'm giving them a plug, but hey their product works well.

Re:The problem of data interfaces and the layman (Score:3, Informative)

by LewisBruck ( 630506 ) writes: on Sunday December 01, 2002 @09:27AM (#4787240)

Take a look at SkyServer [sdss.org] for an "inverse TerraServer". It was co-developed by Jim Gray of Microsoft Research, one of the developers of the inverse TerraServer. In fact, that is how he describes the new project :-)

Re:The problem of data interfaces and the layman (Score:5, Informative)

by KjetilK ( 186133 ) writes: <kjetil AT kjernsmo DOT net> on Sunday December 01, 2002 @09:37AM (#4787253) Homepage Journal

IAAABDTTALA (I Am An Astronomer, But Don't Take This As Legal Advice), and I doubt that they are actually aiming this at the layman. What they are doing is opening it up to everyone, and everyone is free to use it and learn how to use it, but really, you expect mainly professional astronomers to use it.
There are lots of databases that follows this philosophy allready, the NASA Astrophysics Data System [harvard.edu], the Digitized Sky Survey [stsci.edu], not to speak of the larger arxiv.org [arxiv.org]. You can all grab whatever you like from there.
That being said, there are a number of amateur astronomers who are extremely dedicated and are willing to obtain the skill needed to use such a system, even if there is a tough learning curve. These can be considered "laymen", but they are actually very good at what they do. That's the kind of "laymen" you would expect to use it. Not Joe Sixpack, but the people who are dedicated enough to learn how to use it.

Cool, but some links... (Score:4, Informative)

by mraymer ( 516227 ) writes: <mraymer&centurytel,net> on Sunday December 01, 2002 @09:43AM (#4787262) Homepage Journal

While this sounds like a cool idea (terabytes?!), there is already a lot of astronomical data out there in the APOD archives [nasa.gov], which is the largest collection of annotated astronomy pics on the Web.
Also, I have to mention Celestia [shatters.net], a great Space Simulator, similar to OpenUniverse.
In closing, let me say that I think people should take more of an interest in astronomy, as the understanding and exploration of space is one of the most important goals humans should have if they wish to survive longer 500 million years or so.

Re:The problem of data interfaces and the layman (Score:4, Informative)

by ghostlibrary ( 450718 ) writes: on Sunday December 01, 2002 @09:57AM (#4787281) Homepage Journal

Most/all astronomical data is in FITS format. That which isn't, often gets FITSized when put into archives.

All you really need to know about FITS is: it is well specified, there are lots of tools for it, and it has an ASCII (human-readable) header describing the data, followed by specifically formatted binary data.

Also, since most data archives are large, single location repositories (e.g. CHANDRA data), and many data archives are already combined with other sets (e.g. HEASARC.gsfc.nasa.gov), there's a relatively small number of sites providing data (relative to, say, the number of sourceforge projects).

The astronomy community has been providing its data via the web for years now, usually localized by wavelength (e.g. radio archive in 1 place, X-ray data in another). The Virtual Observatory is just a layer on top to simplify access.

And for NASA data, it always goes public 1 year after the observation, so this isn't a new concept, just a better way to get at the data.

Re:How much will this data get re-analyzed? (Score:5, Informative)

by ghostlibrary ( 450718 ) writes: on Sunday December 01, 2002 @10:06AM (#4787293) Homepage Journal

A lot of astronomy data is looked at by its principal investigator (PI) for something specific. Really, data has 5 'lives'.

1) The original proposal by the PI, e.g. 'looking for cornonal emissions from DI Peg, an Algol-type system'. Sort of the pass/fail of the research world.

2) Survey. Someone decides to do a survey study among existing data, e.g. "Light curves from all Algol-type systems".

3) Unexpected. Someone finds a new thing to look for, sometimes due to better theoretical understanding. "Coronal sources should be iron-enhanced, so let's reanalyze DI Peg, specifically looking for iron lines."

4) Data-mining. Searching an archive for a given property. "Looking for all sources with X-ray emission above a given threshold... hey, DI Peg matched!"

5) Grad students. Doing their thesis on a topic, use archival data to support. "Dissertation on coronal systems, using data from DI Peg and others".

So data is often used beyond its initial acquisition!

Re:Web Browser (Score:5, Informative)

by ghostlibrary ( 450718 ) writes: on Sunday December 01, 2002 @10:09AM (#4787301) Homepage Journal

Is it possible to look at the universe with, say, lynx?

I know this was a joke, but that's actually a topic debated by webmasters at GSFC. In theory, all NASA web pages should be accessible, e.g. all browsers, readers for the blind, etc.

For images, this means descriptive image 'alt' tags. For links, it means including a link description. But what to do for data?

It's kinda subtle. The best answer is 'give data informative tags that can be domain-specific.' "Image 5b" is useless, saying "DI Peg data, X-ray wavelengths, reduced, FITS format" is good but tedious for whomever makes the page, giving a spec like 'ASCA dataset1, DI Peg, FITS, reduced' is something that could likely be automatically generated and fits the bill.

But the issue of folks using non-visual browsers is pretty real. Besides lynx and browsers for the blind, there's also data hunting scripts and programs that need to figure out what is on a page, and so it's a problem worth solving.

some details (Score:4, Informative)

by niall2 ( 192734 ) writes: on Sunday December 01, 2002 @12:24PM (#4787677) Homepage

I am involved somewhat in the development of the Virtual Observatory. There are some details that often get overlooked in articles about the VO. First off, its more than putting data on the web. That we do already (the Hubble Space Telescope archive is a 7+ terrabyte archive that is on the web). The real challenge is to make an infrastructure to allow these archives and terabyte databases to interact with grid computing services. We have been working on this for several months now and are working on some demos of the technology for the January American Astronomical Socieity meeting in Seatle.

An example of such a VO project is the Galaxy Morphology demo. We take catalogs of a cluster of galaxies from one source, identify those sources with emission form a separate catalog, fetch images of all of those galaxies, and send the images and brightness information to a grid computer service that calculates the morphology of the galaxies, sending this result to the user to visualize in a VO complient piece of software. The user did nothing but pick the cluster and then look at the results. Much more than simply putting data on the web. And once this service is developed, it can simply be put into a web page for others to use and learn from.

Most of this involves creating simple to use yet potentially powerful interfaces to services. While we are not using true RPCs like SOAP yet, the idea is to create standard interfaces to things like image servers, catalog servers, and the like. With those services, we will extend beyond to data and service discovery. Standard data and metadata formats are also being developed, as are common datamodels, all with the intent that these will make data and service exchange simpler. This all leads to service registries, where many applications will go to discover data and services that could be used for a particular project.

Jim Grey is involved with the project. He lead the Terraserver project at Microsoft Research. He found that, as he put it, images of the earth are worth money; those of the stars are not. Because of this, he found the research he was doing on distributed data with the terraserver project was running into snags where making money hindered access to the data. This not to be true for astronomical data. Hence he is now looking up rather than down now. There is in development a version of Terraserver for different parts of the VO in the works.

There will be usage points for people all the way from my mother who loves astronomical wallpaper to the hard core researcher and all points in between. Public outreach is being built in at the ground level, so this is not just for astronomers. Many of these will be web bases interfaces to the VO, but others may be simple toolkits to make your own services. Some could be simple to use to do basic science projects in school, some may be for science fair level projects, and some for people to develop educational web-based lesson plans.

Yes, 10 million dollars seems small. But its a start. And we are not the only ones working on VO technologies. The Europeans have thier own VO, as does Canada, Russia, India... The divisions are mostly political (each funding agency has its own VO title). The IVO has been establised to act as a stearing body to help us share efforts and make things interoperable from the start.

Re:some questions (Score:2, Informative)

by niall2 ( 192734 ) writes: on Sunday December 01, 2002 @09:36PM (#4790311) Homepage

The sloan is a major part of the VO as will be the 2-Mass allsky Near Infrared survey [caltech.edu] and many other surveys to come. This is not something that will be limited to a particular mission or archive, but infrastructure to allow interaction between these data and service sources.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

National Virtual Observatory 66

National Virtual Observatory More Login

National Virtual Observatory

This is reminiscent of (Score:3, Informative)

Microsoft involvement? (Score:3, Informative)

Solution (Score:2, Informative)

Re:The problem of data interfaces and the layman (Score:3, Informative)

Re:The problem of data interfaces and the layman (Score:5, Informative)

Cool, but some links... (Score:4, Informative)

Re:The problem of data interfaces and the layman (Score:4, Informative)

Re:How much will this data get re-analyzed? (Score:5, Informative)

Re:Web Browser (Score:5, Informative)

some details (Score:4, Informative)

Re:some questions (Score:2, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot