NASA Requests Help With Von Braun's Notes 148
DynaSoar writes "NASA is soliciting ideas from the public on how best to catalog and digitize the collected notes of Wernher von Braun. 'We're looking for creative ways to get it out to the public,' said project manager Jason Crusan. 'We don't always do the best with putting out large sets of data like this.' The PDF notes are those of rocket scientist Wernher von Braun, the first director of NASA's Marshall Spaceflight Center in Huntsville, Alabama and are typed with copious handwritten notes in the margin. According to the official request for information, NASA needs ideas on what format to use (PDF), how to index the notes, and how to create a useful database. The unique nature and historical value of the data, literally discovered in boxes six months ago, is what motivated NASA to ask the public for ideas."
NASA (Score:5, Insightful)
Re:NASA (Score:5, Funny)
Re:NASA (Score:4, Funny)
Wow...I didn't know they had that position?!?!
I'm not sure I'd WANT to be fist director....sounds like more of a strange pr0n thing than a NASA office.
Re: (Score:1)
Don't say that he's hypocritical
Rather say that he's apolitical
"Vunce ze rockets are up, who cares vere zey come down
"Zats not mein department!" [suite101.com] says Werner von Braun
Re: (Score:2)
Re: (Score:2)
I'm not sure I'd WANT to be fist director....
What, you don't want total freedom to punch whoever you feel appropriate? Sometimes, that's the only way to get a bureaucracy moving.
Re: (Score:2)
Next week: What to do with this big golden box thing? We tried opening it and some guy's face melted.
Guy 1: It's the Ark of the Covenant!
Guy 2: No, it's a spare reactor core. Same effect.
Re: (Score:3, Funny)
NASA: We already have top men on that.
Slashdot: But wh--
NASA: Top. Men.
(My favorite line. Uttered by the actor who played Porkins, IIRC.)
Re: (Score:2)
"We tried opening it and some guy's face melted."
WONTFIX. This behaviour is by design. RTFM.
Re: (Score:2)
Good book.
Re: (Score:2)
I assure you that they have top men working on it right now.
Re: (Score:1, Redundant)
TOP men.
Re: (Score:3, Interesting)
Not sure if I can really blame them.
This past weekend I had a garage sale and, as I was clearing stuff, realized how much junk paperwork I had stashed in the garage. There were books, manuals, class notes, lecture notes (from those I attended and those I gave), meeting notebooks, documentation on long obsolete processes (Token Ring MAU reset procedures, Novell Netware rebuild procedures). I had notebooks of stories, embarrassing journal entries from college ("DH has the most beautiful eyes!!"), and all sort
Re: (Score:2)
NASA generates less than one hour what it's taken me a lifetime to accrete.
The only real difference is that research and exploration is actually (ostensibly anyway) NASA's goal and reason for existence. When you do research or exploration, it goes without saying that you need to catalog the fruit of your exercises. Unfortunately though, in reality NASA's main goal is and always has been to play a very expensive game of keeping-ahead-of-the-Jones's. First against the Russians, and now maybe he Chinese and the EU. Data is secondary to getting lovely expensive shiny machinery to far
Re: (Score:1)
Re: (Score:1, Troll)
Outsource it to China? (Score:1)
Just use one of those companies that is always spamming me to do piecemeal typesetting... though i'm betting there's someone in North Korea who could do it for even cheaper.
Re: (Score:2)
Sounds like a job for... Google!
Though I'd be happier if they released it in at least two major formats.
Re: (Score:3, Interesting)
What about Project Gutenberg [gutenberg.org]?
Format Suggestion (Score:3, Funny)
Re: (Score:2)
Re: (Score:3, Funny)
Might as well get MediaSentry and the RIAA in on the act ...
Re: (Score:2)
obviously, bittorrent to distribute the resulting set far and wide.
Well they're off to a good start as they're already running a torrent tracker [nasa.gov] for their Blue Marble [nasa.gov] image collections...
Off topic, but this quote from their FAQ is refreshing. They should share it with media companies and ISPs
I thought P2P and Filesharing were illegal!
This is a common misconception. BitTorrent, and peer-to-peer (P2P) are protocols, like HTTP and EMail. It is true that they can be used to share files illegally, but the same is true of HTTP. Our use here is legitimate, however, so you should have no need to be concerned.
Contact MIT and their archival department (Score:5, Informative)
They got that million dollar touchless scanner that can digitize the papers with ease, then put them into either Open Source or PDF formats.
Re: (Score:2)
Isn't the PDF format open source?
Re:Contact MIT and their archival department (Score:5, Insightful)
yes it is. but many whiners here will argue against it.
The thing is, dont half ass the pdf by simply encapsulating images. they need to do a real OCR on it and separate things out to images that are not typewritten.
then donate the boxes to the Smithsonian.
the MOST IMPORTANT aspect of the documents is that it is easily searched. which means all text must be text and not images. Yes that includes his handwriting.
Re: (Score:2)
I agree, but the second most important aspect is that the images of the original get preserved too. The ideal way to do it is to have the image be displayed, but with the OCR'd text linked t
Re: (Score:1)
Re: (Score:2)
Well, let ReCaptcha do it. If it is German, this should pose no problem to German users.
Re:Contact MIT and their archival department (Score:5, Informative)
Let me fix that for you:
the SECOND MOST IMPORTANT aspect of the documents is that it is easily searched.
The FIRST is of course making a high fidelity digital copy of the original pages, that will serve as the authority on all questions of possible ambiguity in the handwriting, or whether a figure in the margin is a thumbnail sketch or a mere doodle.
A 600 or 1200 dpi .png image of each page in full color would do as the master digital archive. The .png format is an excellent choice since it is open, well understood, and going to be around for a long, long time. Its accuracy is more than adequate for this work. That it supports lossless compression is a bonus: images of pages usually compress very well. Copies of the master digital library should be kept at various institutions and made available on request to anyone.
Then for public and research use, convert each page to HTML 4.01 strict, (since it is universally available, will be around for a long, long time, and Google, etc, can do the indexing for us). UTF of course, especially since Werner used some German and Greek glyphs in his handwriting.
Suggest using OCR to handle conversion of the typed notes, and volunteers or cheap student labor to transcribe the handwritten material (use consensus of several transcribers to assure accuracy). These can be incorporated into the main pages as divs and spans inserted into the correct place in the flow (use classes like "left margin" and "rightmargin"). CSS can use absolute positioning to make them marginal accordians (expand from the margin on mouseover), etc.
Treat sketches like the handwriting: put an img of the sketch into a div or span at the right place in the flow, then also add a searchable text description of the sketch in that div.
A simple script can process the final HTML fragment of each page and insert id="unique" attributes on each paragraph, etc, and <a name="unique"> targets where these would be useful.
The finished NASA product should be a simple online database using server side scripting to compose and serve out pages on request. It should be built with cooperation from Google and other search platforms so that spiders will have good access to the body of the work without causing excessive bandwidth problems. It should be possible for any researcher to develop his own custom search engine. Ideally, it will support not just the notes, but also concordances, wiki discussions, etc.
I once did a lot of this kind of work in moving sermons and such that were circulated by mimeograph in the 1960s and 1970s to web pages. I digitized the pages with a Minolta Z1 camera on a reverse tripod using indirect lighting, and converted to OCR with OmniScan (IIRC). The OCR came out in Word 97 format, and I used Perl scripts to transcribe to HTML. If the technical quality of the originals is good, this can go pretty fast and is highly accurate, even as a basement project. If the original notes use consistent formatting, which I would expect of Werner, then scripting with good use of regular expressions cna do the bulk of the HTML markup.
Re: (Score:3, Interesting)
For the right persons, transcribing the handwritten notes and sketches would be very rewarding. Werner Von Braun was pivotal technologist whose work for the Nazis either posed one of the greatest threats to England during WWII or, through high level monkeywrenching, managed to keep that threat from becoming a reality. He was definitely a very complex character who succeeded in doing a helluva good balancing act on dangerously high political high wires.
So access to his notes in exchange for doing the drudg
Re: (Score:2)
Try reading the documentation [w3schools.com] (one of many possible sources) before speaking up on a subject you know nothing about. And remember: a closed mouth gathers no foot.
The anchor tag, <a>, has to have a name attribute to be useful in its eponymous function. Or to put it in blunter terms, if you ain't got a <a name="there"></a>, you can't do a <a href="#there">Goto There </a>. And despite how harmful gotos might be in other environments, on these Intartubes, it is these gotos that
Re: (Score:2)
My earlier reply was both more snarky and less informative than was necessary, and I apologize. So although it wasn't incorrect, it was still wrong, and I can do better.
A simple script can process the final HTML fragment of each page and insert id="unique" attributes on each paragraph, etc, and targets where these would be useful.
That use of the "name" attribute has been deprecated for years. I don't know of any browser that doesn't support id targets.
The name attribute has been deprecated in XHTML, but it is not deprecated in HTML. These are two very different kinds of markups, despite their similarities. One of the major differences is that HTML v4.01 has a clear future path to HTML 5, but the future of XHTML is not at all clear. XHTML may end up being one of those constructions based o
Re: (Score:2)
Replying to my own post here, and quoting myself, which are things I don't generally do.
HTML v4.01 has a clear future path to HTML 5, but the future of XHTML is not at all clear.
Less than 48 hours after I posted this, the W3C announced that further development of the XHTML standard is ending. Development effort will be focused on bringing HTML 5 along at a faster pace. HTML 5 will effectively converge the HTML and XHTML evolutionary paths back into a single path.
I guess I'm prescient wrt web technologies. Or something.
Re: (Score:2)
Re:Contact MIT and their archival department (Score:4, Informative)
PDF, since its creation, has been an open standard according to definition 2. Some people don't like it because it doesn't meet definition 3 (Adobe are the only ones who can create new versions of the PDF spec).
Re: (Score:2, Informative)
I just assumed that by PDF they mean PDF/A. Isn't that controlled by ISO?
Yep
"On January 29, 2007, Adobe announced its intent to release the full Portable Document Format (PDF) 1.7 specification to AIIM, the Enterprise Content Management Association, for the purpose of publication by the International Organization for Standardization (ISO). During 2007 and into early 2008 that intent was turned into a reality. ISO published the approved ISO 32000-1 standard based upon PDF 1.7 in July 2008. ISO will also produce future versions of the PDF Specification."
http://www.adobe.com/devnet [adobe.com]
Re: (Score:1)
NASA's responsibility is not formatting the data so much as making it available. It should be available at least as images so that others have access to the raw data. Beyond that, OCR'ed to simple text to facilitate search by others. Whatever OCR fails to reliably interpret should be fed to reCAPTCHA.
Obligatory Tom Lehrer.. (Score:5, Funny)
Gather round while I sing you of Wernher von Braun
A man whose allegiance is ruled by expedience
Call him a Nazi, he won't even frown
"Ha, Nazi schmazi," says Wernher von Braun
Don't say that he's hypocritical
Say rather that he's apolitical
"Once the rockets are up, who cares where they come down
That's not my department," says Wernher von Braun
Some have harsh words for this man of renown
But some think our attitude should be one of gratitude
Like the widows and cripples in old London town
Who owe their large pensions to Wernher von Braun
You too may be a big hero
Once you've learned to count backwards to zero
"In German oder English I know how to count down
Und I'm learning Chinese," says Wernher von Braun
Re:Obligatory Tom Lehrer.. (Score:5, Informative)
Re:Obligatory Tom Lehrer.. (Score:5, Funny)
Looks recorded.
Re: (Score:3, Funny)
Or, as David Grinspoon put it... (Score:2)
Re: (Score:2)
Re: (Score:2, Funny)
Is that like Fist Post?
Re: (Score:2)
A suggestion (Score:5, Funny)
On the next thing that goes up to space (or even just a suborbital flight), crank down the window at about 20km up and throw the stuff out (or have some automated thingy with an explosive bolt that distributes it into the atmosphere). Now THAT would be a "creative way to get it out to the public".
Then again, maybe that would be TOO creative.
Distributed Proofreaders (Score:2)
Scan it at high resolution, OCR what you can, and load it into Distributed Proofreaders [pgdp.net]. Or if the material is too technical for the layperson, ask for a copy of the web-based software and set up your own private site. Let bored grad students work on it in exchange for some kind of minor credit on the final digitized work. (I believe that the bored grad students phenomenon produces half of the highly-technical articles on Wikipedia.)
Re: (Score:2)
Captchas.
There are projects that use captchas to digitize old texts, NASA could put those parts which don't lend themselves to OCR as captchas on their webpage.
Re: (Score:1, Insightful)
Seriously?
"Please enter proper LaTeX syntax for the following equation..."
Re: (Score:2)
There are far more individual numbers/letters/etc. in those notes than equations.
Re: (Score:2, Insightful)
Unfortunately, the notes are full of non-words, like (RTG), SNAP-10A, B70, n.mi
At least, that what i'm assuming they say, because some of them are rather unreadable. Now, slashdotters may recognise some, but many people won't see the "words"
Re: (Score:2)
That's a very valid criticism in the case of reCAPTCHA, unfortunately...
However, I seem to remember something similar to reCAPTCHA that operates not on whole words, but on individual symbols. Might work. Even if doesn't exist (can't find it...) it shouldn't be too hard to implement.
Re: (Score:2)
Competition? (Score:1)
Just scan everything and allow private companies, individuals, and non-profits to come up with their own scheme, then combine the best non-proprietary techniques and make your own.
Oh, forgot one thing (Score:1)
Only do this for notes that are in the public domain or which the copyright-holder is willing to license very liberally.
For encumbered notes you'll want some other idea.
Re: (Score:2)
Fist director? (Score:2)
Boy do I not want to work for that particular department.
TIFF FTW (Score:5, Interesting)
Lets go with a format almost anyone can read. As soon as their all scanned in as high res TIFFs THEN you can begin to OCR them and create hybrid PDF's which CAN be indexed. From there we have a good start with high quality originals and searchable dirivitives. Then people can start rolling whatever custom solutions they want to.
Yes, I know that OCR is going to be very crude, especially for anything hand written. But what it will do is get us a very good starting point. Id like to see a wiki set up with the OCR'd text as the beginning text, a link to the document and then the public can begin to go in and correct the OCR mistakes, and fill in what just flat out couldn't be OCRd.
Recaptcha! (Score:2)
Sounds like a job for this project. [recaptcha.net]
Best part is, hand written is going to be more difficult to solve for computers...
Re: (Score:2)
I'd never seen that before, great idea.
Use a Wiki to Process Images to Open Format (Score:5, Insightful)
But, if I were doing this: Assuming these are all in images, put the images in whatever format you want and make a generic wiki page for each of them. Then let users log in (NASA fans should pour in) and translate the pages to annotated wiki pages with the footnotes (normally references) being all the side notes that were penciled in. They can categorize them by related missions and maybe even tag them
Once that's done, ideally you'd put it in some XML standards based format (ODF or OOXML, yeah, that's another argument to be had) that you will always be able to read even if you have to build your own viewer/converter. Keep these sources indexed and provide for people the rendered PDF/PS/PNG/whocares and then you could probably build scripts to rebuild all from sources if you want. New technology comes out or people want to view them in HTML 5--no problem, just build a neat little XSLT for them.
As for indexing them, I can tell you one way not to do it. Don't do the thing that curators of classical music did [stason.org]. Man, that's like speaking another language to me. Arrange the notes by mission or date if you can and any natural titles that arise for the favorites, add to it as an alias.
Re: (Score:2)
Re: (Score:2)
With any decent metadata format, that kind of system (or even more complex) is perfectly fine. Every one of those is meaningful to someone, and maybe they want to search using it. For example, lots of cataloged materials have barcodes which would be a colossal pain to type in by hand (and no one would remember them anyway) -- but they're great for scanning in if you happen to have the thing i
PDF with annotations (Score:2)
Brilliant! We'll make society do the work! (Score:3, Interesting)
they are allowing the marketplace to decide (Score:2, Insightful)
instead of focring people to pay taxes on some project of dubious desirability, they are trying to see if the public has any support for their idea, before they thrust headlong into it.
government workers should ask the opinion of the taxpayers more often, we are after all , their bosses. i have a lot of respect for the government employees that remember this, and nothing but contempt for those who want to 'play social engineer and tax waster' without regard for what the public thinks.
Re: (Score:3, Insightful)
Even if NASA did do it itself, "society" would be paying for it anyway...
Actually, this should be better in two important ways: not only could crowd-sourcing could accomplish the task much more efficiency than $50-grand-space-pen-NASA could to begin with, but also the cost would be distributed across the entire Internet, rather than being shouldered only by American taxpayers! It's a win-win-win* situation, I'd say.
(* for NASA, and for space geeks, and for taxpayers)
Anonymous Coward (Score:2, Interesting)
You guys clearly do not read enough electronic media. PDF and Djvu are the more widespread and relatively ubiquitous modern electronic book formats. Djvu tends to be vastly superior to PDF in terms of file size though.
Read all about it here:
http://en.wikipedia.org/wiki/Djvu
Discuss.
Re: (Score:1, Troll)
Any post that ends with the command, "Discuss", should be taken out back and shot.
It's pretentious, annoying, and detracts from whatever valid points (if any) are contained in the post.
If the topic of the post is worth discussing, it'll be discussed. If not, it will be ignored.
And just to note, djvu is better for file size... at the cost of lossy compression. In my experience, the lossiness isn't really that bad, but we are dealing w
Zoom! (Score:3, Informative)
We're looking for creative ways to get it out to the public
By rocket mail!
http://en.wikipedia.org/wiki/Rocket_mail [wikipedia.org]
Re: (Score:2)
Re: (Score:2)
I have an account at rocketmail.com, but I never get my email by rockets. I'm disappointed now.
Of course you don't, sending rockets to individual users would be cost prohibitive, not to mention really bad for your lawn. No, rocketmail actually only uses rockets to deliver mail between them and your ISP.
Keyword searchable is a must (Score:1, Insightful)
Call me selfish, but I'd love to search Von Braun's notes for one particular name: my late grandfather worked for him at MSFC for over 30 years.
Recaptcha be able to might help (Score:2)
http://recaptcha.net/ [recaptcha.net]
Wonderful (but really awful) irony (Score:2, Interesting)
Vonce ze rakets go up . . . (Score:2)
Who cares where they come down.
That's not my department, says Wernher von Braun.
hard copy (Score:2)
personlly, i'd love a printed hard copy on my book shelf. right there with my Goddard books.
Tobacco Documents Online (Score:4, Interesting)
What format? (Score:1)
NASA needs ideas on what format to use (PDF)
Why do I have this subconscious urge to suggest.... PDF?
Turn the project over to the Smithsonian (Score:1)
Why is NASA handling this themselves? (Score:2)
I'm all for saving historical documents and everything. But with the economy the way it is right now, is this really the best thing for our _space_ agency to focus on? Don't we have some government departments just for handling historical records? Can't we just turn this over to them and let NASA focus on its basic mission?
Re: (Score:2)
...is this really the best thing for our _space_ agency to focus on? Don't we have some government departments just for handling historical records? Can't we just turn this over to them and let NASA focus on its basic mission?
You've never worked for a government agency, have you? Giving up budget dollars is unthinkable.
Re: (Score:1)
correct. It sounds like a job for the National Archives (http://www.archives.gov/) to me. Why is NASA doing it themselves? Because NASA invented Not Invented Here.
Re: (Score:2)
Because Von Braun's notes probably remain relevant today. Von Braun is one of (if not the) most important/influential rocket scientist of the modern era.
Derivatives of the pulse-jet engines on the German V1 rockets are now being seriously examined for re-use in modern aircraft, as they use fewer moving parts and offer greater fuel efficiency than conventional engines today, despite having fallen from favor after WWII.
Just because the science and technology is old doesn't necessarily make it irrelevant. Ol
Twitter! (Score:1)
Post them via twitter. Get Ashton Kutcher involved.
Huntsville needs a dedicated exibit to Von Braun (Score:1)
If you get the rights right, then all else follows (Score:1)
Wikipedia? (Score:2)
Re: (Score:2)
And that's exactly what Wikimedia Commons [wikimedia.org] is for, isn't it?
Text (Score:2)
Let other people format them to their hearts desire.
NSA (not NASA) could help (Score:2)
But it would probably be easier to just convert it into HTML and let Google's spider index it all.
Can't forget the song... (Score:2)
Wernher von Braun
by Tom Lehrer
Gather round while I sing you of Wernher von Braun
A man whose allegiance is ruled by expedience
Call him a Nazi, he won't even frown
"Ha, Nazi schmazi," says Wernher von Braun
Don't say that he's hypocritical
Say rather that he's apolitical
"Once the rockets are up, who cares where they come down
That's not my department," says Wernher von Braun
Some have harsh words for this man of renown
But some think our attitude should be one o
maybe SVG? :) (Score:1)
how about bone dust on tanned leather (Score:2)
to commorate allthe people gassed, and killed, partly with the help of Dr. von Braun
Oh, and lets not forget something to commeorate the hypocrysy of the US - maybe make all viewers where rose tinted glasses
This is not dead history, there are still living people with tattos on their arms with the jew number
Hey I volunteer to help NASA with Tesla's notes! (Score:1)
Let people help (Score:2)
Go see the galaxyzoo [galaxyzoo.org]
website where people like you and Me catagorize galaxies.
Its human powered picture clasification.
Perhaps looking at cool space images are quite the draw
that Von Braun's Notes can't live up to.
Fucking Editors Suck (Score:2)
"According to the official request for information, LINK[NASA needs ideas on what format to use]LINK (PDF)"
Should be
"According to LINK[the official request for information]LINK (PDF), NASA needs ideas on what format to use"
.
Otherwise it looks like someone's implying that PDF is a proposed/preferred format. Also, links should be attached to the text of what they are, not what they say!
Project Gutenberg (Score:2)
This is the sort of thing that Project Gutenberg does all the time. Why not see if they are intrested?
Cue Tom Lehrer (Score:2)
"Who cares how they're writ down?
That's not my department!"
Says Werner von Braun