Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Science

Datamining Medline for Gene Interactions - Pubgene 57

An Anonymous Coward wrote: "According to an article in the 5 May 2001 issue of New Scientist , biologists in Norway have developed a computer program to datamine Medline to predict interactions between genes. Some of the relationships hadn't been predicted before and were found to be real. The scientists' PubGene database and tools are available for experimentation." Wow.
This discussion has been archived. No new comments can be posted.

Datamining Medline for Gene Interactions - Pubgene

Comments Filter:
  • by Anonymous Coward
    There was a session [stanford.edu] called "NATURAL LANGUAGE PROCESSING FOR BIOLOGY" the Pacific Symposium on Biocomputing.

    Also, a paper [nih.gov] in Bioinformatics was published recently which tries to extract protein interactions. They used a dictionary of words related to interactions, and then look for proteins which are mentioned in the same sentence that contain one of those dictionary words, along with part of speech analysis to improve accuracy.

    Something like that.

  • Would it be of any assistance to setup something similar to distributed.net or seti, and just have people designate their idle cycles to processing all sorts of genetic interactions. For example, combining human DNA with DNA of other creatures to provide cures for various diseases. It'd be interesting to cross some human DNA with that of a lizard and get some regeneration action goin on (ala spiderman -- the doctor dude who turns into an alligator when it's least convenient. ;)
  • I'm sure they'll find a correlation between being l33t and the desire to frist post.

  • we could get onto Enkephalonetics,

    Which is......?

  • How soon until I get this sort of thing to write my English papers for me?

    But it's already been done! Quick, go to [sourceforge.net]
    megahal.sourceforge.net and grab the latest sourcecode. Build it, and then run some sample english papers through it. Feed it some other miscellaneous stuff for good measure (how about the script to the movie Terminator and a few of the better Slashdot trolls), and let 'er rip! Sure, it may not make *logical* sense, but since this is for an English course, you'll get graded higher for originality and "thinking outside the box". Just don't mention my name when they come to take you away to your padded cell.

    -1: Sleep Deprived
  • I believe that this is similar to that which Granny Weatherwax refers to as "headology".

    "Enkephalos" is ancient Greek for "Head" or "cranium". Therefore, enkephalonetics = using your head. =)

  • One of the final year comp sci projects here, reminds me of this, although AFAIK far simpler. One idea though, that was brought up during the student's presentation, is that this might work very well in a distributed computing situation.

    Perhaps this will be the next SETI@home ?

  • Thanks, you're right. "Slashdot - Fastest way to find flaws in your arguments"

  • So I guess you havn't heard of psychohistory?
  • Or worse, get good enough that they will replace researchers as well as be a tool for them. When one of these programs, while its operators aren't looking, deduces the existence of the scientific journals and starts trying to publish papers on its own, they'll have to worry.

    I'm wondering if it's much different than Google is doing with web page data, on a larger scale?
  • Why is this moderated to Troll?
  • though doctors should ultimately bear the weight, pharmacist are the ones who are to screen for the interactions. Anyone can pass out meds, a pharmacist is schooled for a reason.
  • "The holy grail is for a computer to be able to read an article like you or I would read it and extract the concepts and relate them all to each other," says Masys.

    Hmm, shades of HAL? Lets hope when he's eventually developed that the developers (or creators?) incoroprate Asimov's 3 Laws of Robotics into this thing.

    Wouldn't want these smart AI programs to "suggest" something that would be potentionally harmful.

    -Cyc

  • Hmm. I think what you describe already exists. Unfortunately, nobody seems to know. Well, I left the company one year ago, but please have a look at www.kelman.de [kelman.de]
  • Nice point re: the implausibility of homebrew gene therapy (for the time being). Your assertions regarding openness and disclosure were echoed in many thousands of responses FDA received to a recent call for commentary on proposed disclosure rules for gene therapy research (the e-mail address was FDADOCKETS@OC.FDA.GOV, but I'm pretty sure the comment period has expired. May still be worth a try, though, if you're interested and want your voice to be heard).

    As many of you may know, the drug/biologics industry has fought bitterly to protect ANY and ALL information relating to gene therapy trials as trade secrets. In fact, they protected such trade secrets about a certain adenovirus vector so well that none of the right people knew it had killed monkeys and seriously injured several humans before it killed Jesse Gelsinger last year (with a little help from a U. Penn research egotist). The upshot is that when these greedy moron subhuman bastards seek to protect prior evidence of toxicity as "trade secrets," people die for lack of the suppressed knowledge. It's shameful. We'll have to hope FDA and HHS have the balls to get these gene cowboys in line before they kill again.

  • Hmmm. I am a little bit surprised that this got onto slashdot, or even new scientist. People using information extraction techniques to mine biological data. It might be important but its hardly novel.

    I think that the main thing is demonstrates is how poorly scientists choose to represent their data in the first place. Not only do we choose to put all this vital stuff into something which is totally unamenable to computation, but we sign the copyright over to various commercial interests.

    Phil

  • This is why the Internet is so much more than a bubble, why there really IS a new economy with new laws, and why old farts in their fifties should get a clue.

    If I hear another lame-ass comment about how the Internet is just like the tulip bubble in the 19th century, I am going to send them this link.

    And, oh yeah. Perl is not just for script-kiddies either. So there.

  • something like this [compare-stuff.com], perhaps?
  • check out compare-stuff.com/pubmed [compare-stuff.com] to analyse relative co-occurrence in PubMed articles.
    you can compare more than just gene names too: disease/condition names, reagents, techniques, author's addresses, whatever...
    reload the entry page to see different examples of what you can compare

    or ask similar questions on the web at large with the vanilla version: compare-stuff.com [compare-stuff.com]

  • quick plug for compare-stuff.com/pubmed [compare-stuff.com]. see my other /. posts [slashdot.org] for more info.
  • So they have the program look for the mention of human genes, then look at what other genes are mentioned in the article.

    Maybe I'm missing something here, but isn't the fact that the other genes are being mentioned in the same article as the first gene already imply a relationship between the two? Why else would the authors mention them in the same article?

    Another thing to consider is that scientist don't just go around randomly picking a gene and studying it. There are generally reasons why the gene is interesting, and those genes are studied more than others. There is a whole field of the sociology of science that deals with how the way scientists go about doing science influences the results that they find. It annoys the heck out of most scientists.

    It is good that there were some relationships that the program found that had not been previously found, but essentially the program is an automated review article generator with a meta analysis component to organize and sort the data.

    The idea posted above, that drug interaction would be a good thing to do as well, I can heartily agree with. Have the program go through not only the literature but also the PDR (Physician's Desk Reference), categorize pharmacological responses (e.g. what drugs cause blood pressure to rise by what mechanisms) and not only could we possibly avoid some nasty drug interactions, but perhaps we could find where some drugs act synergistically with each other to generate greater or new results that were not previously thought of.

  • Now, what we need is for some the slashdot karma geeks to get off their arses and write an open source dataminer for slashdot articles amd posts. Of course it must have a NL front end and be able to answer questions like "how many dumb stories have Commander Taco posted?" or "how many /. users are communists or libertarian, or into goat sex?" Just a thought.
  • How soon until I get this sort of thing to write my English papers for me?

    I'm betting, 25 years, too late for me.

    I'd like to make something that compiles an essay from paragraphs and phrases in other works, that could be made in the next 2 years I think.
  • In regards to your first point... I've done a pretty large amount of work with Medline. Certainly the technique does to some extent rely on a standard nomenclature. This is probably not such a hurdle, though. Each citation indexed in Medline is tagged with particular MeSH headings [nih.gov]. MeSH is a controlled vocabulary of medical terms, with quite extensive supplements that include genetic and chemical information. The most relevent part here is that each heading is associated with a number of synonyms. So in addition to each article being indexed against a controlled vocabulary (by a trained human indexer), that vocabulary itself provides relationship information between various terms, both internal and external to the actual vocab. Also, there's the whole Unified Medical Language System [nih.gov], but I'm not really up to speed on that. It's pretty much independent from MeSH, and it's not used directly in Medline, AFAIK.
  • One question.. are the results of this project public domain, (in some way) or are they going to be snapped up and made proprietary? Seti is one thing (can't really make that a business),but I'm leery of donating my time to make someone else's fortune.

  • This technigue, morover, appears only to collate published interactions-- helpful, perhaps, in guiding the conduct of basic research, and the avoidence of duplicate studies-- but less useful when the goal of a researcher is determining the function of unknown genes, or putative protein products. In those cases, protein fold databases or motif databases are much more useful.

    Sure structures and motifs are good to have, but there are a lot of structures out there that we don't know much more about. And the issue here is about interactions, beyond simple statements like "this is a catalytic protein" or whatever.

    Can one give Pubgene a pdb or fasta file-- and find papers on homolougous genes or structurally similar proteins-- or must one use BLAST, or a fold recognition algoritm prior to searching Pubgene?

    No, you are supposed to have a set of genes names that you are working with. Homology can be asses elsewhere. What you can ask this system is about known and inferred interactions out there.


    Lars
    __

  • As a lot of people here have noticed, the basic technique used in PubGene is quite simple. The novelty of their work is mostly in how to evaluate co-citation of genes, and perhaps also in the quite comprehensive setup. Several other systems have been suggested and setup for discovering protein-protein interactions, gene interaction networks, and also automatic discovery keywords to be associated with genetic conditions.

    More elaborate techniques have also been suggested for learning about the interactions. By simple text analysis, you can deduce with fair (but not perfect) certainty if a gene is up or down regulating another gene. Other systems try to find support for hypothesis on interaction networks by doing pubgene-similar analysis. If your experiments support many tentative networks, you can let the vast amounts of knowledge in the published literature dismiss the bad suggestions.

    The need for systems like this is huge. More articles than ever are being published, and there is no way a researcher can keep up with the information flow. New technology also admits large scale genome-wide experiments that generates enormous amounts of data. Such data needs to be analysed automatically, and if we can tie in the published knowledge, the value of the data increases.

    If you are interested in systems like these, look up the works of Andrade, Valencia, Bork, Ouzounis, and their collaborators!

    Lars
    __
  • Uh, yeah, but why not go directly to PubMed? OK, you don't get relative co-occurrence and those nice little charts, but on the other hand, I cannot come up with a research question where you want the relative co-occurence.
    Lars
    __
  • Maybe I'm missing something here, but isn't the fact that the other genes are being mentioned in the same article as the first gene already imply a relationship between the two? Why else would the authors mention them in the same article?

    That is the basis for their technique yes. The thing is that they are using this transitively. If gene A is mentioned together with B in one paper, and then B is shown to work together with C in another, then that is evidence for A and C having some sort of relationship. There are problems with this approach, and I think the authors are aware of it (have not had access to the article yet). For example, if a paper is talking about a certain new technology, it could bring out examples from various systems in the cell, and thus mentioning genes that are quite unrelated.

    Another thing to consider is that scientist don't just go around randomly picking a gene and studying it. There are generally reasons why the gene is interesting, and those genes are studied more than others.

    This may be true historically, but we are entering a whole new era in genomics. Industrial science is here. The fashionable experiments today are genomewide, studying for example the workings of a large set (thousands) of genes at the the time. Genes are no longer selected for sociological reasons, but based on predictions on various aspects of the gene. "Are there reasons to believe that this gene encodes a protein sitting in the cell membrane? Let's include it in out experiment because it is then probably doing interesting signalling."

    With industrial genomics and a higher publication rate than ever, new tools are needed to sift through the data. These Norwegians provides one attempt at addressing this.


    Lars
    __

  • Sure, a drug interactions database would be useful. No question. Relevance to the topic? Hmmm, a tangent at best. OK. Whatever.

    But what got my goat was the claim that "more than 100,000 deaths per year are caused by adverse drug reactions" and yet "By contrast, deaths due to traditional herbal remedies are so rare they're hard to find."

    This is such blindingly bad use of statistics that I have to howl. It isn't so much like comparing apples with oranges, as like comparing apples with trilobites. Consider the populations: why are people taking traditional herbal medicines? For colds, indigestion, general malaise. Not for heart disease, strokes, cancer or anything life threatening. People at risk of death are a lot more likely to risk dangerous combinations of drugs. Well, derrr.

  • If this is actually READING articles and then making insights about their content, this could be a revolutionary search tool for any field! These guys should contact Lexis-Nexis [lexis-nexis.com] or some other fact finding service.
    Finding information is a hell of a skill - I know that a lot of my time as a grad student has been spent on literature reviews.
  • This is all very well and good, but a far more immediately useful kind of interaction to be looking at would be interaction between medicines. This could save lives straight away. According to a paper published in the Journal of the American Medical Association, more than 100,000 deaths per year are caused by adverse drug reactions - making it the fourth biggest killer in the US, after heart disease, cancer and stroke. See http://dmoz.org/Society/Issues/Health_and_Safety/I atrogenesis/ [dmoz.org] for more info.

    By contrast, deaths due to traditional herbal remedies are so rare they're hard to find. I'm not dismissing modern medicine entirely - far from it - I'm just pointing out some disturbing facts.

    So why are gene interactions so hot, yet medicine interactions so neglected in research? And why, for that matter do so few people know that they could substantially reduce their risk of heart disease and cancer by going vegetarian or vegan? Surely the governments of the world should be funding research and education on these two topics on a massive scale - it could save thousands upon thousands of lives - and even from a callous economic point of view, the savings in terms of medicaid and lost economic productivity due to ill-health would be huge! In fact, official guidelines still endorse a meat-based diet despite the well-known health risks, and there is NO serious attempt to co-ordinate drug safety information between regulatory bodies internationally. That's right, none - regulatory bodies in the UK often ignore bans in the US, and vice-versa. What's more, the support for even collating data on side effects of medicines at a government level is poor - particularly in the UK.

    The reason is the same in both cases, and it's very simple. Profit. Profit for the drugs companies, to be precise. Pharamaceutical corps profit from ill-health, and they don't exactly relish the idea of their drugs getting banned or contraindicated for safety reasons, either. Campaign funds, and the revolving door between the FDA and the drugs/biotech industries helps keep the government in line. For more info see http://www.drrath.com/

  • Bill Gates suggested something similar in this very lame book (after The Road Ahead, I really expected better), of course, he figured people would be using Excel pivot tables.
  • Yeah, a nice way to improve it would be to do something like a genscan search, where you take the raw protein sequence (or gene sequence) and look for conserved portions that are known to interact with other conserved portions (kinda like the zinc finger motif and such). Then go through the literature like this project has done, or a new kind of database of genechip-type data, showing which genes are expressed together, and correlate the data together. Then go from there with things like Chromatin Immunoprecipitation (ChIP) to find out what's really interacting. The future's looking good for us biologists! :-)

    "I may not have morals, but I have standards."
  • Medicine Interaction related deaths are caused by either incompetent doctors, patients not disclosing medications fully, or foolish patients not reading warning labels. It's basically idiocy and carelessnes that causes this. When taking medication, remember that you are putting a chemical into your body, and we have a pretty good idea as to how they interact with other ones clinically, if not molecularly (experimenting with that's probably unethical anyhow). Medical interaction research isn't neglected, but is a standard and critical part of getting a drug approved for use. If you want to know more, go to www.fda.gov [fda.gov] and look at their massive database on drugs. They've got info on interactions aplenty, particularly their medwatch [fda.gov] database.

    And as for why gene interaction is so hot, is that it's the real key to a lot problems. You thought that the human genome was it? No no no... that was only the beginning... it was the map for gene interactions. The genes are worthless if we can't figure out what they do and how they interact. I mean, we can't even tell you how an E. coli works even though we've got the genome. There will be a lot of profit out of finding protein interactions, sure, but it'll be to find cures. I work in a lab that's trying to figure out gene therapy in prostate cancer. We need to know the genetic mechanisms for therapy to be effective. Or don't you want cancer cured?

    "I may not have morals, but I have standards."
  • Hehe, not to diminish what you did (very cool project that I thought about writing myself last year) but just that we're heading in to major league waters here, and it's pretty exciting. Punnet squares are an important part of genetics because of inheritance, but the stuff now is all gene expression and interaction. It's pretty terrifying, because that's where the real work is all going to be, but it's also incredibly exciting, because bio is going to be the science of this century.

    If you're interested in slightly higher level concepts, I just found this website [ucla.edu] at my college's webserver (it's a class I had to take, intro to Molecular Bio) and it looks like it's got some good info through the flash animations. If you want the hardcore stuff, go to the NCBI site [nih.gov] where you can browse the genome, search for proteins and genes, and do all the stuff real biologists do :-) If you're at a University that's paying online fees, you can read journal articles that they link to from University IP's as well.

    "I may not have morals, but I have standards."
  • Your post put me very much in my place, and while I was aware that we don't have the multi-drug interactions down in the least (obviously due in large part to testing issues) that I wanted to inform the root post that there was some info on interactions available.

    The real problem is the one you stated, that most seniors are on multiple meds at a time for about as many conditions. That is, quite simply, pumping the body with way too many chemicals to be safe, especically in people whose bodies are breaking down to begin with. These drug interactions are probably incredibly difficult to study, but I agree that it needs to be done. I also agree that we need some kind of mechanical checking via computer to eliminate a lot of the stupid errors. However, I think the real problem lies in the fact that we're pumping drugs in to people in ways that they just aren't capable of dealing with. We need better forms of treatment, and I think genetic therapies, as well as that critical yet neglected factor of prevention, are going to be key in the future. Healthy diet and exercise alone can help with a large number of ailments, thereby reducing the number, or at least dosage, of meds needed later on in life. And to prevent heart disease, rather than take the pill, introduce the healthy gene (I know it's not ready, but it is the future) and that's one less drug-drug interaction to worry about.

    The other problem is that, in large part, we don't know what causes diseases really. Alzheimer's is getting closer, and the ulcer bacteria was just an absurd discovery in some ways. We need to understand the disease before we can treat it, and all these things have to play together. So while I fully agree with you, and apologize for oversimplifying, that we need to really study multi-drug interactions, I don't think that simply saying "well, we can treat you for heart disease or AIDS, but not both" (totally hypothetical example) is the answer. Prevention is key. Understanding the disease better is key. And, hopefully, gene therapy will ease the number of meds as well. We need to make use of all the things Molecular Biology has achieved and will achieve in the coming decades, rather than rely soley on the older method.

    p.s. Thanks for teaching me something!

    "I may not have morals, but I have standards."
  • No offense, but this is really really different than an 8x8 punnet square (which isn't really that bad, I've done dozens of 'em by hand). This is hardcore datamining the scientific literature, involving lots and lots of parsing keywords and finding gene interactions. What I'd like to see is a tool to do this with raw gene and protein sequences (thinking... possible cool project for me!) Then a tool to combine those would be sweet! Mmmm... genes....

    "I may not have morals, but I have standards."
  • The article says that the algorith assumes that two genes interact when both are mentioned in the same paper. Imagine that the paper actually shows "gene 1 and gene 2 do NOT interact". Nevertheless this new algorith perpetuates and extends the mistaken idea that they do.

    Some leap forward: "Information in, Error out"!
  • If you're at a University that's paying online fees, you can read journal articles that they link to from University IP's as well.

    Naw, I got my GED after my junior year of high school. Now I'm just the average working stiff. :-P

    I thought about college, but after high school, and with what I've heard about how colleges treat undergrads (required to live on the dorms with crappy Internet access, kicked out if you post Bad Things, no privacy, disinterested professors and dumb students), I have no desire to pay ridiculous amounts of money for college when I'd rather be learning.

    It annoys me that it seems to be impossible to do anything between having an extremely casual interest in something and making it your whole career. You can't just go take classes that interest you, because they have prerequisites, and general education requirements, and all sorts of hassle. If I wanted to actually do anything related to genetics, for example, I'd have to spend at least 4 years in school studying it, and then get a low-level job at some place, and then decide that I'm not that interested in it after all, and what then?

    (As an aside, why is it that the simplest things are always overlooked by beginner's resources? Why, for example, don't they introduce all the basic terminology and notation for a topic as soon as the topic appears? I hate having to refer to a portion of the thing I'm working on as "that thingy over there", especially if I'm asking for help. I've seen this in computer science, physics, chemistry, and biology books. They don't even have a "notation" section in the back, or if they do, it's next to useless. (And this problem may be more limited to high school, but when I would ask the teachers, they would actually tell me "don't worry about that". Or they wouldn't know.))

    If you know of any entry-level resources for learning various sciences, I'd be most interested. I'll be sure to check out those sites if I'm ever at a computer with Flash, and I may play around with making a Punnetizer Deluxe or something :)

    --

  • Oh, I'm fully aware of my irrelevance to the scientific community. :P

    And no, the punnet square really isn't hard as such, but I was pretty sure the teacher's motive was to catch as many students in fatigue or misalignment errors as possible. Also keep in mind the fact that >60% of the students were still confused by phenotypes.

    Regardless, writing code to do it for me transformed the assignment from painful drudgery into a fascinating exercise. I was especially proud of realizing — on the way to gym class, no less — that it could all be represented as bitmasks. (I think this was when I first truly grokked the power of C.)

    Genetics was really the only thing that captured my interest in biology; sadly, the class didn't linger long on that topic. I'm still interested in it, all from an amateur perspective, of course. If I get time I think I'd like to make some new software that does multiple generations and traits requiring more than one gene.

    --

  • Well, I know it's nowhere near what these guys are doing, but this brought back a neat memory (one of the few) from high school. My biology teacher was one of the sadistic ones, and one week he decided to assign an 8x8 Punnet square [anl.gov]. My friend and I looked at each other and said, "Why do it manually?"

    Thus was The Punnetizer [quadium.net] born. Once I had the basic functionality working, I went hog-wild with output formats. So you can have your Punnet squares in ASCII text, HTML, LaTeX, and CSV. What was really fun was running it on a StarFire with 2GB of RAM with the maximum number of traits. The output HTML was something like 347MB. :P

    Anyway, that was one of the few times we impressed Cowell. He actually volunteered to give us extra credit. Of course, he graded our next assignment extra tough, but oh well. :P

    --

  • Authors are not going to mention two things in the same article unless they have a reason to. An article has a topic, which tends to be a very narrowly defined. If the authors are discussing a particular gene, and mention another gene in the same article, there is likely already a relationship between the two, otherwise they wouldn't be discussing them in the same article.

    If you were writing an article about a gene that regulates insulin production, you probably wouldn't be mentioning a gene that produces monoamine oxidase. In fact, the program relies on the fact that there will be some relationship between the genes. Otherwise, it's all random.

    I'd say you lose, but as you posted anonymously, that's a given.
  • Listen, I'm the last one to justify the quality of medicine, as commonly practiced, btu your atatement reveals a glaring lack of familiarity with the art and science of either diagnostics or pharmacology -- or for that matter, the underlying eopistemologic staudies of medical informatics.

    I'm not saying that thousands of significant avoidable mistakes are not made each day -- they are and for many of these, the description you gave is entirely accurate. It is inexcusable -- if only because, in my humble opinion, computerized prescription crosschecking should be mandatory, and we should have far better mechanisms for automatic, secure sharing of parient records, with adequate safeguards for privacy.

    HOWEVER: Studies indicate that the average patient over the age of 67 is taking eight medications (be aware that, for purposes of drug interactions, many substances other than prescription meds can be highly significant). Also, studies have shown that somewhere between seven or eight meds, the chances of an unintended drug interaction reaches 50%

    Further: Look at the literature. Though thousands of drug-drug (or class-class)interactions are known, many more are not verified (or quantified to a degree where they can be adequately weighed in clinical decision making). Worse, only a handful of 3-drug interactions are know, and almost no 4-drug and higher interactions have been documented. Finallly, we have barely begin to scratch the surface of stereo chemical racemic mixes and drug-gene interactions

    How could we fully understand unintended drug-gene interactions? (i.e. interaction between a drug and some gene or gene product aside from its intended target) when we have barely mapped the outline of the human genome, and never sequenced a single individual much less the range of variation in the species. (Mapping is like drawing the outlines of the states and the major cities; sequencing is like having a complete roadmap - interpretation of the sequence is a couple of orders of magnitude beyond merely possessing the sequence, a fact we molecular biologists have successfully obscured, in our rush to get others as excited about our work as we are.) Even assuming a rate of growth akin to Moore's Law, we are still many decades from the kind of knowledge you seem to assume we have.

    FURTHER, treatment often involves knowingly balancing the risks of a treatment plan against the risks of other candidate plans, and the the risks of the initial condition. It is not always easy to see these risks and effects without deep analysis and careful double blind studies.

    For example, it wasn't long ago that many ICUs did not allow the adminitration of ACE inhibitors to certain types of heart failure patients, because the patients would visibly decline, and often die. There were reasons to sus pect ACE drugs might help - but any young physician who tried them soon learned the same cruel lesson. However, long term analyses (which were difficult to get authorized, given the obvious mortality) showed that 1-year survival was actually significantly increased -- those patients who immediately got sicker with ACE inhibitors would likely die in the next 6mos anyway, but after the initial shakeout many would actually do better after several days of ACEI, and overall, more patients were alive (and healthier) at 1, 3, or 5 years with the ACE drug than without.

    As a physician and molecular biologist, the problems you cite frustrate me immensely, but it took me years of medical training to fully appreciate how incomplete our knowledge is, and how potentially NP-unsolvable the problem of diagnostics and therapeutics it.

    That's not to say that I don't think it can be done much better. It can (and even doctors I admire could often stand some improvement). I've spent a good chunk of my life training for and working on these kinds of solutions in computing and molecular biology, as well as medicine.

    I'M SORRY but your blanket indictment, though superficially similar to remarks I have made (backed by appropriate data and studies) in peer-reviewed journals, leads to precisely the wrong conclusion when the time comes to make medical policy, and I felt that I should make some effort to correct it. The factors you cite should be kept in mind, true, but they are not even remotely the entire picture. Worse, it is impossible to assess exacly *how much* of the picture they are, but I think we can safely say they are less than 50%

    Every citizen's opinion counts, which makes it important that they hear the "other side". Not only do citizens help mold public policy, but more important, they are my patients, and I believe that the better background they have in their daily lives, the better equipped they will be to make the medical decisions that, in the end, are theirs, not mine, to make.

    Sorry this was so long. As Descartes said:" If I had had more time, I'd have written a shorter reply."
  • Doug Lenat has been working on such a system for 20 or so years. http://www.cyc.com/ Its a pretty "smart" system even if it is a behemoth.

    --

  • >>>Enkephalonetics
    >>Which is?
    >using your head

    Or, more generally, using anyone's head.

    Computers are cute toys, but we've already seen wetware being used to mediate the control of mechanisms. If we can use it to mediate information processing, computers will be relegated to the status of diagnostic tool, low-end user interface, and arithmetic calculator.

    Cybernetics is machines that think.

    Encephalonetics will be brains used as machines.

    The spelling with the k's (kibernetics, enkephalonetics) is just how you say it if you're actually ancient Greek.

    --Blair
  • What we need now is a way to re-animate Norbert Wiener.

    We can tell him, "you know back when you said that machines could do the thinking? Called it 'kibernetics'? Well, it turns out we couldn't do that, so we've adapted humans to do the thinking and we feed it into machines so they can digest it seven times better than fishing around the Science Citation Index. It's only half as good as experimentation but at a micro-fraction of the cost..."

    I think at that point he'd understand the human mind and we could get onto Enkephalonetics, which is where this little electromechanical distraction is really leading us.

    --Blair
  • It's like Google for genes!
  • don't be dumb... you have to print todays papper yesterday and next weeks magizine last week otherwise people realize that today's paper covers yesterday and this weeks magizene is all about the trends going on last month just by looking at the date. With the current system they only catch on to the information lag time if they are otherwise sentient and informed....

    er... no offense... 8-)

    --
    Rob White,
    Cv - Cv = 0 Therefore there is an absolute frame of reference.
  • by Jeremy Erwin ( 2054 ) on Thursday May 03, 2001 @07:06PM (#247053) Journal
    Interesting technique, but it depends in large part on the use of a standard nomenclature. If a protein is known as "p89" in one article, and as "acetylcholinesterase II" in another article, a link cannot be established so easily.

    This technigue, morover, appears only to collate published interactions-- helpful, perhaps, in guiding the conduct of basic research, and the avoidence of duplicate studies-- but less useful when the goal of a researcher is determining the function of unknown genes, or putative protein products. In those cases, protein fold databases or motif databases are much more useful.

    Can one give Pubgene a pdb or fasta file-- and find papers on homolougous genes or structurally similar proteins-- or must one use BLAST, or a fold recognition algoritm prior to searching Pubgene?
  • by acomj ( 20611 ) on Thursday May 03, 2001 @05:23PM (#247054) Homepage
    This is interesting. I used to work at a place trying to do "meaning based search" in the medical field. They were working on among other things ontology based search and a search for protein-gene relationships.

    There was a paper in the office of some proffesor who used a brill learning algorithn with existing genes and then had it try to guess what a ramdom genes did. It did very well in the test despite the "primitive" ai.

    3rdmill and spotfire /labbook and a host of others are working on this stuff to sell to pharama companys to do better search and allow quicker more accurate drug creation.

    There is a lot of computing power in the life sciences field,and a lot of data created with gene-clips and assay data. People can't sort it all out anymore some computer analysis makes everything faster. Look at the human genome.
  • College isn't so bad, it all depends on where you go. No one makes you live in the dorms at a public university (at least not a UC). I understand your problem with the casual interest vs. career, but you do have opportunities to learn outside. I'm taking a class in Roman History because it's required for me to graduate as a general requirement. Ancient history is something I'm fond of, but not something I'd major in. You do have the opportunities to look around though. I know tons of people who spend their first two years just looking around. Changing majors is a normal and constant thing. The goal of college isn't generally to trap you in to what you want to do, but let you feel around a bit (a lot more than in high school!) and finally pick a course. Even then there are options. I'm majoring in molecular biology, with a specialization in computing, and an English minor :-)

    "I may not have morals, but I have standards."
  • by Proud Geek ( 260376 ) on Thursday May 03, 2001 @02:27PM (#247056) Homepage Journal
    It's good that this is done in the open. I would be uncomfortable with genetic engineering being an open source discipline (ie. downloading a gene set and coding your own modifications to it then compiling it into a living creature), but I'd much rather have the knowledge out in the open than locked up for who knows what to happen to it.
  • by DarenN ( 411219 ) on Thursday May 03, 2001 @02:42PM (#247057) Homepage
    Ha! Excuse me, but I can't really see your "genetic code compiler", where you download, ./configure, and make. Remember that gene therapy is highly experimental, and genetic modification a bit of a hot potato at the moment ethically and morally. (I, by the way, believe that genetics is the way forward medically, but that's just MHO. See this story [newscientist.com] about restoration of sight to blind dogs).

    We're a long way from self-modification (I 4m 3l337 with the biggest cock ever kinda thing!) if we ever get there.

    But I do agree with you, information should be available. And that's what this article is about. It was an ingenious method of searching vast quantities of data to link relevant papers.

    One of the best things about this is that the methodology could possibly be applied to diciplines outside genetics, speeding up research in other areas.
  • by 6EQUJ5 ( 446008 ) on Thursday May 03, 2001 @04:11PM (#247058) Homepage

    Genome@home [stanford.edu]
  • by mizhi ( 186984 ) on Thursday May 03, 2001 @02:26PM (#247059)
    First off, yeah... that reaction pretty much sums it up. "Wow."

    But I find it interesting that their method was so simple. It didn't involve any real complicated methods... basically a glorified text scanner. Yet, it was able to predict some new interactions that hadn't existed before. Still, it was only 7 times better than random guessing... I wonder if that could be improved any?

Ummm, well, OK. The network's the network, the computer's the computer. Sorry for the confusion. -- Sun Microsystems

Working...