Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Science

First Complete Gap-Free Human Genome Sequence Published (theguardian.com) 33

An anonymous reader quotes a report from the Guardian: More than two decades after the draft human genome was celebrated as a scientific milestone, scientists have finally finished the job. The first complete, gap-free sequence of a human genome has been published in an advance expected to pave the way for new insights into health and what makes our species unique. Until now, about 8% of the human genome was missing, including large stretches of highly repetitive sequences, sometimes described as "junk DNA." In reality though, these repeated sections were omitted due to technical difficulties in sequencing them, rather than pure lack of interest.

Sequencing a genome is something like slicing up a book into snippets of text then trying to reconstruct the book by piecing them together again. Stretches of text that contain a lot of common or repeated words and phrases would be harder to put in their correct place than more unique pieces of text. New "long-read" sequencing techniques that decode big chunks of DNA at once -- enough to capture many repeats -- helped overcome this hurdle. Scientists were able to simplify the puzzle further by using an unusual cell type that only contains DNA inherited from the father (most cells in the body contain two genomes -- one from each parent). Together these two advances allowed them to decode the more than 3 billion letters that comprise the human genome.
The science behind the sequencing effort and some initial analysis of the new genome regions are outlined in six papers published in the journal Science.
This discussion has been archived. No new comments can be posted.

First Complete Gap-Free Human Genome Sequence Published

Comments Filter:
  • by az-saguaro ( 1231754 ) on Friday April 01, 2022 @02:23AM (#62407314)

    Looking at the comments so far, I thought I might add a clarification of some of the biology.
    The sequencing was done on a cell line established from a hydatidiform mole ("HM", aka molar pregnancy).

    An HM happens when sperm fertilize an ovum, but the ovum has already lost its nucleus or DNA. One sperm with 23 chromosomes can enter the egg and duplicate to a full complement of 46 chromosomes. Or occasionally, 2 sperm enter the egg. The resulting cell has either XX, or sometimes XY (YY cannot survive or function, so never seen). A fetus cannot develop. Instead, the mole grows to become a grotesque blob of intrauterine stuff that must be removed (and in some instances can later become cancer or have other complications).

    The mole is a freaky thing rather than a normal fetus or person, but its origin DNA is normal, from a normal father. That is why a mole is relevant to the DNA sequencing - the DNA being studied comes from just one person. So, if you can complete the sequence, you know it is a complete picture of one normal person. In this case, the study DNA is from an HM cell line immortalized in culture and available for study as a standard reference.

    Some links that provide direct info:
    T2T :
    https://sites.google.com/ucsc.... [google.com]
    Science :
    https://www.science.org/conten... [science.org]
    Science :
    https://www.science.org/doi/10... [science.org]

  • 3 billion letters, four possibilities per letter. That's only 1.5 gigabit of data, roughly 200 megabytes.
    Then the entropy seems to be limited because of repeating sequences.
    Looks like our "design files" are not that complicated as I thought.
    How many humans would fit on a compressed archive on a 32gig thumbstick.
    Guess this falls under the category "a fool can ask more questions in an hour than a wise man can answer in seven years"
    • How many humans would fit on a compressed archive on a 32gig thumbstick.

      Zero. The DNA doesn't define the position of every cell in the body. The environment inside the mother defines a lot of that, so if you put the same DNA into two different mothers you'd get two different-though-extremely-similar people, even if the two mothers were exact copies of one person, because no matter how hard you try they're going to not be actually identical. Their internal biomes will differ, for example.

      • The same goes for every piece of hardware. I.e. a CPU. One design, but silicon lottery changes the max frequency. Still one fundamental design.
      • Two good examples to support this: Caesarian babies don't get their initial bacterial load from the doctors and nurses who handle the baby. Normally babies get their initial bacterial load by exiting through the birth canal. Second is twins separated at birth and thus their diets. By ~25 our gut biome is firmly in place. Their different bacterial profiles would change how the gut produces serotonin, processes glucose/sucrose/fat, and a host of other changes. Things like exposure to parasites and disease (th
        • Before someone things this is troll bate, I'm completely aware that epigenetic changes don't change actual DNA sequences. So much for 'unique personalities' and 'free will', and people 'always having a choice'. Oh, and I also forgot economic situations which greatly change how the brain works. Environmental trauma changes the expression of genes.
    • There's some interesting work on the under-determination of development, especially in the brain. Instead of each neuron's connections being determined by DNA, our genome's brain-building instructions seem to be more along the lines of "send a bunch of axons in that basic direction and have them die if they don't find anything to connect to" for each group of neurons.

      The limits of brain determinacy [royalsocie...ishing.org]

      Exuberance in the development of cortical networks [mit.edu]

    • by Elros ( 735454 )

      3 billion letters, four possibilities per letter. That's only 1.5 gigabit of data, roughly 200 megabytes.
      Then the entropy seems to be limited because of repeating sequences.
      Looks like our "design files" are not that complicated as I thought.
      How many humans would fit on a compressed archive on a 32gig thumbstick.
      Guess this falls under the category "a fool can ask more questions in an hour than a wise man can answer in seven years"

      Am I missing something in your math.
      4 possibilities per letter is 2 bits per letter, thus 3 billion letters is 6 billion gigabit (roughly). That's around 800 MB.

  • anyone got a torrent link?

As far as the laws of mathematics refer to reality, they are not certain, and as far as they are certain, they do not refer to reality. -- Albert Einstein

Working...