Office of the Press
For Immediate Release June
NEAL LANE, ASSISTANT TO THE PRESIDENT
FOR SCIENCE AND
DR. FRANCIS COLLINS, DIRECTOR OF THE NATIONAL
HUMAN GENOME RESEARCH
DR. CRAIG VENTER, PRESIDENT AND CHIEF SCIENTIFIC
OFFICER, CELERA GENOMICS
DR. ARI PATRINOS, ASSOCIATE DIRECTOR FOR BIOLOGICAL
AND ENVIRONMENTAL RESEARCH,
DEPARTMENT OF ENERGY,
ON THE COMPLETION OF THE FIRST
THE ENTIRE HUMAN GENOME
The James S. Brady
MR. LOCKHART: I think
we've established why we here in the White House Press Secretary were not
entrusted with the Human Genome Project. (Laughter.)
We're very honored today to
have a very distinguished group to come to brief you. I think you've all
seen the event. But Dr. Neal Lane will open this with a brief statement,
the President's Science Advisor; followed by Dr. Francis Collins, the Director
of the NIH and Dr. Craig Venter the CEO of Celera.
DR. LANE: Thank you. Thank you, Joe. Good morning
everyone. You have just heard President Clinton, Prime Minister Blair
congratulating all the members of the scientific teams of the Human Genome
partnership, the public effort involving the United States and the United
Kingdom and several others countries; having reached an important milestone in
the sequencing the human genome; as well, as Craig Venter, President of Celera,
and his team who have completed their first assembly of the human
it is an extremely exciting day. It is a forward-looking time because of
the enormous opportunities for the use of this scientific information to
benefit all peoples of the world.
would now like to ask Francis Collins to make a brief statement and then Dr.
Craig Venter and then we'll take your questions.
DR. COLLINS: Well thank you, Neal. This is a happy day for science,
and I think for the public, both here and around the world. I have the
honor of serving as the project manager, I guess is the right word, of the
International Human Sequencing Consortium, which has been laboring to try to
develop methodologies and then apply them for sequencing the 3 billion letters
of the human DNA code. We can now say it's more like 3.15 billion
letters, because we have a better handle on it.
That involves investigators, not only in the United States, but also in the
United Kingdom, in France, in Germany, in China and in Japan. And that
has been a particularly gratifying aspect of this. Because this is, after
all, our shared inheritance and it's nice that we're working on it together
around the world.
What we are announcing today is that we have reached a milestone that we
promised to get to just about now; that is, covering the genome in what we call
a working draft of the human sequence. That is not to say that we have it
all finished and zipped up and every last letter precisely identified.
That will take a number of additional steps and probably the better part of the
next couple of years to achieve.
But if you're sitting somewhere in the genome right now there is a very good
chance you're in our database. And if you look to one side or the other
of any particular letter in the DNA code you will find that you're sitting on
an uninterrupted stretch of sequence that runs about 200,000 letters in length,
and most of the sequences there. So for the scientist who's working,
trying to unravel a mystery of some sort -- and many of these are mysteries
about disease -- this database is now in a form that makes it possible to
answer many of those questions very quickly.
Back in the 1980s, I had the experience of trying to track down the cystic
fibrosis gene. It took us about 10 years of very hard work to finally
succeed at that endeavor. And there were probably 100 investigators
involved and millions of dollars were spent on this enterprise. I can
tell you that with the database that's now available as of today, an average
post-doc working in a lab would be able to accomplish that probably in a matter
of a couple of weeks. So it is profoundly gratifying to see this come
along in this fashion.
Finally, I just would like to say how nice it is to share the podium today with
Dr. Venter. I want to recognize his wonderful willingness to come forward
in the way that has led to today, with the plans here for a simultaneous
announcement of these milestones. The work that his company has done is
really quite remarkable. I think it's a wonderful example of the way in
which the academic community and the biotech community and the pharmaceutical
industry in this country and around the world are really laboring together here
to try to achieve what we all hope for, which is an alleviation of suffering
and a cure for disease.
I'd also like to thank the other person standing up here, Ari Patrinos, of the
Department of Energy, for the important role he's played in leading the Genome
Project, and the catalytic efforts that he has played in getting today to
thank you very much. I'll turn it to Craig.
DR. VENTER: Thank you, Francis, for your very nice comments. In a
few hours -- or at 12:30 p.m., Francis and I will be making detailed
announcements at a press conference across town, where we will be describing
much more detailed information about the scientific accomplishments of the two
Celera, 18 miles from here in Rockville, Maryland, started sequencing the
genome in September, just nine months ago. We announced a while ago that
we had finished the sequencing phase, and today we're announcing that we've
actually now assembled all that data into the linear sequence of the human
This is an exciting stage. It's far from the end-stage, as Francis
said. In fact, annotating this, characterizing the genes, characterizing
the information, while that's, in reality, going to take most of this century,
we plan to make a very significant start on that between now and later this
year, when Francis and I agreed to have the two teams try to simultaneously
publish the results of the different efforts. At that stage, they'll be
really able to be compared in detail. The scientists will be able to
really go through the information in dramatic fashion.
Like Francis, I spent a decade looking for one gene. That gene cost
hundreds of millions of dollars to actually find and sequence, and it was a
combined effort of NIH funding and work funded by Merck. That same
discovery today would take 15 seconds by scientists using the Celera
database. And pharmaceutical companies, biotech companies and university
researchers are making those discoveries probably as we speak -- unless they're
still watching television.
I'm pleased that Francis worked with me, certainly with the help of Ari
Patrinos, to have this event be the focus, and shifting the focus to the
importance of this work to all of us and to humanity. And if we're going
to be the custodians of the genetic information and be trusted to analyze it
and interpret it appropriately, we felt it was important for us to rise above
the squabbles that you've read about, to act more at the level appropriate with
this situation. And I thank Francis for his effort in that
DR. LANE: I should have introduced Ari Patrinos, who runs the Department
of Energy's human sequencing research efforts. Department of Energy has
been very important from the outset in the concept of the Human Genome
you want to say a word? Good, then we're ready for your
Q Dr. Venter, can you tell us what the thought processes were
that made you -- and the timing of when you decided to make this a joint
DR. VENTER: Well, it's been something that's been under works for a very
long period of time, but really became much more actively involved when Dr.
Patrinos arranged a secret meeting between himself, Francis Collins and myself
that turned into a long series of meetings. And I think it's something
that we all had hoped would happen. It took the individual efforts of all
of us to really make it happen.
Q When was that?
DR. PATRINOS: It started on May 7th, and was followed by three
other meetings. The last one was just last week.
DR. COLLINS: And all of those were in Ari's house, and he served beer and
pizza, which was an important part of the good outcome here. (Laughter.)
Ari, I think, deserves a great deal of credit for being a catalyst. When
I called him up in late April and said, can we try this, he was quick to say,
yes, let's give it a shot, and put together that first discussion. And
things went very well. And thank you, Ari.
Q Where and when are you guys going to publish, and what are
you guys going to do with the accompanying data?
DR. VENTER: It hasn't been absolutely decided either where or when. We
expect it to be later this year. We're still working on, by assisting the
state of data interpretation and writing of manuscripts in both camps, and then
we'll try to collectively decide on a time for a submission.
There are several scientific journals that have been wooing us, let's say --
(laughter) -- and I don't think we've made an absolute final decision where
that will be.
DR. COLLINS: There's a prodigious amount of work involved in doing the
analysis of these 3.1 billion letters of the DNA code, and that is very
vigorously underway right now in a public project by a team of investigators
that have been meeting by conference call and a variety of other
mechanisms. And we aim to try to write really good papers here, not just
say, oh, we did it. But also, what did we find here? In the first
pass through the human genome what can you learn about what genes are
there? And maybe what's not there, as well.
the intention is to be sure that these are papers that will stand the test of
time. And we look forward to the opportunity to do this simultaneously
with what Celera is doing.
Q What about the data that will accompany it, where is that
going to be deposited?
DR. COLLINS: Craig should speak for Celera. In the public project,
as you know, all of the sequence data is deposited onto the Internet every 24
hours. And the analysis of that sequence data will also be appearing very
quickly on the Internet, even in advance of publication. But the papers,
of course, themselves, will stand on their own because of the additional
higher-level analysis that they will include.
DR. VENTER: Celera's data is available right now to the academic and
pharmaceutical and biotech worlds, but it's through subscription at the
moment. In the fall, when we actually publish our scientific analysis of
the genome, that data will be available to academic scientists via our Internet
Q As this progresses, how is Celera going to make money on
DR. VENTER: The question is how is Celera going to make money on the
public data. Hopefully it won't. Celera has independently sequenced the
genome. We decided as a corporation that it was such a significant event
that when we were finished with sequencing the genetic code of our species we
would make that data freely available to scientists around the
We've indicated that the effort
to make a financial return for our investors will be from understanding that
information. We're right now helping some of the biggest and best
pharmaceutical and biotech companies and academic institutions interpret the
human genetic code. A key part of this is we will have the mouse genome
sequenced by the end of this year, and that will be very key for a layering on
top of the human genetic code to in fact interpret it.
But our work previously has shown with the close to 24 genomes that we've done
both at Tigr and at Celera, is that having one genetic code is important, but
it's not all that useful. And it's only through comparative genomics --
having both human and mouse, dog, chimpanzee, rat, other species to layer on
top of the human -- will we only then be able to truly begin to interpret the
DR. COLLINS: I want to completely agree with the conclusion that the
human sequence, without comparisons to draw to it, is going to be very
difficult to understand. And, in fact, the public project is also engaged
in beginning the process of sequencing other complex genomes, including the
rat, and a fish called the zebra fish, and also the mouse, but a different
strain than what Celera is doing. And I think Craig and I would agree
that that's a good thing, that these are complementary efforts, and that you
learn a lot from whatever sequencing of this sort you do.
would also strongly want to point out that even with those sequencing efforts
coming into fruition, we will need a lot of other tools to understand how the
genome works. Methods of studying not just one gene at a time, but the
whole genome, in terms of its function. And that's a major goal of the
Genome Project in the coming years.
Q I'd like to ask if you all are -- you mentioned doing a
joint conference to annotate. Is this going to be like the Drosophila
Conference that went on -- the jamboree?
DR. VENTER: No, in fact, I think what the President said is we would --
after publication of our different versions of the genome, we would have a
joint scientific conference to compare the results; but, more importantly, I
think, to analyze the methods -- which are very different for the two different
genome projects -- to understand the best methods for people to go
DR. COLLINS: Yes, I think that's going to be really interesting. In
fact, for the mouse, we're actually beginning to do that sequencing by a
combination of the whole genome shotgun effort that Celera has pioneered, and
the map-based effort, which the public project has been using for humans.
Maybe for the mouse, we'll try a combination. But being able to sit down
together, and really look at the ins and outs and the details of what kind of
sequencing came out of these approaches after the time of publication, is going
to be incredibly interesting, and I would think, a lot of
Q Dr. Venter, can you talk about how many patents you've
applied for so far on the information that you've derived, and how many you
think you'll have applied for before you finally make the data public in
DR. VENTER: As of recently, I think, we're up to about two dozen unique
gene patents that Celera has filed for. The number is changing constantly
as discoveries are made with Celera and its pharmaceutical partners:
Phizer, Novartis, Pharmacia, Amgen, Takada in Japan. We're only filing
patents on genes that our pharmaceutical partners tell us are essential for
their programs to develop new therapeutics, develop new
Celera is not following the route of some of these other biotech companies that
are just randomly patent sequences that they download nightly from the public
effort on a speculative basis. We think it's only important to patent
things -- and I compliment the patent commissioner. A recent report was
issued doing what, I think, both Francis and I would agree that we are pleased
to see is raising the bar, requiring much more information on gene patents than
just simply downloading data off the Internet and doing a quick computer
search. So I think we're definitely working in the right
Q If I could follow up on that a little bit. I think
one of the big sticking points between the two approaches is that the
consortium was looking at every single thing in the genome, even nothing --
spaces -- it's interesting -- whereas, Celera was looking for things that were
patentable, proprietary. To what extent have you been able to reconcile
your different world view there?
DR. COLLINS: I think that's actually an incorrect view. Both
methods were aimed to try to look at the entire genome, because we imagine that
all of it is interesting, and we would be kind of foolish to pretend that we
were smart enough to know what wasn't interesting at this point, so let's just
look at all of it.
And I think, actually, Craig's view and mine on the appropriateness of patents
are much closer together than most reports would have suggested. And I, too,
want to compliment the Patent Commissioner for looking at this issue very
carefully over the course of the last few months and setting some new utility
guidelines that are, I think, quite reassuring in terms of making sure we end
up with an outcome where the patent system is used to provide an incentive for
research and not a disincentive.
Q Dr. Collins, could you explain the difference between what
you've done and what Celera has done?
DR. COLLINS: Well, how deeply do you want to get into science here,
because the answer is going to be of that sort. We will talk about it at
12:30 p.m. But very quickly, the way in which the public project has
sequenced the human genome is to first break it down into pieces that are
roughly 150,000 letters in length.
Those are relatively straightforward to generate, but fairly challenging to
figure out where they go. We have spent a lot of our effort, particularly
over the last year, assembling those pieces into large, contiguous fragments of
DNA across chromosomes. And we have 97 percent of the genome now covered
with those mapped pieces of DNA. We use those pieces, those 150,000
letters long, as our sub-strate for doing the sequencing. So we know how
to put that back together.
That is not nearly as challenging a computer problem as the method that Celera
has been using, which is quite innovative and it requires, obviously, a lot of
we take those pieces, sequence them one by one, and then reassemble the whole
thing back on to the chromosomes, which has been going on particularly
vigorously in the last month; and deduce what the original sequence must have
been by that method.
Celera takes an approach where they skip over the step of having these 150,000
letter long pieces and goes straight to the sequencing process and then use a
computer to reassemble the whole thing at the end of the effort; which
obviously has some advantages. Because you don't have to spend all the
time and effort on the mapping phase, although, the assembly process -- I
imagine Craig might want to comment on this -- is pretty challenging. And there
are some uncertainties to the degree with which repeated sequence in the genome
may give you headaches of various strength. But maybe you should add to
DR. VENTER: There is a couple of other important differences in terms of
with the Celera approach for the whole genome shotgun -- we take all the DNA
out of the cells of individuals. So we actually have genomes that
actually represent individual's entire genetic repertoire. Whereas, some
of the back libraries have come from -- I don't know what the total number is
-- but a variety of different individuals. And I think this workshop that
we were talking about earlier could be actually very instructive in terms of
seeing if the two different approaches give the same view of the human genetic
code. And I think that's going to be very instructive for all of
The calculation that we've done on assembling the genome is certainly, I think,
calculations that are larger usually come out of the Department of Energy with
some of the supercomputer processing there. I think we did 5 million,
trillion calculations to assemble the human genome. It took 20,000 CPU
hours on one of the largest supercomputers in history. But it does
reassemble the entire genetic code of individuals. And we did this with a
fruit fly. Scientists now studying it have reported that there is less
than one error in one million base pairs with it, so the method is clearly
accurate. But it could give a different answer than these different
cloning methods. And I think it's going to be a very instructive --
probably not for the rest of the world, but certainly for the scientists
involved to compare the details of those.
Q Dr. Collins, when you called Dr. Patrinos and talked about
meeting with Dr. Venter, what led you to believe he would be willing to talk
with you about a joint -- at that particular time.
DR. COLLINS: Well, I have to say, although it may not be a popular
statement in this room, that the focus on the race and the personality issues
that have been so prominently featured in press stories about the genome have
in many ways done a disservice to the situation. I don't think the level
of animosity or hostility was anything approaching the way it was described in
some of the pieces that Craig and I have had to read.
This is, after all, a noble enterprise. Sequencing our genomes should not
be something that is tarnished in some way by what appears to be a cat fight
amongst people who are involved in the enterprise. I think both of us
have felt disheartened by the way in which that so dominated the public image
of what was going on with the genome project.
With both of these projects, obviously proceeding extremely well, there is a
great opportunity here for sharing information after publication; because of
the complimentary nature of the scientific strategies it seemed absolutely the
right moment to sit down together and try to figure out a model for cooperation
and coordination so that the public would be the greatest beneficiary and we
could put behind us this chapter, which I hope history will not be very
interested in, which has gone on for too many months and has really done a
disservice to the hard labors of thousands of people around the world who have
been trying to make this happen for the benefit of mankind.
MR. SIEWART: We have another event so we'll take one more
Q Dr. Venter, you have mentioned that Celera is working with
Pfizer, Amgen and other companies. Could you just, for a lay person,
explain what kind of work you are doing that comes out of the human genome
research? And also to kind of both of you, the President, in the event,
said that genetic science will realize the treatment and prevention of almost
all human diseases. When do you think that those big kind of
breakthroughs might start coming?
DR. VENTER: Let me try and answer the second question first. What
this information will do is cause a catalytic change in how researchers do
their work. Instead of funding the kind of programs that I spent 10 years
doing, Dr. Collins spent 10 years doing, as you heard, those can be reduced to
between 15 seconds and two weeks. So the challenges now for the
scientific community to reassess how we fund science, and what we're funding to
make sure that this information now gets used as the beginning of this new
There will discoveries made across the board. But it's impossible to
predict which diseases, at this point, will see the breakthroughs first. What
we know, from scientists studying the intricacies of the genetic code and the
genes that all of us are discovering will be the new starting point for going
much faster. And those discoveries will build on each
we certainly hope to begin to see things just in the next few years. But
some disease, and probably the ones we care about the most, could take
longer. They could be much more refractory because we have to understand
how the 50,000 or so genes work together to actually form life. And that's
never been possible to even contemplate before without having the genetic
DR. COLLINS: Can I add to that? The reason I got interested in
genomics to begin with was, as a physician, this enormous frustration in not
understanding diseases well enough to be able to offer very
And when you look at what we currently know about things like diabetes and
heart disease and Alzheimers Disease, it's not nearly sufficient to enable us
to be able to design the strategies that we all hope for that will really cure
these illnesses. I would be willing to make a predication that within 10
years, we will have the potential of offering any of you the opportunity to
find out what particular genetic conditions you may be at increased risk for,
based upon the discovery of genes involved in common illnesses like diabetes,
hypertension, heart disease, and so on.
many instances, that kind of predictive information could be quite useful to
you, provided we put in the appropriate protections so that people don't use it
against you. Because it would allow you to practice individualized,
preventive medicine, focusing on the things that are most important for your
Over the longer term, perhaps in another 15 or 20 years, you will see a
complete transformation in therapeutic medicine, because every pharmaceutical
company is investing, and every biotech company is also contributing to the
development of new targets for drug therapy, based upon the genome. And
the therapies that we use 15 or 20 years from now will be directed much more
precisely towards the molecular problem in things like cancer, or mental
illness, than anything that we currently have available.
count on this happening. We've got to be patient -- well, maybe we
shouldn't be patient. We should be impatient, but I do think we have to
expect this is going to take a lot of hard work, a lot of good research, a lot
of funding for both the public and private sectors, a lot of partnerships in
ways that we have to be very creative about. But the vision is a very
DR. VENTER: Let me just briefly answer the first part of your question,
which is how do our pharmaceutical partners and subscribers use this
information. They're all linked in through virtual private networks,
through very high-speed lines that are in basically every time zone around the
world. They do searches on the data based on a daily, sometimes
minute-by-minute basis. They've already made some tremendous discoveries
in each of their own disease areas. And they're using some of the genes
right now to move forward; drug design and drug targeting.
Dr. Les Hudson from Pharmacia, one of the top pharmaceutical companies in the
world, he's the head of research there, he is one of the earliest subscribers
to the Celera database. He will be at the 12:30 p.m. press
briefing. And he said he would be available for answering questions about
how the pharmaceutical industry is now using this data.
But it's impossible to gauge every one of the possible ways that they use
it. That's why we just make the information available; some very dramatic
research tools where people can interpret the data and make some very key
discoveries. And every one of them have made some pretty exciting
discoveries that they will be announcing on their own time.
11:38 A.M. EDT
Office of Science
and Technology Policy
1600 Pennsylvania Ave, N.W
Washington, DC 20502