This claim shall be judged with a scaled value based on the gene number assessment "X" for the "Genesweep" Gene Sweepstakes, which is scheduled to be completed in the 2003 CSHL Genome meeting, as follows:
result = (X - 25000) / 1000;
Notwithstanding the assigned judgment date, it is intended that this claim be judged as soon as the Genesweep Gene Sweepstakes assessment of the gene number is completed.
Gene Sweepstake ("Genesweep") This year's Cold Spring Harbor Genome conference saw the completion of 85% of the human genome in "working draft" form - with the expectation that 90% will be done by June 15th. One of the hotly debated topics was the number of human genes. This has been estimated at anything from 35,000 to 150,000. Considering the spread of opinion, the only way to resolve was to get people to bet on it - hence the gene sweepstake. This led to an interesting debate on the definition of a gene (see the text below along with footnotes) and how to assess that number. The basic rules will be that the number is decided on 2003 with voting for the method for how to decide on 2002. Genesweep was organised by Ewan Birney, from the EBI, one of the technical leaders of the Ensembl project to provide an current and consistent annotation of the human genome free to everyone. The betting book stays at Cold Spring Harbor under the care of David Stewart. Anyone can make a bet as long as they physically sign the book. The Rules The Gene Sweepstake will run between 2000 and 2003. The rules are: It costs $1 to make a bet in 2000, $5 in 2001 and $20 in 2002. Bets are for one number. Closest number wins, and in case of ties, the pot is split A gene is a set of connected transcripts. A transcript is a set of exons via transcription followed (optionally) by pre-mRNA splicing. Two transcripts are connected if they share at least part of one exon in the genomic coordinates. At least one transcript must be expressed outside of the nucleus and one transcript must encode a protein (see footnotes). Assessment of the method used to determine the gene will occur by voting at Cold Spring Harbor Genome Meeting 2002. Researchers will be invited to submit their methods to the community at this time. Assessment of the gene number will occur on the 2003 CSHL Genome meeting People betting should write their name, email and number in the Gene Sweepstake book, held at Cold Spring Harbor (contact - David stewart - David Stewart). One bet per person, per year. Year defined as a calendar year. No pencil bets (ie, you can't change your number) Footnotes We are restricting ourselves to protein coding genes to allow an effective assessment. RNA genes were considered too difficult to assess by 2003. The key definition in the gene is that alternatively spliced transcripts all belong to the same gene, even if the proteins that are produced are different. The hope is that by 2003 we should have at least a hard floor to the gene numbers. The voting should be able to determine the best method. The cost of betting goes up over the years because people will have more information The scope of the genome are the automosomal chromosomes and X and Y. No epigenetic nor mitchondrial genes are counted. Encoding a protein assummes that the translation machinery does translate the sequence at some time. The scope of the expression of genes is across all cell types and all developmental stages (obviously!). The genome is defined as the reference sequence (hence a mosaic of haplotypes) as defined by Greg Schuler, NCBI. Somantic recombinant loci are counted after recombination: ie, Ig and TCR loci will form one gene per locus. Transcripts from repetitive regions are not counted even if expressed. A repetitive region is an element which is both repeated in the genome and has good evidence that the method of replication is based on a selfish replication strategy. If trans-splicing is found in humans (which it has not been so far, and is unlikely to occur. But just in case) the definition of the transcript occurs after the trans splicing event. This will split trans-spliced, polycistronic transcripts into multiple genes by this definition.
This year's Cold Spring Harbor Genome conference saw the completion of 85% of the human genome in "working draft" form - with the expectation that 90% will be done by June 15th. One of the hotly debated topics was the number of human genes. This has been estimated at anything from 35,000 to 150,000.
Considering the spread of opinion, the only way to resolve was to get people to bet on it - hence the gene sweepstake. This led to an interesting debate on the definition of a gene (see the text below along with footnotes) and how to assess that number. The basic rules will be that the number is decided on 2003 with voting for the method for how to decide on 2002.
Genesweep was organised by Ewan Birney, from the EBI, one of the technical leaders of the Ensembl project to provide an current and consistent annotation of the human genome free to everyone. The betting book stays at Cold Spring Harbor under the care of David Stewart. Anyone can make a bet as long as they physically sign the book.
The Gene Sweepstake will run between 2000 and 2003. The rules are:
I will judge this claim based on its intent, guided by the details of the wording. The intent of this claim is to predict the size of the human genome as it will be agreed to by the attendees at the Cold Spring Harbor Genome Conference according to the rules drawn up at their meeting in June of 2000. This could very well be different from a broader scientific consensus estimate of that value.