Thursday, March 28, two groups, the Beijing Genomics Institute and Syngenta/Myriad announced that they would be publishing draft DNA sequences of the two major subspecies of rice in the April 5th issue of Science magazine. It is appropriate to consider this announcement against other efforts to achieve this same goal.
What will the rice sequence be used for? Rice is the most important cereal crop for half of the world's population. Increasing population pressures, coupled with losses in arable land, water, energy-dependent fertilizer, and other resources for sustaining agriculture, make steps to maximize rice productivity especially important. Plant breeders are able to identify and to map traits for yield, disease resistance, and tolerance to environmental stress. DNA sequence of rice that is tied to the genetic map facilitates the identification of genes governing those traits. Pinpointing the crucial genes will expedite transfer of beneficial traits into locally adapted elite lines and will permit plant breeders to search for useful allelic variants. In fact, publicly available sequence information has been used to discover genes that are responsible for controlling flowering time. This permits plant breeders to grow higher yielding rice strains in areas with different day lengths.
Rice has a genome with the smallest amount of DNA of all of the cereal grasses making it a prime candidate for genomic sequencing. Moreover, the arrangement of genes on the chromosomes is similar in all cereals, including corn, wheat, barley, rye, sorghum, oats, and millet. As a consequence, information about the rice genome can be applied to the other cereals that have much more DNA. The genome of corn, for example, is at least five times as large as rice and that of wheat is 40 times as large.
An international effort was established in 1997 to obtain a high quality, publically available, map-based sequence of the rice genome. The International Rice Genome Sequencing Project (IRGSP) is currently comprised of ten members: Japan, the United States, China, Taiwan, Korea, India, Thailand, France, Brazil, and the United Kingdom. The IRGSP subscribes to the policy of immediate sequence release and has already published 68% of the rice genome in public databases in daily increments. The sequence of the entire genome with at least 10X coverage will be available by the end of this year. The IRGSP has adopted a "clone-by- clone" approach which means that every clone sequenced can be associated with a specific position on the genetic map. The participants are committed to an accuracy standard of less than one error in 10,000 bases. The rice genome is composed of about 400 million bases.
The public effort to sequence the rice genome has been aided by contributions from private corporations. Two years ago, the Monsanto Company announced that they had produced a draft of the rice genome. They made the sequences available to the IRGSP and to public researchers. Monsanto clones and sequences underlie 30% of the sequence in public databases. Novartis, now part of Syngenta, supported the physical mapping of the rice genome, that is, the ordering of the DNA fragments that are used for the sequencing.
The two groups that participated in the March 28 announcement took the approach used by Celera in sequencing the human genome. Instead of sequencing large ordered fragments of DNA, they sequenced many short, unmapped clonessimultaneously. They then relied on computers to assemble the pieces into longer coherent units of sequence. Most higher organisms have repeated sequences in their DNA which thwarts a genome-wide assembly. Nevertheless, this is a useful strategy for learning the sequence of the major portions of most genes relatively inexpensively. The drawbacks are that the relationships of the genes to each other and to the genetic map is not clear and, because there are fewer sequence "reads" covering any one area, the sequence accuracy is lower. In fact, the draft sequence from Beijing has about 130,000 pieces. This contrasts with about 2,000 ordered pieces, sequenced with greater accuracy, and all tied to the genetic map, for the 2/3 of the genome currently available in public databases.
Public availability is also a question. The group from China posted their sequence on the web a couple of months ago for anyone who wanted to download or search it and they have promised to make this sequence available on public databases. Public access to the Syngenta sequence is not entirely clear. They have ruled out putting it in public databases, but may make it available to international sequencing groups and limited portions available to researchers who sign a transfer agreement.
The IRGSP is committed to make use of all available resources to finish an accurate map-based sequence of the rice genome. We are currently involved in amicable negotiations with Syngenta and are hopeful that an agreement, very similar to the agreement with Monsanto, will shortly be reached that will allow IRGSP members to use the Syngenta data to complete our work. This would be a welcome boost to the IRGSP, especially for filling gaps in the current physical map as well as providing help in choosing clones to sequence. It is also very helpful that both Monsanto and Syngenta chose to sequence the same variety of rice selected by the international group.
Posted April 2, 2002 by Takuji Sasaki and Ben Burr