Skip Navigation
Leading research to understand, treat, and prevent infectious, immunologic, and allergic diseases
Skip Content Marketing
  • Share this:
  • submit to facebook
  • Tweet it
  • submit to reddit
  • submit to StumbleUpon
  • submit to Google +

Release of a Preliminary Assembly of ~5-Fold Sequence Data of Aedes aegypti

The National Institute of Allergy and Infectious Diseases, National Institutes of Health has funded the sequencing of the Aedes aegypti genome through its Microbial Sequencing Centers at the J. Craig Venter Institute (JCVI) and the Broad Institute, Massachusetts Institute of Technology. Random sequence data representing ~5-fold coverage of the 813-Mb genome have been generated and assembled using Celera Assembler. The primary objective of this exercise was to create a preliminary picture of the genome that can be released to the scientific community. Although very rough, this assembly provides a first picture of genome size and complexity. Work is still in progress to optimize assembly parameters. The release of this data should be considered preliminary in nature and as one that is intended to facilitate early research within the scientific community. A final assembly incorporating data derived from additional sequencing to 8-fold coverage is expected in July 2005, with annotation on the final assembly completed by November 2005.

Assembly Statistics

  • Total number of sequence reads: 6.1 million
  • Total number of sequence reads that passed QC and were assembled: 5.59 million
  • Estimated fold coverage of genome (based on 813-Mb size estimate): 4.8-fold
  • Average contig coverage: 2.7-fold
  • Number of contigs: 311,599
  • Number of bp of DNA in contigs: 1.05 billion*
  • Number of scaffolds: 205,495
  • Number of bp of DNA in scaffolds: 827 million*
  • Number of singletons: 1.1 million

*These data suggest the possibility that the A. aegypti genome may be larger than expected, and/or it may contain significant levels of genetic polymorphisms.

Data Release

The contigs and scaffolds generated as part of this initial assembly have been deposited at DDBJ/EMBL/GenBank under the project accession AAGE00000000. The version described herein is the first version, AAGE01000000. Previous Aedes aegypti genome data released to Genbank as part of the genome project includes more than 111,000 BAC end sequences (CC841856-CC875159) (CC065891-CC144307) in addition to shotgun assemblies generated from 24 BACs chosen from across the Aedes genome generated by both the Broad Institute (AC150506 (AC150254-AC150266)) and TIGR (AC149791- AC149800). In addition, greater than 165,000 EST sequences have been generated from four separate normalized libraries (Dengue infected, Plasmodium gallinaceum infected, Brugia malayi infected, and fat body derived). These sequences and associated annotations are available for search and download as part of the Aedes aegypti gene index at the J. Craig Venter Institute.

back to top

Last Updated January 04, 2006