Many sequencing library preparation kits include an option to generate so-called “paired-end reads“. In “short-read” sequencing, intact genomic DNA is sheared into several million short DNA fragments called “reads”. Individual reads can be paired together to create paired-end reads, which offers some benefits for downstream bioinformatics data analysis algorithms. The structure of a paired-end read is described here.
“Read 1”, often called the “forward read”, extends from the “Read 1 Adapter” in the 5′ – 3′ direction towards “Read 2” along the forward DNA strand.
“Read 2”, often called the “reverse read”, extends from the “Read 2 Adapter” in the 5′ – 3′ direction towards “Read 1” along the reverse DNA strand.
There is an arbitrary DNA sequence inserted between “Read 1” and “Read 2”, which we’ll call the “Inner sequence”. The length of this sequence is measured as the “Inner distance”. By definition, the “Insert” is the concatenation of “Read 1”, the “Inner distance” sequence and “Read 2”. And the length of the “Insert” is the “Insert size”. A single “Fragment” includes the “Read 1 Adapter”, “Read 1”, “Inner sequence”, “Read 2” and “Read 2 Adapter”. And the length of this “Fragment” is just the “Fragment length”.
Fig. 2 shows a typical insert size distribution for the Illumina Nextera XT DNA Library Preparation Kit. This is a probabilistic distribution and will vary somewhat for each DNA sample that is prepared with the XT kit. The distribution shows a peak insert size around 300 bp. The distribution is somewhat leptokurtic and positively skewed with a minimum insert size around 40 bp and maximum insert size around 850 bp.
Note that due to the positively skewed nature of the distribution there is a significant number of paired-end reads with a fairly long total length (compared to just the individual reads themselves). This increase in total length is beneficial for sequence alignment algorithms, de novo assembly algorithms, spanning repetitive sequences and the detection of insertions, deletions and inversions.
The Sequencing Center is a USA-owned and operated next-generation genome sequencing company offering affordable genome sequencing and bioinformatics for research, pharmaceutical, and clinical organizations. Our services are designed for organizations performing research on bacterial, viral, and human-oriented research. Among these services, our facility offers targeted sequencing methods for research fields including inherited diseases, oncology, neurodegenerative diseases, antibiotic-resistant gene analysis, and biomarker discovery for drug development optimization.
Click on the different category headings to find out more. You can also change some of your preferences. Note that blocking some types of cookies may impact your experience on our websites and the services we are able to offer.
These cookies are strictly necessary to provide you with services available through our website and to use some of its features.
We provide you with a list of stored cookies on your computer in our domain so you can check what we stored. Due to security reasons we are not able to show or modify cookies from other domains. You can check these in your browser security settings.
We also use different external services like Google Webfonts, Google Maps, and external Video providers. Since these providers may collect personal data like your IP address we allow you to block them here. Please be aware that this might heavily reduce the functionality and appearance of our site. Changes will take effect once you reload the page.
Google Webfont Settings:
Google Map Settings:
Google reCaptcha Settings:
Vimeo and Youtube video embeds: