Hello everybody,
I've a strand-specific paired-end library on which I'd like to perform some standard DGE analysis. However I'm quite unclear about how to go about counting reads. Usually when I have a paired-end (unstranded) library, I first clip them for adapters and trim for quality. And when doing these preprocessing, if one of the pairs passes the filtering and the other doesn't, then I retain them as a single end read. Then, I use tophat to first map the PE reads (where both pairs are retained after filtering) and then pass the junctions obtained from that mapping step to run tophat again on the single end reads so as to not loose those otherwise "good" reads.
Now, the way I thought about this is to first do the same procedure to obtain PE reads that pass filtering and retain them. Then, those reads where only 1 of the pair passed filtering will be stored in two separate files as SE reads (depending on which read is retained) as they are strand-specific.
Even if this is good, again, after mapping, while counting the reads, I am unsure how to count the reads.
So my questions are: How do people normally go about mapping a PE library? Do they only keep "properly mapped pairs"? If so, do they count each pair as 1 read (as they are indeed coming from one fragment)? When you also have SE reads in the same bam file, how can you then count the number of reads? And if there are properly paired SS reads and SS SE reads, ...
↧