This script considers the read mapping start position and the UMI to determine whether a read is a PCR duplicate. All PCR duplicates are then removed and one entry per read is kept. In case of paired-end reads (MAPCap/RAMPAGE), only one end (R1) is kept after filtering.

filterDuplicates(CSobject, outdir, ncores = 1, keepPairs = FALSE)

# S4 method for CapSet
filterDuplicates(CSobject, outdir, ncores = 1,
  keepPairs = FALSE)



an object of class CapSet


character. output directory for filtered BAM files


integer. No. of cores to use


logical. indicating whether to keep pairs in the paired-end data. (note: the pairs are treated as independent reads during duplicate removal)


modified CapSet object with filtering information. Filtered BAM files are saved in `outdir`.


# before running this # 1. Create a CapSet object # 2. de-multiplex the fastqs # 3. map them # load a previously saved CapSet object cs <- exampleCSobject()
#> Checking de-multiplexed R1 reads
#> Checking de-multiplexed R2 reads
#> Checking mapped file
#> Checking de-duplicated file
# filter duplicate reads from mapped BAM files dir.create("filtered_bam")
#> Warning: 'filtered_bam' already exists
cs <- filterDuplicates(cs, outdir = "filtered_bam")
#> Removing PCR duplicates : /data/akhtar/bhardwaj/Gitrepo/icetea/inst/extdata/bam/embryo1.bam
#> Removing PCR duplicates : /data/akhtar/bhardwaj/Gitrepo/icetea/inst/extdata/bam/embryo2.bam
#> Removing PCR duplicates : /data/akhtar/bhardwaj/Gitrepo/icetea/inst/extdata/bam/embryo3.bam
#> Removing PCR duplicates : /data/akhtar/bhardwaj/Gitrepo/icetea/inst/extdata/bam/embryo4.bam