Identify features (e.g., transcripts) with high quality data
Source:R/cBprocessing.R
reliableFeatures.Rd
This function identifies all features (e.g., transcripts, exons, etc.) for which the mutation rate is below a set threshold in the control (-s4U) sample and which have more reads than a set threshold in all samples. If there is no -s4U sample, then only the read count cutoff is considered. Additional filtering options are only relevant if working with short RNA-seq read data. This includes filtering out features with extremely low empirical U-content (i.e., the average number of Us in sequencing reads from that feature) and those with very few reads having at least 3 Us in them.
Arguments
- obj
Object of class bakRData
- high_p
highest mutation rate accepted in control samples
- totcut
Numeric; Any transcripts with less than this number of sequencing reads in any replicate of all experimental conditions are filtered out
- totcut_all
Numeric; Any transcripts with less than this number of sequencing reads in any sample are filtered out
- Ucut
Must have a fraction of reads with 2 or less Us less than this cutoff in all samples
- AvgU
Must have an average number of Us greater than this