Package picard.sam.markduplicates
Class MarkDuplicatesForFlowHelper
java.lang.Object
picard.sam.markduplicates.MarkDuplicatesForFlowHelper
- All Implemented Interfaces:
MarkDuplicatesHelper
MarkDuplicates calculation helper class for flow based mode
The class extends the behavior of MarkDuplicates which contains the complete
code for the non-flow based mode. When in flow mode, additional parameters
may control the establishment of read ends (start/end) for the purpose of
determining duplication status. Additionally, the logic used to gather reads into
(duplicate) buckets (chunks) is enhanced with an optional mechanism of read end
uncertainty threshold. When active, reads are considered to belong to the same chunk if
for each read in the chunk there exists at least one other read with the uncertainty
distance on the read end.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final char[]
static final char[]
static final char[]
static final String
static final int
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected boolean
areComparableForDuplicates
(ReadEndsForMarkDuplicates lhs, ReadEndsForMarkDuplicates rhs, boolean compareRead2, boolean useBarcodes) buildReadEnds
(htsjdk.samtools.SAMFileHeader header, long index, htsjdk.samtools.SAMRecord rec, boolean useBarcodes) Builds a read ends object that represents a single read - for flow based readvoid
generateDuplicateIndexes
(boolean useBarcodes, boolean indexOpticalDuplicates) This method is identical in function to generateDuplicateIndexes except that it accomodates for the possible significance of the end side of the reads (w/ or w/o uncertainty).protected static int
getFlowSumOfBaseQualities
(htsjdk.samtools.SAMRecord rec, int threshold) A quality summing scoring strategy used for flow based reads.protected static int
getFlowSumOfBaseQualitiesNearEnds
(htsjdk.samtools.SAMRecord rec, int dist) A quality selection strategy used for flow based reads.short
getReadDuplicateScore
(htsjdk.samtools.SAMRecord rec, ReadEndsForMarkDuplicates pairedEnds) update score for pairedEndsprotected static int
getReadEndCoordinate
(htsjdk.samtools.SAMRecord rec, boolean startEnd, boolean certain, MarkDuplicatesForFlowArgumentCollection flowBasedArguments) static boolean
isAdapterClipped
(htsjdk.samtools.SAMRecord rec) static boolean
isAdapterClippedWithQ
(htsjdk.samtools.SAMRecord rec) static boolean
isQualityClipped
(htsjdk.samtools.SAMRecord rec)
-
Field Details
-
CLIPPING_TAG_NAME
- See Also:
-
CLIPPING_TAG_CONTAINS_A
public static final char[] CLIPPING_TAG_CONTAINS_A -
CLIPPING_TAG_CONTAINS_AQ
public static final char[] CLIPPING_TAG_CONTAINS_AQ -
CLIPPING_TAG_CONTAINS_QZ
public static final char[] CLIPPING_TAG_CONTAINS_QZ -
DIST_FROM_END
public static final int DIST_FROM_END- See Also:
-
-
Constructor Details
-
MarkDuplicatesForFlowHelper
-
-
Method Details
-
generateDuplicateIndexes
public void generateDuplicateIndexes(boolean useBarcodes, boolean indexOpticalDuplicates) This method is identical in function to generateDuplicateIndexes except that it accomodates for the possible significance of the end side of the reads (w/ or w/o uncertainty). This is only applicable for flow mode invocation.- Specified by:
generateDuplicateIndexes
in interfaceMarkDuplicatesHelper
-
buildReadEnds
public ReadEndsForMarkDuplicates buildReadEnds(htsjdk.samtools.SAMFileHeader header, long index, htsjdk.samtools.SAMRecord rec, boolean useBarcodes) Builds a read ends object that represents a single read - for flow based read- Specified by:
buildReadEnds
in interfaceMarkDuplicatesHelper
-
getReadDuplicateScore
public short getReadDuplicateScore(htsjdk.samtools.SAMRecord rec, ReadEndsForMarkDuplicates pairedEnds) update score for pairedEnds- Specified by:
getReadDuplicateScore
in interfaceMarkDuplicatesHelper
-
areComparableForDuplicates
protected boolean areComparableForDuplicates(ReadEndsForMarkDuplicates lhs, ReadEndsForMarkDuplicates rhs, boolean compareRead2, boolean useBarcodes) -
getFlowSumOfBaseQualities
protected static int getFlowSumOfBaseQualities(htsjdk.samtools.SAMRecord rec, int threshold) A quality summing scoring strategy used for flow based reads. The method walks on the bases of the read, in the synthesis direction. For each base, the effective quality value is defined as the value on the first base on the hmer to which the base belongs to. The score is defined to be the sum of all effective values above a given threshold.- Parameters:
rec
- - SAMRecord to get a score forthreshold
- - threshold above which effective quality is included- Returns:
- - calculated score (see method description)
-
getFlowSumOfBaseQualitiesNearEnds
protected static int getFlowSumOfBaseQualitiesNearEnds(htsjdk.samtools.SAMRecord rec, int dist) A quality selection strategy used for flow based reads. We look at the bases of the reads that are close to the ends of the fragment and calculate the minimal quality of the homopolymers.- Parameters:
rec
- - SAMRecord to get a score fordist
- - Distance from the end- Returns:
- - calculated score (see method description)
-
getReadEndCoordinate
protected static int getReadEndCoordinate(htsjdk.samtools.SAMRecord rec, boolean startEnd, boolean certain, MarkDuplicatesForFlowArgumentCollection flowBasedArguments) -
isAdapterClipped
public static boolean isAdapterClipped(htsjdk.samtools.SAMRecord rec) -
isAdapterClippedWithQ
public static boolean isAdapterClippedWithQ(htsjdk.samtools.SAMRecord rec) -
isQualityClipped
public static boolean isQualityClipped(htsjdk.samtools.SAMRecord rec)
-