Class TrainingInfo

java.lang.Object
edu.msu.cme.rdp.classifier.TrainingInfo

public class TrainingInfo extends Object
The TrainingInfo holds all the training information and taxonomy hierarchy information.
  • Constructor Details

    • TrainingInfo

      public TrainingInfo()
      Creates new TrainingInfo.
  • Method Details

    • createTree

      public void createTree(Reader reader) throws IOException, TrainingDataException
      Reads in the tree information from a reader and create all the HierarchyTrees. Note: the tree information has to be read after at least one of the other three files because we need to set the version information.
      Throws:
      IOException
      TrainingDataException
    • createLogWordPriorArr

      public void createLogWordPriorArr(Reader reader) throws IOException, TrainingDataException
      Reads in the log value of the word prior probability and saves to an array LogWordPriorArr.
      Throws:
      IOException
      TrainingDataException
    • generateWordPairDiffArr

      public void generateWordPairDiffArr(int[] word, int beginIndex)
      For a given word w1 and the reverse complement word w2, calculates the difference between the log word prior of w1 and w2 and saves to an array. Repeats for every possible word of size 8.
    • createGenusWordProbList

      public void createGenusWordProbList(Reader reader) throws IOException, TrainingDataException
      Reads in the index of the genus treenode and conditional probability that genus contains a word. Saves the data into a list genus_wordConditionalProbList.
      Throws:
      IOException
      TrainingDataException
    • createProbIndexArr

      public void createProbIndexArr(Reader reader) throws IOException, TrainingDataException
      Reads in start index of the conditional probability of each genus, saves to an array wordConditionalProbIndexArr.
      Throws:
      IOException
      TrainingDataException
    • createClassifier

      public Classifier createClassifier()
      Creates a new Classifier if all the train information have been completed, throws exception if not.
    • getRootTree

      public HierarchyTree getRootTree()
      Returns the root of the trees.
    • getTrainRank

      public String getTrainRank()
      Returns:
      the rank the classifier was trained on
    • getGenusNodeListSize

      public int getGenusNodeListSize()
      Returns the number of the genus nodes.
    • getGenusNodebyIndex

      public HierarchyTree getGenusNodebyIndex(int i)
      Returns a genus node from the genusNodeList at the specified position.
    • getLogWordPrior

      public float getLogWordPrior(int wordIndex)
      Returns the log value of the prior probability of a word.
    • getWordPairPriorDiff

      public float getWordPairPriorDiff(int wordIndex)
      Returns the difference between given word and its reverse complement word.
    • getLogLeaveCount

      public float getLogLeaveCount(int i)
      Returns the log value of (number of leaves + 1) of a genus
    • getStartIndex

      public int getStartIndex(int wordIndex)
      Returns the start index of GenusIndexWordConditionalProb in the array for the specified wordIndex.
    • getStopIndex

      public int getStopIndex(int wordIndex)
      Returns the stop index of GenusIndexWordConditionalProb in the array for the specified wordIndex.
    • getWordConditionalProbObject

      public GenusWordConditionalProb getWordConditionalProbObject(int posIndex)
      Returns a GenusIndexWordConditionalProb from the genusIndex_wordConditionalProbList at the specified postion in the list.
    • getHierarchyVersion

      public String getHierarchyVersion()
      Returns the version of the taxonomical hierarchy.
    • getHierarchyInfo

      public HierarchyVersion getHierarchyInfo()
      Returns the info of the taxonomy hierarchy from of the training file.
    • isSeqReversed

      public boolean isSeqReversed(ClassifierSequence seq) throws IOException
      Returns true if the sequence is in reverse orientation. Sums the difference between all the overlapping words from the query sequence and the reverse complements of those word. If the summation is less that zero, the query sequence is in reverse orientation.
      Throws:
      IOException
    • isSeqReversed

      public boolean isSeqReversed(int[] wordIndexArr, int wordCount)