int GetRAbundCommand::createRabund(CountTable& ct, ListVector*& list, RAbundVector*& rabund){ try { rabund->setLabel(list->getLabel()); for(int i = 0; i < list->getNumBins(); i++) { if (m->control_pressed) { return 0; } vector<string> binNames; string bin = list->get(i); m->splitAtComma(bin, binNames); int total = 0; for (int j = 0; j < binNames.size(); j++) { total += ct.getNumSeqs(binNames[j]); } rabund->push_back(total); } return 0; } catch(exception& e) { m->errorOut(e, "GetRAbundCommand", "createRabund"); exit(1); } }
int DeconvoluteCommand::execute() { try { if (abort) { if (calledHelp) { return 0; } return 2; } //prepare filenames and open files map<string, string> variables; variables["[filename]"] = outputDir + util.getRootName(util.getSimpleName(fastafile)); string outNameFile = getOutputFileName("name", variables); string outCountFile = getOutputFileName("count", variables); variables["[extension]"] = util.getExtension(fastafile); string outFastaFile = getOutputFileName("fasta", variables); map<string, string> nameMap; map<string, string>::iterator itNames; if (namefile != "") { util.readNames(namefile, nameMap); if (namefile == outNameFile){ //prepare filenames and open files map<string, string> mvariables; mvariables["[filename]"] = outputDir + util.getRootName(util.getSimpleName(fastafile)); mvariables["[tag]"] = "unique"; outNameFile = getOutputFileName("name", mvariables); } } CountTable ct; if (countfile != "") { ct.readTable(countfile, true, false); if (countfile == outCountFile){ //prepare filenames and open files map<string, string> mvariables; mvariables["[filename]"] = outputDir + util.getRootName(util.getSimpleName(fastafile)); mvariables["[tag]"] = "unique"; outCountFile = getOutputFileName("count", mvariables); } } if (m->getControl_pressed()) { return 0; } ifstream in; util.openInputFile(fastafile, in); ofstream outFasta; util.openOutputFile(outFastaFile, outFasta); map<string, string> sequenceStrings; //sequenceString -> list of names. "atgc...." -> seq1,seq2,seq3. map<string, string>::iterator itStrings; set<string> nameInFastaFile; //for sanity checking set<string>::iterator itname; vector<string> nameFileOrder; CountTable newCt; int count = 0; while (!in.eof()) { if (m->getControl_pressed()) { in.close(); outFasta.close(); util.mothurRemove(outFastaFile); return 0; } Sequence seq(in); if (seq.getName() != "") { //sanity checks itname = nameInFastaFile.find(seq.getName()); if (itname == nameInFastaFile.end()) { nameInFastaFile.insert(seq.getName()); } else { m->mothurOut("[ERROR]: You already have a sequence named " + seq.getName() + " in your fasta file, sequence names must be unique, please correct."); m->mothurOutEndLine(); } itStrings = sequenceStrings.find(seq.getAligned()); if (itStrings == sequenceStrings.end()) { //this is a new unique sequence //output to unique fasta file seq.printSequence(outFasta); if (namefile != "") { itNames = nameMap.find(seq.getName()); if (itNames == nameMap.end()) { //namefile and fastafile do not match m->mothurOut("[ERROR]: " + seq.getName() + " is in your fasta file, and not in your namefile, please correct."); m->mothurOutEndLine(); }else { if (format == "name") { sequenceStrings[seq.getAligned()] = itNames->second; nameFileOrder.push_back(seq.getAligned()); }else { newCt.push_back(seq.getName(), util.getNumNames(itNames->second)); sequenceStrings[seq.getAligned()] = seq.getName(); nameFileOrder.push_back(seq.getAligned()); } } }else if (countfile != "") { if (format == "name") { int numSeqs = ct.getNumSeqs(seq.getName()); string expandedName = seq.getName()+"_0"; for (int i = 1; i < numSeqs; i++) { expandedName += "," + seq.getName() + "_" + toString(i); } sequenceStrings[seq.getAligned()] = expandedName; nameFileOrder.push_back(seq.getAligned()); }else { ct.getNumSeqs(seq.getName()); //checks to make sure seq is in table sequenceStrings[seq.getAligned()] = seq.getName(); nameFileOrder.push_back(seq.getAligned()); } }else { if (format == "name") { sequenceStrings[seq.getAligned()] = seq.getName(); nameFileOrder.push_back(seq.getAligned()); } else { newCt.push_back(seq.getName()); sequenceStrings[seq.getAligned()] = seq.getName(); nameFileOrder.push_back(seq.getAligned()); } } }else { //this is a dup if (namefile != "") { itNames = nameMap.find(seq.getName()); if (itNames == nameMap.end()) { //namefile and fastafile do not match m->mothurOut("[ERROR]: " + seq.getName() + " is in your fasta file, and not in your namefile, please correct."); m->mothurOutEndLine(); }else { if (format == "name") { sequenceStrings[seq.getAligned()] += "," + itNames->second; } else { int currentReps = newCt.getNumSeqs(itStrings->second); newCt.setNumSeqs(itStrings->second, currentReps+(util.getNumNames(itNames->second))); } } }else if (countfile != "") { if (format == "name") { int numSeqs = ct.getNumSeqs(seq.getName()); string expandedName = seq.getName()+"_0"; for (int i = 1; i < numSeqs; i++) { expandedName += "," + seq.getName() + "_" + toString(i); } sequenceStrings[seq.getAligned()] += "," + expandedName; }else { int num = ct.getNumSeqs(seq.getName()); //checks to make sure seq is in table if (num != 0) { //its in the table ct.mergeCounts(itStrings->second, seq.getName()); //merges counts and saves in uniques name } } }else { if (format == "name") { sequenceStrings[seq.getAligned()] += "," + seq.getName(); } else { int currentReps = newCt.getNumSeqs(itStrings->second); newCt.setNumSeqs(itStrings->second, currentReps+1); } } } count++; } util.gobble(in); if(count % 1000 == 0) { m->mothurOutJustToScreen(toString(count) + "\t" + toString(sequenceStrings.size()) + "\n"); } } if(count % 1000 != 0) { m->mothurOut(toString(count) + "\t" + toString(sequenceStrings.size())); m->mothurOutEndLine(); } in.close(); outFasta.close(); if (m->getControl_pressed()) { util.mothurRemove(outFastaFile); return 0; } //print new names file ofstream outNames; if (format == "name") { util.openOutputFile(outNameFile, outNames); outputNames.push_back(outNameFile); outputTypes["name"].push_back(outNameFile); } else { util.openOutputFile(outCountFile, outNames); outputTypes["count"].push_back(outCountFile); outputNames.push_back(outCountFile); } if ((countfile != "") && (format == "count")) { ct.printHeaders(outNames); } else if ((countfile == "") && (format == "count")) { newCt.printHeaders(outNames); } for (int i = 0; i < nameFileOrder.size(); i++) { if (m->getControl_pressed()) { outputTypes.clear(); util.mothurRemove(outFastaFile); outNames.close(); for (int j = 0; j < outputNames.size(); j++) { util.mothurRemove(outputNames[j]); } return 0; } itStrings = sequenceStrings.find(nameFileOrder[i]); if (itStrings != sequenceStrings.end()) { if (format == "name") { //get rep name int pos = (itStrings->second).find_first_of(','); if (pos == string::npos) { // only reps itself outNames << itStrings->second << '\t' << itStrings->second << endl; }else { outNames << (itStrings->second).substr(0, pos) << '\t' << itStrings->second << endl; } }else { if (countfile != "") { ct.printSeq(outNames, itStrings->second); } else if (format == "count") { newCt.printSeq(outNames, itStrings->second); } } }else{ m->mothurOut("[ERROR]: mismatch in namefile print."); m->mothurOutEndLine(); m->setControl_pressed(true); } } outNames.close(); if (m->getControl_pressed()) { outputTypes.clear(); util.mothurRemove(outFastaFile); for (int j = 0; j < outputNames.size(); j++) { util.mothurRemove(outputNames[j]); } return 0; } m->mothurOut("\nOutput File Names: \n"); outputNames.push_back(outFastaFile); outputTypes["fasta"].push_back(outFastaFile); for (int i = 0; i < outputNames.size(); i++) { m->mothurOut(outputNames[i] +"\n"); } m->mothurOutEndLine(); //set fasta file as new current fastafile string currentName = ""; itTypes = outputTypes.find("fasta"); if (itTypes != outputTypes.end()) { if ((itTypes->second).size() != 0) { currentName = (itTypes->second)[0]; current->setFastaFile(currentName); } } itTypes = outputTypes.find("name"); if (itTypes != outputTypes.end()) { if ((itTypes->second).size() != 0) { currentName = (itTypes->second)[0]; current->setNameFile(currentName); } } itTypes = outputTypes.find("count"); if (itTypes != outputTypes.end()) { if ((itTypes->second).size() != 0) { currentName = (itTypes->second)[0]; current->setCountFile(currentName); } } return 0; } catch(exception& e) { m->errorOut(e, "DeconvoluteCommand", "execute"); exit(1); } }
int SeqSummaryCommand::execute(){ try{ if (abort == true) { if (calledHelp) { return 0; } return 2; } int start = time(NULL); //set current fasta to fastafile m->setFastaFile(fastafile); map<string, string> variables; variables["[filename]"] = outputDir + m->getRootName(m->getSimpleName(fastafile)); string summaryFile = getOutputFileName("summary",variables); long long numSeqs = 0; long long size = 0; long long numUniques = 0; map<int, long long> startPosition; map<int, long long> endPosition; map<int, long long> seqLength; map<int, long long> ambigBases; map<int, long long> longHomoPolymer; if (namefile != "") { nameMap = m->readNames(namefile); numUniques = nameMap.size(); } else if (countfile != "") { CountTable ct; ct.readTable(countfile, false, false); nameMap = ct.getNameMap(); size = ct.getNumSeqs(); numUniques = ct.getNumUniqueSeqs(); } if (m->control_pressed) { return 0; } vector<unsigned long long> positions; #if defined (__APPLE__) || (__MACH__) || (linux) || (__linux) || (__linux__) || (__unix__) || (__unix) positions = m->divideFile(fastafile, processors); for (int i = 0; i < (positions.size()-1); i++) { lines.push_back(new linePair(positions[i], positions[(i+1)])); } #else positions = m->setFilePosFasta(fastafile, numSeqs); if (numSeqs < processors) { processors = numSeqs; } //figure out how many sequences you have to process int numSeqsPerProcessor = numSeqs / processors; for (int i = 0; i < processors; i++) { int startIndex = i * numSeqsPerProcessor; if(i == (processors - 1)){ numSeqsPerProcessor = numSeqs - i * numSeqsPerProcessor; } lines.push_back(new linePair(positions[startIndex], numSeqsPerProcessor)); } #endif if(processors == 1){ numSeqs = driverCreateSummary(startPosition, endPosition, seqLength, ambigBases, longHomoPolymer, fastafile, summaryFile, lines[0]); }else{ numSeqs = createProcessesCreateSummary(startPosition, endPosition, seqLength, ambigBases, longHomoPolymer, fastafile, summaryFile); } if (m->control_pressed) { return 0; } //set size if (countfile != "") {}//already set else if (namefile == "") { size = numSeqs; } else { for (map<int, long long>::iterator it = startPosition.begin(); it != startPosition.end(); it++) { size += it->second; } } if ((namefile != "") || (countfile != "")) { string type = "count"; if (namefile != "") { type = "name"; } if (numSeqs != numUniques) { // do fasta and name/count files match m->mothurOut("[ERROR]: Your " + type + " file contains " + toString(numUniques) + " unique sequences, but your fasta file contains " + toString(numSeqs) + ". File mismatch detected, quitting command.\n"); m->control_pressed = true; } } if (m->control_pressed) { m->mothurRemove(summaryFile); return 0; } long long ptile0_25 = 1+(long long)(size * 0.025); //number of sequences at 2.5% long long ptile25 = 1+(long long)(size * 0.250); //number of sequences at 25% long long ptile50 = 1+(long long)(size * 0.500); long long ptile75 = 1+(long long)(size * 0.750); long long ptile97_5 = 1+(long long)(size * 0.975); long long ptile100 = (long long)(size); vector<int> starts; starts.resize(7,0); vector<int> ends; ends.resize(7,0); vector<int> ambigs; ambigs.resize(7,0); vector<int> lengths; lengths.resize(7,0); vector<int> homops; homops.resize(7,0); //find means long long meanStartPosition, meanEndPosition, meanSeqLength, meanAmbigBases, meanLongHomoPolymer; meanStartPosition = 0; meanEndPosition = 0; meanSeqLength = 0; meanAmbigBases = 0; meanLongHomoPolymer = 0; //minimum if ((startPosition.begin())->first == -1) { starts[0] = 0; } else {starts[0] = (startPosition.begin())->first; } long long totalSoFar = 0; //set all values to min starts[1] = starts[0]; starts[2] = starts[0]; starts[3] = starts[0]; starts[4] = starts[0]; starts[5] = starts[0]; int lastValue = 0; for (map<int, long long>::iterator it = startPosition.begin(); it != startPosition.end(); it++) { int value = it->first; if (value == -1) { value = 0; } meanStartPosition += (value*it->second); totalSoFar += it->second; if (((totalSoFar <= ptile0_25) && (totalSoFar > 1)) || ((lastValue < ptile0_25) && (totalSoFar > ptile0_25))){ starts[1] = value; } //save value if (((totalSoFar <= ptile25) && (totalSoFar > ptile0_25)) || ((lastValue < ptile25) && (totalSoFar > ptile25))) { starts[2] = value; } //save value if (((totalSoFar <= ptile50) && (totalSoFar > ptile25)) || ((lastValue < ptile50) && (totalSoFar > ptile50))) { starts[3] = value; } //save value if (((totalSoFar <= ptile75) && (totalSoFar > ptile50)) || ((lastValue < ptile75) && (totalSoFar > ptile75))) { starts[4] = value; } //save value if (((totalSoFar <= ptile97_5) && (totalSoFar > ptile75)) || ((lastValue < ptile97_5) && (totalSoFar > ptile97_5))) { starts[5] = value; } //save value if ((totalSoFar <= ptile100) && (totalSoFar > ptile97_5)) { starts[6] = value; } //save value lastValue = totalSoFar; } starts[6] = (startPosition.rbegin())->first; if ((endPosition.begin())->first == -1) { ends[0] = 0; } else {ends[0] = (endPosition.begin())->first; } totalSoFar = 0; //set all values to min ends[1] = ends[0]; ends[2] = ends[0]; ends[3] = ends[0]; ends[4] = ends[0]; ends[5] = ends[0]; lastValue = 0; for (map<int, long long>::iterator it = endPosition.begin(); it != endPosition.end(); it++) { int value = it->first; if (value == -1) { value = 0; } meanEndPosition += (value*it->second); totalSoFar += it->second; if (((totalSoFar <= ptile0_25) && (totalSoFar > 1)) || ((lastValue < ptile0_25) && (totalSoFar > ptile0_25))){ ends[1] = value; } //save value if (((totalSoFar <= ptile25) && (totalSoFar > ptile0_25)) || ((lastValue < ptile25) && (totalSoFar > ptile25))) { ends[2] = value; } //save value if (((totalSoFar <= ptile50) && (totalSoFar > ptile25)) || ((lastValue < ptile50) && (totalSoFar > ptile50))) { ends[3] = value; } //save value if (((totalSoFar <= ptile75) && (totalSoFar > ptile50)) || ((lastValue < ptile75) && (totalSoFar > ptile75))) { ends[4] = value; } //save value if (((totalSoFar <= ptile97_5) && (totalSoFar > ptile75)) || ((lastValue < ptile97_5) && (totalSoFar > ptile97_5))) { ends[5] = value; } //save value if ((totalSoFar <= ptile100) && (totalSoFar > ptile97_5)) { ends[6] = value; } //save value lastValue = totalSoFar; } ends[6] = (endPosition.rbegin())->first; if ((seqLength.begin())->first == -1) { lengths[0] = 0; } else {lengths[0] = (seqLength.begin())->first; } //set all values to min lengths[1] = lengths[0]; lengths[2] = lengths[0]; lengths[3] = lengths[0]; lengths[4] = lengths[0]; lengths[5] = lengths[0]; totalSoFar = 0; lastValue = 0; for (map<int, long long>::iterator it = seqLength.begin(); it != seqLength.end(); it++) { int value = it->first; meanSeqLength += (value*it->second); totalSoFar += it->second; if (((totalSoFar <= ptile0_25) && (totalSoFar > 1)) || ((lastValue < ptile0_25) && (totalSoFar > ptile0_25))){ lengths[1] = value; } //save value if (((totalSoFar <= ptile25) && (totalSoFar > ptile0_25)) || ((lastValue < ptile25) && (totalSoFar > ptile25))) { lengths[2] = value; } //save value if (((totalSoFar <= ptile50) && (totalSoFar > ptile25)) || ((lastValue < ptile50) && (totalSoFar > ptile50))) { lengths[3] = value; } //save value if (((totalSoFar <= ptile75) && (totalSoFar > ptile50)) || ((lastValue < ptile75) && (totalSoFar > ptile75))) { lengths[4] = value; } //save value if (((totalSoFar <= ptile97_5) && (totalSoFar > ptile75)) || ((lastValue < ptile97_5) && (totalSoFar > ptile97_5))) { lengths[5] = value; } //save value if ((totalSoFar <= ptile100) && (totalSoFar > ptile97_5)) { lengths[6] = value; } //save value lastValue = totalSoFar; } lengths[6] = (seqLength.rbegin())->first; if ((ambigBases.begin())->first == -1) { ambigs[0] = 0; } else {ambigs[0] = (ambigBases.begin())->first; } //set all values to min ambigs[1] = ambigs[0]; ambigs[2] = ambigs[0]; ambigs[3] = ambigs[0]; ambigs[4] = ambigs[0]; ambigs[5] = ambigs[0]; totalSoFar = 0; lastValue = 0; for (map<int, long long>::iterator it = ambigBases.begin(); it != ambigBases.end(); it++) { int value = it->first; meanAmbigBases += (value*it->second); totalSoFar += it->second; if (((totalSoFar <= ptile0_25) && (totalSoFar > 1)) || ((lastValue < ptile0_25) && (totalSoFar > ptile0_25))){ ambigs[1] = value; } //save value if (((totalSoFar <= ptile25) && (totalSoFar > ptile0_25)) || ((lastValue < ptile25) && (totalSoFar > ptile25))) { ambigs[2] = value; } //save value if (((totalSoFar <= ptile50) && (totalSoFar > ptile25)) || ((lastValue < ptile50) && (totalSoFar > ptile50))) { ambigs[3] = value; } //save value if (((totalSoFar <= ptile75) && (totalSoFar > ptile50)) || ((lastValue < ptile75) && (totalSoFar > ptile75))) { ambigs[4] = value; } //save value if (((totalSoFar <= ptile97_5) && (totalSoFar > ptile75)) || ((lastValue < ptile97_5) && (totalSoFar > ptile97_5))) { ambigs[5] = value; } //save value if ((totalSoFar <= ptile100) && (totalSoFar > ptile97_5)) { ambigs[6] = value; } //save value lastValue = totalSoFar; } ambigs[6] = (ambigBases.rbegin())->first; if ((longHomoPolymer.begin())->first == -1) { homops[0] = 0; } else {homops[0] = (longHomoPolymer.begin())->first; } //set all values to min homops[1] = homops[0]; homops[2] = homops[0]; homops[3] = homops[0]; homops[4] = homops[0]; homops[5] = homops[0]; totalSoFar = 0; lastValue = 0; for (map<int, long long>::iterator it = longHomoPolymer.begin(); it != longHomoPolymer.end(); it++) { int value = it->first; meanLongHomoPolymer += (it->first*it->second); totalSoFar += it->second; if (((totalSoFar <= ptile0_25) && (totalSoFar > 1)) || ((lastValue < ptile0_25) && (totalSoFar > ptile0_25))){ homops[1] = value; } //save value if (((totalSoFar <= ptile25) && (totalSoFar > ptile0_25)) || ((lastValue < ptile25) && (totalSoFar > ptile25))) { homops[2] = value; } //save value if (((totalSoFar <= ptile50) && (totalSoFar > ptile25)) || ((lastValue < ptile50) && (totalSoFar > ptile50))) { homops[3] = value; } //save value if (((totalSoFar <= ptile75) && (totalSoFar > ptile50)) || ((lastValue < ptile75) && (totalSoFar > ptile75))) { homops[4] = value; } //save value if (((totalSoFar <= ptile97_5) && (totalSoFar > ptile75)) || ((lastValue < ptile97_5) && (totalSoFar > ptile97_5))) { homops[5] = value; } //save value if ((totalSoFar <= ptile100) && (totalSoFar > ptile97_5)) { homops[6] = value; } //save value lastValue = totalSoFar; } homops[6] = (longHomoPolymer.rbegin())->first; double meanstartPosition, meanendPosition, meanseqLength, meanambigBases, meanlongHomoPolymer; meanstartPosition = meanStartPosition / (double) size; meanendPosition = meanEndPosition /(double) size; meanlongHomoPolymer = meanLongHomoPolymer / (double) size; meanseqLength = meanSeqLength / (double) size; meanambigBases = meanAmbigBases /(double) size; if (m->control_pressed) { m->mothurRemove(summaryFile); return 0; } m->mothurOutEndLine(); m->mothurOut("\t\tStart\tEnd\tNBases\tAmbigs\tPolymer\tNumSeqs"); m->mothurOutEndLine(); m->mothurOut("Minimum:\t" + toString(starts[0]) + "\t" + toString(ends[0]) + "\t" + toString(lengths[0]) + "\t" + toString(ambigs[0]) + "\t" + toString(homops[0]) + "\t" + toString(1)); m->mothurOutEndLine(); m->mothurOut("2.5%-tile:\t" + toString(starts[1]) + "\t" + toString(ends[1]) + "\t" + toString(lengths[1]) + "\t" + toString(ambigs[1]) + "\t" + toString(homops[1]) + "\t" + toString(ptile0_25)); m->mothurOutEndLine(); m->mothurOut("25%-tile:\t" + toString(starts[2]) + "\t" + toString(ends[2]) + "\t" + toString(lengths[2]) + "\t" + toString(ambigs[2]) + "\t" + toString(homops[2]) + "\t" + toString(ptile25)); m->mothurOutEndLine(); m->mothurOut("Median: \t" + toString(starts[3]) + "\t" + toString(ends[3]) + "\t" + toString(lengths[3]) + "\t" + toString(ambigs[3]) + "\t" + toString(homops[3]) + "\t" + toString(ptile50)); m->mothurOutEndLine(); m->mothurOut("75%-tile:\t" + toString(starts[4]) + "\t" + toString(ends[4]) + "\t" + toString(lengths[4]) + "\t" + toString(ambigs[4]) + "\t" + toString(homops[4]) + "\t" + toString(ptile75)); m->mothurOutEndLine(); m->mothurOut("97.5%-tile:\t" + toString(starts[5]) + "\t" + toString(ends[5]) + "\t" + toString(lengths[5]) + "\t" + toString(ambigs[5]) + "\t" + toString(homops[5]) + "\t" + toString(ptile97_5)); m->mothurOutEndLine(); m->mothurOut("Maximum:\t" + toString(starts[6]) + "\t" + toString(ends[6]) + "\t" + toString(lengths[6]) + "\t" + toString(ambigs[6]) + "\t" + toString(homops[6]) + "\t" + toString(ptile100)); m->mothurOutEndLine(); m->mothurOut("Mean:\t" + toString(meanstartPosition) + "\t" + toString(meanendPosition) + "\t" + toString(meanseqLength) + "\t" + toString(meanambigBases) + "\t" + toString(meanlongHomoPolymer)); m->mothurOutEndLine(); if ((namefile == "") && (countfile == "")) { m->mothurOut("# of Seqs:\t" + toString(numSeqs)); m->mothurOutEndLine(); } else { m->mothurOut("# of unique seqs:\t" + toString(numSeqs)); m->mothurOutEndLine(); m->mothurOut("total # of seqs:\t" + toString(size)); m->mothurOutEndLine(); } if (m->control_pressed) { m->mothurRemove(summaryFile); return 0; } m->mothurOutEndLine(); m->mothurOut("Output File Names: "); m->mothurOutEndLine(); m->mothurOut(summaryFile); m->mothurOutEndLine(); outputNames.push_back(summaryFile); outputTypes["summary"].push_back(summaryFile); m->mothurOutEndLine(); if ((namefile == "") && (countfile == "")) { m->mothurOut("It took " + toString(time(NULL) - start) + " secs to summarize " + toString(numSeqs) + " sequences.\n"); } else{ m->mothurOut("It took " + toString(time(NULL) - start) + " secs to summarize " + toString(size) + " sequences.\n"); } //set fasta file as new current fastafile string current = ""; itTypes = outputTypes.find("summary"); if (itTypes != outputTypes.end()) { if ((itTypes->second).size() != 0) { current = (itTypes->second)[0]; m->setSummaryFile(current); } } return 0; } catch(exception& e) { m->errorOut(e, "SeqSummaryCommand", "execute"); exit(1); } }
//********************************************************************************************************************** int RemoveRareCommand::processList(){ try { //you must provide a label because the names in the listfile need to be consistent string thisLabel = ""; if (allLines) { m->mothurOut("For the listfile you must select one label, using first label in your listfile."); m->mothurOutEndLine(); } else if (labels.size() > 1) { m->mothurOut("For the listfile you must select one label, using " + (*labels.begin()) + "."); m->mothurOutEndLine(); thisLabel = *labels.begin(); } else { thisLabel = *labels.begin(); } InputData input(listfile, "list"); ListVector* list = input.getListVector(); //get first one or the one we want if (thisLabel != "") { //use smart distancing set<string> userLabels; userLabels.insert(thisLabel); set<string> processedLabels; string lastLabel = list->getLabel(); while((list != NULL) && (userLabels.size() != 0)) { if(userLabels.count(list->getLabel()) == 1){ processedLabels.insert(list->getLabel()); userLabels.erase(list->getLabel()); break; } if ((m->anyLabelsToProcess(list->getLabel(), userLabels, "") == true) && (processedLabels.count(lastLabel) != 1)) { processedLabels.insert(list->getLabel()); userLabels.erase(list->getLabel()); delete list; list = input.getListVector(lastLabel); break; } lastLabel = list->getLabel(); delete list; list = input.getListVector(); } if (userLabels.size() != 0) { m->mothurOut("Your file does not include the label " + thisLabel + ". I will use " + lastLabel + "."); m->mothurOutEndLine(); list = input.getListVector(lastLabel); } } string thisOutputDir = outputDir; if (outputDir == "") { thisOutputDir += m->hasPath(listfile); } map<string, string> variables; variables["[filename]"] = thisOutputDir + m->getRootName(m->getSimpleName(listfile)); variables["[extension]"] = m->getExtension(listfile); variables["[tag]"] = list->getLabel(); string outputFileName = getOutputFileName("list", variables); variables["[filename]"] = thisOutputDir + m->getRootName(m->getSimpleName(groupfile)); variables["[extension]"] = m->getExtension(groupfile); string outputGroupFileName = getOutputFileName("group", variables); variables["[filename]"] = thisOutputDir + m->getRootName(m->getSimpleName(countfile)); variables["[extension]"] = m->getExtension(countfile); string outputCountFileName = getOutputFileName("count", variables); ofstream out, outGroup; m->openOutputFile(outputFileName, out); bool wroteSomething = false; //if groupfile is given then use it GroupMap* groupMap; CountTable ct; if (groupfile != "") { groupMap = new GroupMap(groupfile); groupMap->readMap(); SharedUtil util; vector<string> namesGroups = groupMap->getNamesOfGroups(); util.setGroups(Groups, namesGroups); m->openOutputFile(outputGroupFileName, outGroup); }else if (countfile != "") { ct.readTable(countfile, true, false); if (ct.hasGroupInfo()) { vector<string> namesGroups = ct.getNamesOfGroups(); SharedUtil util; util.setGroups(Groups, namesGroups); } } if (list != NULL) { vector<string> binLabels = list->getLabels(); vector<string> newLabels; //make a new list vector ListVector newList; newList.setLabel(list->getLabel()); //for each bin for (int i = 0; i < list->getNumBins(); i++) { if (m->control_pressed) { if (groupfile != "") { delete groupMap; outGroup.close(); m->mothurRemove(outputGroupFileName); } out.close(); m->mothurRemove(outputFileName); return 0; } //parse out names that are in accnos file string binnames = list->get(i); vector<string> names; string saveBinNames = binnames; m->splitAtComma(binnames, names); int binsize = names.size(); vector<string> newGroupFile; if (groupfile != "") { vector<string> newNames; saveBinNames = ""; for(int k = 0; k < names.size(); k++) { string group = groupMap->getGroup(names[k]); if (m->inUsersGroups(group, Groups)) { newGroupFile.push_back(names[k] + "\t" + group); newNames.push_back(names[k]); saveBinNames += names[k] + ","; } } names = newNames; binsize = names.size(); saveBinNames = saveBinNames.substr(0, saveBinNames.length()-1); }else if (countfile != "") { saveBinNames = ""; binsize = 0; for(int k = 0; k < names.size(); k++) { if (ct.hasGroupInfo()) { vector<string> thisSeqsGroups = ct.getGroups(names[k]); int thisSeqsCount = 0; for (int n = 0; n < thisSeqsGroups.size(); n++) { if (m->inUsersGroups(thisSeqsGroups[n], Groups)) { thisSeqsCount += ct.getGroupCount(names[k], thisSeqsGroups[n]); } } binsize += thisSeqsCount; //if you don't have any seqs from the groups the user wants, then remove you. if (thisSeqsCount == 0) { newGroupFile.push_back(names[k]); } else { saveBinNames += names[k] + ","; } }else { binsize += ct.getNumSeqs(names[k]); saveBinNames += names[k] + ","; } } saveBinNames = saveBinNames.substr(0, saveBinNames.length()-1); } if (binsize > nseqs) { //keep bin newList.push_back(saveBinNames); newLabels.push_back(binLabels[i]); if (groupfile != "") { for(int k = 0; k < newGroupFile.size(); k++) { outGroup << newGroupFile[k] << endl; } } else if (countfile != "") { for(int k = 0; k < newGroupFile.size(); k++) { ct.remove(newGroupFile[k]); } } }else { if (countfile != "") { for(int k = 0; k < names.size(); k++) { ct.remove(names[k]); } } } } //print new listvector if (newList.getNumBins() != 0) { wroteSomething = true; newList.setLabels(newLabels); newList.printHeaders(out); newList.print(out); } } out.close(); if (groupfile != "") { outGroup.close(); outputTypes["group"].push_back(outputGroupFileName); outputNames.push_back(outputGroupFileName); } if (countfile != "") { if (ct.hasGroupInfo()) { vector<string> allGroups = ct.getNamesOfGroups(); for (int i = 0; i < allGroups.size(); i++) { if (!m->inUsersGroups(allGroups[i], Groups)) { ct.removeGroup(allGroups[i]); } } } ct.printTable(outputCountFileName); outputTypes["count"].push_back(outputCountFileName); outputNames.push_back(outputCountFileName); } if (wroteSomething == false) { m->mothurOut("Your file contains only rare sequences."); m->mothurOutEndLine(); } outputTypes["list"].push_back(outputFileName); outputNames.push_back(outputFileName); return 0; } catch(exception& e) { m->errorOut(e, "RemoveRareCommand", "processList"); exit(1); } }