Value *ParallelLoopGenerator::createParallelLoop( Value *LB, Value *UB, Value *Stride, SetVector<Value *> &UsedValues, ValueMapT &Map, BasicBlock::iterator *LoopBody) { Function *SubFn; AllocaInst *Struct = storeValuesIntoStruct(UsedValues); BasicBlock::iterator BeforeLoop = Builder.GetInsertPoint(); Value *IV = createSubFn(Stride, Struct, UsedValues, Map, &SubFn); *LoopBody = Builder.GetInsertPoint(); Builder.SetInsertPoint(&*BeforeLoop); Value *SubFnParam = Builder.CreateBitCast(Struct, Builder.getInt8PtrTy(), "polly.par.userContext"); // Add one as the upper bound provided by openmp is a < comparison // whereas the codegenForSequential function creates a <= comparison. UB = Builder.CreateAdd(UB, ConstantInt::get(LongType, 1)); // Tell the runtime we start a parallel loop createCallSpawnThreads(SubFn, SubFnParam, LB, UB, Stride); Builder.CreateCall(SubFn, SubFnParam); createCallJoinThreads(); // Mark the end of the lifetime for the parameter struct. Type *Ty = Struct->getType(); ConstantInt *SizeOf = Builder.getInt64(DL.getTypeAllocSize(Ty)); Builder.CreateLifetimeEnd(Struct, SizeOf); return IV; }
AllocaInst* Variables::changeLocal(Value* value, ArrayType* newType) { AllocaInst* oldTarget = dyn_cast<AllocaInst>(value); PointerType* oldPointerType = dyn_cast<PointerType>(oldTarget->getType()); ArrayType* oldType = dyn_cast<ArrayType>(oldPointerType->getElementType()); AllocaInst* newTarget = NULL; errs() << "Changing the precision of variable \"" << oldTarget->getName() << "\" from " << *oldType << " to " << *newType << ".\n"; if (newType->getElementType()->getTypeID() != oldType->getElementType()->getTypeID()) { newTarget = new AllocaInst(newType, getInt32(1), "", oldTarget); // we are not calling getAlignment because in this case double requires 16. Investigate further. unsigned alignment; switch(newType->getElementType()->getTypeID()) { case Type::FloatTyID: alignment = 4; break; case Type::DoubleTyID: alignment = 16; break; case Type::X86_FP80TyID: alignment = 16; break; default: alignment = 0; } newTarget->setAlignment(alignment); // depends on type? 8 for float? 16 for double? newTarget->takeName(oldTarget); // iterating through instructions using old AllocaInst vector<Instruction*> erase; Value::use_iterator it = oldTarget->use_begin(); for(; it != oldTarget->use_end(); it++) { bool is_erased = Transformer::transform(it, newTarget, oldTarget, newType, oldType, alignment); if (!is_erased) erase.push_back(dyn_cast<Instruction>(*it)); } // erasing uses of old instructions for(unsigned int i = 0; i < erase.size(); i++) { erase[i]->eraseFromParent(); } // erase old instruction //oldTarget->eraseFromParent(); } else { errs() << "\tNo changes required.\n"; } return newTarget; }
void FuncTransform::visitAllocaInst(AllocaInst &MI) { #if 0 if (MI.getType() != PoolAllocate::PoolDescPtrTy) { Value *PH = getPoolHandle(&MI); assert (PH && "Alloca has no pool handle!\n"); } #endif // FIXME: We should remove SAFECode-specific functionality (and comments) // SAFECode will register alloca instructions with the run-time, so do not // do that here. // // FIXME: // There is a chance that we may need to update PoolUses to make sure that // the pool handle is available in this function. // return; }
Instruction *InstCombiner::visitAllocaInst(AllocaInst &AI) { if (auto *I = simplifyAllocaArraySize(*this, AI)) return I; if (AI.getAllocatedType()->isSized()) { // If the alignment is 0 (unspecified), assign it the preferred alignment. if (AI.getAlignment() == 0) AI.setAlignment(DL.getPrefTypeAlignment(AI.getAllocatedType())); // Move all alloca's of zero byte objects to the entry block and merge them // together. Note that we only do this for alloca's, because malloc should // allocate and return a unique pointer, even for a zero byte allocation. if (DL.getTypeAllocSize(AI.getAllocatedType()) == 0) { // For a zero sized alloca there is no point in doing an array allocation. // This is helpful if the array size is a complicated expression not used // elsewhere. if (AI.isArrayAllocation()) { AI.setOperand(0, ConstantInt::get(AI.getArraySize()->getType(), 1)); return &AI; } // Get the first instruction in the entry block. BasicBlock &EntryBlock = AI.getParent()->getParent()->getEntryBlock(); Instruction *FirstInst = EntryBlock.getFirstNonPHIOrDbg(); if (FirstInst != &AI) { // If the entry block doesn't start with a zero-size alloca then move // this one to the start of the entry block. There is no problem with // dominance as the array size was forced to a constant earlier already. AllocaInst *EntryAI = dyn_cast<AllocaInst>(FirstInst); if (!EntryAI || !EntryAI->getAllocatedType()->isSized() || DL.getTypeAllocSize(EntryAI->getAllocatedType()) != 0) { AI.moveBefore(FirstInst); return &AI; } // If the alignment of the entry block alloca is 0 (unspecified), // assign it the preferred alignment. if (EntryAI->getAlignment() == 0) EntryAI->setAlignment( DL.getPrefTypeAlignment(EntryAI->getAllocatedType())); // Replace this zero-sized alloca with the one at the start of the entry // block after ensuring that the address will be aligned enough for both // types. unsigned MaxAlign = std::max(EntryAI->getAlignment(), AI.getAlignment()); EntryAI->setAlignment(MaxAlign); if (AI.getType() != EntryAI->getType()) return new BitCastInst(EntryAI, AI.getType()); return ReplaceInstUsesWith(AI, EntryAI); } } } if (AI.getAlignment()) { // Check to see if this allocation is only modified by a memcpy/memmove from // a constant global whose alignment is equal to or exceeds that of the // allocation. If this is the case, we can change all users to use // the constant global instead. This is commonly produced by the CFE by // constructs like "void foo() { int A[] = {1,2,3,4,5,6,7,8,9...}; }" if 'A' // is only subsequently read. SmallVector<Instruction *, 4> ToDelete; if (MemTransferInst *Copy = isOnlyCopiedFromConstantGlobal(&AI, ToDelete)) { unsigned SourceAlign = getOrEnforceKnownAlignment( Copy->getSource(), AI.getAlignment(), DL, &AI, AC, DT); if (AI.getAlignment() <= SourceAlign) { DEBUG(dbgs() << "Found alloca equal to global: " << AI << '\n'); DEBUG(dbgs() << " memcpy = " << *Copy << '\n'); for (unsigned i = 0, e = ToDelete.size(); i != e; ++i) EraseInstFromFunction(*ToDelete[i]); Constant *TheSrc = cast<Constant>(Copy->getSource()); Constant *Cast = ConstantExpr::getPointerBitCastOrAddrSpaceCast(TheSrc, AI.getType()); Instruction *NewI = ReplaceInstUsesWith(AI, Cast); EraseInstFromFunction(*Copy); ++NumGlobalCopies; return NewI; } } } // At last, use the generic allocation site handler to aggressively remove // unused allocas. return visitAllocSite(AI); }
Instruction *InstCombiner::visitAllocaInst(AllocaInst &AI) { // Ensure that the alloca array size argument has type intptr_t, so that // any casting is exposed early. if (DL) { Type *IntPtrTy = DL->getIntPtrType(AI.getType()); if (AI.getArraySize()->getType() != IntPtrTy) { Value *V = Builder->CreateIntCast(AI.getArraySize(), IntPtrTy, false); AI.setOperand(0, V); return &AI; } } // Convert: alloca Ty, C - where C is a constant != 1 into: alloca [C x Ty], 1 if (AI.isArrayAllocation()) { // Check C != 1 if (const ConstantInt *C = dyn_cast<ConstantInt>(AI.getArraySize())) { Type *NewTy = ArrayType::get(AI.getAllocatedType(), C->getZExtValue()); AllocaInst *New = Builder->CreateAlloca(NewTy, nullptr, AI.getName()); New->setAlignment(AI.getAlignment()); // Scan to the end of the allocation instructions, to skip over a block of // allocas if possible...also skip interleaved debug info // BasicBlock::iterator It = New; while (isa<AllocaInst>(*It) || isa<DbgInfoIntrinsic>(*It)) ++It; // Now that I is pointing to the first non-allocation-inst in the block, // insert our getelementptr instruction... // Type *IdxTy = DL ? DL->getIntPtrType(AI.getType()) : Type::getInt64Ty(AI.getContext()); Value *NullIdx = Constant::getNullValue(IdxTy); Value *Idx[2] = { NullIdx, NullIdx }; Instruction *GEP = GetElementPtrInst::CreateInBounds(New, Idx, New->getName() + ".sub"); InsertNewInstBefore(GEP, *It); // Now make everything use the getelementptr instead of the original // allocation. return ReplaceInstUsesWith(AI, GEP); } else if (isa<UndefValue>(AI.getArraySize())) { return ReplaceInstUsesWith(AI, Constant::getNullValue(AI.getType())); } } if (DL && AI.getAllocatedType()->isSized()) { // If the alignment is 0 (unspecified), assign it the preferred alignment. if (AI.getAlignment() == 0) AI.setAlignment(DL->getPrefTypeAlignment(AI.getAllocatedType())); // Move all alloca's of zero byte objects to the entry block and merge them // together. Note that we only do this for alloca's, because malloc should // allocate and return a unique pointer, even for a zero byte allocation. if (DL->getTypeAllocSize(AI.getAllocatedType()) == 0) { // For a zero sized alloca there is no point in doing an array allocation. // This is helpful if the array size is a complicated expression not used // elsewhere. if (AI.isArrayAllocation()) { AI.setOperand(0, ConstantInt::get(AI.getArraySize()->getType(), 1)); return &AI; } // Get the first instruction in the entry block. BasicBlock &EntryBlock = AI.getParent()->getParent()->getEntryBlock(); Instruction *FirstInst = EntryBlock.getFirstNonPHIOrDbg(); if (FirstInst != &AI) { // If the entry block doesn't start with a zero-size alloca then move // this one to the start of the entry block. There is no problem with // dominance as the array size was forced to a constant earlier already. AllocaInst *EntryAI = dyn_cast<AllocaInst>(FirstInst); if (!EntryAI || !EntryAI->getAllocatedType()->isSized() || DL->getTypeAllocSize(EntryAI->getAllocatedType()) != 0) { AI.moveBefore(FirstInst); return &AI; } // If the alignment of the entry block alloca is 0 (unspecified), // assign it the preferred alignment. if (EntryAI->getAlignment() == 0) EntryAI->setAlignment( DL->getPrefTypeAlignment(EntryAI->getAllocatedType())); // Replace this zero-sized alloca with the one at the start of the entry // block after ensuring that the address will be aligned enough for both // types. unsigned MaxAlign = std::max(EntryAI->getAlignment(), AI.getAlignment()); EntryAI->setAlignment(MaxAlign); if (AI.getType() != EntryAI->getType()) return new BitCastInst(EntryAI, AI.getType()); return ReplaceInstUsesWith(AI, EntryAI); } } } if (AI.getAlignment()) { // Check to see if this allocation is only modified by a memcpy/memmove from // a constant global whose alignment is equal to or exceeds that of the // allocation. If this is the case, we can change all users to use // the constant global instead. This is commonly produced by the CFE by // constructs like "void foo() { int A[] = {1,2,3,4,5,6,7,8,9...}; }" if 'A' // is only subsequently read. SmallVector<Instruction *, 4> ToDelete; if (MemTransferInst *Copy = isOnlyCopiedFromConstantGlobal(&AI, ToDelete)) { unsigned SourceAlign = getOrEnforceKnownAlignment(Copy->getSource(), AI.getAlignment(), DL); if (AI.getAlignment() <= SourceAlign) { DEBUG(dbgs() << "Found alloca equal to global: " << AI << '\n'); DEBUG(dbgs() << " memcpy = " << *Copy << '\n'); for (unsigned i = 0, e = ToDelete.size(); i != e; ++i) EraseInstFromFunction(*ToDelete[i]); Constant *TheSrc = cast<Constant>(Copy->getSource()); Constant *Cast = ConstantExpr::getPointerBitCastOrAddrSpaceCast(TheSrc, AI.getType()); Instruction *NewI = ReplaceInstUsesWith(AI, Cast); EraseInstFromFunction(*Copy); ++NumGlobalCopies; return NewI; } } } // At last, use the generic allocation site handler to aggressively remove // unused allocas. return visitAllocSite(AI); }
static int initEnv(Module *mainModule) { /* nArgcP = alloc oldArgc->getType() nArgvV = alloc oldArgv->getType() store oldArgc nArgcP store oldArgv nArgvP klee_init_environment(nArgcP, nArgvP) nArgc = load nArgcP nArgv = load nArgvP oldArgc->replaceAllUsesWith(nArgc) oldArgv->replaceAllUsesWith(nArgv) */ Function *mainFn = mainModule->getFunction(EntryPoint); if (mainFn->arg_size() < 2) { klee_error("Cannot handle ""--posix-runtime"" when main() has less than two arguments.\n"); } Instruction* firstInst = mainFn->begin()->begin(); Value* oldArgc = mainFn->arg_begin(); Value* oldArgv = ++mainFn->arg_begin(); AllocaInst* argcPtr = new AllocaInst(oldArgc->getType(), "argcPtr", firstInst); AllocaInst* argvPtr = new AllocaInst(oldArgv->getType(), "argvPtr", firstInst); /* Insert void klee_init_env(int* argc, char*** argv) */ std::vector<const Type*> params; params.push_back(Type::getInt32Ty(getGlobalContext())); params.push_back(Type::getInt32Ty(getGlobalContext())); Function* initEnvFn = cast<Function>(mainModule->getOrInsertFunction("klee_init_env", Type::getVoidTy(getGlobalContext()), argcPtr->getType(), argvPtr->getType(), NULL)); assert(initEnvFn); std::vector<Value*> args; args.push_back(argcPtr); args.push_back(argvPtr); #if LLVM_VERSION_CODE >= LLVM_VERSION(3, 0) Instruction* initEnvCall = CallInst::Create(initEnvFn, args, "", firstInst); #else Instruction* initEnvCall = CallInst::Create(initEnvFn, args.begin(), args.end(), "", firstInst); #endif Value *argc = new LoadInst(argcPtr, "newArgc", firstInst); Value *argv = new LoadInst(argvPtr, "newArgv", firstInst); oldArgc->replaceAllUsesWith(argc); oldArgv->replaceAllUsesWith(argv); new StoreInst(oldArgc, argcPtr, initEnvCall); new StoreInst(oldArgv, argvPtr, initEnvCall); return 0; }
void StackColoring::remapInstructions(DenseMap<int, int> &SlotRemap) { unsigned FixedInstr = 0; unsigned FixedMemOp = 0; unsigned FixedDbg = 0; MachineModuleInfo *MMI = &MF->getMMI(); // Remap debug information that refers to stack slots. for (auto &VI : MMI->getVariableDbgInfo()) { if (!VI.Var) continue; if (SlotRemap.count(VI.Slot)) { DEBUG(dbgs() << "Remapping debug info for [" << cast<DILocalVariable>(VI.Var)->getName() << "].\n"); VI.Slot = SlotRemap[VI.Slot]; FixedDbg++; } } // Keep a list of *allocas* which need to be remapped. DenseMap<const AllocaInst*, const AllocaInst*> Allocas; for (const std::pair<int, int> &SI : SlotRemap) { const AllocaInst *From = MFI->getObjectAllocation(SI.first); const AllocaInst *To = MFI->getObjectAllocation(SI.second); assert(To && From && "Invalid allocation object"); Allocas[From] = To; // AA might be used later for instruction scheduling, and we need it to be // able to deduce the correct aliasing releationships between pointers // derived from the alloca being remapped and the target of that remapping. // The only safe way, without directly informing AA about the remapping // somehow, is to directly update the IR to reflect the change being made // here. Instruction *Inst = const_cast<AllocaInst *>(To); if (From->getType() != To->getType()) { BitCastInst *Cast = new BitCastInst(Inst, From->getType()); Cast->insertAfter(Inst); Inst = Cast; } // Allow the stack protector to adjust its value map to account for the // upcoming replacement. SP->adjustForColoring(From, To); // The new alloca might not be valid in a llvm.dbg.declare for this // variable, so undef out the use to make the verifier happy. AllocaInst *FromAI = const_cast<AllocaInst *>(From); if (FromAI->isUsedByMetadata()) ValueAsMetadata::handleRAUW(FromAI, UndefValue::get(FromAI->getType())); for (auto &Use : FromAI->uses()) { if (BitCastInst *BCI = dyn_cast<BitCastInst>(Use.get())) if (BCI->isUsedByMetadata()) ValueAsMetadata::handleRAUW(BCI, UndefValue::get(BCI->getType())); } // Note that this will not replace uses in MMOs (which we'll update below), // or anywhere else (which is why we won't delete the original // instruction). FromAI->replaceAllUsesWith(Inst); } // Remap all instructions to the new stack slots. for (MachineBasicBlock &BB : *MF) for (MachineInstr &I : BB) { // Skip lifetime markers. We'll remove them soon. if (I.getOpcode() == TargetOpcode::LIFETIME_START || I.getOpcode() == TargetOpcode::LIFETIME_END) continue; // Update the MachineMemOperand to use the new alloca. for (MachineMemOperand *MMO : I.memoperands()) { // FIXME: In order to enable the use of TBAA when using AA in CodeGen, // we'll also need to update the TBAA nodes in MMOs with values // derived from the merged allocas. When doing this, we'll need to use // the same variant of GetUnderlyingObjects that is used by the // instruction scheduler (that can look through ptrtoint/inttoptr // pairs). // We've replaced IR-level uses of the remapped allocas, so we only // need to replace direct uses here. const AllocaInst *AI = dyn_cast_or_null<AllocaInst>(MMO->getValue()); if (!AI) continue; if (!Allocas.count(AI)) continue; MMO->setValue(Allocas[AI]); FixedMemOp++; } // Update all of the machine instruction operands. for (MachineOperand &MO : I.operands()) { if (!MO.isFI()) continue; int FromSlot = MO.getIndex(); // Don't touch arguments. if (FromSlot<0) continue; // Only look at mapped slots. if (!SlotRemap.count(FromSlot)) continue; // In a debug build, check that the instruction that we are modifying is // inside the expected live range. If the instruction is not inside // the calculated range then it means that the alloca usage moved // outside of the lifetime markers, or that the user has a bug. // NOTE: Alloca address calculations which happen outside the lifetime // zone are are okay, despite the fact that we don't have a good way // for validating all of the usages of the calculation. #ifndef NDEBUG bool TouchesMemory = I.mayLoad() || I.mayStore(); // If we *don't* protect the user from escaped allocas, don't bother // validating the instructions. if (!I.isDebugValue() && TouchesMemory && ProtectFromEscapedAllocas) { SlotIndex Index = Indexes->getInstructionIndex(I); const LiveInterval *Interval = &*Intervals[FromSlot]; assert(Interval->find(Index) != Interval->end() && "Found instruction usage outside of live range."); } #endif // Fix the machine instructions. int ToSlot = SlotRemap[FromSlot]; MO.setIndex(ToSlot); FixedInstr++; } } // Update the location of C++ catch objects for the MSVC personality routine. if (WinEHFuncInfo *EHInfo = MF->getWinEHFuncInfo()) for (WinEHTryBlockMapEntry &TBME : EHInfo->TryBlockMap) for (WinEHHandlerType &H : TBME.HandlerArray) if (H.CatchObj.FrameIndex != INT_MAX && SlotRemap.count(H.CatchObj.FrameIndex)) H.CatchObj.FrameIndex = SlotRemap[H.CatchObj.FrameIndex]; DEBUG(dbgs()<<"Fixed "<<FixedMemOp<<" machine memory operands.\n"); DEBUG(dbgs()<<"Fixed "<<FixedDbg<<" debug locations.\n"); DEBUG(dbgs()<<"Fixed "<<FixedInstr<<" machine instructions.\n"); }
void FunctionCodegen(TiXmlElement *procedure, LLVMContext &context, IRBuilder<> *builder) { const char *proc_name = procedure->Attribute("Name"); std::vector<Type*> VarArgs; std::vector<std::string> VarNames; TiXmlElement *form_pars = procedure->FirstChildElement("FormalParameters"); for (TiXmlElement *form_par = form_pars->FirstChildElement("FormalParameter"); form_par; form_par = form_par->NextSiblingElement("FormalParameter")) { const char *form_par_name = form_par->Attribute("Name"); const char *form_par_type = form_par->Attribute("Type"); if ((std::string)form_par_type == "INTEGER") VarArgs.push_back(Type::getInt32Ty(getGlobalContext())); else if ((std::string)form_par_type == "REAL") VarArgs.push_back(Type::getDoubleTy(getGlobalContext())); else { // TODO: Other types <<<<< } VarNames.push_back(form_par_name); // remember the name of the variable } FunctionType *FT_temp = FunctionType::get(Type::getVoidTy(getGlobalContext()),VarArgs,false); Function *F_temp = Function::Create(FT_temp, Function::ExternalLinkage, proc_name, TheModule); BasicBlock *BB_temp = BasicBlock::Create(getGlobalContext(), "entry " + (std::string)proc_name + ":", F_temp); Builder.SetInsertPoint(BB_temp); ValueSymbolTable &VST = F_temp->getValueSymbolTable(); // Internal variables TiXmlElement *proc_declarations = procedure->FirstChildElement("Declarations"); for (TiXmlElement *proc_var = proc_declarations->FirstChildElement("Variable"); proc_var; proc_var = proc_var->NextSiblingElement("Variable")) { const char *proc_var_name = proc_var->Attribute("Name"); const char *proc_var_type = proc_var->Attribute("Type"); AllocaInst *Alloca; if ((std::string)proc_var_type == "INTEGER") { Alloca = builder->CreateAlloca(Type::getInt32Ty(getGlobalContext()),0, (std::string)proc_var_name); //CurVar = Builder.CreateLoad(Alloca, proc_var_name); } else if ((std::string)proc_var_type == "REAL") { Alloca = builder->CreateAlloca(Type::getDoubleTy(getGlobalContext()),0, (std::string)proc_var_name); //CurVar = Builder.CreateLoad(Alloca, proc_var_name); } else { // TODO: Other types <<<<< } } //Body parsing TiXmlElement *proc_body = procedure->FirstChildElement("Body"); for (TiXmlElement *proc_op = proc_body->FirstChildElement("Operator"); proc_op; proc_op = proc_op->NextSiblingElement("Operator")) { const char *proc_op_typename = proc_op->Attribute("TypeName"); if ((std::string) proc_op_typename == "Assign") { OperatorAssignCodegen(proc_op, context, builder, F_temp); F_temp->dump(); } else { // TODO: Other types <<<<< } } //VST.dump(); if (false) { // Test area AllocaInst *ptest = builder->CreateAlloca(Type::getInt32Ty(getGlobalContext()),0, "ptest"); AllocaInst *jtest =(AllocaInst*) VST.lookup("j"); Value *ctest = ConstantInt::get(getGlobalContext(),APInt(32,111,false)); Value *cctest = ConstantInt::get(getGlobalContext(),APInt(32,777,true)); AllocaInst *rrrr = builder->CreateAlloca(Type::getDoubleTy(getGlobalContext()),0, "rrrr"); Value *cccc = ConstantFP::get(getGlobalContext(), APFloat(123.01)); builder->CreateStore(cccc, rrrr); builder->CreateStore(ctest, ptest); builder->CreateStore(ctest, jtest); builder->CreateStore(cctest, jtest); Value *jrez = VST.lookup("j"); Value *jrezw = builder->CreateLoad(jrez,"j"); builder->CreateStore(cctest, jrez); //builder->CreateBinOp(Instruction::Add, ctest, cctest, "rezzzz"); cctest->getType()->isPointerTy(); jrez->getType()->dump(); jtest->getType()->dump(); Value *ffff = builder->CreateAdd(cctest, jrezw, "ffff"); Value *gggg = builder->CreateAdd(jrezw, ffff, "gggg"); gggg = builder->CreateAdd(jrezw, cctest, "gggg"); jrezw = builder->CreateAdd(ffff, gggg, "jrezw"); VST.lookup("j")->dump(); jrezw->dump(); Value *jrezww = builder->CreateLoad(VST.lookup("j"),"j"); jrezww->dump(); //rez->dump(); //IRBuilder<> smallBuild(BB_temp); //Value *tmp = smallBuild.CreateBinOp(Instruction::Add, // ctest, cctest, "tmp111"); } F_temp->dump(); }
/* * Rewrite OpenMP call sites and their associated kernel functions -- the folloiwng pattern call void @GOMP_parallel_start(void (i8*)* @_Z20initialize_variablesiPfS_.omp_fn.4, i8* %.omp_data_o.5571, i32 0) nounwind call void @_Z20initialize_variablesiPfS_.omp_fn.4(i8* %.omp_data_o.5571) nounwind call void @GOMP_parallel_end() nounwind */ void HeteroOMPTransform::rewrite_omp_call_sites(Module &M) { SmallVector<Instruction *, 16> toDelete; DenseMap<Value *, Value *> ValueMap; for (Module::iterator I = M.begin(), E = M.end(); I != E; ++I){ if (!I->isDeclaration()) { for (Function::iterator BBI = I->begin(), BBE = I->end(); BBI != BBE; ++BBI) { bool match = false; for (BasicBlock::iterator INSNI = BBI->begin(), INSNE = BBI->end(); INSNI != INSNE; ++INSNI) { if (isa<CallInst>(INSNI)) { CallSite CI(cast<Instruction>(INSNI)); if (CI.getCalledFunction() != NULL){ string called_func_name = CI.getCalledFunction()->getName(); if (called_func_name == OMP_PARALLEL_START_NAME && CI.arg_size() == 3) { // change alloc to malloc_shared // %5 = call i8* @_Z13malloc_sharedm(i64 20) ; <i8*> [#uses=5] // %6 = bitcast i8* %5 to float* ; <float*> [#uses=2] AllocaInst *AllocCall; Value *arg_0 = CI.getArgument(0); // function Value *arg_1 = CI.getArgument(1); // context Value *loop_ub = NULL; Function *function; BitCastInst* BCI; Function *kernel_function; BasicBlock::iterator iI(*INSNI); //BasicBlock::iterator iJ = iI+1; iI++; iI++; //BasicBlock::iterator iK = iI; CallInst /**next,*/ *next_next; if (arg_0 != NULL && arg_1 != NULL /*&& (next = dyn_cast<CallInst>(*iJ))*/ && (next_next = dyn_cast<CallInst>(iI)) && (next_next->getCalledFunction() != NULL) && (next_next->getCalledFunction()->getName() == OMP_PARALLEL_END_NAME) && (BCI = dyn_cast<BitCastInst>(arg_1)) && (AllocCall = dyn_cast<AllocaInst>(BCI->getOperand(0))) && (function = dyn_cast<Function>(arg_0)) && (loop_ub = find_loop_upper_bound (AllocCall)) && (kernel_function=convert_to_kernel_function (M, function))){ SmallVector<Value*, 16> Args; Args.push_back(AllocCall->getArraySize()); Instruction *MallocCall = CallInst::Create(mallocFnTy, Args, "", AllocCall); CastInst *MallocCast = CastInst::Create(Instruction::BitCast, MallocCall, AllocCall->getType(), "", AllocCall); ValueMap[AllocCall] = MallocCast; //AllocCall->replaceAllUsesWith(MallocCall); // Add offload function Args.clear(); Args.push_back(loop_ub); Args.push_back(BCI); Args.push_back(kernel_function); if (offloadFnTy == NULL) { init_offload_type(M, kernel_function); } Instruction *call = CallInst::Create(offloadFnTy, Args, "", INSNI); if (find(toDelete.begin(), toDelete.end(), AllocCall) == toDelete.end()){ toDelete.push_back(AllocCall); } toDelete.push_back(&(*INSNI)); match = true; } } else if (called_func_name == OMP_PARALLEL_END_NAME && CI.arg_size() == 0 && match) { toDelete.push_back(&(*INSNI)); match = false; } else if (match) { toDelete.push_back(&(*INSNI)); } } } } } } } /* Replace AllocCalls by MallocCalls */ for(DenseMap<Value *, Value *>::iterator I = ValueMap.begin(), E = ValueMap.end(); I != E; I++) { I->first->replaceAllUsesWith(I->second); } /* delete the instructions for get_omp_num_thread and get_omp_thread_num */ while(!toDelete.empty()) { Instruction *g = toDelete.back(); toDelete.pop_back(); g->eraseFromParent(); } }
bool AllocaMerging::runOnFunction(Function& F) { cheerp::PointerAnalyzer & PA = getAnalysis<cheerp::PointerAnalyzer>(); cheerp::Registerize & registerize = getAnalysis<cheerp::Registerize>(); cheerp::TypeSupport types(*F.getParent()); AllocaInfos allocaInfos; // Gather all the allocas for(BasicBlock& BB: F) analyzeBlock(registerize, BB, allocaInfos); if (allocaInfos.size() < 2) return false; bool Changed = false; BasicBlock& entryBlock=F.getEntryBlock(); // Look if we can merge allocas of the same type for(auto targetCandidate=allocaInfos.begin();targetCandidate!=allocaInfos.end();++targetCandidate) { AllocaInst* targetAlloca = targetCandidate->first; Type* targetType = targetAlloca->getAllocatedType(); // The range storing the sum of all ranges merged into target cheerp::Registerize::LiveRange targetRange(targetCandidate->second); // If the range is empty, we have an alloca that we can't analyze if (targetRange.empty()) continue; std::vector<AllocaInfos::iterator> mergeSet; auto sourceCandidate=targetCandidate; ++sourceCandidate; for(;sourceCandidate!=allocaInfos.end();++sourceCandidate) { AllocaInst* sourceAlloca = sourceCandidate->first; Type* sourceType = sourceAlloca->getAllocatedType(); // Bail out for non compatible types if(!areTypesEquivalent(types, PA, targetType, sourceType)) continue; const cheerp::Registerize::LiveRange& sourceRange = sourceCandidate->second; // Bail out if this source candidate is not analyzable if(sourceRange.empty()) continue; // Bail out if the allocas interfere if(targetRange.doesInterfere(sourceRange)) continue; // Add the range to the target range and the source alloca to the mergeSet mergeSet.push_back(sourceCandidate); PA.invalidate(sourceAlloca); targetRange.merge(sourceRange); } // If the merge set is empty try another target if(mergeSet.empty()) continue; PA.invalidate(targetAlloca); if(!Changed) registerize.invalidateLiveRangeForAllocas(F); // Make sure that this alloca is in the entry block if(targetAlloca->getParent()!=&entryBlock) targetAlloca->moveBefore(entryBlock.begin()); // We can merge the allocas for(const AllocaInfos::iterator& it: mergeSet) { AllocaInst* allocaToMerge = it->first; Instruction* targetVal=targetAlloca; if(targetVal->getType()!=allocaToMerge->getType()) { targetVal=new BitCastInst(targetVal, allocaToMerge->getType()); targetVal->insertAfter(targetAlloca); } allocaToMerge->replaceAllUsesWith(targetVal); allocaToMerge->eraseFromParent(); if(targetVal != targetAlloca) PA.getPointerKind(targetVal); allocaInfos.erase(it); NumAllocaMerged++; } PA.getPointerKind(targetAlloca); Changed = true; } if(Changed) registerize.computeLiveRangeForAllocas(F); return Changed; }
// // Method: insertBadAllocationSizes() // // Description: // This method will look for allocations and change their size to be // incorrect. It does the following: // o) Changes the number of array elements allocated by alloca and malloc. // // Return value: // true - The module was modified. // false - The module was left unmodified. // bool FaultInjector::insertBadAllocationSizes (Function & F) { // Worklist of allocation sites to rewrite std::vector<AllocaInst * > WorkList; for (Function::iterator fI = F.begin(), fE = F.end(); fI != fE; ++fI) { BasicBlock & BB = *fI; for (BasicBlock::iterator I = BB.begin(), bE = BB.end(); I != bE; ++I) { if (AllocaInst * AI = dyn_cast<AllocaInst>(I)) { if (AI->isArrayAllocation()) { // Skip if we should not insert a fault. if (!doFault()) continue; WorkList.push_back(AI); } } } } while (WorkList.size()) { AllocaInst * AI = WorkList.back(); WorkList.pop_back(); // // Print information about where the fault is being inserted. // printSourceInfo ("Bad allocation size", AI); Instruction * NewAlloc = 0; NewAlloc = new AllocaInst (AI->getAllocatedType(), ConstantInt::get(Int32Type,0), AI->getAlignment(), AI->getName(), AI); AI->replaceAllUsesWith (NewAlloc); AI->eraseFromParent(); ++BadSizes; } // // Try harder to make bad allocation sizes. // WorkList.clear(); for (Function::iterator fI = F.begin(), fE = F.end(); fI != fE; ++fI) { BasicBlock & BB = *fI; for (BasicBlock::iterator I = BB.begin(), bE = BB.end(); I != bE; ++I) { if (AllocaInst * AI = dyn_cast<AllocaInst>(I)) { // // Determine if this is a data type that we can make smaller. // if (((TD->getTypeAllocSize(AI->getAllocatedType())) > 4) && doFault()) { WorkList.push_back(AI); } } } } // // Replace these allocations with an allocation of an integer and cast the // result back into the appropriate type. // while (WorkList.size()) { AllocaInst * AI = WorkList.back(); WorkList.pop_back(); Instruction * NewAlloc = 0; NewAlloc = new AllocaInst (Int32Type, AI->getArraySize(), AI->getAlignment(), AI->getName(), AI); NewAlloc = castTo (NewAlloc, AI->getType(), "", AI); AI->replaceAllUsesWith (NewAlloc); AI->eraseFromParent(); ++BadSizes; } return (BadSizes > 0); }