2 writes to cache_k
Microsoft.ML.GenAI.Phi (2)
Module\Phi2Attention.cs (2)
88this.cache_k = torch.zeros(maxBatch, this._numKeyValueHeads, maxLength, this._headDim, dtype: config.Dtype); 104this.cache_k = this.cache_k.to(hiddenStates.device, disposeAfter: true).DetachFromDisposeScope();
4 references to cache_k
Microsoft.ML.GenAI.Phi (4)
Module\Phi2Attention.cs (4)
102if (this.cache_k.device != hiddenStates.device) 104this.cache_k = this.cache_k.to(hiddenStates.device, disposeAfter: true).DetachFromDisposeScope(); 139this.cache_k[..batchSize, .., pastKeyValueLength..kvSeqLen, ..] = keyStates; 141keyStates = this.cache_k[..batchSize, .., ..kvSeqLen, ..];