處理器的大腦!小編帶您讀懂CPU指令集
● SSE4.1改進(jìn)視頻處理
SSE4.1是Intel在Penryn核心的Core 2 Duo與Core 2 Solo處理器時(shí),新增的47條新多媒體指令集,用來(lái)加強(qiáng)視頻編輯等方面的應(yīng)用。另外,AMD也開發(fā)了屬于自己的SSE4a多媒體指令集,并內(nèi)建在Phenom與Opteron等K10架構(gòu)處理器中,不過(guò)相關(guān)應(yīng)用都差不多,并且無(wú)法與Intel的SSE4系列指令集相容。
緩慢的視頻處理
據(jù)了解,在進(jìn)行視頻編碼時(shí)需要進(jìn)行動(dòng)態(tài)預(yù)測(cè)(Motion Estimation)及差分編碼方式去除相鄰2張影像之相關(guān)性,這是一個(gè)非常復(fù)雜的運(yùn)算動(dòng)作。在沒(méi)有SSE4指令集時(shí),完成一個(gè)步驟需要以下指令語(yǔ)句:
for (int moveblock=0;moveblock<16;moveblock++)
for(int line=0; line<16; line++) // Does the 16 pixels large in 4 iteration
{
int i=0;
sum0+=abs( pBlock1[j]-pBlock2)+abs(pBlock1[j+1]-pBlock2[i+1])+abs(pBlock1[j+2]-pBlock2[i+2])+abs(pBlock1[j+3]-pBlock2[i+3]); // Compare with 0 pixel offset
sum1+=abs(pBlock1[j+1]-pBlock2)+abs(pBlock1[j+2]-pBlock2[i+1])+abs(pBlock1[j+3]-pBlock2[i+2])+abs(pBlock1[j+4]-pBlock2[i+3]); // Compare with 1 pixel offset
sum2+=abs(pBlock1[j+2]-pBlock2)+abs(pBlock1[j+3]-pBlock2[i+1])+abs(pBlock1[j+4]-pBlock2[i+2])+abs(pBlock1[j+5]-pBlock2[i+3]); // Compare with 2 pixel offset
sum3+=abs(pBlock1[j+3]-pBlock2)+abs(pBlock1[j+4]-pBlock2[i+1])+abs(pBlock1[j+5]-pBlock2[i+2])+abs(pBlock1[j+6]-pBlock2[i+3]); // Compare with 3 pixel offset
sum4+=abs(pBlock1[j+4]-pBlock2)+abs(pBlock1[j+5]-pBlock2[i+1])+abs(pBlock1[j+6]-pBlock2[i+2])+abs(pBlock1[j+7]-pBlock2[i+3]); // Compare with 4 pixel offset
sum5+=abs(pBlock1[j+5]-pBlock2)+abs(pBlock1[j+6]-pBlock2[i+1])+abs(pBlock1[j+7]-pBlock2[i+2])+abs(pBlock1[j+8]-pBlock2[i+3]); // Compare with 5 pixel offset
sum6+=abs(pBlock1[j+6]-pBlock2)+abs(pBlock1[j+7]-pBlock2[i+1])+abs(pBlock1[j+8]-pBlock2[i+2])+abs(pBlock1[j+9]-pBlock2[i+3]); // Compare with 6 pixel offset
sum7+=abs(pBlock1[j+7]-pBlock2)+abs(pBlock1[j+8]-pBlock2[i+1])+abs(pBlock1[j+9]-pBlock2[i+2])+abs(pBlock1[j+10]-pBlock2[i+3]); // Compare with 7 pixel offset
i=4;
j=moveblock+4;
…
… }
}
一大串的指令極度浪費(fèi)處理器資源,而在支持SSE4指令集的處理器上,只需要采用4 SAD運(yùn)算指令:
MPSADBW xmm0,xmm1,0
便完全代替了以上繁復(fù)的指令串,大幅提升動(dòng)態(tài)預(yù)測(cè)(Motion Estimation)及差分編碼的運(yùn)算速度。
關(guān)注我們



