Errata for The Software Vectorization Handbook
This page contains errata for:
Also see:
First person reporting a new error is mentioned in a "courtesy".
First Printing
- page 25: replace "a2-b3" with "a2-b2" in destination
(courtesy Naftali Schwartz).
- page 27: replace "stores the 16-bit sums" with
"stores the 32-bit sums".
- page 35: replace the immediate 230 with 245 in the shufps that
is equal to movshdup.
- page 38: change theoretical peak into 12GFLOPS for single-
and 6GFLOPS for double-precision FP computations.
- page 94: add "T" to two occurrences of "pcmpeq" in expansion
table (viz. "pcmpeqT") and "d" to "pcmpeq" in example
at the bottom (viz. "pcmpeqd").
- page 94: replace "constants -1 and -17" with "constants -1 and 17"
(courtesy Naftali Schwartz).
- page 112: replace "pcmpneb,w,d" entries in Table 5.8 with
"emulate with pcmpeqb,w,d"
(no direct support of "ne" instructions).
- page 113: remove pcmpneb,w row completely from table
(no direct support of "ne" instructions).
- page 127: remove ")" from caption of Figure 6.2.
- page 190: remove "is" from "vector implementation is shown above".
- page 209: remove "LOOP" from "BLOCK LOOP WAS VECTORIZED" in Table 9.5.
- page 215: add brackets "(" ")" around switches "-O3 -Qipo -Qprof_use -QxP".
Second Printing
Please note that this page is privately maintained by
Aart Bik.