Index

0-bits, leading zeros. See nlz function.

0-bits, trailing zeros. See also ntz (number of trailing zeros) function.

counting, 107–114.

detecting, 324. See also CRC (cyclic redundancy check).

plots and graphs, 466

0-bytes, finding, 117–121

1-bits, counting. See Counting bits.

3:2 compressor, 90–95

The 16 Boolean binary operations, 53–57

A

Absolute value

computing, 18

multibyte, 40–41

negative of, 23–26

add instruction

condition codes, 36–37

propagating arithmetic bounds, 70–73

Addition

arithmetic tables, 453

combined with logical operations, 16–17

double-length, 38–39

multibyte, 40–41

of negabinary numbers, 301–302

overflow detection, 28–29

plots and graphs, 461

in various number encodings, 304–305

Advanced Encryption Standard, 164

Alternating among values, 48–51

Alverson‘s method, 237–238

and

plots and graphs, 459

in three instructions, 17

and with complement, 131

Answers to exercises, by chapter

1: Introduction, 405–406

2: Basics, 407–415

3: Power-of-2 Boundaries, 415–416

4: Arithmetic Bounds, 416–417

5: Counting Bits, 417–418

6: Searching words, 418–423

7: Rearranging Bits and Bytes, 423–425

8: Multiplication, 425–428

9: Integer Division, 428–430

10: Integer Division by Constants, 431–434

11: Some Elementary Functions, 434–435

12: Unusual Bases for Number Systems, 435–439

13: Gray Code, 439–441

14: Cyclic Redundancy Check, 441–442

15: Error-Correcting Codes, 442–445

16: Hilbert‘s Curve, 446

17: Floating-Point, 446–448

18: Formulas for Primes, 448–452

Arithmetic, computer vs. ordinary, 1

Arithmetic bounds

checking, 67–69

of expressions, 70–71

propagating through, 70–73

range analysis, 70

searching for values in, 122

Arithmetic tables, 4-bit machine, 453–456

Arrays

checking bounds. See Arithmetic bounds.

counting 1-bits, 89–96

indexes, checking. See Arithmetic bounds.

indexing a sparse array, 95

permutation, 161–163

rearrangements, 165–166

of short integers, 40–41

Autodin-II polynomial, 323

Average, computing, 19, 55–56

B

Base –1 + i number system, 306–308

extracting real and imaginary parts, 310

Base –1 – i number system, 308–309

Base –2 number system, 299–306

Gray code, 315

rounding down, 310

Basic RISC instruction set, 5–6

Basic, Wang System 2200B, 55

Big-endian format, converting to little-endian, 129

Binary decomposition, integer exponentiation, 288–290

Binary forward error-correcting block codes (FEC), 331

Binary search

counting leading 0‘s, 99–104

integer logarithm, 291–297

integer square root, 279–287

Bit matrices, multiplying, 98

Bit operations

compress operation, 150–156

computing parity. See Parity.

counting bits. See Counting bits.

finding strings of 1-bits, 123–128

flipping bits, 135

general permutations, 161–165

generalized bit reversal, 135

generalized extract, 150–156

half shuffle, 141

inner perfect shuffle, plots and graphs, 468–469

inner perfect unshuffle, plots and graphs, 468

inner shuffle, 139–141

numbering schemes, 1

outer shuffle, 139–141, 373

perfect shuffle, 139–141

reversing bits. See Reversing bits and bytes.

on rightmost bits. See Rightmost bits.

searching words for bit strings, 107, 123–128

sheep and goats operation, 161–165

shuffling bits, 139–141, 165–166

transposing a bit matrix, 141–150

unshuffling bits, 140–141, 150, 162

Bit reversal function, plots and graphs, 467

Bit vectors, 1

bitgather instruction, 163–165

Bits. See specific topics.

bitsize function, 106–107

Bliss, Robert D., xv

Bonzini, Paolo, 263

BOOL function, 54–55

Boole, George, 54

Boolean binary operations, all 16, 53–57

Boolean decomposition formula, 51–53, 56–57

Boundary crossings, powers of 2, 63–64

Bounds, arithmetic. See Arithmetic bounds.

Bounds checking. See Checking arithmetic bounds.

branch on carry and register result nonzero instruction, 63

Bytes. See also specific topics.

definition, 1

finding first 0-byte, 117–121

C

C language

arithmetic on pointers, 105, 240

GNU extensions, 105

iIterative statements, 4, 10

referring to same location with different types, 104

representation of character strings, 117

summary of elements, 2–4

Caches, 166-167

Carry-save adder (CSA) circuit, 90–95

CCITT (Le Comité Consultatif Internationale...), 321

Ceiling function, identities, 183–184

Chang, Albert, 123

Character strings, 117

Check bits

Hamming code, 332

SEC-DED code, 334–335

Checking arithmetic bounds, 67–69

Chinese ring puzzle, 315

Chipkill technology, 336

Code, definition, 343

Code length, 331, 343

Code rate, 343

Code size, 343

Comparison predicates

from the carry bit, 26–27

definition, 23

number of leading zeros (nlz) function, 23–24, 107

signed comparisons, from unsigned, 25

true/false results, 23

using negative absolute values, 23–26

Comparisons

computer evaluation of, 27

floating-point comparisons using integer operations, 381–382

three-valued compare function, 21–22. See also sign function.

Compress function, plots and graphs, 464–465

compress operation, 119, 150–161

with insert and extract instructions, 155–156

Computability test, right-to-left, 13–14, 55

Computer algebra, 2–4

Computer arithmetic

definition, 1

plots and graphs, 461–463

Condition codes, 36–37

Constants

dividing by. See Division of integers by constants.

multiplying by, 175–178

Counting bits. See also ntz (number of trailing zeros) function; nlz (number of leading zeros) function; population count function.

1-bits in

7- and 8-bit quantities, 87

an array, 89–95

a word, 81–88

bitsize function, 106–107

comparing two words, 88–89

divide and conquer strategy, 81–82

leading 0‘s, with

binary search method, 99–100

floating-point methods, 104–106

population count instruction, 101–102

rotate and sum method, 85–86

search tree method, 109

with table lookup, 86–87

trailing 0‘s, 107–114

by turning off 1-bits, 85

CRC (cyclic redundancy check)

background, 319–320

check bits, generating, 319–320

checksum, computing

generator polynomials, 322–323, 329

with hardware, 324–326

with software, 327–329

with table lookup, 328–329

techniques for, 320

code vector, 319

definition, 319

feedback shift register circuit, 325–326

generator polynomial, choosing, 322–323, 329

parity bits, 319–320

practice

hardware checksums, 324–326

leading zeros, detecting, 324

overview, 323–324

residual/residue, 324

software checksums, 327–329

trailing zeros, detecting, 324

theory, 320–323

CRC codes, generator polynomials, 322, 323

CRC-CITT polynomial, 323

Cryptography

Advanced Encryption Standard, 164

bitgather instruction, 164–165

DES (Data Encryption Standard), 164

Rijndael algorithm, 164

SAG method, 162–165

shuffling bits, 139–141, 165

Triple DES, 164

CSA (carry-save addr) circuit, 90–95

Cube root, approximate, floating-point, 389

Cube root, integer, 287–288

Curves. See also Hilbert‘s curve.

Peano, 371–372

space-filling, 355–372

Cycling among values, 48–51

D

Davio decomposition, 51-53, 56–57

de Bruijn cycles, 111–112

de Kloet, David, 55

De Morgan‘s laws, 12–13

DEC PDP-10 computer, xiii, 84

Decryption. See Cryptography.

DES (Data Encryption Standard), 164

Dietz‘s formula, 19, 55

difference or zero (doz) function, 41–45

Distribution of leading digits, 385–387

Divide and conquer strategy, 81–82

Division

arithmetic tables, 455

doubleword

from long division, 197–202

signed, 201–202

by single word, 192–197

unsigned, 197–201

floor, 181–182, 237

modulus, 181–182, 237

multiword, 184–188

of negabinary numbers, 302–304

nonrestoring algorithm, 192–194

notation, 181

overflow detection, 34–36

plots and graphs, 463–464

restoring algorithm, 192–193

shift-and-subtract algorithms (hardware), 192–194

short, 189–192, 195–197

signed

computer, 181

doubleword, 201–202

long, 189

multiword, 188

short, 190–192

unsigned

computer, 181

doubleword, 197–201

long, 192–197

short from signed, 189–192

Division of integers by constants

by 3, 207–209, 276–277

by 5 and 7, 209–210

exact division

converting to, 274–275

definition, 240

multiplicative inverse, Euclidean algorithm, 242–245

multiplicative inverse, Newton‘s method, 245–247

multiplicative inverse, samples, 247–248

floor division, 237

incorporating into a compiler, signed, 220–223

incorporating into a compiler, unsigned, 232–234

magic numbers

Alverson‘s method, 237–238

calculating, signed, 212–213, 220–223

calculating, unsigned, 231–234

definition, 211

sample numbers, 238–239

table lookup, 237

uniqueness, 224

magicu algorithm, 232–234

magicu2 algorithm, 236

modulus division, 237

remainder by multiplication and shifting right

signed, 273–274

unsigned, 268–272

remainder by summing digits

signed, 266–268

unsigned, 262–266

signed

by divisors ≤ –2, 218–220

by divisors ≥ 2, 210–218

by powers of 2, 205–206

incorporating into a compiler, 220–223

not using mulhs (multiply high signed), 259–262

remainder by multiplication and shifting right, 273–274

remainder by summing digits, 266–268

remainder from powers of 2, 206–207

test for zero remainder, 250–251

uniqueness, 224

timing test, 276

unsigned

best programs for, 234–235

by 3 and 7, 227–229

by divisors ≥ 1, 230–232

by powers of 2, 227

incorporating into a compiler, 232–234

incremental division and remainder technique, 232–234

not using mulhu (multiply high unsigned) instruction, 251–259

remainder by multiplication and shifting right, 268–272

remainder by summing digits, 262–266

remainder from powers of 2, 227

test for zero remainder, 248–250

Double buffering, 46

Double-length addition/subtraction, 38–39

Double-length shifts, 39–40

Doubleword division

by single word, 192–197

from long division, 197–202

signed, 201–202

unsigned, 197–201

Doublewords, definition, 1

doz (difference or zero) function, 41–45

Dubé, Danny, 112

E

ECCs (error-correcting codes)

check bits, 332

code, definition, 343

code length, 331, 343

code rate, 343

code size, 343

coding theory problem, 345–351

efficiency, 343

FEC (binary forward error-correcting block codes), 331

Gilbert-Varshamov bound, 348–350

Hamming bound, 348, 350

Hamming code, 332-342

converting to SEC-DED code, 334–337

extended, 334–337

history of, 335–337

overview, 332–334

SEC-DED on 32 information bits, 337–342

Hamming distance, 95, 343–345

information bits, 332

linear codes, 348–349

overview, 331, 342–343

perfect codes, 333, 349, 352

SEC (single error-correcting) codes, 331

SEC-DED (single error-correcting, double error-detecting) codes

on 32 information bits, 337–342

check bits, minimum required, 335

converting from Hamming code, 334–337

definition, 331

singleton bound, 352

sphere-packing bound, 348, 350

spheres, 347–351

Encryption. See Cryptography.

End-around-carry, 38, 56, 304–305

Error detection, digital data. See CRC (cyclic redundancy check).

Estimating multiplication overflow, 33–34

Euclidean algorithm, 242–245

Euler, Leonhard, 392

Even parity, 96

Exact division

definition, 240

multiplicative inverse, Euclidean algorithm, 242–245

multiplicative inverse, Newton‘s method, 245–247

multiplicative inverse, samples, 247–248

overview, 240–242

Exchanging

conditionally, 47

corresponding register fields, 46

two fields in same register, 47

two registers, 45–46

exclusive or

plots and graphs, 460

propagating arithmetic bounds through, 77–78

scan operation on an array of bits, 97

in three instructions, 17

Execution time model, 9–10

Exercise answers. See Answers to exercises.

Expand operation, 156–157, 159–161

Exponentiation

by binary decomposition, 288–290

in Fortran, 290

Extended Hamming code, 334–342

on 32 information bits, 337-342

Extract, generalized, 150–156

F

Factoring, 178

FEC (binary forward error-correcting block codes), 331

feedback shift register circuit, 325–326

Fermat numbers, 391

FFT (Fast Fourier Transform), 137–139

find leftmost 0-byte, 117–121

find rightmost 0-byte, 118–121

Finding

decimal digits, 122

first 0-byte, 117–121

first uppercase letter, 122

length of character strings, 117

next higher number, same number of 1-bits, 14–15

the nth prime, 391–398, 403

strings of 1-bits

first string of a given length, 123–125

longest string, 125–126

shortest string, 126–128

values within arithmetic bounds, 122

Flipping bits, 135

Floating-point numbers, 375–389

distribution of leading digits, 385–387

formats (single/double), 375–376

gradual underflow, 376

IEEE arithmetic standard, 375

IEEE format, 375–377

NaN (not a number), 375–376

normalized, 375–377

subnormal numbers, 375–377

table of miscellaneous values, 387–389

ulp (unit in the last position), 378

Floating-point operations

approximate cube root, 389

approximate reciprocal square root, 383–385

approximate square root, 389

comparing using integer operations, 381–382

conversion table, 378–381

converting to/from integers, 377–381

counting leading 0‘s with, 104–106

simulating, 107

Floor division, 181–182, 237

Floor function, identities, 183, 202–203

Floyd, R. W., 114

Formula functions, 398–403

Formulas for primes, 391–403

Fortran

IDIM function, 44

integer exponentiation, 290

ISIGN function, 22

MOD function, 182

Fractal triangles, plots and graphs, 460

Full adders, 90

Full RISC instruction set, 7

Fundamental theorem of arithmetic, 404

G

Gardner, Martin, 315

Gaudet, Dean, 110

Gaudet‘s algorithm, 110

generalized extract operation, 150–156

Generalized unshuffle. See SAG (sheep and goats) operation.

Generator polynomials, CRC codes, 321–323

Gilbert-Varshamov bound, 348–350

Golay, M. J. E., 331

Goryavsky, Julius, 103

Gosper, R. W.

iterating through subsets, 14–15

loop-detection, 114–116

Gradual underflow, 376

Graphics-rendering, Hilbert‘s curve, 372–373

Graphs. See Plots and graphs.

Gray, Frank, 315

Gray code

applications, 315–317

balanced, 317

converting integers to, 97, 312–313

cyclic, 312

definition, 311

history of, 315–317

incrementing Gray-coded integers, 313–315

negabinary Gray code, 315

plots and graphs, 466

reflected, 311–312, 315

single track (STGC), 316–317

Greatest common divisor function, plots and graphs, 464

GRP instruction, 165

H

Hacker, definition, xvi

HAKMEM (hacks memo), xiii

Half shuffle, 141

Halfwords, 1

Hamiltonian paths, 315

Hamming, R. W., 331

Hamming bound, 348, 350

Hamming code

on 32 information bits, 337–342

converting to SEC-DED code, 334–337

extended, 334–337

history of, 335–337

overview, 332–334

perfect, 333, 352

Hamming distance, 95, 343–345

triangle inequality, 352

Hardware checksums, 324–326

Harley, Robert, 90, 101

Harley‘s algorithm, 101, 103

Hexadecimal floating-point, 385

High-order half of product, 173–174

Hilbert, David, 355

Hilbert‘s curve. See also Space-filling curves.

applications, 372–373

coordinates from distance

curve generator driver program, 359

description, 358–366

Lam and Shapiro method, 362–364, 368

parallel prefix operation, 365–366

state transition table, 361, 367

description, 355–356

distance from coordinates, 366–368

generating, 356–358

illustrations, 355, 357

incrementing coordinates, 368–371

non-recursive generation, 371

ray tracing, 372

three-dimensional analog, 373

Horner‘s rule, 49

I

IBM

Chipkill technology, 336

Harvest computer, 336

PCs, error checking, 336

PL/I language, 54

Stretch computer, 81, 336

System/360 computer, 385

System/370 computer, 63

IDIM function, 44

IEEE arithmetic standard, 375

IEEE format, floating-point numbers, 375–377

IEEE Standard for Floating-Point Arithmetic, 375

Image processing, Hilbert‘s curve, 372

Incremental division and remainder technique, 232–234

Inequalities, logical and arithmetic expressions, 17–18

Information bits, 332

Inner perfect shuffle function, plots and graphs, 468–469

Inner perfect unshuffle function, plots and graphs, 468

Inner shuffle, 139–141

insert instruction, 155–156

Instruction level parallelism, 9

Instruction set for this book, 5–8

integer cube root function, 287–288, 297

Integer exponentiation, 288–290

integer fourth root function, 297

integer log base 2 function, 106, 291

integer log base 10 function, 292–297

Integer quotient function, plots and graphs, 463

integer remainder function, 463

integer square root function, 279–287

Integers. See also specific operations on integers.

complex, 306–309

converting to/from floating-point, 377–381

converting to/from Gray code, 97, 312–313

reversed, incrementing, 137–139

reversing, 129–137

Inverse Gray code function

formula, 312

plots and graphs, 466

An Investigation of the Laws of Thought, 54

ISIGN (transfer of sign) function, 22

Iterating through subsets, 14–15

ITU-TSS (International Telecommunications Union...), 321

ITU-TSS polynomial, 323

K

Knuth, Donald E., 132

Knuth‘s Algorithm D, 184–188

Knuth‘s Algorithm M, 171–172, 174–175

Knuth‘s mod operator, 181

Kronecker, Leopold, 375

L

Lam and Shapiro method, 362–364, 368

Landry, F., 391

Leading 0‘s, counting, 99–106. See also nlz (number of leading zeros) function.

Leading 0’s, detecting, 324. See also CRC (cyclic redundancy check).

Leading digits, distribution, 385–387

Least common multiple function, plots and graphs, 464

Linear codes, 348–349

Little-endian format, converting to/from big-endian, 129

load word byte-reverse (lwbrx) instruction, 118

Logarithms

binary search method, 292–293

definition, 291

log base 2, 106–107, 291

log base 10, 291–297

table lookup, 292, 294–297

Logical operations

with addition and subtraction, 16–17

and, plots and graphs, 459

binary, table of, 17

exclusive or, plots and graphs, 460

or, plots and graphs, 459

propagating arithmetic bounds through, 74–76, 78

tight bounds, 74–78

Logical operators on integers, plots and graphs, 459–460

Long Division, definition, 189

Loop detection, 114–115

LRU (least recently used) algorithm, 166–169

lwbrx (load word byte-reverse) instruction, 118

M

MacLisp, 55

magic algorithm

incremental division and remainder technique, 232–234

signed division, 220–223

unsigned division, 232–234

Magic numbers

Alverson‘s method, 237–238

calculating, signed, 212–213, 220–223

calculating, unsigned, 232–234

calculating, Python code for

definition, 211

samples, 238–239

table lookup, 237

uniqueness, 224

magicu algorithm, 232–234

in Python, 240

magicu2 algorithm, 236–237

max function, 41–45

Mills, W. H., 403

Mills’s theorem, 403–404

min function, 41–45

MIT PDP-6 Lisp, 55

MOD function (Fortran), 182

modu (unsigned modulus) function, 98

Modulus division, 181–182, 237

Moore, Eliakim Hastings, 371–372

mulhs (multiply high signed) instruction

division with, 207–210, 212, 218, 222, 235

implementing in software, 173–174

not using, 259–262

mulhu (multiply high unsigned) instruction

division with, 228–229, 234–235, 238

implementing in software, 173

not using, 251–259

Multibyte absolute value, 40–41

Multibyte addition/subtraction, 40–41

Multiplication

arithmetic tables, 454

of complex numbers, 178–179

by constants, 175–178

factoring, 178

low-order halves independent of signs, 178

high-order half of 64-bit product, 173–174

high-order product signed from/to unsigned, 174–175

multiword, 171–173

of negabinary numbers, 302

overflow detection, 31–34

plots and graphs, 462

Multiplicative inverse

Euclidean algorithm, 242–245

Newton‘s method, 245–247, 278

samples, 247–248

multiply instruction, condition codes, 36–37

Multiword division, 184–189

Multiword multiplication, 171–173

MUX operation in three instructions, 56

mux (multiplex) instruction, 406

N

NAK (negative acknowledgment), 319

NaN (not a number), 375–376

Negabinary number system, 299–306

Gray code, 315

Negative absolute value, 23–26

Negative overflow, 30

Newton-Raphson calculation, 383

Newton‘s method, 457–458

integer cube root, 287–288

integer square root, 279–283

multiplicative inverse, 245–248

Next higher number, same number of 1-bits, 14–15

Nibbles, 1

nlz (number of leading zeros) function

applications, 79, 107, 128

bitsize function, 106–107

comparison predicates, 23–24, 107

computing, 99–106

for counting trailing 0‘s, 107

finding 0-bytes, 118

finding strings of 1-bits, 123–124

incrementing reversed integers, 138

and integer log base 2 function, 106

rounding to powers of 2, 61

Nonrestoring algorithm, 192–194

Normalized numbers, 376

Notation used in this book, 1–4

nth prime, finding

formula functions, 398–401

Willans‘s formulas, 393–397

Wormell‘s formula, 397–398

ntz (number of trailing zeros) function

applications, 114–116

from counting leading 0‘s, 107

loop detection, 114–115

ruler function, 114

Number systems

base –1 + i, 306–308

base –1 – i, 308–309

base –2, 299–306, 315

most efficient base, 309–310

negabinary, 299–306, 315

O

Odd parity, 96

1-bits, counting. See Counting bits.

plots and graphs, 459

in three instructions, 17

Ordinary arithmetic, 1

Ordinary rational division, 181

Outer perfect shuffle bits function, plots and graphs, 469

Outer perfect shuffle function, plots and graphs, 467

Outer perfect unshuffle function, plots and graphs, 468

Outer shuffle, 139–141, 373

Overflow detection

definition, 28

division, 34–36

estimating multiplication overflow, 33–34

multiplication, 31–34

negative overflow, 30

signed add/subtract, 28–30

unsigned add/subtract, 31

P

Parallel prefix operation

definition, 97

Hilbert‘s curve, 364–366

inverse, 116

parity, 97

Parallel suffix operation

compress operation, 150–155

expand operation, 156–157, 159–161

generalized extract, 150–156

inverse, 116

Parity

adding to 7-bit quantities, 98

applications, 98

computing, 96–98

definition, 96

parallel prefix operation, 97

scan operation, 97

two-dimensional, 352

Parity bits, 319–320

PCs, error checking, 336

Peano, Giuseppe, 355

Peano curves, 371–372. See also Hilbert‘s curve.

Peano-Hilbert curve. See Hilbert‘s curve.

Perfect codes, 333, 349

Perfect shuffle, 139–141, 373

Permutations on bits, 161–165. See also Bit operations.

Planar curves, 355. See also Hilbert‘s curve.

Plots and graphs, 459–469

addition, 461

bit reversal function, 467

compress function, 464–465

division, 463–464

fractal triangles, 460

Gray code function, 466

greatest common divisor function, 464

inner perfect shuffle, 468–469

inner perfect unshuffle, 468

integer quotient function, 463

inverse Gray code function, 466

least common multiple function, 464

logical and function, 459

logical exclusive or function, 460

logical operators on integers, 459–460

logical or function, 459

multiplication, 462

number of trailing zeros, 466

outer perfect shuffle, 467–469

outer perfect unshuffle, 468

population count function, 467

remainder function, 463

rotate left function, 465

ruler function, 466

SAG (sheep and goats) function, 464–465

self-similar triangles, 460

Sierpinski triangle, 460

subtraction, 461

unary functions, 466–469

unsigned product of x and y, 462

Poetry, 278, 287

population count function. See also Counting bits.

applications, 95–96

computing Hamming distance, 95

counting 1-bits, 81

counting leading 0‘s, 101–102

counting trailing 0‘s, 107–114

plots and graphs, 467

Position sensors, 315–317

Powers of 2

boundary crossings, detecting, 63–64

rounding to, 59–62, 64

signed division, 205–206

unsigned division, 227

PPERM instruction, 165

Precision, loss of, 385–386

Prime numbers

Fermat numbers, 391

finding the nth prime

formula functions, 398–403

Willans‘s formulas, 393–397

Wormell‘s formula, 397–398

formulas for, 391–403

from polynomials, 392

Propagating arithmetic bounds

add and subtract instructions, 70–73

logical operations, 73–78

signed numbers, 71–73

through exclusive or, 77–78

PSHUFB (Shuffle Packed Bytes) instruction, 163

PSHUFD (Shuffle Packed Doublewords) instruction, 163

PSHUFW (Shuffle Packed Words) instruction, 163

Q

Quicksort, 81

R

Range analysis, 70

Ray tracing, Hilbert‘s curve, 372

Rearrangements and index transformations, 165–166

Reed-Muller decomposition, 51-53, 56–57

Reference matrix method (LRU), 166–169

Reflected binary Gray code, 311–312, 315

Registers

exchanging, 45–46

exchanging conditionally, 47

exchanging fields of, 46–47

reversing contents of, 129–135

RISC computers, 5

Reiser, John, 113

Reiser‘s algorithm, 113–114

Remainder function, plots and graphs, 463

Remainders

arithmetic tables, 456

of signed division

by multiplication and shifting right, 273–274

by summing digits, 266–268

from non-powers of 2, 207–210

from powers of 2, 206–207

test for zero, 248–251

of unsigned division

by multiplication and shifting right, 268–272

by summing digits, 262–266

and immediate instruction, 227

incremental division and remainder technique, 232–234

test for zero, 248–250

remu function, 119, 135–136

Residual/residue, 324

Restoring algorithm, 192–193

Reversing bits and bytes, 129–137

6-, 7-, 8-, and 9-bit quantities, 135–137

32-bit words, 129–135

big-endian format, converting to little-endian, 129

definition, 129

generalized, 135

load word byte-reverse (lwbrx) instruction, 118

rightmost 16 bits of a word, 130

with rotate shifts, 129–133

small integers, 135–137

table lookup, 134

Riemann hypothesis, 404

Right justify function, 116

Rightmost bits, manipulating, 11–12, 15

De Morgan‘s laws, 12–13

right-to-left computability test, 13–14, 55

Rijndael algorithm, 164

RISC

basic instruction set, 5–6

execution time model, 9–10

extended mnemonics, 6, 8

full instruction set, 7–8

registers, 5–6

Rotate and sum method, 85–86

Rotate left function, plots and graphs, 464–465

Rotate shifts, 37–38, 129–133

Rounding to powers of 2, 59–62, 64

Ruler function, 114, 466

Russian decomposition, 51-53, 56–57

S

SAG (sheep and goats) operation

description, 162–165

plots and graphs, 464–465

Scan operation, 97

Seal, David, 90, 110

Search tree method, 109

Searching. See Finding.

SEC (single error-correcting) codes, 331

SEC-DED (single error-correcting, double error-detecting) codes

on 32 information bits, 337–342

check bits, minimum required, 335

converting from Hamming code, 334–335

definition, 331

Select instruction, 406

Self-reproducing program, xvi

Self-similar triangles, plots and graphs, 460

shift left double operation, 39

shift right double signed operation, 39–40

shift right double unsigned operation, 39

shift right extended immediate (shrxi) instruction, 228–229

shift right signed instruction

alternative to, for sign extension, 19–20

division by power of 2, 205–206

from unsigned, 20

Shift-and-subtract algorithm

hardware, 192–194

integer square root, 285–287

Shifts

double-length, 39–40

rotate, 37–38

Short division, 189–192, 195–196

Shroeppel‘s formula, 305–306

shrxi (shift right extended immediate) instruction, 228–229

Shuffle Packed Bytes (PSHUFB) instruction, 163

Shuffle Packed Doublewords (PSHUFD) instruction, 163

Shuffle Packed Words (PSHUFW) instruction, 163

Shuffling

arrays, 165–166

bits

half shuffle, 141

inner perfect shuffle, plots and graphs, 468–469

inner perfect unshuffle, plots and graphs, 468

inner shuffle, 139–141

outer shuffle, 139–141, 373

perfect shuffle, 139–141

shuffling bits, 139–141, 165–166

unshuffling, 140–141, 150, 162, 165-166

Sierpinski triangle, plots and graphs, 460

Sign extension, 19–20

sign function, 20–21. See also three-valued compare function.

Signed bounds, 78

Signed comparisons, from unsigned, 25

Signed computer division, 181–182

Signed division

arithmetic tables, 455

computer, 181

doubleword, 201–202

long, 189

multiword, 188

short, 190–192

Signed division of integers by constants

best programs for, 225–227

by divisors ≤ –2, 218–220

by divisors ≥ 2, 210–218

by powers of 2, 205–206

incorporating into a compiler, 220–223

remainder from non-powers of 2, 207–210

remainder from powers of 2, 206–207

test for zero remainder, 250–251

uniqueness of magic number, 224

Signed long division, 189

Signed numbers, propagating arithmetic bounds, 71–73

Signed short division, 190–192

signum function, 20–21

Single error-correcting, double error-detecting (SEC-DED) codes. See SEC-DED (single error-correcting, double error-detecting) codes.

Single error-correcting (SEC) codes, 331

snoob function, 14–15

Software checksums, 327–329

Space-filling curves, 371–372. See also Hilbert‘s curve.

Sparse array indexing, 95

Sphere-packing bound, 348–350

Spheres, ECCs (error-correcting codes), 347–350

Square root, integer

binary search, 281–285

hardware algorithm, 285–287

Newton‘s method, 279–283

shift-and-subtract algorithm, 285–287

Square root, approximate, floating-point, 389

Square root, approximate reciprocal, floating-point, 383–385

Stibitz, George, 308

Strachey, Christopher, 130

Stretch computer, 81, 336

Strings. See Bit operations; Character strings.

strlen (string length) C function, 117

Subnormal numbers, 376

Subnorms, 376

subtract instruction

condition codes, 36–37

propagating arithmetic bounds, 70–73

Subtraction

arithmetic tables, 453

difference or zero (doz) function, 41–45

double-length, 38–39

combined with logical operations, 16–17

multibyte, 40–41

of negabinary numbers, 301–302

overflow detection, 29–31

plots and graphs, 461

Swap-and-complement method, 362–365

Swapping pointers, 46

System/360 computer, 385

System/370 computer, 63

T

Table lookup, counting bits, 86–87

three-valued compare function, 21–22. See also sign function.

Tight bounds

add and subtract instructions, 70–73

logical operations, 74–79

Timing test, division of integers by constants, 276

Toggling among values, 48–51

Tower of Hanoi puzzle, 116, 315

Trailing zeros. See also ntz (number of trailing zeros) function.

counting, 107–114

detecting, 324. See also CRC (cyclic redundancy check).

plots and graphs, 466

Transfer of sign (ISIGN) function, 22

Transposing a bit matrix

8 x 8, 141–145

32 x 32, 145–149

Triangles

fractal, 460

plots and graphs, 460

self-similar, 460

Sierpinski, 460

Triple DES, 164

True/false comparison results, 23

Turning off 1-bits, 85

U

Ulp (unit in the last position), 378

Unaligned load, 65

Unary functions, plots and graphs, 466–469

Uniqueness, of magic numbers, 224

Unshuffling

arrays, 162

bits, 140–141, 162, 468

Unsigned division

arithmetic tables, 455

computer, 181

doubleword, 197–201

long, 192–197

short from signed, 189–192

Unsigned division of integers by constants

best programs for, 234–235

by 3 and 7, 227–229

by divisors ≥ 1, 230–232

by powers of 2, 227

incorporating into a compiler, 232–234

incremental division and remainder technique, 232–234

remainders, from powers of 2, 227

test for zero remainder, 248–250

unsigned modulus (modu) function, 84

Unsigned product of x and y, plots and graphs, 462

Uppercase letters, finding, 122

V

Voorhies, Douglas, 373

W

Willans, C. P., 393

Willans‘s formulas, 393–397

Wilson‘s theorem, 393, 403

Word parity. See Parity.

Words

counting bits, 81–87

definition, 1

division

doubleword by single word, 192–197

Knuth‘s Algorithm D, 184–188

multiword, 184–189

signed, multiword, 188

multiplication, multiword, 171–173

reversing, 129–134

searching for

first 0-byte, 117–121

first uppercase letter, 122

strings of 1-bits, 123–128

a value within a range, 122

word parallel operations, 13

Wormell, C. P., 397

Wormell‘s formula, 397–398

Z

zbytel function, 117–121

zbyter function, 117–121

Zero means 2n, 22–23