Contents

ITheoretical Advancements

1Introduction

1.1Data Structure

1.2Design of Data Structure

1.3Analysis of Data Structure

1.4Amortized Complexity

1.5Computational Models

1.5.1RAM model

1.5.2Word RAM model

1.5.3Cell-probe model of computation

1.6Bounds of Fundamental Data Structures

1.7Lazy Delete

1.8Organization of Part I

1.9Exercises

2O(1) Search by Hashing

2.1Basic Hashing

2.1.1Hash function

2.1.2Load factor

2.1.3Collision resolution

2.2Perfect Hashing

2.2.1Construction

2.2.2Remarks

2.3Universal Hashing

2.3.1Important properties

2.3.2Mathematical guarantees

2.4Cuckoo Hashing

2.4.1Operations

2.4.2Bipartite graph of cuckoo hashing

2.5Bloom Filters

2.5.1Construction of bloom filter

2.5.2Probability of false positives

2.5.3Optimal values of parameters

2.6Locality-Sensitive Hashing

2.6.1Use in nearest neighbor search problem

2.7Exercises

3O(log(n)) Ordered Search (Trees and Lists)

3.1Balanced Binary Search Trees (BSTs)

3.1.1Height bound of balanced BST

3.2Randomized BSTs

3.2.1Static randomized BSTs

3.2.2Dynamic randomized BSTs

3.2.3Analysis of randomized BSTs

3.3Splay Tree

3.3.1Splaying

3.3.2Splaying algorithms

3.3.3Performance

3.4Tango Tree

3.4.1Creation of tango tree

3.4.2Tango analysis

3.5Skiplists

3.5.1Skipping

3.5.2Dynamic updates

3.5.3Probabilistic analysis of skiplist

3.6Static and Dynamic Optimality

3.6.1Search optimality in BST

3.6.2Static optimality

3.6.3Dynamic optimality

3.7Exercises

4Findset, Find Min, and Find Word

4.1Disjoint Sets

4.1.1Operations on disjoint-set data structure

4.1.2Representations of disjoint sets

4.1.3Link-list representations of disjoint sets

4.1.4Forest representations of disjoint sets

4.2Binomial Heap

4.2.1Creation and updates of binomial heap

4.2.2Operations of Binomial Heap

4.2.3Complexity

4.3Fibonacci Heaps

4.3.1Properties of a Fibonacci heap

4.3.2Inserting, merging, cutting, and marking

4.3.3Decreasing keys and delete-min operation

4.3.4Algorithm for Fibonacci heaps

4.3.5Amortized analysis for Fibonacci heaps

4.3.6Tree size

4.4Tries

4.4.1Insertion

4.4.2Searching

4.4.3Deletion

4.4.4Complexity

4.4.5Compact trie

4.4.6Patricia

4.4.7Suffix tree

4.5Inverted Index

4.5.1Inverted index creation

4.5.2Index compression

4.5.3Key words search

4.6Exercises

IIEvolving Paradigms

5Evolving Paradigms of Data Structures

5.1Geometric Queries

5.2I/O Complexities

5.3Communication Complexities

5.4Large Data Problem

5.5Exercise

6Spatial Data Structures

6.1Range Search Trees

6.1.1Construction

6.1.2Range query search

6.2KD Trees

6.2.1Creation of KD tree

6.2.2Range search in KD tree

6.2.3Nearest neighbor search in KD tree

6.3Quadtree

6.3.1Inserting data into a quadtree

6.3.2Properties of quadtree

6.3.3Region quadtree

6.3.4Point quadtree

6.4R Tree

6.4.1Indexing structure of R tree

6.4.2Search in R tree

6.4.3Dynamic update of R tree

6.5Exercises

7Temporal Data Structures

7.1Partial Persistence

7.1.1Partial persistence

7.1.2Full persistence

7.1.3Confluent persistence

7.1.4Functional persistence

7.2Retroactivity

7.2.1Decomposable search problem

7.3Exercises

8External Memory Data Structures

8.1Input/Output (I/O) Model

8.2Cache Oblivious Algorithms

8.2.1Cache aware model

8.2.2Cache oblivious model

8.3B, B+ Tree

8.3.1Searching

8.3.2Insertion

8.3.3Removal

8.3.4Amortized analysis of B trees

8.3.5B+ tree

8.4(a,b) Tree

8.4.1Insertion

8.4.2Deletion

8.5Buffer Tree

8.6Exercises

9Distributed Data Structures (DDSs)

9.1Descriptions of Structures

9.1.1Properties of DDS

9.2Distributed Hashing

9.2.1Structure of distributed hashing

9.3Distributed Trees

9.3.1Construction of distributed BST

9.3.2Insertion

9.3.3Deletion

9.3.4Rotation

9.4Skip Graphs

9.4.1Design

9.4.2Search

9.4.3Insertion

9.4.4Deletion

9.4.5Correctness and concurrency

9.5Exercises

10Synopsis Data Structures

10.1Data Synopsis

10.1.1Synopsis methods

10.1.2Application

10.2Sampling

10.2.1Sampling technique

10.2.2Reservoir sampling

10.2.3Sampling with updates

10.2.4Sliding window sampling

10.3Sketching

10.3.1Count-min sketches

10.4Fingerprint

10.4.1Fingerprinting scheme of Rabin

10.5Wavelets

10.5.1Wavelet decomposition

10.6Exercises

IIIRecent Applications

11Introduction to Applications

11.1Various Domain Applications

11.2Project

12Applications to Cryptography

12.1MD5

12.1.1Password hashing

12.2Secure Socket Layers (SSLs)

12.2.1Data structure of open SSL

12.3Block Chains

12.4Digital Signature

12.5Projects

13Application to IR and WWW

13.1Crawl Frontier

13.2Posting List Intersection

13.3Text Retrieval from Inverted Index

13.4Auto Complete Using Tries

13.5Projects

14Applications to Data Science

14.1Heavy Hitters and Count-Min Structures

14.2Approximate Nearest Neighbor Searches

14.2.1Approximate nearest neighbor

14.2.2Locality-sensitive hashing (LSH)

14.3Low Rank Approximation by Sampling

14.3.1Nystrom approximation

14.3.2Random sketching

14.4Near-Duplicate Detection by Min Hashing

14.5Projects

15Application to Network and IOT

15.1Click-Stream Processing Using Bloom Filters

15.1.1GBF Algorithm

15.2Fast IP-Address Lookup Using Tries

15.3Integrity Verification: Cloud and IOT Data

15.4Projects

16Applications to Systems

16.1Queue Spilling

16.2Completely Fair Schedulers in Kernels

16.2.1CFS internals

16.3Distributed Caching

16.4Data Structures for Building File Systems

16.5Projects

17Applications to Databases

17.1Database Problems

17.1.1Searching sorted files

17.1.2Index for first search

17.1.3Insertion deletion in database

17.2B and B+ Trees for Database Creation and Block Search

17.2.1Applications of B trees in databases and file systems

17.3CouchDB

17.4Bloomjoins

17.5Projects

18Applications to Images and Graphics

18.1R Trees for Map Searches

18.1.1R trees for mapping

18.1.2Insertion

18.1.3Deletion

18.1.4Search

18.2Spatial Proximity in GIS

18.2.1GIS objects

18.2.2Data access in GIS

18.2.3Computational requirements

18.2.4Solution using k-d tree

18.3Ray Shooting

18.3.1Rays

18.3.2Camera-ray intersections

18.3.3Shadow rays

18.3.4Reflection rays

18.3.5Transmission rays

18.3.6Recursive ray tracing

18.3.7Ray intersection

18.3.8Bounding volume hierarchies

18.4Data Structures Used in Ray Shooting

18.4.1Octrees

18.4.2KD trees

18.4.3BSP trees

18.4.4Uniform grids

18.4.5Hierarchical grids

18.5Projects

Bibliography

Index