Chapter 2 Computing Infrastructures for Big Data Statistics
Motivation
- Clock speed saturates at 3 to 4 GHz.
- Computational intensive models (e.g. iterative)
- Large datasets
So, the future is parallel.
Note The explanation and raw data for this figure can be found on Karl Rupp's webpage.
Multicore computing (shared memory)
- Computer architecture
- Multithreading
- Most programs running on multicore systems are threaded.
- A key point is that the threads share memory, making it easy for them to cooperate.
- Writing threaded code directly is messy, so higher-level systems have been developed to hide the messy details from the programmer, thus making his/her life far easier. TODO:How?
- Multiprocessing
- Shared memory parallel computing
- OpenMP
- Intel’s Threading Building Blocks (TBB) system
Cluster computing (distributed memory)
- Cluster architecture
- Distributed (memory) computing
- Distributed data storage
- Cloud computing
- Cloud architecture (virtualization)
GPU computing
- Reconfigurable computing with FPGA
- Vector processors