MIT distributed system course 6.824
Lecture 1 Introduction
1.0 why we need distributed system
- parallelism
- more powerful performance
- fault tolerance
- etc….
1.1 usage
- infrastrure
- storage
- communication
Lecture 2 RPC and mutiple thread
1.0 Go
- have Go is more convenience to create and collect memory than C++
1.1 thread
-
a useful tool to control parallelism
- when we have to write one data,we can set a LOCK
- be warning about share data
Lecture 3 GFS
1.0 why we need google file system
- big storage
- faults tolerance
- replication
1.1 Master
- have one data table
- file name -> array of chunk id
- Id -> list of chunkservers
- LOG on disk
- Checkpoint on disk
- When we need to read
- name to master
- master get us the id
- return the copy
- when we need to write
- ask master where is the file - name to master
- do we have primary?
- if we can not find the 17th chunk(maybe because of electricity), we will return none
- we have a lot of problems of primary