Setting up your own distributed system

2 minute read

Published:

Use virtual machines to setup your own distributed system.

Settings:

  • Host OS: MAC OS
  • Virtual machine OS(guest OS): ubuntu
  • Virtualization tool: Virtualbox, it seems to be the most suitable one for MAC, free and easy to use, if your host is linux or windows, you can have other choices, such as VMware.

Steps:

  • Download and install Virtualbox from virtualbox
  • Create several VMs with the default options. (I have master, worker1, worker2)
  • Download ubuntu iso for VMs, ubuntu iso can be found in ubuntu
  • Start the VMs and install ubuntu on each VM
  • apt install on VMs (when you need to access outside network use NAT network option in settings->network, otherwise host-only adapter is good for internal communication between VMs and VM-host.)
    • sudo apt install golang-go
    • sudo apt install nfs-common
    • sudo apt install ssh
    • sudo apt install nfs-kernel-server
  • Create a new disk on master
    • power off master vm
    • settings->storage->+->create a sata control disk
    • power on master vm
    • config the disk
      • Run
        • sudo fdisk /dev/sdb
        • Then type o and press enter # creates a new table
        • Then type n and press enter # creates a new partition
        • Then type p and press enter # makes a primary partition.
        • Then type 1 and press enter # creates it as the 1st partition
        • (use default setting in this step)
        • Finally type w # this will write any changes to disk.
        • Okay now you have a partition, now you need a filesystem.
      • Run
        • sudo mkfs.ext4 /dev/sdb1
      • add below line into /etc/fstab
        • /dev/sdb1 /home/yourname/GO ext4 defaults 0 1
      • create a floder /home/yourname/GO on your master
        • mkdir /home/yourname/GO
    • restart master
  • share the folder /home/yourname/GO on master
  • add below line in /etc/exports
    • /home/yourname/GO ipaddress of your workers(rw)
  • service nfs-kernel-server restart
  • mount the folder on master to the workers, on workers, execute
    • mkdir /home/yourname/GO
    • mount -t nfs ipaddressOfmaster:/home/yourname/GO /home/yourname/GO
  • set the GOPATH on all vms, you can add the line below to ~/.bashrc
    • export GOPATH=/home/yourname/GO
  • open a new terminal
  • Put your mapreduce code on to one of your vm’s /home/yourname/GO/src as well as the kjv12.txt file, organize them in below structure will make things easier
    • /home/yourname/GO/src
      • kjv12.txt
      • wc.go
    • /home/yourname/GO/src/mapreduce
      • common.go
      • worker.go
      • master.go
      • mapreduce.go
  • Change the code of you mapreduce using tcp rpc instead of unix one.
  • On master
    • go run wc.go master kjv12.txt ipaddressOfmaster:7777
  • On workers
    • go run wc.go worker ipaddressOfmaster:7777 ipaddressOfthisWorker:portyoulike