Our paper Boosting Data Center Performance via Intelligently Managed Multi-backend Disaggregated Memory has been accepted in SC'24.
This paper proposes xDM, a multi-backend disaggregated memory system that can manage multiple far memory paths with high performance.
code: Source code
code/drivers: Source code of RDMA and DRAM backend drivers.
code/eval: Source code of different workloads.
code/farmemserver: Source code of RDMA server.
code/kernel: Source code of fastswap kernel.
code/log_process: Source code of log process.
code/scripts: Source code of scripts used in installation.
document:Include documents about how to confige our system.
1)xDM rdma server:a server with at least 64G memory(128G recommend), ubuntu 20.04.06(recommend), a MT27800 Family [ConnectX-5] NIC(recommend), MLNX_OFED_LINUX-5.8-4.1.5.0 installed (available at Linux InfiniBand Drivers (nvidia.com), should match the Linux distribution and rdma NIC version.),
2)xDM client:a server with at least 64G memory(128G recommend), ubuntu 20.04.06(recommend), a MT27800 Family [ConnectX-5] NIC(recommend), MLNX_OFED_LINUX-5.8-4.1.5.0 installed (available at Linux InfiniBand Drivers (nvidia.com), should match the Linux distribution and rdma NIC version.), require qemu-kvm installed
Install qemu-kvm, we recommend to install virt-manager to manage VMs
sudo apt install qemu-system qemu-utils virt-manager libvirt-clients libvirt-daemon-system -yInstall VM with virt-manager. The next steps are finished in the VMs.
Here is the recommanded configuration of VMs.
-
CPU cores: 16
-
CPU model: (host-model, enable AVX)
-
RAM: >=64G
-
storage: >=300G
2) Compiling and installing data swap kernel in each vm on the client node, only DRAM and RDMA kernel need this step
We use modified kernel in clusterfarmem/fastswap and based on the drivers to implement xDM. We also use part of workloads in clusterfarmem/cfm .
git clone the repo
cd ~
git clone https://github.com/linqinluli/Multi-backend-DM.gitFirst you need a copy of the source for kernel 4.11 with SHA a351e9b9fc24e982ec2f0e76379a49826036da12. We outline the high level steps here.
cd ~
wget https://github.com/torvalds/linux/archive/a351e9b9fc24e982ec2f0e76379a49826036da12.zip
mv a351e9b9fc24e982ec2f0e76379a49826036da12.zip linux-4.11.zip
unzip linux-4.11.zip
cd linux-4.11
git init .
git add .
git commit -m "first commit"Now you can use the provided patch and apply it against your copy of linux-4.11, and use the generic Ubuntu config file for kernel 4.11. You can get the config file from internet, or you can use the one we provide.
git apply ~/Multi-backend-DM/code/kernel/kernel.patch
cp ~/fastswap/kernel/config-4.11.0-041100-generic ~/linux-4.11/.configMake sure you have necessary prerequisites to compile the kernel, and compile it:
sudo apt-get install git build-essential kernel-package fakeroot libncurses5-dev libssl-dev ccache bison flex
make -j `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-fastswapOnce it's done, your deb packages should be one directory above, you can simply install them all:
cd ..
sudo dpkg -i *.debThe fastswap kernel has been installed in the VM. If you want to use RDMA or DRAM backend, you should boot system with the modified 4.11 fastswap kernel.
Refer to document configure rdma in kvm VM, in this step, make sure the ofed driver you installed in VM is 4.3 version. If the official version 4.3 driver is not available, we provide a Google Cloud Drive download link.
DRAM backend:
Use DRAM backend in xDM client (in VM)
cd ~/Multi-backend-DM/code/drivers
make BACKEND=DRAMRDMA backend:
Use RDMA backend in xDM client (in VM)
cd ~/Multi-backend-DM/code/drivers
make BACKEND=RDMAin xDM server
cd ~
git clone https://github.com/linqinluli/Multi-backend-DM.git
cd ~/Multi-backend-DM/code/farmemserver
makexDM supports three types of swap backend SSD (or disk), DRAM, and RDMA. After following the above steps, you can configure it. We offer scripts for configuration. Before you use configure backend, you should have 32G swap space set.
free -g | grep swap
# Swap: 32 0 32SSD backend (supporting Linux simple kernel):
cd ~/Multi-backend-DM/code/scripts/
sudo chmod +x backendswitch.sh
./backendswitch.sh ssd $path_mount_on_ssdDRAM backend (supporting modified Linux kernel)
cd ~/Multi-backend-DM/code/scripts/
sudo chmod +x backendswitch.sh
./backendswitch.sh dramRDMA backend (supporting modified Linux kernel)
To build and run the far memory server do(xDM RDMA server):
./rmserver $port $far_memory_size $cpu_num_in_rdma_clientConfigure rdma backend in xDM client
cd ~/Multi-backend-DM/code/scripts/
sudo chmod +x backendswitch.sh
./backendswitch.sh rdma $rdma_server_ip $rdma_server_port $rdma_client_ipUsing code/scripts/backendswitch.sh, we strongly suggest to use SSD backend without the modified kernel for it may cause the system crush. We will solve the problem in the next version.
Configure a new backend or switch to another backend can be finished to use the script. Just follow the steps in Backend configuration.
turn on THP
sudo sh -c "echo always > /sys/kernel/mm/transparent_hugepage/enabled"turn off THP
sudo sh -c "echo never> /sys/kernel/mm/transparent_hugepage/enabled"The number of CPUs can be only configured by kvm. You should shut down the VM server and start it.
# modify VM configuration
sudo virsh edit CacheExp
# query the number of CPUs
cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -lHere is a example of how to evaluate chatglm with 0.5 local memory ratio.
cd ~/Multi-backend-DM/code/eval
python3 benchmark chatglm 0.5Here is an example of hot to configure NUMA node assignment.
numactl --cpunodebind=0 --membind=0 ./test
numactl -C 0-1 ./testlist running VMs
sudo virsh listshutdown VM-CacheExp
sudo virsh shutdown CacheExpforce shudown VM-CacheExp
sudo virsh destory CacheExpstart VM-CacheExp
sudo virsh start CacheExpedit VM-CacheExp's configurations
sudo virsh edit CacheExpHere are the workloads we support now.
Some workloads' configuration can refer to CFM: quicksort, linpack, stream, pagerank, kmeans, inception, resnet
GridGraph: refer to thu-pacman/GridGraph: Out-of-core graph processing on a single machine
Ligra: refer to jshun/ligra: Ligra: A Lightweight Graph Processing Framework for Shared Memory
chatglm:
model:refer to THUDM/chatglm2-6b · Hugging Face
chatglm-int4:
model:refer to THUDM/chatglm2-6b-int4 · Hugging Face
clip:
model:refer to openai/clip-vit-large-patch14 · Hugging Face
data:refer to CIFAR-10 and CIFAR-100 datasets
text-classify:
model:refer to GitHub - gaussic/text-classification-cnn-rnn: CNN-RNN中文文本分类,基于TensorFlow
data:refer to http://thuctc.thunlp.org/
bet-uncased:
model:refer to https://huggingface.co/bert-base-uncasedSome workloads' configuration can refer to CFM: quicksort, linpack, stream, pagerank, kmeans, inception, resnet
GridGraph: refer to thu-pacman/GridGraph: Out-of-core graph processing on a single machine
Ligra: refer to jshun/ligra: Ligra: A Lightweight Graph Processing Framework for Shared Memory
chatglm:
model:refer to THUDM/chatglm2-6b · Hugging Face
chatglm-int4:
model:refer to THUDM/chatglm2-6b-int4 · Hugging Face
clip:
model:refer to openai/clip-vit-large-patch14 · Hugging Face
data:refer to CIFAR-10 and CIFAR-100 datasets
text-classify:
model:refer to GitHub - gaussic/text-classification-cnn-rnn: CNN-RNN中文文本分类,基于TensorFlow
data:refer to http://thuctc.thunlp.org/
bet-uncased:
model:refer to google-bert/bert-base-uncased · Hugging Face
It's different in different kernels and lsb versions. We show the steps we used in our system.
- Open /boot/grub/grub.cfg in your editor of choice
- Find the
menuentryfor the fastswap kernel - Add
cgroup_no_v1=memoryto the end of the line beginning inlinux /boot/vmlinuz-4.11.0-sswap - Save and exit the file
- Run: sudo update-grub
- Reboot
The framework and scripts rely on the cgroup system to be mounted at /cgroup2. Perform the following actions:
- Run
sudo mkdir /cgroup2to create root mount point - Execute
code/scripts/init_bench_cgroups.sh
Here is an example of how to evaluate chatglm with a 0.5 local memory ratio.
cd ~/Multi-backend-DM/code/eval
python3 benchmark chatglm 0.5Make sure you have installed the workloads install in the code\eval path. Here is a script to quickly install workloads we provide in the repo.
chmod +x ~/Multi-backend-DM/code/scripts/install_workloads.sh
sh ~/Multi-backend-DM/code/scripts/install_workloads.shThen evaluate mutil workloads:
chmod +x ~/Multi-backend-DM/code/scripts/install_workloads.sh
sh ~/Multi-backend-DM/code/scripts/eval_workloads.sh $log_file_nameUse script in code/log_process to process log file.
python ~/Multi-backend-DM/code/log_process/log_process.py $log_file_path