Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Why text-generation-inference using s3 with fluid is slower than s3 with s3-fuse (first access) #4425

Open
hualongfeng opened this issue Dec 4, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@hualongfeng
Copy link

What is your environment(Kubernetes version, Fluid version, etc.)

~/ai/opea/chatqna# kubectl get node
NAME               STATUS   ROLES           AGE    VERSION
icelake-server-2   Ready    control-plane   19d    v1.29.9
opea-dev10         Ready    <none>          5d2h   v1.29.9
~/ai/opea/chatqna# kubectl version
Client Version: v1.29.9
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.9
~/ai/opea/chatqna# helm list
NAME 	NAMESPACE	REVISION	UPDATED                                	STATUS  	CHART      	APP VERSION  
fluid	default  	1       	2024-11-15 13:53:02.904415378 +0800 CST	deployed	fluid-1.0.3	1.0.3-ccdf3a9
root@icelake-server-2:~/ai/opea/chatqna# sudo lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.5 LTS
Release:	22.04
Codename:	jammy
root@opea-dev10:~# sudo lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy

Describe the bug
I build a ceph cluster for s3 storage on one server. And K8s node access s3 using fluid and s3fuse.

  s3 w/ fluid no dataload first access s3 w/ s3-fuse first access
download 6.105737 8.808546
shard 390.42 56.09
others 12.60 16.15

The s3 with fluid first access is slower than s3 with s3-fuse.

Note: every test to run echo 3 > /proc/sys/vm/drop_caches on k8s node.

What you expect to happen:

I would expect the two tests to be close in time.

How to reproduce it
Build a ceph s3 environment:
Reference https://github.com/ceph/ceph

$ git clone https://github.com/ceph/ceph.git
$ cd ceph
$ git submodule update --init --recursive --progress
$ sudo apt install curl
$ sudo ./install-deps.sh
$ sudo apt install python3-routes
$ ./do_cmake.sh
$ cd build
$ ninja -j32
$ cat start_ceph.sh
  MON=1 OSD=4 MDS=0 MGR=1 RGW=1 ../src/vstart.sh -n --bluestore -X \
        -o "osd_pool_default_pg_autoscale_mode=off" \
        -o "osd pool default size = 2" \
        -o "osd_pool_default_min_size = 2" \
        -o "mon_allow_pool_size_one = true" \
        -o "bluestore_block_wal_path = \"\"" \
        -o "bluestore_block_db_path = \"\"" \
        -o "bluestore_bluefs = true" \
        -o "bluestore_block_create = false" \
        -o "bluestore_block_db_create = false" \
        -o "bluestore_block_wal_create = false" \
        -o "bluestore_block_wal_separate = false" \
        -o "osd_op_num_shards = 32" \
        -o "osd_op_num_threads_per_shard = 2" \
        -o "osd_memory_target = 32G" \
        -o "rbd cache = false" \
        -o "ms_async_op_threads = 3" \
        --bluestore-devs /dev/nvme4n1,/dev/nvme5n1,/dev/nvme6n1,/dev/nvme7n1
$ ./start_ceph.sh
$ sudo apt install s3cmd
$ cat <<EOF | sudo tee ~/.s3cfg
[default]
access_key = 0555b35654ad1656d804
host_base = 192.168.0.62:8000
host_bucket = no.way.in.hell
secret_key = h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q==
use_https = False
EOF
$ s3cmd mb s3://opea-models                            #create s3 bucket
$ s3cmd sync --follow-symlinks ./* s3://opea-models/   #Upload model data to s3
  1. Using s3 with fluid
root@icelake-server-2:~/ai/opea/chatqna# cat fluid_opea_s3_only_read_chat_7b.yaml 
---
apiVersion: v1
kind: Secret
metadata:
  name: s3-secret
  namespace: default
type: Opaque
data:
  AWS_ACCESS_KEY_ID: "MDU1NWIzNTY1NGFkMTY1NmQ4MDQ="  # echo -n '{access_key}' | base64
  AWS_SECRET_ACCESS_KEY: "aDdHaHh1QkxUcmxoVlV5eFNQVUtVVjhyLzJFSTRuZ3FKeEQ3aUJkQllMaHdsdU4zMEphVDNRPT0="   # echo -n '{secret_key}' | base64
---

---
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: opea-models
spec:
  mounts:
    - mountPoint: s3://opea-models/models--Intel--neural-chat-7b-v3-3
      name: models--Intel--neural-chat-7b-v3-3
      options:
        alluxio.underfs.s3.endpoint: http://192.168.0.62:8000
        alluxio.underfs.s3.disable.dns.buckets: "true"
        alluxio.underfs.s3.inherit.acl: "false"
      encryptOptions:
      - name: aws.accessKeyId
        valueFrom:
          secretKeyRef:
            name: s3-secret
            key: AWS_ACCESS_KEY_ID
      - name: aws.secretKey
        valueFrom:
          secretKeyRef:
            name: s3-secret
            key: AWS_SECRET_ACCESS_KEY
  accessModes:
    - ReadOnlyMany
---
---
# runtime
apiVersion: data.fluid.io/v1alpha1
kind: AlluxioRuntime
metadata:
  name: opea-models
spec:
  replicas: 1
  tieredstore:
    levels:
      - mediumtype: MEM
        path: /dev/shm
        quota: 50Gi
        high: "0.95"
        low: "0.7"
---
root@icelake-server-2:~/ai/opea/chatqna# cat pod_for_text-generation-inference.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: text-generation-pod
spec:
  containers:
    - name: text-generation-container
      image: ghcr.io/huggingface/text-generation-inference:2.2.0
      args: ["--model-id", "Intel/neural-chat-7b-v3-3"]
      ports:
        - name: http
          containerPort: 8080
          protocol: TCP
      resources:
        limits:
          nvidia.com/gpu: 4
      volumeMounts:
        - mountPath: /data
          name: model-volume
      securityContext:
        capabilities:
          add: ["SYS_ADMIN"]
  volumes:
    - name: model-volume
      persistentVolumeClaim:
        claimName: opea-models
  restartPolicy: Never

root@icelake-server-2:~/ai/opea/chatqna#  kubectl apply -f fluid_opea_s3_only_read_chat_7b.yaml
root@icelake-server-2:~/ai/opea/chatqna# kubectl apply -f pod_for_text-generation-inference.yaml 
root@icelake-server-2:~/ai/opea/chatqna# kubectl logs text-generation-pod
  1. Using s3 with s3-fuse
    Reference https://github.com/s3fs-fuse/s3fs-fuse
root@icelake-server-2:~/ai/opea/chatqna# echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > ${HOME}/.passwd-s3fs
root@icelake-server-2:~/ai/opea/chatqna# chmod 600 ${HOME}/.passwd-s3fs
root@icelake-server-2:~/ai/opea/chatqna# s3fs opea-models-no-blobs ./s3-mount -o passwd_file=~/.passwd-s3fs -o url=http://192.168.0.62:8000 -o no_check_certificate -o nonempty -o use_path_request_style
root@icelake-server-2:~/ai/opea/chatqna# cat pod_for_text-generation-inference.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: text-generation-pod
spec:
  containers:
    - name: text-generation-container
      image: ghcr.io/huggingface/text-generation-inference:2.2.0
      args: ["--model-id", "Intel/neural-chat-7b-v3-3"]
      ports:
        - name: http
          containerPort: 8080
          protocol: TCP
      resources:
        limits:
          nvidia.com/gpu: 4
      volumeMounts:
        - mountPath: /data
          name: model-volume
      securityContext:
        capabilities:
          add: ["SYS_ADMIN"]
  volumes:
    - name: model-volume
      hostPath:
        path: /mnt/s3-mount
        type: Directory
  restartPolicy: Never
root@icelake-server-2:~/ai/opea/chatqna# kubectl apply -f pod_for_text-generation-inference.yaml 
root@icelake-server-2:~/ai/opea/chatqna# kubectl logs text-generation-pod

Additional Information

@hualongfeng hualongfeng added the bug Something isn't working label Dec 4, 2024
@cheyang
Copy link
Collaborator

cheyang commented Dec 17, 2024

I suspected that it does something with your hardware configuration. If possible, please drop an email to me. My email address is [email protected].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants