Iflearner-operator is the controller of the kubernetes IflearnerJob crd.
As you known, horizontal federated learning has two roles, party and server. Iflearner-operator can create different kubernetes objects based on roles, involving ingress, service and pod. The relationship is as follows:
Between the parties and the server, we communicate using the grpc protocol and use SSL. All traffic will go to the server ingress, which will route traffic to different services based on the subdomain name. The aggregator behind the service will handle the traffic with business logic.
Between parties, we also use the same means of communication to transfer data. Of course, this is an option and the communication between parties can be ignored if not required.
You need a Kubernetes cluster to run, and iflearner-operator relies on ingress-nginx to implement ingress. So you need to install ingress-nginx before starting.
You can follow the Official Installation Guide to install ingress-nginx.
You can also install ingress-nginx as follows:
kubectl create -f ingress-nginx/deploy.yaml
You can install CRD as follows:
bin/kustomize build config/crd | kubectl apply -f -
You can install controller as follows:
cd config/manager && ../../bin/kustomize edit set image controller=ghcr.io/iflytek/iflearner-operator:0.2.0
cd ../.. && bin/kustomize build config/default | kubectl apply -f -
We use domain names to connect to ingress, so you need to configure your Kubernetes DNS. If you are using the coredns component, you can configure as follows:
Firstly, you need to enter edit mode.
kubectl -n kube-system edit configmap/coredns
Then, you need to add some template configurations. The sever uses the domain name server.iflearner.com and the party uses the domain name *.party.iflearner.com.
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
template IN A server.iflearner.com {
match .*\.server\.iflearner\.com
answer "{{ .Name }} 60 IN A 172.31.164.52"
fallthrough
}
template IN A a.party.iflearner.com {
match .*\.a\.party\.iflearner\.com
answer "{{ .Name }} 60 IN A 172.31.164.53"
fallthrough
}
template IN A b.party.iflearner.com {
match .*\.b\.party\.iflearner\.com
answer "{{ .Name }} 60 IN A 172.31.164.54"
fallthrough
}
}
Note: The real ip depends on your environment.
Finally, restart the coredns to make the configuration take effect.
kubectl -n kube-system rollout restart deployment coredns
After configuring DNS, you need to generate certificates for those domains.
generate server certificate
openssl req -newkey rsa:2048 -nodes -sha256 -keyout server-iflearner-secret.key -x509 -days 3650 -out server-iflearner-secret.crt -subj "/CN=*.server.iflearner.com"
generate party certificate
You need to edit the cert/party-san.cnf file and add DNS records according to your environment.
openssl req -nodes -newkey rsa:2048 -sha256 -keyout party-iflearner-secret.key -reqexts req_ext -out party-iflearner-secret.csr -subj "/CN=*.party.iflearner.com" -config party-san.cnf
openssl x509 -req -in party-iflearner-secret.csr -extfile party-san.cnf -extensions req_ext -signkey party-iflearner-secret.key -days 3650 -out party-iflearner-secret.crt
We expose ingress with TLS, so we need certificate and you can create secret as follows:
# on server side
kubectl create secret tls server-iflearner-secret --key server-iflearner-secret.key --cert server-iflearner-secret.crt
# on party side
kubectl create secret tls party-iflearner-secret --key party-iflearner-secret.key --cert party-iflearner-secret.crt
We need a certificate to connect to the ingress, so we mount the configmap into the certificate file, you can create the secret as follows:
kubectl create configmap server-iflearner-crt --from-file=ingress-nginx/server-iflearner-secret.crt
kubectl create configmap party-iflearner-crt --from-file=ingress-nginx/party-iflearner-secret.crt
You can uninstall everything as follows:
# delete configmap
kubectl delete configmap party-iflearner-crt
kubectl delete configmap server-iflearner-crt
# delete secret
kubectl delete secret party-iflearner-secret
kubectl delete secret server-iflearner-secret
# delete controller
bin/kustomize build config/default | kubectl delete --ignore-not-found=true -f -
# delete crd
bin/kustomize build config/crd | kubectl delete --ignore-not-found=true -f -
You can create a server and two parties as follows:
Server
apiVersion: git.iflytek.com/v1
kind: IflearnerJob
metadata:
name: iflearnerjob-server
spec:
role: server
host: job1.server.iflearner.com
template:
spec:
containers:
- image: ghcr.io/iflytek/iflearner:v0.1.0
name: iflearnerjob-server
imagePullPolicy: IfNotPresent
args:
- python
- iflearner/business/homo/aggregate_server.py
- -n=2
- --epochs=10
Party A
apiVersion: git.iflytek.com/v1
kind: IflearnerJob
metadata:
name: iflearnerjob-client1
spec:
role: client
host: job1.a.party.iflearner.com
template:
spec:
restartPolicy: Never
containers:
- image: ghcr.io/iflytek/iflearner:v0.1.0
name: iflearnerjob-client
imagePullPolicy: IfNotPresent
workingDir: /iflearner/examples/homo/quickstart_pytorch
args:
- python
- -u
- quickstart_pytorch.py
- --name=client1
- --epochs=10
- --server=job1.server.iflearner.com:30031
- --cert=/etc/server-iflearner-secret.crt
- --peers=0.0.0.0:50001;job1.b.party.iflearner.com:32322
- --peer-cert=/etc/party-iflearner-secret.crt
Party B
apiVersion: git.iflytek.com/v1
kind: IflearnerJob
metadata:
name: iflearnerjob-client2
spec:
role: client
host: job1.b.party.iflearner.com
template:
spec:
restartPolicy: Never
containers:
- image: ghcr.io/iflytek/iflearner:v0.1.0
name: iflearnerjob-client
imagePullPolicy: IfNotPresent
workingDir: /iflearner/examples/homo/quickstart_pytorch
args:
- python
- -u
- quickstart_pytorch.py
- --name=client2
- --epochs=10
- --server=job1.server.iflearner.com:30031
- --cert=/etc/server-iflearner-secret.crt
- --peers=0.0.0.0:50001;job1.a.party.iflearner.com:30031
- --peer-cert=/etc/party-iflearner-secret.crt
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。