[ISSUE-48][FEATURE] Add Uniffle Dockerfile by wangao1236 · Pull Request #132 · apache/uniffle · GitHub
Skip to content

[ISSUE-48][FEATURE] Add Uniffle Dockerfile#132

Merged
roryqi merged 1 commit into
apache:masterfrom
wangao1236:build-add-dockerfile
Aug 10, 2022
Merged

[ISSUE-48][FEATURE] Add Uniffle Dockerfile#132
roryqi merged 1 commit into
apache:masterfrom
wangao1236:build-add-dockerfile

Conversation

@wangao1236

@wangao1236 wangao1236 commented Aug 5, 2022

Copy link
Copy Markdown
Collaborator

What changes were proposed in this pull request?

To solve issue #48 ,I add the Dockerfile for Uniffle.

Why are the changes needed?

Support K8S Operator

Does this PR introduce any user-facing change?

Yes, we will add the document later

How was this patch tested?

Manual test

@wangao1236

Copy link
Copy Markdown
Collaborator Author

@wangao1236 wangao1236 changed the title Add dockerfile of rss [ISSUE-48] [Feature] Add dockerfile of rss Aug 5, 2022
@roryqi roryqi changed the title [ISSUE-48] [Feature] Add dockerfile of rss [ISSUE-48][FEATURE] Add docker file Aug 5, 2022
@roryqi roryqi changed the title [ISSUE-48][FEATURE] Add docker file [ISSUE-48][FEATURE] Add Dockerfile Aug 5, 2022
@roryqi roryqi changed the title [ISSUE-48][FEATURE] Add Dockerfile [ISSUE-48][FEATURE] Add Uniffle Dockerfile Aug 5, 2022
Comment thread deploy/kubernetes/docker/Dockerfile Outdated
Comment thread deploy/kubernetes/docker/build.sh Outdated
Comment thread deploy/kubernetes/docker/hadoopconfig/.gitkeep
Comment thread deploy/kubernetes/docker/start.sh
Comment thread deploy/kubernetes/docker/build.sh
Comment thread deploy/kubernetes/docker/build.sh Outdated
@roryqi

roryqi commented Aug 5, 2022

Copy link
Copy Markdown
Contributor

cc @zuston @czy006 Could you help me review this patch?

Comment thread deploy/kubernetes/docker/start.sh Outdated
Comment thread .gitignore Outdated
@roryqi

roryqi commented Aug 5, 2022

Copy link
Copy Markdown
Contributor

Could you remove the file https://github.com/apache/incubator-uniffle/blob/master/deploy/kubernetes/docker/.gitkeep? Because the directory docker has files, don't need a keep file.

frankliee
frankliee previously approved these changes Aug 5, 2022
@@ -0,0 +1,29 @@
FROM openjdk:8-jdk-slim

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to use other general-purpose images, such as alpine or centos.
Many companies may have their own JDK choices, so we should treat JDK as 3rd party lib hadoop.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use dependency_cache to save our build time

for example:
FROM openjdk:8-jdk-slim as dependency_cache

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems 8-jdk-slim image is based on Debian stable, which is fine.
Can we use 8-jre-slim here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems 8-jdk-slim image is based on Debian stable, which is fine. Can we use 8-jre-slim here?

What's your reason?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems 8-jdk-slim image is based on Debian stable, which is fine. Can we use 8-jre-slim here?

What's your reason?

It's smaller.

@roryqi roryqi Aug 8, 2022

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use dependency_cache to save our build time

for example: FROM openjdk:8-jdk-slim as dependency_cache

It's great. Could you raise a follow-up pr which uses dependency_cache?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems 8-jdk-slim image is based on Debian stable, which is fine. Can we use 8-jre-slim here?

What's your reason?

It's smaller.

It may need more tests. If 8-jdk-slim is fine, we can choose it first. When our e2e test is ready, wen can use e2e test to verify the 8-jre-slim.

@wangao1236 wangao1236 force-pushed the build-add-dockerfile branch 2 times, most recently from 1a0a973 to f776ce7 Compare August 5, 2022 15:24
Comment thread .gitignore Outdated
rss-*.tgz

hadoop-*.tar.gz
/deploy/kubernetes/hadoopconfig/*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use absolute path here? if hadoopconfig is useful, why do we add it to .gitignore file?

@zuston zuston left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Left some minor comments

IMAGE=$REGISTRY/rss-server:$RSS_VERSION

echo "building image: $IMAGE"
docker build -t "$IMAGE" \

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using LABEL to add more metadata, like build-time/author/branch/commit-id?


if [ "$SERVICE_NAME" == "coordinator" ];then

bash ${basedir}/bin/start-coordinator.sh &

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directly exit when executing startup script failed.

bash ${basedir}/bin/start-coordinator.sh || exit 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give me your reason?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start.sh will monitor the process existence, if the coordinator startup failed, we should exit directly, instead of waiting the non-existence PID and then exit 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We kill server or coordinator, the process will not exit immediately sometimes. The process will occupy the port, we should wait for the port for a while.

RUN tar -zxvf /data/rssadmin/hadoop-${HADOOP_VERSION}.tar.gz -C /data/rssadmin
RUN mv /data/rssadmin/hadoop-${HADOOP_VERSION} /data/rssadmin/hadoop
RUN rm -rf /data/rssadmin/hadoop-${HADOOP_VERSION}.tar.gz
COPY hadoopconfig/ /data/rssadmin/hadoop/etc/hadoop

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What files in hadoopconfig folder? core-site/hdfs-site.xml?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

ARG HADOOP_VERSION
ARG RSS_VERSION

RUN apt-get update && apt-get install -y zlib1g zlib1g-dev lzop lsof netcat dnsutils less procps iputils-ping \

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider kerberos? If yes, we need to setup kerberos client.

Maybe we shouldn't consider this in early version.

@roryqi roryqi Aug 7, 2022

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't consider the kerberos dependency here. It's ok for us if there are no big conflicts, it will be convenient to modify Dockerfile after we merge #53

@wangao1236 wangao1236 force-pushed the build-add-dockerfile branch from f776ce7 to d85a10e Compare August 8, 2022 03:26
@wangao1236 wangao1236 force-pushed the build-add-dockerfile branch from d85a10e to 7da0c3e Compare August 8, 2022 12:32

@roryqi roryqi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @zuston @frankliee @kaijchen @czy006 Do you have any other suggestion?

@roryqi

roryqi commented Aug 10, 2022

Copy link
Copy Markdown
Contributor

@roryqi roryqi merged commit 18663ed into apache:master Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants