https://www.orchest.io/ logo
h

howie hu

01/16/2021, 2:23 PM
My current build method is based on build_container.sh. This method is a bit cumbersome. Is there an easier way?
r

Rick Lamers

01/16/2021, 2:29 PM
Out of curiosity, why do you need Python 3.6? Would be interesting to know if more people have valid reasons to depend on lower Python versions. As for building custom images, we extend the Jupyter Docker Stacks and it comes with Python 3.7.6. However, in the environment setup script you should be able to override the Python version through shell commands. Is that an option? I could write an example setup script if you want.
h

howie hu

01/16/2021, 2:35 PM
1.In actual development, historical legacy projects are a hurdle that we can’t get around, I need to support some old versions 2.My current implementation is like this, add
RUN conda install --quiet --yes python=3.6
directly to the Dockerfile
The current implementation must rely on orchest code. Is there a more convenient way, such as plug-ins?
If you can add an example and documentation, that would be the best
r

Rick Lamers

01/16/2021, 2:37 PM
The current implementation must rely on orchest code. Is there a more convenient way, such as plug-ins?
What do you mean must rely on Orchest code?
h

howie hu

01/16/2021, 2:39 PM
These files should be necessary during the build process, right?
The custom image I built by myself is always unavailable. Can you help me build a kernel-py3.6 image?
Copy code
# Ubuntu 18.04.1 LTS Bionic
FROM elyra/kernel-py:2.3.0

USER root
# enable sudo for the NB_USER by default
RUN passwd -d $NB_USER && echo "$NB_USER   ALL=(ALL)   NOPASSWD:ALL" | tee /etc/sudoers.d/$NB_USER

WORKDIR /

COPY ./base-kernel-py/*.sh /

RUN conda install --quiet --yes python=3.6

# Run augment script
RUN ./augment-root.sh

# Install our internal libraries
COPY ./lib/python /orchest/lib/python
COPY ./orchest-sdk /orchest/orchest-sdk

COPY ./runnable-shared/runner /orchest/services/base-images/runnable-shared/runner
WORKDIR /orchest/services/base-images/runnable-shared/runner

RUN pip install -r requirements.txt

COPY ./runnable-shared/bootscript.sh /orchest/bootscript.sh

USER $NB_USER
ENV HOME=/home/$NB_USER

ARG ORCHEST_VERSION
ENV ORCHEST_VERSION=${ORCHEST_VERSION}

CMD [ "/orchest/bootscript.sh" ]
this is my Dockerfile
r

Rick Lamers

01/16/2021, 3:08 PM
I’m traveling right now. But I’ll help you write the correct custom image that uses Python 3.6 when I get back. Custom images are still a work in progress that’s why the documentation is not ready yet.
h

howie hu

01/16/2021, 3:10 PM
Ok thank you, have a nice journey.
r

Rick Lamers

01/16/2021, 10:41 PM
I've spent some time trying to make this work for you. After looking a bit closer, I noticed one of our dependencies (elyra/kernel-py) that we use, for running the pipeline step notebooks interactively as kernels, (through the Jupyter Enterprise Gateway project) has Python 3.7+ code in it. Luckily I was able to clean up their code, but as a result the custom image that runs Python 3.6 is a bit more complicated than I'd like it to be. Attached you'll find a .zip of the Dockerfile (+ the patched python script). If you run
docker build -t my-custom-base .
and enter 'my-custom-base:latest' in the environment, you should get Python 3.6 in both Python scripts and Python notebooks in your pipeline. Let me know if you have any questions. As I said earlier, custom images aren't fully worked out so docs on making them isn't done yet. Hope this helps!
h

howie hu

01/17/2021, 3:15 AM
Thanks for your great job! works well. I noticed that when setting up the environment, the environment name can be empty. Is this a bug?
💯 1
I used my-custom-base image and found that pandas is not installed by default, so I installed pandas==1.1.5 (latest). When I use pandas to load data, I found that there is no way to use orchest.get_inputs() to accept the data passed by orchest.output. The error log is as follows:
I found that this error can also occur just by passing the string. It seems that there is a conflict in orchest-sdk. The issue here is actually that 
fromisoformat
 is not available in Python versions older than 3.7. Maybe you need to make orchest-sdk support python3.6 My current solution is to import a third-party compatibility patch
Copy code
from backports.datetime_fromisoformat import MonkeyPatch
MonkeyPatch.patch_fromisoformat()
Then it can run normally, but I’m not sure if there are any other hidden problems.
r

Rick Lamers

01/17/2021, 9:32 AM
The Python 3.6 environment created in this custom image is pretty barebones. So it can indeed be the case that you have to install
pandas
. The custom image could be modified to install all the Python/conda packages the Jupyter Stacks image comes with. But I didn’t know whether you’d want the image bloated like that. We will look at making
orchest-sdk
Python 3.6 compatible, since it’s used in user code we want it to be as easy to import as possible. We’ll give it a pass and make sure it works with Python 3.6+. Thanks for pointing out the missing validation on the environment name. It should not accept empty input. I’ll let you know when we’ve refactored the
orchest-sdk
so that you can drop the patch.
👍 1
h

howie hu

01/18/2021, 3:03 AM
Hello, I am experiencing a problem with project clone. When the project I need to clone comes from within the company, such as https://gitlab.**.cc/orchest_demo.git At this time, you need to enter the user name and password, orchest will report an error directly, and the log is as follows:
Copy code
Import failed: undefined error. Please try again.
orchest-webserver's log:
Copy code
fatal: could not read Username for '<https://gitlab>.***.cc': No such device or address
How can i solve this problem?
r

Rick Lamers

01/18/2021, 8:23 AM
For credential authenticated
git
at the moment the easiest way is to use a terminal and clone the project in orchest/userdir/projects/. It should automatically detect the new folder when you view projects. Would that be an option?
Or are you restricted to the browser only (e.g. the person doing the importing has no machine terminal access)?
h

howie hu

01/18/2021, 1:50 PM
My current way is to import directly in /userdir/projects/
👍 1
I encountered such a problem in today's use. Regarding Pipline, have you considered setting global parameters? If every script under Pipline needs the same variables, can you consider setting them as global variables to reduce the workload of setting parameters
r

Rick Lamers

01/18/2021, 3:48 PM
We actually like the idea of expanding pipeline step parameterization to the full pipeline (being able to define parameters for the full pipeline and on a step level). This will require some work in the front- and back-end though.
👍 1
In the meanwhile you should be able to get quite far with specifying global configuration with a file in the project directory. A simple JSON file that you read from wherever you need the global parameters.
We're also working on project level environment variables which would be another way to accomplish global parameters. But again, it's in the works and not available currently.
y

Yannick

01/20/2021, 12:34 PM
Maybe you need to make orchest-sdk support python3.6
Python3.6 is now supported! Happy coding :)
👍 1