https://www.orchest.io/ logo
Title
n

nuhyurduseven

10/19/2022, 12:55 PM
Hi there, I have a problem about jupyterlab kernel. During the read dataset, jupyter kernel dumped and mlflow is not working properly. When I run the same code using python on command line, it works but in the orchest, the same code didn't give same output and kernel dumped. I am using DBSCAN and I know that scikit.learn version has high complexity. I think ingress banned somethings and some configurations are required.
sometimes, kernel can run the code not being dumped but during DBSCAN runs, the kernel dumped.
r

Rick Lamers

10/19/2022, 1:17 PM
I @nuhyurduseven thanks for reporting this. By kernel dumped do you mean the kernel stops responding? Could you detail a bit more which code is causing the failure for you? We can try to reproduce the issue.
n

nuhyurduseven

10/19/2022, 1:20 PM
Yes, kernel stopped and status updated "unknow". The code is simple and sometimes it runs flawless. I just read csv via pandas or get_inputs from orchest API. But dataframes may be quite big.
r

Rick Lamers

10/19/2022, 1:21 PM
I suspect you're pushing your kernel out of memory and the kernel container gets killed by Kubernetes
Because memory usage is sometimes lower/higher depending on what else is going on on the system it might sometimes work whereas sometimes it won't. I see two options: reduce memory footprint (analyse how you're loading/unloading data) or get a bigger box 😄
n

nuhyurduseven

10/19/2022, 1:24 PM
You may right, because sometimes memory overflow when I run on terminal. But minikube and orchest use all memory and memory is enough for storing all data.
r

Rick Lamers

10/19/2022, 1:26 PM
One trick you may want to use is rebuilding the JupyterLab image in the settings page. It will stop all ongoing sessions (we're working on a dedicated option for this to avoid having to use this workaround). That could free up resources you need to make it go through
n

nuhyurduseven

10/19/2022, 7:50 PM
I tryed. Actually it's related to memory. when I reduce the input data, code run and finished properly. I installed orchest to one cluster via minikube. Used max cpu, max memory and 50gb disk. It's first installation method at the orchest docs. However, During the algorithm runs, memory usage quite low and aproximately after 1 minute, jupyter kernel died and got a warning about restarting kernel. I'm thinking that it may be a timeout value. Is it possible?
Error is here
r

Rick Lamers

10/20/2022, 8:51 PM
Are you familiar with k9s? It could be worth checking why the kernel pod crashes/gets stopped. The logs of the kernel pod might be informative here. Does this only happen when running code in the kernel or also when you just leave the kernel idle?
n

nuhyurduseven

10/24/2022, 2:01 PM
I installed k9s and checked it out quickly. A container's status is error. I killed and it's restarted again. This is the name of crashed container --> "*orchest-argo-workflow-argo-workflows-workflow-controller-5nd524"*
r

Rick Lamers

10/24/2022, 2:58 PM
Hi @nuhyurduseven thanks for taking the time to check with k9s. The argo system is not used for kernels so it's likely an unrelated error. Do you still only get the Kernel Restarting error occasionally or do you now get it every time you try to run the kernel? If you're able to you could share the project such that we can try to reproduce the issue on our end.
n

nuhyurduseven

10/24/2022, 2:59 PM
But the kernel still crash each try. Kubernetes installed one VM but it has a huge memory. Problem is related eighter memory overflow or a running time constrict. total memory usage was never reach the last point. I have not enough knowledge about kubernetes because of that the problem can be solved by one has a deep understanding this fields or builder of orchest. if I cannot use, no one can use this, except the hobbyists.
r

Rick Lamers

10/24/2022, 3:02 PM
Self-hosting of Orchest is currently still more complicated than we would like it to be, it requires a decent amount of k8s understanding, especially if you're stretching your system's resources. You might want to try just running your project on https://cloud.orchest.io where we can make changes to the cluster setup more easily to accommodate your use case.
n

nuhyurduseven

10/25/2022, 7:40 AM
I don't believe that the problem occur by box size. My box has quite enough resources and bigger than any box that I can get from the orchest cloud.
r

Rick Lamers

10/25/2022, 7:59 AM
I'd love to investigate this issue further if you could provide a minimal reproducible example. Maybe you've uncovered an edge case that we're not handling properly.
n

nuhyurduseven

10/25/2022, 8:23 AM
Yes. Actually I installed orchest a box via minikube and this kubernetes cluster has 50G disksize max 75G+ memory and 8 core vCPU. I created a pipeline and upload a huge csv dataset, over 1.1G+. btw I am using last release,v22.10.8. when I read a small csv, It works, but when I read a huge csv, I get a warning about "kernel may be dead,restarting". This problem is not source by disksize, it's memory size. However when I monitor the box, total memory consumption about to 8-9G, so quite low. I am used to read the files in orchest, but now I can't. After this problem is solved, DBSCAN, the algorithm that is famous with high memory consumption, crash the kernel but only high used with much data. So I think that, It's connected each other and originate container memory. That's the problem needs to solve.
r

Rick Lamers

10/25/2022, 10:22 AM
@Navid H could we in some way be limiting the allowed memory per pod accidentally? Sounds like @nuhyurduseven is not consuming all resources on the system but running into pod memory limits regardless.
👀 1
n

Navid H

10/25/2022, 11:17 AM
We don't do that as far as I know. could you run the following command to describe the pod which runs out of memory.
kubectl describe pod ${pod-name} -n orchest
also could you check the resources of minikube node:
kubectl describe node minikube
n

nuhyurduseven

10/25/2022, 11:20 AM
I uninstalled orchest and minikube for fresh install again. After that I'll try
🙏 2