https://www.orchest.io/ logo
#tech-support
Title
# tech-support
b

Bruno Oliveira dos Santos

05/02/2023, 1:16 PM
Hello guys. I have a problem to perform the restart through the interface. I'll send more information in the thread:
I made the json adjustment to enable authentication. When I go to restart to apply the change I get this error:
Do you have any idea what this error could be?
For better context: I uploaded a GKE cluster and deployed the application in two different namespaces, each namespace is attached to a node group. In a namespace application I didn't have this problem, it's happening in the other one.
👀 1
j

Jacopo

05/02/2023, 3:27 PM
Anything interesting in the browser debugging tools? Both console messages and network request response code and payload
Any chance that the namespace for which things are working is the default namespace? Can you check if the values of the env variables
ORCHEST_NAMESPACE
and
ORCHEST_CLUSTER
are set correctly for the
orchest-api
pod part of the namespace for which things are failing?
Should also check the following pods logs when making a restart request which fails: •
orchest-api
orchest-webserver
orchest-controller
b

Bruno Oliveira dos Santos

05/03/2023, 10:33 AM
Hello Jacopo. Thank you for the informations. Looking at the logs I found some inconsistencies and managed to do a reset and the mentioned problem was solved.
However, I have another problem and I would like some advice from you:
In this same described scenario, a cluster running two Orchests with two different namespaces, and in this cluster I have two NodePools. Through Labels I attached the main deployments to each corresponding NodePool.
However, when I go to build some environment I noticed that it uploads a POD called "image-build-task" and this POD is not being attached to the correct NodePool, causing an error.
Is there any way for this problem to be resolved? If you need more details, I'll let you know in this thread.
j

Jacopo

05/04/2023, 11:31 AM
Hello Jacopo. Thank you for the informations. Looking at the logs I found some inconsistencies and managed to do a reset and the mentioned problem was solved.
happy to hear that!
However, when I go to build some environment I noticed that it uploads a POD called "image-build-task" and this POD is not being attached to the correct NodePool, causing an error.
Take a look at
_get_image_build_manifest
in
services/orchest-api/app/app/core/image_utils.py
, you'll see the
pod_scheduling.modify_image_builder_pod_scheduling_behaviour(manifest)
call at the end of the function, the
services/orchest-api/app/app/core/pod_scheduling.py
module is in charge of doing changes to manifests so that scheduling happens in a way we like, in this particular case, it'll look for the
WORKER_PLANE_SELECTOR
env variable of the orchest-api and celery-worker to adjust the behaviour for multi tenancy you'll likely need to do some minor changes around these variables
17 Views