https://www.orchest.io/ logo
#announcements
Title
# announcements
d

Diego Vasquez

06/23/2022, 8:54 PM
Hi everyone! I have a problem with a job in this version of Orchest. I configured a job to run every 10 minutes but apparently, some pipelines don't start in the right time, causing that the next pipeline start out of time. I already check inside every pipeline if there is a process that it takes more time to run but everything it's going great inside. Here is a screenshot of the pipelines and a screenshot of the pipeline that run at 3.20pm.
r

Rick Lamers

06/24/2022, 6:19 AM
Thanks for letting us know about this. Are you running self-hosted or on Orchest Cloud?
Could you post the cron string just to be sure?
j

Jacopo

06/24/2022, 7:22 AM
Any chance that there are other runs from other jobs being queued? The default parallelism for job runs is 1, and can be changed through the settings
y

Yannick

06/24/2022, 7:27 AM
I agree with @Jacopo that is most likely caused by the setting of
MAX_JOB_RUNS_PARALLELISM
in (docs).
💯 1
d

Diego Vasquez

06/24/2022, 2:49 PM
Thanks for your replies. 1. Cron string: */10 * * * * 2. Orchest is running self-hosted in an Azure server. 3. Orchest settings:
Copy code
"MAX_INTERACTIVE_RUNS_PARALLELISM": 15,
 "MAX_JOB_RUNS_PARALLELISM": 15,
@Jacopo @Yannick
👀 2
j

Jacopo

06/24/2022, 3:14 PM
I'll write up a query for the orchest db that might help in debugging, give me a sec
Alright, could you run the following and send the dump in a file here or in a dm?
Copy code
kubectl exec -n orchest deployment/orchest-database -- psql \
    -U postgres -d orchest_api -c "
    SELECT runs.*, jobs.schedule
    FROM
    (SELECT
    job_uuid, uuid, status, started_time
    FROM pipeline_runs
    WHERE type = 'NonInteractivePipelineRun') runs
    JOIN jobs ON jobs.uuid = runs.job_uuid ORDER BY started_time DESC;
    "
It assumes kubectl is installed, if not, I'll slightly modify it to make use of minikube, assuming you are running on that
d

Diego Vasquez

06/24/2022, 3:20 PM
Copy code
job_uuid               |                 uuid                 | status  |        started_time        |       schedule
--------------------------------------+--------------------------------------+---------+----------------------------+----------------------
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 35c6409c-4ddc-44d9-84ea-9b53644e9c9c | SUCCESS | 2022-06-24 15:17:23.163642 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 1bcbf810-5d4a-4a2e-9db5-473fda565682 | SUCCESS | 2022-06-24 15:16:06.052379 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 14997e06-7044-4885-9370-0ee88edc972a | SUCCESS | 2022-06-24 15:00:09.20949  | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 3b5127b3-d34b-4e06-879c-9801985632bd | SUCCESS | 2022-06-24 14:50:09.206651 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 672aea48-7816-40ce-846d-154cda90e1b8 | SUCCESS | 2022-06-24 14:47:23.218848 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | ae99249e-b5cf-4bdc-99a1-678fef48d39f | SUCCESS | 2022-06-24 14:31:30.074175 | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | b67f7253-e700-45af-b637-fe53b99ebf7a | SUCCESS | 2022-06-24 14:30:09.21037  | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 915743ff-c02c-4b29-b0ad-ff145a454cbf | SUCCESS | 2022-06-24 14:20:09.20522  | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | adb44cba-74e1-40d1-9e35-896fb4383c2c | SUCCESS | 2022-06-24 14:17:30.873497 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 13086380-902b-4b95-a705-585cd1508131 | SUCCESS | 2022-06-24 14:01:33.254644 | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 3fae5a13-6084-4429-83fe-7d1cdcb18cd1 | SUCCESS | 2022-06-24 14:00:09.210597 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | abfba3d1-870f-408d-ab70-b489bfabe679 | SUCCESS | 2022-06-24 13:50:09.208465 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 7687422d-a875-411d-80a8-c0510786c042 | SUCCESS | 2022-06-24 13:46:41.66348  | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | e34d333a-ec8b-4500-9b7a-12f83b113d3c | SUCCESS | 2022-06-24 13:45:32.795907 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | f39e27c7-5efe-45d7-8888-8e4e0f178535 | SUCCESS | 2022-06-24 13:30:09.207895 | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | e45b820a-c10f-453b-80ee-be2c1c2066ab | SUCCESS | 2022-06-24 13:20:09.225235 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 9efbfca9-4a4e-45d4-8d61-b3ef60103260 | SUCCESS | 2022-06-24 13:15:55.778974 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | f6941d3e-6be6-47d0-aa68-cc3a7b45ac26 | SUCCESS | 2022-06-24 13:01:47.251738 | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 0ee1c1dd-e6af-4254-9ee8-f6846a8b5001 | SUCCESS | 2022-06-24 13:00:29.376386 | */10 * * * *
 db708f54-fc26-4e5e-8fbc-21ae93846e3f | 28fb5b15-058f-43b0-8c10-6ed093f371bc | SUCCESS | 2022-06-24 13:00:09.202578 | 0 3,13 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 81e97cba-3ed6-4f21-b69b-5e0d3eb1310b | SUCCESS | 2022-06-24 12:50:09.202844 | */10 * * * *
 db708f54-fc26-4e5e-8fbc-21ae93846e3f | 5116c8fb-bc86-4285-9faf-e6d747d50347 | SUCCESS | 2022-06-24 03:00:09.174804 | 0 3,13 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 63cf38ab-9c3b-4809-9672-543708f61d77 | SUCCESS | 2022-06-24 01:30:09.174721 | */30 13-23,0-1 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | ea148036-5c50-4774-8edc-70f9f216e29b | SUCCESS | 2022-06-24 01:01:25.996502 | */30 13-23,0-1 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 5cc4bbbe-e8da-41aa-a5ed-9f8184c712d5 | SUCCESS | 2022-06-24 00:31:21.568974 | */30 13-23,0-1 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 395a7f9f-83db-4d47-88a7-e2b9814a5577 | SUCCESS | 2022-06-24 00:00:09.172016 | */30 13-23,0-1 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 862905dc-fa4c-4c06-b1be-a012ee1b6054 | SUCCESS | 2022-06-23 23:31:19.228344 | */30 13-23,0-1 * * *
 db708f54-fc26-4e5e-8fbc-21ae93846e3f | f30d0ab7-25a9-40ea-a41f-c197a5821b65 | SUCCESS | 2022-06-23 13:00:07.64629  | 0 3,13 * * *
j

Jacopo

06/24/2022, 3:21 PM
I'll right, ty, give me a sec to take a look
On which orchest version are you?
orchest version
or in
Settings
I'd also be curious to see the output of
kubectl describe deployment -n orchest celery-worker  | grep "PARALLELISM"
Mind if I ask you to re-run the previous script with a tiny change? Forgot to include the
finished time
in the query 😛
Copy code
kubectl exec -n orchest deployment/orchest-database -- psql \
    -U postgres -d orchest_api -c "
    SELECT runs.*, jobs.schedule
    FROM
    (SELECT
    job_uuid, uuid, status, started_time, finished_time
    FROM pipeline_runs
    WHERE type = 'NonInteractivePipelineRun') runs
    JOIN jobs ON jobs.uuid = runs.job_uuid ORDER BY started_time desc;
    "
Maybe I am just seeing things but it looks like that the runs that start late are the ones that would be scheduled in the range of <= 20 minutes of a run from the job with schedule
*/30 13-23,0-1 * * *
,
On some older versions of Orchest to increase the parallelism level you have to restart Orchest, any chance that the restart didn't take place? (the gui alerts you about the need to restart when changing the setting if you are on one of those older versions, but only when changing the value AFAIR)
d

Diego Vasquez

06/24/2022, 4:23 PM
@Jacopo When I changed the settings, I restarted Orchest 3 times
I'm running Orchest 2022.06.6
Copy code
job_uuid               |                 uuid                 | status  |        started_time        |       finished_time        |       schedule
--------------------------------------+--------------------------------------+---------+----------------------------+----------------------------+----------------------
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 622343c7-0f98-49de-b5d4-b6899f6b2744 | SUCCESS | 2022-06-24 16:20:09.213425 | 2022-06-24 16:21:26.559623 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | c0149994-2142-4ced-9e1e-a655cd884871 | SUCCESS | 2022-06-24 16:17:10.577657 | 2022-06-24 16:18:26.524951 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 80256e56-4a71-42cf-8465-66c30b2e6e9d | SUCCESS | 2022-06-24 16:01:34.814202 | 2022-06-24 16:17:10.448884 | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 365da78c-7c7d-451b-b2dc-0c7204baed0c | SUCCESS | 2022-06-24 16:00:09.212336 | 2022-06-24 16:01:34.704894 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 9ca1d6f5-c57f-4f06-b6c2-549ca4e2db20 | SUCCESS | 2022-06-24 15:55:19.639697 | 2022-06-24 15:56:33.51986  | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 9aeaafb9-32bc-456d-8b13-1b17d28160ac | SUCCESS | 2022-06-24 15:54:10.898963 | 2022-06-24 15:55:19.530211 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | a250998c-1b18-4759-a8c2-ef514de368ce | SUCCESS | 2022-06-24 15:52:55.632757 | 2022-06-24 15:54:10.796996 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | a7790e50-869c-4ca1-bf52-525882dabd9f | SUCCESS | 2022-06-24 15:30:09.209559 | 2022-06-24 15:52:55.490071 | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 35f7d4e5-99a0-4d12-b0c8-fc1fc333056d | SUCCESS | 2022-06-24 15:20:09.211379 | 2022-06-24 15:21:28.922291 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 35c6409c-4ddc-44d9-84ea-9b53644e9c9c | SUCCESS | 2022-06-24 15:17:23.163642 | 2022-06-24 15:18:49.144228 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 1bcbf810-5d4a-4a2e-9db5-473fda565682 | SUCCESS | 2022-06-24 15:16:06.052379 | 2022-06-24 15:17:23.039378 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 14997e06-7044-4885-9370-0ee88edc972a | SUCCESS | 2022-06-24 15:00:09.20949  | 2022-06-24 15:16:05.92712  | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 3b5127b3-d34b-4e06-879c-9801985632bd | SUCCESS | 2022-06-24 14:50:09.206651 | 2022-06-24 14:51:26.335789 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 672aea48-7816-40ce-846d-154cda90e1b8 | SUCCESS | 2022-06-24 14:47:23.218848 | 2022-06-24 14:48:39.561793 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | ae99249e-b5cf-4bdc-99a1-678fef48d39f | SUCCESS | 2022-06-24 14:31:30.074175 | 2022-06-24 14:47:23.086101 | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | b67f7253-e700-45af-b637-fe53b99ebf7a | SUCCESS | 2022-06-24 14:30:09.21037  | 2022-06-24 14:31:29.966229 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 915743ff-c02c-4b29-b0ad-ff145a454cbf | SUCCESS | 2022-06-24 14:20:09.20522  | 2022-06-24 14:21:30.144355 | */10 * * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | adb44cba-74e1-40d1-9e35-896fb4383c2c | SUCCESS | 2022-06-24 14:17:30.873497 | 2022-06-24 14:18:52.084532 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 13086380-902b-4b95-a705-585cd1508131 | SUCCESS | 2022-06-24 14:01:33.254644 | 2022-06-24 14:17:30.754011 | */30 13-23,0-1 * * *
 749906b3-2d0f-4dcc-b408-2ca5b1ecfc0d | 3fae5a13-6084-4429-83fe-7d1cdcb18cd1 | SUCCESS | 2022-06-24 14:00:09.210597 | 2022-06-24 14:01:33.143125 | */10 * * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | f39e27c7-5efe-45d7-8888-8e4e0f178535 | SUCCESS | 2022-06-24 13:30:09.207895 | 2022-06-24 13:45:32.666713 | */30 13-23,0-1 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | f6941d3e-6be6-47d0-aa68-cc3a7b45ac26 | SUCCESS | 2022-06-24 13:01:47.251738 | 2022-06-24 13:15:55.665381 | */30 13-23,0-1 * * *
 db708f54-fc26-4e5e-8fbc-21ae93846e3f | 28fb5b15-058f-43b0-8c10-6ed093f371bc | SUCCESS | 2022-06-24 13:00:09.202578 | 2022-06-24 13:00:29.264074 | 0 3,13 * * *
 db708f54-fc26-4e5e-8fbc-21ae93846e3f | 5116c8fb-bc86-4285-9faf-e6d747d50347 | SUCCESS | 2022-06-24 03:00:09.174804 | 2022-06-24 03:00:35.905998 | 0 3,13 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 63cf38ab-9c3b-4809-9672-543708f61d77 | SUCCESS | 2022-06-24 01:30:09.174721 | 2022-06-24 01:44:36.873517 | */30 13-23,0-1 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | ea148036-5c50-4774-8edc-70f9f216e29b | SUCCESS | 2022-06-24 01:01:25.996502 | 2022-06-24 01:16:14.170379 | */30 13-23,0-1 * * *
 824e689c-a141-4ac3-8c14-a1f129bb30f9 | 5cc4bbbe-e8da-41aa-a5ed-9f8184c712d5 | SUCCESS | 2022-06-24 00:31:21.568974 | 2022-06-24 00:45:50.302359 | */30 13-23,0-1 * * *
 db708f54-fc26-4e5e-8fbc-21ae93846e3f | f30d0ab7-25a9-40ea-a41f-c197a5821b65 | SUCCESS | 2022-06-23 13:00:07.64629  | 2022-06-23 13:00:34.109994 | 0 3,13 * * *
here is the result of the last query
@Jacopo it seems that Orchest settings are not saving after restart Orchest
j

Jacopo

06/24/2022, 4:38 PM
Alright ty for reporting the result, it looks like the settings are not being applied, but are stored in the db correctly, we are looking into it!
As a temporary workaround, you can manually patch the celery environment to control parallelism, but mind that the celery worker will be restarted and ongoing pipeline runs interrupted
kubectl set env -n orchest deployment/celery-worker MAX_JOB_RUNS_PARALLELISM=15
the operation can also be done while orchest is not running, i.e. after a
orchest stop
we are looking into how to fix the issue btw, not sure we will hit a release today tho
keep in mind that after a restart said patch won't last since the deployment is created again, @Navid H correct me if I'm wrong
we'll try to fix this and release on Monday btw
d

Diego Vasquez

06/24/2022, 9:45 PM
thanks jacopo, i will try with your solution for the celery-worker in the meantime
👍 1
j

Jacopo

06/27/2022, 11:59 AM
@Diego Vasquez a new release fixing this issue has been made (https://orchest.slack.com/archives/C018A6TGJR3/p1656326580892419), thanks for reporting the problem
d

Diego Vasquez

06/28/2022, 8:59 PM
thanks @Jacopo!
👍 1