https://www.orchest.io/ logo
Title
l

Lex Avstreikh

10/14/2022, 1:10 PM
I dont assume the project files can be updated with the repo’s last commits ?
y

Yannick

10/14/2022, 1:13 PM
You mean in an automatic fashion where you push to a remote repo and the project files in Orchest are automatically updated (i.e. following a certain branch)? EDIT: If that is the case, then no (not yet). Still scoping how we would want this to work: https://github.com/orchest/orchest/issues/1122
l

Lex Avstreikh

10/14/2022, 1:26 PM
indeed; following a certain branch. Noted then. So I assume the way to do it until then would be to create a new project instead ?
y

Yannick

10/16/2022, 4:39 PM
If I wanted to follow a specific branch I would probably create a job with a recurring schedule of 1 minute and in that job create a sentinel node that checks whether the branch has new commits on the remote and if it does, then let the step succeed so other steps will also run. Do note that jobs run on a snapshot (for reproducibility) thus you need to store state (eg in
/data
) what the last commit was that the job ran on. Does that make sense?
l

Lex Avstreikh

10/18/2022, 8:03 AM
I think it does make sense if you want to follow a specific branch yes. I might add that later on. Currently I was working on doing a full end-to-end pipeline for machine learning that would run everyday on schedule; my main concern was the output of the model was being written in the project files. But I figured a workaround I think 🙂
y

Yannick

10/18/2022, 8:21 AM
Sounds great! What workaround are you thinking about?
l

Lex Avstreikh

10/18/2022, 4:25 PM
so: using the github token as a variable and then using that variable with gitpython to push to remote
I am actually about done. Ill do some polish tomorrow and I think i’ll share the tutorial later on 🙂
y

Yannick

10/18/2022, 4:33 PM
Sounds like a great solution indeed! 🤓 Did you use an environment variable to set the GH token? Setting the token directly as a variable (e.g.
token = '<secret-token>'
might not be the safest approach when pushing that to a remote).
Would love to see that tutorial and include it on https://github.com/orchest/orchest-examples (these examples are loaded in the product 🙌 )
l

Lex Avstreikh

10/18/2022, 4:54 PM
So you can’t set the token as a variable in a notebook that you would run; GitHub deletes the token afterwards (or more like you can only do it once)
So you have to set as an env variable indeed
y

Yannick

10/18/2022, 5:11 PM
Interesting, didn't know GH does that. Thanks for sharing :)
l

Lex Avstreikh

10/18/2022, 7:37 PM
I should bit more precise a bit on the why; if that notebook is in your repository; the token gets deleted from your account. In my case the notebook was pushing itself in the repo(after all if it’s an end to end pipeline it needs to loop) this means that at the moment the commit was pushed, using the token to identify, it got delayed from the account. Theoretically if the notebook is not pushed to the repo; it would still work indefinitely, it’s just not too elegant I suppose.
y

Yannick

10/18/2022, 8:02 PM
the token gets deleted from your account.
Interesting 🤔. From an implementation perspective I am curious how they went about implementing it as I doubt that they store the token in plain text in their DB. I always like hearing these details, thank you!