https://www.orchest.io/ logo
m

Michael Duncan

01/29/2021, 6:40 PM
Hi everyone! I'm trying to get started with this cool looking project but the UI doesn't seem to update. I've cloned the repo and run
./orchest install
./orchest update
and
./orchest start
. When I try to create a project nothing shows up in the UI and I can't select the default
Project
. The projects are being made and put into the
userdir/projects
directory but I can't interact with them. Any tips would be greatly appreciated
y

Yannick

01/29/2021, 7:15 PM
Hi @Michael Duncan! This is indeed not intended behavior ;) I assume the output of
./orchest version --ext
shows
v0.6.1
for every container?
r

Rick Lamers

01/29/2021, 7:32 PM
docker logs orchest-webserver
docker logs orchest-api
Could contain errors that are useful. Are you able to share those?
m

Michael Duncan

01/29/2021, 8:09 PM
@Yannick Getting versions of all containers... orchest/jupyter-enterprise-gateway:latest : v0.6.1 orchest/jupyter-server:latest : v0.6.1 orchest/memory-server:latest : v0.6.1 orchest/orchest-ctl:latest : v0.6.1 orchest/update-server:latest : v0.6.1 orchest/orchest-api:latest : v0.6.1 orchest/orchest-webserver:latest : v0.6.1 orchest/celery-worker:latest : v0.6.1 orchest/auth-server:latest : v0.6.1 orchest/file-manager:latest : v0.6.1 orchest/nginx-proxy:latest : v0.6.1 orchest/base-kernel-py:latest : v0.6.1
@Rick Lamers
orchest-api
Copy code
[2021-01-29 20:05:44 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2021-01-29 20:05:44 +0000] [1] [INFO] Listening at: <http://0.0.0.0:80> (1)
[2021-01-29 20:05:44 +0000] [1] [INFO] Using worker: eventlet
[2021-01-29 20:05:44 +0000] [8] [INFO] Booting worker with pid: 8
INFO:apscheduler.scheduler:base.py - [29/Jan/2021 20:05:47] - Scheduler started
INFO:apscheduler.scheduler:base.py - [29/Jan/2021 20:05:47] - Added job "Scheduler.check_for_jobs_to_be_scheduled" to job store "default"
INFO:app:main.py - [29/Jan/2021 20:05:47] - Running orchest-api as root
INFO:app:main.py - [29/Jan/2021 20:05:47] - Running orchest-api as root
INFO:apscheduler.executors.default:base.py - [29/Jan/2021 20:05:57] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[0:00:10], next run at: 2021-01-29 20:05:57 UTC)" (scheduled at 2021-01-29 20:05:57.909404+00:00)
INFO:job-scheduler:scheduler.py - [29/Jan/2021 20:05:57] - found 0 jobs to run
INFO:job-scheduler:scheduler.py - [29/Jan/2021 20:05:57] - found 0 jobs to run
INFO:apscheduler.executors.default:base.py - [29/Jan/2021 20:05:57] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[0:00:10], next run at: 2021-01-29 20:06:07 UTC)" executed successfully
INFO:apscheduler.executors.default:base.py - [29/Jan/2021 20:06:07] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[0:00:10], next run at: 2021-01-29 20:06:17 UTC)" (scheduled at 2021-01-29 20:06:07.909404+00:00)
INFO:job-scheduler:scheduler.py - [29/Jan/2021 20:06:07] - found 0 jobs to run
INFO:job-scheduler:scheduler.py - [29/Jan/2021 20:06:07] - found 0 jobs to run
INFO:apscheduler.executors.default:base.py - [29/Jan/2021 20:06:07] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[0:00:10], next run at: 2021-01-29 20:06:17 UTC)" executed successfully
INFO:apscheduler.executors.default:base.py - [29/Jan/2021 20:06:17] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[0:00:10], next run at: 2021-01-29 20:06:27 UTC)" (scheduled at 2021-01-29 20:06:17.909404+00:00)
INFO:job-scheduler:scheduler.py - [29/Jan/2021 20:06:17] - found 0 jobs to run
INFO:job-scheduler:scheduler.py - [29/Jan/2021 20:06:17] - found 0 jobs to run
INFO:apscheduler.executors.default:base.py - [29/Jan/2021 20:06:17] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[0:00:10], next run at: 2021-01-29 20:06:27 UTC)" executed successfully
INFO:apscheduler.executors.default:base.py - [29/Jan/2021 20:06:27] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[0:00:10], next run at: 2021-01-29 20:06:37 UTC)" (scheduled at 2021-01-29 20:06:27.909404+00:00)
INFO:job-scheduler:scheduler.py - [29/Jan/2021 20:06:27] - found 0 jobs to run
INFO:job-scheduler:scheduler.py - [29/Jan/2021 20:06:27] - found 0 jobs to run
orchest-webserver
Copy code
[2021-01-29 20:05:44 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2021-01-29 20:05:44 +0000] [1] [INFO] Listening at: <http://0.0.0.0:80> (1)
[2021-01-29 20:05:44 +0000] [1] [INFO] Using worker: eventlet
[2021-01-29 20:05:44 +0000] [8] [INFO] Booting worker with pid: 8
INFO:apscheduler.scheduler:base.py - [29/Jan/2021 20:05:47] - Scheduler started
INFO:app:__init__.py - [29/Jan/2021 20:05:47] - Flask CONFIG: <Config {'ENV': 'production',...}>
INFO:apscheduler.scheduler:base.py - [29/Jan/2021 20:05:47] - Added job "analytics_ping" to job store "default"
INFO:app:__init__.py - [29/Jan/2021 20:05:48] - Started log_streamer.py
INFO:app:__init__.py - [29/Jan/2021 20:05:48] - Started log_streamer.py
INFO:app:main.py - [29/Jan/2021 20:05:48] - Running orchest-webserver as root
INFO:app:main.py - [29/Jan/2021 20:05:48] - Running orchest-webserver as root
INFO:app:main.py - [29/Jan/2021 20:05:48] - Running from Gunicorn
INFO:app:main.py - [29/Jan/2021 20:05:48] - Running from Gunicorn
INFO:root:log_streamer started
INFO:gunicorn.access:glogging.py - [29/Jan/2021 20:05:48] - 127.0.0.1 - - [29/Jan/2021:20:05:48 +0000] "GET /socket.io/?transport=polling&EIO=3&t=1611950748.2921224 HTTP/1.1" 200 119 "-" "python-requests/2.25.0"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 20:05:48] - 127.0.0.1 - - [29/Jan/2021:20:05:48 +0000] "GET /socket.io/?transport=polling&EIO=3&t=1611950748.2921224 HTTP/1.1" 200 119 "-" "python-requests/2.25.0"
INFO:socketio.client:<http://Engine.IO|Engine.IO> connection established
INFO:socketio.client:Namespace / is connected
INFO:root:Entered file_reader_loop
INFO:app:socketio_server.py - [29/Jan/2021 20:05:48] - <http://socket.io|socket.io> client connected on /pty
INFO:app:socketio_server.py - [29/Jan/2021 20:05:48] - <http://socket.io|socket.io> client connected on /pty
INFO:socketio.client:Namespace /pty is connected
INFO:root:SocketIO connection established on namespace /pty
INFO:gunicorn.access:glogging.py - [29/Jan/2021 20:06:33] - 192.168.176.9 - - [29/Jan/2021:20:06:33 +0000] "GET / HTTP/1.1" 200 4746 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 20:06:33] - 192.168.176.9 - - [29/Jan/2021:20:06:33 +0000] "GET / HTTP/1.1" 200 4746 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 20:06:33] - 192.168.176.9 - - [29/Jan/2021:20:06:33 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 20:06:33] - 192.168.176.9 - - [29/Jan/2021:20:06:33 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 20:06:34] - 192.168.176.9 - - [29/Jan/2021:20:06:34 +0000] "GET /async/projects HTTP/1.1" 200 3 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 20:06:34] - 192.168.176.9 - - [29/Jan/2021:20:06:34 +0000] "GET /async/projects HTTP/1.1" 200 3 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
No errors from these logs. The only error I've been able to find is this when trying to update through the UI:
y

Yannick

01/29/2021, 8:17 PM
I have just pushed a fix for the update through the UI. The update through the UI hits a different code path then updating through the CLI, so that should still work fine.
Thanks for providing the logs and version information. I am thinking what could cause the issue, but don't really know... Could you run
scripts/clean_userdir.sh
? This cleans your userdir (assuming you have not been able yet to use Orchest and so don't have any files that you want to keep).
m

Michael Duncan

01/29/2021, 8:24 PM
That works, The project files got removed but new projects still don't show up in the UI and trying create a Pipeline or Job sends be to the blank Projects tab
I clicked projects again and am getting this for the webserver logs: Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36" ERRORgunicorn.errorglogging.py - [29/Jan/2021 202446] - Error handling request /async/projects Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context self.dialect.do_execute( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 609, in do_execute cursor.execute(statement, parameters) File "/usr/local/lib/python3.8/site-packages/eventlet/support/psycopg2_patcher.py", line 46, in eventlet_wait_callback state = conn.poll() psycopg2.OperationalError: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/base_async.py", line 55, in handle self.handle_request(listener_name, req, client, addr) File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/base_async.py", line 106, in handle_request respiter = self.wsgi(environ, resp.start_response) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2464, in call return self.wsgi_app(environ, start_response) File "/usr/local/lib/python3.8/site-packages/flask_socketio/__init__.py", line 45, in call return super(_SocketIOMiddleware, self).__call__(environ, File "/usr/local/lib/python3.8/site-packages/engineio/middleware.py", line 74, in call return self.wsgi_app(environ, start_response) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2450, in wsgi_app response = self.handle_exception(e) File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 272, in error_router return original_handler(e) File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 272, in error_router return original_handler(e) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1867, in handle_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise raise value File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 272, in error_router return original_handler(e) File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 272, in error_router return original_handler(e) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise raise value File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/orchest/services/orchest-webserver/app/app/views/views.py", line 414, in projects_get discoverFSDeletedProjects() File "/orchest/services/orchest-webserver/app/app/views/views.py", line 362, in discoverFSDeletedProjects fs_removed_projects = Project.query.filter( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3373, in all return list(self) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3535, in iter return self._execute_and_instances(context) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances result = conn.execute(querycontext.statement, self._params) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute return meth(self, multiparams, params) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement ret = self._execute_context( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context self._handle_dbapi_exception( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception util.raise_( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_ raise exception File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context self.dialect.do_execute( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 609, in do_execute cursor.execute(statement, parameters) File "/usr/local/lib/python3.8/site-packages/eventlet/support/psycopg2_patcher.py", line 46, in eventlet_wait_callback state = conn.poll() sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. [SQL: SELECT projects.uuid AS projects_uuid, projects.path AS projects_path, projects.status AS projects_status FROM projects WHERE 1 = 1 AND projects.status IN (%(status_1)s)] [parameters: {'status_1': 'READY'}] (Background on this error at: http://sqlalche.me/e/13/e3q8) ERRORgunicorn.errorglogging.py - [29/Jan/2021 202446] - Error handling request /async/projects Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context self.dialect.do_execute( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 609, in do_execute cursor.execute(statement, parameters) File "/usr/local/lib/python3.8/site-packages/eventlet/support/psycopg2_patcher.py", line 46, in eventlet_wait_callback state = conn.poll() psycopg2.OperationalError: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/base_async.py", line 55, in handle self.handle_request(listener_name, req, client, addr) File "/usr/local/lib/python3.8/site-packages/gunicorn/workers/base_async.py", line 106, in handle_request respiter = self.wsgi(environ, resp.start_response) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2464, in call return self.wsgi_app(environ, start_response) File "/usr/local/lib/python3.8/site-packages/flask_socketio/__init__.py", line 45, in call return super(_SocketIOMiddleware, self).__call__(environ, File "/usr/local/lib/python3.8/site-packages/engineio/middleware.py", line 74, in call return self.wsgi_app(environ, start_response) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2450, in wsgi_app response = self.handle_exception(e) File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 272, in error_router return original_handler(e) File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 272, in error_router return original_handler(e) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1867, in handle_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise raise value File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 272, in error_router return original_handler(e) File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 272, in error_router return original_handler(e) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise raise value File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/orchest/services/orchest-webserver/app/app/views/views.py", line 414, in projects_get discoverFSDeletedProjects() File "/orchest/services/orchest-webserver/app/app/views/views.py", line 362, in discoverFSDeletedProjects fs_removed_projects = Project.query.filter( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3373, in all return list(self) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3535, in iter return self._execute_and_instances(context) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances result = conn.execute(querycontext.statement, self._params) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute return meth(self, multiparams, params) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement ret = self._execute_context( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context self._handle_dbapi_exception( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception util.raise_( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_ raise exception File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context self.dialect.do_execute( File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 609, in do_execute cursor.execute(statement, parameters) File "/usr/local/lib/python3.8/site-packages/eventlet/support/psycopg2_patcher.py", line 46, in eventlet_wait_callback state = conn.poll() sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. [SQL: SELECT projects.uuid AS projects_uuid, projects.path AS projects_path, projects.status AS projects_status FROM projects WHERE 1 = 1 AND projects.status IN (%(status_1)s)] [parameters: {'status_1': 'READY'}] (Background on this error at: http://sqlalche.me/e/13/e3q8) INFOgunicorn.accessglogging.py - [29/Jan/2021 202446] - 192.168.176.9 - - [29/Jan/20212024:46 +0000] "GET /async/projects HTTP/1.1" 500 0 "-" "-" INFOgunicorn.accessglogging.py - [29/Jan/2021 202446] - 192.168.176.9 - - [29/Jan/20212024:46 +0000] "GET /async/projects HTTP/1.1" 500 0 "-" "-"
r

Rick Lamers

01/29/2021, 8:26 PM
Could you share a bit about your host setup? What OS are you running on? What are the permissions of your orchest/ dir
m

Michael Duncan

01/29/2021, 8:28 PM
macOS Big Sur version 11.1 drwxr-xr-x orchest
r

Rick Lamers

01/29/2021, 8:29 PM
Going to quickly try to replicate, should have an answer in a few minutes
Did you stop orchest before running
scripts/clean_userdir.sh
?
I can verify a clean install is working for me on macOS Big Sur
(avoid using the web updater until the fix of master hits the release in 0.6.2)
The error seems to be caused by an operational error in the Postgres DB server.
docker logs orchest-database
can be informative.
docker stats
could also help to find if there is potentially a resource (e.g. RAM) issue. But I’m still curious about a
./orchest stop
followed by a
scripts/clean_userdir.sh
and
./orchest start
m

Michael Duncan

01/29/2021, 8:55 PM
I think the postgres error was caused by running
scripts/clean_userdir.sh
before stopping but the UI still doesn't work for me after running a 
./orchest stop
 followed by a 
scripts/clean_userdir.sh
 and 
./orchest start
r

Rick Lamers

01/29/2021, 9:02 PM
Could you run
docker exec -it --user postgres orchest-database psql --db orchest_webserver
and query
SELECT * FROM projects;
?
m

Michael Duncan

01/29/2021, 9:03 PM
orchest_webserver=# SELECT * FROM projects; uuid | path | status --------------------------------------+-------------+-------------- 12eb115b-d22e-46c4-a497-5e3a678a1d4d | gasdf | INITIALIZING b0fb2a68-befa-4c95-9116-5f3e2a1d610d | gsdafgsasf | INITIALIZING 5b33dfa9-e104-4578-ba45-c7ffd0387eff | new_project | INITIALIZING 817f865b-c862-462f-b122-817552bdad74 | gsdfsd | INITIALIZING (4 rows) orchest_webserver=# Those are projects I tried to get to show up
r

Rick Lamers

01/29/2021, 9:08 PM
Thanks for bearing with us, it looks like we’re not correctly handling an edge case on project initialization. Could you:
./orchest stop
,
scripts/clean_userdir.sh
,
./orchest start
, create a single new project, check
docker logs orchest-webserver
. It should log the exception that occurs on project creation.
m

Michael Duncan

01/29/2021, 9:21 PM
The problem is there is no exception. I tried to import the quickstart repo as a project so you can see it in the logs and there is no exception raised:
Copy code
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:17:58] - 192.168.176.3 - - [29/Jan/2021:21:17:58 +0000] "POST /async/projects/import-git HTTP/1.1" 200 156 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:17:58] - 192.168.176.3 - - [29/Jan/2021:21:17:58 +0000] "POST /async/projects/import-git HTTP/1.1" 200 156 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:17:59] - 127.0.0.1 - - [29/Jan/2021:21:17:59 +0000] "PUT /async/background-tasks/44fdc9ce-5ea9-4411-8926-0571d02cbe9b HTTP/1.1" 200 156 "-" "python-requests/2.25.0"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:17:59] - 127.0.0.1 - - [29/Jan/2021:21:17:59 +0000] "PUT /async/background-tasks/44fdc9ce-5ea9-4411-8926-0571d02cbe9b HTTP/1.1" 200 156 "-" "python-requests/2.25.0"
Cloning into 'quickstart'...
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:17:59] - 192.168.176.3 - - [29/Jan/2021:21:17:59 +0000] "GET /async/background-tasks/44fdc9ce-5ea9-4411-8926-0571d02cbe9b HTTP/1.1" 200 156 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:17:59] - 192.168.176.3 - - [29/Jan/2021:21:17:59 +0000] "GET /async/background-tasks/44fdc9ce-5ea9-4411-8926-0571d02cbe9b HTTP/1.1" 200 156 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:00] - 127.0.0.1 - - [29/Jan/2021:21:18:00 +0000] "PUT /async/background-tasks/44fdc9ce-5ea9-4411-8926-0571d02cbe9b HTTP/1.1" 200 172 "-" "python-requests/2.25.0"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:00] - 127.0.0.1 - - [29/Jan/2021:21:18:00 +0000] "PUT /async/background-tasks/44fdc9ce-5ea9-4411-8926-0571d02cbe9b HTTP/1.1" 200 172 "-" "python-requests/2.25.0"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:00] - 192.168.176.3 - - [29/Jan/2021:21:18:00 +0000] "GET /async/background-tasks/44fdc9ce-5ea9-4411-8926-0571d02cbe9b HTTP/1.1" 200 172 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:00] - 192.168.176.3 - - [29/Jan/2021:21:18:00 +0000] "GET /async/background-tasks/44fdc9ce-5ea9-4411-8926-0571d02cbe9b HTTP/1.1" 200 172 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:51] - 192.168.176.3 - - [29/Jan/2021:21:18:51 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:51] - 192.168.176.3 - - [29/Jan/2021:21:18:51 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:51] - 192.168.176.3 - - [29/Jan/2021:21:18:51 +0000] "GET /store/datasources?show_internal=false HTTP/1.1" 200 3 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:51] - 192.168.176.3 - - [29/Jan/2021:21:18:51 +0000] "GET /store/datasources?show_internal=false HTTP/1.1" 200 3 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:52] - 192.168.176.3 - - [29/Jan/2021:21:18:52 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:18:52] - 192.168.176.3 - - [29/Jan/2021:21:18:52 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:02] - 192.168.176.3 - - [29/Jan/2021:21:19:02 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:02] - 192.168.176.3 - - [29/Jan/2021:21:19:02 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:02] - 192.168.176.3 - - [29/Jan/2021:21:19:02 +0000] "GET /async/projects HTTP/1.1" 200 3 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:02] - 192.168.176.3 - - [29/Jan/2021:21:19:02 +0000] "GET /async/projects HTTP/1.1" 200 3 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:05] - 192.168.176.3 - - [29/Jan/2021:21:19:05 +0000] "GET / HTTP/1.1" 200 4746 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:05] - 192.168.176.3 - - [29/Jan/2021:21:19:05 +0000] "GET / HTTP/1.1" 200 4746 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:05] - 192.168.176.3 - - [29/Jan/2021:21:19:05 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:05] - 192.168.176.3 - - [29/Jan/2021:21:19:05 +0000] "POST /analytics HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:05] - 192.168.176.3 - - [29/Jan/2021:21:19:05 +0000] "GET /async/projects HTTP/1.1" 200 3 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:05] - 192.168.176.3 - - [29/Jan/2021:21:19:05 +0000] "GET /async/projects HTTP/1.1" 200 3 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:05] - 192.168.176.3 - - [29/Jan/2021:21:19:05 +0000] "GET /public/image/favicon.png HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
INFO:gunicorn.access:glogging.py - [29/Jan/2021 21:19:05] - 192.168.176.3 - - [29/Jan/2021:21:19:05 +0000] "GET /public/image/favicon.png HTTP/1.1" 200 0 "<http://localhost:8000/>" "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36"
But its not there...
r

Rick Lamers

01/29/2021, 9:22 PM
docker exec -it --user postgres orchest-database psql --db orchest_webserver
 and query 
SELECT * FROM projects;
Is probably also giving the STATUS of INITIALIZING for quickstart?
m

Michael Duncan

01/29/2021, 9:23 PM
the db table from before: docker exec -it --user postgres orchest-database psql --db orchest_webserver psql (13.1 (Debian 13.1-1.pgdg100+1)) Type "help" for help. orchest_webserver=# SELECT * FROM projects; uuid | path | status --------------------------------------+------------+-------------- fb23951b-e612-408f-9d8c-c9334cae5265 | quickstart | INITIALIZING (1 row) orchest_webserver=#
r

Rick Lamers

01/29/2021, 9:24 PM
It could be stuck waiting for callbacks to the
orchest-api
. Can you print those logs? (
docker logs orchest-api
)
m

Michael Duncan

01/29/2021, 9:28 PM
[2021-01-29 211712 +0000] [1] [INFO] Starting gunicorn 20.0.4 [2021-01-29 211712 +0000] [1] [INFO] Listening at: http://0.0.0.0:80 (1) [2021-01-29 211712 +0000] [1] [INFO] Using worker: eventlet [2021-01-29 211712 +0000] [9] [INFO] Booting worker with pid: 9 INFOapscheduler.schedulerbase.py - [29/Jan/2021 211718] - Scheduler started INFOapscheduler.schedulerbase.py - [29/Jan/2021 211718] - Added job "Scheduler.check_for_jobs_to_be_scheduled" to job store "default" INFOappmain.py - [29/Jan/2021 211718] - Running orchest-api as root INFOappmain.py - [29/Jan/2021 211718] - Running orchest-api as root INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 211728] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 211728 UTC)" (scheduled at 2021-01-29 211728.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 211728] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 211728] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 211728] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 211738 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 211738] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 211748 UTC)" (scheduled at 2021-01-29 211738.161965+00:00)
This keeps repeating INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212438] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212448 UTC)" (scheduled at 2021-01-29 212438.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212438] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212438] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212438] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212448 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212448] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212458 UTC)" (scheduled at 2021-01-29 212448.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212448] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212448] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212448] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212458 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212458] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212508 UTC)" (scheduled at 2021-01-29 212458.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212458] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212458] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212458] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212508 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212508] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212518 UTC)" (scheduled at 2021-01-29 212508.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212508] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212508] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212508] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212518 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212518] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212528 UTC)" (scheduled at 2021-01-29 212518.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212518] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212518] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212518] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212528 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212528] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212538 UTC)" (scheduled at 2021-01-29 212528.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212528] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212528] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212528] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212538 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212538] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212548 UTC)" (scheduled at 2021-01-29 212538.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212538] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212538] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212538] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212548 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212548] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212558 UTC)" (scheduled at 2021-01-29 212548.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212548] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212548] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212548] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212558 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212558] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212608 UTC)" (scheduled at 2021-01-29 212558.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212558] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212558] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212558] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212608 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212608] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212618 UTC)" (scheduled at 2021-01-29 212608.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212608] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212608] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212608] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212618 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212618] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212628 UTC)" (scheduled at 2021-01-29 212618.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212618] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212618] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212618] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212628 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212628] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212638 UTC)" (scheduled at 2021-01-29 212628.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212628] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212628] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212628] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212638 UTC)" executed successfully INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212638] - Running job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212648 UTC)" (scheduled at 2021-01-29 212638.161965+00:00) INFOjob schedulerscheduler.py - [29/Jan/2021 212638] - found 0 jobs to run INFOjob schedulerscheduler.py - [29/Jan/2021 212638] - found 0 jobs to run INFOapscheduler.executors.defaultbase.py - [29/Jan/2021 212638] - Job "Scheduler.check_for_jobs_to_be_scheduled (trigger: interval[00010], next run at: 2021-01-29 212648 UTC)" executed successfully
r

Rick Lamers

01/29/2021, 9:30 PM
Alright, I think we have a direction of where it’s going wrong. During project creation the
orchest-api
should be queried. The job scheduler is a bit verbose, but is not showing anything out of the ordinary.
docker logs orchest-api 2>&1 | grep gunicorn
does this give you any lines?
m

Michael Duncan

01/29/2021, 9:35 PM
Only this [2021-01-29 211712 +0000] [1] [INFO] Starting gunicorn 20.0.4
r

Rick Lamers

01/29/2021, 9:36 PM
Looks like no web requests from the
orchest-webserver
are able to reach
orchest-api
. Can you share the output of
docker network inspect orchest
?
P.S. which version of Docker on macOS are you using?
m

Michael Duncan

01/29/2021, 9:39 PM
`docker version`: Client: Docker Engine - Community Cloud integration: 1.0.7 Version: 20.10.2 API version: 1.41 Go version: go1.13.15 Git commit: 2291f61 Built: Mon Dec 28 161242 2020 OS/Arch: darwin/amd64 Context: default Experimental: true Server: Docker Engine - Community Engine: Version: 20.10.2 API version: 1.41 (minimum version 1.12) Go version: go1.13.15 Git commit: 8891c58 Built: Mon Dec 28 161528 2020 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.4.3 GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b runc: Version: 1.0.0-rc92 GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff docker-init: Version: 0.19.0 GitCommit: de40ad0
docker network inspect orchest
[ { "Name": "orchest", "Id": "60385328c83eaf17c08dc155cd23626941c326a6f0010c4bf0b159cb50f6084a", "Created": "2021-01-29T154947.9434325Z", "Scope": "local", "Driver": "bridge", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [ { "Subnet": "192.168.176.0/20", "Gateway": "192.168.176.1" } ] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": { "06df22f2c8902be119bb0328bade2fee2e9f9e197eaee8587947c1ddef1c0749": { "Name": "orchest-api", "EndpointID": "cf7b337386780901d89d2f34a9c23e88fd9ea63d4637cb7c721594b0beea23c7", "MacAddress": "0242c0a8b0:06", "IPv4Address": "192.168.176.6/20", "IPv6Address": "" }, "2504e44f635650db61da5821b2263e0c1fb9b1d13a50f2ee8a6e3615ca3ad710": { "Name": "orchest-database", "EndpointID": "12f1c3c9c505bee738e6c8e1bb32feb9947e234f55889803456eb0904a46ef99", "MacAddress": "0242c0a8b0:04", "IPv4Address": "192.168.176.4/20", "IPv6Address": "" }, "7aadad5bbe6ed788f8c2b535d5c84d1a7692fc8dc315840c49d4a3b56d2897a4": { "Name": "orchest-webserver", "EndpointID": "1cb1a1f4dd17dc223a296b2c154e0822001a60915a13821c7507ee1e9fae2534", "MacAddress": "0242c0a8b0:08", "IPv4Address": "192.168.176.8/20", "IPv6Address": "" }, "877be01eab881153faa186a319d869a665e3091a01bb8bb89ec14adafbf15683": { "Name": "nginx-proxy", "EndpointID": "3cf6cfd8bae999c50317c9890b2180cdf8d148e93d41e650e4f5025ff11e2981", "MacAddress": "0242c0a8b0:03", "IPv4Address": "192.168.176.3/20", "IPv6Address": "" }, "8cbe0fc33e43965604c37fc8649ff5936bc3b1a8bde20fd954a52dfb2ecae406": { "Name": "celery-worker", "EndpointID": "07e2e6eb7def37537a71aa7625ae9774c2ed64fb216e3005c9f592dcfd685548", "MacAddress": "0242c0a8b0:05", "IPv4Address": "192.168.176.5/20", "IPv6Address": "" }, "ac26d7e44ea6a6a98a8410b38382b31b0f06cdb0d2eb75c3c0382eb5ef6e4476": { "Name": "auth-server", "EndpointID": "0e19e145a925be7bb01dad539a524d5fb4d8e9b9db1c7eed3ca8e5ff9b07508e", "MacAddress": "0242c0a8b0:07", "IPv4Address": "192.168.176.7/20", "IPv6Address": "" }, "d720300853fc9e5f8236f6aba6b7b9cab4443066ed44866dcfd045120a9931bd": { "Name": "file-manager", "EndpointID": "6c8807c936cf3eeb199b3fe29f34022c9469cf5258b2b29b66d6a5ccf3270c4f", "MacAddress": "0242c0a8b0:02", "IPv4Address": "192.168.176.2/20", "IPv6Address": "" } }, "Options": {}, "Labels": {} } ]
r

Rick Lamers

01/29/2021, 9:41 PM
That’s really helpful. The Docker Desktop should have its own version too. It’s most easily found through the Docker icon top right -> About Docker Desktop
Output of
docker exec orchest-webserver bash -c 'wget <http://orchest-api/api/>'
would be helpful, this will establish whether they are connected in the Docker network.
m

Michael Duncan

01/29/2021, 9:47 PM
Docker Desktop version=3.1.0
docker exec orchest-webserver bash -c 'wget <http://orchest-api/api/>'
--2021-01-29 214740-- http://orchest-api/api/ Resolving orchest-api (orchest-api)... 192.168.176.6 Connecting to orchest-api (orchest-api)|192.168.176.6|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 3835 (3.7K) [text/html] Saving to: ‘index.html’ 0K ... 100% 468M=0s 2021-01-29 214741 (468 MB/s) - ‘index.html’ saved [3835/3835]
🙏 2
r

Rick Lamers

01/29/2021, 9:54 PM
We’re trying to pinpoint where the project creation fails. Can you show outputs of:
ls -la orchest/userdir/.orchest/kernels
ls -la orchest/userdir/projects/<your_project_name>/.orchest/environments
m

Michael Duncan

01/29/2021, 9:56 PM
ls -la userdir/.orchest/kernels
total 8 drwxr-sr-x 4 michaelduncan staff 128 Jan 29 14:18 . drwxr-sr-x 7 michaelduncan staff 224 Jan 29 14:17 .. -rw-r--r-- 1 michaelduncan staff 13 Jan 29 14:13 .gitignore drwxr-xr-x 4 michaelduncan staff 128 Jan 29 14:18 fb23951b-e612-408f-9d8c-c9334cae5265
ls -la userdir/projects/quickstart/.orchest/environments
total 0 drwxrwxr-x@ 3 michaelduncan staff 96 Jan 29 14:18 . drwxrwxr-x@ 4 michaelduncan staff 128 Jan 29 14:18 .. drwxrwxr-x@ 4 michaelduncan staff 128 Jan 29 14:18 c56ab762-539c-4cce-9b1e-c4b00300ec6f
r

Rick Lamers

01/29/2021, 10:01 PM
What does
<http://127.0.0.1:8000/catch/api-proxy/api/runs/>
return?
(make sure Orchest is running 🤓)
m

Michael Duncan

01/29/2021, 10:04 PM
orchest does appear to be running and the output returns:
Copy code
{
  "runs": []
}
r

Rick Lamers

01/29/2021, 10:13 PM
Copy code
curl '<http://127.0.0.1:8000/catch/api-proxy/api/environment-builds>' \
  -H 'Connection: keep-alive' \
  -H 'Pragma: no-cache' \
  -H 'Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0' \
  -H 'sec-ch-ua: "Chromium";v="88", "Google Chrome";v="88", ";Not A Brand";v="99"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36' \
  -H 'Content-Type: application/json;charset=UTF-8' \
  -H 'Accept: */*' \
  -H 'Origin: <http://127.0.0.1:8000>' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Referer: <http://127.0.0.1:8000/>' \
  -H 'Accept-Language: en-US,en;q=0.9,nl;q=0.8,und;q=0.7,mt;q=0.6,fy;q=0.5,pt;q=0.4,de;q=0.3,fr;q=0.2,la;q=0.1' \
  -H 'Cookie: email=admin; _ga=GA1.1.1962276334.1575881370; token=JKypplwOubCGHbBTXzLNhpMaaGROhVtm; session_uuid=db98cdc9-456d-4929-9c3b-3cc7a7346da2; csrftoken=uCcgpprIizYljYHzxgg88hT7d3cn3Mh4yKoSXJzyJV6bR2MOy78etBNCjQLquosn' \
  --data-raw '{"environment_build_requests":[{"environment_uuid":"c56ab762-539c-4cce-9b1e-c4b00300ec6f","project_uuid":"fb23951b-e612-408f-9d8c-c9334cae5265"}]}' \
  --compressed
Could you test the above? (I’m assuming you haven’t yet enabled authentication on the Orchest instance, otherwise the token would mess thing up in the above command). It will attempt to trigger the build of an environment which happens on project creation.
By the way, we could search a bit more on our own if you want to move on for now. Keen to make it work for you, but don’t want to overstep any bounds.
m

Michael Duncan

01/29/2021, 10:16 PM
I appreciate all the help! I'm running the above command now and I'll send you the response. I will have do have to get off pretty soon though
r

Rick Lamers

01/29/2021, 10:20 PM
Seems like
orchest-api
is not logging web requests. Is
docker logs orchest-api 2>&1 | grep gunicorn
still only showing
Starting Gunicorn…
after a call to http://127.0.0.1:8000/catch/api-proxy/api/runs/?
m

Michael Duncan

01/29/2021, 10:21 PM
[2021-01-29 211712 +0000] [1] [INFO] Starting gunicorn 20.0.4 INFOgunicorn.accessglogging.py - [29/Jan/2021 214741] - 192.168.176.8 - - [29/Jan/20212147:41 +0000] "GET /api/ HTTP/1.1" 200 3835 "-" "Wget/1.20.1 (linux-gnu)" INFOgunicorn.accessglogging.py - [29/Jan/2021 214741] - 192.168.176.8 - - [29/Jan/20212147:41 +0000] "GET /api/ HTTP/1.1" 200 3835 "-" "Wget/1.20.1 (linux-gnu)" INFOgunicorn.accessglogging.py - [29/Jan/2021 220328] - 192.168.176.8 - - [29/Jan/20212203:28 +0000] "GET /api/runs/ HTTP/1.1" 200 13 "-" "python-requests/2.25.0" INFOgunicorn.accessglogging.py - [29/Jan/2021 220328] - 192.168.176.8 - - [29/Jan/20212203:28 +0000] "GET /api/runs/ HTTP/1.1" 200 13 "-" "python-requests/2.25.0"
The long
curl
command is just hanging
r

Rick Lamers

01/29/2021, 10:22 PM
Alright I think we found the cause
m

Michael Duncan

01/29/2021, 10:24 PM
Okay I'm going to get off. Thanks for all your help with this! Its nice to know there is an engaged support behind Orchest. Excited to try it out
r

Rick Lamers

01/29/2021, 10:25 PM
Definitely here to make it work. Thank you helping us find out the cause.
We’ll def. do some more digging. If we solve it in the meanwhile I’ll post to this thread
Sorry we couldn’t make it work for you on the first try, but we’ll make sure it’s working again soon.
If you have the time the output of
docker exec celery-worker cat celery_env_builds.log
could help (and:
docker logs rabbitmq-server
)
m

Michael Duncan

01/29/2021, 10:28 PM
Error response from daemon: Container 8cbe0fc33e43965604c37fc8649ff5936bc3b1a8bde20fd954a52dfb2ecae406 is not running
r

Rick Lamers

01/29/2021, 10:29 PM
docker logs celery-worker
?
m

Michael Duncan

01/29/2021, 10:29 PM
docker logs rabbitmq-server 211658.623 [error] Failed to write to cookie file '/var/lib/rabbitmq/.erlang.cookie': enospc 211659.434 [error] Too short cookie string 211659.435 [error] Too short cookie string 211700.163 [error] Too short cookie string 211700.164 [error] Too short cookie string 211700.911 [error] Too short cookie string 211700.912 [error] Too short cookie string 211701.668 [error] Too short cookie string 211701.672 [error] Too short cookie string 211702.540 [error] Too short cookie string 211702.541 [error] Too short cookie string 211703.262 [error] Too short cookie string 211703.264 [error] Too short cookie string 211704.093 [error] Too short cookie string 211704.095 [error] Too short cookie string 211705.155 [error] Too short cookie string 211705.158 [error] Too short cookie string 211705.881 [error] Too short cookie string 211705.885 [error] Too short cookie string 211706.797 [error] Too short cookie string Distribution failed: {{:shutdown, {:failed_to_start_child, :auth, {'Too short cookie string', [{:auth, :init_cookie, 0, [file: 'auth.erl', line: 290]}, {:auth, :init, 1, [file: 'auth.erl', line: 144]}, {:gen_server, :init_it, 2, [file: 'gen_server.erl', line: 417]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 385]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 226]}]}}}, {:child, :undefined, :net_sup_dynamic, {:erl_distribution, :start_link, [[:"rabbitmqcli-246-rabbit@rabbitmq-server", :shortnames, 15000], false, :net_sup_dynamic]}, :permanent, 1000, :supervisor, [:erl_distribution]}} Configuring logger redirection 211709.433 [error] Too short cookie string 211709.438 [error] 211709.433 [error] Supervisor net_sup had child auth started with auth:start_link() at undefined exit with reason "Too short cookie string" in auth:init_cookie/0 line 290 in context start_error 211709.433 [error] CRASH REPORT Process <0.199.0> with 0 neighbours crashed with reason: "Too short cookie string" in auth:init_cookie/0 line 290 BOOT FAILED 211709.442 [error] BOOT FAILED 211709.442 [error] =========== =========== 211709.443 [error] Exception during startup: Exception during startup: 211709.443 [error] 211709.443 [error] supervisor:children_map/4 line 1171 supervisor:children_map/4 line 1171 supervisor:'-start_children/2-fun-0-'/3 line 355 supervisor:do_start_child/2 line 371 211709.444 [error] supervisor:'-start_children/2-fun-0-'/3 line 355 211709.444 [error] supervisor:do_start_child/2 line 371 211709.444 [error] supervisor:do_start_child_i/3 line 385 supervisor:do_start_child_i/3 line 385 211709.444 [error] rabbit_prelaunch:run_prelaunch_first_phase/0 line 27 rabbit_prelaunch:run_prelaunch_first_phase/0 line 27 rabbit_prelaunch:do_run/0 line 111 211709.444 [error] rabbit_prelaunch:do_run/0 line 111 rabbit_prelaunch_dist:setup/1 line 15 211709.444 [error] rabbit_prelaunch_dist:setup/1 line 15 211709.445 [error] rabbit_prelaunch_dist:duplicate_node_check/1 line 51 rabbit_prelaunch_dist:duplicate_node_check/1 line 51 211709.445 [error] error:{badmatch, error:{badmatch, 211709.446 [error] {error, 211709.446 [error] {{shutdown, {error, 211709.446 [error] {failed_to_start_child,auth, {{shutdown, 211709.446 [error] {"Too short cookie string", {failed_to_start_child,auth, {"Too short cookie string", 211709.447 [error] [{auth,init_cookie,0,[{file,"auth.erl"},{line,290}]}, [{auth,init_cookie,0,[{file,"auth.erl"},{line,290}]}, 211709.447 [error] {auth,init,1,[{file,"auth.erl"},{line,144}]}, {auth,init,1,[{file,"auth.erl"},{line,144}]}, 211709.447 [error] {gen_server,init_it,2, {gen_server,init_it,2, 211709.448 [error] [{file,"gen_server.erl"},{line,417}]}, [{file,"gen_server.erl"},{line,417}]}, 211709.448 [error] {gen_server,init_it,6, {gen_server,init_it,6, 211709.448 [error] [{file,"gen_server.erl"},{line,385}]}, [{file,"gen_server.erl"},{line,385}]}, 211709.449 [error] {proc_lib,init_p_do_apply,3, {proc_lib,init_p_do_apply,3, 211709.449 [error] [{file,"proc_lib.erl"},{line,226}]}]}}}, [{file,"proc_lib.erl"},{line,226}]}]}}}, 211709.450 [error] {child,undefined,net_sup_dynamic, {child,undefined,net_sup_dynamic, 211709.450 [error] {erl_distribution,start_link, {erl_distribution,start_link, 211709.450 [error] [[rabbit_prelaunch_471@localhost,shortnames], [[rabbit_prelaunch_471@localhost,shortnames], 211709.451 [error] false,net_sup_dynamic]}, false,net_sup_dynamic]}, 211709.452 [error] permanent,1000,supervisor, permanent,1000,supervisor, 211709.452 [error] [erl_distribution]}}}} [erl_distribution]}}}} 211709.452 [error] 211710.454 [error] Supervisor rabbit_prelaunch_sup had child prelaunch started with rabbit_prelaunch:run_prelaunch_first_phase() at undefined exit with reason {badmatch,{error,{{shutdown,{failed_to_start_child,auth,{"Too short cookie string",[{auth,init_cookie,0,[{file,"auth.erl"},{line,290}]},{auth,init,1,[{file,"auth.erl"},{line,144}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,417}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,385}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}}},{child,undefined,net_sup_dynamic,{erl_distribution,start_link,[[rabbit_prelaunch_471@localhost,shortnames],false,net_sup_dynamic]},...}}}} in context start_error 211710.454 [error] CRASH REPORT Process <0.151.0> with 0 neighbours exited with reason: {{shutdown,{failed_to_start_child,prelaunch,{badmatch,{error,{{shutdown,{failed_to_start_child,auth,{"Too short cookie string",[{auth,init_cookie,0,[{file,"auth.erl"},{line,290}]},{auth,init,1,[{file,"auth.erl"},{line,144}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,417}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,385}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}}},{child,undefined,net_sup_dynamic,{erl_distribution,start_link,[[rabbit_prelaunch_471@localhost,...],...]},...}}}}}},...} in application_master:init/4 line 138 {"Kernel pid terminated",application_controller,"{application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{badmatch,{error,{{shutdown,{failed_to_start_child,auth,{\"Too short cookie string\",[{auth,init_cookie,0,[{file,\"auth.erl\"},{line,290}]},{auth,init,1,[{file,\"auth.erl\"},{line,144}]},{gen_server,init_it,2,[{file,\"gen_server.erl\"},{line,417}]},{gen_server,init_it,6,[{file,\"gen_server.erl\"},{line,385}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,226}]}]}}},{child,undefined,net_sup_dynamic,{erl_distribution,start_link,[[rabbit_prelaunch_471@localhost,shortnames],false,net_sup_dynamic]},permanent,1000,supervisor,[erl_distribution]}}}}}},{rabbit_prelaunch_app,start,[normal,[]]}}}"} Kernel pid terminated (application_controller) ({application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{badmatch,{error,{{shutdown,{failed_to_start_child,auth,{"Too Crash dump is being written to: erl_crash.dump...%
👀 1
docker logs celery-worker /usr/local/lib/python3.8/site-packages/celery/platforms.py800 RuntimeWarning: You're running the worker with superuser privileges: this is absolutely not recommended! Please specify a different user using the --uid option. User information: uid=0 euid=0 gid=0 egid=0 warnings.warn(RuntimeWarning(ROOT_DISCOURAGED.format( -------------- celery@worker-env-builds v4.4.2 (cliffs) --- ***** ----- -- ******* ---- Linux-4.19.121-linuxkit-x86_64-with-glibc2.2.5 2021-01-29 211719 - * --- * --- - ** ---------- [config] - ** ---------- .> app: app:0x7f8781062e20 - ** ---------- .> transport: amqp://guest:**@rabbitmq-server:5672// - ** ---------- .> results: postgresql+psycopg2://postgres@orchest-database/celery_result_backend - * --- * --- .> concurrency: 1 (prefork) -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker) --- ***** ----- -------------- [queues] .> environment_builds exchange=environment_builds(direct) key=environment_builds [tasks] . app.core.tasks.build_environment . app.core.tasks.run_pipeline . app.core.tasks.start_non_interactive_pipeline_run
r

Rick Lamers

01/29/2021, 10:30 PM
Alright, at least we know that it’s because the celery worker wasn’t able to process the task of the environment build which is triggered during project creation.
Massively helpful!
While we get to the root cause, an initial idea would be to pull the images afresh. Container error indicates image corruption. If you’re on a limited connection you could try just removing rabbitmq from docker images and invoking
./orchest install
.
Thanks @Jacopo for helping me troubleshoot.
j

Jacopo

01/29/2021, 10:56 PM
👍
m

Michael Duncan

02/01/2021, 3:20 PM
I repulled everything and got it working! Thanks for all the debugging help last week. For the root cause analysis, I use a VPN for work that has notoriously spotty connection at times.
👍 2
🚀 1
y

Yannick

02/01/2021, 3:23 PM
Glad to hear things are working now! :)
j

Jacopo

02/01/2021, 3:26 PM
happy to hear!