Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't connect to gateway cluster created from dask labestension #748

Open
rabernat opened this issue Sep 21, 2020 · 7 comments
Open

can't connect to gateway cluster created from dask labestension #748

rabernat opened this issue Sep 21, 2020 · 7 comments

Comments

@rabernat
Copy link
Member

On GCP, I created a dask gateway cluster with the dask labextension. But when I injected the code to connect to it, it didn't work:

from dask.distributed import Client
client = Client("gateway://traefik-gcp-uscentral1b-prod-dask-gateway.prod:80/prod.b809bb1978fe4df4bd50476d2a843fce")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-24-e4fd84ddbafe> in <module>
      1 from dask.distributed import Client
      2 
----> 3 client = Client("gateway://traefik-gcp-uscentral1b-prod-dask-gateway.prod:80/prod.b809bb1978fe4df4bd50476d2a843fce")
      4 client

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in __init__(self, address, loop, timeout, set_as_default, scheduler_file, security, asynchronous, name, heartbeat_interval, serializers, deserializers, extensions, direct_to_workers, connection_limit, **kwargs)
    743             ext(self)
    744 
--> 745         self.start(timeout=timeout)
    746         Client._instances.add(self)
    747 

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in start(self, **kwargs)
    948             self._started = asyncio.ensure_future(self._start(**kwargs))
    949         else:
--> 950             sync(self.loop, self._start, **kwargs)
    951 
    952     def __await__(self):

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
    337     if error[0]:
    338         typ, exc, tb = error[0]
--> 339         raise exc.with_traceback(tb)
    340     else:
    341         return result[0]

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/utils.py in f()
    321             if callback_timeout is not None:
    322                 future = asyncio.wait_for(future, callback_timeout)
--> 323             result[0] = yield future
    324         except Exception as exc:
    325             error[0] = sys.exc_info()

/srv/conda/envs/notebook/lib/python3.7/site-packages/tornado/gen.py in run(self)
    733 
    734                     try:
--> 735                         value = future.result()
    736                     except Exception:
    737                         exc_info = sys.exc_info()

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in _start(self, timeout, **kwargs)
   1045 
   1046         try:
-> 1047             await self._ensure_connected(timeout=timeout)
   1048         except (OSError, ImportError):
   1049             await self._close()

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/client.py in _ensure_connected(self, timeout)
   1103         try:
   1104             comm = await connect(
-> 1105                 self.scheduler.address, timeout=timeout, **self.connection_args
   1106             )
   1107             comm.name = "Client->Scheduler"

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/comm/core.py in connect(addr, timeout, deserialize, handshake_overrides, **connection_args)
    258 
    259     scheme, loc = parse_address(addr)
--> 260     backend = registry.get_backend(scheme)
    261     connector = backend.get_connector()
    262     comm = None

/srv/conda/envs/notebook/lib/python3.7/site-packages/distributed/comm/registry.py in get_backend(scheme)
     81             raise ValueError(
     82                 "unknown address scheme %r (known schemes: %s)"
---> 83                 % (scheme, sorted(backends))
     84             )
     85         else:

ValueError: unknown address scheme 'gateway' (known schemes: ['inproc', 'tcp', 'tls', 'ucx'])
@TomAugspurger
Copy link
Member

I think this is hitting dask/dask-labextension#135.

I can take a shot at that fixing that in dask-labextension.

@rabernat
Copy link
Member Author

Thanks Tom! Not very urgent, but I wanted to flag it.

@shanicetbailey
Copy link
Contributor

Echoing @rabernat's issue, I'm getting the same error
ValueError: unknown address scheme 'gateway' (known schemes: ['inproc', 'tcp', 'tls', 'ucx']). Not a big issue, but would love to have the extension back up and running instead of pasting the same codes in every notebook.

@rabernat
Copy link
Member Author

Here's an idea for a workaround.

You can successfully create dask gateway clusters from the labextension. You just can't inject code to connect to them into your notebook.

After creating a cluster from the labextension, try this:

from dask_gateway import Gateway
gateway = Gateway()
clusters = gateway.list_clusters()
cluster = gateway.connect(clusters[0].name)

Not quite as simple as before, but it does allow you to easily share clusters between notebooks.

@jbusecke
Copy link

That should be on the instruction page as long as the 'regular way' does not work.

@roxyboy
Copy link
Member

roxyboy commented Nov 30, 2020

I'm getting the following GatewayClusterError when trying to activate my clusters on Binderhub:

---------------------------------------------------------------------------
GatewayClusterError                       Traceback (most recent call last)
<ipython-input-8-b31913ac460b> in <module>
      3 
      4 gateway = Gateway()
----> 5 cluster = gateway.new_cluster()
      6 cluster.scale(30)
      7 

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_gateway/client.py in new_cluster(self, cluster_options, shutdown_on_close, **kwargs)
    641             cluster_options=cluster_options,
    642             shutdown_on_close=shutdown_on_close,
--> 643             **kwargs,
    644         )
    645 

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_gateway/client.py in __init__(self, address, proxy_address, public_address, auth, cluster_options, shutdown_on_close, asynchronous, loop, **kwargs)
    816             shutdown_on_close=shutdown_on_close,
    817             asynchronous=asynchronous,
--> 818             loop=loop,
    819         )
    820 

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_gateway/client.py in _init_internal(self, address, proxy_address, public_address, auth, cluster_options, cluster_kwargs, shutdown_on_close, asynchronous, loop, name)
    912             self.status = "starting"
    913         if not self.asynchronous:
--> 914             self.gateway.sync(self._start_internal)
    915 
    916     @property

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_gateway/client.py in sync(self, func, *args, **kwargs)
    337             )
    338             try:
--> 339                 return future.result()
    340             except BaseException:
    341                 future.cancel()

/srv/conda/envs/notebook/lib/python3.7/concurrent/futures/_base.py in result(self, timeout)
    433                 raise CancelledError()
    434             elif self._state == FINISHED:
--> 435                 return self.__get_result()
    436             else:
    437                 raise TimeoutError()

/srv/conda/envs/notebook/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_gateway/client.py in _start_internal(self)
    926             self._start_task = asyncio.ensure_future(self._start_async())
    927         try:
--> 928             await self._start_task
    929         except BaseException:
    930             # On exception, cleanup

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_gateway/client.py in _start_async(self)
    944         # Connect to cluster
    945         try:
--> 946             report = await self.gateway._wait_for_start(self.name)
    947         except GatewayClusterError:
    948             raise

/srv/conda/envs/notebook/lib/python3.7/site-packages/dask_gateway/client.py in _wait_for_start(self, cluster_name)
    576                     raise GatewayClusterError(
    577                         "Cluster %r failed to start, see logs for "
--> 578                         "more information" % cluster_name
    579                     )
    580                 elif report.status is ClusterStatus.STOPPED:

GatewayClusterError: Cluster 'prod.b09029737e98403a805a51ae40270c4f' failed to start, see logs for more information

I've been trying to call them via:

from dask.distributed import Client
from dask_gateway import Gateway

gateway = Gateway()
cluster = gateway.new_cluster()

which works on the Pangeo Jupyterhub but fails in the Binderhub. I tried @rabernat 's idea but it didn't work for me.

@scottyhq
Copy link
Member

@roxyboy - right now we're in an awkward state where binderhubs require dask-gateway>=0.9 but jupyterhubs require dask-gateway<0.9. Hopefully things will get normalized this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants