Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to initialize spark in Jupyter Notebook #205

Open
mlcohen opened this issue Aug 10, 2023 · 8 comments
Open

Unable to initialize spark in Jupyter Notebook #205

mlcohen opened this issue Aug 10, 2023 · 8 comments
Labels
help wanted Extra attention is needed

Comments

@mlcohen
Copy link

mlcohen commented Aug 10, 2023

Hi -- I've been attempting to get kotlin-spark to work in Jupyter Notebook (v7.0.2). Unfortunately every time I try to run the magic line %use spark in my Jupyter notebook (using the kotlin kernel [kotlin-jupyter-kernel]), I end up getting the following error:

received properties: Properties: {spark=3.3.1, scala=2.13, v=1.2.3, displayLimit=20, displayTruncate=30, spark.app.name=Jupyter, spark.master=local[*], spark.sql.codegen.wholeStage=false, fs.hdfs.impl=org.apache.hadoop.hdfs.DistributedFileSystem, fs.file.impl=org.apache.hadoop.fs.LocalFileSystem}, providing Spark with: {spark.app.name=Jupyter, spark.master=local[*], spark.sql.codegen.wholeStage=false, fs.hdfs.impl=org.apache.hadoop.hdfs.DistributedFileSystem, fs.file.impl=org.apache.hadoop.fs.LocalFileSystem}
23/08/10 10:46:58 INFO SparkContext: Running Spark version 3.3.1
23/08/10 10:46:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/08/10 10:46:58 INFO ResourceUtils: ==============================================================
23/08/10 10:46:58 INFO ResourceUtils: No custom resources configured for spark.driver.
23/08/10 10:46:58 INFO ResourceUtils: ==============================================================
23/08/10 10:46:58 INFO SparkContext: Submitted application: Jupyter
... [clipped log output for brevity]
The problem is found in one of the loaded libraries: check library init codes
org.jetbrains.kotlinx.jupyter.exceptions.ReplEvalRuntimeException: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x5d9e0fc3) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x5d9e0fc3
org.jetbrains.kotlinx.jupyter.exceptions.ReplLibraryException: The problem is found in one of the loaded libraries: check library init codes
	at org.jetbrains.kotlinx.jupyter.exceptions.ReplLibraryExceptionKt.rethrowAsLibraryException(ReplLibraryException.kt:32)
	at org.jetbrains.kotlinx.jupyter.repl.impl.CellExecutorImpl$ExecutionContext.doAddLibraries(CellExecutorImpl.kt:151)
... [clipped log output for brevity]
Caused by: org.jetbrains.kotlinx.jupyter.exceptions.ReplEvalRuntimeException: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x5d9e0fc3) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x5d9e0fc3
	at org.jetbrains.kotlinx.jupyter.repl.impl.InternalEvaluatorImpl.eval(InternalEvaluatorImpl.kt:110)
... [clipped log output for brevity]
Caused by: java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x5d9e0fc3) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x5d9e0fc3
	at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala:213)
	at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:114)
	at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:353)
	at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:290)
	at org.apache.spark.SparkEnv$.create(SparkEnv.scala:339)
	at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:194)
	at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:279)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:464)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
	at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
	at scala.Option.getOrElse(Option.scala:201)
	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
	at Line_5_jupyter.<init>(Line_5.jupyter.kts:11)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.evalWithConfigAndOtherScriptsResults(BasicJvmScriptEvaluator.kt:105)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.invoke$suspendImpl(BasicJvmScriptEvaluator.kt:47)
	at kotlin.script.experimental.jvm.BasicJvmScriptEvaluator.invoke(BasicJvmScriptEvaluator.kt)
	at kotlin.script.experimental.jvm.BasicJvmReplEvaluator.eval(BasicJvmReplEvaluator.kt:49)
	at org.jetbrains.kotlinx.jupyter.repl.impl.InternalEvaluatorImpl$eval$resultWithDiagnostics$1.invokeSuspend(InternalEvaluatorImpl.kt:103)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:284)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at org.jetbrains.kotlinx.jupyter.repl.impl.InternalEvaluatorImpl.eval(InternalEvaluatorImpl.kt:103)
	... 50 more

I'm running spark locally; no remote cluster setup. Any ideas what I might be doing wrong?

Other details:

  • OS: macOS Ventura 13.4.1
  • Java: openjdk version "17.0.7" 2023-04-18 LTS
  • conda: miniconda 23.7.2
  • python: 3.11.4
  • kotlin-jupyter-kernel: 0.11.0.385
@Jolanrensen
Copy link
Collaborator

Seems specific to your system, I cannot reproduce it. Are you able to run a normal spark project? Without notebooks?

@mdsadiqueinam
Copy link

mdsadiqueinam commented Jun 2, 2024

I am also having saving issue
@Jolanrensen the above issue is occurring in gradle 8.4 and above

@mdsadiqueinam
Copy link

@mlcohen did you find any solution?

@Jolanrensen
Copy link
Collaborator

Could you try a lower java version? I know Spark can be difficult with java 17+, like mentioned here https://stackoverflow.com/questions/72724816/running-unit-tests-with-spark-3-3-0-on-java-17-fails-with-illegalaccesserror-cl

@mdsadiqueinam
Copy link

I tried with java 8 as well 11 but same issue

@mdsadiqueinam
Copy link

mdsadiqueinam commented Jun 4, 2024

@Jolanrensen thanks for the solution the issue has been fixed, but I need a little help, please suggest me way to use spark in ktor server, thank you

@Jolanrensen
Copy link
Collaborator

Unfortunately I have no experience with Ktor+Spark, so I won't be able to help. Maybe someone else could though :)

@Jolanrensen Jolanrensen added the help wanted Extra attention is needed label Jun 4, 2024
@mdsadiqueinam
Copy link

Unfortunately I have no experience with Ktor+Spark, so I won't be able to help. Maybe someone else could though :)

No problem I think I got it how to use it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants