Skip to content

Commit 8fe72e8

Browse files
authored
Merge pull request #2 from MKhalusova/readme-review
2 parents d690882 + 044e591 commit 8fe72e8

File tree

1 file changed

+18
-19
lines changed

1 file changed

+18
-19
lines changed

README.md

Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,30 @@
11
# Kotlin Spark API
22

33

4-
Your next API to work with [Spark](https://spark.apache.org/)
4+
Your next API to work with [Spark](https://spark.apache.org/).
55

6-
We are looking to have this as a part of https://github.com/apache/spark repository. Consider this as beta-quality software.
6+
We are looking to have this as a part of https://github.com/apache/spark repository. Consider this beta-quality software.
77

88
## Goal
99

1010
This project adds a missing layer of compatibility between [Kotlin](https://kotlinlang.org/) and [Spark](https://spark.apache.org/).
1111

12-
Despite Kotlin having first-class compatibility API, Kotlin developers might want to use familiar features like data classes and lambda expressions as simple expressions in curly braces or method references.
12+
Despite Kotlin having first-class compatibility API, Kotlin developers may want to use familiar features like data classes and lambda expressions as simple expressions in curly braces or method references.
1313

1414
## Non-goals
1515

16-
There is no goal to replace any currently supported language or provide them with some functionality to support Kotlin language.
16+
There is no goal to replace any currently supported language or provide other APIs with some functionality to support Kotlin language.
1717

1818
## Installation
1919

20-
Currently, there are no kotlin-spark-api artifacts in maven central, but you can obtain copy using JitPack here: [![](https://jitpack.io/v/JetBrains/kotlin-spark-api.svg)](https://jitpack.io/#JetBrains/kotlin-spark-api)
20+
Currently, there are no kotlin-spark-api artifacts in maven central, but you can obtain a copy using JitPack here: [![](https://jitpack.io/v/JetBrains/kotlin-spark-api.svg)](https://jitpack.io/#JetBrains/kotlin-spark-api)
2121

2222
There is support for `Maven`, `Gradle`, `SBT`, and `leinengen` on JitPack.
2323

24-
This project does not force you to use any concrete version of spark, but we've only tested it with spark `3.0.0-preview2`.
25-
We believe it should also work fine with version `2.4.5`
24+
This project does not force you to use any specific version of Spark, but it has only been tested it with spark `3.0.0-preview2`.
25+
We believe it can work with Spark `2.4.5` but we cannot guarantee that.
2626

27-
So if you're using Maven you'll hve to add following into your `pom.xml`:
27+
So if you're using Maven you'll have to add the following into your `pom.xml`:
2828

2929
```xml
3030
<repositories>
@@ -55,7 +55,7 @@ First (and hopefully last) thing you need to do is to add following import to yo
5555
import org.jetbrains.spark.api.*
5656
```
5757

58-
Then you can create SparkSession we all remember and love
58+
Then you can create a SparkSession:
5959

6060
```kotlin
6161
val spark = SparkSession
@@ -65,25 +65,24 @@ val spark = SparkSession
6565

6666
```
6767

68-
To create Dataset you may call `toDS` method like this
68+
To create a Dataset you can call `toDS` method:
6969

7070
```kotlin
7171
spark.toDS("a" to 1, "b" to 2)
7272
```
7373

7474
Indeed, this produces `Dataset<Pair<String, Int>>`. There are a couple more `toDS` methods which accept different arguments.
7575

76-
Also, there are several interesting aliases in API, like `leftJoin`, `rightJoin` etc.
77-
Interesting fact about them that they're null-safe by design. For example, `leftJoin` is aware of nullability and returns `Dataset<Pair<LEFT, RIGHT?>>`.
78-
Note that were forcing `RIGHT` to be nullable for you as a developer to be able to handle this situation.
76+
Also, there are several aliases in API, like `leftJoin`, `rightJoin` etc. These are null-safe by design. For example, `leftJoin` is aware of nullability and returns `Dataset<Pair<LEFT, RIGHT?>>`.
77+
Note that we are forcing `RIGHT` to be nullable for you as a developer to be able to handle this situation.
7978

80-
We know that `NullPointerException`s are hard to debug in Spark And trying hard to make them happen as rare as possible.
79+
We know that `NullPointerException`s are hard to debug in Spark, and we are trying hard to make them as rare as possible.
8180

8281
## Useful helper methods
8382

8483
### `withSpark`
8584

86-
We provide you with useful function `withSpark`, which accepts everything that may be needed to run spark — properties, name, master location and so on. It also accepts a block of code to execute inside spark context.
85+
We provide you with useful function `withSpark`, which accepts everything that may be needed to run Spark — properties, name, master location and so on. It also accepts a block of code to execute inside Spark context.
8786

8887
After work block ends, `spark.stop()` is called automatically.
8988

@@ -99,9 +98,9 @@ withSpark {
9998

10099
### `withCached`
101100

102-
It may easily happen that we need to fork our computation to several paths. To compute things only once we should call `cache`
101+
It can easily happen that we need to fork our computation to several paths. To compute things only once we should call `cache`
103102
method. But there it is hard to control when we're using cached `Dataset` and when not.
104-
It is also easy to forget to unpersist cached data, which may make break things unexpectably or take more memory
103+
It is also easy to forget to unpersist cached data, which can break things unexpectably or take more memory
105104
than intended.
106105

107106
To solve these problems we introduce `withCached` function
@@ -124,10 +123,10 @@ Here we're showing cached `Dataset` for debugging purposes then filtering it. Th
124123

125124
## Examples
126125

127-
You can find more examples in [examples](https://github.com/JetBrains/kotlin-spark-api/tree/master/examples/src/main/kotlin/org/jetbrains/spark/api/examples) module.
126+
For more, check out [examples](https://github.com/JetBrains/kotlin-spark-api/tree/master/examples/src/main/kotlin/org/jetbrains/spark/api/examples) module.
128127

129128
## Issues and feedback
130129

131130
Issues and any feedback are very welcome in `Issues` here.
132131

133-
If you find that we missed some important features — please report it, and we'll consider adding them.
132+
If you find that we missed some important features — let us know!

0 commit comments

Comments
 (0)