Skip to content

Commit

Permalink
Release version 0.3.0
Browse files Browse the repository at this point in the history
This version refactor the plugin introducing two new extension functions
- gptPromptForText: submit a gpt prompt and return the response as a text
- gptPromptForData: submit a gpt prompt and return the reponse a structured
  data matching the specified schema

Signed-off-by: Paolo Di Tommaso <[email protected]>
  • Loading branch information
pditommaso committed Apr 5, 2024
1 parent 6b3584b commit ddd85e7
Show file tree
Hide file tree
Showing 23 changed files with 1,217 additions and 173 deletions.
89 changes: 66 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,40 +14,82 @@ export OPENAI_API_KEY=<your api key>
2. Add the following snippet at the beginning of your script:

```nextflow
include { prompt } from 'plugin/nf-gpt'
include { gptPromptForText } from 'plugin/nf-gpt'
```

3. Use the `prompt` operator to perform a ChatGPT query and collect teh result to a map object having the schema
of your choice, e.g.
3. Use the `gptPromptForText` operator to perform a ChatGPT prompt and get the response.

```
include { prompt } from 'plugin/nf-gpt'
def text = '''
Extract information about a person from In 1968, amidst the fading echoes of Independence Day,
a child named John arrived under the calm evening sky. This newborn, bearing the surname Doe,
marked the start of a new journey.
'''
channel
.of(text)
.prompt(schema: [firstName: 'string', lastName: 'string', birthDate: 'date (YYYY-MM-DD)'])
.view()
include { gptPromptForText } from 'plugin/nf-gpt'
println gptPromptForText('Tell me a joke')
```

4. run using nextflow as usual
4. run using Nextflow as usual

```
nextflow run <my script>
```

### Other example
5. See the folder [examples] for more examples.


## Reference

### Function `gptPromptForText`

The `gptPromptForText` function carries out a Gpt chat prompt and return the corresponding message as response as a string. Example:


```nextflow
println gptPromptForText('Tell me a joke')
```


When the option `numOfChoices` is specified the response is a list of strings.

```nextflow
def response = gptPromptForText('Tell me a joke', numOfChoices: 3)
for( String it : response )
println it
```

Available options:

See the folder [examples] for more examples

### Options
| name | description |
|---------------|-------------|
| logitBias | Accepts an obnect mapping each token (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100 |
| model | The AI model to be used (default: `gpt-3.5-turbo`) |
| maxTokens | The maximum number of tokens that can be generated in the chat completion |
| numOfChoices | How many chat completion choices to generate for each input message (default: 1) |
| temperature | What sampling temperature to use, between 0 and 2 (default: `0.7`) |


### Function `gptPromptForData`

The `gptPromptForData` function carries out a GPT chat prompt and returns the response as a list of
objects having the schema speciefied. For example:

```nextflow
def query = '''
Extract information about a person from In 1968, amidst the fading echoes of Independence Day,
a child named John arrived under the calm evening sky. This newborn, bearing the surname Doe,
marked the start of a new journey.
'''
def response = gptPromptForData(query, schema: [firstName: 'string', lastName: 'string', birthDate: 'date (YYYY-MM-DD)'])
println "First name: ${response[0].firstName}"
println "Last name: ${response[0].lastName}"
println "Birth date: ${response[0].birthDate}"
```


The following options are available:

The `prompt` operator support those options

| name | description |
|---------------|-------------|
Expand All @@ -56,6 +98,7 @@ The `prompt` operator support those options
| schema | The expected strcuture for the result object represented as map object in which represent the attribute name and the value the attribute type |
| temperature | What sampling temperature to use, between 0 and 2 (default: `0.7`) |


### Configuration file

The following config options can be specified in the `nextflow.config` file:
Expand All @@ -70,7 +113,7 @@ The following config options can be specified in the `nextflow.config` file:
| gpt.temperature | What sampling temperature to use, between 0 and 2 (default: `0.7`) |


## Testing and debugging
## Development

To build and test the plugin during development, configure a local Nextflow build with the following steps:

Expand All @@ -96,15 +139,15 @@ To build and test the plugin during development, configure a local Nextflow buil
./launch.sh run nextflow-io/hello -plugins nf-gpt
```

## Testing without Nextflow build
### Testing without Nextflow build

The plugin can be tested without using a local Nextflow build using the following steps:

1. Build the plugin: `make buildPlugins`
2. Copy `build/plugins/<your-plugin>` to `$HOME/.nextflow/plugins`
3. Create a pipeline that uses your plugin and run it: `nextflow run ./my-pipeline-script.nf`

## Package, upload, and publish
### Package, upload, and publish

The project should be hosted in a GitHub repository whose name matches the name of the plugin, that is the name of the directory in the `plugins` folder (e.g. `nf-gpt`).

Expand Down
14 changes: 6 additions & 8 deletions examples/example1.nf
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
include { prompt } from 'plugin/nf-gpt'
include { gptPromptForText } from 'plugin/nf-gpt'

def text = '''
Extract information about a person from In 1968, amidst the fading echoes of Independence Day,
a child named John arrived under the calm evening sky. This newborn, bearing the surname Doe,
marked the start of a new journey.
'''
/*
* This example show how to use the `gptPromptForText` function in the map operator
*/

channel
.of(text)
.prompt(schema: [firstName: 'string', lastName: 'string', birthDate: 'date (YYYY-MM-DD)'])
.of('Tell me joke')
.map { gptPromptForText(it) }
.view()
23 changes: 16 additions & 7 deletions examples/example2.nf
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
include { prompt } from 'plugin/nf-gpt'
include { gptPromptForText } from 'plugin/nf-gpt'

def query = '''
Who won most gold medals in swimming and Athletics categories during Barcelona 1992 and London 2012 olympic games?"
'''
/*
* This example show how to use the `gptPromptForText` function in a process
*/

channel .of(query)
.prompt(schema: [athlete: 'string', numberOfMedals: 'number', location:'string', sport:'string'])
.view()
process prompt {
input:
val query
output:
val response
exec:
response = gptPromptForText(query)
}

workflow {
prompt('Tell me a joke') | view
}
21 changes: 14 additions & 7 deletions examples/example3.nf
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
include { prompt } from 'plugin/nf-gpt'
include { gptPromptForData } from 'plugin/nf-gpt'

/**
* This example show how to perform a GPT prompt and map the response to a structured object
*/

def text = '''
Extract information about a person from In 1968, amidst the fading echoes of Independence Day,
a child named John arrived under the calm evening sky. This newborn, bearing the surname Doe,
marked the start of a new journey.
'''

channel
.fromList(['Barcelona, 1992', 'London, 2012'])
.combine(['Swimming', 'Athletics'])
.prompt(schema: [athlete: 'string', numberOfMedals: 'number', location: 'string', sport: 'string']) { edition, sport ->
"Who won most gold medals in $sport category during $edition olympic games?"
}
.view()
.of(text)
.flatMap { gptPromptForData(it, schema: [firstName: 'string', lastName: 'string', birthDate: 'date (YYYY-MM-DD)']) }
.view()
16 changes: 16 additions & 0 deletions examples/example4.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
include { gptPromptForData } from 'plugin/nf-gpt'

/**
* This example show how to perform a GPT prompt and map the response to a structured object
*/


def query = '''
Who won most gold medals in swimming and Athletics categories during Barcelona 1992 and London 2012 olympic games?"
'''

def RECORD = [athlete: 'string', numberOfMedals: 'number', location:'string', sport:'string']

channel .of(query)
.flatMap { gptPromptForData(it, schema:RECORD, temperature: 2d) }
.view()
16 changes: 16 additions & 0 deletions examples/example5.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
include { gptPromptForData } from 'plugin/nf-gpt'

/**
* This example show how to perform multiple GPT prompts using combine and flatMap operators
*/


channel
.fromList(['Barcelona, 1992', 'London, 2012'])
.combine(['Swimming', 'Athletics'])
.flatMap { edition, sport ->
gptPromptForData(
"Who won most gold medals in $sport category during $edition olympic games?",
schema: [athlete: 'string', numberOfMedals: 'number', location: 'string', sport: 'string'])
}
.view()
2 changes: 1 addition & 1 deletion plugins/nf-gpt/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ dependencies {
compileOnly 'org.slf4j:slf4j-api:1.7.10'
compileOnly 'org.pf4j:pf4j:3.4.1'
// add here plugins depepencies
api 'dev.langchain4j:langchain4j-open-ai:0.27.1'
api 'dev.langchain4j:langchain4j-open-ai:0.28.0'

// test configuration
testImplementation "org.apache.groovy:groovy:4.0.20"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
/*
* Copyright 2013-2024, Seqera Labs
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/

package nextflow.gpt.client

import groovy.transform.Canonical
import groovy.transform.CompileStatic
import groovy.transform.ToString

/**
* Model a GTP chat conversation create request object.
*
* See also
* https://platform.openai.com/docs/api-reference/chat/create
*
* @author Paolo Di Tommaso <[email protected]>
*/
@CompileStatic
@ToString(includePackage = false, includeNames = true)
class GptChatCompletionRequest {

@ToString(includePackage = false, includeNames = true)
@CompileStatic
static class Message {
String role
String content
}

@ToString(includePackage = false, includeNames = true)
@CompileStatic
static class ToolMessage extends Message {
String name
String tool_call_id
}

@ToString(includePackage = false, includeNames = true)
@CompileStatic
static class Tool {
String type
Function function
}

@ToString(includePackage = false, includeNames = true)
@CompileStatic
static class Function {
String name
String description
Parameters parameters
}

@ToString(includePackage = false, includeNames = true)
@CompileStatic
static class Parameters {
String type
Map<String,Param> properties
List<String> required
}

@ToString(includePackage = false, includeNames = true)
@CompileStatic
static class Param {
String type
String description
}

@ToString(includePackage = false, includeNames = true)
@CompileStatic
@Canonical
static class ResponseFormat {
static final ResponseFormat TEXT = new ResponseFormat('text')
static final ResponseFormat JSON = new ResponseFormat('json_object')
final String type
}

/**
* ID of the model to use.
*/
String model

/**
* A list of tools the model may call
*/
List<?> messages

List<Tool> tools

String tool_choice

/**
* The maximum number of tokens that can be generated in the chat completion
*/
Integer max_tokens

/**
* How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices
*/
Integer n

/**
* What sampling temperature to use, between 0 and 2
*/
Float temperature

/**
* Modify the likelihood of specified tokens appearing in the completion
*/
Map logit_bias

/**
* Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.
*/
ResponseFormat response_format
}
Loading

0 comments on commit ddd85e7

Please sign in to comment.