Release version 0.3.0

This version refactor the plugin introducing two new extension functions - gptPromptForText: submit a gpt prompt and return the response as a text - gptPromptForData: submit a gpt prompt and return the reponse a structured data matching the specified schema Signed-off-by: Paolo Di Tommaso <[email protected]>
nextflow-io · Apr 5, 2024 · ddd85e7 · ddd85e7
1 parent 6b3584b
commit ddd85e7
Show file tree

Hide file tree

Showing 23 changed files with 1,217 additions and 173 deletions.
diff --git a/README.md b/README.md
@@ -14,40 +14,82 @@ export OPENAI_API_KEY=<your api key>
 2. Add the following snippet at the beginning of your script:
 
 ```nextflow
-include { prompt } from 'plugin/nf-gpt'
+include { gptPromptForText } from 'plugin/nf-gpt'
 ```
 
-3. Use the `prompt` operator to perform a ChatGPT query and collect teh result to a map object having the schema
-of your choice, e.g.
+3. Use the `gptPromptForText` operator to perform a ChatGPT prompt and get the response.
 
 ```
-include { prompt } from 'plugin/nf-gpt'
-
-def text = '''
-Extract information about a person from In 1968, amidst the fading echoes of Independence Day,
-a child named John arrived under the calm evening sky. This newborn, bearing the surname Doe,
-marked the start of a new journey.
-'''
-
-channel
-     .of(text)
-     .prompt(schema: [firstName: 'string', lastName: 'string', birthDate: 'date (YYYY-MM-DD)'])
-     .view()
+include { gptPromptForText } from 'plugin/nf-gpt'
+
+println gptPromptForText('Tell me a joke')
+
 ```
 
-4. run using nextflow as usual
+4. run using Nextflow as usual
 
 ```
 nextflow run <my script>
 ```
 
-### Other example
+5. See the folder [examples] for more examples.
+
+
+## Reference 
+
+### Function `gptPromptForText` 
+
+The `gptPromptForText` function carries out a Gpt chat prompt and return the corresponding message as response as a string. Example: 
+
+
+```nextflow
+println gptPromptForText('Tell me a joke')
+```
+
+
+When the option `numOfChoices` is specified the response is a list of strings.
+
+```nextflow
+def response =  gptPromptForText('Tell me a joke', numOfChoices: 3)
+for( String it : response )
+  println it
+```
+
+Available options:
 
-See the folder [examples] for more examples
 
-### Options 
+| name          | description |
+|---------------|-------------|
+| logitBias     | Accepts an obnect mapping each token (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100 |
+| model         | The AI model to be used (default: `gpt-3.5-turbo`) |
+| maxTokens     | The maximum number of tokens that can be generated in the chat completion |
+| numOfChoices  | How many chat completion choices to generate for each input message (default: 1) |
+| temperature   | What sampling temperature to use, between 0 and 2 (default: `0.7`) |
+
+
+### Function `gptPromptForData` 
+
+The `gptPromptForData` function carries out a GPT chat prompt and returns the response as a list of 
+objects having the schema speciefied. For example: 
+
+```nextflow 
+
+def query = '''
+        Extract information about a person from In 1968, amidst the fading echoes of Independence Day, 
+        a child named John arrived under the calm evening sky. This newborn, bearing the surname Doe, 
+        marked the start of a new journey.
+        '''
+
+def response = gptPromptForData(query, schema: [firstName: 'string', lastName: 'string', birthDate: 'date (YYYY-MM-DD)'])
+
+println "First name: ${response[0].firstName}"
+println "Last name: ${response[0].lastName}"
+println "Birth date: ${response[0].birthDate}"
+```
+
+
+The following options are available: 
 
-The `prompt` operator support those options 
 
 | name          | description |
 |---------------|-------------|
@@ -56,6 +98,7 @@ The `prompt` operator support those options
 | schema        | The expected strcuture for the result object represented as map object in which represent the attribute name and the value the attribute type |
 | temperature   | What sampling temperature to use, between 0 and 2 (default: `0.7`) |
 
+
 ### Configuration file 
 
 The following config options can be specified in the `nextflow.config` file: 
@@ -70,7 +113,7 @@ The following config options can be specified in the `nextflow.config` file:
 | gpt.temperature   | What sampling temperature to use, between 0 and 2 (default: `0.7`) |
 
 
-## Testing and debugging
+## Development 
 
 To build and test the plugin during development, configure a local Nextflow build with the following steps:
 
@@ -96,15 +139,15 @@ To build and test the plugin during development, configure a local Nextflow buil
     ./launch.sh run nextflow-io/hello -plugins nf-gpt
     ```
 
-## Testing without Nextflow build
+### Testing without Nextflow build
 
 The plugin can be tested without using a local Nextflow build using the following steps:
 
 1. Build the plugin: `make buildPlugins`
 2. Copy `build/plugins/<your-plugin>` to `$HOME/.nextflow/plugins`
 3. Create a pipeline that uses your plugin and run it: `nextflow run ./my-pipeline-script.nf`
 
-## Package, upload, and publish
+### Package, upload, and publish
 
 The project should be hosted in a GitHub repository whose name matches the name of the plugin, that is the name of the directory in the `plugins` folder (e.g. `nf-gpt`).
 

diff --git a/examples/example1.nf b/examples/example1.nf
@@ -1,12 +1,10 @@
-include { prompt } from 'plugin/nf-gpt'
+include { gptPromptForText } from 'plugin/nf-gpt'
 
-def text = '''
-Extract information about a person from In 1968, amidst the fading echoes of Independence Day, 
-a child named John arrived under the calm evening sky. This newborn, bearing the surname Doe, 
-marked the start of a new journey.
-'''
+/*
+ * This example show how to use the `gptPromptForText` function in the map operator
+ */
 
 channel
-     .of(text)
-     .prompt(schema: [firstName: 'string', lastName: 'string', birthDate: 'date (YYYY-MM-DD)'])
+     .of('Tell me joke')
+     .map { gptPromptForText(it) }
      .view()
diff --git a/examples/example2.nf b/examples/example2.nf
@@ -1,9 +1,18 @@
-include { prompt } from 'plugin/nf-gpt'
+include { gptPromptForText } from 'plugin/nf-gpt'
 
-def query = '''
-Who won most gold medals in swimming and Athletics categories during Barcelona 1992 and London 2012 olympic games?"
-'''
+/*
+ * This example show how to use the `gptPromptForText` function in a process
+ */
 
-channel .of(query)
-        .prompt(schema: [athlete: 'string', numberOfMedals: 'number', location:'string', sport:'string'])
-        .view()
+process prompt {
+  input:
+    val query
+  output:
+    val response
+  exec:
+   response = gptPromptForText(query)
+}
+
+workflow {
+    prompt('Tell me a joke') | view
+}
diff --git a/examples/example3.nf b/examples/example3.nf
@@ -1,9 +1,16 @@
-include { prompt } from 'plugin/nf-gpt'
+include { gptPromptForData } from 'plugin/nf-gpt'
+
+/**
+ * This example show how to perform a GPT prompt and map the response to a structured object
+ */
+
+def text = '''
+Extract information about a person from In 1968, amidst the fading echoes of Independence Day, 
+a child named John arrived under the calm evening sky. This newborn, bearing the surname Doe, 
+marked the start of a new journey.
+'''
 
 channel
-        .fromList(['Barcelona, 1992', 'London, 2012'])
-        .combine(['Swimming', 'Athletics'])
-        .prompt(schema: [athlete: 'string', numberOfMedals: 'number', location: 'string', sport: 'string']) { edition, sport ->
-            "Who won most gold medals in $sport category during $edition olympic games?"
-        }
-        .view()
+     .of(text)
+     .flatMap { gptPromptForData(it, schema: [firstName: 'string', lastName: 'string', birthDate: 'date (YYYY-MM-DD)']) }
+     .view()
diff --git a/examples/example4.nf b/examples/example4.nf
@@ -0,0 +1,16 @@
+include { gptPromptForData } from 'plugin/nf-gpt'
+
+/**
+ * This example show how to perform a GPT prompt and map the response to a structured object
+ */
+
+
+def query = '''
+Who won most gold medals in swimming and Athletics categories during Barcelona 1992 and London 2012 olympic games?"
+'''
+
+def RECORD = [athlete: 'string', numberOfMedals: 'number', location:'string', sport:'string']
+
+channel .of(query)
+        .flatMap { gptPromptForData(it, schema:RECORD, temperature: 2d) }
+        .view()
diff --git a/examples/example5.nf b/examples/example5.nf
@@ -0,0 +1,16 @@
+include { gptPromptForData } from 'plugin/nf-gpt'
+
+/**
+ * This example show how to perform multiple GPT prompts using combine and flatMap operators 
+ */
+
+
+channel
+        .fromList(['Barcelona, 1992', 'London, 2012'])
+        .combine(['Swimming', 'Athletics'])
+        .flatMap { edition, sport ->
+            gptPromptForData(
+                    "Who won most gold medals in $sport category during $edition olympic games?",
+                    schema: [athlete: 'string', numberOfMedals: 'number', location: 'string', sport: 'string'])
+        }
+        .view()
diff --git a/plugins/nf-gpt/build.gradle b/plugins/nf-gpt/build.gradle
@@ -59,7 +59,7 @@ dependencies {
     compileOnly 'org.slf4j:slf4j-api:1.7.10'
     compileOnly 'org.pf4j:pf4j:3.4.1'
     // add here plugins depepencies
-    api 'dev.langchain4j:langchain4j-open-ai:0.27.1'
+    api 'dev.langchain4j:langchain4j-open-ai:0.28.0'
 
     // test configuration
     testImplementation "org.apache.groovy:groovy:4.0.20"

diff --git a/plugins/nf-gpt/src/main/nextflow/gpt/client/GptChatCompletionRequest.groovy b/plugins/nf-gpt/src/main/nextflow/gpt/client/GptChatCompletionRequest.groovy
@@ -0,0 +1,127 @@
+/*
+ * Copyright 2013-2024, Seqera Labs
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package nextflow.gpt.client
+
+import groovy.transform.Canonical
+import groovy.transform.CompileStatic
+import groovy.transform.ToString
+
+/**
+ * Model a GTP chat conversation create request object.
+ *
+ * See also
+ * https://platform.openai.com/docs/api-reference/chat/create
+ * 
+ * @author Paolo Di Tommaso <[email protected]>
+ */
+@CompileStatic
+@ToString(includePackage = false, includeNames = true)
+class GptChatCompletionRequest {
+
+    @ToString(includePackage = false, includeNames = true)
+    @CompileStatic
+    static class Message {
+        String role
+        String content
+    }
+
+    @ToString(includePackage = false, includeNames = true)
+    @CompileStatic
+    static class ToolMessage extends Message {
+        String name
+        String tool_call_id
+    }
+
+    @ToString(includePackage = false, includeNames = true)
+    @CompileStatic
+    static class Tool {
+        String type
+        Function function
+    }
+
+    @ToString(includePackage = false, includeNames = true)
+    @CompileStatic
+    static class Function {
+        String name
+        String description
+        Parameters parameters
+    }
+
+    @ToString(includePackage = false, includeNames = true)
+    @CompileStatic
+    static class Parameters {
+        String type
+        Map<String,Param> properties
+        List<String> required
+    }
+
+    @ToString(includePackage = false, includeNames = true)
+    @CompileStatic
+    static class Param {
+        String type
+        String description
+    }
+
+    @ToString(includePackage = false, includeNames = true)
+    @CompileStatic
+    @Canonical
+    static class ResponseFormat {
+        static final ResponseFormat TEXT = new ResponseFormat('text')
+        static final ResponseFormat JSON = new ResponseFormat('json_object')
+        final String type
+    }
+
+    /**
+     * ID of the model to use.
+     */
+    String model
+
+    /**
+     * A list of tools the model may call
+     */
+    List<?> messages
+
+    List<Tool> tools
+
+    String tool_choice
+
+    /**
+     * The maximum number of tokens that can be generated in the chat completion
+     */
+    Integer max_tokens
+
+    /**
+     * How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices
+     */
+    Integer n
+
+    /**
+     * What sampling temperature to use, between 0 and 2
+     */
+    Float temperature
+
+    /**
+     * Modify the likelihood of specified tokens appearing in the completion
+     */
+    Map logit_bias
+
+    /**
+     * Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.
+     */
+    ResponseFormat response_format
+}