Search And (LLM) Transform

This article shows an evolution of the “search and replace” feature from text editor,
where the “replace” step has been replaced by a LLM transformation.
The example is using GenAISCript.

It can be useful to batch apply text transformations that are n…


This content originally appeared on DEV Community and was authored by Peli de Halleux

This article shows an evolution of the "search and replace" feature from text editor,
where the "replace" step has been replaced by a LLM transformation.
The example is using GenAISCript.

It can be useful to batch apply text transformations that are not easily done with
regular expressions.

For example, when we added the ability to use a string command in
the exec command, we needed to convert all calls using argument arrays to this new syntax:

host.exec("cmd", ["arg0", "arg1", "arg2"])

to

host.exec(`cmd arg0 arg1 arg2`)`

While it's possible to match this function call with a regular expression

host\.exec\s*\([^,]+,\s*\[[^\]]+\]\s*\)

it's not easy to formulate the replacement string... unless you can describe it in natural language:

Convert the call to a single string command shell in TypeScript

Here are some example of the transformations where the LLM correctly handled variables.

  • concatenate the arguments of a function call into a single string
- const { stdout } = await host.exec("git", ["diff"])
+ const { stdout } = await host.exec(`git diff`)
  • concatenate the arguments and use the ${} syntax to interpolate variables
- const { stdout: commits } = await host.exec("git", [
-     "log",
-     "--author",
-     author,
-     "--until",
-     until,
-     "--format=oneline",
- ])
+ const { stdout: commits } = 
+   await host.exec(`git log --author ${author} --until ${until} --format=oneline`)

Search

The search step is done with the workspace.grep
that allows to efficiently search for a pattern in files (this is the same search engine
that powers the Visual Studio Code search).

const { pattern, glob } = env.vars
const patternRx = new RegExp(pattern, "g")
const { files } = await workspace.grep(patternRx, glob)

Compute Transforms

The second step is to apply the regular expression to the file content
and pre-compute the LLM transformation of each match using an inline prompt.

const { transform } = env.vars
...
const patches = {} // map of match -> transformed
for (const file of files) {
    const { content } = await workspace.readText(file.filename)
    for (const match of content.matchAll(patternRx)) {
        const res = await runPrompt(
            (ctx) => {
                ctx.$`
            ## Task

            Your task is to transform the MATCH with the following TRANSFORM.
            Return the transformed text.
            - do NOT add enclosing quotes.

            ## Context
            `
                ctx.def("MATCHED", match[0])
                ctx.def("TRANSFORM", transform)
            },
            { label: match[0], system: [], cache: "search-and-transform" }
        )
        ...

Since the LLM sometimes decides to wrap the answer in quotes, we need to remove them.

    ...
    const transformed = res.fences?.[0].content ?? res.text
    patches[match[0]] = transformed

Transform

Finally, with the transforms pre-computed, we apply a final regex replace to
patch the old file content with the transformed strings.

    const newContent = content.replace(
        patternRx,
        (match) => patches[match] ?? match
    )
    await workspace.writeText(file.filename, newContent)
}

Parameters

The script takes three parameters: a file glob, a pattern to search for, and a LLM transformation to apply.
We declare these parameters in the script metadata and extract them from the env.vars object.

script({ ...,
    parameters: {
        glob: {
            type: "string",
            description: "The glob pattern to filter files",
            default: "*",
        },
        pattern: {
            type: "string",
            description: "The text pattern (regular expression) to search for",
        },
        transform: {
            type: "string",
            description: "The LLM transformation to apply to the match",
        },
    },
})
const { pattern, glob, transform } = env.vars

Running

To run this script, you can use the --vars option to pass the pattern and the transform.

genaiscript run st --vars 'pattern=host\.exec\s*\([^,]+,\s*\[[^\]]+\]\s*\)' 'transform=Convert the call to a single string command shell in TypeScript'


This content originally appeared on DEV Community and was authored by Peli de Halleux


Print Share Comment Cite Upload Translate Updates
APA

Peli de Halleux | Sciencx (2024-09-18T21:38:07+00:00) Search And (LLM) Transform. Retrieved from https://www.scien.cx/2024/09/18/search-and-llm-transform/

MLA
" » Search And (LLM) Transform." Peli de Halleux | Sciencx - Wednesday September 18, 2024, https://www.scien.cx/2024/09/18/search-and-llm-transform/
HARVARD
Peli de Halleux | Sciencx Wednesday September 18, 2024 » Search And (LLM) Transform., viewed ,<https://www.scien.cx/2024/09/18/search-and-llm-transform/>
VANCOUVER
Peli de Halleux | Sciencx - » Search And (LLM) Transform. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/09/18/search-and-llm-transform/
CHICAGO
" » Search And (LLM) Transform." Peli de Halleux | Sciencx - Accessed . https://www.scien.cx/2024/09/18/search-and-llm-transform/
IEEE
" » Search And (LLM) Transform." Peli de Halleux | Sciencx [Online]. Available: https://www.scien.cx/2024/09/18/search-and-llm-transform/. [Accessed: ]
rf:citation
» Search And (LLM) Transform | Peli de Halleux | Sciencx | https://www.scien.cx/2024/09/18/search-and-llm-transform/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.