This content originally appeared on DEV Community and was authored by Will Dady
If you're like me you've probably visited a few recipe websites in your time and had the unfortunate experience of having to scroll through the author's life story before getting to the actual recipe. I recently stumbled across an interesting website which is able to take a URL and extract just the recipe without all the surrounding fluff. This got me thinking... could I build something similar on AWS with serverless technologies and generative AI?
Lately I have been experimenting with various large language models on Amazon Bedrock. I have been particularly interested in the latest offering from Anthropic with it's Claude 3 family of models. My idea was to see if I could use Claude to extract a recipe from a web page and return it as a structured JSON object.
Crafting the prompt
I hypothesised that Anthropic's Hiaku model would make light work of finding a recipe in a body of text assuming it is given a well crafted prompt. Luckily, Anthropic's prompt engineering documentation provides some excellent guidance on how best to craft a prompt to work with their family of LLMs.
I began with some low-fi manual testing directly in the AWS Bedrock Chat Playground. I figured my eventual solution would scrape HTML from a target website and inject it into the prompt but in the meantime I'd need to do this manually. This is as simple as navigating to a recipe in my browser, viewing the page source and copying the raw HTML. I then paste the raw HTML into my prompt with some surrounding instructional text.
The Amazon Bedrock Chat Playground showing a well-formed JSON response from Claude
After some experimentation in the Bedrock console I was able to get Claude to consistently parse a recipe from a body of text and return it as structured JSON. Pretty cool!
I ended up settling on the following prompt template:
Read the HTML fragment contained inside the following <document></document> XML tags and extract the recipe.
<document>
<%- document %>
</document>
If you are able to successfully find a recipe in the document output your result as a JSON object with the following fields:
title: A string containing the receipe title
description: A string containing a summary of what the recipe makes
ingredients: An array of strings where each is a single ingredient
steps: An array of strings where each describes a step required to complete the recipe
<example>
{
"title": "Pasta Aglio E Oilio",
"description": "Pasta Aglio e Olio is a classic Italian dish known for its simplicity and flavor. Translating to 'pasta with garlic and oil' it's a traditional recipe originating from the region of Naples. The dish features spaghetti cooked al dente and then tossed with minced garlic, olive oil, red pepper flakes, and sometimes parsley. Despite its minimal ingredients, Pasta Aglio e Olio packs a punch of flavor, with the garlic-infused oil coating each strand of pasta for a deliciously satisfying meal. It's a go-to dish for a quick, easy, and tasty dinner option.",
"ingredients": [
"Half a Lemon",
"Fresh Grated Parmesan",
"2 - 3 Tbsp Butter",
"1 Cup Chopped Italian Parsley",
"1 Pinch Chili Flakes or Hot Pepper FlakesFresh",
"Ground Pepper",
"2 Tbsp Salt (for pasta water)",
"3 Cloves Thin Sliced Fresh Garlic",
"1 Cup Pasta Water",
"1/4 Cup Olive Oil",
"450g Spaghetti"
],
"steps": [
"Slice garlic cloves thin (do not chop) and remove stems from parsley. Chop parsley fine and set all this aside. Have all your ingredients handy and ready since you will need to get to them quickly.",
"Bring water with 2 TBSP salt to a boil and add pasta. Cook for 10-12 minutes till al dente. Save 2 cups of the hot pasta water (for possibly creating more liquid) before draining.",
"About 5 minutes before your pasta is ready, heat the olive oil in large pan on medium and add the sliced garlic. Toss gently and after about 30 seconds add the hot pepper flakes and salt and pepper. Cook and toss for about 30 seconds.",
"Add parsley, cooked pasta, pasta water to your liking, butter and squeeze in the half lemon (watch for seeds). Stir gently and bring to gentle boil. Add more fresh pepper and salt to taste. Toss the mixture well to fully coat and turn off heat.",
"Let mixture set about 30 seconds so liquid thickens. Serve hot and top with fresh grated parmesan."
]
}
</example>
If you are not able to find a recipe in the document respond with a JSON object with the following fields:
error: A string stating "No recipe found in document"
cause: A string describing the error
<example>
{
"error": "No recipe found in document",
"cause": "The document appears to be about Harley Davidson motocycles"
}
</example>
<example>
{
"error": "No recipe found in document",
"cause": "The document appears to be about cats"
}
</example>
Take note of the following:
- I am using XML tags to delineate parts of the prompt as recommended in Anthropic's documentation - Use XML tags
- The
<document>
tags is where you put the HTML you want Claude to read. The<%- document %>
contained within is an ejs placeholder. More on this shortly. - I am using
<example>
tags to show how I want the JSON to be structured. I'm also telling Claude what to do if it is unable to find a recipe in the document. - I am explicitly asking for JSON output.
Building a solution with the AWS CDK
Now that I have a prompt which I am satisfied works, I set out to automate it.
I began by creating a new AWS CDK project using Typescript.
The bulk of the solution will be driven by one of my favourite serverless AWS services, Step Functions! Using the CDK I create a Step Functions state machine with it's type set to Express. We need to use the Express type as we will eventually expose the state machine via an API.
The state machine workflow performs these main tasks:
- Scrape (and sanitise) the HTML from a web page
- Generate the prompt by injecting the HTML into our prompt template
- Invoke Amazon Bedrock with our finalised prompt
- Handle the case where Claude was not able to find a recipe in the document
Scraping the web page
The first step of our state machine scrapes the contents of a web page containing our recipe. To do this I opted to create a simple Node.js based Lambda function using Typescript which takes a url
parameter and uses fetch to... ahem... fetch the target web page.
At this point I could simply return the raw HTML to advance the state machine but that would be extremely wasteful as there is a lot of redundant markup which serves little value, not to mention the longer the prompt the more it costs. We can do better!
I opted to use cheerio, a nifty HTML parsing library to clean-up the HTML before returning it from the function.
Using cheerio I:
- Extract the content of
<body>
tag - Delete elements which have little semantic meaning to Claude such as
img
,video
,svg
etc. - Delete all attributes from elements
- Delete all
<!-- >
comments
I also do some final manipulation of the HTML string to remove excessive whitespace before returning the result.
Astute readers may note this is a far from ideal way of scraping a web page as it fails to consider dynamic content loaded after page load. My assumption is most recipe websites will deliver the complete recipe in the initial HTML sent from the server to aid SEO. That said, a headless Chromium + Puppeteer setup would likely be a better choice in a production environment.
Constructing the prompt
For the second step of our state machine I wrote this short Lambda function. It takes the sanitised HTML output of the previous step and combines it with our prompt template. To do this I use ejs to replace the <%- document %>
placeholder in the prompt template with our sanitised HTML.
import * as ejs from 'ejs';
const PROMPT_TEMPLATE = `...`; // Omitted for brevity
interface Payload {
input: string;
}
export const handler = async ({ input }: Payload) => {
const output = ejs.render(PROMPT_TEMPLATE, { document: input });
return { output };
};
Invoking Amazon Bedrock
Now we have our finalised prompt returned from the previous step we can invoke the model with it. What's nice is AWS Step Functions has a direct integration with Bedrock which means we can invoke it directly without first having to write another Lambda function. Here is what the BedrockInvokeModel
task looks like as defined in the CDK app.
const model = bedrock.FoundationModel.fromFoundationModelId(
this,
'Model',
new bedrock.FoundationModelIdentifier(
'anthropic.claude-3-haiku-20240307-v1:0',
),
);
const aiTask = new tasks.BedrockInvokeModel(this, 'Invoke Model', {
model,
body: sfn.TaskInput.fromObject({
anthropic_version: 'bedrock-2023-05-31',
max_tokens: 1024,
messages: [
{
role: 'user',
content: sfn.JsonPath.stringAt('$.Payload.output'),
},
{
role: 'assistant',
content: '{',
},
],
}),
resultSelector: {
output: sfn.JsonPath.stringToJson(
sfn.JsonPath.format(
'{}{}',
'{',
sfn.JsonPath.stringAt('$.Body.content[0].text'),
),
),
},
});
This takes the prompt I output from the previous step ($.Payload.output
) and invokes the Anthropic Claude 3 Haiku model as it's input.
You might be wondering what that second 'assistant' message is. That is a message pre-fill which gives Claude a starting point on how to respond to the 'user' input. As instructed in our prompt template we always want Claude to respond with JSON. This combined with the prompt entered in the 'user' message helps guide Claude on how to respond. You can read more about how this works on Anthropic's website - Control output format (JSON mode).
An interesting quirk of pre-filling the opening JSON brace is that it is not included in the message response we receive from Bedrock. You'll notice in the resultSelector
I am using sfn.JsonPath.format
, which is one of Step Function's intrinsic functions, to prepend the missing opening brace. The resulting string is then converted to JSON with the sfn.JsonPath.stringToJson
function.
Exposing via API Gateway
The last piece of the puzzle is to expose our state machine via Amazon API Gateway. Within the same CDK app I instantiate a new RestApi
which proxies requests directly to my state machine.
const api = new apigateway.RestApi(this, 'StepFunctionsRestApi');
api.root.addProxy({
defaultIntegration:
apigateway.StepFunctionsIntegration.startExecution(stateMachine),
});
Once deployed the CDK CLI will output the URL of the newly created Rest API. Copy the URL so we can test it in our web browser.
Test it!
At this point all that's left to do is test it out on some web pages. To do this all you need to do is navigate to a website containing a recipe and paste your API URL in-front of the url in your browser's address bar.
For example, my Rest API is available at https://c7jdzx7r36.execute-api.ap-southeast-2.amazonaws.com/prod/
. If I want to extract the recipe from https://www.recipetineats.com/caramel-slice/
I simply append the latter to the former e.g.
https://c7jdzx7r36.execute-api.ap-southeast-2.amazonaws.com/prod/https://www.recipetineats.com/caramel-slice/
After a few seconds, voila!... the recipe extracted from the page as JSON!
{
"output": {
"title": "Caramel Slice",
"description": "This is a Caramel Slice that works as promised – the creamy caramel sets perfectly and will never be runny, the chocolate won't crack when cutting it and the caramel won't ooze out. It's an easy recipe with no thermometer required.",
"ingredients": [
"1 cup flour, plain/all purpose",
"1/2 cup brown sugar, loosely packed",
"1/2 cup desiccated coconut (US: sweetened finely shredded coconut)",
"125g / 4.5oz unsalted butter, melted",
"125g / 4.5oz unsalted butter, roughly chopped",
"1/2 cup (80g) brown sugar, loosely packed",
"1 tsp vanilla extract (or essence)",
"395g / 14oz sweetened condensed milk (1 can, 300ml)",
"200g / 7oz dark or milk melting chocolate (US: semi-sweet chocolate chips)",
"1 tbsp vegetable oil"
],
"steps": [
"Preheat oven to 180°C/350°F (fan 160°C)",
"Grease and line a 28x 18cm (lamington pan) / 7\" x 11\" rectangle pan with baking/parchment paper (Note 2). Have overhang for ease of removal.",
"Mix together Base ingredients and press into a pan (I use an egg flip)",
"Bake for 15 minutes until the surface is golden. Cool in fridge if you have time (Note 3).",
"Lower oven to oven to 160°C/320°F (fan 140°C)",
"Place butter, sugar and vanilla in a saucepan over medium low heat. When the butter is melted, whisk to combine with sugar, then just leave it until it comes to a simmer.",
"When bubbles appear, add condensed milk. Whisk constantly for 5 minutes (Note 4), until you start getting some big slow bubbles on the base.",
"Once bubbles start appearing, whisk for 1 minute, then pour onto Base. Tilt pan to spread evenly.",
"Bake for 12 minutes. Don't worry if you get splotchy brown bits (this happens with ovens that don't distribute heat evenly).",
"Cool on counter for 20 minutes then refrigerate 30 minutes - bottom of pan should be warm but surface cool (not cold) to touch. (Note 5)",
"Place chocolate and oil in a microwave proof bowl. Microwave in 30 second bursts, stirring in between, until chocolate is fully melted (takes me 4 x 30 sec).",
"Pour over caramel, spread with spatula. Then gently shake pan to make the surface completely flat.",
"Refrigerate 1 hour or until set. Remove from fridge and leave out for 5 minutes to take chill out of chocolate slightly. Then cut into bars or squares to serve!"
]
}
}
Here is an example of what is output when you provide a URL to something which doesn't contain a recipe.
https://c7jdzx7r36.execute-api.ap-southeast-2.amazonaws.com/prod/https://news.ycombinator.com/
{
"error": "No recipe found in document",
"cause": "The document appears to be a Hacker News article listing and does not contain any recipes."
}
Closing thoughts
I had a lot of fun building this and hopefully this post highlights how combining serverless technologies with generative AI can be used for novel outcomes. This is very much a toy/experimental project and is not intended for serious production use. To make this safer for use in production a number of additional features would need to be added such as improved error handing, authentication at the API layer and caching to name a few.
If you'd like to try it out yourself you can find the complete CDK app here https://github.com/willdady/recipe-extractor-cdk
Photo by Syd Wachs on Unsplash
This content originally appeared on DEV Community and was authored by Will Dady
Will Dady | Sciencx (2024-07-20T05:28:25+00:00) Getting to the meat and potatoes of serverless recipe parsing with Amazon Bedrock. Retrieved from https://www.scien.cx/2024/07/20/getting-to-the-meat-and-potatoes-of-serverless-recipe-parsing-with-amazon-bedrock/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.