This content originally appeared on DEV Community and was authored by Sunil Kumar Dash
TL;DR
My mom has been running a small business for a while, and she gets a ton of invoices from clients, suppliers, and dealers. I always wondered why she always seemed mad at me when I was younger. It wasn’t long before I realized she was constantly fumbling with these invoices.
So, I thought, why not create an AI bot that automatically retrieves emails, processes them, and organizes the necessary details into a spreadsheet?
Here’s how I built the bot that saved my mom 20 hours of hair-pulling and screaming at everyone.
- Use Composio's Gmail Integration to retrieve invoice emails from the inbox.
- Use LLM to extract relevant data points.
- Add data points to the Google sheet.
Composio - Open-source platform for AI tools & Integrations
Here’s a quick introduction about us.
Composio is an open-source tooling infrastructure for building robust and reliable AI applications. We provide over 100+ tools and integrations across industry verticals from CRM, HRM, and Sales to Productivity, Dev, and Social Media.
Composio handles user authentication and authorization for all these applications, making connecting API endpoints with various AI models and frameworks simple.
Please help us with a star. 🥹
It would help us to create more articles like this 💖
Star the Composio repository ⭐
How does it work?
This project simplifies retrieving invoice emails from Gmail, downloading invoice attachments, and extracting critical elements to a Google sheet.
- Sign up on GmailGenius (the app’s name) and link your Gmail account and Google Sheets.
- Enter keywords you want to search for in your Gmail.
- GmailGenius finds emails and attachments from Gmail that match your keyword criteria.
- Helpful information from the attachments is extracted and stored in your linked Google Sheet.
Technical Description
Under the hood, the AI tool divides the task into multiple steps and executes them:
- Retrieves emails from Gmail that match the keyword/phrase criteria.
- Download the relevant attachments.
- Extract valuable attributes from the attachments using Nanonets.
- Stores the extracted data in the linked Google Sheet
Techstack
- Frontend: React, Vite, and TailwindCSS.
- Authentication: Firebase.
- Backend: Python, FastAPI.
- AI Agent: CrewAI, Composio, Gemini.
Quick Description
- Composio: A toolkit for integrating apps with AI agents.
- CrewAI: An open-source framework for building collaborative multiple AI bot systems.
- Firebase: A Google platform providing backend services like databases, authentication, and hosting for web and mobile apps.
- React + Vite: A combination of React for building UIs and Vite for fast development and build tooling.
- FastAPI: Python framework for building REST APIs faster.
- Gemini: LLMs from Google.
Let’s Get Started 💥
To start quickly, fork and clone this repository.
The project has two parts: the back end and the front end. The back end consists of the AI tool built using CrewAI, Composio, and Gemini, and the front end has an interactive UI.
To set up the development environment. Make the setup.sh
executable and execute it.
cd GmailGenius/backend
chmod +x setup.sh
,/setup.sh
For reference, this is the setup.sh
file.
#!/bin/bash
# Create a virtual environment
echo "Creating virtual environment..."
python3 -m venv ~/.venvs/gmail_agent
# Activate the virtual environment
echo "Activating virtual environment..."
source ~/.venvs/gmail_agent/bin/activate
# Install libraries from requirements.txt
echo "Installing libraries from requirements.txt..."
pip install -r requirements.txt
# Login to your account
echo "Login to your Composio acount"
composio login
# Add calendar tool
echo "Add Gmail tools. Finish the flow"
composio add gmail
composio add googlesheets
#Enable Gmail trigger
composio triggers enable gmail_new_gmail_message
# Copy env backup to .env file
if [ -f ".env.example" ]; then
echo "Copying .env.example to .env..."
cp .env.example .env
else
echo "No .env.example file found. Creating a new .env file..."
touch .env
fi
# Prompt user to fill the .env file
echo "Please fill in the .env file with the necessary environment variables."
echo "Setup completed successfully!"
This will create a Python virtual environment and install libraries from requirements.txt
. You will also be prompted to log in to Composio. This will redirect you to the Composio login page.
Create an account on Composio and paste the displayed key into the terminal to log in to your Composio account.
You will then be redirected to the Google Authentication page to add the Gmail and Google Sheet integrations.
Once you are done with integration. You can visit the composio dashboard and monitor your integrations.
Building the backend
Now that we are finished with the integrations let's build the backend.
Prerequisites
You will need APIs for Nanonets and Google’s Gemini to complete the project.
Nanonets
This will help extract relevant data from the invoice PDFs. So, create an account with Nanonet and a free API key.
Copy the key and add it to the .env
file.
Also, set the Nanonet URL https://app.nanonets.com/api/v2/OCR/FullText
into the .env
file.
Also, go to the Google AI studio and create an API key.
Save the key to the .env
file as well.
Building the AI Bot
Let’s start by creating the AI bot responsible for retrieving invoices from the Gmail inbox, processing the PDF, and writing it into the Google sheet.
Begin by importing modules
import os
import re
import glob
import json
from composio.client.collections import TriggerEventData
from composio_crewai import Action, ComposioToolSet
from crewai import Agent, Crew, Task, Process
from crewai_tools.tools.base_tool import BaseTool
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from typing import Any, Dict
import requests
load_dotenv()
Create an instance for Gemini.
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash",
verbose=True, temperature=0.5,
google_api_key=os.environ.get("GEMINI_API_KEY"))
Define a tool to extract the emails using Nanonets.
#Get the attachment that was recently downloaded
def get_recent_attachment() -> str:
pdf_files = glob.glob(os.path.join("/Users/abhishekpatil/.composio/output/", "*.pdf")) #modify path as per need
if not pdf_files:
return None
most_recent_pdf = max(pdf_files, key=os.path.getctime)
return most_recent_pdf
#Extract useful attributes from the attachment
class extractorTool(BaseTool):
name: str = "ExtractorTool"
description: str = "This tool extracts useful attributes from pdf document/attachments"
def _run(self) -> Dict[str, Any]:
attachment = get_recent_attachment()
url = os.environ.get("NANO_URL")
FilePath = {'file': open(attachment, 'rb')}
response = requests.post(url, auth=requests.auth.HTTPBasicAuth(os.environ.get("NANO_API_KEY"), ''), files=FilePath)
return json.loads(response.text)["result"][0]["prediction"]
In the above code block,
- We define a CrewAI tool
extractor tool
that will parse PDFs and extract information from them. - The
get_recent_attachment()
function retrieves the recently downloaded PDF.
We will also want a tool to keep track of inserted rows in Google Sheet.
#Get the current counter value
def read_counter():
try:
with open("counter.json", "r") as file:
data = json.load(file)
return data["counter"]
except FileNotFoundError:
initial_data = {"counter": 0}
with open("counter.json", "w") as file:
json.dump(initial_data, file)
return 0
#Tool to increment counter value
class incrementCounter(BaseTool):
name: str = "Increment Counter"
description: str = "This tool increments the counter value"
def _run(self):
current_value = read_counter()
new_value = current_value + 1
data = {"counter": new_value}
with open("counter.json", "w") as file:
json.dump(data, file)
It tracks the row position in Google Sheets to add data accurately. Since we need to specify the exact cell (e.g., A1) for data entry, the counter helps determine the next available row, especially if some are already filled. The counter updates only when data is successfully added to ensure it reflects the correct position, preventing unnecessary updates when no data is added.
Initialise the Composio tools for Gmails, Google Sheets, and the tool we just defined.
#Tools
IncrementCounter = incrementCounter()
Extractor_tool=extractorTool()
composio_toolset = ComposioToolSet()
google_tools = composio_toolset.get_actions(
actions=[
# Action.GMAIL_FETCH_MESSAGE_BY_THREAD_ID,
Action.GMAIL_GET_ATTACHMENT,
Action.GMAIL_FETCH_EMAILS,
Action.GOOGLESHEETS_BATCH_UPDATE
]
)
tools = google_tools + [Extractor_tool, IncrementCounter]
We defined an Extractor Tool before and now the Google tool with three actions:
-
Action.GMAIL_GET_ATTACHMENT
: Retrieves attachments from Gmail emails. -
Action.GMAIL_FETCH_EMAILS
: Fetches emails from Gmail based on specific criteria. -
Action.GOOGLESHEETS_BATCH_UPDATE
: Updates data in Google Sheets in bulk.
Next, define a CrewAI agent.
google_assistant = Agent(
role="Gmail Assistant",
goal="""Get five recent emails/messages and check if the thread ID matches, download attachments & extract attributes/information from it & store attributes in Google sheet (Store just the values & not the attribute names)""",
backstory=f"""You're an AI assistant that makes use of google tools/APIs and does work on behalf of user. You can extract attributes/information from attachments & Store them in Google sheet""",
verbose=True,
llm=llm,
tools=tools,
allow_delegation=False,
)
Now, we will define a CrewAI Agent. The agent is responsible for carrying out the tasks.
Create an instance of Agent with
- Role, Goal and Backstory: This provides the LLM additional context to complete a job.
- Verbose: Logs execution traces.
- LLM: The LLM instance.
- Tools: All the tools we defined earlier. Extractor tool and Google tools.
- allow delegation: Set to false so that the agent will not pass control flow to other agents (if available)
Next, define the event listener.
The event listener will continue to monitor the Gmail Inbox and retrieve emails when they arrive. You can define an event listener with trigger filters.
Here, we enabled a Gmail trigger during the Composio setup that fetches new emails from Gmail inbox.
Each event listener will accompany a cal back function called when an event is received via the trigger.
@listener.callback(filters={"trigger_name": "GMAIL_NEW_GMAIL_MESSAGE"})
def callback_new_message(event: TriggerEventData) -> None:
print("Received email")
payload = event.payload
thread_id = payload.get("threadId")
keywords = "Invoice, Apple TV"
attributes = "Invoice date, invoice number, invoice amount, currency"
sheetId = "1UEzR3FG9Jk6Vl_2RvTgHJuDbuzZXHDMl6yAIX3NDwJU"
sheetName = "Sheet1"
find_invoice_from_gmail = Task(
description=f"""
Find emails with thread id {thread_id}, check if it contains keywords like {keywords}, if so then extract attachment id & attachment name associated with that email and download the attachment & store the following attributes: {attributes} in google sheet with id {sheetId} & sheet name {sheetName} and cell A{read_counter()}
""",
agent=google_assistant,
expected_output="If email matches criteria ({keywords}) then download attachment & store attributes on google sheet & increment counter, otherwise indicate email isnt related",
)
gmail_processing_crew = Crew(
agents=[google_assistant],
tasks=[find_invoice_from_gmail],
verbose=1,
process=Process.sequential,
)
result = gmail_processing_crew.kickoff()
return result
print("Subscription created!")
listener.listen()
In the callback function callvack_new_message
,
- We extracted the keywords to look for in emails to find invoices, attributes to save on the Google sheet, and the Sheet name.
- Defined a CrewAI Task for the
google_assistant
agent with a clear description and expected output. - Finally, define the Crew with the agent and the task.
Building the API backend
Next, we will build an API endpoint to receive information from the front end. As I mentioned before, we will use FastAPI and Pydantic.
Import the required modules and set up logging.
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from agent import run_crew
import logging
# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
Create a FastAPI app and set up CORS using CORSMiddleware
.
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
-
allow_origins=["*"]
: Allows requests from any origin. This is useful for development but should be restricted in production. -
allow_credentials=True
: Allows cookies and HTTP authentication to be included in the requests. -
allow_methods=["*"]
: Allows all HTTP methods (GET, POST, PUT, DELETE, etc.). -
allow_headers=["*"]
: Allows all headers in the requests.
Now, define a Pydantic class for Message.
class Message(BaseModel):
emailKeywords: str
attributes: str
sheetId: str
Finally, define the POST endpoint.
@app.post("/fetch")
async def fetch(message: Message):
try:
logger.info(f"Received request with email keywords: {message.emailKeywords} and attributes: {message.attributes}")
result, status_code = run_crew(message.emailKeywords, message.attributes, message.sheetId)
logger.info(f"Agent result: {result}")
return {"result": result}
except Exception as e:
logger.error(f"Error occurred: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
The endpoint receives user inputs from the front end and runs the crew using the input data.
Building the Frontend
The application's front end is built with React and Vite.
Go to the src
directory and install the dependencies.
npm install
Firebase Authentication
Firebase handles the authentication. So, make sure you add the necessary values to the environment variable.
VITE_FIREBASE_API_KEY=your-firebase-api-key
VITE_FIREBASE_AUTH_DOMAIN=your-firebase-auth-domain
VITE_FIREBASE_PROJECT_ID=your-firebase-project-id
VITE_FIREBASE_STORAGE_BUCKET=your-firebase-storage-bucket
VITE_FIREBASE_MESSAGING_SENDER_ID=your-firebase-messaging-sender-id
VITE_FIREBASE_APP_ID=your-firebase-app-id
VITE_FIREBASE_MEASUREMENT_ID=your-firebase-measurement-id
Create a Firebase js
file inside the config folder
and define the FireBase configuration.
import { initializeApp } from "firebase/app";
import { getAnalytics } from "firebase/analytics";
import { getAuth, signOut } from "firebase/auth";
const firebaseConfig = {
apiKey: import.meta.env.VITE_FIREBASE_API_KEY,
authDomain: import.meta.env.VITE_FIREBASE_AUTH_DOMAIN,
projectId: import.meta.env.VITE_FIREBASE_PROJECT_ID,
storageBucket: import.meta.env.VITE_FIREBASE_STORAGE_BUCKET,
messagingSenderId: import.meta.env.VITE_FIREBASE_MESSAGING_SENDER_ID,
appId: import.meta.env.VITE_FIREBASE_APP_ID,
measurementId: import.meta.env.VITE_FIREBASE_MEASUREMENT_ID
};
const app = initializeApp(firebaseConfig);
const analytics = getAnalytics(app);
export const auth = getAuth(app);
export const logOut = () => {
signOut(auth).then(() => {
console.log("user signed out");
}).catch((error) => {
console.log("error signing user out");
});
}
This file does the following.
- Initialize Firebase: Firebase is set up using the project settings. it connects the app to Firebase.
- Set Up Analytics: The code then sets up Firebase Analytics, which helps you track user behaviour and understand how people use your app.
- Enable Authentication: This enables authentication, which helps manage user login to the app.
- Create a Logout Function: Lastly, the code provides a function to log users out of your app. When called, this function signs out the user.
Create Pages for the App
For this project, we have three pages.
- Login: The login page.
- Home: The Home page lists FAQs, etc.
- Dashboard: The main app user interface.
Designing the LogIn page
This is a simple login page with the Firebase authentication.
import { signInWithRedirect, GoogleAuthProvider, signInWithPopup } from "firebase/auth";
import { auth } from "../config/firebase";
import { useNavigate } from "react-router-dom";
const Login = () => {
const provider = new GoogleAuthProvider();
const navigate = useNavigate();
const signUpWithGoogle = async () => {
try {
const result = await signInWithPopup(auth, provider);
const credential = GoogleAuthProvider.credentialFromResult(result);
navigate("/dashboard");
} catch (error) {
alert(error);
const errorCode = error.code;
const errorMessage = error.message;
// The email of the user's account used.
const email = error.customData.email;
// The AuthCredential type that was used.
const credential = GoogleAuthProvider.credentialFromError(error);
}
}
return <div className="min-h-screen text-gray-900 flex justify-center">
<div className="flex justify-center flex-1">
<div className="lg:w-1/2 xl:w-5/12 p-24 sm:p-12">
<div className="flex flex-col items-center mt-24">
<button
onClick={signUpWithGoogle}
className="w-full max-w-xs font-bold shadow-sm rounded-lg py-3 bg-indigo-100 text-gray-800 flex items-center justify-center transition-all duration-300 ease-in-out focus:outline-none hover:shadow focus:shadow-sm focus:shadow-outline">
<div className="bg-white p-2 rounded-full">
<svg className="w-4" viewBox="0 0 533.5 544.3">
<path
d="M533.5 278.4c0-18.5-1.5-37.1-4.7-55.3H272.1v104.8h147c-6.1 33.8-25.7 63.7-54.4 82.7v68h87.7c51.5-47.4 81.1-117.4 81.1-200.2z"
fill="#4285f4" />
<path
d="M272.1 544.3c73.4 0 135.3-24.1 180.4-65.7l-87.7-68c-24.4 16.6-55.9 26-92.6 26-71 0-131.2-47.9-152.8-112.3H28.9v70.1c46.2 91.9 140.3 149.9 243.2 149.9z"
fill="#34a853" />
<path
d="M119.3 324.3c-11.4-33.8-11.4-70.4 0-104.2V150H28.9c-38.6 76.9-38.6 167.5 0 244.4l90.4-70.1z"
fill="#fbbc04" />
<path
d="M272.1 107.7c38.8-.6 76.3 14 104.4 40.8l77.7-77.7C405 24.6 339.7-.8 272.1 0 169.2 0 75.1 58 28.9 150l90.4 70.1c21.5-64.5 81.8-112.4 152.8-112.4z"
fill="#ea4335" />
</svg>
</div>
<span className="ml-4">
Login/Sign Up with Google
</span>
</button>
</div>
</div>
</div>
</div>
}
export default Login;
The above code block does this.
- Import Modules: Imports modules to enable Google Authentication with Firebase.
- GoogleAuthProvider Instance: Initializes a new GoogleAuthProvider instance to specify Google as the authentication provider.
- Navigation:
useNavigate
to create a navigation function redirecting app users.
signUpWithGoogle
Function:
- Asynchronously handles Google sign-in using a popup window (
signInWithPopup
). - On success, redirect the user to the "/dashboard" page.
- When there is an error, it displays an alert and logs details like the error code, message, user's email, and credential type.
- Login Button: Finally, it returns a login button styled using Tailwind.
Designing the Home page
Let’s now design the Home page.
The home page will contain a few FAQs and information regarding this project. So, feel free to skip this.
import Hero from "../components/Hero";
import Benefits from "../components/Benefits";
import FAQ from "../components/FAQ";
import Working from "../components/Working";
import ActionButton from "../components/ActionButton";
const Home = () => {
return <section className="bg-white dark:bg-gray-900 mt-12">
<div className="py-8 px-4 mx-auto max-w-screen-xl text-center lg:py-16 lg:px-12">
<Hero />
<Benefits />
<Working />
<FAQ />
<div className="mt-20">
<ActionButton displayName={"Get started"} link={"#"} />
</div>
</div>
</section>
}
export default Home;
This will create a simple Home page like the following picture.
Designing the Dashboard
The dashboard contains two input text boxes that accept keywords that will be used to search emails and attributes you want to store in the spreadsheet.
import { useNavigate } from "react-router-dom";
import Fetch from "../components/Fetch";
import { useEffect, useState } from "react";
const Dashboard = ({ user }) => {
const navigate = useNavigate();
const [isLoading, setIsLoading] = useState(true);
useEffect(() => {
if (user === null) {
console.log("user: ", user);
navigate('/login');
} else {
setIsLoading(false);
}
}, [user, navigate]);
if (isLoading) {
return <div>Loading...</div>; // Or any loading indicator
}
return (
<section className="bg-white dark:bg-gray-900 mt-12">
<div className="py-8 px-4 mx-auto max-w-screen-xl text-center lg:py-16 lg:px-12">
<span className="font-semibold text-3xl text-gray-900">Enter Keywords (Crisp & Concise)</span>
<Fetch />
</div>
</section>
);
};
export default Dashboard;
This will create a nice, simple dashboard for accepting user information. The fetch button will trigger the backend to spring into action.
Setting Up User Authentication and Protected Routes in App.jsx
In the App.jsx
file, we set up the primary component that manages user authentication and controls access to specific routes. This component ensures that only authenticated users can access particular application parts, providing a secure and personalized user experience.
import { BrowserRouter, Routes, Route, Navigate } from "react-router-dom";
import { onAuthStateChanged } from "firebase/auth";
import { auth } from "./config/firebase";
import Navbar from "./components/Navbar";
import Home from "./pages/Home";
import Footer from "./components/Footer";
import Dashboard from "./pages/Dashboard";
import ScrollToTop from "./components/ScrollToTop";
import { useState, useEffect } from "react";
import Login from "./pages/Login";
import Settings from "./pages/Settings";
import Agent from "./pages/Agent";
const ProtectedRoute = ({ user, children }) => {
if (!user) {
return <Navigate to="/login" replace />;
}
return children;
};
const App = () => {
const [user, setUser] = useState(null);
const [loading, setLoading] = useState(true);
useEffect(() => {
const unsubscribe = onAuthStateChanged(auth, (user) => {
setUser(user);
setLoading(false);
});
return () => unsubscribe();
}, []);
if (loading) {
return <div>Loading...</div>; // Or any loading indicator
}
return (
<BrowserRouter>
<Navbar user={user} />
<ScrollToTop />
<Routes>
<Route path="/login" element={<Login />} />
<Route path="/Agent" element={<Agent />} />
<Route path="/Settings" element={<Settings />} />
<Route
path="/dashboard"
element={
<ProtectedRoute user={user}>
<Dashboard user={user} />
</ProtectedRoute>
}
/>
<Route path="/" element={<Home />} />
</Routes>
<Footer />
</BrowserRouter>
);
}
export default App;
This is what is happening in the above function.
-
Imports: Imports React, Firebase, and routing libraries, along with custom components like
Navbar
,Footer
, and page components (Home
,Dashboard
,Login
,Settings
,Agents
). -
ProtectedRoute Component:
- Ensures that only authenticated users can access specific routes.
- Redirects unauthenticated users to the login page.
-
State Management:
- Uses
useState
to manage theuser
object (authenticated user) andloading
state.
- Uses
-
Authentication Listener:
-
useEffect
sets up a listener (onAuthStateChanged
) to track user authentication status and update theuser
state accordingly.
-
-
Loading Indicator:
- Displays a loading message or indicator while authentication status is being determined.
-
Router Setup:
- Configures routes using
BrowserRouter
andRoutes
. -
Navbar
andFooter
are consistently displayed across all pages. - Routes include:
-
/login
for the login page. -
/dashboard
for the protected dashboard, only accessible to authenticated users. -
/
for the home page.
-
- Configures routes using
- Finally, it Renders the app with navigation, page content, and consistent components like the navbar and footer.
Define the Entrypoint
Finally, define the main.jsx
file as the entry point for the application.
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import App from './App.jsx'
import './index.css'
createRoot(document.getElementById('root')).render(
<StrictMode>
<App />
</StrictMode>,
)
Running the App
Finally, run the application using the following npm
command.
npm run dev
This will start up the front-end server on the localhost:5345.
You can now visit the app and see it in action.
Thank you for reading.
Next Steps
In this article, you built a complete AI tool that manages invoices from Gmail, processes them and updates them in the Google sheet.
If you liked the article, explore and star the Composio repository for more AI use cases.
Star the Composio repository ⭐
This content originally appeared on DEV Community and was authored by Sunil Kumar Dash
Sunil Kumar Dash | Sciencx (2024-08-27T14:57:16+00:00) I built an AI tool to handle my mom’s invoices and saved her 20 hours of work! 😲. Retrieved from https://www.scien.cx/2024/08/27/i-built-an-ai-tool-to-handle-my-moms-invoices-and-saved-her-20-hours-of-work-%f0%9f%98%b2/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.