🚀 Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System

This content originally appeared on DEV Community and was authored by Ajmal Hasan

🤖 Ollama

Ollama is a framework for running large language models (LLMs) locally on your machine. It lets you download, run, and interact with AI models without needing cloud-based APIs.

🔹 Example: ollama run deepseek-r1:1.5b – Runs DeepSeek R1 locally.

🔹 Why use it? Free, private, fast, and works offline.

🔗 LangChain

LangChain is a Python/JS framework for building AI-powered applications by integrating LLMs with data sources, APIs, and memory.

🔹 Why use it? It helps connect LLMs to real-world applications like chatbots, document processing, and RAG.

📄 RAG (Retrieval-Augmented Generation)

RAG is an AI technique that retrieves external data (e.g., PDFs, databases) and augments the LLM’s response.

🔹 Why use it? Improves accuracy and reduces hallucinations by referencing actual documents.

🔹 Example: AI-powered PDF Q&A system that fetches relevant document content before generating answers.

⚡ DeepSeek R1

DeepSeek R1 is an open-source AI model optimized for reasoning, problem-solving, and factual retrieval.

🔹 Why use it? Strong logical capabilities, great for RAG applications, and can be run locally with Ollama.

🚀 How They Work Together?

Ollama runs DeepSeek R1 locally.
LangChain connects the AI model to external data.
RAG enhances responses by retrieving relevant information.
DeepSeek R1 generates high-quality answers.

💡 Example Use Case: A Q&A system that allows users to upload a PDF and ask questions about it, powered by DeepSeek R1 + RAG + LangChain on Ollama! 🚀

🎯 Why Run DeepSeek R1 Locally?

Benefit	Cloud-Based Models	Local DeepSeek R1
Privacy	❌ Data sent to external servers	✅ 100% Local & Secure
Speed	⏳ API latency & network delays	⚡ Instant inference
Cost	💰 Pay per API request	🆓 Free after setup
Customization	❌ Limited fine-tuning	✅ Full model control
Deployment	🌍 Cloud-dependent	🔥 Works offline & on-premises

🛠 Step 1: Installing Ollama

🔹 Download Ollama

Ollama is available for macOS, Linux, and Windows. Follow these steps to install it:

1️⃣ Go to the official Ollama download page

🔗 Download Ollama

2️⃣ Select your operating system (macOS, Linux, Windows)

3️⃣ Click on the Download button

4️⃣ Install it following the system-specific instructions

📸 Screenshot:

🛠 Step 2: Running DeepSeek R1 on Ollama

Once Ollama is installed, you can run DeepSeek R1 models.

🔹 Pull the DeepSeek R1 Model

To pull the DeepSeek R1 (1.5B parameter model), run:

ollama pull deepseek-r1:1.5b

This will download and set up the DeepSeek R1 model.

🔹 Running DeepSeek R1

Once the model is downloaded, you can interact with it by running:

ollama run deepseek-r1:1.5b

It will initialize the model and allow you to send queries.

📸 Screenshot:

🛠 Step 3: Setting Up a RAG System Using Streamlit

Now that you have DeepSeek R1 running, let's integrate it into a retrieval-augmented generation (RAG) system using Streamlit.

🔹 Prerequisites

Before running the RAG system, make sure you have:

Python installed
Conda environment (Recommended for package management)
Required Python packages

pip install -U langchain langchain-community
pip install streamlit
pip install pdfplumber
pip install semantic-chunkers
pip install open-text-embeddings
pip install faiss
pip install ollama
pip install prompt-template
pip install langchain
pip install langchain_experimental
pip install sentence-transformers
pip install faiss-cpu

For detailed setup, follow this guide:

🔗 Setting Up a Conda Environment for Python Projects

🛠 Step 4: Running the RAG System

🔹 Clone or Create the Project

1️⃣ Create a new project directory

mkdir rag-system && cd rag-system

2️⃣ Create a Python script (app.py)
Paste the following Streamlit-based script:

import streamlit as st
from langchain_community.document_loaders import PDFPlumberLoader
from langchain_experimental.text_splitter import SemanticChunker
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import Ollama
from langchain.prompts import PromptTemplate
from langchain.chains.llm import LLMChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains import RetrievalQA

# Streamlit UI
st.title("📄 RAG System with DeepSeek R1 & Ollama")

uploaded_file = st.file_uploader("Upload your PDF file here", type="pdf")

if uploaded_file:
    with open("temp.pdf", "wb") as f:
        f.write(uploaded_file.getvalue())

    loader = PDFPlumberLoader("temp.pdf")
    docs = loader.load()

    text_splitter = SemanticChunker(HuggingFaceEmbeddings())
    documents = text_splitter.split_documents(docs)

    embedder = HuggingFaceEmbeddings()
    vector = FAISS.from_documents(documents, embedder)
    retriever = vector.as_retriever(search_type="similarity", search_kwargs={"k": 3})

    llm = Ollama(model="deepseek-r1:1.5b")

    prompt = """
    Use the following context to answer the question.
    Context: {context}
    Question: {question}
    Answer:"""

    QA_PROMPT = PromptTemplate.from_template(prompt)

    llm_chain = LLMChain(llm=llm, prompt=QA_PROMPT)
    combine_documents_chain = StuffDocumentsChain(llm_chain=llm_chain, document_variable_name="context")

    qa = RetrievalQA(combine_documents_chain=combine_documents_chain, retriever=retriever)

    user_input = st.text_input("Ask a question about your document:")

    if user_input:
        response = qa(user_input)["result"]
        st.write("**Response:**")
        st.write(response)

🛠 Step 5: Running the App

Once the script is ready, start your Streamlit app:

streamlit run app.py

📸 Screenshot:

CHECK GITHUB REPO FOR COMPLETE CODE
LEARN BASICS HERE

🎯 Final Thoughts

✅ You have successfully set up Ollama and DeepSeek R1!

✅ You can now build AI-powered RAG applications with local LLMs!

✅ Try uploading PDFs and asking questions dynamically.

💡 Want to learn more? Follow my Dev.to blog for more development tutorials! 🚀

This content originally appeared on DEV Community and was authored by Ajmal Hasan

Print Share Comment Cite Upload Translate Updates

APA

Ajmal Hasan | Sciencx (2025-01-28T20:30:37+00:00) 🚀 Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System. Retrieved from https://www.scien.cx/2025/01/28/%f0%9f%9a%80-setting-up-ollama-running-deepseek-r1-locally-for-a-powerful-rag-system/

MLA

" » 🚀 Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System." Ajmal Hasan | Sciencx - Tuesday January 28, 2025, https://www.scien.cx/2025/01/28/%f0%9f%9a%80-setting-up-ollama-running-deepseek-r1-locally-for-a-powerful-rag-system/

HARVARD

Ajmal Hasan | Sciencx Tuesday January 28, 2025 » 🚀 Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System., viewed ,<https://www.scien.cx/2025/01/28/%f0%9f%9a%80-setting-up-ollama-running-deepseek-r1-locally-for-a-powerful-rag-system/>

VANCOUVER

Ajmal Hasan | Sciencx - » 🚀 Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/01/28/%f0%9f%9a%80-setting-up-ollama-running-deepseek-r1-locally-for-a-powerful-rag-system/

CHICAGO

" » 🚀 Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System." Ajmal Hasan | Sciencx - Accessed . https://www.scien.cx/2025/01/28/%f0%9f%9a%80-setting-up-ollama-running-deepseek-r1-locally-for-a-powerful-rag-system/

IEEE

" » 🚀 Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System." Ajmal Hasan | Sciencx [Online]. Available: https://www.scien.cx/2025/01/28/%f0%9f%9a%80-setting-up-ollama-running-deepseek-r1-locally-for-a-powerful-rag-system/. [Accessed: ]

rf:citation

» 🚀 Setting Up Ollama & Running DeepSeek R1 Locally for a Powerful RAG System | Ajmal Hasan | Sciencx | https://www.scien.cx/2025/01/28/%f0%9f%9a%80-setting-up-ollama-running-deepseek-r1-locally-for-a-powerful-rag-system/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

🤖 Ollama

🔗 LangChain

📄 RAG (Retrieval-Augmented Generation)

⚡ DeepSeek R1

🚀 How They Work Together?

🎯 Why Run DeepSeek R1 Locally?

🛠 Step 1: Installing Ollama

🔹 Download Ollama

🛠 Step 2: Running DeepSeek R1 on Ollama

🔹 Pull the DeepSeek R1 Model

🔹 Running DeepSeek R1

🛠 Step 3: Setting Up a RAG System Using Streamlit

🔹 Prerequisites

🛠 Step 4: Running the RAG System

🔹 Clone or Create the Project

🛠 Step 5: Running the App

🎯 Final Thoughts

Related Posts