{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Welcome to Lab 3 for Week 1 Day 4\n", "\n", "Today we're going to build something with immediate value!\n", "\n", "In the folder `me` I've put a single file `linkedin.pdf` - it's a PDF download of my LinkedIn profile.\n", "\n", "Please replace it with yours!\n", "\n", "I've also made a file called `summary.txt`\n", "\n", "We're not going to use Tools just yet - we're going to add the tool tomorrow." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", "

Looking up packages

\n", " In this lab, we're going to use the wonderful Gradio package for building quick UIs, \n", " and we're also going to use the popular PyPDF2 PDF reader. You can get guides to these packages by asking \n", " ChatGPT or Claude, and you find all open-source packages on the repository https://pypi.org.\n", " \n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# If you don't know what any of these packages do - you can always ask ChatGPT for a guide!\n", "\n", "from dotenv import load_dotenv\n", "from openai import OpenAI\n", "from pypdf import PdfReader\n", "import gradio as gr" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "load_dotenv(override=True)\n", "openai = OpenAI()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "reader = PdfReader(\"me/linkedin.pdf\")\n", "linkedin = \"\"\n", "for page in reader.pages:\n", " text = page.extract_text()\n", " if text:\n", " linkedin += text" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "   \n", "Contact\n", "Plot no.11 , Ganga Enclave-ii ,\n", "Siroli , Jagatpura , Jaipur, Rajasthan\n", "- 302033\n", "08107588977 (Home)\n", "manishchhipa98@gmail.com\n", "www.linkedin.com/in/\n", "manishcheepa (LinkedIn)\n", "Top Skills\n", "Data Warehousing\n", "Data Visualization\n", "Apache Kafka\n", "Certifications\n", "Microsoft Certified: Azure\n", "Fundamentals\n", "Infosys Certified Google Associate\n", "Cloud Engineer\n", "English Proficiency Certificate\n", "Infosys Certified RPA Professional\n", "Manish Cheepa\n", "Software Developer IV @ Astreya | AWS, Azure, Cloud, DataBricks,\n", "Airflow, Kafka, ETL, CICD, Spark, BigData, Django, DSA, System\n", "Design, Machine Learning, Data Science Gen AI\n", "Dublin, County Dublin, Ireland\n", "Summary\n", "As a Software Developer IV at Astreya, I design, develop, and\n", "implement ETL/ELT processes using Azure Databricks, Python,\n", "and Kafka. I also work with Angular and Django to create web\n", "applications and REST APIs. I lead a team of four developers to\n", "deliver automated data pipelines from various sources, ensuring\n", "reliability, efficiency, and data quality.\n", "I have 5+ years of IT experience in the information industry, working\n", "with different integration platforms and the Hadoop ecosystem. I\n", "have a strong understanding of relational databases, business data,\n", "and storage concepts. I am proficient in Python, C++, and Scala,\n", "and have a good knowledge of algorithms and data structures. I hold\n", "multiple certifications, including AWS Certified Solutions Architect –\n", "Associate, UiPath RPA Developer Advanced, and Infosys Certified\n", "Google Associate Cloud Engineer. I graduated with a Bachelor of\n", "Technology in Electrical Engineering from SKIT Jaipur in 2018. I am\n", "passionate about data engineering and web development, and I am\n", "always eager to learn new technologies and skills.\n", "Experience\n", "Astreya\n", "Software Developer IV\n", "May 2023 - Present (2 years 2 months)\n", "Maharashtra, India\n", "InfoObjects Inc.\n", "Senior Software Engineer\n", "October 2021 - June 2023 (1 year 9 months)\n", "Jaipur, Rajasthan, India\n", "Infosys\n", "  Page 1 of 2   \n", "3 years 5 months\n", "Technology Analyst\n", "April 2021 - October 2021 (7 months)\n", "Pune, Maharashtra, India\n", "Senior System Engineer\n", "August 2020 - April 2021 (9 months)\n", "Pune, Maharashtra, India\n", "System Engineer\n", "June 2018 - August 2020 (2 years 3 months)\n", "Pune, Maharashtra, India\n", "Education\n", "SKIT Jaipur\n", "Bachelor of Technology - BTech, Electrical Engineering · (2014 - 2018)\n", "Woolf\n", "Master of Science - MS, Computer Science · (May 2024)\n", "  Page 2 of 2\n" ] } ], "source": [ "print(linkedin)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "with open(\"me/summary.txt\", \"r\", encoding=\"utf-8\") as f:\n", " summary = f.read()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "name = \"Manish Cheepa\"" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "system_prompt = f\"You are acting as {name}. You are answering questions on {name}'s website, \\\n", "particularly questions related to {name}'s career, background, skills and experience. \\\n", "Your responsibility is to represent {name} for interactions on the website as faithfully as possible. \\\n", "You are given a summary of {name}'s background and LinkedIn profile which you can use to answer questions. \\\n", "Be professional and engaging, as if talking to a potential client or future employer who came across the website. \\\n", "If you don't know the answer, say so.\"\n", "\n", "system_prompt += f\"\\n\\n## Summary:\\n{summary}\\n\\n## LinkedIn Profile:\\n{linkedin}\\n\\n\"\n", "system_prompt += f\"With this context, please chat with the user, always staying in character as {name}.\"\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "system_prompt" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def chat(message, history):\n", " messages = [{\"role\": \"system\", \"content\": system_prompt}] + history + [{\"role\": \"user\", \"content\": message}]\n", " response = openai.chat.completions.create(model=\"gpt-4o-mini\", messages=messages)\n", " return response.choices[0].message.content" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "* Running on local URL: http://127.0.0.1:7860\n", "* To create a public link, set `share=True` in `launch()`.\n" ] }, { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gr.ChatInterface(chat, type=\"messages\").launch()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A lot is about to happen...\n", "\n", "1. Be able to ask an LLM to evaluate an answer\n", "2. Be able to rerun if the answer fails evaluation\n", "3. Put this together into 1 workflow\n", "\n", "All without any Agentic framework!" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# Create a Pydantic model for the Evaluation\n", "\n", "from pydantic import BaseModel\n", "\n", "class Evaluation(BaseModel):\n", " is_acceptable: bool\n", " feedback: str\n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "evaluator_system_prompt = f\"You are an evaluator that decides whether a response to a question is acceptable. \\\n", "You are provided with a conversation between a User and an Agent. Your task is to decide whether the Agent's latest response is acceptable quality. \\\n", "The Agent is playing the role of {name} and is representing {name} on their website. \\\n", "The Agent has been instructed to be professional and engaging, as if talking to a potential client or future employer who came across the website. \\\n", "The Agent has been provided with context on {name} in the form of their summary and LinkedIn details. Here's the information:\"\n", "\n", "evaluator_system_prompt += f\"\\n\\n## Summary:\\n{summary}\\n\\n## LinkedIn Profile:\\n{linkedin}\\n\\n\"\n", "evaluator_system_prompt += f\"With this context, please evaluate the latest response, replying with whether the response is acceptable and your feedback.\"" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "def evaluator_user_prompt(reply, message, history):\n", " user_prompt = f\"Here's the conversation between the User and the Agent: \\n\\n{history}\\n\\n\"\n", " user_prompt += f\"Here's the latest message from the User: \\n\\n{message}\\n\\n\"\n", " user_prompt += f\"Here's the latest response from the Agent: \\n\\n{reply}\\n\\n\"\n", " user_prompt += f\"Please evaluate the response, replying with whether it is acceptable and your feedback.\"\n", " return user_prompt" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "import os\n", "gemini = OpenAI(\n", " api_key=os.getenv(\"GOOGLE_API_KEY\"), \n", " base_url=\"https://generativelanguage.googleapis.com/v1beta/openai/\"\n", ")" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "def evaluate(reply, message, history) -> Evaluation:\n", "\n", " messages = [{\"role\": \"system\", \"content\": evaluator_system_prompt}] + [{\"role\": \"user\", \"content\": evaluator_user_prompt(reply, message, history)}]\n", " response = gemini.beta.chat.completions.parse(model=\"gemini-2.0-flash\", messages=messages, response_format=Evaluation)\n", " return response.choices[0].message.parsed" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "messages = [{\"role\": \"system\", \"content\": system_prompt}] + [{\"role\": \"user\", \"content\": \"do you hold a patent?\"}]\n", "response = openai.chat.completions.create(model=\"gpt-4o-mini\", messages=messages)\n", "reply = response.choices[0].message.content" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "reply" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "evaluate(reply, \"do you hold a patent?\", messages[:1])" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "def rerun(reply, message, history, feedback):\n", " updated_system_prompt = system_prompt + f\"\\n\\n## Previous answer rejected\\nYou just tried to reply, but the quality control rejected your reply\\n\"\n", " updated_system_prompt += f\"## Your attempted answer:\\n{reply}\\n\\n\"\n", " updated_system_prompt += f\"## Reason for rejection:\\n{feedback}\\n\\n\"\n", " messages = [{\"role\": \"system\", \"content\": updated_system_prompt}] + history + [{\"role\": \"user\", \"content\": message}]\n", " response = openai.chat.completions.create(model=\"gpt-4o-mini\", messages=messages)\n", " return response.choices[0].message.content" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "def chat(message, history):\n", " if \"patent\" in message:\n", " system = system_prompt + \"\\n\\nEverything in your reply needs to be in pig latin - \\\n", " it is mandatory that you respond only and entirely in pig latin\"\n", " else:\n", " system = system_prompt\n", " messages = [{\"role\": \"system\", \"content\": system}] + history + [{\"role\": \"user\", \"content\": message}]\n", " response = openai.chat.completions.create(model=\"gpt-4o-mini\", messages=messages)\n", " reply =response.choices[0].message.content\n", "\n", " evaluation = evaluate(reply, message, history)\n", " \n", " if evaluation.is_acceptable:\n", " print(\"Passed evaluation - returning reply\")\n", " else:\n", " print(\"Failed evaluation - retrying\")\n", " print(evaluation.feedback)\n", " reply = rerun(reply, message, history, evaluation.feedback) \n", " return reply" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gr.ChatInterface(chat, type=\"messages\").launch()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" } }, "nbformat": 4, "nbformat_minor": 2 }