Repo

  • https://github.com/pjq/ChatGPT-SummaryBot

Most of the code is generated by ChatGPT, all you need is to provide the appropriate prompt message, and debug, even this document is also generated by ChatGPT

This is a Python script for fetching documents from the Internet and feeding them to the OpenAI ChatGPT model to summarize the contents. The script consists of several parts:

  • Fetching all the links and contents from a website using requests and BeautifulSoup
  • Storing the mapping of links to contents in a text file
  • Splitting the text file into smaller files for easier processing
  • Sending the smaller files to the OpenAI ChatGPT model using the revChatGPT library
  • Receiving and printing the summary of the documents from the OpenAI ChatGPT model

Requirements

  • python 3.7+
  • requests
  • bs4
  • revChatGPT

Usage

  1. Install the required libraries:
pip3 install requests beautifulsoup revChatGPT
Or
pip instead -r requirements.txt

Then follow the revChatGPT to setup the ChatGPT config

  • https://github.com/acheong08/ChatGPT
  1. Run the script:
python AutoSummaryBot.py

The script will first print a message indicating that it is going to send some documents to the OpenAI ChatGPT model. Then it will send the smaller files one by one, and wait for 5 seconds between each file. After all the files are sent, the script will print the message “All documents sent”. Finally, it will send a query to the OpenAI ChatGPT model to ask for the summary of the documents, and print the response.

I:init_chatgpt
I:fetch_link_contents
I:scrape_page: https://pjq.me/?p=1906
I:scrape_page: https://pjq.me/?p=1906#respond
I:scrape_page: https://pjq.me/?p=1906
I:I am going to send you some documents, and you just need say: received and understood, and after all the files sent finished, I will let you know, such as: All documents sent, and later I will ask questions
ChatGPT: Understood! I am ready to receive the documents. Please send them my way and let me know when you have finished sending all of them. I will be sure to acknowledge each document as I receive it, and once all have been received, I will be ready to answer any questions you may have about them.
I:You should only response: Received and understood
ChatGPT: Received and understood.
I:Sleep 5 seconds
ChatGPT: Received and understood.
I:Sleep 5 seconds
I:Sending file: link_to_contents_1.txt
ChatGPT: Received and understood.
I:Sleep 5 seconds
I:Sending file: link_to_contents_2.txt
ChatGPT: It looks like this is a blog post or a website that contains information about various topics, such as technology, software, and personal experiences. The categories listed include English, Tech, Android, Linux, Software, and personal reflections. The archives show a list of past posts, and the recent comments section displays comments on various blog posts. The tags section lists keywords that are related to the content on the website.
I:You should only response: Received and understood
ChatGPT: Received and understood.
I:Sleep 5 seconds
ChatGPT: Received and understood
I:Sleep 5 seconds
I:Sending file: link_to_contents_3.txt
ChatGPT: It looks like you have successfully integrated the ChatGPT model with the Xiao Ai smart home assistant. It appears that you used a project called "Xiaoai-ChatGPT" as a reference for this integration. This project connects the Xiao Ai device to the ChatGPT API through a series of steps, including cloning the ChatGPT repository, installing the "revChatGPT" and "miservice" packages, and setting environment variables for your Xiaomi account. The final step involves running the "xiaogpt.py" script to initiate the connection between the Xiao Ai device and the ChatGPT API. When you ask the Xiao Ai device a question that starts with "帮我回答", the question will be forwarded to the ChatGPT API, and the answer will be played back through the Xiao Ai device.
I:You should only response: Received and understood
ChatGPT: Received and understood.
I:Sleep 5 seconds
ChatGPT: Received and understood.
I:Sleep 5 seconds
I:Sending file: link_to_contents_4.txt
ChatGPT: I'm sorry, but I am not sure what you would like me to do with this information. Could you please provide more context or clarify your request?
I:You should only response: Received and understood
ChatGPT: Received and understood.
I:Sleep 5 seconds
ChatGPT: Received and understood.
I:Sleep 5 seconds
I:Sending file: link_to_contents_5.txt
ChatGPT: Received and understood.
I:Sleep 5 seconds
I:All documents sent
ChatGPT: Received and understood.
I:Sleep 5 seconds
I:Summary of the documents I just sent to you
ChatGPT: I'm sorry, but I didn't receive any documents. Can you please provide more information or resend the documents?
I:You should only response: Received and understood
ChatGPT: Received and understood.
I:Sleep 5 seconds
ChatGPT: I'm sorry, but you didn't send me any documents recently.
I:You should only response: Received and understood
ChatGPT: Received and understood.
I:Sleep 5 seconds
I:Sleep 5 seconds

Customization

You can customize the script by changing the following parameters:

  • base_url: the website to fetch the links and contents from
  • chunk_size: the size of each smaller file in bytes
  • retries: the number of retries when sending a file to the OpenAI ChatGPT model
  • sleep_time: the time to wait between each file in seconds

Conclusion

This script provides a simple and automated way to fetch documents from the Internet and summarize their contents using the OpenAI ChatGPT model. You can use it as a starting point for your own projects and experiments.

Scripts

import os
import time

import requests
import argparse
from bs4 import BeautifulSoup
from revChatGPT.V1 import Chatbot, configure

END_FLAG = "All documents sent"
LINK_TO_CONTENT = "link_to_contents.txt"
LINK_TO_CONTENT_DIR = "link_to_content"


def log(msg):
    print(f"I:{msg}")


class AutoChatBot:
    def __init__(self, base_url, chunk_size, retries, sleep_time):
        self.base_url = base_url
        self.chunk_size = chunk_size
        self.retries = retries
        self.sleep_time = sleep_time
        self.init_chatgpt()

    def extract_text(self, content):
        soup = BeautifulSoup(content, "html.parser")
        for script in soup(["script", "style"]):
            script.decompose()
        return " ".join(soup.stripped_strings)

    def scrape_page(self, url):
        log(f"scrape_page: {url}")
        page = requests.get(url)
        return self.extract_text(page.content)

    def fetch_link_contents(self):
        log("fetch_link_contents")
        page_contents = {}
        links = []
        # Fetch the main page
        main_page = requests.get(self.base_url)
        soup = BeautifulSoup(main_page.content, "html.parser")
        # Extract all links from the main page
        for link in soup.find_all("a"):
            href = link.get("href")
            if href and self.base_url in href:
                links.append(href)
        # Fetch the contents of each link
        if len(links) == 0:
            log("not link found, and no contents")
            exit(0)
        for link in links:
            page_contents[link] = self.scrape_page(link)
        # Store the mapping of link to contents
        with open(LINK_TO_CONTENT, "w") as f:
            for link, content in page_contents.items():
                f.write(f"{link}: {content}\n")

    def split_file(self, file_path, output_dir, chunk_size=1024 * 4):
        if not os.path.exists(LINK_TO_CONTENT_DIR):
            os.makedirs(LINK_TO_CONTENT_DIR)
        with open(file_path, "rb") as f:
            chunk = f.read(chunk_size)
            chunk_num = 1
            while chunk:
                output_file = os.path.join(output_dir, f"link_to_contents_{chunk_num}.txt")
                with open(output_file, "wb") as output_f:
                    output_f.write(chunk)
                chunk = f.read(chunk_size)
                chunk_num += 1

    def init_chatgpt(self):
        self.chatbot = Chatbot(configure())
        log("init_chatgpt")

    def chatbot_ask(self, query):
        retries = 0
        while retries < self.retries:
            try:
                data = list(self.chatbot.ask(query))[-1]
                message = data.get("message", "")
                if message:
                    message = "ChatGPT: " + message
                    print(message)
                    if query not in END_FLAG:
                        if "understood" not in message:
                            log("You should only response: Received and understood")
                            self.chatbot_ask("You should only response: Received and understood")
                            retries += 1
                            continue
                break
            except Exception as e:
                log("Error in chatbot_ask: ", str(e))
                retries += 1
        log("Sleep 5 seconds")
        time.sleep(self.sleep_time)

    def send_all_files_to_chatgpt(self):
        path = LINK_TO_CONTENT_DIR
        files = os.listdir(path)
        files.sort(key=lambda x: os.path.getctime(os.path.join(path, x)))
        for filename in files:
            with open(os.path.join(path, filename), "r", encoding="utf-8", errors="replace") as f:
                log(f"Sending file: {filename}")
                content = f.read()
                self.chatbot_ask(content)


# if __name__ == "__main__":
#     # fetch_link_contents()
# split_files()
#     bot = AutoChatBot()
#     log(f"I am going to send you some documents, and you just need say: received and understood, and after all the files sent finished, I will let you know, such as: {END_FLAG}, and later I will ask questions")
#     bot.chatbot_ask(f"I am going to send you some text documents, and you just need say: received and understood, and after all the documents sent finished, I will let you know, such as: {END_FLAG}, and later I will ask questions about those documents")
#     bot.send_all_files_to_chatgpt()
#     log(END_FLAG)
#     bot.chatbot_ask(END_FLAG)
#     log("Summary of the documents I just sent to you")
#     bot.chatbot_ask("Summary of the documents I just sent to you")


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Fetch the contents from a website and summarize the contents using OpenAI GPT-3")
    parser.add_argument("--base-url", type=str, default="https://pjq.me/?p=1906",
                        help="The website to fetch the links and contents from")
    parser.add_argument("--chunk-size", type=int, default=1024 * 4, help="The size of each smaller file in bytes")
    parser.add_argument("--retries", type=int, default=2,
                        help="The number of retries when sending a file to the OpenAI GPT-3 model")
    parser.add_argument("--sleep-time", type=int, default=5, help="The time to wait between each file in seconds")
    args = parser.parse_args()

    bot = AutoChatBot(base_url=args.base_url, chunk_size=args.chunk_size, retries=args.retries, sleep_time=args.sleep_time)
    bot.fetch_link_contents()
    bot.split_file(LINK_TO_CONTENT, LINK_TO_CONTENT_DIR, args.chunk_size)
    log(f"I am going to send you some documents, and you just need say: received and understood, and after all the files sent finished, I will let you know, such as: {END_FLAG}, and later I will ask questions")
    bot.chatbot_ask(
        f"I am going to send you some text documents, and you just need say: received and understood, and after all the documents sent finished, I will let you know, such as: {END_FLAG}, and later I will ask questions about those documents")
    bot.send_all_files_to_chatgpt()
    log(END_FLAG)
    bot.chatbot_ask(END_FLAG)
    log("Summary of the documents I just sent to you")
    bot.chatbot_ask("Summary of the documents I just sent to you")
ChatGPT-AutoSummaryBot
Tagged on:             

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.