的帖子

< / div > < / div >

< / div >

添加文章 < / div > < / div > < / div >

探索朗链的法学硕士链

#AI # python < / div > < / div >

< / div > < / div > < / div >

< / div >

查看配置文件 < / div >

查看更多文章 < / div > < / div >

< / div > < / div > < / div >

< / div >

在这个故事中，我们将深入了解 LangChain’s LLM链 class. 根据 LangChain的文档 LLM链 allows to define a prompt template and then send a list of key value pairs to the prompt template for the large language model to process.

法学硕士连锁如何运作

你通常会使用 LLM链使用像这样的工作流:

步骤如下:

创建一个 LLM链具有特定模型的对象. 这个模型既可以是聊天模型，也可以是在线模型.g. “gpt-3.5-turbo ')或简单的LLM (' text-davinci-003 ')
定义一个提示模板，如e.g:“请从中提取最相关的关键字。 {内容} 标题: {标题}. Use a the prefix ‘Keywords:’ before the list of keywords.”
Create a list of key value pairs using the keys specified in the prompt template, like e.g:

Key_value_list = [
  {'内容':“整个英国, fossil fuel companies’ broken promises have left scarred and polluted landscapes, 没有人被追究责任。”, 
    'title': "As the toxic legacy of opencast 最小值ing in Wales"},
  {'内容': "The solution is not more fields but better, 更紧凑, 零残忍和无污染的工厂”, 
    'title': '我们认为农业好，工厂坏' '}，
]

键值对列表应用于 LLM链 object which internally sends the requests to the LLM and retrieves the results.

使用LLM链与卫报RSS源

以便更好地了解您可以如何使用 LLM链’s we created a simple script which applies multiple prompt templates to TheGuardian的 (一份英国报纸)RSS订阅. RSS feeds contains lists of text based records which can be converted into a key value list.

这个脚本的目的是获得情感, keywords of a set of articles and to categorize the articles on the basis of a pre-defined set of categories.

The script then counts the sentiments, categories and keywords found on the columnist’s RSS feed:

我们使用的模型是gpt-3.5-turbo”

脚本可以在Github上查看:

http://github.com/gilfernandes/llm_chain_out/blob/main/langchain_llm_chain_extract.py

脚本操作

The command line script performs the following operations:

Loops through a list of URLs with the RSS feeds for Guardian columnists, like e.g: http://www.theguardian.com/profile/georgemonbiot/rss or http://www.theguardian.com/profile/simonjenkins/rss

对于sys中的url.argv [1]:
    process_url (url)

def process_url (url):
    """
    提取每个RSS提要的内容.
    Sends the 内容 of each RSS feed to the LLMChain to apply the prompts to the extracted records.
    Creates a data set for each RSS feed which combines the output of the LLM and generates an HTML and Excel 文件 out of it.
    :param url: RSS提要的url，如e.g: http://www.theguardian.com/profile/georgemonbiot/rss
    """
    打印(f”处理{url}”)
    Zipped_results = []
    Llm_responses = []
    Input_list = extract_rss(url)
    对于prompt_templates中的prompt_template:
        llm_responses.追加(process_llm (input_list, prompt_template))
    sentiment_counter = Counter()
    categores_counter = Counter()
    对于zip(input_list， *llm_responses)中的压缩:
        情绪={'情绪':压缩[1]['text']}
        categorized_sentiment = categorize_sentiment(zipped[1]['text'])
        Sentiment_counter [categorized_sentiment] += 1
        sentiment_category = {'sentiment_category': categorized_sentiment}
        Keywords = {' Keywords ': zipped[2]['text']}
        Raw_categories = zipped[3]['text']
        分类= {' Classification ': raw_categories}
        消毒_topics = sanitize_categories(raw_categories)
        categories_counter.更新(消毒_topics)
        Sanitized_categories = {'topics': "，".澳门十大正规赌博娱乐平台(消毒_topics)}
        Full_record = {
            * *压缩[0], 
            * *情绪, 
            * *关键字, 
            * * sentiment_category, 
            * *分类,
            * * 消毒_categories
        }
        zipped_results.追加(full_record)
    Result_df = pd.DataFrame (zipped_results)
    Title = url.取代 (":", "_").取代 ("/", "_")
    serialize_results(url, result_df, title, sentiment_counter, categories_counter)

It extracts for each URL multiple articles (typically 20). For each article only the 内容 and the title are extracted.

def extract_rss (url):
    """
    从URL中提取内容和标题.
    :param url RSS提要的url，如e.g: http://www.theguardian.com/pro文件/georgemonbiot/rs
    """
    响应=请求.得到(url)
    tree = ElementTree.fromstring(响应.内容)
    内容= []
    对于树中的孩子:
        如果孩子.标签== 'channel':
            对于child中的channel_child:
                如果channel_child.标签== 'item':
                    内容.追加({“内容”:channel_child [2].Text， 'title': channel_child[0].文本})
    返回内容

For each article the script then loops through a collection of prompts:

“请告诉我……的感情。 {内容} 有了这个标题: {标题}? Is it very positive, positive, very negative, negative or neutral? Please answer using these expressions: ‘very positive’, “积极”, “非常消极的”, ‘消极’还是‘中性’”
请从中提取最相关的关键字 {内容} 标题: {标题}. Use a the prefix ‘Keywords:’ before the list of keywords.”
“Please categorize the following 内容 using the following 内容 {内容} 与标题 {标题} 使用这些类别:“政治”, “环境”, “社会”, “体育”, “生活方式”, “技术”, “艺术”

Prompt_templates = [(
    "Please tell me the sentiment of {内容} 有了这个标题: {标题}? Is it very positive, positive, very negative, negative or neutral? " 
    + "Please answer using these expressions: 'very positive', “积极”, “非常消极的”, '消极'或'中性'”),
    "Please extract the most relevant keywords from {内容} 标题: {标题}. Use a the prefix 'Keywords:' before the list of keywords.",
    "Please categorize the following 内容 using the following 内容 {内容} 与标题 {标题} using these categories: " + ",".澳门十大正规赌博娱乐平台(accepted_categories)
]

模型= 'gpt-3.5-turbo”

Def process_llm(input_list: list, prompt_template):
    """
    使用特定模型创建LLMChain对象
    :param input_list a list of dictionaries with the 内容 and title of each article
    :param prompt_template A single prompt template with 内容 and title parameters
    """
    llm = ChatOpenAI(温度=0，模型=模型)
    你也可以用另一种型号. Text-davinci-003比gpt-3贵.5-turbo
    # llm = OpenAI(temperature=0, model='text-davinci-003')
    llm_chain = LLMChain(
        llm = llm,
        提示= PromptTemplate.from_template (prompt_template)
    )
    返回llm_chain.应用(input_list)

After this loop we have a response about the sentiment, another one with the keywords and one with the categories of each article. We sanitize the LLM output to be able to count sentiments, categories and keywords.


def sanitize_categories(文本):
    文本=文本.低()
    消毒= []
    对于accepted_categories中的cat:
        如果在文本中:
            消毒.追加(cat)
    返回消毒


def sanitize_keywords(文本):
    文本=文本.低()
    文本=文本.替换(“关键词:”、“”).带()
    消毒= [re].子(r \.$", "", s.Strip())用于文本中的s.分裂(","))
    返回消毒


def categorize_sentiment(文本):
    文本=文本.低()
    如果在文本中“非常消极”:
        返回“非常消极”
    文本中的Elif 'negative':
        返回“负面”
    Elif在文本中表示“非常积极”:
        回复“非常积极”
    文本中的Elif “积极”:
        返回"正面"
    返回“中性”

We use some counters to count the sentiments and categories of each author (See process_url 上面的代码)
Finally the script generates an Excel 文件 and an HTML 文件 with the sentiment count, category count and all the prompt responses for each article

def serialize_results(url, result_df, title, sentiment_counter, categories_counter):
    """
    将结果转换为Excel工作表或HTML页面. HTML页面还包含计数器信息.
    :param url RSS提要url
    :param result_df The combined raw data and with the LLM output
    :param title The RSS feed URL with some modified characters
    :param sentiment_counter The counter with the sentiment information
    :param categories_counter The counter with the counted categories
    """
    result_df.to_excel (target_folder / f”{标题}.xlsx”)
    Html_文件 = target_folder/f"{标题}.html”
    Html_内容 = result_df.to_html(逃避= False)
    #确保文件是用UTF-8写的
    使用open(html_文件， "w"， encoding="utf-8")作为文件:
        文件.写(html_内容)
    sentiment_html = generate_sentiment_table(sentiment_counter, "Sentiment")
    categories_html = generate_sentiment_table(categories_counter, "Category")
    使用open(html_文件, encoding="utf8")作为f:
        内容 = f"""
                
                    
                    
                    
                
                
                    
                        {re.子(r '.+?theguardian.com/pro文件', '', url)}

                        Sentiment Count

                        {sentiment_html}
                        Categories Count

                        {categories_html}
                        {f.read ()}
                    

                

博彩app
Buying-website-media@forum4women.com
Gaming-platform-website-hr@ivantseng.com
Gaming-platform-support@salamzone.com
Asian-gaming-admin@azarnewsonline.com
Crown-betting-contact@eagle1027.com
体育博彩
European-Cup-bowling-help@noriko7.com
博彩app下载
Crown-Sports-contact@apipros.net
91wan网页游戏平台
Crown-Sports-hr@tertemizhaliyikama.com
正规博彩平台
博彩平台
沙巴体育
New-Portuguese-gambling-marketing@silvamkt.com
线上博彩网址
新葡京
育儿论坛
迅雷看看音乐频道

洛阳房地产信息网
奥特美克
魔云手机建站助手
龙泉宝剑网
赣州人才网
临床药师网
淘宝试用中心
大连瓜皮岛旅游网
天逸音响
贵阳新闻网

 歪歪频道设计
晨越建管
站点地图
西安高新第一中学初中校区



            """
        内容=内容.replace('class="dataframe"', 'class="table table-striped table-hover dataframe"')
    使用open(html_文件， "w"， encoding="utf8")作为f:
        f.写(内容)

脚本输出示例

This is the script output for example for the following columnists:

结论

LangChain’s LLM链 provides a very convenient way to interact with an LLM when you have a list based input to which you want to apply a pre-defined LLM prompt with parameters.

吉尔·费尔南德斯，Onepoint咨询公司

< / div > < / div > < / div > < / div >

探索朗链的法学硕士链

分享

法学硕士连锁如何运作

使用LLM链与卫报RSS源

脚本操作

{re.子(r '.+?theguardian.com/pro文件', '', url)}

Sentiment Count

Categories Count

脚本输出示例

结论

相关的帖子

由Fuzzy Labs发布

Fuzzy Labs release free tool to export Google Analytics data into Google BigQuery

由D55发布

D55 presents Disruptive Innovation @ Manchester Tech Incubator

由伍德赫斯特咨询公司发布

人工智能在交易监控中——一石二鸟

由伍德赫斯特咨询公司发布

机器学习在可预见的未来

由伍德赫斯特咨询公司发布

你不仅仅是你的信用评分

由伍德赫斯特咨询公司发布

了解整体情况

由伍德赫斯特咨询公司发布

不完美的智能，第一部分-垃圾数据

由伍德赫斯特咨询公司发布

不完美的智能，第二部分-有偏见的系统

由iomart Group plc发布

祝多产的北方科技奖提名者好运

莱顿英国发布

创新资金 & 合作促进增长

澳门十大正规赌博娱乐平台