跳过导航
跳过mega-menu

This story is about an AI-driven consultant chatbot I have worked on based on LangChain 和 Chainlit. This bot asks potential customers about their problems in the enterprise data space, developing on the go a dynamic questionnaire to better underst和 the problems. After gathering enough information about the user’s problem, it gives advice to solve it. 在提出问题的同时,它还试图检查用户是否感到困惑,是否需要回答一些问题. If that is the case, it tries to reply to it.

The chatbot is built around a knowledge base about topics related to AI governance, 安全, 数据质量, 等. But you could use other topics of your choice.

This knowledge base is stored in a vector database () 和 is used at every step to either generate questions or give advice.

This chatbot could however be based on any knowledge base 和 used in different contexts. So it you can take it as a blueprint for other consulting chatbots.

交互流

聊天机器人的正常流程很简单:用户提问,机器人回答,以此类推. The bot normally remembers the previous interactions, so there is a history.

The interaction of this bot is however different. 它是这样的:

人工智能驱动的顾问聊天机器人互动

在这种情况下,聊天机器人会问一个问题, the user answers 和 repeats this interaction a couple of times. 如果积累的知识足以回答问题或问题的数量达到一定的阈值, 给出一个回应, 否则就会问另一个问题.

粗略的体系结构

Here are the participants in this application:

Main participants in the AI driven consultant chatbot

我们有4个参与者:

  • 用户

  • 在ChatGPT、知识库和用户之间协调工作流的应用程序.

  • ChatGPT 4 (gpt-4-0613)

  • 知识库(采用矢量数据库) )

我们已经尝试了ChatGPT 3.5 but the results were not that great 和 it was hard to generate meaningful questions. ChatGPT 4 (gpt-4-0613)似乎提供了更好的问题和建议,也更稳定.

We have also experimented 与 the latest ChatGPT 4 model (gpt-4–1106-preview, GPT 4 Turbo), but we have frequently experienced unexpected results from the OpenAI function calls. So we would often see error logs like this one here:

 文件“pydantic /主要.Py ",第341行,pydantic.main.BaseModel.__init__

pydantic.error_wrappers.ValidationError: 2 validation errors for ResponseTags

extracted_questions

 必需的字段(type=value_error).失踪)

questions_related_to_data_analytics

 必需的字段(type=value_error).失踪)

工作流-它是如何工作的

This diagramme shows how the tool works internally:

聊天机器人工作流

以下是工作流程的步骤:

  • The tool asks the user a pre-defined question. 这是典型的:”
    Which area of your data ecosystem are you most concerned about?”

  • 用户回答最初的问题

  • The chatbot checks whether the user reply contains a legitimate question (i.e. 一个不离题的问题)
    - if yes, then a simple query agent is started to clarify the question. This simple agent uses ChatGPT 4 和 the DuckDuckGo search engine.

  • Now the chatbot decides whether it should generate more questions or give advice. This decision is influenced by a simple rule: in case there are less than 4 questions, 还有一个问题, 否则,我们让ChatGPT决定是给出建议还是继续提问.
    - If the decision is to continue asking questions, 查询带有知识库的向量数据库以检索与用户答案最相似的文本. 向量数据库搜索结果包含问题和答案,并发送给ChatGPT 4以生成更多问题.
    —如果决策是给出建议,则查询知识库中的所有问题和答案. 知识库中最相似的部分被提取出来,并与整个问卷(问题和答案)一起包含在ChatGPT的建议生成提示中. 给出建议后,流程终止.

实现

The whole implementation can be found in this repository:

GitHub - onepointconsulting/data-questionnaire-agent: Data Questionnaire 代理 Chatbot

项目的安装说明可以在项目的README文件中找到:

http://github.com/onepointconsulting/data-questionnaire-agent/blob/main/README.md 

应用程序模块

bot包含一个服务模块,您可以在其中找到与ChatGPT 4交互并执行某些操作的所有服务, like generating the PDF report 和 sending an email to the user.

服务

这是包含服务的文件夹:

http://github.com/onepointconsulting/data-questionnaire-agent/tree/main/data_questionnaire_agent/service 

最重要的服务是:

数据结构

There is a module 与 the data structures in this application:

http://github.com/onepointconsulting/data-questionnaire-agent/tree/main/data_questionnaire_agent/model 

在这种情况下,我们有两个模块:

用户界面

这是一个模块 Chainlit 基于用户界面代码:

http://github.com/onepointconsulting/data-questionnaire-agent/tree/main/data_questionnaire_agent/ui 

该文件用主实现的 Chainlit 用户界面为:

该文件中包含工作流实现的方法是process_questionnaire.

全球最大的博彩平台UI的注意事项

The Chainlit version was forked from version 0.7.0 和 modified to meet some requirements given to us. The project should work however using more modern Chainlit versions.

提示

We have separated the prompts from the Python code 和 used a toml 存档:

http://github.com/onepointconsulting/data-questionnaire-agent/blob/main/prompts.toml 

提示符使用分隔符将que指令与知识库、问题和答案分开. ChatGPT 4 seems to underst和 delimiters well, unlike ChatGPT 3.5,这很容易混淆. Here is an example of the prompt used for question generation:

(问卷调查)

   (调查问卷.最初的)

   question = "Which area of your data ecosystem are you most concerned about?"

   system_message =“您是数据集成和治理专家,可以询问有关数据集成和治理的问题,以帮助客户解决数据集成和治理问题”

   human_message = """基于最佳实践和知识库以及对客户回答的问题的回答, \

请生成有助于该客户解决数据集成和治理问题的{questions_per_batch}问题.

最佳实践部分从==== best practices START ====开始,以==== best practices END ====结束.

知识库部分以==== knowledge base START ====开始,以==== knowledge base END ====结束.

向用户提出的问题以==== question ====开始,以==== question END ====结束.

客户提供的用户回答以==== answer ====开头,以==== answer END ====结尾.

====知识库启动====

{knowledge_base}

====知识库端====

====问题====

{问题}

====问题结束====

====回答====

{答案}

====回答结束====

"""

   (调查问卷.二次)

   system_message =“您是一名英国数据集成和治理专家,可以询问有关数据集成和治理的问题,以帮助客户解决数据集成和治理问题”

   human_message = """基于最佳实践和知识库以及客户回答的多个问题的答案, \

请生成有助于该客户解决数据集成问题的{questions_per_batch}问题, 治理和质量问题.

知识库部分以==== knowledge base START ====开始,以==== knowledge base END ====结束.

客户回答的问题和回答部分以====问卷====开始,以====问卷结束====结束.

用户答案位于以==== answers ====开始,以==== answers END ====结束的部分中.

====知识库启动====

{knowledge_base}

====知识库端====

====问卷====

{questions_answers}

====问卷结束====

====回答====

{答案}

====答案结束====

"""

As you can see we are using delimiter sections like e.g: ====知识库启动====or ====知识库端====

外卖

We have tried to build meaningful interactions using ChatGPT 3.5, but this model could not underst和 well the prompt delimiters, 而ChatGPT 4 (gpt-4-0613)可以做到这一点,并允许我们与用户进行有意义的交互. 因此,我们为这个应用程序选择了ChatGPT 4.

就像我们之前提到的, we tried to replace gpt-4–0613 与 gpt-4–1106-preview, 但结果并不好. 函数调用经常失败.

当我们开始这个项目时,我们每分钟有10000个令牌的限制,这导致了一些恼人的错误. 但现在OpenAI将限制增加到30万个令牌,这增加了应用程序的稳定性:

增加了每分钟令牌的限制

另一个重要的收获是,你需要非常小心地限制互动的范围, otherwise your bot might be misused for something else, 就像这个例子:

离题问题

但是我们找到了一种方法来防止它,并且机器人可以识别离题问题(参见[标签]部分) promps.toml 文件):

最后的结论是,ChatGPT4能够应对这一挑战,生成一种有意义的顾问式交互,它可以生成一个开放式问卷,以一系列有意义的建议结束.

十大正规博彩网站评级

在这里注册