With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily meas...With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study.展开更多
在过去20年中,语言建模(Language models,LM)已经成为一种主要方法,用于语言理解和生成,同时作为自然语言处理(Natural language processing,NLP)领域下游的关键技术受到广泛关注.近年来,大语言模型(Large language models,LLMs),例如Ch...在过去20年中,语言建模(Language models,LM)已经成为一种主要方法,用于语言理解和生成,同时作为自然语言处理(Natural language processing,NLP)领域下游的关键技术受到广泛关注.近年来,大语言模型(Large language models,LLMs),例如ChatGPT等技术,取得了显著进展,对人工智能乃至其他领域的变革和发展产生了深远的影响.鉴于LLMs迅猛的发展,本文首先对LLMs相关技术架构和模型规模等方面的演进历程进行了全面综述,总结了模型训练方法、优化技术以及评估手段.随后,分析了LLMs在教育、医疗、金融、工业等领域的应用现状,同时讨论了它们的优势和局限性.此外,还探讨了大语言模型针对社会伦理、隐私和安全等方面引发的安全性与一致性问题及技术措施.最后,展望了大语言模型未来的研究趋势,包括模型的规模与效能、多模态处理、社会影响等方面的发展方向.本文通过全面分析当前研究状况和未来走向,旨在为研究者提供关于大语言模型的深刻见解和启发,以推动该领域的进一步发展.展开更多
Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, a...Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, and more. However, their widespread usage emphasizes the critical need to enhance their security posture to ensure the integrity and reliability of their outputs and minimize harmful effects. Prompt injections and training data poisoning attacks are two of the most prominent vulnerabilities in LLMs, which could potentially lead to unpredictable and undesirable behaviors, such as biased outputs, misinformation propagation, and even malicious content generation. The Common Vulnerability Scoring System (CVSS) framework provides a standardized approach to capturing the principal characteristics of vulnerabilities, facilitating a deeper understanding of their severity within the security and AI communities. By extending the current CVSS framework, we generate scores for these vulnerabilities such that organizations can prioritize mitigation efforts, allocate resources effectively, and implement targeted security measures to defend against potential risks.展开更多
This paper introduces a novel multi-tiered defense architecture to protect language models from adversarial prompt attacks. We construct adversarial prompts using strategies like role emulation and manipulative assist...This paper introduces a novel multi-tiered defense architecture to protect language models from adversarial prompt attacks. We construct adversarial prompts using strategies like role emulation and manipulative assistance to simulate real threats. We introduce a comprehensive, multi-tiered defense framework named GUARDIAN (Guardrails for Upholding Ethics in Language Models) comprising a system prompt filter, pre-processing filter leveraging a toxic classifier and ethical prompt generator, and pre-display filter using the model itself for output screening. Extensive testing on Meta’s Llama-2 model demonstrates the capability to block 100% of attack prompts. The approach also auto-suggests safer prompt alternatives, thereby bolstering language model security. Quantitatively evaluated defense layers and an ethical substitution mechanism represent key innovations to counter sophisticated attacks. The integrated methodology not only fortifies smaller LLMs against emerging cyber threats but also guides the broader application of LLMs in a secure and ethical manner.展开更多
微调后的大语言模型(Large language models,LLMs)在多任务中表现出色,但集中式训练存在用户隐私泄漏的风险。联邦学习(Federated learning,FL)通过本地训练避免了数据共享,但LLMs庞大的参数量对资源受限的设备和通信带宽构成挑战,导致...微调后的大语言模型(Large language models,LLMs)在多任务中表现出色,但集中式训练存在用户隐私泄漏的风险。联邦学习(Federated learning,FL)通过本地训练避免了数据共享,但LLMs庞大的参数量对资源受限的设备和通信带宽构成挑战,导致在边缘网络中部署困难。结合分割学习(Split learning,SL),联邦分割学习可以有效解决这一问题。基于模型深层权重的影响更为显著,以及对部分层的训练准确率略低于整体模型训练的发现,本文按照Transformer层对模型进行分割,同时引入低秩适应(Low⁃rank adaption,LoRA)进一步降低资源开销和提升安全性。因此,在设备端,仅对最后几层进行低秩适应和训练,然后上传至服务器进行聚合。为了降低开销并保证模型性能,本文提出了基于联邦分割学习与LoRA的RoBERTa预训练模型微调方法。通过联合优化边缘设备的计算频率和模型微调的秩,在资源受限的情况下最大化秩,提高模型的准确率。仿真结果显示,仅训练LLMs最后3层的情况下,在一定范围内(1~32)增加秩的取值可以提高模型的准确率。同时,增大模型每轮的容忍时延和设备的能量阈值可以进一步提升模型的准确率。展开更多
随着人工智能技术的迅猛发展,大语言模型(large language models,LLMs)在自然语言处理和各种知识应用中展现了强大的能力.研究了国内大语言模型在中小学学科知识图谱自动标注中的应用,重点以义务教育阶段道德与法治学科和高中数学学科...随着人工智能技术的迅猛发展,大语言模型(large language models,LLMs)在自然语言处理和各种知识应用中展现了强大的能力.研究了国内大语言模型在中小学学科知识图谱自动标注中的应用,重点以义务教育阶段道德与法治学科和高中数学学科为例进行分析和探讨.在教育领域,知识图谱的构建对于整理和系统化学科知识具有重要意义,然而传统的知识图谱构建方法在数据标注方面存在效率低、耗费大量人工成本等问题.研究旨在通过大语言模型来解决这些问题,从而提升知识图谱构建的自动化和智能化水平.基于国内大语言模型的现状,探讨了其在学科知识图谱自动标注中的应用,以道德与法治和数学学科为例,阐述了相关方法和实验结果.首先,探讨了研究背景和意义.接着,综述了国内大语言模型的发展现状和学科知识图谱的自动标注技术.在方法与模型部分,尝试探索一种基于国内大语言模型的自动标注方法,力图完善其在学科知识图谱上的应用.还探讨了学科知识图谱人工标注方法模型,以此作为对比,评估自动标注方法的实际效果.在实验与分析部分,通过在道德与法治和数学学科的自动标注实验和对其结果的分析,发现两个学科的知识图谱自动标注均取得了较高的准确率和效率,与人工标注结果进行了深入比较分析,得出了一系列有价值的结论,验证了所提出方法的有效性和准确性.最后,对未来的研究方向进行了展望.总体而言,研究为学科知识图谱的自动标注提供了一种新的思路和方法,有望推动相关领域的进一步发展.展开更多
[目的/意义]探究大语言模型(Large Language Models,LLMs)等人工智能生成技术对用户信息检索行为产成的影响,为信息检索系统和信息资源建设建言献策。[方法/过程]以ChatGPT等LLMs的蓬勃发展为背景,结合大语言模型的技术特点与现有产品...[目的/意义]探究大语言模型(Large Language Models,LLMs)等人工智能生成技术对用户信息检索行为产成的影响,为信息检索系统和信息资源建设建言献策。[方法/过程]以ChatGPT等LLMs的蓬勃发展为背景,结合大语言模型的技术特点与现有产品的特征,从用户信息行为的视角,通过探讨现有文献和大型语言模型,分析该技术的不断普及对信息检索系统与用户检索行为的影响。[结果/结论]LLMs用作信息检索系统具有传统产品无法比拟的优势,其对用户信息检索行为的底层逻辑、行动重点与检索期望等方面都会产成影响。然而LLMs现有可靠性、准确度等缺陷仍难以使其立刻取代传统信息检索方式。建议在信息检索系统和信息资源建设中重视该技术,探索LLMs与信息服务智能结合,以应对未来用户信息需求的变化,并进一步充分利用已有信息资源的价值。展开更多
文摘自从OpenAI在2022年11月推出其生成式人工智能(AIGC,artificial intelligence generative content,也有人使用generative AI)产品——ChatGPT后,整个世界都为之颠覆.生成式人工智能主要有两个主流:大型语言模型(LLM,large language model)和扩散模型(diffusion model),新的应用和研究每天都在加速发表.在本文中,我们首先对大型语言模型表现出来的智能水平提出了一个严肃的问题:它是否真的拥有像普通人的智能能力一样的通用人工智能(AGI,artificial general intelligence)能力?在本文中,我首先提出了一个重要的假说:作为一个封闭的系统,通过一个大型的语言模型被设计成表示和存储人类的巨大知识和智能的能力和行为,并配备了最高的价值标准,即模型必须符合人类的价值,但大型语言模型内部结构和性质并没有显示其拥有通用人工智能能力.然而,作为一个开放的系统,一旦我们输入一些隐含人类知识和智能的格式化文本,我们就会突然发现,大型语言模型的输出显示出某些人类智能和行为的特征.其中格式化的输入文本被称为提示(prompt),提示的智能程度越高,模型的智能输出就越好.换句话说,大型语言模型拥有某种以prompt提示为条件的通用人工智能AGI能力.经济学研究和其他社会科学研究如政治、历史、语言学等包括了最复杂的社会形态和人类最深刻的思想,因此本文试图通过总结其他研究者最新的研究成果来探讨大语言模型的通用人工智能是事实还是错觉?以及大语言模型其他经济功能和效用,对于这个模型的类通用人工智能的能力,我们总结这些研究学者的最新研究成果,包括大语言模型的智商水平,生成式人工智能的产业经济学,生成式人工智能下的计算社会科学研究,大语言模型的商业决策制定,经济学和其他社会科学,以及虛拟生成式人工智能经济学家的范式研究等问题.
文摘With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study.
文摘在过去20年中,语言建模(Language models,LM)已经成为一种主要方法,用于语言理解和生成,同时作为自然语言处理(Natural language processing,NLP)领域下游的关键技术受到广泛关注.近年来,大语言模型(Large language models,LLMs),例如ChatGPT等技术,取得了显著进展,对人工智能乃至其他领域的变革和发展产生了深远的影响.鉴于LLMs迅猛的发展,本文首先对LLMs相关技术架构和模型规模等方面的演进历程进行了全面综述,总结了模型训练方法、优化技术以及评估手段.随后,分析了LLMs在教育、医疗、金融、工业等领域的应用现状,同时讨论了它们的优势和局限性.此外,还探讨了大语言模型针对社会伦理、隐私和安全等方面引发的安全性与一致性问题及技术措施.最后,展望了大语言模型未来的研究趋势,包括模型的规模与效能、多模态处理、社会影响等方面的发展方向.本文通过全面分析当前研究状况和未来走向,旨在为研究者提供关于大语言模型的深刻见解和启发,以推动该领域的进一步发展.
文摘Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, and more. However, their widespread usage emphasizes the critical need to enhance their security posture to ensure the integrity and reliability of their outputs and minimize harmful effects. Prompt injections and training data poisoning attacks are two of the most prominent vulnerabilities in LLMs, which could potentially lead to unpredictable and undesirable behaviors, such as biased outputs, misinformation propagation, and even malicious content generation. The Common Vulnerability Scoring System (CVSS) framework provides a standardized approach to capturing the principal characteristics of vulnerabilities, facilitating a deeper understanding of their severity within the security and AI communities. By extending the current CVSS framework, we generate scores for these vulnerabilities such that organizations can prioritize mitigation efforts, allocate resources effectively, and implement targeted security measures to defend against potential risks.
文摘This paper introduces a novel multi-tiered defense architecture to protect language models from adversarial prompt attacks. We construct adversarial prompts using strategies like role emulation and manipulative assistance to simulate real threats. We introduce a comprehensive, multi-tiered defense framework named GUARDIAN (Guardrails for Upholding Ethics in Language Models) comprising a system prompt filter, pre-processing filter leveraging a toxic classifier and ethical prompt generator, and pre-display filter using the model itself for output screening. Extensive testing on Meta’s Llama-2 model demonstrates the capability to block 100% of attack prompts. The approach also auto-suggests safer prompt alternatives, thereby bolstering language model security. Quantitatively evaluated defense layers and an ethical substitution mechanism represent key innovations to counter sophisticated attacks. The integrated methodology not only fortifies smaller LLMs against emerging cyber threats but also guides the broader application of LLMs in a secure and ethical manner.
文摘随着人工智能技术的迅猛发展,大语言模型(large language models,LLMs)在自然语言处理和各种知识应用中展现了强大的能力.研究了国内大语言模型在中小学学科知识图谱自动标注中的应用,重点以义务教育阶段道德与法治学科和高中数学学科为例进行分析和探讨.在教育领域,知识图谱的构建对于整理和系统化学科知识具有重要意义,然而传统的知识图谱构建方法在数据标注方面存在效率低、耗费大量人工成本等问题.研究旨在通过大语言模型来解决这些问题,从而提升知识图谱构建的自动化和智能化水平.基于国内大语言模型的现状,探讨了其在学科知识图谱自动标注中的应用,以道德与法治和数学学科为例,阐述了相关方法和实验结果.首先,探讨了研究背景和意义.接着,综述了国内大语言模型的发展现状和学科知识图谱的自动标注技术.在方法与模型部分,尝试探索一种基于国内大语言模型的自动标注方法,力图完善其在学科知识图谱上的应用.还探讨了学科知识图谱人工标注方法模型,以此作为对比,评估自动标注方法的实际效果.在实验与分析部分,通过在道德与法治和数学学科的自动标注实验和对其结果的分析,发现两个学科的知识图谱自动标注均取得了较高的准确率和效率,与人工标注结果进行了深入比较分析,得出了一系列有价值的结论,验证了所提出方法的有效性和准确性.最后,对未来的研究方向进行了展望.总体而言,研究为学科知识图谱的自动标注提供了一种新的思路和方法,有望推动相关领域的进一步发展.
文摘[目的/意义]探究大语言模型(Large Language Models,LLMs)等人工智能生成技术对用户信息检索行为产成的影响,为信息检索系统和信息资源建设建言献策。[方法/过程]以ChatGPT等LLMs的蓬勃发展为背景,结合大语言模型的技术特点与现有产品的特征,从用户信息行为的视角,通过探讨现有文献和大型语言模型,分析该技术的不断普及对信息检索系统与用户检索行为的影响。[结果/结论]LLMs用作信息检索系统具有传统产品无法比拟的优势,其对用户信息检索行为的底层逻辑、行动重点与检索期望等方面都会产成影响。然而LLMs现有可靠性、准确度等缺陷仍难以使其立刻取代传统信息检索方式。建议在信息检索系统和信息资源建设中重视该技术,探索LLMs与信息服务智能结合,以应对未来用户信息需求的变化,并进一步充分利用已有信息资源的价值。