index.xml

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>CA77</title>
        <link>http://localhost:1313/</link>
        <description>Recent content on CA77</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>zh-cn</language>
        <lastBuildDate>Tue, 01 Oct 2024 00:27:14 +0800</lastBuildDate><atom:link href="http://localhost:1313/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>机器学习与监督学习通论</title>
        <link>http://localhost:1313/p/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E4%B8%8E%E7%9B%91%E7%9D%A3%E5%AD%A6%E4%B9%A0%E9%80%9A%E8%AE%BA/</link>
        <pubDate>Tue, 01 Oct 2024 00:27:14 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E4%B8%8E%E7%9B%91%E7%9D%A3%E5%AD%A6%E4%B9%A0%E9%80%9A%E8%AE%BA/</guid>
        <description>&lt;h2 id=&#34;机器学习&#34;&gt;机器学习
&lt;/h2&gt;&lt;p&gt;我们以 Russell 和 Norvig 的教材上的内容来展开，关于机器学习，其概念可以在原文中找到：&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;如果一个智能体通过对世界进行观测来提高它的性能，我们称其为智能体 &lt;em&gt;学习（learning）&lt;/em&gt; 。学习可以是简单的，例如记录一个购物清单，也可以是复杂的，例如爱因斯坦推断关于宇宙的新理论。当智能体是一台计算机时，我们称之为 &lt;strong&gt;机器学习（machine learning）&lt;/strong&gt; ：一台计算机观测到一些数据，基于这些数据构建一个 &lt;em&gt;模型（model）&lt;/em&gt; ，并将这个模型作为关于世界的一个 &lt;em&gt;假设（hypothesis）&lt;/em&gt; 以及用于求解问题的软件的一部分。&lt;/p&gt;
&lt;p&gt;为什么我们希望一台机器进行学习？为什么不通过合适的方式编程然后让它运行呢？这里有两个主要的原因。其一，程序的设计者无法预见未来所有可能发生的情形。举例来说，一个被设计用来导航迷宫的机器人必须掌握每一个它可能遇到的新迷宫的布局；一个用于预测股市价格的程序必须能适应各种股票涨跌的情形。其二，有时候设计者并不知道如何设计一个程序来求解目标问题。大多数人都能辨认自己家人的面孔，但是他们实现这一点利用的是潜意识，所以即使能力再强的程序员也不知道如何编写计算机程序来完成这项任务，除非他使用机器学习算法。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&#34;http://localhost:1313/figures/ml_controltheory.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
	
&gt;&lt;/p&gt;
&lt;p&gt;从很高的角度看来，机器学习是一个负反馈的黑箱 \( M \) ，系统 \( M \) 本体具有可调的参数 \( w \) ，这些参数会受到输出影响。但机器学习是离散过程，因此这里的反馈并不会像自动控制系统里的反馈一样由微分方程决定，但我们还是要指出：机器学习的反馈过程，是一个 &lt;strong&gt;负反馈&lt;/strong&gt; 过程。&lt;/p&gt;
&lt;p&gt;事实上，执行和反馈是分开进行的，前者称为 &lt;strong&gt;预测&lt;/strong&gt; ，后者称为 &lt;strong&gt;训练&lt;/strong&gt; 。我们需要两组输入数据 \( x_1, x_2 \) 和其中一种输入的已知输出数据 \( y_1 \) 才能以最低限度驱动机器学习系统 \( M \) 。&lt;/p&gt;
\[ \left.\begin{align*}
&amp;训练算法:&amp; \\
\hline
&amp;\quad \mathtt{input} &amp; &amp;x_1,y_1,w_0, loss_{\min}&amp; \\
&amp;\quad \mathtt{output}&amp; &amp;w&amp; \\
&amp;\qquad S_1 &amp; &amp;\tilde{y}_1\leftarrow M(x_1,w_0)&amp; \\
&amp;\qquad S_2 &amp; &amp;loss\leftarrow \mathrm{Dist}(y_1,\tilde{y}_1)&amp; \\
&amp;\qquad S_3 &amp; &amp;\mathtt{if}(loss&gt;loss_{\min})&amp; \\
&amp;\qquad S_4 &amp; &amp;\qquad w\leftarrow \mathrm{Update}(loss, w_0)&amp; \\
&amp;\qquad S_4 &amp; &amp;\qquad \mathtt{goto}\ S_2&amp; \\
&amp;\qquad S_5 &amp; &amp;\mathtt{else}&amp; \\
&amp;\qquad S_6 &amp; &amp;\qquad \mathtt{return}\ w&amp; \\
\end{align*}\quad\middle|\quad \begin{align*}
&amp;预测算法:&amp; \\
\hline
&amp;\quad \mathtt{input} &amp; &amp;x_2, w&amp; \\
&amp;\quad \mathtt{output}&amp; &amp;y_2&amp; \\
&amp;\qquad S_1 &amp; &amp; y_2\leftarrow\ M(x_2,w) &amp; \\
&amp;\qquad S_2 &amp; &amp; \mathtt{return}\ y_2&amp; \\
\end{align*}\right.\]&lt;p&gt;可以看到，\( 反馈=训练, 执行=预测 \) 。&lt;/p&gt;
</description>
        </item>
        <item>
        <title>kNN 算法</title>
        <link>http://localhost:1313/p/knn-%E7%AE%97%E6%B3%95/</link>
        <pubDate>Mon, 30 Sep 2024 23:06:06 +0800</pubDate>
        
        <guid>http://localhost:1313/p/knn-%E7%AE%97%E6%B3%95/</guid>
        <description></description>
        </item>
        <item>
        <title>机器学习</title>
        <link>http://localhost:1313/p/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/</link>
        <pubDate>Mon, 30 Sep 2024 20:11:51 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/</guid>
        <description>&lt;h1 id=&#34;教材&#34;&gt;教材
&lt;/h1&gt;&lt;p&gt;Stuart Russell, Peter Novrig. &lt;em&gt;Artificial Intelligence: A Modern Approach&lt;/em&gt;, Fourth Edition. 中译本: 人工智能: 现代方法, 张博雅等译. 北京: 人民邮电出版社, 2023.1.&lt;/p&gt;
&lt;h1 id=&#34;投稿目录&#34;&gt;投稿目录
&lt;/h1&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e6%9c%ba%e5%99%a8%e5%ad%a6%e4%b9%a0%e4%b8%8e%e7%9b%91%e7%9d%a3%e5%ad%a6%e4%b9%a0%e9%80%9a%e8%ae%ba&#34; &gt;机器学习与监督学习通论&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/knn-%e7%ae%97%e6%b3%95/&#34; &gt;\( k \)-邻近算法 与 \( k\bm d \) 树 及其应用&lt;/a&gt;&lt;/p&gt;
&lt;h1 id=&#34;绪论--introduce&#34;&gt;绪论 — Introduce
&lt;/h1&gt;&lt;p&gt;&lt;em&gt;内容来自 : wikipedia&lt;/em&gt;&lt;/p&gt;
&lt;h2 id=&#34;简介&#34;&gt;简介
&lt;/h2&gt;&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Machine learning (ML)&lt;/strong&gt; is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Quick progress in the field of deep learning, beginning in 2010s, allowed neural networks to surpass many previous approaches in performance.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;机器学习 (ML) 是人工智能领域的一门学科，专注于研究从数据中学习并对未见过的数据进行泛化，从而能够执行无需明确指令的任务的统计算法的发展和研究。深度学习领域的快速进步始于 2010 年代，使得神经网络在性能上超越了之前的许多方法。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;ML finds application in many fields, including natural language processing, computer vision, speech recognition, email filtering, agriculture, and medicine. The application of ML to business problems is known as predictive analytics.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;ML 在许多领域都有应用，包括自然语言处理、计算机视觉、语音识别、电子邮件过滤、农业和医学等。ML在商业问题中的应用被称为预测分析。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Statistics and mathematical optimization (mathematical programming) methods comprise the foundations of machine learning. Data mining is a related field of study, focusing on exploratory data analysis (EDA) via unsupervised learning.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;统计学和数学优化（数学规划）方法构成了机器学习的基础。数据挖掘是与之相关的研究领域，专注于通过无监督学习进行探索性数据分析 (EDA) 。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;From a theoretical viewpoint, probably approximately correct (PAC) learning provides a framework for describing machine learning.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;从理论角度来看，“可能近似正确” (PAC) 学习为机器学习提供了一个描述框架。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;理论&#34;&gt;理论
&lt;/h2&gt;&lt;blockquote&gt;
&lt;p&gt;A core objective of a learner is to generalize from its experience. Generalization in this context is the ability of a learning machine to perform accurately on new, unseen examples/tasks after having experienced a learning data set. The training examples come from some generally unknown probability distribution (considered representative of the space of occurrences) and the learner has to build a general model about this space that enables it to produce sufficiently accurate predictions in new cases.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;学习者的核心目标是从其经验中进行归纳。在此背景下，泛化指的是学习机器在经历了学习数据集之后，能够在未见过的新的示例/任务上准确执行的能力。训练示例来自一些通常未知的概率分布（被认为是发生空间的代表性分布），学习者必须构建一个关于这个空间的通用模型，以便在新情况下产生足够准确的预测。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The computational analysis of machine learning algorithms and their performance is a branch of theoretical computer science known as computational learning theory via the Probably Approximately Correct Learning (PAC) model. Because training sets are finite and the future is uncertain, learning theory usually does not yield guarantees of the performance of algorithms. Instead, probabilistic bounds on the performance are quite common. The bias–variance decomposition is one way to quantify generalization error.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;机器学习算法的计算分析及其性能是理论计算机科学的一个分支，通过“可能近似正确学习” (PAC) 模型被称为计算学习理论。由于训练集是有限的，未来是不确定的，学习理论通常不能保证算法的性能。相反，性能的概率性界通常很常见。偏差-方差分解是量化泛化误差的一种方法。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;For the best performance in the context of generalization, the complexity of the hypothesis should match the complexity of the function underlying the data. If the hypothesis is less complex than the function, then the model has under fitted the data. If the complexity of the model is increased in response, then the training error decreases. But if the hypothesis is too complex, then the model is subject to overfitting and generalization will be poorer.
&lt;em&gt;在泛化性能方面要达到最佳表现，假设的复杂度应该与数据背后的函数复杂度相匹配。如果假设的复杂度低于函数，那么模型就会对数据进行欠拟合。如果相应地增加模型的复杂度，那么训练误差就会降低。但如果假设过于复杂，那么模型就会出现过拟合问题，泛化性能就会变差。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In addition to performance bounds, learning theorists study the time complexity and feasibility of learning. In computational learning theory, a computation is considered feasible if it can be done in polynomial time. There are two kinds of time complexity results: Positive results show that a certain class of functions can be learned in polynomial time. Negative results show that certain classes cannot be learned in polynomial time.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;除了性能边界外，学习理论家还研究学习的时间复杂度和可行性。在计算学习理论中，如果一个计算可以在多项式时间内完成，则认为它是可行的。有两种时间复杂度结果：正面结果表明，某些类型的函数可以在多项式时间内学习。负面结果表明，某些类函数无法在多项式时间内学习。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;方法&#34;&gt;方法
&lt;/h2&gt;&lt;blockquote&gt;
&lt;p&gt;Machine learning approaches are traditionally divided into three broad categories, which correspond to learning paradigms, depending on the nature of the &amp;ldquo;signal&amp;rdquo; or &amp;ldquo;feedback&amp;rdquo; available to the learning system:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;机器学习的方法通常被分为三大类，这与学习系统的 “信号” 或 “反馈” 的性质有关，对应于不同的学习范式：&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Supervised learning&lt;/strong&gt;: The computer is presented with example inputs and their desired outputs, given by a &amp;ldquo;teacher&amp;rdquo;, and the goal is to learn a general rule that maps inputs to outputs.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;监督学习：计算机被展示一些示例输入和它们的期望输出（由“老师”给出），目的是学习一个将输入映射到输出的通用规则。&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Unsupervised learning&lt;/strong&gt;: No labels are given to the learning algorithm, leaving it on its own to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end (feature learning).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;无监督学习：学习算法不被给予任何标签，完全靠自己从输入数据中寻找结构。无监督学习既可以作为一种目标（在数据中发现隐藏的模式），也可以作为实现其他目标的一种手段（特征学习）。&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reinforcement learning&lt;/strong&gt;: A computer program interacts with a dynamic environment in which it must perform a certain goal (such as driving a vehicle or playing a game against an opponent). As it navigates its problem space, the program is provided feedback that&amp;rsquo;s analogous to rewards, which it tries to maximize.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;强化学习：计算机程序与动态环境交互，必须完成某个目标（例如驾驶车辆或与对手进行游戏）。在探索问题空间的过程中，程序会收到类似于奖励的反馈，并试图最大化这些奖励。&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;监督学习--supervised-learning&#34;&gt;监督学习 — Supervised Learning
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Supervised learning algorithms build a mathematical model of a set of data that contains both the inputs and the desired outputs. The data, known as training data, consists of a set of training examples. Each training example has one or more inputs and the desired output, also known as a supervisory signal. In the mathematical model, each training example is represented by an array or vector, sometimes called a feature vector, and the training data is represented by a matrix. Through iterative optimization of an objective function, supervised learning algorithms learn a function that can be used to predict the output associated with new inputs. An optimal function allows the algorithm to correctly determine the output for inputs that were not a part of the training data. An algorithm that improves the accuracy of its outputs or predictions over time is said to have learned to perform that task.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;监督学习算法会建立一个包含输入和期望输出的数据集的数学模型。该数据被称为训练数据，由一组训练示例组成。每个训练示例包含一个或多个输入和期望输出（也称为监督信号）。在数学模型中，每个训练示例由一个数组或向量表示，有时称为特征向量，而训练数据则由一个矩阵表示。通过对目标函数的迭代优化，监督学习算法可以学习一个函数，用于预测与新输入相关的输出。一个最优函数使算法能够正确确定未在训练数据中出现的输入的输出。如果一个算法能够在不断学习的过程中提高其输出或预测的准确性，则可以说该算法学会了执行该任务。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Types of supervised-learning algorithms include active learning, classification and regression. Classification algorithms are used when the outputs are restricted to a limited set of values, and regression algorithms are used when the outputs may have any numerical value within a range. As an example, for a classification algorithm that filters emails, the input would be an incoming email, and the output would be the name of the folder in which to file the email. Examples of regression would be predicting the height of a person, or the future temperature.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;监督学习算法的类型包括主动学习、分类和回归。当输出被限制为有限的一组值时，使用分类算法。当输出可能在一定范围内具有任何数值时，使用回归算法。例如，对于用于过滤电子邮件的分类算法，输入将是一封新到达的电子邮件，输出将是该电子邮件应被归档到哪个文件夹的名称。预测身高或预测未来温度是回归算法的一些示例。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Similarity learning is an area of supervised machine learning closely related to regression and classification, but the goal is to learn from examples using a similarity function that measures how similar or related two objects are. It has applications in ranking, recommendation systems, visual identity tracking, face verification, and speaker verification.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;相似性学习是监督式机器学习的一个领域，与回归和分类密切相关，但其目标是使用一个衡量两个对象之间相似度或相关性的相似性函数来从示例中学习。它在排名、推荐系统、视觉身份跟踪、人脸验证和说话人验证等领域都有应用。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;无监督学习--unsupervised-learning&#34;&gt;无监督学习 — Unsupervised Learning
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Unsupervised learning algorithms find structures in data that has not been labeled, classified or categorized. Instead of responding to feedback, unsupervised learning algorithms identify commonalities in the data and react based on the presence or absence of such commonalities in each new piece of data. Central applications of unsupervised machine learning include clustering, dimensionality reduction, and density estimation. Unsupervised learning algorithms also streamlined the process of identifying large indel based haplotypes of a gene of interest from pan-genome.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;无监督学习算法可以在未标记、分类或归类的数据中发现结构。与响应反馈不同，无监督学习算法会识别数据中的共同点，并根据每块新数据中是否存在这些共同点来做出反应。无监督机器学习的主要应用包括聚类、降维和密度估计。无监督学习算法还简化了从泛基因组中识别基因感兴趣区域的大型插入/缺失等位基因型 (haplotype) 的过程。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Cluster analysis is the assignment of a set of observations into subsets (called clusters) so that observations within the same cluster are similar according to one or more predesignated criteria, while observations drawn from different clusters are dissimilar. Different clustering techniques make different assumptions on the structure of the data, often defined by some similarity metric and evaluated, for example, by internal compactness, or the similarity between members of the same cluster, and separation, the difference between clusters. Other methods are based on estimated density and graph connectivity.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;聚类分析是将一组观测数据划分为若干子集（称为簇）的过程，使得同一簇中的观测数据在根据预先指定的一或多个标准（如相似性）的相似性方面相似，而来自不同簇的观测数据则不相似。不同的聚类技术对数据结构有不同的假设，通常由一些参数（如距离或相似度）定义，并通过例如距离或同一簇成员之间的相似度、不同簇之间的差异等进行评估。其他方法则基于密度和轮廓。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;A special type of unsupervised learning called, self-supervised learning involves training a model by generating the supervisory signal from the data itself.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;一种特殊的无监督学习方法称为“自监督学习”，它通过从数据本身生成监督信号来训练模型。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;半监督学习--semi-supervised-learning&#34;&gt;半监督学习 — Semi-Supervised Learning
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Semi-supervised learning falls between unsupervised learning (without any labeled training data) and supervised learning (with completely labeled training data). Some of the training examples are missing training labels, yet many machine-learning researchers have found that unlabeled data, when used in conjunction with a small amount of labeled data, can produce a considerable improvement in learning accuracy.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;半监督学习介于无监督学习（没有任何标注的训练数据）和有监督学习（有完全标注的训练数据）之间。一些训练示例缺少训练标签，但许多机器学习研究人员发现，在少量标注数据的辅助下，未标注数据可以显著提高学习准确性。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In weakly supervised learning, the training labels are noisy, limited, or imprecise; however, these labels are often cheaper to obtain, resulting in larger effective training sets.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;在弱监督学习中，训练标签是噪声的、有限的或不准确的；然而，这些标签通常更容易获取，因此可以生成更大的有效训练集。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;强化学习--reinforcement-learning&#34;&gt;强化学习 — Reinforcement Learning
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Due to its generality, the field is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics and genetic algorithms. In reinforcement learning, the environment is typically represented as a Markov decision process (MDP). Many reinforcements learning algorithms use dynamic programming techniques. Reinforcement learning algorithms do not assume knowledge of an exact mathematical model of the MDP and are used when exact models are infeasible. Reinforcement learning algorithms are used in autonomous vehicles or in learning to play a game against a human opponent.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;强化学习是机器学习的一个领域，关注软件代理在环境中应该如何采取行动，以最大化某种累积奖励的概念。由于其通用性，该领域在许多其他学科中也受到研究，例如博弈论、控制理论、运筹学、信息论、基于模拟的优化、多代理系统、群体智能、统计学和遗传算法。在强化学习中，环境通常被表示为马尔可夫决策过程 (MDP) 。许多强化学习算法使用动态规划技术。强化学习算法不假定对 MDP 有精确的数学模型的了解，并且在精确模型不可行的情况下使用。强化学习算法用于自主车辆或学习与人类对手玩游戏。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;降维--dimensionality-reduction&#34;&gt;降维 — Dimensionality Reduction
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Dimensionality reduction is a process of reducing the number of random variables under consideration by obtaining a set of principal variables. In other words, it is a process of reducing the dimension of the feature set, also called the &amp;ldquo;number of features&amp;rdquo;. Most of the dimensionality reduction techniques can be considered as either feature elimination or extraction. One of the popular methods of dimensionality reduction is principal component analysis (PCA). PCA involves changing higher-dimensional data (e.g., 3D) to a smaller space (e.g., 2D). The manifold hypothesis proposes that high-dimensional data sets lie along low-dimensional manifolds, and many dimensionality reduction techniques make this assumption, leading to the area of manifold learning and manifold regularization.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;降维是一种通过获取主变量集来减少考虑的随机变量数量的过程。换句话说，它是一种减少特征集维度（也称为“特征数量”）的过程。大多数降维技术可以被认为是特征消除或提取。降维的一种流行方法是主成分分析 (PCA) 。PCA 涉及将高维数据（例如 3D ）转换为较小的空间（例如 2D ）。 “流形假设”提出高维数据集位于低维流形上，许多降维技术都基于这一假设，从而产生了流形学习和流形正则化领域。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;自学习--self-learning&#34;&gt;自学习 — Self-Learning
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Self-learning, as a machine learning paradigm was introduced in 1982 along with a neural network capable of self-learning, named crossbar adaptive array (CAA). It is learning with no external rewards and no external teacher advice. The CAA self-learning algorithm computes, in a crossbar fashion, both decisions about actions and emotions (feelings) about consequence situations. The system is driven by the interaction between cognition and emotion. The self-learning algorithm updates a memory matrix \( W =\|w(a,s)\| \) such that in each iteration executes the following machine learning routine:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;自学习是一种机器学习范式，于1982年与一种能够自我学习的神经网络一起被引入，该神经网络被称为交叉开关自适应阵列 (CAA) 。这是一种无需外部奖励和外部教师指导的学习方式。 CAA 的自学习算法采用交叉方式计算关于动作和关于后果情况的情感（情绪）决策。系统由认知和情感之间的相互作用驱动。自学习算法更新一个记忆矩阵 \( W =\|w(a,s)\| \) ，使得在每次迭代中执行以下机器学习流程：&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in situation \( s \) perform action \( a \)
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;在情况 \( s \) 下采取行动 \( a \)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;receive a consequence situation \( s&#39; \)
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;面临后果 \( s&#39; \)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;compute emotion of being in the consequence situation \( v(s&#39;) \)
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;计算处于后果情境中的情绪 \( v(s&#39;) \)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;update crossbar memory \( w&#39;(a,s) = w(a,s) + v(s&#39;) \)
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;更新交叉开关的记忆 \( w&#39;(a,s) = w(a,s) + v(s&#39;) \)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is a system with only one input, situation, and only one output, action (or behavior) a. There is neither a separate reinforcement input nor an advice input from the environment. The backpropagated value (secondary reinforcement) is the emotion toward the consequence situation. The CAA exists in two environments, one is the behavioral environment where it behaves, and the other is the genetic environment, wherefrom it initially and only once receives initial emotions about situations to be encountered in the behavioral environment. After receiving the genome (species) vector from the genetic environment, the CAA learns a goal-seeking behavior, in an environment that contains both desirable and undesirable situations.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;这是一个只有单一输入（情况）和单一输出（行动或行为） \( a \) 的系统。环境既没有独立的强化输入，也没有来自环境的建议输入。反向传播的价值（次级强化）是对后果情况的情绪。 CAA 存在于两个环境中，一个是行为环境， CAA 在其中表现，另一个是遗传环境， CAA 仅在一次且最初从中接收关于将在行为环境中遇到的情况的初始情绪。从遗传环境中接收基因（物种）向量后， CAA 学习一种寻求目标的行为，在一个包含既有吸引力又有排斥力的情况的环境中。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;模型&#34;&gt;模型
&lt;/h2&gt;&lt;blockquote&gt;
&lt;p&gt;A machine learning model is a type of mathematical model that, after being &amp;ldquo;trained&amp;rdquo; on a given dataset, can be used to make predictions or classifications on new data. During training, a learning algorithm iteratively adjusts the model&amp;rsquo;s internal parameters to minimize errors in its predictions. By extension, the term &amp;ldquo;model&amp;rdquo; can refer to several levels of specificity, from a general class of models and their associated learning algorithms to a fully trained model with all its internal parameters tuned.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;机器学习模型是一种数学模型，在对给定数据集进行“训练”后，可以用于对新数据进行预测或分类。在训练过程中，学习算法会逐次调整模型内部参数，以最小化其预测中的误差。因此，“模型”一词可以指代特定模型及其相关学习算法的多个层次，直至指代完全训练好的模型及其所有内部参数的调整。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Various types of models have been used and researched for machine learning systems, picking the best model for a task is called model selection.&lt;/p&gt;
&lt;p&gt;机器学习系统中使用了各种类型的模型，为任务选择最佳模型称为模型选择。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;人工神经网络--artificial-neural-network-ann&#34;&gt;人工神经网络 — Artificial Neural Network (ANN)
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Artificial neural networks (ANNs), or connectionist systems, are computing systems vaguely inspired by the biological neural networks that constitute animal brains. Such systems &amp;ldquo;learn&amp;rdquo; to perform tasks by considering examples, generally without being programmed with any task-specific rules.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;人工神经网络 (ANNs) 或称连接主义系统，是一种受生物神经网络启发的计算系统，后者构成了动物大脑的功能。这种系统通常通过考虑示例来“学习”执行任务，通常无需按照特定任务的规则进行编程。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;An ANN is a model based on a collection of connected units or nodes called &amp;ldquo;artificial neurons&amp;rdquo;, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit information, a &amp;ldquo;signal&amp;rdquo;, from one artificial neuron to another. An artificial neuron that receives a signal can process it and then signal additional artificial neurons connected to it. In common ANN implementations, the signal at a connection between artificial neurons is a real number, and the output of each artificial neuron is computed by some non-linear function of the sum of its inputs. The connections between artificial neurons are called &amp;ldquo;edges&amp;rdquo;. Artificial neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Artificial neurons may have a threshold such that the signal is only sent if the aggregate signal crosses that threshold. Typically, artificial neurons are aggregated into layers. Different layers may perform different kinds of transformations on their inputs. Signals travel from the first layer (the input layer) to the last layer (the output layer), possibly after traversing the layers multiple times.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;ANN 是一种基于连接在一起的单元或节点（称为“人工神经元”）的模型，它大致模拟生物大脑中的神经元。每个连接（类似于生物大脑中的突触）都可以从一个人工神经元向另一个人工神经元传输信息，即“信号”。接收信号的人工神经元可以对信号进行处理，然后向与其连接的其他人工神经元发送信号。在常见的 ANN 实现中，人工神经元之间的连接之间的信号是一个实数，每个人工神经元的输出是其输入总和的某个非线性函数。人工神经元和连接之间的边缘通常有一个权重，随着学习的进行而调整。权重会增加或减少连接处信号的强度。人工神经元可能有一个阈值，只有当总信号超过该阈值时才会发送信号。通常，人工神经元会组合成不同的层。不同的层可能对其输入执行不同的变换。信号从第一层（输入层）传递到最后一层（输出层），可能需要在各层之间多次穿梭。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The original goal of the ANN approach was to solve problems in the same way that a human brain would. However, over time, attention moved to performing specific tasks, leading to deviations from biology. Artificial neural networks have been used on a variety of tasks, including computer vision, speech recognition, machine translation, social network filtering, playing board and video games and medical diagnosis.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;ANN 方法的原始目标是像人脑一样解决问题。然而，随着时间的推移，人们的关注点转向执行特定的任务，导致偏离了生物学。人工智能神经网络已被应用于各种任务，包括计算机视觉、语音识别、机器翻译、社交网络过滤、棋盘和视频游戏以及医学诊断。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Deep learning consists of multiple hidden layers in an artificial neural network. This approach tries to model the way the human brain processes light and sound into vision and hearing. Some successful applications of deep learning are computer vision and speech recognition.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;深度学习是一种在人工神经网络中包含多个隐藏层的方法。这种方法试图模拟人类大脑将光和声音转换为视觉和听觉的方式。深度学习的一些成功应用包括计算机视觉和语音识别。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;决策树--decision-tree&#34;&gt;决策树 — Decision Tree
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Decision tree learning uses a decision tree as a predictive model to go from observations about an item (represented in the branches) to conclusions about the item&amp;rsquo;s target value (represented in the leaves). It is one of the predictive modeling approaches used in statistics, data mining, and machine learning. Tree models where the target variable can take a discrete set of values are called classification trees; in these tree structures, leaves represent class labels, and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees. In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. In data mining, a decision tree describes data, but the resulting classification tree can be an input for decision-making.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;决策树学习使用决策树作为预测模型，从关于某个项目（以树枝表示）的观察结果出发，推断出该项目的目标值（以树叶表示）。它是统计学、数据挖掘和机器学习中使用的预测建模方法之一。目标变量可以取离散集合值的树模型称为分类树；在这些树结构中，叶子代表类标签，树枝代表导致这些类标签的特征的交集。目标变量可以取连续值（通常是实数）的决策树称为回归树。在决策分析中，决策树可以用于可视化和明确地表示决策和决策制定。在数据挖掘中，决策树描述数据，但生成的分类树可以作为决策的输入。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;支持向量机--support-vector-machine-svm&#34;&gt;支持向量机 — Support-Vector Machine (SVM)
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Support-vector machines (SVMs), also known as support-vector networks, are a set of related supervised learning methods used for classification and regression. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category. An SVM training algorithm is a non-probabilistic, binary, linear classifier, although methods such as Platt scaling exist to use SVM in a probabilistic classification setting. In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;支持向量机 (SVM) 又称为支持向量网络，是一种用于分类和回归的相关监督学习方法的集合。给定一组带有标记的训练示例，每个示例都被标记为属于两个类别中的一个，SVM 训练算法会构建一个模型，用于预测新示例属于哪个类别。SVM训练算法是一种非概率的二元线性分类器，尽管存在诸如 Platt 缩放之类的方法，可以将 SVM 应用于概率分类环境中。除了进行线性分类外， SVM 还可以高效地使用所谓的“核技巧”进行非线性分类，将输入数据隐式映射到高维特征空间。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;回归分析--regression-analysis&#34;&gt;回归分析 — Regression Analysis
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Regression analysis encompasses a large variety of statistical methods to estimate the relationship between input variables and their associated features. Its most common form is linear regression, where a single line is drawn to best fit the given data according to a mathematical criterion such as ordinary least squares. The latter is often extended by regularization methods to mitigate overfitting and bias, as in ridge regression. When dealing with non-linear problems, go-to models include polynomial regression (for example, used for trendline fitting in Microsoft Excel), logistic regression (often used in statistical classification) or even kernel regression, which introduces non-linearity by taking advantage of the kernel trick to implicitly map input variables to higher-dimensional space.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;回归分析是一种包含多种统计方法的工具，用于估计输入变量与其相关特征之间的关系。其最常见的形式是线性回归，即根据诸如普通最小二乘法等数学准则，绘制一条直线来最佳拟合给定的数据。后者通常通过正则化方法进行扩展，以减少过拟合和偏差，例如岭回归。当处理非线性问题时，常用的模型包括多项式回归（例如，用于在 Microsoft Excel 中绘制趋势线）、逻辑回归（通常用于统计分类），甚至核回归，通过利用核技巧将输入变量隐式映射到更高维空间来引入非线性。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;bayes-网络--bayesian-network&#34;&gt;Bayes 网络 — Bayesian Network
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;A Bayesian network, belief network, or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional independence with a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases. Efficient algorithms exist that perform inference and learning. Bayesian networks that model sequences of variables, like speech signals or protein sequences, are called dynamic Bayesian networks. Generalizations of Bayesian networks that can represent and solve decision problems under uncertainty are called influence diagrams.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bayes 网络（也称为信念网络或有向无环图模型）是一种概率图模型，它使用有向无环图 (DAG) 来表示一组随机变量及其条件独立性。例如， Bayes 网络可以表示疾病和症状之间的概率关系。给定症状，该网络可以计算各种疾病存在的概率。存在高效的算法可以进行推理和学习。表示和解决不确定性条件下决策问题的 Bayes 网络的一般化形式称为影响图。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;gauss-过程--gaussian-process&#34;&gt;Gauss 过程 — Gaussian Process
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;A Gaussian process is a stochastic process in which every finite collection of the random variables in the process has a multivariate normal distribution, and it relies on a pre-defined covariance function, or kernel, that models how pairs of points relate to each other depending on their locations.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Gauss 过程是一种随机过程，其中过程的任意有限集合的随机变量都有多维正态分布。它依赖于一个预先定义的协方差函数（或核函数），该函数根据点的位置模型点对之间的关系。&lt;/em&gt;
Given a set of observed points, or input–output examples, the distribution of the (unobserved) output of a new point as function of its input data can be directly computed by looking like the observed points and the covariances between those points and the new, unobserved point.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;给定一组观察点或输入-输出示例，可以通过观察这些点及其与新、未观察点之间的协方差来直接计算新点输出（即未观察点的输出）的分布。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Gaussian processes are popular surrogate models in Bayesian optimization used to do hyperparameter optimization.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Gauss 过程是 Bayes 优化中常用的超参数优化替代模型。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;遗传算法--genetic-algorithm&#34;&gt;遗传算法 — Genetic Algorithm
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;A genetic algorithm (GA) is a search algorithm and heuristic technique that mimics the process of natural selection, using methods such as mutation and crossover to generate new genotypes in the hope of finding good solutions to a given problem. In machine learning, genetic algorithms were used in the 1980s and 1990s. Conversely, machine learning techniques have been used to improve the performance of genetic and evolutionary algorithms.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;遗传算法 (GA) 是一种仿照自然选择过程的搜索算法和启发式技术，它使用突变和交叉等方法来生成新的基因型，以期找到给定问题的良好解决方案。在机器学习领域，遗传算法在 20 世纪 80 年代和 90 年代得到了应用。相反，机器学习技术已被用于改进遗传和进化算法的性能。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;信念函数--belief-function&#34;&gt;信念函数 — Belief Function
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory, is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as probability, possibility and imprecise probability theories. These theoretical frameworks can be thought of as a kind of learner and have some analogous properties of how evidence is combined (e.g., Dempster&amp;rsquo;s rule of combination), just like how in a pmf-based Bayesian approach would combine probabilities. However, there are many caveats to these beliefs functions when compared to Bayesian approaches in order to incorporate ignorance and uncertainty quantification. These belief function approaches that are implemented within the machine learning domain typically leverage a fusion approach of various ensemble methods to better handle the learner&amp;rsquo;s decision boundary, low samples, and ambiguous class issues that standard machine learning approach tend to have difficulty resolving. However, the computational complexity of these algorithms are dependent on the number of propositions (classes), and can lead to a much higher computation time when compared to other machine learning approaches.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;信念函数理论，又称为证据理论或 Dempster–Shafer 理论，是一种处理不确定性的通用框架，与概率、可能性和不精确概率理论等其他框架存在联系。这些理论框架可以被看作是一种学习器，具有将证据组合（例如， Dempster 组合规则）的类似特性，就像在基于 pmf 的 Bayes 方法中会将概率进行组合一样。然而，与 Bayes 方法相比，信念函数方法在纳入无知和不确定性量化时存在许多局限性。在机器学习领域中实现的这些信念函数方法通常采用各种集成方法的融合策略，以更好地处理学习器的决策边界、低样本量和模糊类问题，而这些问题通常是传统机器学习方法难以解决的。然而，这些算法的计算复杂度取决于命题（类）的数量，与其他机器学习方法相比，可能会导致更高的计算时间。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;训练模型&#34;&gt;训练模型
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Typically, machine learning models require a high quantity of reliable data to perform accurate predictions. When training a machine learning model, machine learning engineers need to target and collect a large and representative sample of data. Data from the training set can be as varied as a corpus of text, a collection of images, sensor data, and data collected from individual users of a service. Overfitting is something to watch out for when training a machine learning model. Trained models derived from biased or non-evaluated data can result in skewed or undesired predictions. Biased models may result in detrimental outcomes, thereby furthering the negative impacts on society or objectives. Algorithmic bias is a potential result of data not being fully prepared for training. Machine learning ethics is becoming a field of study and notably, becoming integrated within machine learning engineering teams.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;通常来说，机器学习模型需要大量的可靠数据来进行准确的预测。在训练机器学习模型时，机器学习工程师需要针对并收集大量具有代表性的数据样本。训练集的数据可以是多种多样的，比如文本语料库、图像集合、传感器数据，以及来自服务中单个用户的数据。在训练机器学习模型时，需要警惕过度拟合的问题。由带有偏见或未经评估的数据训练得到的模型可能会导致预测偏斜或不可接受。带有偏见的模型可能会导致不利后果，从而进一步加剧对社会或目标的负面影响。算法偏见可能是由于数据在训练过程中没有得到充分准备而产生的潜在结果。机器学习伦理学正在成为一门学科，并且越来越受到机器学习工程团队的重视。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&#34;联邦学习&#34;&gt;联邦学习
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Federated learning is an adapted form of distributed artificial intelligence to training machine learning models that decentralizes the training process, allowing for users&amp;rsquo; privacy to be maintained by not needing to send their data to a centralized server. This also increases efficiency by decentralizing the training process to many devices. For example, Gboard uses federated machine learning to train search query prediction models on users&amp;rsquo; mobile phones without having to send individual searches back to Google.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;联邦学习是一种适应性分布式人工智能，用于在不将用户数据发送到集中式服务器的情况下训练机器学习模型，从而保护用户的隐私。此外，通过将训练过程分散到多个设备上，还可以提高效率。例如， Gboard 使用联邦机器学习在用户的移动设备上训练搜索查询预测模型，而不需要将每个搜索结果发送回谷歌。&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
</description>
        </item>
        <item>
        <title>正则系综确定分布的平均值方法</title>
        <link>http://localhost:1313/p/%E6%AD%A3%E5%88%99%E7%B3%BB%E7%BB%BC%E7%A1%AE%E5%AE%9A%E5%88%86%E5%B8%83%E7%9A%84%E5%B9%B3%E5%9D%87%E5%80%BC%E6%96%B9%E6%B3%95/</link>
        <pubDate>Mon, 30 Sep 2024 19:56:03 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E6%AD%A3%E5%88%99%E7%B3%BB%E7%BB%BC%E7%A1%AE%E5%AE%9A%E5%88%86%E5%B8%83%E7%9A%84%E5%B9%B3%E5%9D%87%E5%80%BC%E6%96%B9%E6%B3%95/</guid>
        <description>&lt;img src="http://localhost:1313/covers/darwin_fowler.jpg" alt="Featured image of post 正则系综确定分布的平均值方法" /&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;返回目录&lt;/a&gt;&lt;/p&gt;
&lt;h1 id=&#34;正则系综确定分布的平均值方法&#34;&gt;正则系综确定分布的平均值方法
&lt;/h1&gt;&lt;h2 id=&#34;背景&#34;&gt;背景
&lt;/h2&gt;&lt;p&gt;正则系综描述着共享总能量 \( \mathscr{E} \) 的 \( \mathscr{N} \) 个全同系统构成的系综，这些系统的能级为 \( E_r \) ， \( n_r \) 表示能量为 \( E_r \) 的系统个数，因此存在约束关系&lt;/p&gt;
\[
\sum_r n_r=\mathscr{N},\quad\sum_r n_rE_r=\mathscr{E}=\mathscr{N}U
\]&lt;p&gt;其中 \( U=\mathscr{E}/\mathscr{N} \) 为系综内系统的平均能量。&lt;/p&gt;
&lt;p&gt;系统的改组数量为&lt;/p&gt;
\[ W\{n_r\}=\dfrac{\mathscr{N}!}{n_0!n_1!n_2!\cdots} \]&lt;p&gt;我们的目标是计算平均值&lt;/p&gt;
\[ \braket{n_r}=\dfrac{\sum_{\{n_r\}}n_rW\{n_r\}}{\sum_{\{n_r\}}W\{n_r\}} \]&lt;h2 id=&#34;引理&#34;&gt;引理
&lt;/h2&gt;&lt;p&gt;首先我们来研究 Cauchy 积分。考虑在紧子集(有界闭区域) \( K_0 \) 上的全纯函数 \( f(z) \) ， \( a \) 为邻域 \( \mathop{\rm Int}K_0 \) 内一点，我们考虑足够小的开圆盘 \( B_r(a)\Subset K_0 \)，根据 Cauchy 定理&lt;/p&gt;
\[\begin{split}
    \oint_{\partial K_0}\frac{f(z)}{z-a}dz &amp;= \oint_{B_r(a)}\frac{f(z)}{z-a}dz\xlongequal{z=a+re^{it}}\int_0^{2\pi}\frac{f(a+re^{it})}{re^{it}}d(a+re^{it}) \\
    &amp;=\int_0^{2\pi}\frac{f(a+re^{it})}{re^{it}}rie^{it}dt=2\pi i\braket{f(a+re^{it})}\to 2\pi if(a),\quad(r\to 0)
\end{split}\]&lt;p&gt;最后一步为积分中值定理， \( \braket{} \) 里的是平均值，由此我们得到了 Cauchy 积分公式&lt;/p&gt;
\[ \displaystyle{f(a)=\frac{1}{2\pi i}\oint\frac{f(z)}{z-a}dz} \]&lt;p&gt;&lt;img src=&#34;http://localhost:1313/figures/taylor.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
	
&gt;&lt;/p&gt;
&lt;p&gt;现在我们考虑将 \( f(z) \) 在闭圆盘 \( K=\mathop{\rm Close}B_r(a) \) 的内部展开，考虑在盘线 \( \partial K \) 上的Cauchy 积分&lt;/p&gt;
\[ \forall z\in\mathop{\rm Int}K:f(z)=\frac{1}{2\pi i}\oint_{\partial K}\frac{f(\xi)}{\xi-z}d\xi \]&lt;p&gt;考虑到几何级数展开&lt;/p&gt;
\[ \frac{1}{\xi-z}=\frac{1}{(\xi-a)-(z-a)}=\frac{1}{\xi-a}\frac{1}{1-\dfrac{z-a}{\xi-a}}=\frac{1}{\xi-a}\sum_{n=0}^\infty\left(\frac{z-a}{\xi-a}\right)^n \]&lt;p&gt;我们知道，几何级数是一致收敛的，因此&lt;/p&gt;
\[ \begin{split}
    f(z) &amp;= \frac{1}{2\pi i}\oint_C\left[\sum_{n=0}^\infty\frac{(z-a)^n}{(\xi-a)^{n+1}}\right]f(\xi)d\xi \\
    &amp;= \sum_{n=0}^\infty\left[\frac{1}{2\pi i}\oint_C\frac{f(\xi)}{(\xi-a)^{n+1}}d\xi\right](z-a)^n=\sum_{n=0}^\infty a_n(z-a)^n
\end{split} \]&lt;p&gt;这就得到了 \( f(z) \) 的 Taylor 展开，其系数为 \( \displaystyle{a_n=\frac{1}{2\pi i}\oint_C\frac{f(\xi)}{(\xi-a)^{n+1}}d\xi} \) 。&lt;/p&gt;
&lt;h2 id=&#34;过程&#34;&gt;过程
&lt;/h2&gt;&lt;p&gt;我们引入参数 \( \omega_r \) ，考虑等效权重因子 \( \widetilde{W}\{n_r\}=\dfrac{\mathscr{N}!\omega_0^{n_0}\omega_1^{n_1}\omega_2^{n_2}\cdots}{n_0!n_1!n_2!\cdots} \) 。在这里我们并引入统计物理中惯用的辅助函数 \( \Gamma(\mathscr{N},U)=\sum_{\{n_r\}}\widetilde{W}\{n_r\} \) ，这就使得&lt;/p&gt;
\[ \braket{n_r}=\omega_r\frac{\partial}{\partial\omega_r}(\log\Gamma)\bigg|_{\omega\equiv 1} \]&lt;p&gt;函数&lt;/p&gt;
\[ \Gamma(\mathscr{N},U)=\mathscr{N}!\sum_{\{n_r\}}\left(\frac{\omega_0^{n_0}}{n_0!}\frac{\omega_1^{n_1}}{n_1!}\frac{\omega_2^{n_2}}{n_2!}\cdots\right) \]&lt;p&gt;我们考虑 &lt;strong&gt;生成函数&lt;/strong&gt;&lt;/p&gt;
\[ G(\mathscr{N},z)=\sum_{U=0}^\infty\Gamma(\mathscr{N},U)z^{\mathscr{N}U} \]&lt;p&gt;考虑到分配约束条件，我们可以改写上式为&lt;/p&gt;
\[\begin{split}
    G(\mathscr{N},z)&amp;=\sum_{U=0}^\infty\left[\mathscr{N}!\sum_{\{n_r\}}\left(\frac{\omega_0^{n_0}}{n_0!}\frac{\omega_1^{n_1}}{n_1!}\frac{\omega_2^{n_2}}{n_2!}\cdots\right)\right]\times\left(z^{n_0E_0}z^{n_1E_1}z^{n_2E_2}\cdots\right) \\
    &amp;=\sum_{U=0}^\infty\sum_{\{n_r\}}\frac{\mathscr{N}!}{n_0!n_1!n_2!\cdots}(\omega_0z_0^{E_0})^{n_0}(\omega_1z_1^{E_1})^{n_1}(\omega_2z_2^{E_2})^{n_2}\cdots
\end{split}\]&lt;p&gt;由于我们对所有可能的内能 \( U \) 都有求和，因此上式等价于去掉对 \( U \) 的求和并将  \( \sum_{\{n_r\}} \) 的能量约束条件去掉，这样上述级数就彻底变成一个 Newton 展开式了，根据多项式定理我们立马得到 \( G(\mathscr{N},z)=[f(z)]^{\mathscr{N}} \) ，其中函数&lt;/p&gt;
\[ f(z)=\omega_0z^{E_0}+\omega_1z^{E_1}+\omega_2z^{E_2}+\cdots=\sum_r\omega_rz^{E_r} \]&lt;p&gt;现在为了求出 \( \Gamma \) 函数，我们注意到 \( \Gamma \) 是生成函数的系数，为此我们将生成函数变成 Taylor 级数。令 \( t=z^{\mathscr{N}} \) ，代入生成函数的定义式中得到&lt;/p&gt;
\[ K(t)=\sum_{U=0}^\infty\Gamma(\mathscr{N},U)t^U \]&lt;p&gt;其中 \( K(t)=[f(z)]^{\mathscr{N}} \) ，因此我们立马可以得到&lt;/p&gt;
\[\begin{split}
    \Gamma(\mathscr{N},U)&amp;=\frac{1}{2\pi i}\oint\frac{K(t)}{t^{U+1}}dt \\
    &amp;=\frac{1}{2\pi i}\oint\frac{f(z)^{\mathscr{N}}}{z^{\mathscr{N}U+\mathscr{N}}}d(z^{\mathscr{N}}) \\
    &amp;=\frac{1}{2\pi i}\oint\frac{f(z)^{\mathscr{N}}}{z^{\mathscr{N}U+\mathscr{N}}}\mathscr{N}z^{\mathscr{N}-1}dz \\
    &amp;=\frac{\mathscr{N}}{2\pi i}\oint\frac{f(z)^{\mathscr{N}}}{z^{\mathscr{N}U+1}}dz
\end{split}\]&lt;p&gt;接下来我们的目的就是算出此积分。&lt;/p&gt;
&lt;p&gt;不失一般性，我们假设 \( 0=E_0\leq E_1\leq E_2\leq\cdots \) ，且选取适当的能量单位使得 \( E_r \) 均为整数。&lt;/p&gt;
&lt;p&gt;我们考察积函数&lt;/p&gt;
\[ \dfrac{f(z)^{\mathscr{N}}}{z^{\mathscr{N}U+1}}=\frac{(\omega_0z^{E_0}+\omega_1z^{E_1}+\omega_2z^{E_2}+\cdots)^{\mathscr{N}}}{z^{\mathscr{N}U+1}} \]&lt;p&gt;考虑沿着实轴正半轴，注意到所有的 \( \omega=1 \) ，且 \( E_r \) 为从 0 开始的弱上升整数序列，所以分子 \( [f(z)]^{\mathscr{N}} \) 从 0 开始以 \( \mathscr{N}\sup E_r \) 的次幂上升，而分母又是以 \( \mathscr{N}U+1 \) 的次幂压制被积函数的上升，因此我们考虑沿着实轴正半轴被积函数应该会出现一处位于 \( x_0 \) 的鞍点，如图所示。&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://localhost:1313/figures/darwin_fowler.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
	
&gt;&lt;/p&gt;
&lt;p&gt;我们考虑被积函数的变换 \( \dfrac{f(z)^{\mathscr{N}}}{z^{\mathscr{N}U+1}}=\exp[\mathscr{N}g(z)] \) ，即&lt;/p&gt;
\[ g(z)=\log f(z)-\left(U+\frac{1}{\mathscr{N}}\right)\log z\simeq\log f(z)-U\log z\qquad(\mathscr{N}\gg 1\sim U)\]\[ g&#39;(x_0)=\frac{f&#39;(x_0)}{f(x_0)}-\frac{\mathscr{N}U+1}{\mathscr{N}x_0}\simeq\frac{f&#39;(x_0)}{f(x_0)}-\frac{U}{x_0}=0\Longrightarrow U\simeq x_0\frac{f&#39;(x_0)}{f(x_0)}=\frac{\sum\limits_r\omega_r E_r x_0^{E_r}}{\sum\limits_r \omega_r x_0^{E_r}} \]\[ g&#39;&#39;(x_0)=\left(\frac{f&#39;&#39;(x_0)}{f(x_0)}-\frac{f&#39;(x_0)^2}{f(x_0)^2}\right)+\frac{\mathscr{N}U+1}{\mathscr{N}x_0^2}\simeq\frac{f&#39;&#39;(x_0)}{f(x_0)}-\frac{U^2-U}{x_0^2} \]&lt;p&gt;我们现在考虑在鞍点附近的 Taylor 展开&lt;/p&gt;
\[ g(z)\Big|_{z=x_0+i\varepsilon}=g(x_0)+g&#39;(x_0)(i\varepsilon)+\frac{1}{2}g&#39;&#39;(x_0)(i\varepsilon)^2+\cdots\simeq g(x_0)-\frac{1}{2}g&#39;&#39;(x_0)\varepsilon^2 \]\[ \begin{split}
    \ 被积函数\ \frac{f(z)^{\mathscr{N}}}{z^{\mathscr{N}U+1}}&amp;=\exp[\mathscr{N}g(z)]\simeq\exp\left[\mathscr{N}\left(g(x_0)-\frac{1}{2}g&#39;&#39;(x_0)\varepsilon^2\right)\right]\\
    &amp;=\exp[\mathscr{N}g(x_0)]\exp\left(-\frac{\mathscr{N}}{2}g&#39;&#39;(x_0)\varepsilon^2\right)\\
    &amp;=\frac{f(x_0)^{\mathscr{N}}}{x_0^{\mathscr{N}U+1}}\exp\left(-\frac{\mathscr{N}}{2}g&#39;&#39;(x_0)\varepsilon^2\right)
\end{split} \]&lt;p&gt;我们对被积函数积分就能得到 \( \Gamma \) 系数&lt;/p&gt;
\[\begin{split}
    \Gamma(\mathscr{N},U) &amp;\simeq\frac{\mathscr{N}}{2\pi i}\oint\left[\frac{f(x_0)^{\mathscr{N}}}{x_0^{\mathscr{N}U+1}}\exp\left(-\frac{\mathscr{N}}{2}g&#39;&#39;(x_0)\varepsilon^2\right)\right]dz\quad (z=x_0+i\varepsilon)\\
    &amp;=\frac{\mathscr{N}}{2\pi i}\frac{f(x_0)^{\mathscr{N}}}{x_0^{\mathscr{N}U+1}}\int_{-\infty}^{+\infty}\left[\exp\left(-\frac{\mathscr{N}}{2}g&#39;&#39;(x_0)\varepsilon^2\right)\right]id\varepsilon\quad(\text{Gauss\ 积分})\\
    &amp;=\frac{\mathscr{N}}{2\pi i}\frac{f(x_0)^{\mathscr{N}}}{x_0^{\mathscr{N}U+1}}\cdot i\cdot\sqrt{\frac{2\pi}{\mathscr{N}g&#39;&#39;(x_0)}}=\frac{f(x_0)^{\mathscr{N}}}{x_0^{\mathscr{N}U+1}}\sqrt{\frac{\mathscr{N}}{2\pi g&#39;&#39;(x_0)}}
\end{split}\]\[ \begin{split}
\frac{1}{\mathscr{N}}&amp;\log\Gamma(\mathscr{N},U)\\
&amp;=\frac{1}{\mathscr{N}}\left\{\mathscr{N}\log f(x_0)-(\mathscr{N}U+1)\log x_0+\frac{1}{2}\Big[\log\mathscr{N}-\log(2\pi g&#39;&#39;\left(x_0)\right)\Big]\right\}\\
&amp;=\log f(x_0)-U\log x_0-\frac{1}{\mathscr{N}}\log x_0+\frac{1}{2\mathscr{N}}\log\mathscr{N}-\frac{1}{2\mathscr{N}}\log\Big(2\pi g&#39;&#39;(x_0)\Big)\\\
&amp;\to\log f(x_0)-U\log(x_0)=\log\sum_r\omega_r x_0^{E_r}-U\log x_0\qquad(\mathscr{N}\to\infty)
\end{split} \]&lt;p&gt;令 \( x_0=\exp(-\beta) \) 代入上式得到&lt;/p&gt;
\[ \frac{1}{\mathscr{N}}\log\Gamma(\mathscr{N},U)=\log\left(\sum_r\omega_r\exp\left(-\beta E_r\right)\right)+\beta U \]\[ \begin{split}
    \frac{\braket{n_r}}{\mathscr{N}}&amp;=\left[\omega_r\frac{\partial}{\partial\omega_r}\left(\frac{1}{\mathscr{N}}\log\Gamma\right)\right]_{\omega\equiv 1}\\
    &amp;=\left\{\omega_r\frac{\partial}{\partial\omega_r}\left[\log\left(\sum_r\omega_r\exp\left(-\beta E_r\right)\right)+\beta U\right]\right\}_{\omega\equiv 1}\\
    &amp;=\left\{\frac{\omega_r\dfrac{\partial}{\partial\omega_r}\displaystyle{\sum_r}\omega_r\exp\left(-\beta E_r\right)}{\displaystyle{\sum_r}\omega_r\exp\left(-\beta E_r\right)}+\omega_r\frac{\partial\beta}{\partial\omega_r}U\right\}_{\omega\equiv 1}\\
    &amp;=\left\{\frac{\omega_r\left[\exp\left(-\beta E_r\right)-\dfrac{\partial\beta}{\partial\omega_r}\displaystyle{\sum_r}\omega_r E_r e^{-\beta E_r}\right]}{\displaystyle{\sum_r}\omega_r\exp\left(-\beta E_r\right)}+\omega_r\frac{\partial\beta}{\partial\omega_r}U\right\}_{\omega\equiv 1}\\
    &amp;=\left\{\frac{\omega_r\exp(-\beta E_r)}{\displaystyle{\sum_r}\omega_r\exp(-\beta E_r)}+\omega_r\frac{\partial\beta}{\partial\omega_r}\left[U-\frac{\displaystyle{\sum_r}\omega_rE_r\exp(-\beta E_r)}{\displaystyle{\sum_r}\omega_r\exp(-\beta E_r)}\right]\right\}_{\omega\equiv 1}
\end{split} \]&lt;p&gt;注意到 \( g&#39;(x_0)=0 \) 正等价于 \( U=\frac{\displaystyle{\sum_r}\omega_rE_r\exp(-\beta E_r)}{\displaystyle{\sum_r}\omega_r\exp(-\beta E_r)} \) ，因此上式可以进一步化简为&lt;/p&gt;
\[ \frac{\braket{n_r}}{\mathscr{N}}=\left.\frac{\omega_r\exp(-\beta E_r)}{\displaystyle{\sum_r}\omega_r\exp(-\beta E_r)}\right|_{\omega\equiv 1}=\frac{\exp(-\beta E_r)}{\displaystyle{\sum_r}\exp(-\beta E_r)} \]&lt;p&gt;这就完成了计算。&lt;/p&gt;
</description>
        </item>
        <item>
        <title>正则系综</title>
        <link>http://localhost:1313/p/%E6%AD%A3%E5%88%99%E7%B3%BB%E7%BB%BC/</link>
        <pubDate>Sun, 29 Sep 2024 19:56:03 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E6%AD%A3%E5%88%99%E7%B3%BB%E7%BB%BC/</guid>
        <description>&lt;img src="http://localhost:1313/covers/adv_sm_chap3.jpg" alt="Featured image of post 正则系综" /&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;返回目录&lt;/a&gt;&lt;/p&gt;
&lt;h1 id=&#34;正则系综--the-canonical-ensemble&#34;&gt;正则系综 — The Canonical Ensemble
&lt;/h1&gt;&lt;p&gt;我们考虑一个系统与一个大热库之间的平衡，这是研究正则系综的基础。事实上我们考虑的这种情况与 &lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%83%ad%e5%8a%9b%e5%ad%a6%e7%b3%bb%e7%bb%9f/&#34; &gt;第一章&lt;/a&gt; 是类似的，其中一个是我们研究的系统，另一个是热库，只不过热库的能量足够大（或许是因为尺度足够大），我们研究的系统像是“浸入”在热库之中。&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://localhost:1313/figures/adv_sm_fig3-1.png&#34;
	
	
	
	loading=&#34;lazy&#34;
	
	
&gt;&lt;/p&gt;
&lt;p&gt;类似的能量守恒 \( E_r+E_r&#39;=E^{(0)}=\text{const} \) 依然成立，只不过此时 \( E_r\ll E^{(0)} \) 。设 \( T=\dfrac{1}{k_B\beta} \) 为热库的温度，平衡时也为系统的温度。&lt;/p&gt;
&lt;p&gt;类似的，由于能量守恒，状态数可以表示为 \( E_r \) 或 \( E_r&#39; \) 的函数而不是同时独立地依赖于两者，我们记作 \( \Omega&#39;(E&#39;_r) \) ，\( \Omega&#39; \) 的 \( \prime \) 表示状态数有可能依赖于热库的物理性质，但我们没有详细表出。&lt;/p&gt;
&lt;p&gt;我们考虑 Taylor 展开&lt;/p&gt;
\[\begin{split}
\log\Omega&#39;(E&#39;_r) = \log\Omega&#39;(E^{(0)}-E_r) &amp;= \log\Omega&#39;(E^{(0)})+\left(\frac{\partial\log\Omega&#39;}{\partial E_r&#39;}\right)_{E_r&#39;=E^{(0)}}(-E_r)+\cdots \\
&amp;\simeq \text{const}-\beta&#39;E_r\quad\Longrightarrow\quad\Omega&#39;(E&#39;_r)=\text{const}\cdot e^{-\beta E_r}
\end{split}\]&lt;p&gt;这里我们用到了热平衡时系统的温度和热库的温度相等 \( \beta=\beta&#39; \)。&lt;/p&gt;
&lt;p&gt;根据等概率原理，取得上述角标 \( r \) 标注的状态的概率应该正比于上述状态数 \( P_r\propto\Omega&#39;(E&#39;_r)\propto e^{-\beta E_r} \) ，归一化后即可得到概率&lt;/p&gt;
\[ P_r=\frac{\exp(-\beta E_r)}{\sum\limits_r\exp(-\beta E_r)} \]&lt;p&gt;我们注意到 \( P_r \) 仅依赖于热库的温度，与热库的具体理化性质无关。&lt;/p&gt;
&lt;h2 id=&#34;正则系综内的一个系统&#34;&gt;正则系综内的一个系统
&lt;/h2&gt;&lt;p&gt;考虑分享总能量 \( \mathscr{E} \) 的 \( \mathscr{N} \) 个全同系统，这些系统的能级为 \( E_r \) ，\( n_r \) 表示能量为 \( E_r \) 的系统数，则存在条件&lt;/p&gt;
\[ \sum_r n_r=\mathscr{N},\qquad\sum_r n_rE_r=\mathscr{E}=\mathscr{N}U \]&lt;p&gt;其中 \( U=\dfrac{\mathscr{E}}{\mathscr{N}} \) 为系统的平均能量，我们将其称为 &lt;strong&gt;内能&lt;/strong&gt; ，在后面我们看到这个能量正是热力学能。&lt;/p&gt;
&lt;p&gt;我们知道，系统在系综中的分配方式为&lt;/p&gt;
\[ W\{n_r\}=\frac{\mathscr{N}!}{n_0!n_1!n_2!\cdots} \]&lt;p&gt;热平衡时，类似的，\( W\{n_r\} \) 应该取极值，此时的分布我们设为 \( \{n^*_r\} \) 。考虑平均值&lt;/p&gt;
\[ \braket{n_r} = \frac{\sum_{\{n_r\}}n_rW\{n_r\}}{\sum_{\{n_r\}}W\{n_r\}} \]&lt;p&gt;我们将要说明， \( n_r^*=\braket{n_r} \) 。&lt;/p&gt;
&lt;h3 id=&#34;分配极值&#34;&gt;分配极值
&lt;/h3&gt;&lt;p&gt;我们考虑目标函数的对数，使用 Striling 公式&lt;/p&gt;
\[ \log W=\log(\mathscr{N}!)-\sum_r\log(n_r!)\simeq\mathscr{N}\log\mathscr{N}-\sum_r n_r\log n_r \]&lt;p&gt;其变分为&lt;/p&gt;
\[ \delta(\log W)=-\sum_r(\log n_r+1)\delta n_r \]&lt;p&gt;考虑到分配条件，不妨引入 Lagrange 乘子&lt;/p&gt;
\[ \delta(\log W)-\alpha\delta(n_r)-\beta\delta(E_rn_r)\bigg|_{n_r=n^*_r}=\sum_r\left[-(\log n^*_r+1)-\alpha-\beta E_r\right]\delta n_r=0 \]&lt;p&gt;因此&lt;/p&gt;
\[ n^*_r=C\exp(-\beta E_r),\qquad C=e^{-(\alpha+1)} \]&lt;p&gt;很明显，根据系统数量条件&lt;/p&gt;
\[ \frac{n_r^*}{\mathscr{N}}=\frac{\exp(-\beta E_r)}{\sum\limits_r\exp(-\beta E_r)}=P_r \]&lt;p&gt;为了确定乘子 \( \alpha, \beta \) , 考虑系统能量条件&lt;/p&gt;
\[ U=\frac{\sum_r E_r\exp(-\beta E_r)}{\sum_r\exp(-\beta E_r)}=-\frac{\partial}{\partial\beta}\log\left[\sum_r\exp(-\beta E_r)\right] \]&lt;p&gt;我们记 \( Q(N,V,T)=\sum_r\exp(-\beta E_r) \) 称为 &lt;strong&gt;配分函数&lt;/strong&gt; ，上式即 \( U=-\dfrac{\partial}{\partial\beta}\log Q \) 。&lt;/p&gt;
&lt;p&gt;我们考虑 Helmholtz 自由能 \( F=U-TS \) ，存在微分关系&lt;/p&gt;
\[ \begin{split}
dF&amp;=dU-TdS-SdT\\
&amp;=(TdS-PdV+\mu dN)-TdS-SdT\\
&amp;=-SdT-PdV+\mu dN
\end{split} \]&lt;p&gt;因此&lt;/p&gt;
\[\begin{split}
    U=F+TS&amp;=F-T\left(\frac{\partial F}{\partial T}\right)_{N,V}=F+\beta\left(\frac{\partial F}{\partial\beta}\right)_{N,V}\\
    &amp;=F\left(\frac{\partial\beta}{\partial\beta}\right)_{N,V}+\left(\frac{\partial F}{\partial\beta}\right)_{N,V}\beta=\left(\frac{\partial}{\partial\beta}(F\cdot\beta)\right)_{N,V}
\end{split}\]&lt;p&gt;这说明上述的 \( \beta \) 正是温度的 \( \beta \) 参数，且 \( Q=-F\cdot\beta \) 。&lt;/p&gt;
&lt;p&gt;有了配分函数，那么 \( \mathscr{N}=\sum_r n^*_r=\sum_r C\exp(-\beta E_r)=CQ=e^{-(\alpha+1)}Q \) ，由此立马可以得到 \( \alpha=-\log\dfrac{\mathscr{N}}{Q}-1 \) 。&lt;/p&gt;
&lt;p&gt;事实上 \( \alpha \) 乘子在这里是不重要的，因为我们总能归一化得到系数，而 \( \alpha \) 乘子仅仅确定系数而已。&lt;/p&gt;
&lt;h3 id=&#34;均值&#34;&gt;均值
&lt;/h3&gt;&lt;p&gt;我们的目标是证明&lt;/p&gt;
\[ \frac{\braket{n_r}}{\mathscr{N}}=\frac{\exp(-\beta E_r)}{\sum_r\exp(-\beta E_r)} \]&lt;p&gt;并且式中的 \( \beta \) 乘子正恰好为温度的 \( \beta \) 参数。&lt;/p&gt;
&lt;p&gt;可以采用 &lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e6%ad%a3%e5%88%99%e7%b3%bb%e7%bb%bc%e7%a1%ae%e5%ae%9a%e5%88%86%e5%b8%83%e7%9a%84%e5%b9%b3%e5%9d%87%e5%80%bc%e6%96%b9%e6%b3%95/&#34; &gt;Darwin-Fowler 的方法&lt;/a&gt; 来确定上式。&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;返回目录&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%b3%bb%e7%bb%bc%e7%90%86%e8%ae%ba/&#34; &gt;上一篇&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;下一篇&lt;/a&gt;&lt;/p&gt;
</description>
        </item>
        <item>
        <title>高等统计物理</title>
        <link>http://localhost:1313/p/%E9%AB%98%E7%AD%89%E7%BB%9F%E8%AE%A1%E7%89%A9%E7%90%86/</link>
        <pubDate>Sat, 28 Sep 2024 19:56:07 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E9%AB%98%E7%AD%89%E7%BB%9F%E8%AE%A1%E7%89%A9%E7%90%86/</guid>
        <description>&lt;img src="http://localhost:1313/covers/adv_sm.jpg" alt="Featured image of post 高等统计物理" /&gt;&lt;h1 id=&#34;高等统计物理---advanced-statistical-mechanics-advsm&#34;&gt;高等统计物理 - Advanced Statistical Mechanics &lt;code&gt;adv.sm&lt;/code&gt;
&lt;/h1&gt;&lt;p&gt;教材:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;R K Pathria, Paul D Beale. &lt;em&gt;Statistical Mechanics&lt;/em&gt;, 3rd ed. 中译本: 统计力学, 第三版. 方锦, 戴越译. 北京: 高等教育出版社, 2017.9.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&#34;contents&#34;&gt;Contents
&lt;/h1&gt;&lt;h2 id=&#34;chapter-1---single-system--thermodynamicsp热力学系统&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%83%ad%e5%8a%9b%e5%ad%a6%e7%b3%bb%e7%bb%9f/&#34; &gt;Chapter 1 - Single System &amp;amp; Thermodynamics&lt;/a&gt;
&lt;/h2&gt;&lt;p&gt;我们从单个系统出发, 研究一个具体的多粒子系统的行为，将其与热力学唯象理论对比。&lt;/p&gt;
&lt;h2 id=&#34;chapter-2---ensemble-theory-and-the-microcanonical-ensemblep系综理论&#34;&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%b3%bb%e7%bb%bc%e7%90%86%e8%ae%ba&#34; &gt;Chapter 2 - Ensemble Theory and The Microcanonical Ensemble&lt;/a&gt;
&lt;/h2&gt;&lt;p&gt;现在我们转入系综 —— 系统的系统，并研究了一种具体的系综 —— 微正则系综。&lt;/p&gt;
</description>
        </item>
        <item>
        <title>个人简历</title>
        <link>http://localhost:1313/p/%E4%B8%AA%E4%BA%BA%E7%AE%80%E5%8E%86/</link>
        <pubDate>Sat, 28 Sep 2024 19:56:07 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E4%B8%AA%E4%BA%BA%E7%AE%80%E5%8E%86/</guid>
        <description>&lt;img src="http://localhost:1313/covers/profile.jpg" alt="Featured image of post 个人简历" /&gt;</description>
        </item>
        <item>
        <title>高等量子力学</title>
        <link>http://localhost:1313/p/%E9%AB%98%E7%AD%89%E9%87%8F%E5%AD%90%E5%8A%9B%E5%AD%A6/</link>
        <pubDate>Sat, 28 Sep 2024 19:56:03 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E9%AB%98%E7%AD%89%E9%87%8F%E5%AD%90%E5%8A%9B%E5%AD%A6/</guid>
        <description>&lt;img src="http://localhost:1313/covers/adv_qm.jpg" alt="Featured image of post 高等量子力学" /&gt;&lt;h1 id=&#34;高等量子力学---advanced-quantum-mechanics-advqm&#34;&gt;高等量子力学 - Advanced Quantum Mechanics &lt;code&gt;adv.qm&lt;/code&gt;
&lt;/h1&gt;&lt;p&gt;教材:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;J J Sakurai, Jim Napolitano. &lt;em&gt;Modern Quantum Mechanics&lt;/em&gt;, 2nd ed. Cambridge University Press, 2017.&lt;/li&gt;
&lt;li&gt;曾谨言. &lt;em&gt;量子力学&lt;/em&gt;, 卷II. 5版, ed. 现代物理学丛书. 北京: 科学出版社, 2014.1.&lt;/li&gt;
&lt;li&gt;M E Peskin, D V Schroeder. &lt;em&gt;An Introduction to Quantum Field Theory&lt;/em&gt;. CRC Press, 2018.&lt;/li&gt;
&lt;/ul&gt;
</description>
        </item>
        <item>
        <title>热力学系统</title>
        <link>http://localhost:1313/p/%E7%83%AD%E5%8A%9B%E5%AD%A6%E7%B3%BB%E7%BB%9F/</link>
        <pubDate>Sat, 28 Sep 2024 19:56:03 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E7%83%AD%E5%8A%9B%E5%AD%A6%E7%B3%BB%E7%BB%9F/</guid>
        <description>&lt;img src="http://localhost:1313/covers/adv_sm_chap1.jpg" alt="Featured image of post 热力学系统" /&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;返回目录&lt;/a&gt;&lt;/p&gt;
&lt;h1 id=&#34;热力学系统--thermodynamics-system&#34;&gt;热力学系统 — Thermodynamics System
&lt;/h1&gt;&lt;p&gt;统计力学旨在以研究分子系统的方式来研究物质的性质，为此我们考虑单个多分子系统的动力学，并与热力学的唯象理论对比。&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;微观态 — 系统的 Schrodinger 方程的解，形如 \( \psi=\psi(\bm r_1,\bm r_2,\cdots,\bm r_N) \) 。&lt;/li&gt;
&lt;li&gt;宏观态 — 系统的一组可测量的统计学量，例如: 系统的能量 \( E \) ，分子数 \( N \) ，容器的体积 \( V \) 等等&lt;/li&gt;
&lt;li&gt;状态数 — \( \Omega(N,V,E) \) : 在以 \( (N,V,E) \) 确定的宏观态中允许的微观态数量&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;统计力学基本假设--等概率原理&#34;&gt;统计力学基本假设 — 等概率原理
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;在以 \( (N,V,E) \) 确定的宏观态中，允许的微观态在热力学极限下以相等的概率出现。&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;热力学极限 — 在保持分子数密度 \( n=\dfrac{N}{V} \) 不变的情况下让 \( N\to\infty，V\to\infty \)&lt;/li&gt;
&lt;li&gt;物理无穷小 — 在体积 \( \Delta V \) 内的分子数 \( \Delta N \) 为无穷大，尽管 \( \Delta V \) 为无穷小&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;温度&#34;&gt;温度
&lt;/h2&gt;&lt;p&gt;考虑两个热接触的系统 \( A_{1,2} \) ，两个系统的整体在空间中孤立，因此:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;物质 — 被困在每个系统中无法交换&lt;/li&gt;
&lt;li&gt;能量 — 可以在两个系统之间交换，但无法向环境交换&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;http://localhost:1313/figures/adv_sm_fig1-1.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
	
&gt;
&lt;em&gt;细线表示导热隔板，粗线表示绝热隔板&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;用指标 \( i=1,2 \) 来区分两个系统，设两个系统的总能量(明显恒定)为 \( E^{(0)}=E_1+E_2=\text{const} \)，则总系统的状态数为
&lt;/p&gt;
\[
    \Omega^{(0)}(E_1,E_2)=\Omega_1(E_1)\Omega_2(E_2)=\Omega_1(E_1)\Omega_2(E^{(0)}-E_1)    
\]&lt;p&gt;
平衡时状态数取极大值，因此对上式微分，我们用 \( \overline{E}_1 \) 来表示取极值时的能量分布，另一边的能量就是 \( \overline{E}_2=E^{(0)}-\overline{E}_1 \) 。 注意到 \( (\partial E_2/\partial E_1)_{E_1=\overline{E}_1}=-1 \)，故能得到
&lt;/p&gt;
\[
    \left(\frac{\partial\log\Omega_1(E_1)}{\partial E_1}\right)_{E_1=\overline{E}_1}=\left(\frac{\partial\log\Omega_2(E_2)}{\partial E_2}\right)_{E_2=\overline{E}_2}
\]&lt;p&gt;
这说明量 \( \beta=\left(\dfrac{\partial\log\Omega(E)}{\partial E}\right)_{E=\overline{E}} \) 在两个热接触系统中是相等的。 注意到 \( \dfrac{1}{T}=\left(\dfrac{\partial S}{\partial E}\right)_{N,V} \) ，两式近似为有限元差商后相除得到
&lt;/p&gt;
\[
    \Delta S/\Delta(\log\Omega)\simeq\text{const}
\]&lt;p&gt;
这就是 Boltzmann 关系。值得注意的是，Boltzmann 关系的解是 \( S=S_0+\lambda\log\dfrac{\Omega}{\Omega_0} \)，其中系数 \( \lambda \) 就是差商的渐进值，\( S_0 \) 和 \( \Omega_0 \) 为积分常数。Planck 发展了 Boltamann 关系，指出
&lt;/p&gt;
\[
    S=k_B\log\Omega
\]&lt;p&gt;
其中系数 \( k_B \) 正是上述的系数 \( \lambda \) , 被称为 Boltzmann 常数。&lt;/p&gt;
&lt;p&gt;为了理解 Planck 的假设，我们注意到这相当于令 \( \Omega_0=1, S_0=0 \) 。回到温度的定义
&lt;/p&gt;
\[
    \begin{split}
    \beta=\frac{1}{k_BT} &amp;= \left(\frac{\partial}{\partial E}\frac{S}{k_B}\right)_{N,V} \\
    &amp;= \left(\frac{\partial}{\partial E}\left(\frac{S_0}{k_B}+\log\frac{\Omega}{\Omega_0}\right)\right)_{N,V} \\
    &amp;=\left(\frac{\partial}{\partial E}\log \Omega\right)_{N,V}=\frac{1}{\Omega}\left(\frac{\partial\Omega}{\partial E}\right)_{N,V}
    \end{split}
\]&lt;p&gt;
可以看到温度并不依赖于积分常数 \( S_0 \) 和 \( \Omega_0 \) ，能够看出 &lt;strong&gt;以 \( \beta \) 参数表示的温度的统计意义是状态数随能量增大的相对增长率&lt;/strong&gt; 。现在我们考虑低温极限，此时 \( \beta\to+\infty \) ，即随温度升高时 \( \Omega \) 以无穷大的速度增长，为了避免发散我们需要让此时状态数为 \( \Omega_0=1 \) ，因为 1 的任意指数都是 1 ，这种状态对应的熵为 \( S=S_0+k_B\log\dfrac{\Omega(=1)}{\Omega_0(=1)}=S_0 \)，我们规定为 0 。规定 \( \Omega=1 \) 时的熵 \( S_0=0 \) ，这是因为只有一种状态的系统就是纯粹的力学系统，熵是不应该存在的。&lt;/p&gt;
&lt;p&gt;在这里我们可以看到，低温极限 \( \beta\to+\infty \) 时，状态数会自然回落到 \( \Omega\to 1 \) ，此时熵 \( S\to 0 \) ，因此我们说：&lt;strong&gt;零温度系统 \(\Longleftrightarrow\) 纯力学系统&lt;/strong&gt; ，换句话说纯力学系统正是 \( T=0{\rm K} \) 的热力学系统。&lt;/p&gt;
&lt;h2 id=&#34;热力学&#34;&gt;热力学
&lt;/h2&gt;&lt;p&gt;我们类似的可以引入
&lt;/p&gt;
\[
    \beta=\left(\dfrac{\partial\log\Omega(N,V,E)}{\partial E}\right)_{N,V}
\]\[
    \eta=\left(\dfrac{\partial\log\Omega(N,V,E)}{\partial V}\right)_{N,E}
\]\[
    \zeta=\left(\dfrac{\partial\log\Omega(N,V,E)}{\partial N}\right)_{V,E}
\]&lt;p&gt;
这意味着存在全微分关系
&lt;/p&gt;
\[
    d\log\Omega(N,V,E)=\beta dE+\eta dV+\zeta dN
\]&lt;p&gt; 或 &lt;/p&gt;
\[
    dS=k_B d(\log\Omega)=k_B\beta dE+k_B\eta dV+k_B\zeta dN
\]&lt;p&gt; 或 &lt;/p&gt;
\[
    dE=\frac{1}{k_B\beta}dS-\frac{\eta}{\beta}dV-\frac{\zeta}{\beta}dN
\]&lt;p&gt;
如果我们认为这里的 \( E \) 为热力学能(内能)，则对比热力学第二定律
&lt;/p&gt;
\[
    dE=TdS-PdV+\mu dN
\]&lt;p&gt;
我们立马可以得到
&lt;/p&gt;
\[
    \beta=\frac{1}{k_B T},\qquad\eta=\frac{P}{k_B T},\qquad\zeta=-\frac{\mu}{k_B T}
\]&lt;p&gt;
这就从熵出发推出了全部热力学关系。&lt;/p&gt;
&lt;h2 id=&#34;理想气体&#34;&gt;理想气体
&lt;/h2&gt;&lt;p&gt;在初步的模型中，我们认为理想气体 \( \Omega\propto V^N \) ，由此立刻可以得到 \( \dfrac{P}{T}=k_B\left(\dfrac{\partial\log\Omega}{\partial V}\right)_{N,E}=k_B\dfrac{N}{V} \)，即理想气体状态方程
&lt;/p&gt;
\[ PV=Nk_BT \]&lt;p&gt;我们引入&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;\( \Omega_N(E^*) \) : 半径 \( \sqrt{E^*} \) 的 \( 3N \) 维球面上的整数格点数&lt;/li&gt;
&lt;li&gt;\( \Sigma_N(E^*) \) : 半径 \( \sqrt{E^*} \) 的 \( 3N \) 维球面内(含球面)的整数格点数&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;则
&lt;/p&gt;
\[
    \Sigma_N(E^*)=\sum_{E&#39;\leq E^*}\Omega_N(E&#39;)\simeq\int_{E&#39;\leq E^*} \frac{\partial\Omega_N(E&#39;)}{\partial E&#39;}dE&#39;
\]&lt;p&gt;考虑理想气体，\( \Omega(N,V,E) \) 事实上为满足 \( \sum_{r=1}^{3N}\varepsilon_r=E \) 的解的数量，其中每一个 \( \varepsilon_r \) 都是
&lt;/p&gt;
\[
    \varepsilon(n_x,n_y,n_z)=\frac{h^2}{8mL^2}(n_x^2+n_y^2+n_z^2)\Longrightarrow(n_x^2+n_y^2+n_z^2)=\frac{8mV^{2/3}\varepsilon}{h^2}=\varepsilon^*
\]&lt;p&gt;
这就将能量满足的方程化简为了 \( \sum_{r=1}^{3N}n_r^2=\dfrac{8mV^{2/3}E}{h^2}=E^* \) 的解数量，即 \( \Omega_N(E^*) \) 。&lt;/p&gt;
&lt;p&gt;一方面，我们注意到 \( V, E \) 是以 \( V^{2/3}E \) 的整体出现在 \( \Omega \) 中的，因此 \( S(N,V,E)=S(N,V^{2/3}E) \) 。 \( V^{2/3}E=\text{const} \) 的过程在分子数不变时定义了等熵过程，即可逆绝热过程，容易得到 \( P=-\left(\dfrac{\partial E}{\partial V}\right)_{N,S}=\dfrac{2}{3}\dfrac{E}{V} \)，因此 \( PV^{5/3}=\text{const} \) ，即绝热指数为 \( 5/3 \) 。&lt;/p&gt;
&lt;p&gt;由于 \( V, E \) 是以 \( V^{2/3}E \) 的整体出现在 \( \Omega \) 中的，因此我们将能量 \( E \) 改写为 \( E^*=\dfrac{8mV^{2/3}E}{h^2} \)
，等效的值 \( E^* \) 依赖于体积 \( V \)，此后我们可以将形如 \( \Omega(N,V,E) \) 的量记作 \( \Omega_N(E^*) \) ，这两种记号混用时应该记得这点。&lt;/p&gt;
&lt;p&gt;我们知道
&lt;/p&gt;
\[
    \Sigma_N(E^*)=\sum_{E&#39;\leq E^*}\Omega_N(E&#39;)\simeq\int_{E&#39;\leq E^*} \frac{\partial\Omega_N(E&#39;)}{\partial E&#39;}dE&#39;
\]&lt;p&gt;
为此定义 \( \displaystyle{\Gamma}(N,V,E;\Delta)=\int\limits_{E-\frac{1}{2}\Delta&lt; E &lt; E+\frac{1}{2}\Delta}\Sigma_N(E^*)dE^* \) 为能量范围 \( \left(E-\dfrac{1}{2}\Delta，E+\dfrac{1}{2}\Delta\right) \) 内的状态数，因此在 \( \Delta\ll E \) 时
&lt;/p&gt;
\[
    \Gamma(N,V,E;\Delta)\simeq\Delta\frac{\partial\Sigma(N,V,E)}{\partial E}
\]&lt;p&gt;
我们用体积来渐进
&lt;/p&gt;
\[
    \Sigma_N(E^*)\simeq\frac{1}{2^{3N}}V_{3N}(\sqrt{E^*})=\frac{1}{2^{3N}}\left\{\frac{\pi^{3N/2}}{(3N/2)!}{E^*}^{3N/2}\right\}    
\]&lt;p&gt;
式中大括号内为 \( 3N \) 维空间中半径 \( \sqrt{E^*} \) 的球的体积，\( 2^{3N} \) 为 \( 3N \) 维空间的卦限数，相当于取坐标轴正轴的卦限。在上式利用 Striling 公式将阶乘化为对数，代入计算，最终可以得到
&lt;/p&gt;
\[
    \log\Gamma(N,V,E;\Delta)\simeq N\log\left[\frac{V}{h^3}\left(\frac{4\pi mE}{3N}\right)^{3/2}\right]+\frac{3}{2}N
\]&lt;p&gt;
我们可以看到，很奇怪但又的确是这样的是，宽度 \( \Delta \) 对 \( \Omega \) 没有贡献，这说明仅有 \( E \) 附近的很小的区域对状态数才有贡献。 将上式乘以 Boltzmann 常数，就得到了理想气体的熵
&lt;/p&gt;
\[
    S(N,V,E)\simeq Nk_B\log\left[\frac{V}{h^3}\left(\frac{4\pi mE}{3N}\right)^{3/2}\right]+\frac{3}{2}Nk_B
\]&lt;h2 id=&#34;gibbs-计数修正&#34;&gt;Gibbs 计数修正
&lt;/h2&gt;&lt;p&gt;我们来考虑气体的混合问题。&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://localhost:1313/figures/adv_sm_fig1-2.svg&#34;
	
	
	
	loading=&#34;lazy&#34;
	
	
&gt;&lt;/p&gt;
&lt;p&gt;图中两种气体被导热隔板隔开，令 \( N=N_1+N_2 \) , \( V=V_1+V_2 \) ，热平衡时，其熵为两者之和，分别为&lt;/p&gt;
\[ S_i=N_i k_B\log V_i+\frac{3}{2} N_i k_B\left[1+\log\left(\frac{2\pi m_i k_B T}{h^2}\right)\right]\qquad (i=1,2)\]&lt;p&gt;混合后，其熵变为&lt;/p&gt;
\[ S_T=\sum_{i=1,2}\left\{N_i k_B\log V+\frac{3}{2} N_i k_B\left[1+\log\left(\frac{2\pi m_i k_B T}{h^2}\right)\right]\right\}\]&lt;p&gt;两者差值被定义为 &lt;strong&gt;混合熵&lt;/strong&gt; ，即&lt;/p&gt;
\[
    \Delta S=S_T-\sum_{i=1,2}S_i=k_B\left(N_1\log\frac{V}{V_1}+N_2\log\frac{V}{V_2}\right)
\]&lt;p&gt;容易看到 \( \Delta S&gt;0 \) ，这与熵增加原理一致。&lt;/p&gt;
&lt;p&gt;我们用星号来表示气体分子数密度相同时的混合熵，用角标 \( 1\equiv 2 \) 表示同种分子的混合熵，因此&lt;/p&gt;
\[\begin{split}
    (\Delta S) &amp;= k_B\left(N_1\log\frac{V}{V_1}+N_2\log\frac{V}{V_2}\right) &gt; 0\\
    (\Delta S)^* &amp;= k_B\left(N_1\log\frac{N}{N_1}+N_2\log\frac{N}{N_2}\right) &gt; 0
\end{split}\]&lt;p&gt;注意到 \( \Delta S \) 不依赖于 \( m_i \) ，因此&lt;/p&gt;
\[ (\Delta S)_{1\equiv 2}=(\Delta S)&gt;0,\qquad (\Delta S)^*_{1\equiv 2}=(\Delta S)^*&gt;0 \]&lt;p&gt;但值得注意的是，混合熵 \( (\Delta S)^*_{1\equiv 2} \) 描述的混合过程是两边完全全等的气体的混合，因此应该有 \( (\Delta S)^*_{1\equiv 2}=0 \) 。&lt;/p&gt;
&lt;p&gt;注意到，利用 Striling 公式，反过来有&lt;/p&gt;
\[ \begin{split}
    (\Delta S)^*_{1\equiv 2}=(\Delta S)^*
    &amp;= k_B\left[N_1\log\frac{N}{N_1}+N_2\log\frac{N}{N_2}\right] \\
    &amp;= k_B\left[(N_1+N_2)\log N-(N_1\log N_1+N_2\log N_2)\right] \\
    &amp;= k_B(N\log N) - \left[ k_B(N_1\log N_1) + k_B(N_2\log N_2) \right] \\
    &amp;\simeq k_B\log(N!) - \left[ k_B\log(N_1!) + k_B\log(N_2!) \right]
\end{split} \]&lt;p&gt;由此可见修正上述问题的方法就是在上述计算中令 \( S\leftarrow S-k_B\log(N!) \) 使 \( (\Delta S)^*_{1\equiv 2}=0 \) ，这样得出的熵叫做 &lt;strong&gt;Sackur-Tetrode 方程&lt;/strong&gt; ，即
&lt;/p&gt;
\[\begin{split}
    S(N,V,E) &amp;= Nk_B\log\left[\frac{V}{h^3}\left(\frac{4\pi mE}{3N}\right)^{3/2}\right]+\frac{3}{2}Nk_B - k_B(N\log N-N)\\
    &amp;= Nk_B\log\left[\frac{V}{Nh^3}\left(\frac{4\pi mE}{3N}\right)^{3/2}\right]+\frac{5}{2}Nk_B\\
    &amp;= Nk_B\log\frac{V}{N}+\frac{3}{2}Nk_B\left[\frac{5}{3}+\log\left(\frac{2\pi mk_B T}{h^2}\right)\right]
\end{split}\]&lt;p&gt;注意到 \( S=k_B\log\Omega \) ，这一修正就等价于 \( \Omega\leftarrow\dfrac{1}{N!}\Omega \) ，我们称为 &lt;strong&gt;Gibbs 计数修正&lt;/strong&gt; 。&lt;/p&gt;
&lt;p&gt;事实上，经典理想气体分子不仅仅是全同的，更是不可分辨的。我们考虑全部可能的排列数
&lt;/p&gt;
\[
    \frac{N!}{n_1!n_2!\cdots}    
\]&lt;p&gt;
这些排列，任意对换都很大概率会导致 \( N,V,E \) 宏观态的改变，因此符合 \( N,V,E \) 宏观态的系统在上述排列中往往占极少数，在最极端的情况下仅占上述排列的一种，因此我们需要除掉上述排列中不符合 \( N,V,E \) 宏观态的排列个数。&lt;/p&gt;
&lt;p&gt;Gibbs 的方案是只除掉分子 \( N! \) ，这不依赖于具体的排列细节 \( \{n_i\} \) 。这样的作法相当于对分布作了一个虚假的排列因子
&lt;/p&gt;
\[ W\{n_i\}=\frac{1}{n_1!n_2!\cdots} \]&lt;p&gt;
事实上我们应该直接令 \( W\{n_i\}=1 \) 才能得到正确的结果。我们可以看到， Gibbs 的方案导出经典统计的结论，这要求
&lt;/p&gt;
\[ \langle n_i\rangle\ll 1 \]&lt;p&gt;
以保证对换尽可能不破坏 \( N,V,E \) 宏观态条件，这个要求往往适用于温度充分高、密度充分低的气体。在极低温或极高密度情况下，我们需要让 \( W\{n_i\}=1 \) ，在后面我们看到这个要求导出的结果正是量子统计的结论。&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;返回目录&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;上一篇&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%b3%bb%e7%bb%bc%e7%90%86%e8%ae%ba/&#34; &gt;下一篇&lt;/a&gt;&lt;/p&gt;
</description>
        </item>
        <item>
        <title>系综理论</title>
        <link>http://localhost:1313/p/%E7%B3%BB%E7%BB%BC%E7%90%86%E8%AE%BA/</link>
        <pubDate>Sat, 28 Sep 2024 19:56:03 +0800</pubDate>
        
        <guid>http://localhost:1313/p/%E7%B3%BB%E7%BB%BC%E7%90%86%E8%AE%BA/</guid>
        <description>&lt;img src="http://localhost:1313/covers/adv_sm_chap2.jpg" alt="Featured image of post 系综理论" /&gt;&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;返回目录&lt;/a&gt;&lt;/p&gt;
&lt;h1 id=&#34;系综理论--ensemble-theory&#34;&gt;系综理论 — Ensemble Theory
&lt;/h1&gt;&lt;p&gt;我们在本章考虑系综理论，首先我们来明确系统和系综的含义。&lt;/p&gt;
&lt;h2 id=&#34;相空间&#34;&gt;相空间
&lt;/h2&gt;&lt;p&gt;任意时刻，一组相坐标 \( (q,p)=(q_1,\cdots,q_{3N},p_1,\cdots,p_{3N}) \) 被视作系统的一个状态，因此这些坐标构成的 \( 6N \) 维空间将被视作系统的“坐标空间”，我们称之为 &lt;strong&gt;相空间&lt;/strong&gt; 。相空间的前 \( 3N \) 维子空间描述系统的位形，称为 &lt;strong&gt;位形空间&lt;/strong&gt; ，后 \( 3N \) 维描述系统的动量。换言之， \( 6N \) 维相空间正被视为 \( N \) 粒子构成的系统空间，其中每一个点都代表一个 \( N \) 粒子系统。&lt;/p&gt;
&lt;p&gt;我们在 &lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%83%ad%e5%8a%9b%e5%ad%a6%e7%b3%bb%e7%bb%9f/&#34; &gt;第一章&lt;/a&gt; 中详细研究了单个系统的行为，但我们引入了 &lt;strong&gt;等概率原理&lt;/strong&gt; 这一重要的假设，也就是说我们研究的这个“单系统”应该被视作全部可能系统在某种意义上的代表，这就引出了所谓“系综”。&lt;/p&gt;
&lt;h2 id=&#34;系综-ensemble&#34;&gt;系综 (Ensemble)
&lt;/h2&gt;&lt;p&gt;正如我们所说，我们不再考虑单个系统。我们在每一时刻都考虑大量系统的 “illusory copies” （假想的副本），系统以相等的概率在这些副本之间选择，这种集合 {illusory copies} 就被称为系综。由于每个系统都为相空间中的一个点，因此系综可以被视作相空间上的流体。&lt;/p&gt;
&lt;p&gt;我们知道，系统的运动方程为 Hamilton 正则方程&lt;/p&gt;
\[ \dot{q}_i=\frac{\partial H(q, p)}{\partial p_i},\qquad\dot{p}_i=-\frac{\partial H(q, p)}{\partial q_i} \]&lt;p&gt;系统的能量 \( E \) 给定，也就是说系综应该被限制在曲面 \( H(q,p)=E \) 上。根据我们在 &lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%83%ad%e5%8a%9b%e5%ad%a6%e7%b3%bb%e7%bb%9f/&#34; &gt;第一章&lt;/a&gt; 中对理想气体的研究，我们宁愿将曲面 \( H(q,p)=E \) 扩宽成壳层 \( E-\dfrac{1}{2}\Delta &lt; H(q,p) &lt; E+\dfrac{1}{2}\Delta \) ，以 \( \Gamma \) 来代替 \( \Omega \) 进行计数。&lt;/p&gt;
&lt;p&gt;我们用 \( \rho(q,p,t) \) 来表示在相空间给定点 \( q,p \) 处时间 \( t \) 时的系统密度，考虑力学量 \( f(q,p) \) 的系综平均值&lt;/p&gt;
\[ \braket{f}=f\ 的系综平均=\frac{\displaystyle{\int f(q,p)\rho(q,p,t)d^{3N}qd^{3N}p}}{\displaystyle{\int \rho(q,p,t)d^{3N}qd^{3N}p}}\]&lt;p&gt;这个值被称为 \( f \) 的统计观测值。&lt;/p&gt;
&lt;h2 id=&#34;liouville-方程&#34;&gt;Liouville 方程
&lt;/h2&gt;&lt;p&gt;作为在相空间上的“系统”流体，系综自然也服从流体力学的基本原理，即连续性方程&lt;/p&gt;
\[ \frac{\partial}{\partial t}\rho(q,p,t)+\mathop{\rm div}[\rho(q,p,t)\bm v]=0 \]&lt;p&gt;其中 \( \bm v \) 为相速度，div 为相空间的 \( 3N \) 维散度。我们把上式散度化简，代入 Hamilton 正则方程，很容易得到&lt;/p&gt;
\[ \frac{\partial\rho}{\partial t}+\{\rho, H\}=0 \]&lt;p&gt;上式中的花括号为 Poisson 括号。这个式子被称为 &lt;strong&gt;Liouville 方程&lt;/strong&gt; ，容易看出左边正式 \( \dfrac{d\rho}{dt}=\dfrac{\partial\rho}{\partial t}+\{\rho, H\} \) ，因此上述方程也就是 \( \dfrac{d\rho}{dt}=0 \) 。&lt;/p&gt;
&lt;p&gt;我们将 \( \dfrac{\partial\rho}{\partial t}\equiv 0 \) 的系综称为 &lt;strong&gt;平衡态系综&lt;/strong&gt; ，这些系综中各个系统都处于热平衡。&lt;/p&gt;
&lt;h2 id=&#34;微正则系综--the-microcanonical-ensemble&#34;&gt;微正则系综 — The Microcanonical Ensemble
&lt;/h2&gt;&lt;p&gt;微正则系综描述着以 \( N,V,E \) 确定的宏观态对应的系统集合，然而正像我们前面说的，我们宁愿用 \( \left(E-\dfrac{1}{2}\Delta, E+\dfrac{1}{2}\Delta\right) \) 这个区间来代替具体的能量 \( E \) 。我们用 \( \omega=(q,p) \) 来表示相空间的测度，即 \( d\omega=d^{3N}qd^{3N}p \) ，则微正则系综在相空间内占据的体积为&lt;/p&gt;
\[ \omega = \int&#39;d\omega=\int&#39; d^{3N}qd^{3N}p \]&lt;p&gt;其中带 \( \prime \) 的积分表示在区域 \( E-\dfrac{1}{2}\Delta &lt; H(\omega) &lt; E+\dfrac{1}{2}\Delta \) 中的积分。&lt;/p&gt;
&lt;p&gt;根据等概率原理，我们知道微正则系综的密度应该是均匀的，即&lt;/p&gt;
\[ \rho(\omega)=\begin{dcases} \text{const}, &amp; E-\dfrac{1}{2}\Delta &lt; H(\omega) &lt; E+\dfrac{1}{2}\Delta \\ 0, &amp; \text{otherwise} \end{dcases} \]&lt;p&gt;上式不含时间，说明微正则系综是稳定系综，即系统间处于热平衡，因此我们有&lt;/p&gt;
\[\begin{split}
\braket{f}&amp;=f\ 的系综平均\\
&amp;= (f\ 的系综平均)\ 的时间平均\\
&amp;= (f\ 的时间平均)\ 的系综平均\\
&amp;= f\ 的长时间平均 = f_{\ 期望值}
\end{split}\]&lt;p&gt;为了确定状态数，我们需要确定一个基本度量 \( \omega_0 \) ，这个度量相当于状态计数的单位，即 \( \omega \) 体积内的状态数可以渐进表示为 \( \Gamma\simeq\dfrac{\omega}{\omega_0} \) ，此时熵 \( S=k_B\log\Gamma \) 。我们先放出结论，对于自由度为 \( f \) 的系统构成的微正则系综， \( \omega_0=h^f \) 为 Planck 常数的 \( f \) 次幂。&lt;/p&gt;
&lt;h2 id=&#34;实例-1--理想气体&#34;&gt;实例 1 — 理想气体
&lt;/h2&gt;&lt;p&gt;系综的体积为&lt;/p&gt;
\[\begin{split}
    \omega &amp;= \int&#39; d\omega=\int&#39; d^{3N}q \int&#39; d^{3N}p \\
    &amp;= V^N\mathop{\displaystyle{\int\cdots\int}}\limits_{\left(E-\frac{1}{2}\Delta\right) &lt; \sum_{r=1}^{3N}\frac{p_r^2}{2m} &lt; \left(E+\frac{1}{2}\Delta\right)}d^{3N}p\xlongequal{y_r^2=\frac{p_r^2}{2m}} V^N\mathop{\displaystyle{\int\cdots\int}}\limits_{2m\left(E-\frac{1}{2}\Delta\right) &lt; \sum_{r=1}^{3N}y_r^2 &lt; 2m\left(E+\frac{1}{2}\Delta\right)}d^{3N}y
\end{split}\]&lt;p&gt;我们用 \( S_{n-1}(R) \) 和 \( V_n(R) \) 来表示 \( n \) 维球的表面积和体积，注意到 \( dV_n(R)=S_{n-1}(R)dR \) ，上式等于&lt;/p&gt;
\[\omega = V_{3N}\left[\sqrt{2m\left(E+\frac{1}{2}\Delta\right)}\right] - V_{3N}\left[\sqrt{2m\left(E-\frac{1}{2}\Delta\right)}\right] \simeq 
\Delta\sqrt{\frac{m}{2E}} S_{3N-1}\left(\sqrt{2mE}\right) \]&lt;p&gt;算出为 \( \omega\simeq\dfrac{\Delta}{E}V^N\dfrac{(2\pi mE)^{3N/2}}{[(3N/2)-1]!} \) ，回到 &lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%83%ad%e5%8a%9b%e5%ad%a6%e7%b3%bb%e7%bb%9f/&#34; &gt;第一章&lt;/a&gt; 中对理想气体的研究中得到的 \( \Gamma \) ，两式相除得到 \( (\omega/\Gamma)_\text{asymptotic}\equiv\omega_0=h^{3N} \) 。&lt;/p&gt;
&lt;h2 id=&#34;实例-2--1-维谐振子&#34;&gt;实例 2 — 1 维谐振子
&lt;/h2&gt;&lt;p&gt;一维谐振子的 Hamilton 量为 \( H(\omega)=\dfrac{p^2}{2m}+\dfrac{1}{2}m\omega^2q^2 \) ，通过求解 Liouville 方程可以确定其相空间轨迹为如下椭圆：&lt;/p&gt;
\[\frac{q^2}{2E/m\omega^2}+\frac{p^2}{2mE}=1\]&lt;p&gt;其面积为 \( \pi\cdot\sqrt{\dfrac{2E}{m\omega^2}}\cdot\sqrt{2mE}=\dfrac{2\pi E}{\omega} \) ，考虑此式的差值即为&lt;/p&gt;
\[\omega=\int&#39;d\omega=\left.\dfrac{2\pi E}{\omega}\right|_{E-\frac{1}{2}\Delta}^{E+\frac{1}{2}\Delta}=\frac{2\pi\Delta}{\omega}\]&lt;p&gt;我们知道，谐振子的相邻两条能级之间的间距为 \( \hbar\omega=\dfrac{h\omega}{2\pi} \) ，因此状态数应该接近于 \( \Gamma\simeq\dfrac{\Delta}{\hbar\omega}=\dfrac{2\pi\Delta}{h\omega} \) ，由此基本体积确定为 \( \omega_0\simeq\dfrac{\omega}{\Gamma}=h \) ，这与 Heisenberg 的不确定性关系 \( \left(\Delta q\Delta p\right)_{\min}\simeq h \ 量级 \) 不谋而合。&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e9%ab%98%e7%ad%89%e7%bb%9f%e8%ae%a1%e7%89%a9%e7%90%86/&#34; &gt;返回目录&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e7%83%ad%e5%8a%9b%e5%ad%a6%e7%b3%bb%e7%bb%9f/&#34; &gt;上一篇&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class=&#34;link&#34; href=&#34;http://localhost:1313/p/%e6%ad%a3%e5%88%99%e7%b3%bb%e7%bb%bc/&#34; &gt;下一篇&lt;/a&gt;&lt;/p&gt;
</description>
        </item>
        <item>
        <title>Archives</title>
        <link>http://localhost:1313/archives/</link>
        <pubDate>Sun, 06 Mar 2022 00:00:00 +0000</pubDate>
        
        <guid>http://localhost:1313/archives/</guid>
        <description></description>
        </item>
        <item>
        <title>Search</title>
        <link>http://localhost:1313/search/</link>
        <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
        
        <guid>http://localhost:1313/search/</guid>
        <description></description>
        </item>
        
    </channel>
</rss>