forked from CTeX-org/tex-by-topic-cn
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathchapter02.tex
1803 lines (1637 loc) · 80.2 KB
/
chapter02.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% -*- coding: utf-8 -*-
% This file is part of TeX by Topic
% Copyright 2007-2014 Victor Eijkhout
% Translated by LiYanrui@bbs.ctex.org
% Translated by zoho@bbs.ctex.org
\documentclass{book}
\input{preamble}
\setcounter{chapter}{1}
\begin{document}
%\chapter{Category Codes and Internal States}\label{mouth}
\chapter{类别码与内部状态}\label{mouth}
%When characters are read,
%\TeX\ assigns them
%category codes. The reading mechanism has three internal
%states, and transitions between these states are affected
%by category codes of characters in the input.
%This chapter describes how \TeX\ reads its input and
%how the category codes of characters influence the
%reading behaviour. Spaces and line ends are discussed.
在读取字符时,\TeX\ 以类别码赋之。\TeX\ 的输入处理器有三种内部状态,
而且输入处理器在这三种内部状态之间的转换以字符的类别码作为表征。
本章主要讲述 \TeX\ 如何读取字符以及类别码如何影响它的读取行为,
附加讨论一下有关空格与行尾的问题。
%\label{cschap:endlinechar}\label{cschap:ignorespaces}\label{cschap:catcode}\label{cschap:char32}\label{cschap:obeylines}\label{cschap:obeyspaces}
%\begin{inventory}
%\item [\cs{endlinechar}]
% The character code of the end-of-line character
% appended to input lines.
% \IniTeX\ default:~13.
%\item [\cs{par}]
% Command to close off a paragraph and go into vertical mode.
% Is generated by empty lines.
\label{cschap:endlinechar}\label{cschap:ignorespaces}\label{cschap:catcode}
\label{cschap:char32}\label{cschap:obeylines}\label{cschap:obeyspaces}
\begin{inventory}
\item [\cs{endlinechar}]
添加到输入行末尾的行结束符的字符码。\IniTeX\ 默认为~13。
\item [\cs{par}]
结束当前段落并进入竖直模式。可以用空行生成。
%\item [\cs{ignorespaces}]
% Command that reads and expands until something is
% encountered that is not a \gr{space token}.
\item [\cs{ignorespaces}]
读取并展开直到遇到非 \gr{space token}。
%\item [\cs{catcode}]
% Query or set category codes.
\item [\cs{catcode}]
查询或者设置类别码。
%\item [\cs{ifcat}]
% Test whether two characters have the same category code.
\item [\cs{ifcat}]
检测两个字符的类别码是否相同。
%\item [\cs{\char32}]
% Control space.
% Insert the same amount of space that a space token would
% when \cs{spacefactor}${}=1000$.
\item [\cs{\textvisiblespace}]
控制空格。插入与 \cs{spacefactor}${}=1000$ 时的空格记号相同大小的空白。
%\item [\cs{obeylines}]
% Macro in plain \TeX\ to make line ends significant.
\item [\cs{obeylines}]
用于保留行结束符的 Plain \TeX\ 宏。
%\item [\cs{obeyspaces}]
% Macro in plain \TeX\ to make (most) spaces significant.
%\end{inventory}
\item [\cs{obeyspaces}]
用于保留(大多数)空格的 Plain \TeX\ 宏。
\end{inventory}
%\section{Introduction}
\section{概述}
%\TeX's input processor scans input lines from a file or terminal, and
%makes tokens out of the characters.
%The input processor can be viewed as
%a simple finite state automaton with three internal states;
%depending on the state its scanning behaviour may differ.
%This automaton will be treated here both from the point of view of the
%internal states and of the category codes governing the
%transitions.
\TeX\ 的输入处理器从文件或终端中扫描输入的文本行,将字符转化为记号。输入处理器可
视为一种简单的有限状态自动机,具有三种内部状态,不同的状态对应不同的扫描行为。本章
分别从内部状态和类别码这两个角度考察这个自动机。
%\section{Initial processing}
\section{初始处理}
%Input from a file (or from the user terminal, but this
%will not be mentioned specifically
%most of the time) is handled one line at a time.
%Here follows a discussion of what exactly is an input line
%for \TeX.
\TeX\ 对输入文件(也可能是来自终端的输入,但实际很少有人使用,下文不再刻意提它)%
是逐行处理的,因此首先要讨论 \TeX\ 输入处理器是如何识别输入行的。
%Computer systems differ with respect to
%\index{line! input}\index{line! end}\index{machine independence}
%the exact definition of an input
%\mdqon
%line. The carriage return/""line feed
%\mdqoff
%\message{slash-dash}%
%sequence terminating a line is most common,
%but some systems use just a line feed, and
%some systems with fixed record length (block) storage do not have
%a line terminator at all. Therefore \TeX\ has its
%own way of terminating an input line.
不同的计算机系统对输入行有不同的定义。
\index{line!input}\index{line!end}\index{machine independence}
最常见的方式是采用回车符加换行符作为行终止符,但是有些系统只使用换行符,
还有一些系统是固定宽度的输入行(块存储)而根本不使用终止符。
为了对这些系统一视同仁,\TeX\ 必须要掌控输入行的终止方式,大致步骤如下:
%\begin{enumerate}
%\item An input line is read from an input file (minus the
%line terminator, if any).
%\item Trailing spaces are removed (this is for the systems
%with block storage, and it prevents confusion because these
%spaces are hard to see in an editor).
%\item The \csterm endlinechar\par, by default \gram{return}
%(code~13) is appended.
%If the value of \cs{endlinechar} is negative
%\label{append:elc}%
%or more than~255 (this was 127 in versions of \TeX\ older
%than version~3; see page~\pageref{2vs3} for more differences),
%no character is appended.
%The effect then is the same as
%if the line were to end with a comment character.
%\end{enumerate}
\begin{enumerate}
\item 从输入文件读取一行(去掉输入行终止符,如果有的话)。
\item 移除行尾空格(这是针对采用块存储的系统的操作,而且也避免了混乱,
因为在编辑器中行尾空格通常是不可见的)。
\item 将 \csterm endlinechar\par (默认为\gram{return},其 \ascii\ 码为 13)%
添加到输入行尾部。如果 \cs{endlinechar} 的值为负值或者大于 255%
(在 \TeX\ 3 之前则为大于 127;见第~\pageref{2vs3}~页介绍的更多差异),
\label{append:elc}%
行尾不需要添加字符;其效果与该行以注释符结尾相同。
\end{enumerate}
%Computers may also differ in the character encoding
%(the most common schemes are \ascii{} and \ebcdic{}), so \TeX\
%converts the characters that are read from the file to its
%own character codes. These codes are then used exclusively,
%so that \TeX\ will perform the same on any system.
%For more on this, see Chapter~\ref{char}.
不同的计算机系统可能在字符编码方面也存在区别%
(最常见的编码是 \ascii{} 和 \ebcdic{}),
因此 \TeX\ 必须要将文件输入的字符编码转换为它的内部编码,
藉此 \TeX\ 可以兼容任何系统中的字符编码。更多内容详见第~\ref{char}~章。
%\section{Category codes}
\section{类别码}
%Each of the 256 character codes (0--255) has an
%associated \indexterm{category code}, though not necessarily always the same one.
%There are 16 categories, numbered 0--15.
%When scanning the input, \TeX\
%thus forms character-code--category-code pairs.
%The input processor sees only these pairs; from them are formed
%character tokens, control sequence tokens, and parameter tokens.
%These tokens are then passed to \TeX's expansion and execution
%processes.
256 个字符码(0--255)的每一个都关联一个不尽相同的\indexterm{类别码}。
共有 16 个类别,编号从 0 到 15。在扫描输入行的过程中,
\TeX\ 会生成(字符码,类别码)对。\TeX\ 的输入处理器的眼里只有(字符码,类别码)对,
从中生成字符记号、控制序列记号和参数记号。
这些记号随后被传送到 \TeX\ 的展开处理器与执行处理器。
%A~character token is a character-code--category-code
%pair that is passed unchanged.
%A~control sequence token consists of one or more characters
%preceded by an escape character; see below.
%Parameter tokens are also explained below.
字符记号是(字符码,类别码)对,它在展开处理器与执行处理器中不会被改变。
控制序列记号是由一个或多个前缀为转义符的字符构成,详见下文。
参数记号的解释也详见下文。
%This is the list of the categories, together with a brief
%description. More elaborate explanations follow in this and
%later chapters.
%\begin{enumerate} \message{set counter}%\SetCounter:item=-1
%\setcounter{enumi}{-1}
%\item\label{ini:esc}\index{category!0} Escape character; this signals
% the start of a control sequence. \IniTeX\ makes the backslash
% \verb-\- (code~92) an escape character.
%\item\index{category!1} Beginning of group; such a character causes
% \TeX\ to enter a new level of grouping. The plain format makes the
% open brace \verb-{- \mdqon a beginning"-of-group character. \mdqoff
%\item\index{category!2} End of group; \TeX\ closes the current level
% of grouping. Plain \TeX\ has the closing brace \verb-}- as
% end-of-group character.
%\item\index{category!3} Math shift; this is the opening and closing
% delimiter for math formulas. Plain \TeX\ uses the dollar
% sign~\verb-$- for this.
%\item\index{category!4} Alignment tab; the column (row) separator in
% tables made with \cs{halign} (\cs{valign}). In plain \TeX\ this is
% the ampersand~\verb-&-.
%\item\index{category!5}\label{ini:eol} End of line; a character that
% \TeX\ considers to signal the end of an input line.
% \IniTeX\ assigns this code to the \gram{return}, that is, code~13.
% Not coincidentally, 13~is also the value that \IniTeX\ assigns to
% the \cs{endlinechar} parameter; see above.
%\item\index{category!6} Parameter character; this indicates parameters
% for macros. In plain \TeX\ this is the hash sign~\verb-#-.
%\item\index{category!7} Superscript; this precedes superscript
% expressions in math mode. It is also used to denote character codes
% that cannot be entered in an input file; see below. In plain
% \TeX\ this is the circumflex~\verb-^-.
%\item\index{category!8} Subscript; this precedes subscript expressions
% in math mode. In plain \TeX\ the underscore~\verb-_- is used for
% this.
%\item\index{category!9} Ignored; characters of this category are
% removed from the input, and have therefore no influence on further
% \TeX\ processing. In plain \TeX\ this is the \gr{null} character,
% that is, code~0.
%\item\index{category!10}\label{ini:sp} Space; space characters receive
% special treatment. \IniTeX\ assigns this category to the \ascii{}
% \gr{space} character, code~32.
%\item\index{category!11}\label{ini:let} Letter; in \IniTeX\ only the
% characters \n{a..z}, \n{A..Z} are in this category. Often, macro
% packages make some `secret' character (for instance~\n@) into a
% letter.
%\item\index{category!12}\label{ini:other} Other; \IniTeX\ puts
% everything that is not in the other categories into this
% category. Thus it includes, for instance, digits and punctuation.
%\item\index{category!13} Active; active characters function as a
% \TeX\ command, without being preceded by an escape character. In
% plain \TeX\ this is only the tie character~\verb-~-, which is
% defined to produce an unbreakable space; see page~\pageref{tie}.
%\item\index{category!14}\label{ini:comm} Comment character; from a
% comment character onwards, \TeX\ considers the rest of an input line
% to be comment and ignores it. In \IniTeX\ the per cent sign \verb-%-
% is made a comment character.
%\item\index{category!15}\label{ini:invalid} Invalid character; this
% category is for characters that should not appear in the
% input. \IniTeX\ assigns the \ascii\ \gr{delete} character, code~127,
% to this category.
%\end{enumerate}
下面是 16 个类别列表的大致解释,更多的细节知识散布于后文以及后续各章之中。
\begin{enumerate} \message{set counter}%\SetCounter:item=-1
\setcounter{enumi}{-1}
\item\label{ini:esc}\index{category!0}转义符:
用于表示控制序列的开始。\IniTeX\ 使用反斜线 \verb-\-(ASCII 码为~92)作为转义符。
\item\index{category!1}组开始符:
此类字符可让 \TeX\ 进入新一层的编组。在 Plain \TeX\ 中,组开始符默认是 \verb-{-。
\item\index{category!2}组结束符:
此类字符可让 \TeX\ 结束当前层的编组。在 Plain \TeX\ 中,组结束符默认是 \verb-}-。
\item\index{category!3}数学切换符:
置于数学公式两侧,向 \TeX\ 表示这是数学公式。在 Plain \TeX\ 中,
数学切换符默认为 \verb-$-。
\item\index{category!4}制表符:
在 \cs{halign}(\cs{valign})制作的表格中作为列(行)的分割符。
在 Plain \TeX\ 中,制表符默认为 \verb-&-。
\item\index{category!5}\label{ini:eol}行结束符:
用于表示此处为输入行的结束之处。
\IniTeX\ 默认将 \gram{return} 字符(ASCII 码为~13)视为行结束符,
所以 \IniTeX\ 将 13 作为 \cs{endlinechar} 的值并非巧合;见下面所述。
\item\index{category!6}参数符:
用于表示宏的参数。Plain \TeX\ 默认使用~\verb-#-~作为参数符。
\item\index{category!7}上标符:
在数学模式中用于表示上标,也可用于表示那些无法直接在文本中输入的字符;
见下面所述。Plain \TeX\ 默认使用 \verb-^- 作为上标符。
\item\index{category!8}下标符:
在数学模式中用于表示下标。Plain \TeX\ 使用下划线 \verb-_- 作为下标符。
\item\index{category!9}可忽略符:
\TeX\ 将会从输入中去掉此类字符,因此它不会影响 \TeX\ 的后续处理。
Plain \TeX\ 使用 \gr{null} 字符(ASCII 码为0)作为可忽略符。
\item\index{category!10}\label{ini:sp}空格符:
这个符号会受到 \TeX\ 的特殊礼遇,它默认被 \IniTeX\ 赋予 \gr{space} 字符(ASCII 码为 32)。
\item\index{category!11}\label{ini:let}字母符:
对于该类字符,\IniTeX\ 只定义了 \n{a..z} 和 \n{A..Z} 这些。通常在写宏包的时候,
为了避免宏名冲突,宏包作者通常会将某些非字母符(例如~\n@)打扮为字母符而使用。
\item\index{category!12}\label{ini:other}其他字符:
\IniTeX\ 将不属于其他 15 类的字符归到该类,最常见的是数字、标点符号等。
\item\index{category!13}活动符:
活动符在功能上相当于 \TeX\ 控制序列,但是它不需要转义符作为前缀。
在 Plain \TeX\ 中只有~\verb-~-~是活动符,用于产生不可断行的空格;
见第~\pageref{tie}~页。
\item\index{category!14}\label{ini:comm}注释符:
\TeX\ 将忽略从注释符开始的该行所有字符。\IniTeX\ 使用分号~\verb-%-~作为注释符。
\item\index{category!15}\label{ini:invalid}无效符:
这个字符类是为那些不应该在 \TeX\ 输入中出现的字符而设置的。
\IniTeX\ 将 \gr{delete} 字符(ASCII 码为~127)归入此类。
\end{enumerate}
%The user can change the mapping
%of character codes to category codes
%with the \csterm catcode\par\ command (see Chapter~\ref{gramm}
%for the explanation of concepts such as~\gr{equals}):
%\begin{disp}\cs{catcode}\gram{number}\gr{equals}\gram{number}.\end{disp}
%In such a statement, the first number is often given in the form
%\begin{disp}\verb>`>\gr{character}\quad or\quad \verb>`\>\gr{character}\end{disp}
%both of which denote the character code of the character
%(see pages \pageref{char:code} and~\pageref{int:denotation}).
用户可以修改任意字符的类别码,途径是使用 \csterm catcode\par\ 命令%
(见第~\ref{gramm}~章对诸如 \gr{equals} 的概念的解释):
\begin{disp}\cs{catcode}\gram{number}\gr{equals}\gram{number}.\end{disp}
此语句的第一个参数是需要修改类别码的字符的编码,它通常可用下面形式给出:
\begin{disp}\verb>`>\gr{character}\quad 或\quad \verb>`\>\gr{character}\end{disp}
这两种写法都表示该字符的字符码(见第~\pageref{char:code}~和~\pageref{int:denotation}~页)。
%The plain format defines
%\csterm active\par
%\begin{verbatim}
%\chardef\active=13
%\end{verbatim}
%so that one can write statements such as
%\begin{verbatim}
%\catcode`\{=\active
%\end{verbatim}
%The \cs{chardef} command is treated
%on pages \pageref{chardef} and~\pageref{num:chardef}.
Plain \TeX\ 格式将 \csterm active\par 定义为:
\begin{verbatim}
\chardef\active=13
\end{verbatim}
因此你可以像下面这样写
\begin{verbatim}
\catcode`\{=\active
\end{verbatim}
上面的 \cs{chardef} 命令将在第 \pageref{chardef} 和 \pageref{num:chardef} 页中介绍。
%The \LaTeX\ format has the control sequences
%\begin{verbatim}
%\def\makeatletter{\catcode`@=11 }
%\def\makeatother{\catcode`@=12 }
%\end{verbatim}
%in order to switch on and off the `secret' character~\n@
%(see below).
\LaTeX\ 格式有下面这样的控制序列:
\begin{verbatim}
\def\makeatletter{\catcode`@=11 }
\def\makeatother{\catcode`@=12 }
\end{verbatim}
它可用于开启或关闭“隐秘”字符 \n@(见下述)。
%The \cs{catcode} command can also be used to query category
%codes: in
%\begin{verbatim}
%\count255=\catcode`\{
%\end{verbatim}
%it yields a number, which can be assigned.
\cs{catcode} 命令也可用于查询类别码,例如:
\begin{verbatim}
\count255=\catcode`\{
\end{verbatim}
所得类别码存储于第 255 号计数寄存器。
%Category codes can be tested by
%\begin{disp}\cs{ifcat}\gr{token$_1$}\gr{token$_2$}\end{disp}
%\TeX\ expands whatever is after \cs{ifcat} until two
%unexpandable tokens are found; these are then compared
%with respect to their category codes. Control sequence
%tokens are considered to have category code~16\index{category!16},
%which makes them all equal to each other, and unequal to
%all character tokens.
%Conditionals are treated further in Chapter~\ref{if}.
类别码可使用以下命令进行测试:
\begin{disp}\cs{ifcat}\gr{token$_1$}\gr{token$_2$}\end{disp}
无论 \cs{ifcat} 之后跟随的是一些什么东西,\TeX\ 都会将其展开,
直至发现两个不可展开的记号为止,然后去比较这两个记号的类别码是否相等。
控制序列的类别码被视为~16\index{category!16},
这样它们的类别码都是相等的,而控制序列与字符记号的类别码总不相等。
条件语句在第~\ref{if}~章中将会仔细介绍.
%\section{From characters to tokens}
\section{从字符到记号}
%The input processor
%of \TeX\ scans input lines from a file or from the
%user terminal, and converts the characters in the input
%to tokens. There are three types of tokens.
%\begin{itemize}\item Character tokens: any character that is
% passed on its own to \TeX's
%further levels of processing with an appropriate
%category code attached.
%\item Control sequence tokens, of which there are two kinds:
% an escape character
%\ldash that is,\message{ldash nobreak?}
%a character of category~0\index{category!0} \rdash followed
%by a string of `letters' is
%lumped together into a {\em control word}, which is a single token.
%An escape character followed by a single character that is not of
%category~11\index{category!11}, letter, is made into a
%\indextermsub{control}{symbol}.
%If the distinction between control word and control symbol is
%irrelevant, both are called
%\indextermsub{control}{sequence}.
\TeX\ 的输入处理器对来自文件或用户终端的输入行进行扫描,
将其中的字符转化为记号。记号的类型分为以下三种:
\begin{itemize}
\item 字符记号:任何本身会被传递到 \TeX\ 后续处理器并具有相应的类别码的字符。
\item 控制序列记号:这种记号分为两种类型,第一种类型是{\em 控制词},
由转义符\index{category!0}(即类别码为 0 的字符)后跟一串`字母'而成;
第二种类型是{\em 控制符},由转义符后跟任何非字母(即类别码不是 11)
\index{category!11}的单个字符组成。在没必要区分控制词与控制符时,
可以将它们统称为{\em 控制序列}\index{控制!控制序列}。
%The control symbol that results from an escape character followed
%\csterm \char32\par
%by a space character is called
%\indextermbus{control}{space}.
由转义符与一个空格字符 \csterm\char32\par 构成的控制序列,称为{\em 控制空格}。
\index{控制!控制空格}
%\item Parameter tokens: a parameter character \ldash that is, a
% character of category~6\index{category!6}, by default~\verb=#=
% \rdash followed by a digit \n{1..9} is replaced by a parameter
% token. Parameter tokens are allowed only in the context of macros
% (see Chapter~\ref{macro}).
\item 参数记号:由参数字符(类别码为 6,Plain \TeX\ 中默认为 \verb=#=)%
尾随一位在 \n{1..9} 中的数字构成。参数记号只能在宏的环境中出现%
(见第~\ref{macro}~章)。
%A macro parameter character followed by another macro parameter
%character (not necessarily with the same character code)
%is replaced by a single character token.
%This token has category~6 (macro parameter), and the character
%code of the second parameter character.
%The most common instance is of this is
%replacing \n{\#\#} by~\n{\#$_6$}, where the subscript
%denotes the category code.
在宏的替换文本中,如果一个宏参数字符之后又跟随了一个宏参数字符%
(字符码可以不相同),那么它们会被替换为单个字符记号,
其类别码为 6(宏参数),字符码等于第 2 个参数字符的编码。
常见情形是输入行内的 \n{\#\#} 会被替换为 \n{\#$_6$},这里的下标表示类别码。
%\end{itemize}
\end{itemize}
%\section{The input processor as a finite state automaton}
%\label{input:states}
\section{输入处理器视为有限状态自动机}
\label{input:states}
%\TeX's input processor can be considered to be a finite state
%automaton with three \indextermbus{internal}{states},
%that is, at any moment in time it is in one of three states,
%and after transition to another state there is no memory of the
%previous states.
\TeX\ 的输入处理器可视为三态的有限状态自动机,也就是说在任意的瞬间,
它都处于这三种\indextermbus{内部}{状态}的某一种状态之中,
并且在转移到另一种状态之后,对于前一状态没有任何记忆。
%\subsection{State {\italic N}: new line}
\subsection{状态 {\italic N}:新行}
%State {\italic N} is entered at the beginning of each new input line,
%and that is the only time \TeX\ is in this state. In state~{\italic
% N} all space tokens (that is, characters of
%category~10\index{category!10}) are ignored; an end-of-line character
%is converted into a \cs{par} token. All other tokens bring \TeX\ into
%state~{\italic M}.
在每个输入行的开始处,\TeX\ 输入处理器便会进入状态 {\italic N},
这是它唯一可进入这一状态的时刻。在这一状态中,
所有的空格记号(也就是类别码为 10 的字符\index{category!10})会被忽略;
行结束符会被转化为 \cs{par} 记号。如果遇到其他记号,
那么输入处理器所处状态便会切换为状态 {\italic M}。
%\subsection{State {\italic S}: skipping spaces}
\subsection{状态 {\italic S}:忽略空格}
%State {\italic S} is entered in any mode after a control word or
%control space (but after no other control symbol),
%or, when in state~{\italic M}, after a space.
%In this state all subsequent spaces or end-of-line characters
%in this input line are discarded.
在任何状态的控制词或控制空格(其他控制符不在这一范畴)之后,
或者在状态 {\italic M} 的空格字符之后,输入处理器便会进入状态 {\italic S}。
在这一状态中,所有的后续空格或行结束符会被丢弃。
%%\spoint State {\italic M}: middle of line
%\subsection{State {\italic M}: middle of line}
%\spoint State {\italic M}: middle of line
\subsection{状态 {\italic M}:行内}
%By far the most common state is~{\italic M}, `middle of line'.
%It is entered after characters of categories
%1--4, 6--8, and 11--13, and after control symbols
%other than control space.
%An end-of-line character encountered in this state
%results in a space token.
显然状态 {\italic M} 是最寻常的状态。当 \TeX\ 的输入处理器遇到类别码为
1--4,6--8 以及 11--13 的字符或者控制符(不包括控制空格),
在其之后便进入状态 {\italic M}。在状态 {\italic M} 中,
如果输入处理器遇到了行结束符,它会将其转化为一个空格记号。
%%% \input figflow \message{left align flow diagram}
%%% \vskip12pt plus 1pt minus 4pt\relax %before spoint skip
%%% \begin{tdisp}%\PopIndentLevel
%%% \leavevmode\relax
%%% %\figmouth
%%% \message{fig mouth missing}
%%% \end{tdisp}
%% \input figflow \message{left align flow diagram}
%% \vskip12pt plus 1pt minus 4pt\relax %before spoint skip
%% \begin{tdisp}%\PopIndentLevel
%% \leavevmode\relax
%% %\figmouth
%% \message{fig mouth missing}
%% \end{tdisp}
%\input figs1
%\begin{quotation}
% \figmouth
%\end{quotation}
\input figs1
\begin{quotation}
\figmouth
\end{quotation}
%%\point[hathat] Accessing the full character set
%\section{Accessing the full character set}
%\label{hathat}
%\point[hathat] Accessing the full character set
\section{所有字符皆可信手拈来}
\label{hathat}
%Strictly speaking, \TeX's input processor
%is not a finite state automaton.
%This is because during the scanning of the input line
%all trios consisting of two {\sl equal\/} superscript characters
%\index{\char94\char94\ replacement}
%(category code~7\index{category!7}) and a subsequent character
%(with character code~$<128$)
%are replaced by a single character with a character
%code in the range 0--127,
%differing by 64 from that of the original character.
严格地讲,\TeX\ 输入处理器并非有限状态自动机。这是因为在扫描输入行期间,
两个{\sl 相同\/}上标字符(类别码为 7 \index{\char94\char94\ replacement})%
尾随一个编码小于 128 的字符(姑且称之为原字符)组成的三元组会被替换为字符码在
0--127 之间的字符,新字符的编码与原字符的编码相差 64。
%This mechanism can be used, for instance, to access positions in a font
%corresponding to character codes that cannot
%be input, for instance because they are \ascii{} control characters.
%The most obvious examples are the \ascii{} \gr{return}
%and \gr{delete} characters; the corresponding
%positions 13 and 127 in a font are
%accessible as \verb>^^M> and~\verb>^^?>.
%However, since the category of \verb>^^?> is 15\index{category!15}, invalid,
%that has to be changed before character 127 can be accessed.
这种字符访问机制主要用于访问那些难以输入的字符,例如像 \ascii\ 码中的
\gr{return} 和 \gr{delete} 字符。可分别使用 \verb>^^M> 和 \verb>^^?> 进行访问。
不过,由于 \verb>^^?> 的类别码是 15 \index{category!15},属于无效符,
因此要访问编码为 127 的字符,必须先修改 \verb>^^?> 的类别码。
%In \TeX3 this mechanism has been
%modified and extended to access 256 characters:
%any quadruplet \verb-^^xy- where both \n x and \n y are lowercase
%hexadecimal digits \n0--\n9, \n a--\n f,
%is replaced by a character in the
%range 0--255, namely the character the number of which is
%represented hexadecimally as~\n{xy}.
%This imposes a slight restriction on the applicability
%of the earlier mechanism: if, for instance, \verb>^^a>
%is typed to produce character~33, then a following
%\n0--\n9, \n{a}--\n{f} will be misunderstood.
\TeX3 修改和扩展了这个机制以访问 256 个字符:
任何四元组 \verb-^^xy-,其中 \n x 和 \n y 为小写十六进制数字 \n0--\n9, \n a--\n f,
被替换为一个在 0--255 之间的字符,即十六进制表示为~\n{xy}~的字符。
这也稍微限制了前面机制的使用:设若键入了 \verb>^^a> 以生成字符 \verb>!>,
接着再键入 \n0--\n9 或 \n{a}--\n{f} 将被错误理解。
%While this process makes \TeX's input processor
%somewhat more powerful
%than a true finite state automaton,
%it does not interfere with the rest of
%the scanning. Therefore it is conceptually simpler to pretend that
%such a replacement of triplets or quadruplets
%of characters, starting with~\verb>^^>, is performed in advance.
%In actual practice this is not possible,
%because an
%input line may assign category code~7\index{category!7} to some
%character other than the circumflex, thereby
%influencing its further processing.
这种字符访问机制使得 \TeX\ 的输入处理器比真正的有限状态自动机更强大,
并且不会妨碍 \TeX\ 输入处理器的其余扫描过程。
因而,为了更容易理解此概念,可以假装认为这种对 \verb>^^>
引导的三元组或四元组的字符替换是提前进行的。
实际上这是不可能的,因为输入行内有可能会将非上标符的类别码也设为 7
\index{category!7},这样便会影响后续的处理了。
%%\point Transitions between internal states
%\section{Transitions between internal states}
%\point Transitions between internal states
\section{内部状态切换}
%Let us now discuss the effects on the internal state
%of \TeX's input processor when
%certain category codes are encountered in the input.
现在我们来关注一下不同类别码的字符对 \TeX\ 的输入处理器内部状态的影响。
%%\spoint 0: escape character
%\subsection{0: escape character}
%\index{escape!character|see{character, escape}}
%\spoint 0: escape character
\subsection{0:转义符}
\index{escape!character|see{character, escape}}
%When an \indextermbus{escape}{character} is encountered,
%\TeX\ starts forming a control sequence token.
%Three different types of control sequence can result,
%depending on the category code of the character that
%follows the escape character.
在遇到{\em 转义符}\index{字符!转义符}时,\TeX\ 便开始形成一个控制序列记号。
控制序列记号有三种类型,依赖于转义符后面的字符的类别码。
%\begin{itemize}\item
%If the character following the escape is of category~11\index{category!11},
%letter, then \TeX\ combines the escape,
%that character and all following
%characters of category~11, into a control word.
%After that \TeX\
%goes into state~{\italic S}, skipping spaces.
%\item
%With a character of category~10\index{category!10}, space, a control
%symbol called control space results, and \TeX\ goes into
%state~{\italic S}.
%\item
%With a character of any other category code
%a control symbol results, and \TeX\ goes into state~{\italic M},
%middle of line.
%\end{itemize}
\begin{itemize}
\item 如果转义符之后的字符的类别码为 11\index{category!11},即字母,
那么 \TeX\ 便会将转义符、类别码为 11 的字符以及后续所有类别码为 11 的字符捆绑为一个控制词,
然后进入状态 {\italic S},即忽略空格状态。
\item 如果转义符之后的字符的类别码为 10\index{category!10},即空格,
那么 \TeX\ 便会产生一个控制空格,然后进入状态 {\italic S}。
\item 如果转义符之后的字符为其他类别码,那么 \TeX\ 便形成一个控制符,
然后 \TeX\ 进入状态 {\italic M},即行内状态。
\end{itemize}
%The letters of a control sequence name have to be all on one line;
%a control sequence name is not continued on the next line
%if the current line ends with a comment sign, or if (by letting
%\cs{endlinechar} be outside the range~0--255)
%there is no terminating character.
控制序列的名称所包含的字符必须居于同一行。即使当前行以注释符结束,
或者当前行没有行结束符(通过将 \cs{endlinechar} 设定到 0--255 之外实现),
控制序列字符也不能跨过两行。
%%\spoint 1--4, 7--8, 11--13: non-blank characters
%\subsection{1--4, 7--8, 11--13: non-blank characters}
%\spoint 1--4, 7--8, 11--13: non-blank characters
\subsection{1–4, 7–8, 11–13:非空字符}
%Characters of category codes 1--4, 7--8, and 11--13 are made
%into tokens, and \TeX\ goes into state~{\italic M}.
类别码属于 1-4、7-8、11-13 的字符会被转化为记号,然后 \TeX\ 进入状态 {\italic M}。
%%\spoint 5: end of line
%\subsection{5: end of line}
%\spoint 5: end of line
\subsection{5:行结束符}
%Upon encountering an end-of-line character,
%\TeX\ discards the rest of the
%line, and starts processing the next line,
%in state~{\italic N}. If the current state was~{\italic N},
%that is, if the
%line so far contained at most spaces, a~\cs{par} token
%is inserted; if the state was~{\italic M}, a~space token is inserted,
%and in state~{\italic S} nothing is inserted.
遇到行结束符时,\TeX\ 会忽略当前行的剩余部分,然后进入状态 {\italic N} 开始处理下一行。
如果当前状态是 {\italic N},即当前行只有空格,\TeX\ 就插入 \cs{par} 记号;
如果当前状态是 {\italic M},那么就插入一个空格记号;
如果当前状态是 {\italic S},就不插入任何记号。
%Note that by `end-of-line character' a character with category
%code~5 is meant. This is not necessarily the \cs{endlinechar},
%nor need it appear at the end of the line.
%See below for further remarks on line ends.
注意“行结束符”是类别码为 5 的字符,它可以不是\cs{endlinechar},
也不必出现在行尾。要明白它,请继续阅读下文。
%%\spoint 6: parameter
%\subsection{6: parameter}
%\spoint 6: parameter
\subsection{6:参数符}
%A \indextermbus{parameter}{character} \ldash usually~\verb=#= \rdash can be
%followed by either a digit \n{1..9}
%in the context of macro definitions
%\altt
%or by another parameter character.
%In the first case a `parameter token' results,
%in the second case only a single parameter character
%is passed on as a character token for further processing.
%In either case \TeX\ goes into state~{\italic M}.
在宏定义中,{\em 参数符}\index{字符}{参数符}通常为 \verb=#=,
其后可跟随数字 \n{1..9} 或者另一个参数符,前者产生的是“参数记号”,
后者产生的是单个字符记号。这两种情况都会导致 \TeX\ 都会进入状态 {\italic M}。
%A parameter character can also appear on its own in an
%alignment preamble (see Chapter~\ref{align}).
参数符在 Plain \TeX\ 中也被用于构建阵列的模板行(见第~\ref{align}~章)。
%%\spoint 7: superscript
%\subsection{7: superscript}
%\spoint 7: superscript
\subsection{7:上标符}
%A superscript character is handled like most non-blank
%characters, except in the case where it is followed
%by a superscript character of the same character code.
%The process
%that replaces these two characters plus the following character
%(possibly two characters in \TeX3) by another character
%was described above.
上标符会像非空字符那样被处理,除非其后尾随一个相同字符码的上标符。
两个上标符及其尾随字符构成的三元或四元组的字符替换功能在前文已有阐述。
%%\spoint 9: ignored character
%\subsection{9: ignored character}
%\spoint 9: ignored character
\subsection{9:可忽略符}
%Characters of category 9 are ignored; \TeX\ remains in the same state.
类别码为 9 的字符会被忽略,并且 \TeX\ 会保持其状态不变。
%%\spoint 10: space
%\subsection{10: space}
%\spoint 10: space
\subsection{10:空格符}
%A token with category code 10 \ldash this is called a \gr{space token},
%irrespective of the character code \rdash
%is ignored in states {\italic N} and~{\italic S}
%(and the state does not change);
%in state~{\italic M} \TeX\ goes into state~{\italic S}, inserting
%a token that has category~10 and character code~32
%(\ascii{} space).
%This implies that the character code of the space token may change
%from the character that was actually input.
类别码为 10 的记号称为 \gr{space token}(空格记号),不管其字符码是什么。
在状态 {\italic N} 和 {\italic S} 中,\TeX\ 会忽略空格记号(而且其状态不变);
在状态 {\italic M} 中 \TeX\ 会将它替换为类别码为 10 字符码为 32 的字符(\ascii\ 空格符),
并进入状态 {\italic S}。这意味着空格记号的字符码可能与从输入字符的编码不同。
%%\spoint 14: comment
%\subsection{14: comment}
%\spoint 14: comment
\subsection{14:注释符}
%A comment character causes \TeX\ to discard
%the rest of the line, including the comment character.
%In particular, the end-of-line character is not seen,
%so even if the comment was encountered in state~{\italic M}, no space
%token is inserted.
注释符可使 \TeX\ 忽略输入行的后续文本,其中包含注释符本身。
特别地,注释符将导致 \TeX\ 看不到输入行的行结束符,
所以即使在状态 {\italic M} 中遇到注释符,\TeX\ 也不会插入空格记号。
%%\spoint 15: invalid
%\subsection{15: invalid}
%\spoint 15: invalid
\subsection{15:无效符}
%Invalid characters cause an error message. \TeX\ remains in
%the state it was in.
%However, in the context of a control symbol an invalid character
%is acceptable. Thus \verb>\^^?> does not cause any error messages.
无效符会导致 \TeX\ 报错。\TeX\ 的状态会停留在无效字符之前的状态。
不过,在控制符中的无效符是可以接受的,譬如 \verb>\^^?> 就不会导致 \TeX\ 报错。
%%\point[cat12] Letters and other characters
%\section{Letters and other characters}
%\label{cat12}
%\point[cat12] Letters and other characters
\section{字母符与其他字符}
\label{cat12}
%In most programming languages identifiers can consist
%of both letters and digits (and possibly some other
%character such as the underscore), but control sequences in \TeX\
%are only allowed to be formed out of characters of category~11,
%letter. Ordinarily, the digits and punctuation symbols have
%category~12, other character.
%However, there are contexts where \TeX\ itself
%generates a string of characters, all of which have
%category code~12, even if that is not their usual
%category code.
大部分编程语言的标识符可由字母与数字构成(也可能包含其他字符,例如下划线),
但是 \TeX\ 的控制词只能由类别码为 11 的字符形成。默认情况下,
数字与标点符号的类别码为 12(其他字符)。不过 \TeX\
可以产生各字符的类别码均为 12 的字符串,
尽管这些字符的原始类别码并非 12。
%This happens when the operations
%\cs{string},
%\cs{number},
%\cs{romannumeral},
%\cs{jobname},
%\cs{fontname},
%\cs{meaning},
%and \cs{the}
%are used to generate a stream of character tokens.
%If any of the characters delivered by such a command
%is a space character (that is, character code~32),
%it receives category code~10, space.
类别码为 12 的字符串可用
\cs{string}、
\cs{number}、
\cs{romannumeral}、
\cs{jobname}、
\cs{fontname}、
\cs{meaning}
以及 \cs{the} 等命令生成。
这些命令所产生的字符串中如果包含空格符,其类别码为 10。
%For the extremely rare case where a hexadecimal digit has been
%hidden in a control sequence, \TeX\ allows \n A$_{12}$--\n F$_{12}$
%to be hexadecimal digits, in addition to the ordinary
%\n A$_{11}$--\n F$_{11}$ (here
%the subscripts denote the category codes).
在极个别情况下十六进制数字会隐藏在控制序列中,
因而除了通常的 \n A$_{11}$--\n F$_{11}$ 之外,
\TeX\ 还允许 \n A$_{12}$--\n F$_{12}$ 作为十六进制数字(这里的下标表示类别码)。
%For example,
%\begin{disp}\verb>\string\end>\quad gives four character tokens\quad
%\n{\char92$_{12}$e$_{12}$n$_{12}$d$_{12}$} \end{disp}
%Note that the \indextermbus{escape}{character}~\texttt{\char`\\}$_{12}$\label{use:escape}
%is used in the output only because the
%value of \cs{escapechar} is the character code for the
%backslash. Another value of \cs{escapechar} leads to another
%character in the output of \cs{string}.
%The \cs{string} command is treated further in Chapter~\ref{char}.
看下面的示例:
\begin{disp}\verb>\string\end>\quad 可以得到字符记号 \quad
\n{\char92$_{12}$e$_{12}$n$_{12}$d$_{12}$} \end{disp}
注意{\em 转义符}\index{字符!转义符} \texttt{\char`\\}$_{12}$\label{use:escape}
出现在输出中是因为 \cs{escapechar} 的值等于反斜线的字符码。
将 \cs{escapechar} 改为另一个值将使得 \cs{string} 输出另一个字符.
这个 \cs{string} 命令将在第~\ref{char}~章中进一步介绍。
%Spaces can wind up in control sequences:
%\begin{disp}\verb>\csname a b\endcsname>\end{disp} gives a control sequence
%token in which one of the three characters is a space.
%Turning this control sequence token into a string of characters
%\begin{disp}\verb>\expandafter\string\csname a b\endcsname>\end{disp}
%gives \n{\char92$_{12}$a$_{12}$\char32$_{10}$b$_{12}$}.
空格是可以封到控制序列中的,例如
\begin{disp}\verb>\csname a b\endcsname>\end{disp}
给出的是一个控制序列记号,其中三个字符有一个是空格符。
将这个控制序列转化为字符串
\begin{disp}\verb>\expandafter\string\csname a b\endcsname>\end{disp}
可得 \n{\char92$_{12}$a$_{12}$\char32$_{10}$b$_{12}$}.
%As a more practical example, suppose there exists a sequence
%of input files \n{file1.tex}, \n{file2.tex}\label{ex:jobnumber},
%and we want to
%write a macro that finds the number of the input file
%that is being processed. One approach would be to write
%\begin{verbatim}
%\newcount\filenumber \def\getfilenumber file#1.{\filenumber=#1 }
%\expandafter\getfilenumber\jobname.
%\end{verbatim}
%where the letters \n{file} in the parameter text of the
%macro (see Section~\ref{param:text}) absorb that part of the
%jobname, leaving the number as the sole parameter.
举个更实用一些的例子,假设有一系列输入文件 \n{file1.tex}、
\n{file2.tex}\label{ex:jobnumber}等。我们想写一个宏统计输入文件的序号,
一种方法是:
\begin{verbatim}
\newcount\filenumber \def\getfilenumber file#1.{\filenumber=#1 }
\expandafter\getfilenumber\jobname.
\end{verbatim}
宏参数中的字符 \n{file}(见第~\ref{param:text}~节)会吸走
\cs{jobname} 中的 \n{file} 部分,
从而留下文件编号作为唯一的参数。
%However, this is slightly incorrect: the letters \n{file} resulting
%from the \cs{jobname} command have category code~12, instead of
%11 for the ones in the definition of \cs{getfilenumber}.
%This can be repaired as follows:
%\begin{verbatim}
%{\escapechar=-1
% \expandafter\gdef\expandafter\getfilenumber
% \string\file#1.{\filenumber=#1 }
%}
%\end{verbatim}
%Now the sequence \verb>\string\file> gives the four
%letters \n{f$_{12}$i$_{12}$l$_{12}$e$_{12}$};
%the \cs{expandafter} commands let this be executed prior to
%the macro definition;
%the backslash is omitted because we put\handbreak \verb>\escapechar=-1>.
%Confining this value to a group makes it necessary to use~\cs{gdef}.
但是上述代码有误,宏参数中的 \n{file} 字符串的类别码为 11,
而 \cs{jobname} 中的 \n{file} 字符串的类别码为 12,
所以需要对上述代码进行以下修正:
\begin{verbatim}
{\escapechar=-1
\expandafter\gdef\expandafter\getfilenumber
\string\file#1.{\filenumber=#1 }
}
\end{verbatim}
注意 \verb>\string\file> 得到 \n{f$_{12}$i$_{12}$l$_{12}$e$_{12}$} 这 4 个字符,
而 \cs{expandafter} 命令让 \verb>\string\file> 在宏定义之前先行展开,
并且 \verb>\escapechar=-1> 让 \TeX\ 忽略反斜线。
由于 \cs{escapechar} 设定被限制在编组内部,我们需要使用 \cs{gdef} 进行宏定义。
%\section{The \lowercase{\n{\char92par}} token}
\section{\protect\cs{par} 记号}
%\TeX\ inserts a \csterm par\par\ token into the input after
%an \indextermbus{empty}{line}, that is, when
%encountering a character with category code~5,
%end of line, in state~{\italic N}.
%It is good to realize when exactly this happens:
%since \TeX\ leaves state~{\italic N}
%when it encounters any token but a space,
%a~line giving a \cs{par} can only contain characters
%of category~10. In particular, it cannot end with a comment
%character. Quite often this fact is used the other way around:
%if an empty line is wanted for the layout of the input
%one can put a comment sign on that line.
\TeX\ 在遇到{\em 空白行}\index{行!空白行}之后,
即在状态 {\italic N} 时遇到类别码为 5 的字符(行结束符)之后,
就会向输入中插入一个 \csterm par\par 记号。最好是明白这是如何发生的:
因为 \TeX\ 在遇到空格符之外的任何字符都会离开状态{\italic N},
所以能够形成 \cs{par} 的输入行所包含字符的类别码肯定皆为 10;
特别地,该行不能包含注释符。
此事实常常以另一种方式被用到:如果输入格式中需要保留空白行,
我们可以给该行加上一个注释符。
%Two consecutive empty lines generate two \cs{par} tokens.
%For all practical purposes this is equivalent to one \cs{par},
%because after the first one \TeX\ enters vertical mode, and
%in vertical mode a \cs{par} only
%exercises the page builder,
%and clears the paragraph shape parameters.
连续两个空行产生两个 \cs{par} 记号,实际上它们等同于一个 \cs{par} 记号,
这是因为在第一个 \cs{par} 之后,\TeX\ 进入竖直模式,而在竖直模式中的
\cs{par} 只会触发 \TeX\ 的页面构建器以及清除段落形状参数。
%A \cs{par} is also inserted into the input when \TeX\ sees a
%\gram{vertical command} in unrestricted horizontal mode.
%After the \cs{par} has been read and expanded, the
%vertical command is examined anew (see Chapters~\ref{hvmode}
%and~\ref{par:end}).
在非受限水平模式中遇到 \gram{vertical command}(竖直命令)时,
\TeX\ 也会向输入中插入一个 \cs{par} 记号,并对其读取和展开,
然后再重新处理竖直命令(见第~\ref{hvmode}~和~\ref{par:end}~章)。
%The \cs{par} token may also be inserted by the \cs{end}
%command that finishes off the run of \TeX; see Chapter~\ref{output}.
\cs{end} 命令也会插入 \cs{par} 记号,然后结束 \TeX\ 的运行;见第~\ref{output}~章。
%It is important to realize that \TeX\ does what it normally does
%when encountering an empty line
%(which is ending a paragraph)
%only because of the default definition of the \cs{par} token.
%By redefining \cs{par} the behaviour
%caused by empty lines and vertical commands can be changed completely,
%and interesting special effects can be achieved.
%In order to continue to be able to cause the actions normally
%associated with \cs{par}, the synonym \cs{endgraf} is
%available in the plain format. See further Chapter~\ref{par:end}.
要知道 \TeX\ 在遇到空白行时通常所作的事情(结束当前段落)取决于 \cs{par} 记号的默认定义。
如果重定义 \cs{par},那么空白行和竖直命令的行为可能就完全不同了,
甚至可以藉此实现一些不同寻常的效果。为了能够得到与 \cs{par} 相同的行为,
Plain \TeX\ 提供了 \cs{par} 的“同义词” \cs{endgraf}。详见第~\ref{par:end}~章。
%The \cs{par} token is not allowed to be part of a macro
%argument, unless the macro has been declared to be \cs{long}.
%A \cs{par} in the argument of a non-\cs{long} macro
%prompts \TeX\ to give a `runaway argument' message.
%Control sequences that have been \cs{let} to \cs{par}
%(such as \cs{endgraf}) are allowed, however.