-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathcoding_guidelines.html
599 lines (565 loc) · 51.1 KB
/
coding_guidelines.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Coding guidelines — PyCogent 1.9 documentation</title>
<link rel="stylesheet" href="_static/agogo.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
VERSION: '1.9',
COLLAPSE_INDEX: false,
FILE_SUFFIX: '.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="_static/jquery.js"></script>
<script type="text/javascript" src="_static/underscore.js"></script>
<script type="text/javascript" src="_static/doctools.js"></script>
<link rel="top" title="PyCogent 1.9 documentation" href="index.html" />
<link rel="next" title="The data files used in the documentation" href="data_file_links.html" />
<link rel="prev" title="The Readme" href="README.html" />
<script type="text/javascript" src="http://www.google.com/jsapi?key=ABQIAAAAbW_pA971hrPgosv-Msv7hRQZ4X-jPDmWcshBrz2j7-fJvuUABRTGWmdiw2G89JpgztGlFGG8hDxRAw"></script>\
<script type="text/javascript" src="_static/google_feed.js"></script>
</head>
<body role="document">
<div class="header-wrapper" role="banner">
<div class="header">
<div class="headertitle"><a
href="index.html">PyCogent 1.9 documentation</a></div>
<div class="rel" role="navigation" aria-label="related navigation">
<a href="README.html" title="The Readme"
accesskey="P">previous</a> |
<a href="data_file_links.html" title="The data files used in the documentation"
accesskey="N">next</a> |
<a href="genindex.html" title="General Index"
accesskey="I">index</a>
</div>
</div>
</div>
<div class="content-wrapper">
<div class="content">
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<div class="section" id="coding-guidelines">
<span id="id1"></span><h1>Coding guidelines<a class="headerlink" href="#coding-guidelines" title="Permalink to this headline">¶</a></h1>
<p>As project size increases, consistency increases in importance. Unit testing and a consistent style are critical to having trusted code to integrate. Also, guesses about names and interfaces will be correct more often.</p>
<div class="section" id="what-should-i-call-my-variables">
<h2>What should I call my variables?<a class="headerlink" href="#what-should-i-call-my-variables" title="Permalink to this headline">¶</a></h2>
<ul class="simple">
<li><em>Choose the name that people will most likely guess.</em> Make it descriptive, but not too long: <code class="docutils literal"><span class="pre">curr_record</span></code> is better than <code class="docutils literal"><span class="pre">c</span></code>, or <code class="docutils literal"><span class="pre">curr</span></code>, or <code class="docutils literal"><span class="pre">current_genbank_record_from_database</span></code>.</li>
<li><em>Good names are hard to find.</em> Don’t be afraid to change names except when they are part of interfaces that other people are also using. It may take some time working with the code to come up with reasonable names for everything: if you have unit tests, it’s easy to change them, especially with global search and replace.</li>
<li><em>Use singular names for individual things, plural names for collections.</em> For example, you’d expect <code class="docutils literal"><span class="pre">self.Name</span></code> to hold something like a single string, but <code class="docutils literal"><span class="pre">self.Names</span></code> to hold something that you could loop through like a list or dict. Sometimes the decision can be tricky: is <code class="docutils literal"><span class="pre">self.Index</span></code> an int holding a positon, or a dict holding records keyed by name for easy lookup? If you find yourself wondering these things, the name should probably be changed to avoid the problem: try <code class="docutils literal"><span class="pre">self.Position</span></code> or <code class="docutils literal"><span class="pre">self.LookUp</span></code>.</li>
<li><em>Don’t make the type part of the name.</em> You might want to change the implementation later. Use <code class="docutils literal"><span class="pre">Records</span></code> rather than <code class="docutils literal"><span class="pre">RecordDict</span></code> or <code class="docutils literal"><span class="pre">RecordList</span></code>, etc. Don’t use Hungarian Notation either (i.e. where you prefix the name with the type).</li>
<li><em>Make the name as precise as possible.</em> If the variable is the name of the input file, call it <code class="docutils literal"><span class="pre">infile_name</span></code>, not <code class="docutils literal"><span class="pre">input</span></code> or <code class="docutils literal"><span class="pre">file</span></code> (which you shouldn’t use anyway, since they’re keywords), and not <code class="docutils literal"><span class="pre">infile</span></code> (because that looks like it should be a file object, not just its name).</li>
<li><em>Use</em> <code class="docutils literal"><span class="pre">result</span></code> <em>to store the value that will be returned from a method or function.</em> Use <code class="docutils literal"><span class="pre">data</span></code> for input in cases where the function or method acts on arbitrary data (e.g. sequence data, or a list of numbers, etc.) unless a more descriptive name is appropriate.</li>
<li><em>One-letter variable names should only occur in math functions or as loop iterators with limited scope.</em> Limited scope covers things like <code class="docutils literal"><span class="pre">for</span> <span class="pre">k</span> <span class="pre">in</span> <span class="pre">keys:</span> <span class="pre">print</span> <span class="pre">k</span></code>, where <code class="docutils literal"><span class="pre">k</span></code> survives only a line or two. Loop iterators should refer to the variable that they’re looping through: <code class="docutils literal"><span class="pre">for</span> <span class="pre">k</span> <span class="pre">in</span> <span class="pre">keys,</span> <span class="pre">i</span> <span class="pre">in</span> <span class="pre">items</span></code>, or <code class="docutils literal"><span class="pre">for</span> <span class="pre">key</span> <span class="pre">in</span> <span class="pre">keys,</span> <span class="pre">item</span> <span class="pre">in</span> <span class="pre">items</span></code>. If the loop is long or there are several 1-letter variables active in the same scope, rename them.</li>
<li><em>Limit your use of abbreviations.</em> A few well-known abbreviations are OK, but you don’t want to come back to your code in 6 months and have to figure out what <code class="docutils literal"><span class="pre">sptxck2</span></code> is. It’s worth it to spend the extra time typing <code class="docutils literal"><span class="pre">species_taxon_check_2</span></code>, but that’s still a horrible name: what’s check number 1? Far better to go with something like <code class="docutils literal"><span class="pre">taxon_is_species_rank</span></code> that needs no explanation, especially if the variable is only used once or twice.</li>
</ul>
<div class="section" id="acceptable-abbreviations">
<h3>Acceptable abbreviations<a class="headerlink" href="#acceptable-abbreviations" title="Permalink to this headline">¶</a></h3>
<p>The following list of abbreviations can be considered well-known and used with impunity within mixed name variables, but some should not be used by themselves as they would conflict with common functions, python built-in’s, or raise an exception. Do not use the following by themselves as variable names: <code class="docutils literal"><span class="pre">dir</span></code>, <code class="docutils literal"><span class="pre">exp</span></code> (a common <code class="docutils literal"><span class="pre">math</span></code> module function), <code class="docutils literal"><span class="pre">in</span></code>, <code class="docutils literal"><span class="pre">max</span></code>, and <code class="docutils literal"><span class="pre">min</span></code>. They can, however, be used as part of a name, eg <code class="docutils literal"><span class="pre">matrix_exp</span></code>.</p>
<table border="1" class="docutils">
<colgroup>
<col width="59%" />
<col width="41%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">Full</th>
<th class="head">Abbreviated</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>alignment</td>
<td>aln</td>
</tr>
<tr class="row-odd"><td>archaeal</td>
<td>arch</td>
</tr>
<tr class="row-even"><td>auxillary</td>
<td>aux</td>
</tr>
<tr class="row-odd"><td>bacterial</td>
<td>bact</td>
</tr>
<tr class="row-even"><td>citation</td>
<td>cite</td>
</tr>
<tr class="row-odd"><td>current</td>
<td>curr</td>
</tr>
<tr class="row-even"><td>database</td>
<td>db</td>
</tr>
<tr class="row-odd"><td>dictionary</td>
<td>dict</td>
</tr>
<tr class="row-even"><td>directory</td>
<td>dir</td>
</tr>
<tr class="row-odd"><td>end of file</td>
<td>eof</td>
</tr>
<tr class="row-even"><td>eukaryotic</td>
<td>euk</td>
</tr>
<tr class="row-odd"><td>frequency</td>
<td>freq</td>
</tr>
<tr class="row-even"><td>expected</td>
<td>exp</td>
</tr>
<tr class="row-odd"><td>index</td>
<td>idx</td>
</tr>
<tr class="row-even"><td>input</td>
<td>in</td>
</tr>
<tr class="row-odd"><td>maximum</td>
<td>max</td>
</tr>
<tr class="row-even"><td>minimum</td>
<td>min</td>
</tr>
<tr class="row-odd"><td>mitochondrial</td>
<td>mt</td>
</tr>
<tr class="row-even"><td>number</td>
<td>num</td>
</tr>
<tr class="row-odd"><td>observed</td>
<td>obs</td>
</tr>
<tr class="row-even"><td>original</td>
<td>orig</td>
</tr>
<tr class="row-odd"><td>output</td>
<td>out</td>
</tr>
<tr class="row-even"><td>parameter</td>
<td>param</td>
</tr>
<tr class="row-odd"><td>phylogeny</td>
<td>phylo</td>
</tr>
<tr class="row-even"><td>previous</td>
<td>prev</td>
</tr>
<tr class="row-odd"><td>probability</td>
<td>prob</td>
</tr>
<tr class="row-even"><td>protein</td>
<td>prot</td>
</tr>
<tr class="row-odd"><td>record</td>
<td>rec</td>
</tr>
<tr class="row-even"><td>reference</td>
<td>ref</td>
</tr>
<tr class="row-odd"><td>sequence</td>
<td>seq</td>
</tr>
<tr class="row-even"><td>standard deviation</td>
<td>stdev</td>
</tr>
<tr class="row-odd"><td>statistics</td>
<td>stats</td>
</tr>
<tr class="row-even"><td>string</td>
<td>str</td>
</tr>
<tr class="row-odd"><td>structure</td>
<td>struct</td>
</tr>
<tr class="row-even"><td>temporary</td>
<td>temp</td>
</tr>
<tr class="row-odd"><td>taxonomic</td>
<td>tax</td>
</tr>
<tr class="row-even"><td>variance</td>
<td>var</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="what-are-the-naming-conventions">
<h2>What are the naming conventions?<a class="headerlink" href="#what-are-the-naming-conventions" title="Permalink to this headline">¶</a></h2>
<table border="1" class="docutils">
<colgroup>
<col width="33%" />
<col width="33%" />
<col width="33%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">Type</th>
<th class="head">Convention</th>
<th class="head">Example</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>function</td>
<td><code class="docutils literal"><span class="pre">verb_with_underscores</span></code></td>
<td><code class="docutils literal"><span class="pre">find_all</span></code></td>
</tr>
<tr class="row-odd"><td>variable</td>
<td><code class="docutils literal"><span class="pre">noun_with_underscores</span></code></td>
<td><code class="docutils literal"><span class="pre">curr_index</span></code></td>
</tr>
<tr class="row-even"><td>constant</td>
<td><code class="docutils literal"><span class="pre">NOUN_ALL_CAPS</span></code></td>
<td><code class="docutils literal"><span class="pre">ALLOWED_RNA_PAIRS</span></code></td>
</tr>
<tr class="row-odd"><td>class</td>
<td><code class="docutils literal"><span class="pre">MixedCaseNoun</span></code></td>
<td><code class="docutils literal"><span class="pre">RnaSequence</span></code></td>
</tr>
<tr class="row-even"><td>public property</td>
<td><code class="docutils literal"><span class="pre">MixedCaseNoun</span></code></td>
<td><code class="docutils literal"><span class="pre">IsPaired</span></code></td>
</tr>
<tr class="row-odd"><td>private property</td>
<td><code class="docutils literal"><span class="pre">_noun_with_leading_underscore</span></code></td>
<td><code class="docutils literal"><span class="pre">_is_updated</span></code></td>
</tr>
<tr class="row-even"><td>public method</td>
<td><code class="docutils literal"><span class="pre">mixedCaseExceptFirstWordVerb</span></code></td>
<td><code class="docutils literal"><span class="pre">stripDegenerate</span></code></td>
</tr>
<tr class="row-odd"><td>private method</td>
<td><code class="docutils literal"><span class="pre">_verb_with_leading_underscore</span></code></td>
<td><code class="docutils literal"><span class="pre">_check_if_paired</span></code></td>
</tr>
<tr class="row-even"><td>really private data</td>
<td><code class="docutils literal"><span class="pre">__two_leading_underscores</span></code></td>
<td><code class="docutils literal"><span class="pre">__delegator_object_ref</span></code></td>
</tr>
<tr class="row-odd"><td>parameters that match properties</td>
<td><code class="docutils literal"><span class="pre">SameAsProperty</span></code></td>
<td><code class="docutils literal"><span class="pre">def</span> <span class="pre">__init__(data,</span> <span class="pre">Alphabet=None)</span></code></td>
</tr>
<tr class="row-even"><td>factory function</td>
<td><code class="docutils literal"><span class="pre">MixedCase</span></code></td>
<td><code class="docutils literal"><span class="pre">InverseDict</span></code></td>
</tr>
<tr class="row-odd"><td>module</td>
<td><code class="docutils literal"><span class="pre">lowercase_with_underscores</span></code></td>
<td><code class="docutils literal"><span class="pre">unit_test</span></code></td>
</tr>
<tr class="row-even"><td>global variables</td>
<td><code class="docutils literal"><span class="pre">gMixedCaseWithLeadingG</span></code></td>
<td>no examples - should be rare!</td>
</tr>
</tbody>
</table>
<ul class="simple">
<li><em>It is important to follow the naming conventions because they make it much easier to guess what a name refers to</em>. In particular, it should be easy to guess what scope a name is defined in, what it refers to, whether it’s OK to change its value, and whether its referent is callable. The following rules provide these distinctions.</li>
<li><code class="docutils literal"><span class="pre">lowercase_with_underscores</span></code> <em>for modules and internal variables (including function/method parameters).</em></li>
<li><code class="docutils literal"><span class="pre">MixedCase</span></code> for <em>classes</em> and <em>public properties</em>, and for <em>factory functions</em> that act like additional constructors for a class.</li>
<li><code class="docutils literal"><span class="pre">mixedCaseExceptFirstWord</span></code> for <em>public methods and functions</em>.</li>
<li><code class="docutils literal"><span class="pre">_lowercase_with_leading_underscore</span></code> for <em>private</em> functions, methods, and properties.</li>
<li><code class="docutils literal"><span class="pre">__lowercase_with_two_leading_underscores</span></code> for <em>private</em> properties and functions that <em>must not be overwritten</em> by a subclass.</li>
<li><code class="docutils literal"><span class="pre">CAPS_WITH_UNDERSCORES</span></code> for named <em>constants</em>.</li>
<li><code class="docutils literal"><span class="pre">gMixedCase</span></code> (i.e. mixed case prefixed with ‘g’) for <em>globals</em>. Globals should be used extremely rarely and with caution, even if you sneak them in using the Singleton pattern or some similar system.</li>
<li><em>Underscores can be left out if the words read OK run together.</em> <code class="docutils literal"><span class="pre">infile</span></code> and <code class="docutils literal"><span class="pre">outfile</span></code> rather than <code class="docutils literal"><span class="pre">in_file</span></code> and <code class="docutils literal"><span class="pre">out_file</span></code>; <code class="docutils literal"><span class="pre">infile_name</span></code> and <code class="docutils literal"><span class="pre">outfile_name</span></code> rather than <code class="docutils literal"><span class="pre">in_file_name</span></code> and <code class="docutils literal"><span class="pre">out_file_name</span></code> or <code class="docutils literal"><span class="pre">infilename</span></code> and <code class="docutils literal"><span class="pre">outfilename</span></code> (getting too long to read effortlessly).</li>
</ul>
</div>
<div class="section" id="how-do-i-organize-my-modules-source-files">
<h2>How do I organize my modules (source files)?<a class="headerlink" href="#how-do-i-organize-my-modules-source-files" title="Permalink to this headline">¶</a></h2>
<ul class="simple">
<li><em>Have a docstring with a description of the module’s functions</em>. If the description is long, the first line should be a short summary that makes sense on its own, separated from the rest by a newline.</li>
<li><em>All code, including import statements, should follow the docstring.</em> Otherwise, the docstring will not be recognized by the interpreter, and you will not have access to it in interactive sessions (i.e. through <code class="docutils literal"><span class="pre">obj.__doc__</span></code>) or when generating documentation with automated tools.</li>
<li><em>Import built-in modules first, followed by third-party modules, followed by any changes to the path and your own modules.</em> Especially, additions to the path and names of your modules are likely to change rapidly: keeping them in one place makes them easier to find.</li>
<li><em>Don’t use</em> <code class="docutils literal"><span class="pre">from</span> <span class="pre">module</span> <span class="pre">import</span> <span class="pre">*</span></code>, <em>instead use</em> <code class="docutils literal"><span class="pre">from</span> <span class="pre">module</span> <span class="pre">import</span> <span class="pre">Name,</span> <span class="pre">Name2,</span> <span class="pre">Name3...</span></code> <em>or possibly</em> <code class="docutils literal"><span class="pre">import</span> <span class="pre">module</span></code>. This makes it <em>much</em> easier to see name collisions and to replace implementations.</li>
</ul>
<div class="section" id="example-of-module-structure">
<h3>Example of module structure<a class="headerlink" href="#example-of-module-structure" title="Permalink to this headline">¶</a></h3>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="ch">#!/usr/bin/env python</span>
<span class="sd">"""Provides NumberList and FrequencyDistribution, classes for statistics.</span>
<span class="sd">NumberList holds a sequence of numbers, and defines several statistical</span>
<span class="sd">operations (mean, stdev, etc.) FrequencyDistribution holds a mapping from</span>
<span class="sd">items (not necessarily numbers) to counts, and defines operations such as</span>
<span class="sd">Shannon entropy and frequency normalization.</span>
<span class="sd">"""</span>
<span class="kn">from</span> <span class="nn">math</span> <span class="k">import</span> <span class="n">sqrt</span><span class="p">,</span> <span class="n">log</span><span class="p">,</span> <span class="n">e</span>
<span class="kn">from</span> <span class="nn">random</span> <span class="k">import</span> <span class="n">choice</span><span class="p">,</span> <span class="n">random</span>
<span class="kn">from</span> <span class="nn">Utils</span> <span class="k">import</span> <span class="n">indices</span>
<span class="k">class</span> <span class="nc">NumberList</span><span class="p">(</span><span class="nb">list</span><span class="p">):</span>
<span class="k">pass</span> <span class="c1"># much code deleted</span>
<span class="k">class</span> <span class="nc">FrequencyDistribution</span><span class="p">(</span><span class="nb">dict</span><span class="p">):</span>
<span class="k">pass</span> <span class="c1"># much code deleted</span>
<span class="c1"># use the following when the module can meaningfully be called as a script.</span>
<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span> <span class="c1"># code to execute if called from command-line</span>
<span class="k">pass</span> <span class="c1"># do nothing - code deleted</span>
</pre></div>
</div>
</div>
</div>
<div class="section" id="how-should-i-write-comments">
<h2>How should I write comments?<a class="headerlink" href="#how-should-i-write-comments" title="Permalink to this headline">¶</a></h2>
<ul>
<li><p class="first"><em>Always update the comments when the code changes.</em> Incorrect comments are far worse than no comments, since they are actively misleading.</p>
</li>
<li><p class="first"><em>Comments should say more than the code itself.</em> Examine your comments carefully: they may indicate that you’d be better off rewriting your code (especially, <em>renaming your variables</em> and getting rid of the comment.) In particular, don’t scatter magic numbers and other constants that have to be explained through your code. It’s far better to use variables whose names are self-documenting, especially if you use the same constant more than once. Also, think about making constants into class or instance data, since it’s all too common for ‘constants’ to need to change or to be needed in several methods.</p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="10%" />
<col width="90%" />
</colgroup>
<tbody valign="top">
<tr class="row-odd"><td>Wrong</td>
<td><code class="docutils literal"><span class="pre">win_size</span> <span class="pre">-=</span> <span class="pre">20</span> <span class="pre">#</span> <span class="pre">decrement</span> <span class="pre">win_size</span> <span class="pre">by</span> <span class="pre">20</span></code></td>
</tr>
<tr class="row-even"><td>OK</td>
<td><code class="docutils literal"><span class="pre">win_size</span> <span class="pre">-=</span> <span class="pre">20</span> <span class="pre">#</span> <span class="pre">leave</span> <span class="pre">space</span> <span class="pre">for</span> <span class="pre">the</span> <span class="pre">scroll</span> <span class="pre">bar</span></code></td>
</tr>
<tr class="row-odd"><td>Right</td>
<td><code class="docutils literal"><span class="pre">self._scroll_bar_size</span> <span class="pre">=</span> <span class="pre">20</span></code></td>
</tr>
<tr class="row-even"><td> </td>
<td><code class="docutils literal"><span class="pre">win_size</span> <span class="pre">-=</span> <span class="pre">self._scroll_bar_size</span></code></td>
</tr>
</tbody>
</table>
</div></blockquote>
</li>
<li><p class="first"><em>Use comments starting with #, not strings, inside blocks of code.</em> Python ignores real comments, but must allocate storage for strings (which can be a performance disaster inside an inner loop).</p>
</li>
<li><p class="first"><em>Start each method, class and function with a docstring using triple double quotes (“””).</em> The docstring should start with a 1-line description that makes sense by itself (many automated formatting tools, and the IDE, use this). This should be followed by a blank line, followed by descriptions of the parameters (if any). Finally, add any more detailed information, such as a longer description, notes about the algorithm, detailed notes about the parameters, etc. If there is a usage example, it should appear at the end. Make sure any descriptions of parameters have the correct spelling, case, etc. For example:</p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">''</span><span class="p">,</span> <span class="n">alphabet</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="sd">"""Returns new Sequence object with specified data, name, alphabet.</span>
<span class="sd"> Arguments:</span>
<span class="sd"> - data: The sequence data. Should be a sequence of characters.</span>
<span class="sd"> - name: Arbitrary label for the sequence. Should be string-like.</span>
<span class="sd"> - alphabet: Set of allowed characters. Should support 'for x in y'</span>
<span class="sd"> syntax. None by default.</span>
<span class="sd"> Note: if alphabet is None, performs no validation.</span>
<span class="sd"> """</span>
</pre></div>
</div>
</li>
<li><p class="first"><em>Always update the docstring when the code changes.</em> Like outdated comments, outdated docstrings can waste a lot of time. “Correct examples are priceless, but incorrect examples are worse than worthless.” <a class="reference external" href="http://www.python.org/pycon/dc2004/papers/4/PyCon2004DocTestUnit.pdf">Jim Fulton</a>.</p>
</li>
</ul>
</div>
<div class="section" id="how-should-i-format-my-code">
<h2>How should I format my code?<a class="headerlink" href="#how-should-i-format-my-code" title="Permalink to this headline">¶</a></h2>
<ul>
<li><p class="first"><em>Use 4 spaces for indentation.</em> Do not use tabs (set your editor to convert tabs to spaces). The behaviour of tabs is not predictable across platforms, and will cause syntax errors. If we all use the same indentation, collaboration is much easier.</p>
</li>
<li><p class="first"><em>Lines should not be longer than 79 characters.</em> Long lines are inconvenient in some editors. Use \ for line continuation. Note that there cannot be whitespace after the \.</p>
</li>
<li><p class="first"><em>Blank lines should be used to highlight class and method definitions.</em> Separate class definitions by two blank lines. Separate methods by one blank line.</p>
</li>
<li><p class="first"><em>Be consistent with the use of whitespace around operators.</em> Inconsistent whitespace makes it harder to see at a glance what is grouped together.</p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="19%" />
<col width="81%" />
</colgroup>
<tbody valign="top">
<tr class="row-odd"><td>Good</td>
<td><code class="docutils literal"><span class="pre">((a+b)*(c+d))</span></code></td>
</tr>
<tr class="row-even"><td>OK</td>
<td><code class="docutils literal"><span class="pre">((a</span> <span class="pre">+</span> <span class="pre">b)</span> <span class="pre">*</span> <span class="pre">(c</span> <span class="pre">+</span> <span class="pre">d))</span></code></td>
</tr>
<tr class="row-odd"><td>Bad</td>
<td><code class="docutils literal"><span class="pre">(</span> <span class="pre">(a+</span> <span class="pre">b)</span> <span class="pre">*(c</span> <span class="pre">+d</span> <span class="pre">))</span></code></td>
</tr>
</tbody>
</table>
</div></blockquote>
</li>
<li><p class="first"><em>Don’t put whitespace after delimiters or inside slicing delimiters.</em> Whitespace here makes it harder to see what’s associated.</p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="16%" />
<col width="35%" />
<col width="49%" />
</colgroup>
<tbody valign="top">
<tr class="row-odd"><td>Good</td>
<td><code class="docutils literal"><span class="pre">(a+b)</span></code></td>
<td><code class="docutils literal"><span class="pre">d[k]</span></code></td>
</tr>
<tr class="row-even"><td>Bad</td>
<td><code class="docutils literal"><span class="pre">(</span> <span class="pre">a+b</span> <span class="pre">)</span></code></td>
<td><code class="docutils literal"><span class="pre">d</span> <span class="pre">[k],</span> <span class="pre">d[</span> <span class="pre">k]</span></code></td>
</tr>
</tbody>
</table>
</div></blockquote>
</li>
</ul>
</div>
<div class="section" id="how-should-i-test-my-code">
<h2>How should I test my code ?<a class="headerlink" href="#how-should-i-test-my-code" title="Permalink to this headline">¶</a></h2>
<p>There are two basic approaches for testing code in python: unit testing and doc testing. Their purpose is the same, to check that execution of code given some input produces a specified output. The cases to which the two approaches lend themselves are different.</p>
<p>An excellent discourse on testing code and the pros and cons of these alternatives is provided in a presentation by <a class="reference external" href="http://www.python.org/pycon/dc2004/papers/4/PyCon2004DocTestUnit.pdf">Jim Fulton</a>, which is recommended reading. A significant change since that presentation is that <code class="docutils literal"><span class="pre">doctest</span></code> can now read content that is not contained within docstrings. A another comparison of these two approaches, along with a third (<code class="docutils literal"><span class="pre">py.test</span></code>) is also <a class="reference external" href="http://agiletesting.blogspot.com/2005/11/articles-and-tutorials-page-updated.html">available</a>. To see examples of both styles of testing look in <code class="docutils literal"><span class="pre">PyCogent/tests</span></code>: files ending in .rst are using <code class="docutils literal"><span class="pre">doctest</span></code>, those ending in .py are using <code class="docutils literal"><span class="pre">unittest</span></code>.</p>
<p>In general, it’s easier to start writing <code class="docutils literal"><span class="pre">doctest</span></code>‘s, as you don’t need to learn the <code class="docutils literal"><span class="pre">unittest</span></code> API but the latter give’s much greater control.</p>
<p>Whatever approach is employed, the general principle is every line of code should be tested. It is critical that your code be fully tested before you draw conclusions from results it produces. For scientific work, bugs don’t just mean unhappy users who you’ll never actually meet: they may mean retracted publications.</p>
<p>Tests are an opportunity to invent the interface(s) you want. Write the test for a method before you write the method: often, this helps you figure out what you would want to call it and what parameters it should take. It’s OK to write the tests a few methods at a time, and to change them as your ideas about the interface change. However, you shouldn’t change them once you’ve told other people what the interface is.</p>
<p>Never treat prototypes as production code. It’s fine to write prototype code without tests to try things out, but when you’ve figured out the algorithm and interfaces you must rewrite it <em>with tests</em> to consider it finished. Often, this helps you decide what interfaces and functionality you actually need and what you can get rid of.</p>
<p>“Code a little test a little”. For production code, write a couple of tests, then a couple of methods, then a couple more tests, then a couple more methods, then maybe change some of the names or generalize some of the functionality. If you have a huge amount of code where ‘all you have to do is write the tests’, you’re probably closer to 30% done than 90%. Testing vastly reduces the time spent debugging, since whatever went wrong has to be in the code you wrote since the last test suite. And remember to use python’s interactive interpreter for quick checks of syntax and ideas.</p>
<p>Run the test suite when you change <cite>anything</cite>. Even if a change seems trivial, it will only take a couple of seconds to run the tests and then you’ll be sure. This can eliminate long and frustrating debugging sessions where the change turned out to have been made long ago, but didn’t seem significant at the time.</p>
<div class="section" id="some-unittest-pointers">
<h3>Some <code class="docutils literal"><span class="pre">unittest</span></code> pointers<a class="headerlink" href="#some-unittest-pointers" title="Permalink to this headline">¶</a></h3>
<ul>
<li><p class="first"><em>Use the</em> <code class="docutils literal"><span class="pre">unittest</span></code> <em>framework with tests in a separate file for each module.</em> Name the test file <code class="docutils literal"><span class="pre">test_module_name.py</span></code>. Keeping the tests separate from the code reduces the temptation to change the tests when the code doesn’t work, and makes it easy to verify that a completely new implementation presents the same interface (behaves the same) as the old.</p>
</li>
<li><p class="first"><em>Use</em> <code class="docutils literal"><span class="pre">evo.unit_test</span></code> <em>if you are doing anything with floating point numbers or permutations</em> (use <code class="docutils literal"><span class="pre">assertFloatEqual</span></code>). Do <em>not</em> try to compare floating point numbers using <code class="docutils literal"><span class="pre">assertEqual</span></code> if you value your sanity. <code class="docutils literal"><span class="pre">assertFloatEqualAbs</span></code> and <code class="docutils literal"><span class="pre">assertFloatEqualRel</span></code> can specifically test for absolute and relative differences if the default behavior is not giving you what you want. Similarly, <code class="docutils literal"><span class="pre">assertEqualItems</span></code>, <code class="docutils literal"><span class="pre">assertSameItems</span></code>, etc. can be useful when testing permutations.</p>
</li>
<li><p class="first"><em>Test the interface of each class in your code by defining at least one</em> <code class="docutils literal"><span class="pre">TestCase</span></code> <em>with the name</em> <code class="docutils literal"><span class="pre">ClassNameTests</span></code>. This should contain tests for everything in the public interface.</p>
</li>
<li><p class="first"><em>If the class is complicated, you may want to define additional tests with names</em> <code class="docutils literal"><span class="pre">ClassNameTests_test_type</span></code>. These might subclass <code class="docutils literal"><span class="pre">ClassNameTests</span></code> in order to share <code class="docutils literal"><span class="pre">setUp</span></code> methods, etc.</p>
</li>
<li><p class="first"><em>Tests of private methods should be in a separate</em> <code class="docutils literal"><span class="pre">TestCase</span></code> <em>called</em> <code class="docutils literal"><span class="pre">ClassNameTests_private</span></code>. Private methods may change if you change the implementation. It is not required that test cases for private methods pass when you change things (that’s why they’re private, after all), though it is often useful to have these tests for debugging.</p>
</li>
<li><p class="first"><em>Test `all` the methods in your class.</em> You should assume that any method you haven’t tested has bugs. The convention for naming tests is <code class="docutils literal"><span class="pre">test_method_name</span></code>. Any leading and trailing underscores on the method name can be ignored for the purposes of the test; however, <em>all tests must start with the literal substring</em> <code class="docutils literal"><span class="pre">test</span></code> <em>for</em> <code class="docutils literal"><span class="pre">unittest</span></code> <em>to find them.</em> If the method is particularly complex, or has several discretely different cases you need to check, use <code class="docutils literal"><span class="pre">test_method_name_suffix</span></code>, e.g. <code class="docutils literal"><span class="pre">test_init_empty</span></code>, <code class="docutils literal"><span class="pre">test_init_single</span></code>, <code class="docutils literal"><span class="pre">test_init_wrong_type</span></code>, etc. for testing <code class="docutils literal"><span class="pre">__init__</span></code>.</p>
</li>
<li><p class="first"><em>Write good docstrings for all your test methods.</em> When you run the test with the <code class="docutils literal"><span class="pre">-v</span></code> command-line switch for verbose output, the docstring for each test will be printed along with <code class="docutils literal"><span class="pre">...OK</span></code> or <code class="docutils literal"><span class="pre">...FAILED</span></code> on a single line. It is thus important that your docstring is short and descriptive, and makes sense in this context.</p>
<blockquote>
<div><p><strong>Good docstrings:</strong></p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">NumberList</span><span class="o">.</span><span class="n">var</span> <span class="n">should</span> <span class="k">raise</span> <span class="ne">ValueError</span> <span class="n">on</span> <span class="n">empty</span> <span class="ow">or</span> <span class="mi">1</span><span class="o">-</span><span class="n">item</span> <span class="nb">list</span>
<span class="n">NumberList</span><span class="o">.</span><span class="n">var</span> <span class="n">should</span> <span class="n">match</span> <span class="n">values</span> <span class="kn">from</span> <span class="nn">R</span> <span class="k">if</span> <span class="nb">list</span> <span class="n">has</span> <span class="o">></span><span class="mi">2</span> <span class="n">items</span>
<span class="n">NumberList</span><span class="o">.</span><span class="n">__init__</span> <span class="n">should</span> <span class="k">raise</span> <span class="n">error</span> <span class="n">on</span> <span class="n">values</span> <span class="n">that</span> <span class="n">fail</span> <span class="nb">float</span><span class="p">()</span>
<span class="n">FrequencyDistribution</span><span class="o">.</span><span class="n">var</span> <span class="n">should</span> <span class="n">match</span> <span class="n">corresponding</span> <span class="n">NumberList</span> <span class="n">var</span>
</pre></div>
</div>
<p><strong>Bad docstrings:</strong></p>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">var</span> <span class="n">should</span> <span class="n">calculate</span> <span class="n">variance</span> <span class="c1"># lacks class name, not descriptive</span>
<span class="n">Check</span> <span class="n">initialization</span> <span class="n">of</span> <span class="n">a</span> <span class="n">NumberList</span> <span class="c1"># doesn't say what's expected</span>
<span class="n">Tests</span> <span class="n">of</span> <span class="n">the</span> <span class="n">NumberList</span> <span class="n">initialization</span><span class="o">.</span> <span class="c1"># ditto</span>
</pre></div>
</div>
</div></blockquote>
</li>
<li><p class="first"><em>Module-level functions should be tested in their own</em> <code class="docutils literal"><span class="pre">TestCase</span></code><em>, called</em> <code class="docutils literal"><span class="pre">modulenameTests</span></code>. Even if these functions are simple, it’s important to check that they work as advertised.</p>
</li>
<li><p class="first"><em>It is much more important to test several small cases that you can check by hand than a single large case that requires a calculator.</em> Don’t trust spreadsheets for numerical calculations – use R instead!</p>
</li>
<li><p class="first"><em>Make sure you test all the edge cases: what happens when the input is None, or ‘’, or 0, or negative?</em> What happens at values that cause a conditional to go one way or the other? Does incorrect input raise the right exceptions? Can your code accept subclasses or superclasses of the types it expects? What happens with very large input?</p>
</li>
<li><p class="first"><em>To test permutations, check that the original and shuffled version are different, but that the sorted original and sorted shuffled version are the same.</em> Make sure that you get <em>different</em> permutations on repeated runs and when starting from different points.</p>
</li>
<li><p class="first"><em>To test random choices, figure out how many of each choice you expect in a large sample (say, 1000 or a million) using the binomial distribution or its normal approximation.</em> Run the test several times and check that you’re within, say, 3 standard deviations of the mean.</p>
</li>
</ul>
</div>
<div class="section" id="example-of-a-unittest-test-module-structure">
<h3>Example of a <code class="docutils literal"><span class="pre">unittest</span></code> test module structure<a class="headerlink" href="#example-of-a-unittest-test-module-structure" title="Permalink to this headline">¶</a></h3>
<div class="highlight-default"><div class="highlight"><pre><span></span><span class="ch">#!/usr/bin/env python</span>
<span class="sd">"""Tests NumberList and FrequencyDistribution, classes for statistics."""</span>
<span class="kn">from</span> <span class="nn">cogent.util.unit_test</span> <span class="k">import</span> <span class="n">TestCase</span><span class="p">,</span> <span class="n">main</span> <span class="c1"># for floating point test use unittestfp</span>
<span class="kn">from</span> <span class="nn">statistics</span> <span class="k">import</span> <span class="n">NumberList</span><span class="p">,</span> <span class="n">FrequencyDistribution</span>
<span class="k">class</span> <span class="nc">NumberListTests</span><span class="p">(</span><span class="n">TestCase</span><span class="p">):</span> <span class="c1"># remember to subclass TestCase</span>
<span class="sd">"""Tests of the NumberList class."""</span>
<span class="k">def</span> <span class="nf">setUp</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">"""Define a few standard NumberLists."""</span>
<span class="bp">self</span><span class="o">.</span><span class="n">Null</span> <span class="o">=</span> <span class="n">NumberList</span><span class="p">()</span> <span class="c1"># test empty init</span>
<span class="bp">self</span><span class="o">.</span><span class="n">Empty</span> <span class="o">=</span> <span class="n">NumberList</span><span class="p">([])</span> <span class="c1"># test init with empty sequence</span>
<span class="bp">self</span><span class="o">.</span><span class="n">Single</span> <span class="o">=</span> <span class="n">NumberList</span><span class="p">([</span><span class="mi">5</span><span class="p">])</span> <span class="c1"># single item</span>
<span class="bp">self</span><span class="o">.</span><span class="n">Zero</span> <span class="o">=</span> <span class="n">NumberList</span><span class="p">([</span><span class="mi">0</span><span class="p">])</span> <span class="c1"># single, False item</span>
<span class="bp">self</span><span class="o">.</span><span class="n">Three</span> <span class="o">=</span> <span class="n">NumberList</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">])</span> <span class="c1"># multiple items</span>
<span class="bp">self</span><span class="o">.</span><span class="n">ZeroMean</span> <span class="o">=</span> <span class="n">NumberList</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span> <span class="c1"># items nonzero, mean zero</span>
<span class="bp">self</span><span class="o">.</span><span class="n">ZeroVar</span> <span class="o">=</span> <span class="n">NumberList</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">])</span> <span class="c1"># items nonzero, mean nonzero, variance zero</span>
<span class="c1"># etc. These objects shared by all tests, and created new each time a method</span>
<span class="c1"># starting with the string 'test' is called (i.e. the same object does not</span>
<span class="c1"># persist between tests: rather, you get separate copies).</span>
<span class="k">def</span> <span class="nf">test_mean_empty</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">"""NumberList.mean() should raise ValueError on empty object"""</span>
<span class="k">for</span> <span class="n">empty</span> <span class="ow">in</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">Null</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">Empty</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">assertRaises</span><span class="p">(</span><span class="ne">ValueError</span><span class="p">,</span> <span class="n">empty</span><span class="o">.</span><span class="n">mean</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">test_mean_single</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">"""NumberList.mean() should return item if only 1 item in list"""</span>
<span class="k">for</span> <span class="n">single</span> <span class="ow">in</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">Single</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">Zero</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">assertEqual</span><span class="p">(</span><span class="n">single</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">single</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="c1"># other tests of mean</span>
<span class="k">def</span> <span class="nf">test_var_failures</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="sd">"""NumberList.var() should raise ZeroDivisionError if <2 items"""</span>
<span class="k">for</span> <span class="n">small</span> <span class="ow">in</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">Null</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">Empty</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">Single</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">Zero</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">assertRaises</span><span class="p">(</span><span class="ne">ZeroDivisionError</span><span class="p">,</span> <span class="n">small</span><span class="o">.</span><span class="n">var</span><span class="p">)</span>
<span class="c1"># other tests of var</span>
<span class="c1"># tests of other methods</span>
<span class="k">class</span> <span class="nc">FrequencyDistributionTests</span><span class="p">(</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">pass</span> <span class="c1"># much code deleted</span>
<span class="c1"># tests of other classes</span>
<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span> <span class="c1"># run tests if called from command-line</span>
<span class="n">main</span><span class="p">()</span>
</pre></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="sidebar">
<div class="news">
<table id="feed"><tr><td><h3><a href="http://pycogent.wordpress.com/">PyCogent News and Announcements</a></h3></td>
</tr></table></div>
<h3>Table Of Contents</h3>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="install.html">Quick installation using pip</a></li>
<li class="toctree-l1"><a class="reference internal" href="README.html">The Readme</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Coding guidelines</a></li>
<li class="toctree-l1"><a class="reference internal" href="data_file_links.html">The data files used in the documentation</a></li>
<li class="toctree-l1"><a class="reference internal" href="examples/index.html">Cogent Usage Examples</a></li>
<li class="toctree-l1"><a class="reference internal" href="cookbook/index.html">PyCogent Cookbook</a></li>
<li class="toctree-l1"><a class="reference internal" href="developer_notes.html">For Developers</a></li>
<li class="toctree-l1"><a class="reference internal" href="scripting_guidelines.html">Scripting guidelines</a></li>
<li class="toctree-l1"><a class="reference internal" href="licenses.html">Licenses and disclaimer</a></li>
<li class="toctree-l1"><a class="reference internal" href="ChangeLog.html">Changelog</a></li>
</ul>
<div role="search">
<h3 style="margin-top: 1.5em;">Search</h3>
<form class="search" action="search.html" method="get">
<input type="text" name="q" />
<input type="submit" value="Go" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div>
<div class="clearer"></div>
</div>
</div>
<div class="footer-wrapper">
<div class="footer">
<div class="left">
<div role="navigation" aria-label="related navigaton">
<a href="README.html" title="The Readme"
>previous</a> |
<a href="data_file_links.html" title="The data files used in the documentation"
>next</a> |
<a href="genindex.html" title="General Index"
>index</a>
</div>
<div role="note" aria-label="source link">
<br/>
<a href="_sources/coding_guidelines.txt"
rel="nofollow">Show Source</a>
</div>
</div>
<div class="right">
<div class="footer" role="contentinfo">
© Copyright 2016, PyCogent Team.
Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.4.1.
</div>
</div>
<div class="clearer"></div>
</div>
</div>
</body>
</html>