Skip to content

Commit 88643d0

Browse files
pedroerpfacebook-github-bot
authored andcommitted
Add CODING_STYLE.md
Summary: Add CODING_STYLE.md documentation. Reviewed By: kgpai Differential Revision: D30459374 fbshipit-source-id: 35fcaa49f644c48581f7cee107805260aea24389
1 parent 22032c6 commit 88643d0

File tree

1 file changed

+302
-0
lines changed

1 file changed

+302
-0
lines changed

CODING_STYLE.md

+302
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,302 @@
1+
# Velox Coding Style and Best Practices
2+
3+
The goal of this document is to make Velox developers more productive by
4+
promoting consistency within the codebase, and encouraging common practices
5+
which will make the codebase easier to read, edit, maintain and debug in the
6+
future.
7+
8+
## Cpp Style
9+
10+
Many aspects of cpp style will be covered by clang-format, such as spacing,
11+
line width, indentation and ordering (for includes, using directives and etc). 
12+
13+
* Always ensure your code is clang-format compatible.
14+
* If you’re working on a legacy file which is not clang-format compliant yet,
15+
refrain from formatting the entire file in the same PR as it pollutes your
16+
original PR and makes it harder to review.
17+
* Submit a separate diff/PR with the format-only changes.
18+
19+
## Naming Conventions
20+
21+
* Use **PascalCase** for types (classes, structs, enums, type aliases, type
22+
template parameters) and file names.
23+
* Use **camelCase** for functions, member and local variables, and non-type
24+
template parameters.
25+
* **camelCase_** for private and protected members variables.
26+
* Use **snake_case** for namespace names and build targets
27+
* Use **UPPER_SNAKE_CASE** for macros.
28+
* Use **kPascalCase** for static constants.
29+
30+
31+
## Comments
32+
33+
Some good practices about code comments:
34+
35+
* Optimize code for the reader, not the writer. 
36+
* Velox is growing and starting to be used by multiple compute engines and
37+
teams; as a result, more time will be spent reading code than writing it.
38+
* Overall goal: make the code more obvious and remove obscurity.
39+
* Comments should capture information that was on the writer’s mind, but
40+
**couldn’t be represented as code.**
41+
* As such, refrain from adding obvious comments, e.g: simple getter/setter
42+
methods.
43+
* However, “obviousness” is in the reader’s mind - if a reviewer says
44+
something is not obvious, then it is not obvious. 
45+
* Consider that the audience is an experienced Software Engineer with a
46+
moderate knowledge of the codebase.
47+
* What should be commented: 
48+
* Every File
49+
* Every Class
50+
* Every method that's not an obvious getter/setter
51+
* Every member variable
52+
* Do not simply restate the variable name. Either add a comment explaining
53+
the semantic meaning of that variable or do not add a comment at all.
54+
This is an anti-pattern: ``` // A simple counter. size_t count_{0}; ```
55+
* For functions with large bodies, a good practice is to group blocks of
56+
related code, and precede them with a blank line and a high-level comment
57+
on what the block does.
58+
59+
About comment style:
60+
61+
* Header comment for source files:
62+
* All source files (.h and .cpp) should have a standard header in a large
63+
comment block at the top; this should include the standard copyright
64+
notice, the license header, and a brief file description. This comment
65+
block should use `/* */`
66+
* For single line comments, always use `//` (with a space separator before the
67+
comment).
68+
* Comments should be full english sentences, starting with a capital letter and
69+
ending with a period (.).
70+
* Use: ``` // True if this node only sorts a portion of the final result.
71+
```
72+
* Instead of: ``` // true if this node only sorts a portion of the final
73+
result ```
74+
* For multi-line comments:
75+
* Velox will follow the doxygen comment style. 
76+
* For multi-line comments within actual code blocks (the ones which are not
77+
to be exposed as documentation, use `//` (double slashes).
78+
* For comments on headers for classes, functions, methods and member
79+
variables (the ones to be exposed as documentation), use `///` (three
80+
slashes) at the front of each line. 
81+
* Don't use old-style `/* */` comments inside code or on top-level header
82+
comments. It adds two additional lines and makes headers more verbose than
83+
they need to be.
84+
* Special comments:
85+
* Use this format when something needs to be fixed in the future: ``` //
86+
TODO: Description of what needs to be fixed. ```
87+
* Include enough context in the comment itself to make clear what will be
88+
done, without requiring any references from outside the code.
89+
* Do not include the author’s username. If required, this can always be
90+
retrieved from git blame.
91+
92+
## Asserts and CHECKs
93+
94+
* For assertions and other types of validation, use `VELOX_CHECK_*` macros
95+
* `VELOX_CHECK_*` will categorize the error to be an internal runtime error.
96+
* `VELOX_USER_CHECK_*` will categorize the error to be a user error.
97+
* When comparing two values/expressions, prefer to use:
98+
* `VELOX_CHECK_LT(idx, children_.size());`
99+
* Rather than:
100+
* `VELOX_CHECK(idx < children_.size());`
101+
* The former will evaluate and include both expressions values in the
102+
exception.
103+
* All macro also provide the following optional features:
104+
* Specifying error code (error code listed in
105+
`velox/common/base/VeloxException.h`): 
106+
* `VELOX_CHECK_EQ(v1[, v2[, error_code]]);`
107+
* `VELOX_USER_CHECK_EQ(v1[, v2[, error_code]])`
108+
* Appending error message:
109+
* `VELOX_CHECK_EQ(v1, v2, “Some error message”)`
110+
* `VELOX_USER_CHECK_EQ(v1, v2, “Some error message”)`
111+
* Error message formatting via fmt:
112+
* `VELOX_USER_CHECK_EQ( v1, v2, “My complex error message {} and {}”, str,
113+
i)`
114+
* Note that the values of v1 and v2 are already included in the exception
115+
message by default.
116+
117+
## Variables
118+
119+
* Do not declare more than one variable on a line (this helps accomplish the
120+
other guidelines, promotes commenting, and is good for tagging).
121+
* Obvious exception: for-loop initialization
122+
* Initialize most variables at the time of declaration (unless they are class
123+
objects without default constructors).
124+
* Prefer using uniform-initialization to make initialization more consistent
125+
across types, e.g., prefer `size_t size{0};` over `size_t size = 0;`
126+
* Declare your variables in the smallest scope possible.
127+
* Declare your variables as close to the usage point as possible within the
128+
given scope.
129+
* Don't group all your variables at the top of the scope -- this makes the code
130+
much harder to follow.
131+
* If the variable or function parameter is a pointer or reference type, group
132+
the `*` or &` with the type -- pointer-ness or reference-ness is an attribute
133+
of the type, not the name.
134+
* `int* foo;` `const Bar& bar;` NOT `int *foo; `const Bar &bar`;
135+
* Beware that `int* foo, bar;` will be parsed as declaring `foo` as an `int*`
136+
and `bar` as an `int`. Note that multiple declaration is discouraged.
137+
* For member variables:
138+
* Group member variable and methods based on their visibility (public,
139+
protected and private)
140+
* It’s ok to have multiple blocks for a given level. 
141+
* Most member variables should come with a short comment description and an
142+
empty line above that. 
143+
* Refrain from using public member variables whenever possible, in order to
144+
promote encapsulation.
145+
* Leverage getter/setter methods when appropriate.
146+
* Prefer to use value-types, `std::optional`, and `std::unique_ptr` in that
147+
order.
148+
* Value-types are conceptually the simplest and cheapest.
149+
* std::optional allows you to express “may be null” without the additional
150+
complexity of manual storage duration.
151+
* `std::unique_ptr<>` should be used for types that are not cheaply movable
152+
but need to transfer ownership, or which are too large to store on the
153+
stack. Note that most types that perform large allocations already store
154+
their bulk memory on-heap.
155+
156+
## Constants
157+
158+
* Always use `nullptr` if you need a constant that represents a null pointer
159+
(`T*` for some `T`); use `0` otherwise for a zero value.
160+
* For large literal numbers, use ‘ to make it more readable, e.g:  `1’000’000`
161+
instead of `1000000`.
162+
* For floating point literals, never omit the initial 0 before the decimal
163+
point (always `0.5`, not `.5`).
164+
* File level variables and constants should be defined in an anonymous
165+
namespace.
166+
* Always prefer const variables and enum to using preprocessor (#define) to
167+
define constant values. 
168+
* Prefer `enum class` over `enum` for better type safety.
169+
* As a general rule, do not use string literals without declaring a named
170+
constant for them.
171+
* The best way to make a constant string literal is to use constexpr
172+
`std::string_view`/folly::StringPiece`
173+
* **NEVER** use `std::string` - this makes your code more prone to SIOF bugs.
174+
* Avoid `const char* const` and `const char*` - these are less efficient to
175+
convert to `std::string` later on in your program if you ever need to
176+
because `std::string_view`/`folly::StringPiece` knows its size and can use
177+
a more efficient constructor. `std::string_view`/`folly::StringPiece` also
178+
has richer interfaces and often works as a drop-in replacement to
179+
`std::string`.
180+
* Need compile-time string concatenation? You can use `folly::FixedString`
181+
for that.
182+
183+
## Macros
184+
185+
Do not use them unless absolutely necessary. Whenever possible, use normal
186+
inline functions or templates instead. If you absolutely need, remember that
187+
macro names are always upper-snake-case. Also:
188+
189+
* Use parentheses to sanitize complex inputs.
190+
* For example, `#define MUL(x, y) x * y` is incorrect for input like `MUL(2,
191+
1 + 1)`. The author probably meant `#define MUL(x, y) ((x) * (y))` instead.
192+
* When making macros that have multiple statements, use this idiom: 
193+
* `do { stmt1; stmt2; } while (0)`
194+
* This allows this block to act the most like a single statement, usable in
195+
if/for/while even if they don't use braces and forces use of a trailing
196+
semicolon.
197+
198+
## Headers and Includes
199+
200+
* All header files must have a single-inclusion guard using `#pragma once`
201+
* Included files and using declarations should all be at the top of the file,
202+
and ordered properly to make it easier to see what is included. 
203+
* Use clang-format to order your include and using directives.
204+
* Includes should always use the full path (relative to github’s root dir).
205+
* Whenever possible, try to forward-declare as much as possible in the .h and
206+
only `#includ`e things you need the full implementation for. 
207+
* For instance, if you just use a `Class*` or `Class&` in a header file,
208+
forward-declare `Class` instead of including the full header to minimize
209+
header dependencies.
210+
* Put small private classes that will only get used in one place into a .cpp,
211+
inside an anonymous namespace. A common example is a custom comparator class.
212+
* Be aware of what goes into your .h files. If possible, try to separate them
213+
into "Public API" .h files that may be included by external modules. These
214+
public .h files should contain only those functions or classes that external
215+
users need to access. 
216+
* No large blocks of code should be in .h files that are widely included if
217+
these blocks aren't vital to all users' understanding of the interface the .h
218+
represents. Split these large implementation blocks into a separate `-inl.h`
219+
file.
220+
* `-inl.h` files should exist only to aid readability with the expectation
221+
that a user of your library should only need to read the .h to understand
222+
the interface. Only if someone needs to understand your implementation
223+
should opening the `-inl.h` be necessary (or the .cpp file, for that
224+
matter).
225+
226+
## Function Arguments
227+
228+
* Const
229+
* All objects, whenever possible, should be passed to functions as const-refs
230+
(`const T&`). This makes APIs much clearer as to whether it’s an input or
231+
output arg, reduces the need for NULL-checks (and reduces crashing bugs),
232+
and helps enforce good encapsulation.
233+
* An obvious exception is objects that are trivially copy- or
234+
move-constructible, like primitive types or `std::unique_ptr`. 
235+
* Don't pass by const reference when passing by value is both correct and
236+
cheaper.
237+
* All non-static member functions should be marked as const if calling them
238+
doesn't change the behavior/response of any member.
239+
* References as function arguments.
240+
* For input or read-only parameters, pass by `const&`
241+
* For nullable or optional parameters, use higher order primitives like
242+
`std::optional`; raw pointers are also appropriate.
243+
* For output, mutable, or in/out parameters:
244+
* Non-const ref is the recommendation if the argument is not nullable. 
245+
* If it’s nullable, use a raw pointer.
246+
* Always use `std::optional` instead of `folly::Optional` and
247+
`boost::optional`. 
248+
* Prefer `std::string_view` to `const std::string&`
249+
* This avoids unnecessary construction of a std::string when passing other
250+
types. This often happens when string literals are passed into functions.
251+
* If you need a `std::string` inside the function, however, take a
252+
`std::string` to move the copy to the API boundary. This allows the caller
253+
to avoid the copy with a move.
254+
* Almost never take an r-value reference as a function argument; take the
255+
argument by value instead. This usually as efficient as taking an r-value
256+
reference, but simpler and more flexible at the call site. For example: ```
257+
InPredicate::InPredicate(folly::F14HashSet<T>&& rhs) : rhs_(std::move(rhs))
258+
{} ```
259+
* is more complex than the version that takes expr by value: ```
260+
InPredicate::InPredicate(folly::F14HashSet<T> rhs) : rhs_(std::move(rhs))
261+
{} ```
262+
* However, the first requires either a std::move() or an explicit copy at the
263+
call point; the second doesn’t. The performance of the two should be almost
264+
identical.
265+
* Add comments when it’s not clear which argument is intended in a function
266+
call. This is particularly a problem for longer signatures with repeated
267+
types or constant arguments. For example, `phrobinicate(/*elements=*/{1, 2},
268+
/*startOffset=*/0, /*length=*/2)`
269+
* Use the /*argName=*/value format (note the lack of spaces). Clang-tidy
270+
supports checking the given argument name against the declaration when this
271+
format is used.
272+
273+
## Namespaces
274+
275+
* All Velox code should be placed inside the facebook::velox namespace
276+
* Always use nested namespace definition:
277+
* `namespace facebook::velox::core {` Rather than
278+
* `namespace facebook { namespace velox { namespace core {`
279+
* Use sub namespaces (e.g: facebook::velox::core) to logically group large
280+
chunks of related code and prevent identifier clashes.
281+
* Namespaces should make it easier to code, not harder. Refrain from creating
282+
hierarchies that are way too long.
283+
* Namespace don’t necessarily need to reflect the on-disk layout. 
284+
* This isn’t java.
285+
* Guidelines for importing namespaces:
286+
* Don't EVER put `using namespace anything;` OR `using anything::anything;`
287+
in a header file. This pollutes the global namespace and makes it extremely
288+
difficult to refactor code.
289+
* It’s very useful to not have to fully-qualify types as it can make code
290+
more readable. In header files, it's unavoidable. ALWAYS fully qualify
291+
names in a header (e.g. it's `std::string`, not just `string`).
292+
* In cpp files, best practice is to add a using declaration after your list
293+
of `#includes` (e.g. `using tests::VectorMaker;`) and to then just write
294+
`VectorMaker`.
295+
* `using namespace std;` can lead to surprising behavior, especially when
296+
other modules don't follow best practices with their using statements and
297+
what they dump into global scope. The best defensive programming practice
298+
for std it to fully qualify symbols, e.g: `std::string`, std::vector
299+
* Do not use `using namespace std;`
300+
* Use `using namespace anything_else;` somewhat sparingly – the whole point
301+
of namespaces is obliterated when the code starts doing this prolifically.
302+
Frequently, `using foo::bar::BazClass;` is better.

0 commit comments

Comments
 (0)