Authors:
Mustafa Mustafa
Thomas Ullrich
Anselm Vossen

Draft Version 1.0, February 23, 2015

This document is a draft of new C++ coding guidelines compiled for the STAR collaboration by the above mentioned authors. This effort was initiated by the STAR computing coordinator Jerome Lauret on October 31, 2014. The charge can be viewed here. The committee produced two documents, one for the coding guidelines seen here, and one for the naming and formatting guidelines that can be viewed here.

The committee based their work on the existing guidelines, expanded them for clarity, and added new material where it saw fit. The coding guidelines include the new C++11 standard. We have made heavy use of the C++ Google Style guide at http://google-styleguide.googlecode.com using their xml and css style sheets.

The goal of this guide is to manage the complexity of C++ (often in conjunction with ROOT) by describing in detail the dos and don'ts of writing C++ code. These rules exist to keep the STAR code base manageable while still allowing coders to use C++ language features productively. In some cases we constrain, or even ban, the use of certain C++ and ROOT features. We do this to keep code simple and to avoid the various common errors and problems that these features can cause. We also had to take into account that millions of lines of STAR code exist. For a new experiment the guidelines certainly would look different in places but we have to live with the legacy of existing code and the guidelines under which they were written.

Note that this guide is not a C++ tutorial: we assume that the reader is familiar with the language. We marked parts of the guidelines that address specifically new C++11 features.

This style guide contains many details that are initially hidden from view. They are marked by the triangle icon, which you see here on your left. The first level of hidden information is the subsection Summary in each rule and the second level of hidden information is the optional subsection Extra details and exceptions to the rule. Click the arrow on the left now, you should see "Hooray" appear below.

Hooray! Now you know you can expand points to get more details. Alternatively, there are an "expand all summaries" and an "expand all summaries and extra details" at the top of this document.

In general, every .cxx file should have an associated .h file. Each header file should contain only one or related class declarations for maintainability and for easier retrieval of class definitions.

Correct use of header files can make a huge difference to the readability, size and performance of your code. The following rules will guide you through the various pitfalls of using header files.

All header files should have #define guards to prevent multiple inclusion. The format of the symbol name should be <FILE>_H.

For example, the file myFile.h should have the following guard:

#ifndef MYFILE_H #define MYFILE_H ... #endif // MYFILE_H
You may forward declare ordinary classes in order to avoid unnecessary #includes. A "forward declaration" is a declaration of a class, function, or template without an associated definition. #include lines can often be replaced with forward declarations of whatever symbols are actually used by the client code.
  • Unnecessary #includes force the compiler to open more files and process more input.
  • They can also force your code to be recompiled more often, due to changes in the header.
  • It can be difficult to determine the correct form of a forward declaration in the presence of features like templates, typedefs, default parameters, and using declarations.
  • Forward declaring multiple symbols from a header can be more verbose than simply including the header.
  • Forward declarations of functions and templates can prevent the header owners from making otherwise-compatible changes to their APIs; for example, widening a parameter type, or adding a template parameter with a default value.
  • Forward declaring symbols from namespace std:: usually yields undefined behavior.
  • When using a function declared in a header file, always #include that header.
  • When using a class template, prefer to #include its header file.
  • When using an ordinary class, relying on a forward declaration is OK, but be wary of situations where a forward declaration may be insufficient or incorrect; when in doubt, just #include the appropriate header.
  • Do not replace data members with pointers just to avoid an #include.
Always #include the file that actually provides the declarations/definitions you need; do not rely on the symbol being brought in transitively via headers not directly included. One exception is that myFile.cxx may rely on #includes and forward declarations from its corresponding header file myFile.h.
As a general rule, put function definitions into the .cxx file and let the compiler decide what gets inlined (it can decide anyway, regardless of the inline keyword). Use inline when you require the implementation of a function in multiple translation units (e.g. template classes/functions).

The inline keyword indicates that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. But a compiler is not required to perform this inline substitution at the point of call.

Functions that are defined within a class definition are implicitly inline. Note, however, that the definition of functions in the class definition is strongly discouraged in STAR.

An inline function must be defined in every translation unit from where it is called. It is undefined behavior if the definition of the inline function is not the same for all translation units. Note that this implies that the function is defined in a header file. This can have an impact on compile time and lead to longer (= less efficient) development cycles.

Note that the inline keyword has no effect on the linkage of a function. Linkage can be changed via unnamed namespaces or the static keyword.

If you add a new function, put it into the .cxx file per default. Small functions, like accessors and mutators may be placed into the .h file instead (inline). Also, most template function implementations need to go into the .h file. If you later determine that a function should be moved from the .cxx file into the .h file, please make sure that it helps the compiler in optimizing the code. Otherwise you're just increasing compile time.

Include headers from external libraries using angle brackets. Include headers from your own project/libraries using double quotes.
Do not rely on implicit includes. Make header files self-sufficient.

There are two types of #include statements: #include <myFile.h> and #include “myFile.h”.

  • Include headers from external libraries using angle brackets. #include <iostream> #include <cmath> #include <TH1D.h>
  • Include headers from your own project or any STAR related project using double quotes. #include "MyClass.h" #include "StEnumeration.h"

The header files of external libraries are obviously not in the same directory as your source files. So you need to use angle brackets.

Headers of your own application have a defined relative location to the source files of your application. Using double quotes, you have to specify the correct relative path to the include file.

Include order

Another important aspect of include management is the include order. Typically, you have a class named Foo, a file Foo.h and a file Foo.cxx . The rule is : In your file Foo.cxx, you should include Foo.h as the first include, before the system includes.

The rationale behind that is to make your header standalone.

Let's imagine that your Foo.h looks like this: class Foo { public: Bar getBar(); };

And your Foo.cxx looks like this: #include "Bar.h" #include "Foo.h"

Your Foo.cxx file will compile, but it will not compile for other people using Foo.h without including Bar.h. Including Foo.h first makes sure that your Foo.h header works for others. // Foo.h #include "Bar.h" class Foo { public: Bar getBar(); }; // Foo.cxx #include "Foo.h"

For more details: Getting #includes right.

Namespaces subdivide the global scope into distinct, named scopes, and thus are useful for logically grouping related types and functions and preventing name collisions. In C++ it is in general very good practice to use namespaces, especially in libraries. However, historically STAR software makes little to no use of namespaces but rather uses a specific naming scheme (prefixes) to indicate the scope (e.g. StEmc..., StTpc... etc). While certain tools in STAR can handle namespaces (such as cons) others would be very cumbersome to adapt.

Namespaces are for legacy reasons depreciated in STAR. As with every guideline there might be exceptions, especially in end user code. However, care should be taken to check for possible side effects. Namespaces should be entirely avoided in the context of StEvent.

When using namespaces in end-user parts of your code (e.g. specific analysis code), encapsulate your entire class into the namespace. Nonmember functions that are logically tied to a specific type should be in the same namespace as that type. To make namespace work with cons and ROOT use the STAR specific $NMSPC tag as follows namespace StMyStuff //$NMSPC { Class AClass : public TNamed {....}; Class BClass : public TNamed {....}; } // namespace StMyStuff This will cause cons to generate a dictionary consistent with ROOT. The tag $NMSPC triggers the namespace inclusion (multiple namespaces can be used but you cannot combine sections with namespace and sections without).

Don't write namespace using declarations or using directives in a header file or before an #include.

# include "Bar.h" // OK in .cxx after include statements using namespace Foo; // sometimes a using declaration is preferable, to be precise // about the symbols that get imported using Foo::Type;

// Forbidden in .h -- This pollutes the namespace. using namespace Foo;

The using directive can sometimes be useful in header files to import one namespace into another one. This can effectively hide a namespace from the public interface, similar to what an inline namespace does.
Do not declare anything in namespace std, not even forward declarations of standard library classes.

Declaring entities in namespace std represents undefined behavior, i.e., not portable. To declare entities from the standard library, include the appropriate header file.

Nonmember functions (also known as global functions) should be within a namespace.

Putting nonmember functions in a namespace avoids polluting the global namespace. Static member functions are an alternative as long as it makes sense to include the function within the class.

namespace MyNamespace { void doGlobalFoo(); // Good -- doGlobalFoo is within a namespace. class MyClass { public: ... // Good -- doGlobalBar is a static member of class MyClass and // has a reason to be part of this class (not shown here). static Bar* doGlobalBar(); }
Declare variables as locally as possible.

Variables whose lifetime are longer than necessary have several drawbacks:

  • They make the code harder to understand and maintain.
  • They can't be always sensibly initialized.

  • It can be sometimes more efficient to declare a variable (usually of an object type) outside a loop.

    If the variable is an object, its constructor is invoked every time it enters scope and is created, and its destructor is invoked every time it goes out of scope.

    // Inefficient implementation: for (int i = 0; i < bigNumber; ++i) { Foo foo; // My ctor and dtor get called bigNumber times each. foo.doSomething(i); }

    It may be more efficient to declare such a variable used in a loop outside that loop:

    Foo foo; // My ctor and dtor get called once each. for (int i = 0; i < bigNumber; ++i) { foo.doSomething(i); }
  • This item does not apply to constants, because constants don't add a state.
Always initialize variables.

Do not separate initialization from declaration, e.g.

int value; value = function(); // Bad -- initialization separate from declaration. int value = function(); // Good -- declaration has initialization.

Use a default initial value or ternary operator (?:) to reduce mixing data flow with control flow.

int speedupFactor; // Bad: does not initialize variable if (condition) { speedupFactor = NoFactor; } else { speedupFactor = DoubleFactor; } int speedupFactor = DoubleFactor; // Good: initializes variable if (condition) { speedupFactor = NoFactor; } int speedupFactor = condition ? NoFactor : DoubleFactor; // Good: initializes variable

Prefer declaration of loop variables inside a loop, e.g.

int i; // Bad: does not initialize variable for (i = 0; i < number; ++i) { doSomething(i); } for (int i = 0; i < number; ++i) { doSomething(i); // Good }
Prefer initialization with braces except for single-argument assignment.

In C++11, the brace initialization syntax for builtin arrays and POD structures has been extended for use with all other datatypes.

Example of brace initialization:

std::vector<std::string> myVector{"alpha", "beta", "gamma"};

Example of single-argument assignments:

int value = 3; // preferred style std::string name = "Some Name"; int value { 3 }; // also possible std::string name{ "Some Name" }; std::string name = { "Some Name" };

User data types can also define constructors that take initializer_list, which is automatically created from braced-init-list: #include <initializer_list> class MyType { public: // initializer_list is a reference to the underlying init list, // so it can be passed by value. MyType(std::initializer_list<int> initList) { for (int element : initList) { .. } } }; MyType myObject{2, 3, 5, 7};

Finally, brace initialization can also call ordinary constructors of data types that do not have initializer_list constructors. // Calls ordinary constructor as long as MyOtherType has no // initializer_list constructor. class MyOtherType { public: explicit MyOtherType(std::string name); MyOtherType(int value, std::string name); }; MyOtherType object1 = {1, "b"}; // If the constructor is explicit, you can't use the "= {}" form. MyOtherType object2{"b"};

Never assign a braced-init-list to an auto local variable. In the single element case, what this means can be confusing. auto value = {1.23}; // value is an initializer_list<double> auto value = double{1.23}; // Good -- value is a double, not an initializer_list.

For clarity of the examples above we use directly explicit values, however following the rule about magic numbers requires to define all such numbers as named constants or constexpr first.

Variables declared in the global scope are not allowed. Other global variables, including static class variables and variables in namespace scope, should be avoided where other means of communication are possible. A global variable is a variable that can be accessed (theoretically) from everywhere in the program. The adjective "global" should rather be understood as concerning its linkage, not whether it is in the global scope. int gBar; // this is obviously global class Something { private: static int sId; // but this one too (details at the end of the rule) }; namespace NotGlobalScope { Foo fooObject; // and finally this one as well } Global variables are a simple solution to sharing of data. Global variables make it harder to reason about the code (for humans and compilers): the smaller the number of variables a given region of code reads and writes, the easier. Global variables can be read and written from anywhere. Therefore, global variables pose a challenge to the optimizer. We want to reduce the shared data in our software to the unavoidable minimum. Therefore, global variables should be avoided where other means of communication are possible. Note that a private static class variable sId is global. For example two threads having each an instance of the class could access sId via these instances. In the rare and justified cases where you use global variables, including file-static variables, static member variables and variables in namespace scope, initialize them statically.
  • This rule additionally applies to file-static variables.
  • A global variable is statically initialized if the type has no constructor or a constexpr constructor. This is the case for fundamental types (integers, chars, floats, or pointers) and POD (Plain Old Data: structs/unions/arrays of fundamental types or other POD structs/unions).
Dynamic initialization of globals can be used to run code before main(). Destructors of globals can be used to run code after main(). Dynamic initialization of globals in dynamically loaded objects can be used to run code on plugin load. It is very hard to reason about the order of execution of such functions. Especially if dynamically initialized globals are present in shared libraries that are linked into dynamically loaded objects, few people understand the semantics.

As an example do this:

struct Pod { int indexes[5]; float width; }; struct LiteralType { int value; constexpr LiteralType() : value(1) {} }; Pod gData; LiteralType gOtherData;

But not this :

class NotPod { NotPod(); }; NotPod gBadData; // dynamic constructor Global variables must be initialized statically.
We only allow global variables to contain POD data. This rule completely disallows std::vector (use std::array instead), or std::string (use const char []) for global variables.

If there is a need for startup or shutdown code, dynamic constructors may be used. But only if:

  • The dependencies on other data are minimized.
  • It is documented what code depends on this and why there is no issue of incorrect calling order.
  • Side-effects are clearly understood and documented.

Example that exhibits the problem of execution order:

// Struct.h: #include <string> struct Struct { static std::string sString; }; // Struct.cxx: std::string Struct::sString = "Hello World"; // main.cxx: #include "Struct.h" #include <iostream> std::string gAnotherString = Struct::sString; int main() { std::cout << gAnotherString << std::endl; return 0; }

This program will either output "Hello World" or crash, depending on the initialization order (with GCC on Linux it depends on whether you link with g++ Struct.o main.o or g++ main.o Struct.o).

Static variables in functions (called "function-local static variables" in the C++ terminology) are expensive and need care on destruction. Prefer to use static class variables where possible. Function-local static variables are initialized on first use and destructed in the reverse order of construction. Since C++11, function-local static variables are guaranteed to be initialized exactly once, even in a multi-threaded environment. The variable is lazily initialized. Therefore the order of construction is better under control. Also the cost of initialization is only incurred if it is really needed. Because of the thread-safe lazy initialization, function-local static variables have an extra cost compared to other function variables. Use static class variables rather than function-local static variables. Therefore, do this : class Something { public: int generateId() { return Something::sId++; } private: static int sId; // initialized in .cxx }; instead of : class Something { public: int generateId() { static int sId = 0; return sId++; } };

Function-local static variables can be used to build factory functions for lazily initialized global objects. This makes it possible to safely use dynamically initialized types in the global scope.

In case function-local static variables are nevertheless used, it is best to avoid non-owning references because their destruction happens after return from main(), as shown on the following code snippets : std::string &globalMessage() { static std::string message = "Hello world."; return message; } int main() { std::cout << globalMessage() << '\n'; globalMessage() = "Goodbye cruel world."; std::cout << globalMessage() << '\n'; return 0; } // prints: // Hello world. // Goodbye cruel world. class BreakTheCode { public: void setReference(int *value) { mValuePointer = value; } ~BreakTheCode() { *mValuePointer += 2; } private: int *mValuePointer = 0; }; BreakTheCode &globalBreaker() { static BreakTheCode breaker; return breaker; } int main() { int someValue = 1; globalBreaker().setReference(&someValue); return 0; } // globalBreaker::breaker accesses main::someValue after it went out of scope. // This example should be harmless, but if a more complex type instead of int // is involved — and possibly the destructor of another function-local // static overwrites the stack data from main — this pattern can lead to a crash.

Classes are the fundamental unit of code in C++. Naturally, we use them extensively. This section lists the main dos and don'ts you should follow when writing a class. Every class should have at least one constructor. All uninitialized variables should be initialized in the constructor.

Every class should have at least one constructor (and a destructor) even if it does nothing or is defined to do nothing, or compiler defaults should be explicitly requested using the default specifier. This is good practice, it indicates to the reader that the coder thought about this and not plainly forgot about it.

If a class does not have a constructor there is no guarantee of how objects of that class will be initialized, whilst data members should be explicitly initialized therein. When the constructor dynamically allocates memory, a destructor must be added to return the memory to the free pool when an object gets cleaned up.

In the constructor all data member should be initialized, either in the constructor function body or in the constructor initializer list or in-class. See the next item for more.
Declare and initialize members variables in the same order. Prefer initialization (in the constructor initializer list or in-class) to assignment (in the constructor function body). C++98 allows in-class member initialization for const static members only. C++11 allows in-class member initialization for any variable. Class MyClass { public: int x = 1; }; This is basically equivalent to using initialization lists in constructors. The advantage of in-class initialization is that it allows consistent default initialization when there are multiple constructors and saves a lot of typing resulting in cleaner codes. Constructor initialization overrides in-class initialization.

Class member variables are initialized in the order in which they are declared in the class definition. The order in the constructor initializer list has no influence on initialization order and therefore may be misleading if it does not match the order of the declaration. Compilers often issue a warning if this rule is broken, but not always.

If a member variable is not explicitly initialized in the constructor initializer list and it is a non-POD the default constructor of that variable is called. Therefore, if a member variable is assigned to in the constructor function body, the member variable may get initialized unnecessarily with the default constructor. If the variable is a POD and the class instance is created with the new operator the variable will be zero-initialized. Otherwise, PODs are left uninitialized see examples below.

If you do not declare any constructors yourself then the compiler will generate a default constructor for you, which may leave some fields uninitialized or initialized to inappropriate values.

Examples:

Initialization list:

// MyClass.h class MyClass : public MyBase { // ... private: int mValue; std::vector mVector; }; // MyClass.cxx MyClass::MyClass() : MyBase(), mValue(0), mVector() {}

See an example of initialization via std::initializer_list in Brace Initialization.

Non explicit initialization:

struct MySturct { int x; mySturct(): x(10) {} }; class MyClass { public: int mX; int mY; MySturct mS; // mS and mY are not explicitly initialized in the initializer list // mS default constructor will always be called. // mY is a POD therefore it will be: // Zero-initialized if instance of MyClass is created with new operator. // Otherwise, left un-initialized. MyClass(): mX(5) { cout<<mX<<" "<<mS.x<<" "<<mY<<endl; } }; int main() { MyClass c0; // MyClass::mY will not be initialized. MyClass* c1 = new MyClass(); // MyClass::mY will be zero-initialized. MyClass* c2 = new MyClass; // MyClass::mY will be zero-initialized. }

Member variables should be declared and initialized in the same order.

Use in-class member initialization for simple initializations, especially when a member variable must be initialized the same way in more than one constructor.

If your class defines member variables that aren't initialized in-class, and if it has no other constructors, you must define a default constructor (one that takes no arguments). It should preferably initialize the object in such a way that its internal state is consistent and valid.

If your class inherits from an existing class but you add no new member variables, you are not required to have a default constructor, also see Delegating and Inheriting Constructors.

Do not call virtual functions in constructors and destructors.

Inside constructors and destructors virtual function do not behave "virtually". If the work calls virtual functions, these calls will not get dispatched to the subclass implementations. Calls to an unimplemented pure virtual function result in undefined behavior.

Calling a virtual function non-virtually is fine: class MyClass { public: MyClass() { doSomething(); } // Bad virtual void doSomething(); }; class MyClass { public: MyClass() { MyClass::doSomething(); } // Good virtual void doSomething(); };

Constructors should never call virtual functions.
Each class should have an assignment operator and a copy constructor implemented or explicitly deleted. Exception is when the class doesn't allocate subsidiary data structures on the heap, in this case compiler defaults can be explicitly requested. Be aware of data slicing for polymorphic classes. Since C++11, the programmer can instruct the compiler not to create certain defaults by using the specifier = delete This is particularly useful in two cases: 1) Making objects non-copyable: class NonCopyable { NonCopyable(const NonCopyable&) = delete; NonCopyable& operator=(const NonCopyable&) = delete; }; 2) Preventing implicit conversion of function arguments: Class MyClass { void f(double i); void f(int) = delete; }; The specifier = default can be used to state the programmers wish for defaults to be created. Class MyClass { MyClass() = default; // default constructor is explicitly requested. MyClass(...); }; However, the verbosity here is redundant, it is useful as a declaration of intention. For classes, the default generated functions are always public. Programmer can control the visibility of the defaults by using = default. The copy constructor and copy assignment operator are used to create copies of objects.

The assignment operator ClassName& operator=(const ClassName&) is called when one instance of a class is assigned to another. The copy constructor ClassName(const ClassName&) defines the behavior of the class when it is passed by value as an argument, returned by value from a function, or used to initialize one instance with the value of another class instance. Defining this constructor, all the class objects are copied properly.

It is possible to implement one using the other. One can safely invoke the copy assignment operator from the constructor as long as the operator is not declared virtual.

The rule of three is a rule of thumb in C++ that claims that if a class defines one of the following it should probably explicitly define all three:

  • destructor
  • copy constructor
  • copy assignment operator
The Rule of Three claims that if one of these had to be defined by the programmer, it means that the compiler-generated version does not fit the needs of the class in one case and it will probably not fit in the other cases either.

Most classes that allocate subsidiary data structures on the heap or consume any other kind of shared resources should have a copy constructor and assignment operator.

Exception 1: do not implement your own copy/assignment when member-wise copy is desired and request the compiler to generate the defaults instead. For example, use: ClassName(const ClassName &other) = default; ClassName& operator=(const ClassName&) = default; instead of ClassName(const ClassName &other): data(other.data) {} ClassName& operator=(const ClassName &other): data(other.data) {} The former can be optimized much better by the compiler.

Exception 2: if the design of the class asks specifically demands that no copy is created (e.g. for singletons) the copy constructor and assignment operators should be be disabled by using the delete keyword (C++11) or by making them private (C++98 or older).

Polymorphic class design implies pointer semantics. Since copy constructors/assignment operators, etc. cannot be made virtual, making a base class copyable could result in objects slicing.

If your polymorphic class needs to be copyable, consider using a virtual clone() method. This way copying can be implemented without slicing and be used more naturally for pointers: void someFunction(SomeInterface *object) { SomeInterface *objectCopy = object->clone(); ... }

Only implement move constructors/assignment operators if your class needs optimizations for move semantics. Control of defaults can be used for copy/move assignment operators or constructors and destructor. However, one should pay attention to the Rule of Five. Stated roughly by Stroustrup:

1) If any move, copy, or destructor is explicitly specified (declared, defined, =default, or =delete) by the user, no move is generated by default.

2) If any move, copy, or destructor is explicitly specified (declared, defined, =default, or =delete) by the user, any undeclared copy operations are generated by default, but this is deprecated, so don't rely on that.

The move constructor and move assignment operator are used to move (semantically) objects. Move semantics were introduced with C++11. In essence, a move is just a copy that can be optimized from the knowledge that the source object is at the end of its life (such objects bind to rvalue references). Thus, a move of a std::vector does not need to copy all data, but only the pointer to the data. Additionally, the source object must be told that it does not own the data anymore, to inhibit the free from the destructor. In most cases the move constructor/assignment operator therefore modifies the source object (e.g. setting the data pointer to nullptr). See rvalue reference and move semantics for more details on move semantics implementation.

Only implement move constructors/assignment operators if your class needs optimizations for move semantics.

Use delegating and inheriting constructors when they reduce code duplication. Be aware of self delegation.

Delegating and inheriting constructors are two different features, both introduced in C++11, for reducing code duplication in constructors. Delegating constructors allow the constructor to forward work to another constructor of the same class, using a special variant of the initialization list syntax.

Many classes have multiple constructors, especially in STAR. Often, an empty constructor sets variables to either 0 or any default parameters, while the non-empty constructors are used to set the data member to the values provided. Other examples are constructors that take different arguments, depending on the context the class is used. Often only parts of all data that defined in the class are known at construction time.

In many cases this requires to either write constructors with semi-identical code making code maintenance difficult, or providing a private init() function that is called internally by the various constructors.

Here’s an example how it was done in C++98: class A { public: A(): num1(0), num2(0) {average=(num1+num2)/2.;} A(int i): num1(i), num2(0) {average=(num1+num2)/2.;} A(int i, int j): num1(i), num2(j) {average=(num1+num2)/2.;} private: int num1; int num2; double average; }; To at least keep the repetitions at a minimum often this is done: class A { public: A(): num1(0), num2(0) {init();} A(int i): num1(i), num2(0) {init();} A(int i, int j): num1(i), num2(j) {init();} private: int num1; int num2; double average; void init(){ average=(num1+num2)/2.;}; };

This revision eliminates code duplication but it brings the following new problems: Other member functions might accidentally call init(), which causes unexpected results. After we enter a class member function, all the class members have already been constructed. It's too late to call member functions to do the construction work of class members. In other words, init() merely reassigns new values to data members. It doesn’t really initialize them.

Verbosity hinders readability and repetition is error-prone. Both get in the way of maintainability. So, in C++11, we can define one constructor in terms of another: class A { public: A(): A(0){} A(int i): A(i, 0){} A(int i, int j) { num1=i; num2=j; average=(num1+num2)/2.; } private: int num1; int num2; double average; };

Delegating constructors make the program clear and simple. Delegating and target constructors do not need special labels or disposals to be delegating or target constructors. They have the same interfaces as other constructors. A delegating constructor can be the target constructor of another delegating constructor, forming a delegating chain. Target constructors are chosen by overload resolution or template argument deduction. In the delegating process, delegating constructors get control back and do individual operations after their target constructors exit.

If not careful one can generate a constructor that delegates to itself: class C { public: C(int) { } C() : C(42) { } C(char) : C(42.0) { } C(double) : C('a') { } }; int main() { C c('b'); return 0; } Different compilers handle this case differently. For example, clang submits an error, gcc compiles and crashes with stack overflow. Note that above code is attached to the class definition only for demonstration purposes. In STAR, the use of code in class declarations is strongly discouraged since it reduces readability.

A derived class, per default, inherits all functions of the base class. This is not the case for constructors. Since C++11 it is possible to explicitly inherit the constructors of a base class. This can be a significant simplification for derived classes that don't need custom constructor logic.

class Base { public: Base(); Base(int number); Base(const string& name); ... }; class Derived : public Base { public: using Base::Base; // Base's constructors are redeclared here. };

This is especially useful when Derived's constructors don't have to do anything more than calling Base's constructors.

Use delegating and inheriting constructors when they reduce code duplication.
Be cautious about inheriting constructors when your derived class has new member variables and use in-class member initialization for the derived class's member variables. When coding delegating constructors be aware of self delegation.

Every class must free resources (objects, IO handlers, etc.) it allocated during its lifetime. The base class destructors must be declared virtual if they are public.

In polymorphic design a special care is needed in implementing base class destructors. If deletion through a pointer to a base Base should be allowed, then the Base destructor must be public and virtual. Otherwise, it should be protected and can be non-virtual.

Always write a destructor for a base class, because the implicitly generated one is public and non-virtual.

In some class designs the destructors (of all classes in the inheritance tree) do nothing (implying that the classes and their members never allocate any resources). Typically, such designs do not have any virtual functions at all, and the virtual destructor would be the only reason for the existence of a vtable. Then a virtual destructor may be unnecessary and may be omitted.
Use a struct only for passive objects that carry data; everything else is a class.

The struct and class keywords behave almost identically in C++. We add our own semantic meanings to each keyword, so you should use the appropriate keyword for the data-type you're defining.

structs should be used for passive objects that carry data, and may have associated constants, but lack any functionality other than access/setting the data members. The accessing/setting of fields is done by directly accessing the fields rather than through method invocations. Methods should not provide behavior but should only be used to set up the data members, e.g., constructor, destructor, initialize(), reset(), validate().

If more functionality is required, a class is more appropriate.

You can use struct instead of class for functors and traits.

Note that member variables in structs and classes have different naming rules.

When using inheritance, make it public and declare overridden methods as override or final. However, composition is often more appropriate than inheritance especially if a class is not designed to be a base class.

Since C++11 it is possible to mark virtual methods as overriding a virtual methods from the base class using the keyword override. This is useful to state the intent and get a compile error (on otherwise silent errors) if this intent is not fulfilled for some reason (e.g. typo in the method name, mismatching method signature, virtual keyword forgotten in the base class). For example: Class Base { virtual void a(int); virtual void f(); virtual void g() const; void k(); // not virtual virtual void h(char); }; Class Derived : public Base { void a(float) override; // doesn't override Base::a(int) (wrong signature) void f() override; // overrides Base::f() void g() override; // doesn't override Base::g() (wrong type) void k() override; // doesn't override Base::k() (Base::k() is not virtual) void h(char); // overrides Base::h() }; Error given by gcc compiler when there is a problem with an override attempt is error: ‘void foo::foo()’ marked override, but does not override.

The final keyword tells the compiler that derived classes may not override the virtual methods anymore. This is useful to limit abuse of your classes by users, but it closes the possibility of better implementation of methods in derived classes. Class Base { virtual void f() final; }; Class Derived : Base { void f(); // ill-formed because the virtual method Base::f has been marked final }; It can also be useful to prevent inheritance from classes (if a programmer tries to inherit from a class that is declared final by the author it is an indication that composition is more appropriate than inheritance in that case). Class Base final { }; Class Derived : Base { }; // ill-formed because the class Base has been marked final

When a class inherits from a base class, it includes the definitions of all the data and operations that the base class defines. In practice, inheritance is used in two major ways in C++: implementation inheritance, in which actual code is inherited by the derived class, and interface inheritance, in which only method names are inherited. Implementation inheritance reduces code size by re-using the base class code as it specializes an existing type. Because inheritance is a compile-time declaration, you and the compiler can understand the operation and detect errors. Interface inheritance can be used to programmatically enforce that a class expose a particular API. Again, the compiler can detect errors, in this case, when a class does not define a necessary method of the API. For implementation inheritance, because the code implementing a derived class is spread between the base and the derived class, it can be more difficult to understand an implementation. The derived class cannot override functions that are not virtual, so the derived class cannot change implementation. The base class may also define some data members, so that specifies physical layout of the base class.

All inheritance should be public. If you want to do private inheritance, you should be including an instance of the base class as a member instead.

Do not overuse implementation inheritance. Composition is often more appropriate.

State your intent when you want to override a virtual method by using the keyword override.

Use multiple inheritance implementation only when at most one of the base classes has an implementation; all other base classes must be pure interface classes. Multiple inheritance allows a sub-class to have more than one base class. However this functionality can bring to the so-called Diamond problem unless base classes are pure interfaces. Multiple inheritance is allowed only when all superclasses, with the possible exception of the first one, are pure interfaces. If a class was designed as a pure interface, keep it as a pure interface.

A class is a pure interface if it meets the following requirements:

  • It has only public pure virtual ("= 0") methods and static methods (see Destructors).
  • It does not have data members.
  • It does not have any constructors defined. If a constructor is provided, it must take no arguments and it must be protected.
  • If it is a subclass, it may only be derived from classes that satisfy these conditions.
When writing a pure interface, apply the corresponding naming rule and make sure there is no implementation in it. Make sure not to add implementation to an existing pure interface.
When overloading operators keep the same semantics. Operator overloading is a specific case of function overloading in which some or all operators like +, = or == have different behaviors depending on the types of their arguments. It can easily be emulated using function calls.

For example: a << 1; shifts the bits of the variable left by one bit if a is an integer, but if a is an output stream instead this will write "1" to it.

The semantics of the operator overloading should be kept the same. Because operator overloading allows the programmer to change the usual semantics of an operator, it should be used with care.

The public, protected and private keywords must be used explicitly in the class declaration in order to make code more readable. It is recommended to list the public data member and methods first since they define the global interface and are most important for the user/reader.

Don't do this:

class Momentum { double mX; // w/o access control keyword implicitly private double mY; double mZ; protected: bool containsInvalidNumbers() const; public: // should be listed first double px() const; double py() const; double pz() const; double pT() const; };

Do this:

class Momentum { public: double px() const; double py() const; double pz() const; double pT() const; protected: bool containsInvalidNumbers() const; private: double mX; double mY; double mZ; };
Hide internals. Avoid returning handles to internal data managed by your class.

Information hiding protects the code from uncontrollable modifying state of your object by clients and it also help to minimize dependencies between calling and called codes.

A class consisting mostly of gets/sets is probably poorly designed. Consider providing an abstraction or changing it in struct.

Make data members private, except in structs. If there is no better way how to hide the class internals, provide the access through protected or public accessor and, if really needed, modifier functions.

See also Inheritance, Structs vs. Classes and Function Names.

The use of friend declarations should be avoided where possible. Friends are a mechanism to override data hiding. Friends of a class have access to its private data. Friend is a `limited export' mechanism. Friends have three problems:
  • They can change the internal state of objects from outside the definition of the class.
  • They introduce extra coupling between components, and therefore should be used sparingly.
  • They have access to everything, rather than being restricted to the members of interest to them.
To guarantee the encapsulation of a base class is always preferable to use protected data over friends or public data and then use inheritance.
Prefer the use of fundamental types built-in C++ over ROOT types, except where absolutely required.

ROOT defines a large set of portable and unportable types such as Int_t, Float_t, Double_t, and many more. There is a priori no reason to use any of those when not needed. Use of builtin C++ types makes the code in fact more portable, readable, and often faster (see discussion under "Extended integer types"). If fixed size is absolutely required the introduction of extended integer types of fixed size and guaranteed size types in C++11 makes ROOT types redundant.

The only exceptions are data member in “persistent” classes (e.g. StEvent) under schema evolution. Here ROOT types need to be used to define the data members.

Avoid the use ROOT definitions of built-in data types and use the built-in C++ types. Use int instead Int_t, float instead of Float_t, double instead of Double_t, etc.
Prefer the use of mathematical function available in the C++ standard (<cmath>) over those provided by ROOT.

ROOT provides a rich set of special mathematical functions often adapted from the old CERNLIB or, more recently, wrapped GSL functions. They are heavily used in STAR code. However, ROOT also provides a set of basic mathematical functions that are already defined in <cmath>. Examples are TMath::Sqrt(), TMath::Log(), TMath::Sin(), and many more.

There is no rational reason to use these ROOT functions when the same functionality is available in the standard and defined in <cmath>. In most cases they are implemented by calling the built-in functions anyway and their use reduces readability and portability.

The use of ROOT mathematical function already available in <cmath< is strongly discouraged. Use sqrt() instead of TMath::Sqrt(), use log() instead of TMath::Log(), use sin() instead of TMath::Sin(), etc.
"Attributes" is a new standard syntax aimed at providing some order in the mess of facilities for adding optional and/or vendor specific information (GNU, IBM, …) into source code. The use of attributes is discouraged in STAR.

Vendors use a multitude of methods to add specific information into source code, mostly through preprocessor/macro statements such as __attribute__, __declspec, and #pragma. Attributes were added to C++ in order to unify and streamline this procedure. As such their use for the common (STAR) programmer is limited. An attribute can be used almost everywhere in the C++ program, and can be applied to almost everything: to types, to variables, to functions, to names, to code blocks, and to entire translation units, although each particular attribute is only valid where it is permitted by the implementation.

An attribute is placed within double square brackets. There are few attributes defined yet: [[noreturn]], [[carries_dependency]], [[deprecated]](C++14), and [[deprecated("reason")]](C++14). Future attributes are in discussion to support MP.

Example: void boom [[ noreturn ]] () // boom() will never return { throw "error"; }
The use of attributes is discouraged in STAR. There is a reasonable fear that attributes will be misused. The recommendation is to use attributes to only control things that do not affect the meaning of a program but might help detect errors (e.g. [[noreturn]]) or help optimizers (e.g. [[carries_dependency]]).
Use C++ exceptions instead of return codes for error handling. Do not use exceptions to return values.

Exceptions should be used for error handling.
Exception classes typically derive from std::exception (or one of its subclasses like std::runtime_error).
Exceptions should be scoped inside the class that throws them.
By default, catch exceptions by reference.
int computePedestals() { ... if (somethingWrong) { throw BadComputation(); } ... } ... try { computePedestals(); } catch (BadComputation& e) { // catch exception by reference // code that handles error ... } int computePedestals() { ... if (somethingWrong) { return -1; } ... } ... if (computePedestals() == -1) { // code that handles error ... }

Declare objects that are logically constant as const. Design const-correct interfaces. Consider constexpr for some uses of const.

The Standard Library […] in simple words says that it expects operations on const objects to be thread-safe. This means that the Standard Library won't introduce a data race as long as operations on const objects of your own types either

  • Consist entirely of reads –that is, there are no writes–; or
  • Internally synchronizes writes.
[Source: Stack Overflow]

This is a great example of how C++11 is a simpler language: we can stop the Cold War-era waffling about subtleties about what 20th-century C++ const means, and proudly declare modern C++ const has the simple and natural and “obvious” meaning that most people expected all along anyway.

[…] Bjarne Stroustrup writes: “I do point out that const means immutable and absence of race conditions in the last Tour chapter. […]”

[Source: isocpp.org]
Variables and parameters can be declared as const to indicate that the variables are logically immutable. (Because of const_cast and mutable member variables, and global variables, const is no hard guarantee for immutability.) Member functions can be declared const to allow calls with const this pointer. Note that overloading member functions on const is possible.

const variables, data members, methods and arguments add a level of compile-time type checking; it is better to detect errors as soon as possible. Therefore we strongly recommend that you use const whenever it makes sense to do so.

Use const:

  • for an argument, if the function does not modify it when passed by reference or by pointer.
  • For accessors.
  • For methods, if they:
    • do not modify any non-local data;
    • can be safely (no data race) called from multiple threads;
    • do not call any non-const methods;
    • do not return a non-const pointer or non-const reference to a data member.
  • For data members, whenever they do not need to be modified after construction.

mutable can be used to make objects that are already threadsafe (such as std::mutex) mutable in const methods. Thus, it is possible to make const methods thread-safe, through internal synchronization.

In C++11, use constexpr to define true constants or to ensure constant initialization. One of the improvements in C++11, generalized constant expressions, allows programs to take advantage of compile-time computation. It is a feature that, if used correctly, can speed up programs. The basic idea of constant expressions is to allow certain computations to take place at compile time—literally while your code compiles—rather than when the program itself is run. constexpr int multiply(int x, int y) { return x*y; } constexpr int factorial(int n) { return n <= 1 ? 1 : (n * factorial(n-1)); } Then the compiler will evaluate the following statements at compile time instead at run time: const int val multiply(10,10); const int n5 factorial(5);

Another benefit of constexpr, beyond the performance of compile time computation, is that it allows functions to be used in all sorts of situations that previously would have called for macros. For example, let's say you want to have a function that computes the the size of an array based on some multiplier. If you had wanted to do this in C++ without a constexpr, you'd have needed to create a macro since you can't use the result of a function call to declare an array. But with constexpr, you can now use a call to a constexpr function inside an array declaration.

Example: constexpr int defaultArraySize(int multiplier) { return 10*multiplier; } and in the program it is now possible to use: int array[defaultArraySize(3)];

Note that a constexpr specifier used in an object declaration implies const. A constexpr specifier used in a function declaration implies inline. If you declare a class member function to be constexpr, that marks the function as const as well. If you declare a variable as constexpr, that in turn marks the variable as const. However, it doesn't work the other way--a const function is not a constexpr, nor is a const variable a constexpr.

You can make any object a constexpr. In this case the constructor must be declared a constexpr as well as the method to be used.

There are also some limitations:
  • It must consist of single return statement (with a few exceptions)
  • It can call only other constexpr functions
  • It can reference only constexpr global variables

Some variables can be declared constexpr to indicate the variables are true constants, i.e. fixed at compilation/link time. Some functions and constructors can be declared constexpr which enables them to be used in defining a constexpr variable.

Use of constexpr enables definition of constants with floating-point expressions rather than just literals; definition of constants of user-defined types; and definition of constants with function calls. Prematurely marking something as constexpr may cause migration problems if later on it has to be downgraded. Current restrictions on what is allowed in constexpr functions and constructors may invite obscure workarounds in these definitions.

constexpr definitions enable a more robust specification of the constant parts of an interface. Use constexpr to specify true constants and the functions that support their definitions. Avoid complexifying function definitions to enable their use with constexpr. Do not use constexpr to force inlining.

While constexpr variables are constant expressions, they can still have an address. Thus, using a constexpr variable as argument for a const-ref function parameter requires the constexpr variable to have a symbol. Consider the following header file: constexpr int GlobalScopeValue = 0; namespace Namespace { constexpr int ScopeValue = 1; } struct Struct { static constexpr int ScopeValue = 1; }; template<typename T> struct TemplateStruct { static constexpr int ScopeValue = 1; }; void function(const int &value); And the following test code: function(GlobalScopeValue); // fine function(Namespace::ScopeValue); // fine function(Struct::ScopeValue); // link error function(TemplateStruct<int>::ScopeValue); // link error To provide the missing symbols you have to add template<typename T> constexpr int TemplateStruct<T>::ScopeValue; to the header file and constexpr int Struct::ScopeValue; to one .cxx file.
C++11 new suffix return value syntax (or extended function declaration syntax) represents another use for auto. It is useful mostly in templates and in methods where the return type is the class itself. The new return syntax, however, is not as easy to read as the standard method and should only be used where necessary. It should not be regarded as an alternative way of defining a simple function.

In all prior versions of C and C++, the return value of a function absolutely had to go before the function:

int multiply (int x, int y); In C++11, you can now put the return value at the end of the function declaration, substituting auto for the name of the return type. auto multiply (int x, int y) -> int;

In the above example the use of the new syntax does not provide any advantage, in fact it makes it less readable. However, there are several cases were the new syntax is in fact the only way to make things work.

Consider: template<class T, class U> ??? add(T x, U y) { return x+y; } What can we write as the return type? It's the type of “x+y", of course, but how can we say that? First idea, use decltype: template<class T, class U> decltype(x+y) add(T x, U y) // scope problem! { return x+y; } That won't work because x and y are not in scope. The solution is put the return type where it belongs, after the arguments: template<class T, class U> auto add(T x, U y) -> decltype(x+y) { return x+y; } We use the notation auto to mean "return type to be deduced or specified later." The suffix syntax is not primarily about templates and type deduction, it is really about scope. The use of the new return type syntax is very useful in templates and in methods where the return type is the class itself. See also decltype. The new return syntax, however, is not as easy to read as the standard method and should only be used where necessary and to simplify the code. It should not be regarded as an alternative way of defining a simple function. Use the suffix syntax only if absolutely required.
It is a modern C++ idiom to get rid of naked pointers whenever possible. However, it is currently difficult to devise an error free scheme where smart pointers can live in harmony with ROOT object ownership and management rules. Avoid using smart pointers in STAR code. This decision could be revisited in the future if conflict with ROOT is resolved. Smart pointers have existed long before C++11. But since C++11 the standard library contains the classes unique_ptr<T>, shared_ptr<T>, and weak_ptr<T>. Also, the standard library provides make_shared<T> and starting with C++14 also make_unique<T>. Smart pointers are objects that act like pointers, but automate ownership. There are two main semantics for ownership: unique and shared ownership.

Unique ownership ensures that there can be only one smart pointer to the object. If that smart pointer goes out of scope it will free the pointer.

Shared ownership allows to have multiple pointers to an object without deciding who is the exclusive owner. Thus the owners can be freed in any order and the pointer will stay valid until the last one is freed, in which case the pointer is also freed. Note that shared_ptr<T> is thread-safe and thus enables sharing ownership over multiple threads.

Example: { std::shared_ptr<int> first; { std::unique_ptr<int> second(new int); auto third = std::make_shared<int>(); first = third; } // only second is freed automatically here } // first and third are automatically freed here When exiting the inner scope, only second is freed automatically, because the last reference to it went out of scope. But even though third went out of scope here, no free occurred because first still has a reference. Only when first went out of scope and as it is the last reference, third is automatically freed.

Smart pointers are extremely useful for preventing memory leaks, and are essential for writing exception-safe code. They also formalize and document the ownership of dynamically allocated memory. Smart pointers are not easy to work with in an environment where ROOT is used (another possible conflict is with STAR "WhiteBoard" of StMakers). int main() { TFile f("out.root", "recreate"); f.cd(); std::unique_ptr<TH1F> h {new TH1F("h", "h", 100, -5, 5)}; h->FillRandom("gaus", 10000); h->Write(); f.Close(); return 0; } The histogram which is handled by a unique pointer was owned by the current gDirectory. Since I politely closed the file before I exit my program the histogram was destroyed by ROOT memory management agent. Now at the end of main() the pointer to the histogram goes out of scope and its resource needs to be freed, but it has already been freed! Avoid using smart pointers in STAR code.

Avoid magic numbers (hard coded numbers).

Avoid spelling literal constants like 42 or 3.141592 in code. Hard-coded numbers within the code reduce portability and make maintainability harder. The use of symbolic names and expressions (declared ) is a valid solution. Make use of const and constexpr. Names add information and introduce a single point of maintenance.

Example of constants at namespace level:

static constexpr double Millimeter = 1.; static constexpr double Centimeter = 10.*Millimeter;

Example of class-specific constants:

// File Widget.h class Widget { private: static const int sDefaultWidth; // value provided in definition static constexpr int DefaultHeight = 600; // value provided in declaration }; // File Widget.cxx const int Widget::sDefaultWidth = 800; // value provided in definition constexpr int Widget::DefaultHeight; // definition required only if reference/pointer to // DefaultHeight is needed

For values which are likely to change with time, a database approach should be considered. Refer to the STAR database Web page area for more information.

Avoid macros. Use inline functions, constexpr functions, enums, constexpr variables, or templates instead if they can solve the problem.

Macros mean that the code you see is not the same as the code the compiler sees. This can introduce unexpected behavior, especially since macros have global scope.

The following usage pattern will avoid many problems with macros; if you use macros, follow it whenever possible:

  • Don't define macros in a .h file.
  • Define macros (via #define) right before you use them, and undefine them (via #undef) right after.
  • Do not just undefine an existing macro (via #undef) before replacing it with your own; instead, pick a name that's likely to be unique.
  • Try not to use macros that expand to unbalanced C++ constructs, or at least document that behavior well.
  • Prefer not using ## to generate function/class/variable names.
  • Follow the naming convention as described here.
Prefer small and focused functions.

Long functions are hard to debug and makes readability difficult. Short functions allow code reuse. If a function exceeds about 40 lines, think about whether it can be broken up without harming the structure of the program.
Giving the function a name that describes what it does might help splitting it into smaller pieces. Functions should represent logical grouping, therefore it should be easy to assign them meaningful names.
Please note that nesting is not the same as splitting long functions into short ones. In addition, it does not improve readability and ease of debug.

Use RTTI with caution. If you find yourself overusing dynamic_cast consider reviewing the design of your code and classes. RTTI allows a programmer to query the C++ class of an object at run time. This is done by use of typeid or dynamic_cast.

Querying the type of an object at run-time frequently means a design problem and is often an indication that the design of your class hierarchy is flawed.

Undisciplined use of RTTI makes code hard to maintain. It can lead to type-based decision trees or switch statements scattered throughout the code, all of which must be examined when making further changes.

Decision trees based on type are a strong indication that your code is on the wrong track. if (typeid(*data) == typeid(Data1)) { ... } else if (typeid(*data) == typeid(Data2)) { ... } else if (typeid(*data) == typeid(Data3)) { ... Code such as this usually breaks when additional subclasses are added to the class hierarchy. Moreover, when properties of a subclass change, it is difficult to find and modify all the affected code segments.

The standard alternatives to RTTI (described below) require modification or redesign of the class hierarchy in question. Sometimes such modifications are infeasible or undesirable, particularly in widely-used or mature code.

RTTI can be useful in some unit tests. For example, it is useful in tests of factory classes where the test has to verify that a newly created object has the expected dynamic type. It is also useful in managing the relationship between objects and their mocks.

// Example of a unit test Geo::Factory geoFactory; Geo::Object* circle = geoFactory.CreateCircle(); if ( ! dynamic_cast<Geo::Circle>(circle) ) { std::cerr << "Unit test failed." << std::endl; }

RTTI has legitimate uses but is prone to abuse, so you must be careful when using it. You may use it freely in unit tests, but avoid it when possible in other code. In particular, think twice before using RTTI in new code. If you find yourself needing to write code that behaves differently based on the class of an object, consider one of the following alternatives to querying the type:

  • Virtual methods are the preferred way of executing different code paths depending on a specific subclass type. This puts the work within the object itself.
  • If the work belongs outside the object and instead in some processing code, consider a double-dispatch solution, such as the Visitor design pattern. This allows a facility outside the object itself to determine the type of class using the built-in type system.

When the logic of a program guarantees that a given instance of a base class is in fact an instance of a particular derived class, then use of a dynamic_cast or static_cast as an alternative may be also justified in such situations.

An example of code based on dynamic_cast:

void foo(Bar* bar) { // ... some code where x, y, z are defined ... // ... if (Data1 data1 = dynamic_cast<Data1*>(bar)) { doSomething(data1, x, y); } else if (Data2 data2 = dynamic_cast<Data2*>(bar)) { doSomething(data2, z) }

which can be defined using the Visitor pattern:

void foo(Bar* bar) { // ... some code where x, y, z are defined ... // ... DoSomethingVisitor visitor(x, y, z); bar.accept(visitor); }
In general, avoid designs that require casting. You may use static_cast when necessary, but avoid const_cast and reinterpret_cast. C-casts are forbidden.
  • Try to avoid casts. The need for casts may be a hint that too much type information was lost somewhere.
  • Use static_cast to explicitly convert values between different types. static_casts are useful for up-casting pointers in an inheritance hierarchy.
  • Avoid const_cast. (Possibly use mutable member variables instead.) const_cast may be used to adapt to const-incorrect interfaces that you cannot (get) fix(ed).
  • reinterpret_casts are powerful but dangerous. Rather try to avoid them. Code that requires a reinterpret_cast should document the aliasing implications. (reinterpret_cast on cppreference.com)

See the RTTI section for guidance on the use of dynamic_cast.

For the dangers of reinterpret_cast consider: std::uint32_t fun() { std::uint32_t binary = 0; reinterpret_cast<float &>(binary) = 1.f; return binary; // the return value is undefined, according to the C++ standard } The following is better, but still undefined behavior according to the type aliasing rules: std::uint32_t fun() { float value = 1.f; return reinterpret_cast<std::uint32_t &>(value); // this returns 0x3f800000 on x86 } In case of doubt, prefer not to use reinterpret_cast in order to avoid mistakes.
Don't use variable-length arrays or alloca(). Stack-allocated objects avoid the overhead of heap allocation. Variable-length arrays and alloca allow variably-sized stack allocations, whereas all other stack allocations in C++ only allow fixed-size objects on the stack. Variable-length arrays are part of C but not Standard C++. alloca is part of POSIX, but not part of Standard C++. Use STL containers instead. If you really need to improve the performance consider using a custom allocator for the containers. Prefer the prefix form of the increment (++i) and decrement (--i) operators because it has simpler semantics.

If not conditional on an enumerated value, switch statements should always have a default case. Empty loop bodies should use {} or continue.

If not conditional on an enumerated value, switch statements should always have a default case (in the case of an enumerated value, the compiler will warn you if any values are not handled). If the default case should never execute, simply assert:

switch (value) { case 0: { // 2 space indent ... // 4 space indent break; } case 1: { ... break; } default: { assert(false); } }

Empty loop bodies should use {} or continue, but not a single semicolon.

while (condition) { // Repeat test until it returns false. } for (int i = 0; i < someNumber; ++i) {} // Good — empty body. while (condition) continue; // Good — continue indicates no logic. while (condition); // Bad — looks like part of do/while loop.
Range-for loops are useful and should be used. Use reference to elements (or const) in range-for statements especially when dealing with large objects. Prefer ordinary loops when you need the index information. C++ syntax is extended to support easier iteration over a list of elements. For example: int main() { vector<int> v {1,2,3,4,5}; for(auto& i:v) // reference to element { cout<<i<<endl; i +=1; // modifies element value } for(auto i:v) // copy of element { cout<<i<<endl; } return 0; } Range-for statements work for any type where begin() and end() are defined to return iterators. The use of range-for loops increases code readability.

Programmers often need both, the elements of an iterable collection and its index. This is not directly supported in C++11.

Unless optimized away by compiler, using a copy of element could come at a performance cost if the element type is large.

Range-for loops are useful and should be used.

Ordinary loops should be preferred when programmer needs the element index. Avoid having your own counter.

Using reference to elements is encouraged when dealing with large objects. Use const reference when you don't need to modify the object state.

Per default, use int if you need an integer type. If you need to guarantee a specific size use the new extended integer types defined in <cstdint>.

There are five standard signed integer types in C++: signed char short int int long int long long int and five unsigned integer types unsigned char unsigned short int unsigned int unsigned long int unsigned long long int The C++ standard only loosely specifies the sizes of its built-in integer types and only guarantees that their sizes are: signed char = unsigned char ≤ short int = unsigned short int ≤ int = unsigned int ≤ long int = unsigned long int ≤ long long int = unsigned long long int

Already implemented by many compilers the C++11 standard makes these types now official.

Signed integer type with width of exactly 8, 16, 32 and 64 bits respectively with no padding bits and using 2's complement for negative values (provided only if the implementation directly supports the type). int8_t int16_t int32_t int64_t Fastest signed integer type with width of at least 8, 16, 32 and 64 bits respectively. int_fast8_t int_fast16_t int_fast32_t int_fast64_t Smallest signed integer type with width of at least 8, 16, 32 and 64 bits respectively. int_least8_t int_least16_t int_least32_t int_least64_t Maximum width integer type. intmax_t Integer type capable of holding a pointer. intptr_t All types exist also as unsigned version in which case they are preceded by a u.

While these types are sufficient for most tasks, there are times where the precise size has to be defined. In this case consider one of the integer types in <cstdint> . Already before C++11 these types were implemented in most compilers but C++11 makes it official. Three types are defined in header <cstdint>: Exact-Width Types, Minimum-Width Types, and Fastest Minimum-Width Types.

This set identify types with precise sizes. The general form is intN_t for signed types and uintN_t for unsigned types, with N indicating the number of bits. Note, however, that not all systems can support all the types. For example, there could be a system for which the smallest usable memory size is 16 bits; such a system would not support the int8_t and uint8_t types.

The minimum-width types guarantee a type that is at least a certain number of bits in size. These types always exist. For example, a system that does not support 8-bit units could define int_least_8_t as a 16-bit type.

For a particular system, some integer representations can be faster than others. For example, int_least16_t might be implemented as short, but the system might do arithmetic faster using type int. So <cstdint> also defines the fastest type for representing at least a certain number of bits. These types always exist. In some cases, there might be no clear-cut choice for fastest; in that case, the system simply specifies one of the choices.

In general the standard integer types should be used since they are typically also the “fastest” types on the respective architecture. In cases where the exact size matters, e.g. unpacking of raw data the extended types the extended integer types can be used. The use of ROOT types is discouraged unless it is absolute necessary (see discussion under "ROOT Related Issues").
Take extra care of the code portability. Bear in mind problems of printing, comparisons, and structure alignment related to 32-bit and 64-bit data representations .

Below we give a list (incomplete) of possible portability issues:

  • printf() specifiers for some types are not cleanly portable between 32-bit and 64-bit systems.
  • Remember that sizeof(void *) != sizeof(int). Use intptr_t if you need a pointer-sized integer.
  • You may need to be careful with structure alignments, particularly for structures being stored on disk.
  • The data memory representation is computer specific and not defined by C++. The terms endian and endianness, refer to how bytes of a data word are ordered within memory. Big endian store bytes from the highest to the lowest, Little endian from the lowest to the highest.
Use 0 for integers, nullptr for pointers, and '\0' for chars.

nullptr is a pointer literal of type std::nullptr_t. On the other hand, NULL is a macro equivalent the integer 0. Using NULL could bring to unexpected problems. For example imagine you have the following two function declarations: void function(int number); void function(char *name); function( NULL ); Because NULL is 0, and 0 is an integer, the first version of function will be called instead. In C++11, nullptr is a new keyword that can (and should!) be used to represent NULL pointers.

Prefer sizeof(varname) to sizeof(type).

Use sizeof(varname) when you take the size of a particular variable. sizeof(varname) will update appropriately if someone changes the variable type either now or later. You may use sizeof(type) for code unrelated to any particular variable, such as code that manages an external or internal data format where a variable of an appropriate C++ type is not convenient.

Struct data; memset(&data, 0, sizeof(data)); memset(&data, 0, sizeof(Struct)); if (rawSize < sizeof(int)) { logMessage << "compressed record not big enough for count: " << rawSize; }

If the compiler is able to deduce the type of a variable from its initialization, you don't need to provide the type. This is achieved by using the auto keyword.

Use auto to avoid type names that are just clutter. Continue to use manifest type declarations when it helps readability, and never use auto for anything but local variables.

Use auto against verbosity, not consistency. In cases where the rhs expression is an integer or floating point literal the use of auto is strongly discouraged.

In C++11, a variable whose type is given as auto will be given a type that matches that of the expression used to initialize it. You can use auto either to initialize a variable by copying, or to bind a reference. vector<string> names; ... auto name1 = names[0]; // Makes a copy of names[0]. const auto& name2 = names[0]; // name2 is a reference to names[0].

C++ type names can sometimes be long and cumbersome, especially when they involve templates or namespaces. In a statement like map<string, string>::iterator itr = address_book.begin(); the return type is hard to read, and obscures the primary purpose of the statement. Changing it to auto itr = address_book.begin(); makes it more readable.

Without auto we are sometimes forced to write a type name twice in the same expression, adding no value for the reader, as in diagnostics::ErrorStatus* status = new diagnostics::ErrorStatus("xyz");

Using auto makes it easier to use intermediate variables when appropriate, by reducing the burden of writing their types explicitly. auto status = new diagnostics::ErrorStatus("xyz");

Sometimes code is clearer when types are manifest, especially when the initialization of a variable depends on functions/variables that were declared far away. In an expression like auto i = xValue.Lookup(key); it may not be obvious what i's type is, if x was declared hundreds of lines earlier.

Programmers have to understand the difference between auto and const auto& or they'll get copies when they didn't mean to.

The interaction between auto and C++11 brace-initialization can be confusing. The declarations auto xValue(3); // Note: parentheses. auto yValue{3}; // Note: curly braces. mean different things — xValue is an int, while yValue is an initializer_list. The same applies to other normally-invisible proxy types.

If an auto variable is used as part of an interface, e.g. as a constant in a header, then a programmer might change its type while only intending to change its value, leading to a more radical API change than intended.

There is a certain danger using auto with numerical literals. auto i = 3; // i is an int auto x = 10; // x is an int auto y = 10.2; // y is an double auto l = 2; // l is an int This might not be what was intended. To make it clear to the compiler one would need to be more explicit: auto i = 3U; // i is an unsigned int auto x = 10F; // x is a float auto y = 10.2L; // y is a long double auto l = 2LL; // l is an long long Nothing is gained using the suffix notation. If the type has to be explicitly defined than it is much clearer to write: unsigned int i = 3; float x = 10; long double y = 10.2; long long l = 2;

auto is permitted for local variables only. Do not use auto for file-scope or namespace-scope variables, or for class members. Do not use auto for numeric literals. Never assign a braced initializer list to an auto-typed variable.

The non-member begin() and end() functions are a new addition to the standard library, promoting uniformity, consistency and enabling more generic programming. They work with all STL containers, but more than that they are overloadable, so they can be extended to work with any type. Overloads for C-like arrays are also provided. The use of non-member begin() and end() is encouraged. In STL all containers have a non-static member begin() and end() methods that return an iterator to the beginning and the end of the container. Therefor iterating over a container could look like this: std::vector<int> v; for(auto it = v.begin(); it != v.end(); ++it) std::cout << *it << std::endl; The problem here is that not all user-defined containers have begin() and end(), which makes them impossible to use with the STL algorithms or any other user-defined template function that requires iterators. That is even more problematic when using C arrays. The non-member begin() and end() methods are extensible, in the sense they can be overloaded for any type (including C arrays). std::vector<int> v; for(auto it = begin(v); it != end(v); ++it) std::cout << *it << std::endl; To adopt any custom container all one must do is create your own iterator that supports *, prefix increment (++itr) and != . The use of the non-member version of begin() and end() allows one to write very generic methods and is hence encouraged. static_assert() performs an assertion check at compile-time. Type traits and static_assert is mostly for template class developer . Since the use of templates in STAR is minimal, these new C++11 features will be rarely used, if at all. There’s no argument against using this feature if needed. static_assert() performs an assertion check at compile-time. If the assertion is true, nothing happens. If the assertion is false, the compiler displays the specified error message. template <typename T, size_t Size> class MyVector { static_assert(Size > 3, "Size is too small"); T points[Size]; }; int main() { MyVector <int, 16> a1; MyVector <double, 2> a2; // will produce compile error return 0; } The form of the output depends on the platform. One the Mac using LLVM it prints: sassert.cpp:12:5: error: static_assert failed "Size is too small." static_assert(Size > 3, "Size is too small."); Note that since static_assert is evaluated at compile time, it cannot be used to check assumptions that depends on run-time values.

static_assert() becomes more useful when used together with type traits. These are a series of classes that provide information about types at compile time. They are available in the <type_traits> header. There are several categories of classes in this header: helper classes, for creating compile-time constants, type traits classes, to get type information at compile time, and type transformation classes, for getting new types by applying transformation on existing types.

template <typename T1, typename T2> auto add(T1 t1, T2 t2) -> decltype(t1 + t2) { static_assert(std::is_integral<T1>::value, "Type T1 must be integral"); static_assert(std::is_integral<T2>::value, "Type T2 must be integral"); return t1 + t2; }
There’s no argument against using this feature if needed. Note that static_assert() does not violate STAR’s messaging scheme since the assert error messages are printed at compile not run time.
Rvalue reference are used to achieve move semantics and perfect forwarding. See Move Constructor and Assignment Operator for the guideline on the move semantics. One can find multiple rules of thumb for recognizing the lvalueness or rvalueness of an object. For our purposes, the following are sufficient:

If you can take its address using the built-in address-of operator (&) then it is an lvalue, otherwise, it is an rvalue.

Another useful rule of thumb that is useful but not strictly correct is the if-it-has-a-name rule: if it has a name then it is an lvalue, otherwise, it is an rvalue.

Rvalue reference is designated with an && as opposed to & for lvalue reference. Here is an example of function overloading to handle lvalue and rvalue arguments separately: #include <cstddef> int foo() { return 5; } void print(int const& x) { cout<<__PRETTY_FUNCTION__<<endl; cout<<x<<endl; } void print(int&& x) { cout<<__PRETTY_FUNCTION__<<endl; cout<<x<<endl; } int main() { std::cout << "Hello, World!\n"; int x =6; print(x); // call print on an lvalue print(foo()); // call print on an rvalue } Now this looks cool. However, the real power of the ability to distinguish between rvalues and lvalues in C++11 is to enable two things: 1) move semantics 2) perfect forwarding.

The move semantics allow to get rid of expensive copies from temporary (rvalue) objects when a move is intended. Now that we can detect temporary objects using rvalue references we can overload the copy/assignment functions to do the less expensive move from the temporary object by simply pointing the current object's pointers to the temporary object's resources and nullifying the latter's pointers. To add to the multitude of examples of move semantics implementation here is one:

#include <cstddef> class dataHandler { private: int mNumberOfElements; int* mData; public: dataHandler(int n=10): mNumberOfElements(n), mData(new int[mNumberOfElements]) {cout<<__PRETTY_FUNCTION__<<endl;} // copy constructor dataHandler(const dataHandler& x): mNumberOfElements(x.mNumberOfElements), mData(new int[x.mNumberOfElements]) { cout<<__PRETTY_FUNCTION__<<endl; for(int i=0;i<this->mNumberOfElements;i++){ this->mData[i] = x.mData[i]; } } // copy assignment operator dataHandler& operator=(const dataHandler& rhs) { cout<<__PRETTY_FUNCTION__<<endl; if(this == &rhs) return *this; if(this.mNumberOfElements != rhs.mNumberOfElements) { // this should never happen and should throw an exception. // we will just terminate the program instead of throwing an exception in this demo example. exit(1); } for(int i=0;i<this->mNumberOfElements;i++){ this->mData[i] = x.mData[i]; } return *this; } // move constructor dataHandler(dataHandler&& x) { cout<<__PRETTY_FUNCTION__<<endl; this->mData = x.mData; this->mNumberOfElements = x.mNumberOfElements; x.mData = nullptr; } // move assignment operator dataHandler& operator=(dataHandler&& rhs) { cout<<__PRETTY_FUNCTION__<<endl; if(this == &rhs) return *this; this->mData = rhs.mData; this->mNumberOfElements = x.mNumberOfElements; rhs.mData = nullptr; return *this; } ~dataHandler(){ cout<<__PRETTY_FUNCTION__<<endl; if(mData) delete mData;} inline int data() const { return *mData;} }; dataHandler get_a_dataHandler() { dataHandler t(6); return t; } int main() { cout<<"Testing move semantics..."<<endl; dataHandler t0; t0 = get_a_dataHandler(); // should call move assignment operator. dataHandler t1(get_a_dataHandler()); // should call move constructor (unless optimized away by the compiler (lookup RVO and copy elision)). return 0; } An rvalue reference itself is not necessarily an rvalue. For example, inside the move constructor the variable x is an rvalue reference, but it is an lvalue (you can take its address, it has a name). This case is important when one wants to construct base classes in a move function of the derived class. The base class move function should be invoked and this can be achieved by statically casting the variable x to an rvalue reference, i.e. hiding its name which can be achieved using std::move. Derived(Derived&& rhs) : Base(rhs) // wrong: rhs is an lvalue { // Derived-specific stuff } Derived(Derived&& rhs) : Base(std::move(rhs)) // good, calls Base(Base&& rhs) { // Derived-specific stuff } std::move hides the name of its arguments (static casting it to an rvalue reference). There is one subtlety with rvalues and deduced types. The rvalueness/lvalueness of a deduced type follows that of the initializer. For example, in a function template: template<typename T> void print(T&& x) { cout<<x<<endl;} So calling print on a lvalue makes x an lvalue reference, same if x is an rvalue. Now when does this matter? It doesn't matter inside print itself since x is an lvalue there anyway. It matters when you want to pass x to another function, do you pass it as an lvalue (just x) or hide its name using std::move? The answer obviously depends on the nature of x, you want to preserve that. This can be achieved using std::forward. std:forward passes rvalue references as rvalues and lvalue references as lvalues.

Strive to define your move semantics so that they cannot throw exceptions and declare them so using noexcept specifier.

Use std::move to pass argument to base classes in move constructor and assignment operator.

Use std:forward to forward arguments to classes constructors in templated functions or classes.

Remember that an rvalue reference is not necessarily an rvalue itself.

Take advantage of compilers Return Value Optimization (RVO)/elision, don't be afraid to return by value.

The coding conventions described above have to be followed. However, like all good rules, these sometimes have exceptions.

It is permissible to deviate from the rules when dealing with code that does not conform to these guidelines.

To modify code that was written to specifications other than those presented by this guide, it may be necessary to deviate from these rules in order to stay consistent with the local conventions in that code. In case of doubt the original author or the person currently responsible for the code should be consulted. Remember that consistency also includes local consistency.

Use common sense and BE CONSISTENT.

The point about having style guidelines is to have a common vocabulary of coding so people can concentrate on what the programmer is saying, rather than on how he/she is saying it.

OK, enough writing about writing code; the code itself is much more interesting. Have fun!

[1] Herb Sutter on software, hardware, and concurrency blog [http://herbsutter.com/2013/05/09/gotw-1-solution]