SoftBench SDK: CodeAdvisor and Static Programmer's Guide ABCDE HP Part No.
Notices The information contained in this document is subject to change without notice. Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and tness for a particular purpose. Hewlett-Packard shall not be liable for errors contained herein or direct, indirect, special, incidental or consequential damages in connection with the furnishing, performance, or use of this material. Warranty.
Copyright c 1980, 1984, 1986 Novell, Inc. Copyright c 1979, 1980, 1983, 1985-1993 The Regents of the University of California. This software and documentation is based in part on the Fourth Berkeley Software Distribution under license from the Regents of the University of California. Copyright c 1994 X/Open Company Limited. UNIX R is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Limited. Copyright c 1990 Motorola, Inc. All Rights Reserved.
Printing History New editions of this manual incorporate all material updated since the previous edition. The manual printing date and part number indicate its current edition. The printing date changes when a new edition is printed. (Minor corrections and updates incorporated at reprint do not cause this date to change.) The manual part number changes when extensive technical changes are incorporated.
Preface This manual describes how to write new rules for the SoftBench CodeAdvisor product. It also documents the Static Database Application Programmer's Interace (API) for programmers who need to access the API for other purposes.
Typeface Conventions Convention Description italic font Information you supply, either in syntax examples or in text descriptions. For example, if told to type: lename , you supply an actual le name like sample. Italics are also used for emphasis , and for Titles of Books . typewriter font Computer commands or other information that must be typed exactly as shown. For example, if told to type: sample, you type exactly the word in typewriter font, sample.
Contents 1. User De ned CodeAdvisor Rules 2. Modifying Table-Driven Rules Modi cation Process . . . . . . Table Formats . . . . . . . . Specifying Scope of Changes . . The NameConventions Rule Family Rule Format . . . . . . . . . Examples of Use . . . . . . . Extending NameConventions . . The ProhibIdent Rule Family . . Rule Format . . . . . . . . . Examples of Use . . . . . . . Extending ProhibIdent . . . . The ProhibDefines Rule Family . Rule Format . . . . . . . . . Examples of Use . . . . . . .
4. Understanding the Static Database Database Objects . . . . . . . . . Capabilities of the Database . . . . Learning the Database API . . . . Database Objects . . . . . . . . Incomplete Objects . . . . . . . Database Types . . . . . . . . Type Quali ers . . . . . . . . . Accessing the Database . . . . . . Opening and Closing the Database Delimiting Transactions . . . . . Iterators . . . . . . . . . . . . . Attribute Iterators . . . . . . . Object Interfaces . . . . . . . . . Block Object . . . . . . . . .
The TypedSymbol Base Class . Variable Object . . . . . . . . Using the Database API . . . . . The Example Rule . . . . . . Understanding the Example Rule The shadow Function . . . . kindMask and langMask . . . The check Function . . . . . Final De nitions . . . . . . Example Files . . . . . . . . The UserRulesLocalHides Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A. Detailed Database Type Descriptions Object Kind . . . . . . . . . . . Attributes . . . . . . . . . . . . Scalar Types . . . . . . . . . . . Language Types . . . . . . . . . References . . . . . . . . . . . . Error Codes . . . . . . . . . . . B. Iterators Standard Iterators Attribute Iterators Index Contents-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2 A-3 A-5 A-6 A-7 A-8 . . . . . . . .
Figures 4-1. Object Hierarchy . . . . . . . . . . . . . . . . . . . . 4-2. RefList Organization . . . . . . . . . . . . . . . . . .
1 User Defined CodeAdvisor Rules SoftBench CodeAdvisor o ers you a powerful tool for improving the reliability and maintainability of your C and C++ code. Many prede ned rules come with the SoftBench CodeAdvisor product, allowing you to bene t from the product \right out of the box." You can also extend the SoftBench CodeAdvisor functionality to meet your local needs. The simplest way to customize the SoftBench CodeAdvisor product is to modify the ASCII les that are read by existing table-driven rules.
2 Modifying Table-Driven Rules Several rules shipped with the SoftBench CodeAdvisor product read their de nitions from ASCII les. By modifying the les, you can modify the rules' behavior. You can add new rule cases, delete current rules, or change a rule's de nition. You need not do any programming; you simply edit a text le. Modification Process To modify table-driven rules, you edit or replace the table in the ASCII le to meet your needs.
Specifying Scope of Changes Your changes and additions can a ect di erent scopes, depending on where you make the change or addition. SoftBench CodeAdvisor checks several locations for rule table information: /opt/softbench/config/ruletables/$LANG/rule-family Standard pre-con gured rule tables, as de ned by Hewlett-Packard. Ordinarily you should not change these les, but you may copy them to create your own rule le. /etc/opt/softbench/config/ruletables/$LANG/rule-family Local changes and customizations.
The NameConventions Rule Family NameConventions allows you to specify almost any kind of required or prohibited condition in an identi er name. For example, you can create a rule that requires all class names to be capitalized, or that ags the use of certain prohibited characters. Rule Format Each line contains the following space-delimited elds: Rule ID Name of the rule. The same Rule ID cannot appear on multiple lines. Help Volume Name of the help volume that contains online help for the rule.
Required Attributes Speci es all attributes that must be set on the identi er. (See \Attributes" in Appendix A for a list of all attributes understood by SoftBench CodeAdvisor.
Examples of Use The rules shipped with SoftBench CodeAdvisor use NameConventions to detect problems such as: Illegal identi ers, such as global IDs beginning with underscore Stylistic conventions, such as non-capitalized class names You can create additional rules like these to support your local conventions. Extending NameConventions The source for the NameConventions rule family can be found in /opt/softbench/examples/CodeAdvisor/Rules/ruleNameConventions.C.
The ProhibIdent Rule Family ProhibIdent checks for prohibited identi er names. This includes calls to unsafe functions, uses of obsolete functions and variables, and other similar situations. Rule Format Each line contains the following space-delimited elds: Rule ID Name of the rule. Usually the name of the prohibited identi er is also used as the name of the rule. Help Volume Name of the help volume that contains online help for the rule.
Examples of Use The rules shipped with SoftBench CodeAdvisor use ProhibIdent to detect the use of unsafe, obsolete, and non-portable identi ers. You may add your own rules to prohibit the use of other identi ers. Extending ProhibIdent The source for the ProhibIdent rule family can be found in /opt/softbench/examples/CodeAdvisor/Rules/ruleProhibIdent.C. You can use this source to extend ProhibIdent for your local needs. See Chapter 3 and later chapters for information on writing rules.
The ProhibDefines Rule Family ProhibDefines checks for prohibited identi er names, but it uses a more specialized algorithm than the ProhibIdent rule. ProhibDefines looks for identi ers that are not allowed in #define macros or in -D de nitions on the compiler command line. Rule Format Each line contains the following space-delimited elds: Rule ID Name of the rule. Usually the name of the prohibited identi er is also used as the name of the rule.
Extending ProhibIdent The source for the ProhibIdent rule family can be found in /opt/softbench/examples/CodeAdvisor/Rules/ruleProhibIdent.C. You can use this source to extend ProhibIdent for your local needs. See Chapter 3 and later chapters for information on writing rules.
The DtorMatchCtor Rule Family DtorMatchCtor veri es that resources allocated in a class's constructor are deallocated in the destructor. Furthermore, the deallocator must match the allocator. For example, you cannot allocate memory using new and deallocate it using free. Rule Format Each line contains the following elds. Fields in this rule table are separated by vertical bars (|), since some of the eld values (such as \operator new") have embedded spaces. Rule ID Name of the rule.
For example, the following line can be found in the DtorMatchCtor rule le: DtorMCtorXDeviceList|CommonCxx_DtorMatchCtor| XHPFreeDeviceList,XFreeDeviceMotionEvents| XHPListInputDevices,XGetDeviceMotionEvents| %d call(s) to (one of) %s in %s (file %s, line %d) not deallocated (This example has been broken into several lines for readability. It must appear on one line in the rule table.
3 Understanding the Programming Model The SoftBench CodeAdvisor architecture implements rules in shared libraries. When the rule engine initializes itself, it reads in all the rule libraries it can nd and invokes these rules as appropriate. You can add your own rules by creating libraries for the rule engine to read. Your libraries will contain C++ code that de ne classes to implement the rules.
The Rule Engine SoftBench CodeAdvisor loads in all the rule libraries it nds in /opt/softbench/lib/rulelibs, /etc/opt/softbench/lib/rulelibs, and any directories speci ed by the -l option to softcheck. For each Rule or RuleWithTable in a rule library, exactly one instance of the rule must be created. The C++ code that de nes the rule instance should be of the form: static NewRuleClass instance; All global data members are initialized when the shared library is loaded.
The Rule Base Class Non-table-driven rules are written as a class derived from the Rule base class. Rule de nes the interface functions required of all rules. A Rule can de ne a single rule, or it can de ne a \multi-rule" that can issue violations on any of several closely related rules. This can provide signi cant performance bene ts, since you can iterate through interesting objects (such as the base classes of a class) only once and check for several conditions.
const char *help_volume, const char *rule_name = NULL); }; // When defining a multi-rule, call report() to determine if // a particular sub-rule should be checked. DBboolean report(const char *name) const; You should not access the other public members, the data members, or the friend functions of the Rule class. They are used by the rule engine. You must provide your own versions of all the pure virtual functions: kindMask(), langMask(), check(), and errorMess().
KIND_TAG to receive all objects of these types. See the next section (\Example Rule") for an example. As a special case, a value of 0 indicates the rule should be called only once for all symbols. You are then responsible for handling any iteration required by your rule. See /opt/softbench/examples/CodeAdvisor/Rules/ruleMixedIO.C langMask() check() errorMess() name(), names() violation() for an example. Returns a bitmask that tells which languages the rule applies to.
Both forms have three additional parameters: an err parameter, which is a string describing the speci c violation; help_volume, the name of the help volume containing the on-line help for this rule; and an optional rule_name parameter. help_volume can specify a help node using the format helpvolume _helpnode . rule_name is required only when issuing a violation from a multi-rule, and indicates which rule has red. report() You do not de ne your own violation(), but merely call it from check().
Example Rule The following code de nes a very simple rule that enforces a common coding convention: every class name should be capitalized. (You could use the Class Editor in Static Analyzer to nd and x every occurrence of noncapitalized classes with one simple operation, by selecting the class and choosing \Edit: Modify . . . ".) This rule uses several data structures and functions from the Static API, which you don't need to understand yet.
// only on Symbol objects. Class is not a Symbol; Tag is. int UserRulesCapClass::kindMask() const { return 1 << KIND_TAG; } // This rule applies only to C++ code. Language UserRulesCapClass::langMask() const { return LANGUAGE_CPP; } // Find all non-capitalized class names void UserRulesCapClass::check(SymbolTable *, const Symbol &sym) { Tag tag; Class cl; // Don't want to check instances; only the class name must be capitalized. // This code is a common idiom to reject instances. // The !tag.
} if (regexec(&capitalized_compiled_reg, tag.Name(), 0, NULL, 0)!=0) { // doesn't match regular capitalized expression char buf[1024]; sprintf(buf, "Class or class template name '%s' not capitalized", tag.Name()); violation(tag, buf, "UserRules"); } // Generic one-line description of the rule const char *UserRulesCapClass::errorMess() const { return("Class name not capitalized.
The RuleWithTable Base Class Table-driven rules are written as a class derived from the RuleWithTable base class. RuleWithTable inherits most of its interface from Rule, and adds components to work with rule tables. Each RuleWithTable rule de nes the family of rules included in its rule table, so every RuleWithTable rule is e ectively a multi-rule. See the name() and names() function descriptions for speci c information on how RuleWithTable uses them.
}; // User-Defined table-based rules should define check_table_entry // instead of check(). virtual void check_table_entry(const SymbolTable &symtab, const Symbol &sym, RuleTableRecord &entry) = 0; virtual void check(SymbolTable *symtab, const Symbol &sym); You should not access the private members of the RuleWithTable class. They are used by the rule engine. The RuleWithTable interface is a combination of Rule and RuleWithTable.
name(), names() Your rule constructor initializer list should invoke the RuleWithTable() constructor, passing in the appropriate arguments for your table format. The rule IDs in a table-driven rule are de ned in the rule table, not in the RuleWithTable de nition. De ne a name() function that returns the name of your rule family . This must match the basename of your rule table. The rule engine searches the locations described in \Specifying Scope of Changes" in Chapter 2 to nd a rule table with that name.
NameConventions family are handled by the NameConventions rule code.
Example Table-Driven Rule The following code de nes ProhibDefines, a simple rule that is identical (except for its name) to the ProhibDefines rule that is shipped with SoftBench CodeAdvisor. See /opt/softbench/config/ruletables/$LANG/ProhibDefines for the table format used by this rule. (Since this rule is named UserRulesProhibDefines, it would normally search for a rule table with that same name. However, for demonstration purposes, this rule uses the ProhibDefines table.
File file; UserRulesProhibDefinesClientDataRecord *cache; if (entry.client_data) cache = (UserRulesProhibDefinesClientDataRecord *) entry.client_data; else { cache = new UserRulesProhibDefinesClientDataRecord; entry.client_data = cache; const char *regexp = "([[:space:]]|^)-D"; char *pattern = new char[strlen(regexp) + strlen(entry.data[0]) + 2]; if (pattern) { sprintf(pattern, "%s%s", regexp, entry.
} } } violation(defn.file, defn.position.line, buf, entry.help_location, entry.name); const char *UserRulesProhibDefines::errorMess() const { return("Identifier prohibited for specified reason."); } // Rule name -- also used as name of rule table. By default, this rule // uses the ProhibDefines rule shipped with CodeAdvisor.
4 Understanding the Static Database Rules use the Static database as their view on the program being checked. The Static database is represented as a set of persistent objects. That is, the objects are stored in the le system of your computer so they are remembered from one session to another. Each time you build your program and regenerate the Static database, a new set of objects is created in the database for future use.
Capabilities of the Database Since the Static database contains attributes and associations for each object, it is best matched to certain kinds of rule algorithms. For example, the database is an ideal match for a rule that examines the member functions de ned in a class. The class object lists its member functions on its association list, and each member function object gives full details on its type and declaration information.
Learning the Database API You access the Static database through an Application Programmer Interface (API). The API gives you an object-oriented view onto the contents of the database, through which you can access information on your program les. Database Objects The database is implemented as a collection of objects. The interface to the database consists of functions to open the database and examine those objects.
parameters, and functions. All typed objects inherit from TypedSymbol. Block Represents blocks within functions. Class Represents C++ classes, and structs and unions in C++ code. Represent class templates and function templates ClassTemplate, (both global template functions and member function FunctionTemplate templates). Enum, EnumMember Represent enumerations and enumeration constants. RefList Contains all references to a speci c named object in the database.
TemplateArgument Represents class template and function template arguments. nTypedefn Represents named user-de ned types. nVariablen Represents program variables. See Figure 4-1 for a graphical representation of the database objects. Notice that Class, Enum, and Struct do not inherit from Symbol. The Tag object inherits from Symbol, holds the name information, and refers to the aggregate type. Figure 4-1.
Incomplete Objects Some object types can be \complete" or \incomplete." An incomplete object is one for which complete information is not available; in particular, no de nition is available for the symbol. This is most often encountered with externally de ned objects. For example, a program might include the declaration \class Myclass;", but no de nition of the class. The database knows Myclass is a class, but knows no more about it. Myclass will be incomplete in the database.
Language Usage Reference SourcePosition The language (such as LANGUAGE_C or LANGUAGE_CPP) associated with a le or symbol. The type of reference to a symbol, such as REF_DEFINITION, REF_MODIFICATION, or REF_CALL. A reference to a symbol, including the Usage type and the line and column where the reference occurs. A Reference within a speci c le. These types are de ned in the header le DB_Common.h. See Appendix A for a complete listing.
Accessing the Database The basic interface to the database is quite simple. You open the database, specifying what language(s) you are interested in, and the open call returns the database's global symbol table. You then bracket each request to the database in a \transaction," so that no other process can change the database while you are reading it. Remember to close the database when you are nished. Note: rule writers do not need to open or close the database or manage transactions.
It is possible to open and manipulate multiple databases at once. This is useful if there are multiple databases representing your program. For example, if you compiled a library separately from the main program, in another directory, the library would have its own Static database. CloseDatabase simply closes the database and clears the globalsymboltable pointer.
Iterators Since an object may have an arbitrary number of items associated with it (for example, a variable may be accessed in arbitrarily many locations), the database provides a mechanism to successively select and operate on each item in a list. The Iterators mechanism manages the iteration through a collection of items. Using Iterators, it is easy to iterate through all objects in a list, without needing to understand the underlying iteration mechanism.
Attribute Iterators The Static API also de nes a subset of iterators, called Attribute Iterators, that de ne a set of attributes along with each object in the iteration list. Attributes, as de ned in the Static database, specify characteristics of a symbol such as Global, Static, Public, Private, and Virtual. See \Database Types". Attribute Iterators are identical to normal iterators, with the addition of two member functions (GetIteratorAttribute() and SetIteratorAttribute()) to access the attributes.
Object Interfaces The class interfaces for the database objects de ne the bulk of the Static API. Each object de nes the methods (functions) that are used to access the object. In addition, many objects also inherit from other, more generic objects (usually Symbol), which in turn de ne additional function interfaces. The following sections describe the class de nition interface to each object type.
Block Object Block represents the entire code block within a function. Block inherits from PerBase, and has no type or name properties.
Class Object Class objects represent C++ classes. Structs and unions in C++ code are also represented by Class, since C++ makes little distinction between classes and the other aggregate types. (You can determine if the Class was declared as a struct or union by testing the Attrib() value using the attribute-testing functions WAS_STRUCT() and WAS_UNION().) Structs parsed only by a C compiler are represented as Structs. Each Class has a corresponding Tag.
ATTRIBUTE_ITERATOR(Tag) BaseClasses() const; ATTRIBUTE_ITERATOR(Tag) DerivedClasses() const; ITERATOR(Tag) NestedClasses() const; ITERATOR(Tag) NestedEnums() const; ITERATOR(Typedef) NestedTypedefs() const; ITERATOR(Symbol) Friends() const; // Class Template this class is an instance of. DBboolean ExpandedFrom(Tag &tag) const; }; friend class Tag; Method Definitions Returns the Tag object associated with the Class. Attrib() Returns the attributes (such as ATTR_GLOBAL) of the class.
other classes that inherit directly from X. See below for an example. Note that BaseClasses() is guaranteed to return all base classes of a class, but DerivedClasses() cannot be guaranteed to return all derived classes. It is possible that code not included in the database derives from this class. GetIteratorAttribute() returns the attributes of NestedClasses(), NestedEnums(), NestedTypedefs() Friends() ExpandedFrom() 4-16 the inheritance relationship: virtual, public, private, or protected.
Example This function prints all function members in the class referred to by a speci ed Class, including all inherited function members. void function_members(Class cls) { Tag tag; cls.ClassTag(tag); printf("Function Members defined in class %s:\n", tag.Name()); ITERATOR(FunctionMember) fmi = cls.FunctionMembers(); ITERATE_BEGIN(fmi) { printf(" %s:\n", fmi.
ClassTemplate Object ClassTemplate objects represent C++ parametric classes. Each ClassTemplate has a corresponding Tag. Like Class objects, ClassTemplate objects contain the data members and member functions de ned by that class template. For an incomplete object, only ClassTag() and Attrib() return meaningful results. All other methods return FALSE or null values. ClassTemplate inherits from PerBase, and has no type or name properties. The corresponding Tag contains the name information.
DataMember Object DataMember objects represent the data members of structures, classes, and class templates. DataMember inherits type and name information from TypedSymbol .
Enum Object Enum objects represent enumerated types. Each Enum has a corresponding Tag. Enums objects contain EnumMember objects representing each value de ned by the enum. For an incomplete enum, only EnumTag() and Attributes() return meaningful results. All other methods return FALSE or null values. Enum inherits from PerBase, and has no type or name properties. The corresponding Tag contains the name information.
EnumMember Object EnumMember objects represent the constant values of an Enum. EnumMember inherits name information from Symbol. class EnumMember : public Symbol { public: EnumMember(); ~EnumMember(); }; Enum MemberOf() const; int Value() const; Method Definitions MemberOf() Value() Returns the enum of which this object is a member. Returns the ordinal (numeric) value of this member.
File Object File objects contain all the Symbols and RefLists de ned within a le. File inherits name information from Symbol.
Includes(), IncludedBy() Modules(), Macros(), Variables() , Functions() , Tags(), Typedefs(), FunctionTemplates() EnclosingFunction() Return iterators over all les that this le includes, and all les that include this le. Return iterators for all types of symbols de ned within the le. Returns the function that encloses the line line in the le. Notice that EnclosingFunction returns a Symbol, not a Function. The enclosing function may be a FunctionMember or a FunctionTemplate.
Function Object Function represents complete and incomplete functions. An \incomplete" function is a function that is known only by its signature. It may be de ned by an extern reference, or by a forward reference that is never completed. Many incomplete function references are created by #include les, since they declare a function without de ning it. For incomplete functions, only the base Symbol methods are valid. All other methods return FALSE and/or null results.
ParameterTypeInfo() works on incomplete DefinitionSite() FunctionBlock() MemberFunction() ExpandedFrom() functions that have full function signatures. Note that K&R C code has no signatures, and thus ParameterTypeInfo() does not work on this code. This function shadows the DefinitionSite() method in Symbol. It is specialized to handle multiple functions of the same name, such as if your database includes multiple main() functions. Returns the block containing the function's code.
FunctionMember Object FunctionMember objects represent function members of C++ classes. FunctionMember inherits from the Function class. FunctionMember inherits type and name information from TypedSymbol . class FunctionMember : public Function { public: FunctionMember(); ~FunctionMember(); }; Class MemberOf() const; Method Definitions MemberOf() 4-26 Returns the class of which this function is a member.
FunctionTemplate Object FunctionTemplate objects represent C++ parametric functions. FunctionTemplate inherits from the Function class.
Label Object Label represents the target of switch or goto commands. The RefLists() de ned for a Label refer to the statements that branch to the Label. Label inherits name information from Symbol. class Label : public Symbol { public: Label(); ~Label(); }; // Label container; Block, Module or File.
Macro Object The Macro object represents C preprocessor macros (#define). It is not used for C++ inline functions. Macro inherits name information from Symbol. class Macro : public Symbol { public: Macro(); ~Macro(); }; Macro de nes no interface methods of its own. All Symbol methods are available; in particular, EnclosingFile() and EnclosingBlock() can be used to nd the de nition scope for global and local macros, respectively.
Parameter Object Parameter represents function parameters. Parameter inherits type and name information from TypedSymbol. class Parameter : public TypedSymbol { public: Parameter(); ~Parameter(); }; Method Definitions Parameter de nes no interface methods of its own. All TypedSymbol methods are available. Parameter objects are empty for incomplete functions. Use ParameterTypeInfo() for information on incomplete functions.
The PerBase Base Class PerBase is the foundation class that de nes the concepts of persistent database objects. In particular, PerBase de nes object \handles" and conversion to higher-level objects. All database objects inherit directly or indirectly from PerBase. All PerBase methods are available to all database objects. You will not encounter PerBase objects in the database. It is used only as a parent class for constructing objects.
RefList Object RefList represents an array of references. Each RefList lists all references to a symbol within one le. The Symbol object contains an iterator of Reflists, one for each le containing a reference to the Symbol. RefList inherits from PerBase, and has no type or name properties.
Notice that there are RefLists() iterators de ned on Symbol and File objects. The two-dimensional organization of RefLists (below) allows you to access references by symbol (stepping through the accesses in each le) or by le (stepping through accesses to all the symbols de ned in that le). Figure 4-2. RefList Organization In this illustration, the boxes containing \References" are RefLists. In this example, Symbol1 is a local symbol referenced only in File1.
Example These code fragments illustrate the use of RefLists. Notice the use of the overloaded [] operator. This code is equivalent to choosing a \Symbol" in Figure 4-2 and following the arrows to the right: // Print location of all references for the variable "var". ITERATOR(RefList) rli = var.RefLists(); ITERATE_BEGIN(rli) { printf("References in file %s:\n", rli.FileIn().Name()); int i; for (i=0; i
Scalar Object Scalar objects represent built-in intrinsic types, such as int or char. Notice that Scalar does not inherit from TypedSymbol, since a type has no TypeQualifier information. Instead, Scalar inherits name information from Symbol, and provides a ScalarType function to describe the type of the scalar. class Scalar : public Symbol { public: Scalar(); ~Scalar(); }; ScalarType Type() const; Method Definitions Type() Returns the type of the Scalar. ScalarType is de ned in DB_Common.h.
Struct Object Struct objects represent structures and unions in C code. Structures and unions are represented as Class objects in C++ code, since C++ makes no real distinction between structs, unions, and classes. Note: if a header le is included by both C and C++ les, any structs de ned in the header le are promoted to Class objects even when they are used in C code. Each Struct has a corresponding Tag. Struct objects contain DataMember objects to represent the data elds in the struct.
FindDataMember() DataMembers() Returns the DataMember in this Struct with the speci ed name. Returns an iterator over all data members in the struct.
The Symbol Base Class Symbol is the base class through which all named objects (Macro, Variable, Parameter, Function, File, Scalar, Tag, Typedef, EnumMember, DataMember, and TemplateArgument) are derived (directly, or indirectly through TypedSymbol) from the Symbol class. You will not encounter Symbol objects in the database; the class is used only as a parent class for other objects.
}; DBboolean DBboolean DBboolean DBboolean SymbolToTemplateArgument(TemplateArgument &templatearg) const; SymbolToFunctionTemplate(FunctionTemplate &functiontempl) const; SymbolToModule(Module &module) const; SymbolToFile(File &file) const; Method Definitions Name() Attrib() EnclosingFile(), EnclosingBlock(), EnclosingClass() Returns the name of the object. Lists attributes of the symbol, such as ATTR_GLOBAL or ATTR_STATIC. See \Database Types".
could use SymbolToVariable to create a Variable object. For example, \sym.SymbolToVariable(var)" converts the Symbol sym into the Variable var. If the Symbol is not actually of (or derived from) type type , the function returns FALSE.
The SymbolTable Class The SymbolTable class de nes the global symbol table for a database. A database contains exactly one SymbolTable, which acts as the \root" of the database just as \/" acts as the \root" of a lesystem. The SymbolTable contains all Files and all globally-scoped objects in the database. class SymbolTable { public: SymbolTable(); ~SymbolTable(); PerHandle Handle() const; // Time stamp of database and transaction management.
ITERATOR(Symbol) GlobalSymbols() const; ATTRIBUTE_ITERATOR(Symbol) GlobalSymbols(const char *name, PerKind kind) const; DBboolean Find(const char *name, FunctionMember &funmember) const; DBboolean Find(const char *name, DataMember &datamember) const; ATTRIBUTE_ITERATOR(Symbol) SymbolsAtLocation( const char *name, const char *filename, long line, long column, DBboolean ignorecase, DBboolean useregexp, SymbolsAtLocationScoping& scoping, DBboolean allowFuzzyMatch = true) const; DBboolean EnclosingFunction(Symb
Contains() Macros(), GlobalVariables(), GlobalFunctions(), GlobalTags() , LocalTags() , GlobalTypedefs(), GlobalModules(), Files(), FunctionTemplates() GlobalSymbols() Find() ActivateFiles() Transaction management is handled by the rule engine, so rule writers need not be concerned about it. Tests whether a Symbol is found in the database. This can be useful if you have multiple databases open. Return iterators to scan through all objects of the speci ed type.
speci ed name in any File, or line or column in a File. ignorecase speci es a case-insensitive search, and useregexp speci es that name is a regexp(5)-style regular expression. If useregexp is true, name can contain any normal non-extended regular expression. The RE can also use + (preceding RE must appear 1 or more times) and ? (preceding RE must appear 0 or 1 times). scoping speci es the type of \scoping" to use when searching for the symbols.
Tag Object Tag objects represent all aggregate types, such as classes and enums. The two-part representation of aggregates (the Tag and the Enum, Struct, Class, or ClassTemplate ) allows the database to handle self-referential objects. Each tag can be mapped onto its corresponding aggregate, and vice versa. The Tag inherits from Symbol, and therefore contains all information about the aggregate's name.
TemplateArgument Object TemplateArgument objects represent C++ parametric type arguments. They are used for class template and template function arguments. TemplateArgument inherits type and name information from TypedSymbol.
Typedef Object Typedef objects represent named types. Typedef inherits type and name information from TypedSymbol. class Typedef : public TypedSymbol { public: Typedef(); ~Typedef(); }; Method Definitions Typedef de nes no methods of its own, but inherits all typing and symbol information from TypedSymbol.
The TypedSymbol Base Class TypedSymbol is the base class through which all typed objects (Variable, Parameter, Function, Typedef, DataMember, and TemplateArgument) inherit their type and name information. TypedSymbol inherits its name information from Symbol. As with Symbol, you will not encounter TypedSymbol objects in the database. The class is used only as a parent class for other objects. The attributes that describe an object's type (Type and TypeQualifiers) are inherited from TypedSymbol.
Variable Object Variable represents complete and incomplete program variables. For incomplete variables, only the base Symbol methods are valid. All other methods return FALSE and/or null results. Variable inherits type and name information from TypedSymbol. class Variable : public TypedSymbol { public: Variable(); ~Variable(); }; DBboolean Scope(Block &block) const; Method Definitions Scope() Returns the enclosing block within which the variable is de ned, or FALSE if the variable is global.
Using the Database API The following example is one of the actual rules delivered with the SoftBench CodeAdvisor product. This real-life example will help you to understand how the database API is used in rules. The Example Rule This rule, UserRulesLocalHides , detects local identi ers with the same name as a local or inherited data member or member function. You can read a description of the rule in the SoftBench CodeAdvisor online help for the UserRulesLocalHides rule.
After getting the name of the symbol, shadow() begins by iterating through all local functions in cl. (AllFunctions() returns all member functions in a class, and all function templates in a template.) Next it iterates through all local data members. The test is the same for both types of symbols: if the symbol is visible (if it is in this class, or is a non-private member of a base class), and has the same name as sym, return the hidden_sym.
check() then iterates through all functions in the class. Remember that AllFunctions() returns all member functions of a class, as well as all function templates in a template, so the same code can handle both cases. The loop rst rejects \synthetic" compiler-generated functions and \incomplete" functions. (Incomplete functions have a declaration but no de nition, and therefore no FunctionBlock.
Example Files Source les for the example rules are available on-line in /opt/softbench/examples/CodeAdvisor/Rules . The les in this directory include: Makefile A make control le to build all the example les. Sources for the example rules. Notice that ruleCapClass.C, the ProhibDefines and ProhibIdent rules ruleLocalHides.C, are identical to the corresponding rules ruleNameConventions.C, that are shipped with CodeAdvisor, except ruleProhibDefines.C , they are named UserRulesProhibrule ruleProhibIdent.
SoftBench CodeAdvisor, or specify the library location using the -l ag to softcheck. \make install" installs the rule library and help volume in the standard locations. Note that you must do the install as \root" in order to install, since the required directories under /opt/softbench are not typically writeable by ordinary users.
The UserRulesLocalHides Rule #include #include #include #include // Note, only sprintf is used; no stdio/iostream mix class UserRulesLocalHides : public Rule { public: virtual int kindMask() const; virtual Language langMask() const; void check(SymbolTable *, const Symbol &); virtual const char *errorMess() const; virtual const char *name() const; }; // Return a pointer to the simple name of member or namespace qualified obj.
// Test to see if a symbol hides (or has the same name but doesn't hide) // some member of a class, or some inherited member. static DBboolean shadow(const Symbol &sym, // symbol that may be shadowed const Class &cl, // class to check members of Symbol &hidden_sym, // symbol that sym collides with DBboolean baseclassp = false // is this a baseclass // of one where sym defined? ) { char name[1024]; simpleName(sym.Name(), name); // test sym name against local member functions ITERATOR(Function) fmi=cl.
} assert(tagi.
if (shadow(parami, cl, hidden_sym)) { sprintf(buf, "Parameter '%s' of '%s' hiding member '%s' with same name", parami.Name(), fmi.Name(), hidden_sym.Name()); violation(parami, buf, "UserRules"); } } ITERATE_END(parami) // check variables defined in any block within function ITERATOR(Variable) vari=fblock.BlockVariables(); ITERATE_BEGIN(vari) { if (shadow(vari, cl, hidden_sym)) { sprintf(buf, "Local variable '%s' in '%s' hiding member '%s' with same name", vari.Name(), fmi.Name(), hidden_sym.
5 Implementing Your Rule Now that you understand the building blocks you can work with, you can decide how to implement your rule. You must decide what approach will work best within the SoftBench CodeAdvisor framework. Design Guidelines The following are suggested guidelines for your rule designs. Do not generate excessive violations. It's usually better to miss agging a few errors than to ag incorrect violations.
Write your code to work for both classes and templates. Most class rules apply equally well to classes and templates, so it makes sense to check both. Convert Symbol objects to Tags using Symbol::SymbolToTag(), then verify the Tag refers to a Class using Tag::ClassType(). This test succeeds for both classes and class templates. Be aware that the test also succeeds for structs and unions in C++ code, since C++ treats them almost identically.
Implementing the Rule Once you understand the rule model, the Static API, and the design guidelines, you can begin implementing your rules. The example les provided with the system can be very helpful when learning the rule programming environment. If you have not studied the examples described in \Example Files" in Chapter 4, please do so before proceeding. The following sections outline a recommended procedure for developing rules.
Keep the Static API capabilities in mind when assigning di culty scores. An apparently simple rule may be di cult to implement if it requires program knowledge that the database does not provide.
each of the base classes. The exact procedure you use will depend on your rule's semantics. Compiling the Rule The Makefile in /opt/softbench/examples/CodeAdvisor/Rules correctly compiles and links rule libraries, and can be invoked from the SoftBench Project builder. If you create your own Makefile or compile from the command line, you must remember the following points: Rule code must be compiled with the aCC compiler, using the following options: aCC -I/opt/softbench/include +z -c rule le.
You may nd softcheck very useful in testing your rules. You can easily invoke SoftBench CodeAdvisor on your rule with the command softcheck -l YourRuleLibDir -r RuleToCheck See softcheck(1) for more information on softcheck. See \Debugging Your Rule" for a full explanation of running and debugging rules.
Adding Your Rule to a Rule Group The SoftBench CodeAdvisor product includes over 1000 rules. Many of them ag potential problems, so they can generate violations in cases where there is currently no error. Since it would be di cult to use the output of all rules at once, rules are organized into rule groups . Each rule group contains rules that are related in some way. You can select any set of rule groups, and run the analysis using only those rules.
$HOME/.softbench/rulegroups Personal changes. Visible within all projects for that user. $PROJECTROOT/Projects/project-name /rulegroups Personal changes. Visible only within the speci ed project. The locations are checked in the order above. Later information overrides previous information; for example, personal customizations under $HOME/.softbench are merged in with the system-wide customizations in /etc/opt/softbench/config, and override it on a group-by-group basis.
The name of your group can contain only alphanumeric characters. The rst character must be alphabetic. After adding your new group to the rulegroup le, start SoftBench and display the CodeAdvisor page. You should see your group in the \Rule Groups" area. Updating the Group Index Under the rule group selection area on the CodeAdvisor page, the 4 5 button displays an index of all rules sorted by rule group. Rule Group Help...
Debugging Your Rule After you have implemented your rule, you can test it by running it under SoftBench CodeAdvisor or by using the softcheck command. SoftBench CodeAdvisor provides the complete user interface that your users will see, and also allows you to test the linkage to your on-line help. Install the new library in /opt/softbench/lib/rulelibs. softcheck provides a very simple and \light-weight" interface to the rule engine.
5. 6. 7. Static database. Enter \-l library-dir ", where library-dir is the directory containing your debuggable rule library. (This is not necessary if your library is in the standard location, /opt/softbench/lib/rulelibs .) If you want to run only a few rules in your test library, specify them by entering \-r rule-name " or \-r rule-group ''" in the \Program Arguments" Input Box. Multiple rules or rule groups can be speci ed by separating them by colons.
Setting Breakpoints In Your Rule Since rules are stored in dynamically-loaded shared libraries, you must know how to debug these libraries within SoftBench Debugger. You cannot set breakpoints in your rule library immediately after running your program, since the library has not been loaded. You must load your rule libraries rst. 1. Enter \libsLoaded" in the SoftBench Debugger \( ):" Input Box and choose \Break: Set At ( )". (Or, since libsLoaded() is directly after main() in debugPoints.
Tracing Rule Execution You can cause softcheck to generate some extra output that may be useful in your debugging. The environment variable RULE_DEBUG accepts several values: RULE_DEBUG=1 Displays a message just before calling each rule. The message indicates the object on which the rule is being invoked. This can be very useful if, for example, you encounter a core dump in your rules. By turning on this message, you can immediately see what rule caused the core dump, and what object triggered the problem.
Documenting Your Rule In addition to the normal documentation that is recommended for any program, you should provide on-line help for your rule. When your rule detects and reports a violation, the user has the option of displaying an on-line summary and explanation of the rule. In SoftBench Program Builder, this is done by selecting the 4 5 button after selecting the violation display. Help When the user selects 4 5, a message is sent to the SoftBench On-Line Help server to display the help text.
in all help volumes. If you are writing help for languages other than English, refer to the CDE Help System Author's Guide for additional instructions. Referring to Other Help Volumes The basic HelpTag tools allow you to refer to other nodes within your help volume.
A Detailed Database Type Descriptions The Static database interface provides two header les to declare the constants, types, and functions used to access the database. These les are: DB_Common.h Common types and constants used by the database. The contents of this le are described in this Appendix. DB_Read.h The \read" interface to the database. The contents of this le are described in \Object Interfaces" in Chapter 4. The database header les are found under install dir /include/DB_Access.
Object Kind Database objects are represented by a \handle" of type PerHandle. They are typed by the enum PerKind. The rst four PerKind values (KIND_BADSYMBOL, KIND_SYMBOLENTRY, KIND_FILEENTRY, KIND_RELATION) are only used internally. You will not encounter them.
Attributes Each object has an attribute eld that describes the attributes pertinent to that object. The Attribute type is de ned as a bit vector: typedef unsigned long Attribute; Attributes are combined as necessary for a given object. The interface also de nes inline functions in DB_Common.h to test the associated Attribute values. These predicate functions generally start with IS_, WAS_, or HAS_, such as IS_GLOBAL(), WAS_STRUCT(), and HAS_DEFAULT() .
ATTR_VIRTUAL ATTR_PURE ATTR_ABSTRACT ATTR_DECLARED_STRUCT ATTR_DECLARED_UNION ATTR_DEFAULT ATTR_SPECIALIZATION ATTR_INLINED ATTR_COMPILE_ERRORS ATTR_INSTANTIATED ATTR_SYNTHETIC A-4 Applies to class members and inheritance relationships. (IS_VIRTUAL()) Applies only to virtual class member functions. (IS_PURE()) Applies to classes that contain a pure virtual function. (IS_ABSTRACT()) Applies to C++ classes that were declared as a C struct.
Scalar Types Scalar types are described by members of the ScalarType enum. Legal ScalarType values are: SCALAR_CHAR SCALAR_UNSIGNED_CHAR SCALAR_WIDE_CHAR SCALAR_SHORT SCALAR_UNSIGNED_SHORT SCALAR_INT SCALAR_UNSIGNED_INT SCALAR_FLOAT SCALAR_DOUBLE SCALAR_LONGDOUBLE SCALAR_TEMPLARG SCALAR_FUNCTYPE SCALAR_LOGICAL SCALAR_STRING SCALAR_TEXT SCALAR_LABEL SCALAR_POINTER SCALAR_VOID SCALAR_LONG SCALAR_UNSIGNED_LONG Signed character type. Unsigned character type. The NLS wide character type.
Language Types The Language type is used to determine the programming language contained in a File. Language is a bit vector de ned as: typedef unsigned long Language; The legal Language values are: LANGUAGE_C LANGUAGE_F77 LANGUAGE_PASCAL LANGUAGE_COBOL LANGUAGE_BASIC LANGUAGE_ADA LANGUAGE_CPP LANGUAGE_UNKNOWN A-6 C source le. FORTRAN 77 source le. HP Pascal source le. HP COBOL source le. BASIC source le. Ada source le. C++ source le. Any source le kind.
References A reference is a tuple of line, column, length, and usage information. The line, column and length describe the token position in the le; the Usage describes the context in which the reference occurs. Reference is de ned as follows: typedef struct { unsigned long length : 8; unsigned long line : 24; unsigned short column; Usage use; } Reference; Note that the line eld causes a type mismatch if you attempt to print it using cout. Cast it to an integer (cout << (int) ref.
Error Codes The database interface routines de ne a global variable DBError to allow the application to diagnose any problems. This variable is primarily set during database open/close operations and during write operations; therefore, it is not generally used in rules. DBError has two elds: one to record any system error (errno) and the other to record any error condition detected by the database.
DBERR_DBFILE_READ DBERR_LOCKFILE_OPEN DBERR_LOCK DBERR_BAD_NAME DBERR_BAD_ATTRIBUTES DBERR_BAD_SCALAR DBERR_BAD_HANDLE There was a system failure in an attempt to read from the database le. There was a system failure in an attempt to open the lock le. There was a system failure in an attempt to lock the lock le. A bad (non string) name value was passed to a routine in the write interface. A bad attribute value was passed to a routine in the write interface.
B Iterators Iterators are the mechanism used to loop through an arbitrary number of objects in the Static database. Because of some limitations in the C++ template mechanism, it's not possible to de ne general iterators using templates. Instead, the Static database interface simulates the template functionality using #defines. \Iterators" in Chapter 4 gives a simple explanation of the use of iterators. That explanation is su cient for most users. This section explains the mechanism behind iterators.
Standard Iterators Iterators are de ned as follows: class Iterator { public: Iterator(long count, PerHandle *handles); Iterator(); Iterator(const Iterator &iterator); ~Iterator(); Iterator &operator=(Iterator &iterator); void add(long count, PerHandle *handles) const; DBboolean Open(PerHandle &handle) const; DBboolean Next(PerHandle &handle) const; DBboolean Done() const; protected: PerHandle IteratorHandle; }; #define ITERATOR(Base) Base##Iterator #define ITERATOR_IMPLEMENT(Base, Handle) class ITERATOR(Ba
Static database code can then declare functions of type ITERATOR(object ). These functions return an iterator on objects of type object . Since the iterator class inherits both from Iterator and from object , the new iterator can be used to access both Iterator operations (to step through objects in the iteration list) and object operations and data (to manipulate objects in the list). The methods Open() and Next() allow navigation through the array of iterators.
Attribute Iterators A few objects use a specialized form of Iterator called AttributeIterator. AttributeIterators are identical to Iterators in every way, except that each object in the iteration list includes an Attribute eld. Attribute, as de ned in DB_Common.h, speci es what kind of symbol is de ned by the current object. As an example, a symbol may be ATTR_PUBLIC or ATTR_PRIVATE. Attribute iterators are de ned in only two situations: in the Global Symbol Table and in Class objects.
Attributes are de ned as follows: class AttributeIterator : public Iterator { public: AttributeIterator(long count, PerHandle *handles, Attributes *attr); AttributeIterator(); ~AttributeIterator(); void add(long count, PerHandle *handles, Attributes *attr) const; DBboolean SetIteratorAttribute(Attributes) const; DBboolean GetIteratorAttribute(Attributes &attr) const; }; #define ATTRIBUTE_ITERATOR(Base) Base##AttributeIterator #define ATTRIBUTE_ITERATOR_IMPLEMENT(Base, Handle) class ATTRIBUTE_ITERATOR(Base)
Index A Aggregate objects, 4-6 AllFunctions(), 4-15 API, 4-1 ArgumentCount(), 4-18, 4-27 ArgumentOf() , 4-46 ATTR_attrtype , A-3 Attrib() , 4-15, 4-20, 4-36, 4-39 Attribute, 4-6, A-3 Attribute iterators, 4-11, B-4 B BaseClasses() , 4-15 BaseTotype (), 4-31 BeginLine(), 4-13 Block, 4-4, 4-13 BlockFile(), 4-13 BlockFunctions() , 4-13 BlockLabels() , 4-13 BlockTags(), 4-13 BlockTypedefs(), 4-13 BlockVariables() , 4-13 Boldface font, vi Breakpoints, 5-12 C check() , 3-4, 3-11 check_table_entry(), 3-11 Class
EnclosingFile(), 4-39 EnclosingFunction(), 4-22, 4-42 EndLine(), 4-13 EndTransaction(), 4-42 Enum, 4-4, 4-20 EnumMember, 4-4, 4-21 EnumMembers(), 4-20 EnumTag(), 4-20 EnumType(), 4-45 errorMess(), 3-4, 3-11 Example rules, 3-7, 4-50{58 ExpandedFrom(), 4-15, 4-24 EXTERNREF, 5-15 F File, 4-4, 4-22 FileIn(), 4-32 FileName(), 4-42 Files(), 4-42 FileType(), 4-22 Find(), 4-42 FindDataMember(), 4-15, 4-36 FindEnumMember(), 4-20 FindFunctionMember(), 4-15 FindFunctionTemplate(), 4-18 Font usage, vi Friends(), 4-1
LocalTags(), 4-42 M Macro, 4-4, 4-29 Macros() , 4-22, 4-42 Makefile , 4-53 MemberCount() , 4-15, 4-20, 4-36 MemberFunction() , 4-24 MemberOf(), 4-19, 4-21, 4-26, 4-27 ModifiedTime(), 4-22, 4-42 Modules(), 4-22 N name(), 3-4, 3-11 Name(), 4-39 NameConventions rule, 2-3 names() , 3-4, 3-11 NestedClasses(), 4-15 NestedEnums() , 4-15 NestedTypedefs() , 4-15 O Object types, 4-3 Online examples, 4-53 On-line help, 5-14 external links, 5-15 operator[], 4-32 P Parameter, 4-4, 4-30 ParameterCount() , 4-24 Parame
SetIteratorAttribute(), B-4 Setting breakpoints, 5-12 softcheck, 5-10 SourcePosition, 4-7 StartTransaction(), 4-42 Static API, 4-1 Static database, 4-1 Struct, 4-4, 4-36 StructTag(), 4-36 StructType(), 4-45 Symbol, 4-3, 4-38 SymbolFor(), 4-32 SymbolsAtLocation(), 4-42 SymbolTable, 4-3, 4-41 SymbolTotype (), 4-39 TemplateArgument , 4-4, 4-46 TemplateArguments() , 4-18, 4-27 Testing rules, 5-5 Tracing rules, 5-13 Transactions, 4-9 Type(), 4-35 Typedef , 4-5, 4-47 Typedefs(), 4-22 TypedSymbol, 4-3, 4-48 Typ