Topics

Adding criteria to document architectural and security info


Andy Murren
 

tl;dr  Too much information about how the software is architected and security relevant information is difficult to find or non-existent. This is an informal discussion about the possibility of adding criteria to document some architectural and security information.

 

Before formally adding anything as a change request I want to discuss the applicability and feasibility of adding some documentation items to the criteria. For some context I am a security engineer who, as is standard practice, gets pulled in after software has been deployed into production and tasked with figuring out how to meet industry and government requirements. This includes trying to figure out end user and privileged user accesses, separation of duties, audit reduction, security event alerting, monitoring critical application files for unauthorized modifications, etc., etc., etc.

 

Where I see a need is in documenting the software more than in implementing it.  The criteria for more in depth documentation of the software architecture could be criteria for more advanced badges, but some could be helpful for projects of any level.  Much of this the primary developers already know, it is just not in an easy to use or find format.

 

I spend a fair amount of time reading documentation, parsing logs, and reviewing source code when it is available. Things I think that would help security people like me and I suspect some others include:

 

 1. If a database is use, document the schema

    - Tables names along with keys and indexes

    - Column names with data type and length

 

Example:

Table: employee_tbl

    emp_no, INT(11), key, index

    first_name, varchar(20)

    middle_name, varchar(20)

    last_name, varchar(20)

    suffix, INT(3), links to suffix_tbl:suffix_id

    gender, ENUM(‘M’,’F’)

    role, INT(3), links to role_tbl:role_id

   

Table: suffix_tbl

    suffix_id, INT(3), key

    suffix_abbv, CHAR(5)

    suffix_long, VARCHAR(25)

 

Table: role_tbl

    role_id, INT(3), key

    role_abbv, CHAR(5)

    role_title, VARCHAR(25)

    has_priv_IT_access, BOOL

    has_manager_access, BOOL

 

Reasons:

   Developers: Allows developers to understand the database schema so they can most effectively use it

   Security:

        a) Can request developers to create specific reports as part of continuous monitoring / auditing

        b) Can tailor auditing to look for changes to important fields (like adding privileged access to IT systems)

 

  2. Listing of standard error messages

    - Error number

    - Error title

    - Severity

    - Standard text

    - Usage

 

Example:

22; Unknown User; Fatal; “unknown uid %u: who are you?”; User ID is not a known or valid UID

23; User not part of requested project; Warn; “user \"%s\" is not a member of project \”%s\””; The user requested access to resources for a project they are not a member of

 

Reasons:

    General: Not all projects have an errno.h file where it is easy to find and use standard error messages

    Developers:

        a) Increases the possibility of developers using error handling

        b) Reduces the possibility of developers inventing their own error messages

    Security: Makes it easier to establish and maintain auditing, alerting, and audit reduction

 

3. List of critical and configuration files or directories

    - Name of file or directory

    - Standard path to file or directory

    - Description of file or directory

 

Example:

db.conf;/etc/db/db.conf; primary server configuration file

data files;/db/data; database files are stored in this directory with an extension of .db

 

Reasons:

    Developers:  Understanding of which files or directories to use to find/store important application components

    System Admins: Easier to find information about important files or directories

    Security:

        a) Can document location of critical and important files or directories

        b) Can set up auditing, monitoring, and alerting of important files and directories for unexpected or unauthorized changes

 

Other things that it would help to have documented include things such as the basic architecture and dataflow and how credentials (username/passwd and certificates) are stored and protected.

 

I am interested in hearing what the group thinks of the proposal. I am willing to write the criteria if there is a concensus to add it.

 

Andy Murren

 


David A. Wheeler
 

> tl;dr  Too much information about how the software is architected and security relevant information is difficult to find or non-existent. This is an informal discussion about the possibility of adding criteria to document some architectural and security information.

> Before formally adding anything as a change request I want to discuss the applicability and feasibility of adding some documentation items to the criteria….

> Where I see a need is in documenting the software more than in implementing it.  The criteria for more in depth documentation of the software architecture could be criteria for more advanced badges, but some could be helpful for projects of any level.

 

I think that at least for now this would go into a “more advanced” badge, because so many projects currently *don’t* document their architecture.  But that’s okay – we need to work out the criteria for those higher levels!

 

I agree that some information is really valuable for anyone who’s trying to analyze the software/system for security.

 

> Things I think that would help security people like me and I suspect some others include:

>  1. If a database is use, document the schema

 

Sure, and I completely agree that knowing schemas (where applicable) is vital.

 

*However* - and this applies to some the things you listed – I think we should encourage documenting “how to find or generate that information” where possible, instead of requiring some duplication of information that will rapidly go out of date.

 

For example, in the BadgeApp, the underlying database schema is generated and maintained in the file “db/schema.rb”.  The documentation should *not* duplicate its contents, because the “db/schema.rb” file is kept current.  But telling people “where to get the current information” is easy, and frankly more useful.

 

>   2. Listing of standard error messages

 

>         a) Increases the possibility of developers using error handling

>         b) Reduces the possibility of developers inventing their own error messages

>     Security: Makes it easier to establish and maintain auditing, alerting, and audit reduction

 

I’m skeptical this should really be *mandated*.  On a lot of systems this is nigh impossible to do, and even after you do it, it’s not clear that sifting through all the errors really tells you a lot.  What’s more, increasingly software is built from mostly reused components, which are repeatedly updated – are those included?

 

The bigger problem is the stuff that does *not* report errors, in which case this won’t help.

 

 

> 3. List of critical and configuration files or directories

 

Sure!

 

> Other things that it would help to have documented include things such as the basic architecture and dataflow and how credentials (username/passwd and certificates) are stored and protected.

 

I would start with that.  I think “basic architecture” (what are the main components & how do they interact) is the most important, especially if you include “where are connections to external untrusted entities”?  Credential handling is also very important.

 

I would think we could reference some materials, instead of just trying to create this list from scratch.  Anyone have a reasonable citation offhand?

 

Anyone else have thoughts about this?

 

--- David A. Wheeler

 


Daniel Stenberg
 

On Thu, 9 Jun 2016, andrew murren wrote:

tl;dr Too much information about how the software is architected and security relevant information is difficult to find or non-existent. This is an informal discussion about the possibility of adding criteria to document some architectural and security information.
I think we risk getting too far into nitty gritty details here. I'm afraid of bloating the set of critieras. Feature creep if you will. Asking for very specific details in the documentation I think isn't a FOSS best practice, that's just asking for good and accurate documentation. I think even more important qualities for documentation are that it is "fresh", up to date and written in a clear languange that can be understood easily. But I don't see how we can't add requirements for that.

If a project already reaches 100% on the 66 existing critieras we have now, I'd say that is a pretty well-run project.

If that project would lack details in the documentation (it must have to be 100%), there's a known way to report that problem (it must have to be 100%) and the project provides the sources in a code repo (it must have to be 100%) so you can send updates yourself to help out (otherwise it wouldn't be 100%).

I would much rather that we, once we see more projects reaching 100% so that we get a set to actually get some stats from, see if we can figure out patterns in remaining flaws (or perhaps "less than ideal artifacts") among these projects that we can think of existing best-practices that would improve them and then add such criterias.

This is not me saying projects shouldn't improve their documentation. We all should.

--

/ daniel.haxx.se


David A. Wheeler
 

Daniel Stenberg:
I'm afraid of bloating the set of criteria.
Me too.

We've always planned to add additional badges/badge levels ("gold" and "platinum") at some future time. However the criteria for those additional levels (whatever they turn out to be) are planned to be a separate set of criteria, not affecting the current set, and the expectation is that many projects won't try to achieve those higher levels. I have no doubt that criterion #1 for the higher level would be "meet the 'passing' set first".

Detailed design documentation is certainly much less common is FLOSS projects, so I think it should *not* be in the current badge level... but I think it's reasonable to entertain it at higher badge levels. That doesn't mean we should *do* it, just that it's worthy of discussion.

Asking for very specific details in the documentation I think isn't a FOSS best practice, that's just asking for good and accurate documentation. I think even more important qualities for documentation are that it is "fresh", up to date and written in a clear languange that can be understood easily. But I don't see how we can't add requirements for that.
The *freshness* is a *BIG* concern to me. Some of you may know that I used to be a lead validator under the Common Criteria (CC), which focuses very heavily on document review, so I'm quite familiar with how that works. Documentation review has a number of problems (big surprise). One of the biggest problems, which is subtle yet devastating, is if the documentation being reviewed isn't really representative of the actual system.

I would want to maximize automatically generated materials. E.g., don't give me a markdown or Word document with the schema - show me the command that prints the current schema. Doxygen can use code, and while the comments may be out of date, their proximity to the code they describe increases the likelihood of being correctness. That sort of thing.

If that project would lack details in the documentation (it must have to be 100%), there's a known way to report that problem (it must have to be 100%) and the project provides the sources in a code repo (it must have to be 100%) so you can send updates yourself to help out (otherwise it wouldn't be 100%).
I'm not sure how you're measuring those things, but it sounds plausible. Can you be a little more specific on how you'd word that?

BTW, I think we need to focus on the "minimum documentation needed" not "ideal"; there's no end to writing it.

I would much rather that we, once we see more projects reaching 100% so that we get a set to actually get some stats from, see if we can figure out patterns in remaining flaws (or perhaps "less than ideal artifacts") among these projects that we can think of existing best-practices that would improve them and then add such criterias.
Yes, that's definitely the plan.

This is not me saying projects shouldn't improve their documentation. We all should.
:-).

--- David A. Wheeler


Alan Robertson <alanr@...>
 

On 06/09/2016 06:05 PM, Wheeler, David A wrote:
I would want to maximize automatically generated materials.  E.g., don't give me a markdown or Word document with the schema - show me the command that prints the current schema.  Doxygen can use code, and while the comments may be out of date, their proximity to the code they describe increases the likelihood of being correctness.  That sort of thing.
<semi-shameless-plug>
One of the things we do in the Assimilation Suite is discover "stuff". We know which systems have databases on them, lots of security settings of all kinds. Something I'd like to get to is discovering the schema and things like which columns are encrypted, etc.

This would be relatively easy to do - if someone cares enough to say they'd use it in real life.

I usually say there are three kinds of documentation:
    - the most common kind is documentation you do not have
    - the second kind is documentation which is incorrect today
    - the third and last kind is documentation which will be incorrect tomorrow

There is no fourth kind of documentation. This is why we concentrate on keeping our database up to date with the mechanically discoverable information.
</semi-shameless-plug>


--

Alan Robertson / CTO / +1 303.947.7999

Assimilation Systems Limited
http://AssimilationSystems.com


Daniel Stenberg
 

On Thu, 9 Jun 2016, Wheeler, David A wrote:

I would want to maximize automatically generated materials. E.g., don't give me a markdown or Word document with the schema - show me the command that prints the current schema. Doxygen can use code, and while the comments may be out of date, their proximity to the code they describe increases the likelihood of being correctness. That sort of thing.
I hear you, and I agree to that in general. I just find that in many cases there's a battle between nicely written docs that are clear, and the desire to generate it and have it derived from code. Ie doxygen and tools are good in the sense that having docs is better than not having docs, but they tend to generate docs that are hard to read and end up messy. Typically people don't write enough details in their code to have the docs get comprehensive enough when generated from there.

I typically favor *not* generating docs simply because it can make the docs so much better and easier to understand when done manually. Harder to keep up-to-date sure, but better for the audience when it is.

I'm not sure how you're measuring those things, but it sounds plausible. Can you be a little more specific on how you'd word that?
I think I'd rephrase myself and say that such criterias would be better put in your suggested "gold" or "platinum" levels.

--

/ daniel.haxx.se