raven/devdocs/coding-standards.txt

# “Do the simplest thing that will work.”

//COMMENT FLAGS:
//TODO: means something that needs to be done but is awaiting something else.  This is a *MUST* do.
//LOOKAT: means something that I want to think about and revisit later but is not urgent
//BEFORE_RELEASE: means something that MUST be changed before release, usually special debugging code for development or testing
//BUGBUG: means there's a bug here that is known


Error messages / Numbers
	- All server error codes start with E1000, all API error codes start with E2000
	- Look for English text in all the messages so far and see if can be localized even crudely by google translate and do so
	- Make sure error numbers have a consistent system and don't conflict, I think there are two sets of error numbers, there should only be one
	- Make sure Every error has a number and that is documented in the manual
	- Locale keys for error numbers??  i.e. E1000, "blah blah error 1000"

C# code convention:
	All names are PascalCaseOnly with the following two exceptions:
		- function paramenter names are ALWAYS camelCased
		- CONST values are  ALL_CAPS with underlines between for spaces


DATES, TIMES, TIMESTAMPS - All dates and times sent or retrieved from the REST interface must be in UTC / GMT time zone.
It is the client's responsibility to display and accept dates in local format but interaction with the server is in UTC only.

Localized text - All text prsented by the server will be a locale key only and it is the clients responsibility to display in local text format with the exception of:
    - Ops logs, Metrics, event logs: these items and any other future pre-generated text items will be localized at the server according to the logged in users' locale


JAVASCRIPT
(mostly using AirBNB standard but using Google standard for naming things)
https://google.github.io/styleguide/javascriptguide.xml?showone=Naming#Naming
In general, use functionNamesLikeThis, variableNamesLikeThis, ClassNamesLikeThis, EnumNamesLikeThis, methodNamesLikeThis, CONSTANT_VALUES_LIKE_THIS, foo.namespaceNamesLikeThis.bar, and filenameslikethis.js.

C# Naming
https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/capitalization-conventions

OLDER STUFF
=-=-=-=-=-=

#DISTRIBUTION / DEPLOYMENT
- Linux folders to use:
    - Program files in /opt
    - Data files in /var/lib
    - Log files /var/log/


#PROCESS

- LEAST PRIVILEGE:  Code everything for least privilege possible. The principle of “least privilege” mandates that a process should have the lowest level
of privilege needed to accomplish its task
- Test / Evaluation data generator developed hand in hand with testing and application code
    - Data generator is very important, take the time to get it right
    - Production size and complexity data is required for proper development right from the start
    - As new modules are added need to add data generation for them as well
- Test driven development
- Dependency injection
    - https://joonasw.net/view/aspnet-core-di-deep-dive
- Separation of concerns: a Customer object should not have to deal with it's persistence at all directly for example.
- Rapid release cycles of small features (every 2-4 weeks)
- NEVER select *, always specify columns, shouldn't be a problem with EF but remember this, it's for futureproofing and simultaneous versions running in api
- AGILE-like PROCESS
    - (http://jack-vanlightly.com/blog/2017/9/17/you-dont-need-scrum-you-need-agility)
    - At every decision favor whatever will in future involve the least amount of supervision by me or manual steps or work Roughly "Agile":
        - Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
        - Welcome changing requirements, even late in development. Agile processes harness change for the customer's competitive advantage.
        - Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
        - Business people and developers must work together daily throughout the project.
        - Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.
        - The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.
        - Working software is the primary measure of progress.
        - Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.
        - Continuous attention to technical excellence and good design enhances agility.
        - Simplicity--the art of maximizing the amount of work not done--is essential.
        - The best architectures, requirements, and designs emerge from self-organizing teams.
        - At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.


- BEST PRACTICES
    - SECURITY (https://docs.microsoft.com/en-us/aspnet/core/security/)
        - Enforce SSL: https://docs.microsoft.com/en-us/aspnet/core/security/enforcing-ssl
    - PERFORMANCE
        - Don't use session state at all if possible.  Better to reload a session based on JWT userID from persistent storage.
    - LEAST PRIVILEGE: DO NOT REQUIRE ROOT ACCESS Code everything for least privilege possible. The principle of “least privilege” mandates that a process should have the lowest level of privilege needed to accomplish its task
    - CONFIGURATION: have as few sources of configuration as possible, ideally in one place or one copy replicated.
        - PASSWORDS: passwords to production databases should be kept separate from any other configuration files. They should especially be kept out of the installation directory for the software
        - SECRETS: should be using Environment variables in dev and production.  MS says in dev do differently but that is not ideal for consistency and testing
        - We do not want to have to set multiple configs in multiple locations and hunt around or worse yet have it in code.
        - Track configuration changes somewhere for analysis when shit goes south.  If a user changed a config need to know when and what was changed.
        - Configuration should be highly protected, it contains database passwords and other important stuff
        - Never keep the configuration file in the application folder, it will be overwritten on install or restore and if hacked it may be available to the hacker compromising other things
        - Name config properties according to their function not their nature.  Don't use Hostname, use AuthenticationServer
        - A UI might be helpful here, one that can also track old versions of config for reference so user can revert or look back.

    - A database failure should not bring down the webapi server, it should be able to handle that gracefully with helpful diagnostic information
    - Need stress and failure testing early and often, i.e. kill database server - what happens? etc
    - Need longevity testing, have the system keep testing in the background continuously for a week or more with high adn low volume transaction testing
    - SMALL QUERIES: Favor smaller simpler multiple queries through EF rather than trying to get a whole large graph with many joins at once, i.e. build up the object graph from several small queries rather than one huge one.
        - (this is my own idea based off issues with things in Ayanova leading to enormous SQL
        - Tailor small objects that satisfy what would take a much heavier query to satisfy UI requirements, i.e. a tailored customer list for a specific need would be better than loading all the customer data for the list
    - ISOLATE reporting and history / audit log functionality from transactional functionality
        - we don't want to use reporting objects in transactions as they are often ad-hoc slower to query, not as streamlined
        - Transactions are more important than reporting, don't let reporting hurt transactions
        - Consider a separate database for storing anything not required to process transactions like history, audit, INACTIVE objects that are archived, some reporting etc
        - PURGING "A rigorous regimen of data purging is vital to the long-term stability and performance of your system."
            - a second long term storage db can be used for purged data to keep it out of the active transaction data.
        - Transaction code should be vanilla ORM sql issued, avoid hand crafted sql in business transaction code as it makes db tuning much harder
    - RESOURCE POOLING is critical
        - Be very careful with it
        - Do not allow callers to block forever. Make sure that any checkout call has a timeout and that the caller knows what to do when it doesn’t get a connection back
        - Undersized resource pools lead to contention and increased latency. This defeats the purpose of pooling the connections in the first place. Monitor calls to the connection pools to see how long your threads are waiting to check out connections.
    - CACHING
        - Needs a memory limit
        - Monitor hit rates for the cached items to see whether most items are being used from cache
        - avoid caching things that are cheap to generate
        - Seldom-used, tiny, or inexpensive objects aren’t worth caching
    - PRECOMPUTE anything large and relatively static that changes infrequently so it isn't dynamically generated every time
    - NETWORK (at data center level, these might not apply but making note here)
        - good network design for the data center partitions the backup traffic onto its own network segment
        - Have a separate network for administration purposes (like we do with softlayer)
            - This is actually very important for security and peformance, a hacker of the public interface can't access admin functionality if it's bound to another NIC / network
        - App needs to be configurable which network interface is used for which part so that it's not listening on *all* networks exposing danger to the private network
    - SLA Even if we never have a service level agreement with our users we should implement it internally so we know if we are living up to it
        - Good section in the "Release it" book about this, but for now
        - Have some metrics to watch
        - Have a service that does synthetic transactions to monitor the live service and log issues.
        - SLA should be concerned with specific features not as a whole because some functionality is more important than others and can have radically different possible SLA because it may or may not rely on 3rd parties.
        - AN SLA can only ever be as good as the poorest SLA of anything our service relies on.  If an integral component has no SLA then we can't have one either.
    - LOAD BALANCING: Need it in hosting scenario but it relies on the underlying architecture so this is more in the area of the CONTAINERIZATION Research
    - DESIGN FOR FAILURE MODES UP FRONT: failures will happen, need to control what happens when parts fail properly
        - One way to prepare for every possible failure is to look at every external call, every I/O, every use of resources, and every expected outcome and ask, “What are all the ways this can go wrong?”

        - Mock client: if hosting, need an external (not co-located) automated mock client that can detect if the system is down
        - TIMEOUTS - Always use Timeouts for external resources like database connections, other remote servers network comms etc
            - Instead of handling timeouts all over the place for similar ops, abstract it into an object (i.e.QueryObject) that has the timeout code in it
            - Use a generic Gateway to provide the template for connection handling, error handling, query execution, and result processing.
            - Timeouts have natural synergy with circuit breakers. A circuit breaker can tabulate timeouts, tripping to the “off” state if too many occur.
        - Fail fast: when a resource times out send an error response immediately and drop that transaction
            - Fail Fast applies to incoming requests, whereas the Timeouts pattern applies primarily to outbound requests. They’re two sides of the same coin.
            - DONT JUST TRY A TRANSACTION: Check resource availability at the start of a transaction (check with circuit breakers what their state is) and fail fast if not able to processing
            - Do basic user input validation even before you reserve resources. Don’t bother checking out a database connection, fetching domain objects, populating them, and calling validate( ) just to find out that a required parameter wasn’t entered.

        - CIRCUIT BREAKERS
            - Coupled with timeouts usually for external resources but could also be used for internal critical code
            - Very useful to be able to trip them on demand from an OPERATIONS point of view or reset them on demand
            - if a call fails increment a count, if it passes threshold immediately fast fail and stop making that call and start a timeout
            - After fail timeout then half open and try the call again if it passes then close the circuit breaker and go back to normal
            - If a half open fails again then start the fail timeout again
            - These are critical incidents to report
            - If circuit breaker is open then it should fast fail message back to it's caller indicating the fault
            - Popping a Circuit Breaker always indicates there is a serious problem. It should be visible to operations. It should be reported,recorded, trended, and correlated.
        - For a website using service-oriented architectures, “fast enough” is probably anything less than 250 milliseconds.
        - Protect against unbounded result sets (i.e. sql query suddenly returning millions of rows when only a few expected [USE LIMIT CLAUSE ALWAYS])
        - DO not allow unbounded result sets to be returned by our api, always enforce a built in limit per transaction with paging if no limit is specified
        - this way a user can't request unlimited data in one call
        - Also if due to something unexpected a ton of records are created in a table this will prevent a crash from sending all that data back
    - TESTING
        - REPLICATE PRODUCTION LOADS EARLY IN TESTING: an hour or two of development time spent creating a data generator will pay off many times over
        - MULTIPLE SERVERS: if a configuration requires multiple servers in production, be sure to test it that way.
            - using Virtual Machines if necessary.
            - If testing on one machine what would normally run on multiple it's easy to miss something vital
        - FIREWALLS: enable a full firewall on a testing machine and then darefully document any ports that need to be opened as this will be needed for production / installation
    - STARTUP AND SHUTDOWN
        - Build a clean startup sequence that verifies everything before flipping a switch to let users in (preflight check)
        - Don't accept connections until startup is complete
        - Don't just startup and then exit if PFC fails, it should be up and running to be interrogated by administrator
        - Clean shutdown: don't just hard shutdown, have a mechanism for each module to complete it's work but not accept new work until all transactions are completed
        - Timeout the shutdown so if something hangs it can't stop the whole thing from being shut down.
    - ADMINISTRATION
        - Ability to set entire API to read only mode both on demand (control panel) and in code (for backup process)
        - Simple html based admin is ok but command line is better because it can be automated / accessed over a remote shell easily.
        - Don't have a fancy native app gui admin because it will piss off administrators and be hard to use over remote access
        - Ideally a simple html for regular users and a command line one for power users.
        - Try to make every admin function scriptable from the command line
        - "Jumphost": a single machine, very tightly secured, that is allowed to connect via SSH to the production servers
        - The ability to restart components, instead of entire servers, is a key concept of recovery-oriented computing
    - OPS TRANSPARENCY / DASHBOARD
        - This is important and needs to be in there just as much as the rest
        - Think of a dashboard that can be seen at a glance or left up all day on a screen in a "command center"
        - Should show real time snapshot but also scheduled daily events, whether they succeeded or not, i.e. notifications being sent out etc
        - transparency: historical trending, predictive forecasting, present status, and instantaneous behavior
        - Log to "ops" database "OpsDB" See page 300 of Release IT for more guideance on this.
            - Client side api to feed data to ops db
        - This is important, see the Release It book page 271 for some guidance on what to track
        - For the most utility, the dashboard should be able to present different facets of the overall system to different users. An engineer in operations probably cares first about the component-level view. A developer is more likely to want an application-centric view, whereas a business sponsor probably wants a view rolled up to the feature or business process level.
        - COLOR CODING:
            - *Green* All of the following must be true:
            - All expected events have occurred.
            - No abnormal events have occurred.
            - All metrics are nominal.
            - All states are fully operational.
            - *Yellow* At least one of the following is true:
            - An expected event has not occurred.
            - At least one abnormal event, with a medium severity,
            has occurred.
            - One or more parameters is above or below nominal.
            - A noncritical state is not fully operational. (For example,
            a circuit breaker has cut off a noncritical feature.)
            - *Red* At least one of the following is true:
            - A required event has not occurred.
            - At least one abnormal event, with high severity, has
            occurred.
            - One or more parameters is far above or below nominal.
            - A critical state is not at its expected value. (For example,
            “accepting requests” is false when it should be true.)
        - LOGGING
            - Always allow OPS to set the location of the log file
            - Use a logging framework, don't roll one. (LOG4NET?)
            - Log files are human readable so they constitute a human computer interface and should be designed accordingly
                - Clear, accurate and actionable information
                - columnar space padded, can be read and scanned quickly by humans and also read by software:
                    - [datetime] errornumber location/source severity message
            - Messages should include some kind of transaction id to trace the steps of a transaction if appropriate (user id, session id, arbitrary ID generated on first step of transaction etc)
            - Design with purging / pruning log files in mind up front
            - Don't log to a resource used by the production system (i.e. don't log in the same database as the app is using, don't log to the same disk or volume)
            - Always use a rolling log format, don't just keep appending.
            - Do NOT deploy with full debug logs enabled, it's too much noise to spot problems (see AyaNova current log for that)
            - Ensure a ERROR message is relevant to OPS, not just a business logic issue.  It should be something that needs doing something about to be error level.
            - ** Use short message codes / code numbers so users can convey them easily instead of the long text message!!!
            - CATALOG OF MESSAGES  build a catalog of all the messages that could appear in the log file is hepful to end users
        - MONITORING SYSTEMS
            - Logging of severe errors to OS application log can be used to integrate to automatic monitoring systems so it should be an option
            - Page 297 of Release It has some idea of what to expose and how to expose it.
    - ADAPTABILITY / CODING DESIGN DECISIONS / FUTUREPROOF
        - VERSIONING
            - Static assets should be in a version folder right off the bat, i.e. wwwRoot/css/v1/app.css, wwwRoot/js/lib/v1/jqueryxx.js or wwwRoot/js/templates/v1/
            - I think naming them similar to the api endpoint versioning is a good idea, i.e "v1" or "v2.1" etc.
            - this way they can still be served up to old clients without breaking new ones
            - Need the flexibility of having different version numbers at the backend and frontend.  I.E. can refer to AyaNova backend v8.1 and front end v8.3 but keep it in the family of 8.x?
            - ?? Or maybe the backend is just an incrementing number like a schema update, could be 1000 for all it matters?
            - ?? Not sure how to handle the index page, maybe it needs to be version agnostic and in turn call another page or something,
                - maybe Index.html with menu to select indexV2.html or indexV2.3.html
                - "A new version is available, switch to version 8.5?" user selects and they book mark to that new version indexv8.5.html?
            - Database versioning (this one is trickiest of all, can't remove old objects until the api is unsupported, but they might need to change, will require creative solutions)
                - Select * is bad with reversioning, instead selecting exact columns is safer and MORE FUTUREPROOF
                - Can't drop old columns or set IS NOT NULL on some if they changed that way until after the new release is fully adopted and the old can be removed.
        - Refactoring
            - Constantly improving the design of existing code
            - Only possible with unit testing
            - Test driven development: write the test first then write the code to pass the test
            - Write just enough code to make the test pass and not one line more (YAGNI), once the test passes you can refactor all you want as long as the test passes
            - Mocks are good because they immediately cause the object under test to be "re-used", once in production and once in testing with a mock object, so reuse is tested as well.
        - Dependency injection
            - components should interact through interfaces and shouldn’t directly instantiate each other.
            - Instead, some other agency should “wire up” the application out of loosely coupled components
            - The container wires components together at runtime based on a configuration file or application definition
            - Encourages loose coupling
            - Helps with testing
            - Defining and using interfaces is the main key to successfully achieving flexibility with dependency injection
            - Objects collaborating through interfaces can have either endpoint swapped out without noticing.
            - That swap can replace the existing endpoint with new functionality, or the substitute can be a mock object used for unit testing.
            - Dependency injection using interfaces preserves your ability to make localized changes
            - https://joonasw.net/view/aspnet-core-di-deep-dive
            - https://docs.microsoft.com/en-us/aspnet/core/fundamentals/dependency-injection
            - http://deviq.com/strategy-design-pattern/
            - http://deviq.com/separation-of-concerns/

    - SEPARATION OF CONCERNS
            - Presentation layer
                - The Presentation Layer should include all components and processes exclusively related to the visual display needs of an application, and should exclude all other components and processes
            - Service interface layer
                -
            - Business layer
                - The primary goal of the Business Layer is to encapsulate the core business concerns of an application exclusive of how data and behavior is exposed, or how data is specifically obtained. The Business Layer should include all components and processes exclusively related to the business domain of the application, and should exclude all other components and processes.
                - Object model
                - Business logic
                - Workflow
            - Resource access layer
                - The goal of the Resource Access Layer is to provide a layer of abstraction around the details specific to data access.
                - The Resource Access Layer should include all components and processes exclusively related to accessing data external to the system, and should exclude all other components and processes
    - FIPS
        - Don't use managed encryption if want to support FIPS


TOOLING
=-=-=-=
NO PROPRIETARY OR COMMERCIAL COMPONENTS OR TOOLS WHEREVER POSSIBLE
Need to automate the fuck out of anything that can be automated.
    Do this early on so time is saved right from the start.


NAMING
=-=-=-

.net Namespace:
COMPANY.PRODUCT.AREA (server)
GZTW.AyaNova.whatever-whatever


Files, routes, urls etc:
Use lowercase entirely everywhere, do not use uppercase, this avoids future confusion all around.
No spaces in names, this avoids having to use quotes in paths etc
Use spinal (kebab) delimiter, i.e.: coding-standards.txt
Here is some REST api guidelines to naming:
https://github.com/Microsoft/api-guidelines/blob/vNext/Guidelines.md#16-naming-guidelines

CSS:
BEM naming - http://getbem.com/


#TESTING
- THINGS TO TEST
    - Concurrency exceptions with each db type as it could be an issue
- Coding should go hand in hand with testing, don't write anything that can't be tested immediately
- Write a data generator that goes hand in hand with testing, need large, realistic dataset generatable on demand to support testing
- Unit tests where useful but a main focus on integration tests, need to be able to hit one button and be certain a build is passing
- Going to need to test all architecture levels early and continuously. I.e. in a docker container, stand-alone, different DB types etc
- Test should include exported data from v7 regularly.


```A second, more subtle effect is produced through consistent unit testing.
You should never call an object “reusable” until it has been reused.
When an object is subjected to unit testing, it is immediately used in
two contexts: the production code and the unit test itself. This forces
the object under test to be more reusable. Testing the object means you
will need to supply stubs or mocks in place of real objects. That means
the object must expose its dependencies as properties, thereby making
them available for dependency injection in the production code. When
an object requires extensive configuration in its external context (like
the previously mentioned Customer object), it becomes difficult to unit
test. One common—and unfortunate—response is to stop unit testing
such objects. A better response is to reduce the amount of external context required.
In the example of the Customer domain object, extracting
its persistence responsibilities reduces the amount of external context
you have to supply. This makes it easier to unit test and also reduces
the size of Customer’s crystal—thereby making Customer itself more malleable.
The cumulative effect of many such small changes is profound.```


DOCUMENTATION
=-=-=-=-=-=-=
All documentation will be primarily in Markdown format following the Commonmark spec http://commonmark.org/help/.
See tooling doc for how to use commonmark markdown
If other formats are required they will be generated *from* the markdown.
The api should be self documenting so docs can be generated and api routes can provide information and examples
    i.e. while coding write the docs for each route / method etc.
IF we want to do a web sequence diagram there is a handy tool:
    - https://www.websequencediagrams.com/


ERROR MESSAGES
The 4 H’s of Error Messages

Human
Helpful
Humorous
Humble