© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
L. Harding, L. BaylissSalesforce Platform Governance Methodhttps://doi.org/10.1007/978-1-4842-7404-0_13

13. Data Architecture & Management (Phase B) Resource Base

Lee Harding1   and Lee Bayliss2
(1)
Clayton-le-Woods, UK
(2)
Clifton, UK
 

This chapter contains the resources required to govern Phase B of the Salesforce Platform Governance Method. These resources should be considered a starting point and can therefore be tailored depending on your project and governance requirements. However, we have made every effort to capture the main topics that have appeared throughout our Salesforce engagements and experience, so these should serve as a good reference in relation to the method.

Tip

You do not need to use this resource base as a strict method or process. The idea is that this will give you a good indication of what you should be taking into consideration. The expectation is that you will use this as a guide and then build upon it as you navigate the Salesforce ecosystem.

Guidelines and Best Practices

This section contains the guidance and best practices that are available from Salesforce, as well as other resources that we have determined will be valuable. This section should serve to provide a good set of guidelines that can be reviewed by anyone delivering the governance function within your organization.

We know there is an infinite number of resources available on the Web, and although some are better than others, this resource base will provide you with a good selection that we recommend you review. As per all the resource base documentation in this book, we do not intend to go into low-level detail for each item, as we should not consume any more paper than absolutely necessary. Instead, we focus on pointing you to the details or explaining the high-level concepts for you to then follow on with additional reading where necessary.

As is the case with all the resource base chapters in this book, the links to the resources will be managed in the Salesforce Platform Governance Method GitHub account. The URL for this account is as follows: github.com/salesforceplatformgovernancemethod.

Design & Optimization

Needless to mention, this resource base complements the context of the governance method described in Chapter 3, “Data Architecture & Management.” To begin, we have provided Table 13-1 to supply resource links to the content that’s discussed.

Data architecture and design and optimization are critical parts of your Salesforce org, and how you optimize your data model will undoubtedly have a direct impact on the success of your application. In this resource base you will find references to the concepts and technical considerations that will help you to ensure that your Salesforce data architecture design is fit for its purpose and has considered the controls and limitations that should govern the overall design. We have tried to cover the scenarios that you will for sure find yourself seeking guidance on, or a deeper level of understanding. As with all Salesforce concepts, there is a mountain of information, discussion topics, views, and opinions from other professionals in the industry available to you for review. However, this resource base provides you with a comprehensive list of concepts that we recommend you review in the context of data architecture and management design.
Table 13-1

Data Optimization Resources

Artifact

GitHub Ref

Description

SFDC Data Modeling Introduction

Data Modeling

This Trailhead module is a great introduction to Salesforce data modeling concepts, including simple object relationships, standard and custom objects, and the schema builder. If you are new to the Salesforce ecosystem, this would be a great place to start your data modeling journey.

Data Model – Object Relationship Overview

Data Model Object Relationships

The next resource is the object relationships, or rather, what object relationships are available and how they work. Remember that object relationships in your data model will be a key part of your data model design. You need to decide if a lookup relationship will serve your needs, or perhaps a master–detail relationship would be more appropriate. Finally, it’s good practice to document your data model clearly and create a data model dictionary. This document will prove vital for good data model governance. Best practice would be to add clear descriptions for all objects and fields that you add. When other resources come to view the fields available in the model, they will understand the intention for creating the element and can decide if it suits their requirement or business purpose.

Considerations for Relationships

Data Model Relationship Considerations

Review this resource for data model relationship considerations before creating relationships between your objects. If you have reviewed the previous resource base, this will serve as a good reminder for the options Salesforce has to offer.

External Data Access

Define External Objects with Salesforce Connect

With Salesforce, you can access external data sources and view data as if it resides on the Salesforce platform. There are many ways in which this can be achieved, and this resource covers this concept using Salesforce Connect and the associated adaptors; for example, cross-org, OData (2.0, 4.0), or custom Apex adaptors.

Things to Consider using External Objects or Data Sources

Salesforce Connect Considerations for all Adaptors

Salesforce Connect, although useful, does have some special behaviors and limitations that should be considered in your solution. For example, not all object relationship types are available, and of course if the external data source is unavailable or suffering a degradation in performance it could directly impact your application performance in relation to loading external data views.

What Are Custom Metadata types

Custom Metadata Types

We know Salesforce and we know what metadata is (data about data), but with custom metadata types we can define data in our apps as metadata. This gives us the control we need to manage application data (metadata) via the Metadata API, package this data up as managed or un-managed packages, and even build logic using validation rules to validate data entered by users to standards that you define. The main benefit of using custom metadata types is having the ability to deploy app data from one environment to another. If we consider a custom setting, for example, you can only move the metadata from environment to environment. So any values you specify in the custom setting cannot be managed in the same way as data defined in a custom metadata type. The reason is that the data defined in your custom metadata type IS metadata, so when you are migrating custom metadata types from one environment to another, the values are treated not as data, but as metadata. You can even define relationships between custom metadata types. We use this feature to create relationships with other metadata types or custom objects and fields and static resources.

Custom Settings Description and Examples

Custom Settings

Custom settings are like custom objects. They allow you to create custom data sets across your org. The main benefit of custom settings is that the data is available in the application cache; therefore, when you need to access this data you can do so efficiently as you will not need multiple queries to the database. However, there are limitations that you should note, the main one being that you cannot easily move custom settings data from one environment to another. This is because, unlike custom metadata, custom settings values are classed as data. A good use case for defining a custom setting would be if your application has a need to list country codes for all countries used in an application. You can therefore easily access this data without expensive database queries.

Picklist Limitations

Picklists vs. Custom Objects and associated Limitations

As discussed in the method, part of your governance process should look at the data granularity and how normalized your data model design is. All of these things will contribute to your app performance, so you need to be aware of how certain decisions you make impact your application overall. For example, it is ideal to have less than 1,000 values in your picklist. Any more than this and a custom object would be more appropriate. There are other limits that should also be top of mind when considering options. This resource should provide some guidance on limits and their performance implications.

Data Management Strategy

Data Management Best Practice Guides

Use this resource to really challenge your thinking and approach to data management. The integrity of your data must always be the single most important thing when you consider your data integrations and how and where data is mastered. These resources will help you to think about best practices and to consider how your application integration methods could affect data integrity, as well as to consider the overall strategy you adopt to control how data flows in and out of your Salesforce implementation.

Large Data Volumes

Introduction to Large Data Volume Concepts

Salesforce enables customers to easily scale their applications up from small to large volumes of data. This resource provides you with an introduction to large data volume (LDV) concepts, such as search options and the use of skinny tables. LDV is a term used to describe an org where typically there are tens of millions of records, hundreds of gigabytes of data, or where an object has in excess of 5 million records. This resource provides a good starting point to grasp the fundamentals before exploring deeper technical reference material. For example, consider the impact to performance as your data scales. We must assume that as data volumes increase, the time required for certain operations, such as search, may also increase. So, the ways in which architects design and configure data structures and operations can have a significant impact on the expected performance of an application.

LDV Best Practices

Best Practices for Deployments with Large Data Volumes

This definitive Salesforce resource provides you with a very comprehensive guide to large data volumes, search architecture, optimization techniques, skinny tables, and the associated best practice recommendations. Things like making queries selective or using techniques to reduce the data stored in Salesforce will all be key to understanding and controlling how the volume of data impacts your application characteristics, performance, and the design decisions you make.

Query Plan Tool

Query Plan Tool

It might be a good idea to use the query plan tool in the Salesforce developer console to check the query plan for any SOQL queries that are slow and inefficient. The tool will basically help you to understand how efficient your query is and how to make the query more performant.

Custom Indexes

Custom Indexes

Within this resource, you will learn that the best query is a selective query, but if you are have issues and need to specify additional indexes, then a custom index could be the right approach. Review this resource to learn more about how custom indexes can be applied to your specific use case.

Skinny Tables

Skinny Tables

Looking deeper into the topic of LDV, we meet the concept of skinny tables. This is something we advise you to consider when your query performance is causing production issues and you are sure that you’ve performed all due diligence on your query design, explored custom indexes, and reviewed the issues with Salesforce customer support (where applicable). Having done all that, there is still something that can be done to improve query performance, and that’s to implement skinny tables. Salesforce will have to do this for you, as this is not something that you can configure on the platform yourself. A skinny table is a custom table that contains a subset of fields from the source object. The basic idea is that with a skinny table you define the fields that are required by your business case. Then when your query executes, the scan process should be able to return more rows of data (due to the reduced number of fields in your table), as query throughput is significantly increased. So for queries on objects with millions of rows of data, a skinny table could be just the ticket for a performant solution.

Big Objects – Trailhead, Implementation Guide

Salesforce Big Object Reference

For applications that have huge data requirements—for example, where the purpose of the Salesforce application is to retire a legacy system—it may be necessary to use a big object on the Salesforce platform. If the legacy app has hundreds of millions of records or perhaps even more volumes of records (say more than a billion), then to maintain application performance you’ll need to focus on using a big object. However, there are several considerations that must be reviewed so that it’s clear how big objects work, how they are used, how you interact with them, and other limitations that could influence your decision to use them for your use case. Bear in mind that the main use cases for big objects are for archiving huge amounts of data or where you have the requirement to bring in large data sets from an external data source or legacy application. This resource covers all the basic concepts of big objects and will also provide guidance for increasing the limits should this be a consideration for your use case.

Query Performance

Maximizing the Performance of Force.com SOQL, Reports, and List Views

Query performance is key when designing Salesforce applications. Think about it: Your app could have a great UI that’s way superior to that of your competitors, but this will pale into insignificance if the performance of your application is hindered by bad search and query design. The time it takes to request data from the database for presentation will ultimately dictate if your application “just works” rather than playing a pivotal part in your application’s being a success and delighting your customers. Having responsive and selective queries is therefore a vital part of the overall architecture. Review this set of resources to ensure that your queries are optimal and follow the guidelines for selectivity.

Selective SOQL Queries

Make SOQL Query Selective

Now we know that selectivity for our queries is the key to having a fast, responsive query set. But what do we really mean by selectivity? Well, it all comes down to using a query that makes appropriate use of the WHERE clause (as well as others) to narrow down the data set returned using index fields. And this is the key to selectivity: using indexed fields in the query and also keeping the number of returned rows within a system-defined threshold.

Large SOQL Queries

Working with Very Large SOQL Queries

In the resources that cover this topic, you will find additional examples that require further consideration when working with very large SOQL queries where the defined heap limit has been reached and you’ve hit an error.

Search and Query Optimization

Salesforce Query Search Optimization Cheat Sheet

Using this resource, you should be able to quickly understand if your query is selective in nature. This resource will give you all the information you need to understand the thresholds that govern how many records you can return in your data set while maintaining a level of performance. Given what we have already covered in this resource base, having this knowledge and applying its rhetoric to your design will be a huge factor for performance.

Data Movement

In this section of the resource base, the focus is on data more specifically, the movement of data. You will undoubtedly have many requirements throughout your Salesforce career to either migrate, archive, back up, load, remove, create external data integrations, or perform many other actions that all revolve around the management and security of the data that powers your Salesforce applications.

There are several topics and resources that, as a result of the requirement to control the data flow, we should be aware of, especially when we need to apply our governance method to be successful. For example, loading data into the Salesforce platform should be a relatively simple task, and it is; however, you need to prepare and understand the impact of what loading huge data volumes actually has on your production service. We’ll cover this and more in the resources provided in Table 13-2.
Table 13-2

Data Optimization Resources

Artifact

GitHub Ref

Description

Data Load Resources

Data Load – Importing Data into Salesforce Choosing a Method for Importing Data

Loading data into Salesforce can be a simple task. But that depends on several factors or characteristics of the data that you are trying to load into Salesforce; for example, the number of records you wish to upload will play a significant part in the decision as to which tool you use. The first thing to consider will be the storage limits of your Salesforce org. Trying to import more data than your storage allocation seems like an obvious point to consider, but it’s best that you start with this type of mindset from the start. Then you will need to consider the load order of the records. Which records should you load first? General rule of thumb will dictate that the load order be users then accounts and then opportunities, as an example. You can see the chain of record dependency here that predicates the order. Then you need to look at the tool that you intend to use, a choice largely driven by the number of records that you wish to load. But first, review the resources in this section as they will provide you with the guidance and information you need to get informed.

Data Import Tools

Choosing a Method for Importing Data

One choice you will need to make is the data load tool that is most appropriate for your requirement. The main consideration is the number of records. This resource provides a table that will help you to derive the correct tool to use.

Bulk Data Loads

Bulk API 2.0 and Bulk API Developer Guide

With Data Loader you can choose to load your records using the Bulk API. The Bulk API allows you to load your records by creating a job that contains one or more batches. This is a very useful and important tool that you have at your disposal to load very large sets of records. Using Bulk API, you could effectively load up to 150,000,000 records in a 24-hour period. That’s specifying 15,000 batches that each have up to 10,000 records. However, there are a number of factors that you will need to consider in order to use the Bulk API effectively. This resource will provide you with all the information you need and serve as a good reference point in the future. For example, one of the common issues with loading data using the Bulk API involves lock contention (also see granular locking). This is where your load fails due to multiple batches’ trying to update the same account record at the same time when using Bulk in parallel mode. An issue like this might force you to load the records using serial mode. The downside to this is that it will increase the execution time significantly. Other things to consider are things like how you will bypass any object automations that you have configured—process builders or triggers, for example. These will need to be bypassed for the duration of the data load. If you do not do this, your data load will most certainly fail or will take an inordinate amount of time to execute. So, in review, the number of records, automation bypass, database locking, time of execution, batch size, operational impact, data load plan, and choice of load tool are all examples of what you must consider when thinking about loading data into Salesforce. Something else that is often overlooked is the prospect of data skew, which we consider next.

Data Skew

Designing Record Access for Enterprise ScaleManaging Lookup Skew

There are three types of data skew that we should be concerned about when either moving or importing data into Salesforce or managing large data volumes As previously discussed, these are account, data ownership, and lookup skew. Data skew can essentially be described as the effect of having 10,000 child records associated with a single parent record. This can cause application performance issues and record locking when performing DML (Data Manipulation Language) actions. Use these resources to learn more about skew, the effect it has on your implementation, and how to avoid falling foul of this silent application performance killer.

Granular Locking — Group Membership Locking

Granular Locking

This has been touched on already (and will be detailed in the Sharing & Visibility chapter), but one thing you need to be aware of is the potential for lock errors. This can occur when two processes are trying to write to the same database table at the same time, causing the write to fail due to the table’s being locked by the first process. In Salesforce, this occurs when you are updating the role hierarchy or when making changes to the group membership. Review this resource in order to note the impact of locking and how granular locking helps you to avoid errors.

Data Backup & Restore

Data Backup

I don’t think you need to read this book to understand why you need to protect the data that you store in your Salesforce implementation. It’s your data, and this data is what you rely on to provide your customers or your business function a service. You must ask yourself, what would be the impact if suddenly the data that underpins my business function became unavailable? How would we get the data back? To what point in time and how long would it take? These are all common questions that IT organizations have been asking ever since the dawn of the data revolution. But this is cloud—why do I need to worry about backup? While Salesforce will provide you with a platform to propel your business to stratospheric heights, up to now it has not provided you with an automatic data backup service out of the box. Well, this is all about to change. During Summer 2021, Salesforce began the backup and restore services pilot with the product going GA in November 2021. You can still utilize all the AppExchange partner solutions for backup and restore, and there is the option to contact Salesforce support to perform data restoration.

Phase B Standards

As described in the resource base for application architecture, as much as possible we want to ensure that your Salesforce org is self-documented. There are many resources in the Phase A “Standards” section that also apply here for the control and management of data architecture components. So, we must reiterate the importance of documenting the overall data model and object relationships. We also underline the importance of following naming standards to make it easier to identify objects, either by the type or the projects to which they belong. All description fields are required to be completed in line with the description standards that follow.

Aside from loop iterators such as i, j, and k, variable names, object names, class names, and method names should always be descriptive. Single-letter names are not acceptable. Object names should use CapitalizedCamelCase. Method or function names should use lowerCamelCase. Constants should be CAPITALIZED_WITH_UNDERSCORES.

Note

In fact, this is just an example of what is advised in the “Naming Conventions” link in Table 12-1 from the Phase A resource base.

Underscores should not be used for any variable name, object name, class name, or method name except as an application prefix. Overriding standard tabs, objects, and standard names should not be allowed without first seeking approval from your central governance team.

Names must be meaningful. Abbreviated names must be avoided; for example:

Good

Bad

computeAverage()

CompAvg()

boolean isError

boolean isNoError

Data Model Definition

When we define our data model, it’s best practice to document the model using some form of tool. Remember, your data model describes the data entities, relationships, and associated attributes, and will be reviewed by the governance team in order to understand how the solution data all hangs together.

The data model should provide all the metadata (data about data) for each object that you define. So, for example, the model should be structured in such a way that not only is the object naming clear, but all the related fields are defined, including the type of field and attributes such as type and description. This is what makes a well-defined data model easy to understand; all resources on a project will use it as part of the solution development. There are three models that you potentially will need to create: physical, conceptual, and logical. A conceptual model provides a high-level view of data, the logical model depicts the model with separation from the data’s physical storage, and, finally, the physical model shows the lowest level or physical representation of how data is stored in the database. There are many ways that we can provide a physical view of our data model. The most common is to provide a physical diagram that shows the relationships, which is usually accompanied by a spreadsheet or a data dictionary where the low-level descriptive data for each object is defined.

As an example, you could document the Business Contact object for our Bright Sound Guitars business as follows:

Object Label

BSG Business Contact

Object API

BSG_Business_Contact__c

Label

API

Type

Reference

Attributes

Description

Business

Business__c

Lookup

Account

 

Lookup to Account Record

Business Contact Name

Name

Text(80)

  

Name of Business

Contact

Contact__c

Lookup

Contact

 

Lookup to Contact Record

Key

Key__c

Text(255)

 

Unique, Case insensitive

Unique Identifier

First Time User

First_Time_User__c

Checkbox

 

Boolean

Denotes New Customer Eligibility Status

View Contact

View_Contact__c

Lookup

Contact

 

Dynamic contact ownership field used to drive sharing

This method of documenting your data model is good practice and will be an important artifact in your project’s documentation. Each object in the model should have its own tab, and every field should have as much information as possible defined.

Objects

Remember, objects represent database tables that contain your organization’s information. For example, the central object in the Salesforce data model represents accountscompanies and organizations involved with your business, such as customers, partners, and competitors. The term record describes a particular occurrence of an object (such as a specific account like IBM or United Airlines that is represented by an Account object). A record is analogous to a row in a database table.

Objects already created for you by Salesforce are called standard objects. Objects you create in your organization are called custom objects. Objects you create that map to data stored outside your organization are called external objects.

Object names will start with a capital letter and may include a prefix, as outlined here :

Enterprise Object

Format

Example

[ObjectName]__c

If your company manufactured guitars, you may want a customer object called Guitar__c.

Project Object

Format

Example

[BusinessArea][Project]_[ObjectName]__c

If your company has a business area called Sales and Marketing, and they have a project to develop a custom configuration on the Salesforce platform called Accelerate, your custom object might be called SMACC_Guitar__c where SM refers to “Sales and Marketing,” while ACC refers to the project “Accelerate.”

[Project] is mandatory and must be populated, [BusinessArea] is optional and only needs populating if not limited to a single business area.

Note

The transition of a project object to an enterprise object must go through the central governance team given the potential impact on existing reports, sharing, workflows, and triggers, etc. Promotion of a project object to an enterprise object WILL require effort.

Fields

Capture your unique business data by storing it in custom fields. When you create a custom field, you configure where you want it to appear and optionally control security at the field level.

Custom field names should have words delimited by an underscore. Whole words should be used, and the use of acronyms and abbreviations should be avoided. The API field name must meet the following convention:

Enterprise Fields (Global, Reusable)

Format

Example

[FieldName]__c

If a new field was created to define a Customer’s Secondary Email Address, the example would look like Secondary_Email_Address__c

Project-Specific Fields

Format

Example

[BusinessArea][Project]_[FieldName]__c

A project specific field could look as follows, SMACC_Secondary_Email_Address__c

Note

[Project] is optional and only needs populating if not generic across all business areas.

Custom Settings

Custom settings are similar to custom objects in that they let you customize org data. Unlike custom objects, which have records based on them, custom settings let you utilize custom data sets across your org. Custom settings also let you distinguish a particular set of users or profiles based on custom criteria.

Format

Example

[BusinessArea][Project]_CustomSetting

For example, SM_ACC_Billing_Data__c

Checklists

The phase checklist simply tracks that each step and sub-step within the phase is governed correctly and completely. Each sub-step may have several subject areas to form complete coverage from a governance perspective.

Governance Step

Govern Design & Optimization

Pass / Fail

Govern the solution for optimal performance, scalability, usability, and maintenance

 

Govern the appropriate use of declarative and programmatic functionality used

 

Govern the considerations for a single org or dedicated org strategy

 

Govern the usage of license types (capabilities and constraints)

 

Govern the data modeling concepts and implications of database design

 

Govern the usage of reports and analytics (platform considerations and trade-offs)

 

Govern the usage of external applications (Salesforce AppExchange and Application Integration)

 

Data Movement

Pass/Fail

Govern the usage of the platform’s internationalization functionality (multiple currencies, translations, and languages)

 

Govern the use of visual workflow, taking into account the limitations and considerations of a visual workflow solution

 

Govern the capabilities and limitations of Salesforce actions

 

Govern the integration with social capabilities of the platform

 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.32.46