Chapter 25. Demystify the Source and Illuminate the Data Pipeline

Meghan Kwartler

You’ve been assigned to a new project, new team, or new company. You want to dig in and make an impact to add business value. It can be tempting to start writing code immediately, but if you resist the inclination to make initial assumptions and instead give your attention to setting up a solid foundation, it will pay dividends moving forward.

First, discover where and how the data originates. When your data is initiated from users, it is useful to get their perspective on their entry experience. Each time I walk the floor of a manufacturing plant or talk to a machine operator about how they use a system, I gain valuable knowledge. Often I discover ways users are entering data that are inconsistent with the original system design, or the valid reasons why they are omitting data. If you don’t have access to the people entering data, study their training documentation and talk to the business analysts associated with that function.

Get to know the specifications when your data originates from sensors, equipment, or hardware. Dig into the manuals and documentation to clarify how the data is generated. You will then have a clear understanding when you encounter this data later in your analysis. Knowing the expected values also helps you identify possible malfunctions in the source equipment.

Now examine the metadata. You have discovered the originating business events for implicit, explicit, manually entered, or automatically generated data. Descriptive metadata accompanies each of these sources. For example, metadata includes equipment event timestamps, device types, or descriptive user data. Determine whether this metadata from numerous sources is consistent and could be unified. For instance, timestamp formatting may differ across time zones.

Now it’s time to trace. Whether you’ve identified one or one hundred sources of data, how does that data move through the pipeline and arrive in the place where you will access it? Another consideration is that data types could be converted, and business translation may be required as data moves through systems.

If you have the good fortune to receive this type of information during your onboarding, be extremely grateful and recognize that this documentation is a gift. It will set you up for success. Then reference it. Sometimes effort is made to develop valuable documentation, but more often it isn’t utilized to its fullest capacity.

During your investigation, create documentation if it doesn’t exist. It doesn’t need to be overly complicated. It can be simple and yet make a strong impact at the same time. Pay it forward and help the next person who will be tasked with using data to benefit the business.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.120.109