CloverETL is now CloverDX - Learn Why

Back to CloverDX Blog on Data Integration

Data Citizens: Rogue Users or Team Players?

Posted by Peter Cresse on Jan 27, 2017 2:48:24 PM

The rise of data citizens has sparked lively discussions in the data community. From citizen integrators to citizen data scientists or citizen analysts, the list keeps growing. Generally, data citizens can be seen as data users harnessing analytical tools and certain skills to get insights from data without waiting on an IT team for answers.

While this seems like an efficient route in the highly-distributed world of data, does it offer more value or difficulty for organizations? I approach data citizens with considerable caution rather than immediate praise. As data management moves into its future, can data teams adequately support the fast-paced, self-serving timelines of data citizens, while also protecting and nurturing valuable data assets? In other words, where’s the modicum of governance in a distributed data world?

Data citizens’ rights—and the cost of being wrong

As expertise and access to data sets grow, we do recognize data citizens as a part of the data community. But they are not the leaders, governors, or owners of the data asset. And while their level of experience is growing, it also varies, which drives the question: are these the right people to be managing such valuable assets? Stan Christiaens’ 2017 predictions stand in favor, offering, “All data citizens have the right to use data to unlock unlimited possibilities for the business. Too often an organization's data is in a state of anarchy or ruled by data dictators.” Another view from Gartner sees citizen integrators empowering business, but they hedge with a dour assessment: “Through 2016, only 10% of self-service business intelligence initiatives will be sufficiently well-governed to prevent inconsistencies that adversely affect the business.”

This ambivalence questions data citizens’ ability to critically analyze more than just data reports, but the nature of data sets themselves. At the top level, data citizens can see data trends from raw feeds or say, a coagulation from a data lake. Perhaps in some cases that may be enough.

In comparison, data managers have more know-how to look deeper into the nature of the data, beyond what comes out of ingest. Let’s take a seemingly simple example of web traffic analysis. A data citizen might see a report about an increase in web traffic and accept it as a good thing. Because data citizens often lack the experience and knowledge to dig into the data, they may end up building insights based on skewed reports. Data professionals, on the other hand, have been trained to pause at good news and more deeply consider the “why” of the results, and return to the data set for answers. This more critical reflection might reveal a website under attack, resulting in higher traffic data, but not real visitors. With this case in mind, it’s important for data organizations to understand the risk of individuals misinterpreting the data, potentially in a much more dangerous scenario, and making decisions unaware of their mistakes.

Data citizens in governed data organizations

The more data citizens have unbounded access to data, the question becomes: where’s the management imperative? I’m open to assessing better ways to do business and finding where data citizens can fit in the larger community, but I’m not sure that IT leaders want an individual line of business (LOB) determining their best data path without a guiding hand. And a CIO or CDO (Chief Data Officer) would surely be hesitant to accept the rise of the data citizens being closely involved in security, workflow management, or best uses of data assets. There’s too much at stake. Countless Fortune 500 companies see owning and managing their data as a top priority, and they invest in leaders and leading software to do it right. Therefore, I’d disagree with Christiaens’ predictions, and can add some caveats to the rights of data citizens.   

Often, there is a delay from IT teams, who might not prioritize data citizens’ needs, which calls for intervention and self-service tools. But without proper governance to enable controlled self-sufficiency, a data citizen might go rogue or make critical errors without measured caution. If an organization is able to develop and apply a clearly defined management structure, there should be no reason why data citizens can’t participate in the data community. So how can the relationship among all data users in an organization be managed? Self-service analytics, for example, is a reasonable approach to integrate the “citizen” into a collaborative relationship with the IT team.

You can see this realization coming to big name vendors, like Tableau, well aware of the situation they somehow unintentionally, and in goodwill, helped to worsen with their easily accessible tools. Their remedy started with the Drive Best Practices framework two years ago, and continues to precipitate into the core of the product as Certified Data Sources. As one customer has said, “Drive helped us build the framework needed to support a self-service, agile culture supported by a Business/IT Center of Excellence. We now have employees who are identifying problems and fixing them for themselves in hours instead of days and building reports in weeks instead of months.” But, who’s minding the data?

Leveraging the data citizen

No one citizen can or should take data matters into their own hands; rather, managing and analyzing data assets should be a thoughtful, governed process. If this is prioritized, it’s possible for data citizens to have a large degree of freedom, while also ensuring the right levels of security, access, and relationship management to protect the integrity of data insights. By the way, software toolsets, through smart transformations and orchestration, can manage this.

Whenever the term “data citizen” is touted as the way of the future, it’s important to discern whether it’s being utilized as a shifty way to muscle past IT consultants and developers, or if it truly enables the next generation of self-service or LOB functionality. If data citizens can be managed well when approaching data assets, they’ll have the potential to uncover insights quickly to drive the organization towards a more competitive edge.

Data integration software and ETL tools provided by the CloverDX platform (formerly known as CloverETL) offer solutions for data management tasks such as data integration, data migration, or data quality. CloverDX is a vital part of enterprise solutions such as data warehousing, business intelligence (BI) or master data management (MDM). CloverDX Designer (formerly known as CloverETL Designer) is a visual data transformation designer that helps define data flows and transformations in a quick, visual, and intuitive way. CloverDX Server (formerly known as CloverETL Server) is an enterprise ETL and data integration runtime environment. It offers a set of enterprise features such as automation, monitoring, user management, real-time ETL, data API services, clustering, or cloud data integration.