Back to CloverDX Blog on Data Integration

Data Profiling with CloverDX Profiler beta

Posted by Jakub Lehotsky on October 31, 2011

The process of data integration, data migration, consolidation and other data manipulation projects consists of a variety of steps and tasks. Javlin supports many critical tasks within these projects with a versatile ETL tool that provides technical solutions to transform data and connect different systems and data sources with various data formats. However, there are additional aspects in order to achieve success with data processing projects. One important project phase to ensure future success is the analysis phase. At Javlin, we are now on the verge of introducing new tool that will boost productivity in the early stages of a project. This product is our new data profiling tool - CloverDX Profiler.

data profiling

CloverDX Profiler helps give you insight into your data. By creating a profile of your data using various metrics, you can get the big picture of the current condition of your data. This information has a large impact on the next stages of your project.

Real World Use - Data Warehousing Project

Imagine you are project manager with a complex task at hand - building a data warehouse. You have couple of data sources - database tables, CSV files, log files you want to analyze and probably some others. After you define the basic requirements for your data warehouse and its structure, you need to check the condition of your source data.

As you use the CloverDX Profiler, you will gain:

  • insight into the quality of your data indicating whether you need the perform a data cleansing phase in your project:
    • detect what data are missing (high percentage of missing or invalid values) looking for a different source
    • you may include data cleansing tools in the project budget and reserve tasks in your project plan
  • information about the structure of data in all your sources:
    • detect the need for ETL tools on specific data sources to transform your input data into a common format

Instead of writing your own tools and testing the data in complex way, you are just a few clicks away from creating detailed profile of your data.

Product Vision

The usage of the CloverDX Profiler tool for the project analysis phase is just the beginning. As part of our vision for the future, we are on the path to create a fully integrated product in CloverDX Designer and CloverDX Server family.

Its main purposes will be to aid:

  • CloverDX Designer: to assist the development of ETL transformation by allowing the creation of profiles of data in any stage of the transformation and checking the results during the debugging process
  • CloverDX Server: to constantly monitor the data condition on the production server  and invoke actions based on the specific condition detected in the profile of your data

Beta Program

Right now, we are excited to announce the beginning of our beta program for CloverDX Profiler. We invite you, people from the data integration community, to join the testing phase; it’s a great opportunity to not only have a look behind the scenes, but to also be a part of the development process by contributing your insights.

Data integration software and ETL tools provided by the CloverDX platform (formerly known as CloverETL) offer solutions for data management tasks such as data integration, data migration, or data quality. CloverDX is a vital part of enterprise solutions such as data warehousing, business intelligence (BI) or master data management (MDM). CloverDX Designer (formerly known as CloverETL Designer) is a visual data transformation designer that helps define data flows and transformations in a quick, visual, and intuitive way. CloverDX Server (formerly known as CloverETL Server) is an enterprise ETL and data integration runtime environment. It offers a set of enterprise features such as automation, monitoring, user management, real-time ETL, data API services, clustering, or cloud data integration.