CloverETL is now CloverDX - Learn Why

Back to CloverDX Blog on Data Integration

Image Metadata Manipulation Using CloverDX

Posted by Jay Benedetti on Apr 22, 2015 4:04:06 PM

Time and again, CloverDX has proven it can solve some pretty interesting problems. Have you ever though about image metadata manipulation? Turns out, there’s a ton of information contained within the metadata of images that people and systems find useful. For example, companies such as Facebook and Twitter strip the copyright information from image metadata that are uploaded to their servers. Imagine you're a photographer selling your pictures for a living. It probably would upset you if someone else took ownership of them, right?

In this blog, we’ll show you how you can utilize cloud data storage for images, read all of the shared images metadata, modify them and write the images back to their original location. To accomplish this, you’ll need CloverDX Designer and Server and a shared Dropbox location that CloverDX can read/write to. Also, you’ll need to install Exiv2 (Exiv2 is an open source image metadata library).

Potential Use Cases

1. Image metadata standardization – You might decide that you want all your images to follow a particular standard. CloverDX can accomplish this by editing existing metadata or adding new metadata to the image.

2. Image metadata validation – You may have already implemented a policy regarding image metadata standardization, but since you receive images from many different sources, you have no way to implement these standards. By utilizing CloverDX, you can validate the metadata of images by automatically rejecting or accepting images based on their metadata.

3. Managing a photobank – You can manage and manipulate the metadata of all photos contained within a photobank.

Practical Implementation

To change the metadata with CloverDX, you'll need to use one jobflow and one graph. In this example, we'll update image metadata - specifically metadata tag comment.
In the jobflow, CloverDX will list the files in your image location and use a filter to make sure you will process only the correct file types (.jpg, .gif, etc.).

Preparing data for image metadata manipulation

Jobflow feeding the data into graph

After it'll execute the following graph that will take care of the metadata modification.

Graph used for Image metadata manipulation

Graph for processing, reading, writing metadata to image files

The SystemExecute component on the top of the graph reads the metadata of a single image (see our blog on jobflows to understand how this process works) using the Exiv2 tool, and extracts the metadata into a JSON formatted file CloverDX can process.

This is where standard data processing capabilities of CloverDX come into play. We've set up two different readers. One is the SpreadsheetDataReader that reads an XLS file contains the data you want to inject into the metadata of the image. In this example we are using XLS file to modify metadata, but you can use any data source CloverDX can process - like a database, web interface, CSV files, and so on.

Here is an example of the XLS file we are using.

Filename

Transaction type

Metadata comment tag change

Red_Apple.jpg

Add

Processed by: CloverDX

orange.jpg

Modify

john.doe@email-examples.com

cherry.jpg

Add

Filmed by – Jon Smith

 

The other reader is a JSONReader named "Read Image Metadata" that reads the metadata of the image extracted by EXIV in Phase0. DataIntersection component then merges those two data flows.

Once the new metadata is prepared in the Reformat component named "Edit Image Metadata", we'll use the JSONWriter component to save the new metadata back into image. Finally, we use another SystemExecute component to write our new metadata into the original image and upload it to Dropbox.

Checkout the end result on metadata below. As you can see, the "comments metadata tag" has been changed to "Processed by: CloverDX" as we specified in the XLS file.

Image metadata manipulation - before and after

Conclusion

In this example, we’ve covered how to use CloverDX to modify image metadata. Remember, this is only a basic example that demonstrates some interesting capabilities of CloverDX. Using a similar framework, you can modify any metadata of any media file (images, music or videos and so on) stored on your local drives or in any cloud services.

If you want to try this example yourself, download the download the graph exifmetadata.zip and CloverDX for free and you’ll be on your way to work with media metadata in no time.

Data integration software and ETL tools provided by the CloverDX platform (formerly known as CloverETL) offer solutions for data management tasks such as data integration, data migration, or data quality. CloverDX is a vital part of enterprise solutions such as data warehousing, business intelligence (BI) or master data management (MDM). CloverDX Designer (formerly known as CloverETL Designer) is a visual data transformation designer that helps define data flows and transformations in a quick, visual, and intuitive way. CloverDX Server (formerly known as CloverETL Server) is an enterprise ETL and data integration runtime environment. It offers a set of enterprise features such as automation, monitoring, user management, real-time ETL, data API services, clustering, or cloud data integration.