Tag: google

Mariano Gonzalez on Tuesday, March 25, 2014

Batch Module Reloaded

0

With ’s December 2013 release we introduced the new batch module. We received great feedback about it and we even have some users happily using it in production! However, we know that the journey of has just begun and for the Early Access release of Mule 3.5 we added a bunch of improvements. Let’s have a look!

Support for not Serializable Objects

A limitation in the first release of batch was that all records needed to have a Serializable payload. This is so because batch uses persistent queues to buffer the records making it possible to processes “larger than memory” sets of data. However, we found that non Serializable payloads were way more common that we initially thought. So, we decided to have batch use the Kryo serializer instead of the ’s standard. Kryo is a very cool serialization library that allows:

  • Serializing objects that do not implement the Serializable interface
  • Serializing objects that do not have (nor inherit) a default constructor
  • It’s way faster than the Java serializer and produces smaller outputs

Introducing Kryo into de project did not only made batch more versatile by removing limitations, it also had a great impact in performance. During our testing, we saw performance improvements of up to 40% by doing nothing but just using Kyro (of course that the speed boost is relative to the jobs characteristics; if you have a batch job that  spends 90% of its time doing IO, the impact in performance won’t be as visible as in one that juggles between IO and CPU processing)

When we announced the December 2013 release, an exciting new feature also saw daylight: The Batch Module. If you haven’t read the post describing the feature’s highlights, you should, but today I’d like to focus on how the <batch:commit>block interacts with Anypoint™ Connectors and more specifically, how you can leverage your own connectors to take advantage of the feature.

<batch:commit> overview

In a nutshell, you can use a Batch Commit block to collect a subset of records for bulk upsert to an external source or service. For example, rather than upserting each individual contact (i.e. record) to Google Contacts, you can configure a Batch Commit to collect, lets say 100 records, and then upsert all of them to Google Contacts in one chunk. Within a batch step – the only place you can apply it – you can use a Batch Commit to wrap an outbound message processor. See the example below:



It sounds like the title for a fantasy movie, but Google, OAuth and the “” is a very common issue. Wikipedia defines a as “a computer program that is innocently fooled by some other party into misusing its authority. It is a specific type of privilege escalation” (complete article here).

The Wikipedia article shares an example of a compiler exposed as a paid service. This compiler receives an input source code file and the path where the compiled binary is to be stored. This compiler also keeps a file called BILLING where billing information is updated each time a compilation is requested. If a user were to request a compilation setting the output path to “BILLING”, then the file would be overwritten and the billing information lost. In this case, the compiler is a “confused deputy” because although the client doesn’t have access to the file, it’s tricked the compiler (who does have access) into altering the file.

Hello There! If you remember a couple of months back we started a series regarding the Google Cloud Connectors Suite. In the first post we introduced the suite, took a look at how to install the connectors in Studio and built a very simple yet cool iApp that takes contacts from a Google Spreadsheet and turns them into Salesforce contacts, Google Contacts, Google Calendar Events and Tasks.  Then on the second post we gave some quick code examples of common usage on the connectors.

Google Apps offers a cloud alternative to many of the office products.  If you have a Gmail account then you have Google Apps including Spreadsheets, Docs, Presentations, Contacts, Calendars and Tasks.  Of course Google Apps have APIS and of course we have the connectors to make it easy to connect Google Apps and your applications together.  Lets get the connectors and then take a look at what you can do.

Mark Zuckenberg once said: “How can you connect the world if you leave out China”. Well, I now hereby say: “How can you connect the cloud if you leave out Google”. I know I don’t have his net worth, but I have a point nevertheless. Reality is that Google has done a great job building a Gazillion of different and very cool APIs and you’d be right to feel that it’s hard to keep their pace. To help you with that is that we proudly present to you the first release of the Google Cloud Connectors Suite.

Most people who ever worked in real-world data integration projects agree that at some point custom code becomes necessary. Pre-fabricated connectors, filter and pipeline logic can only go so far. And to top it off, using those pre-fabricated integration logic components often becomes cumbersome for anything but the most trivial data integration and processing tasks.

With RESTx – a platform for the rapid creation of RESTful web services – we recognize that custom code will always remain part of serious data integration tasks. As developers, we already know about a concise, standardized and very well defined way to express what we want: The programming languages we use every day! Why should we have to deal with complex, unfamiliar configuration files or UI tools that still restrict us in what we can do, if it is often so much more concise and simple to just write down in code what you want to have done?

Therefore, embraces custom code: Writing it and expressing your data integration logic with it is made as simple as possible.

Let me illustrate how straight forward it is to integrate data using just a few lines of clear, easy to read code.