“Translytics” is a portmanteau that derives from two different types of database workloads: transaction processing (trans) and analytics (lytics). Translytics is analytic processing that is performed on transactional data as soon as possible after it is created or ingested. This is called “real-time” analytic processing. “Real-time” is also used to describe the timeliness of the data processed in this way.
Translytics (or “translytical”) can also refer to a data-processing platform that consolidates both types of workloads in a single context—typically, a database. A translytical database is usually positioned as an alternative to separate online transaction processing (OLTP) and online analytic processing (OLAP) databases. OLTP databases are associated with operational business applications, such as ERP, HR and CRM; common OLAP-like systems include data warehouses and data marts.
Traditionally, organizations used monitoring to automate different kinds of actions in response to discrete events, alerts, messages, and so on. The translytics model of analytical processing is different. Like real-time analytics, it uses pre-built analytic models to process data in real time--ideally, coincident with its creation or ingestion. The models identify patterns or signatures that correlate more or less strongly with specific phenomena, such as fraud.
Most real-time analytic architectures consist of a stack of systems—including OLTP databases, ETL/ESB integration software, change-data capture (CDC) software and a stream processing bus--that process data in real time. This is usually done by feeding the data into a separate platform, such as a data warehouse, an operational data store or a compute engine, such as Apache Kafka or Apache Spark.
A translytical database does all of this work in a single system. In a production environment, translytics processing could kick off a workflow that automates a sequence of remediations—for example, in the case of suspected fraud, voiding a debit transaction, disabling a credit card, and/or sending a text message to a customer. Translytics can also be used to accelerate event-driven operations in the context of core business workflows.
What Are the Requirements the Translytics Model of Analytic Processing?
What will you need to take advantage of translytics? A translytical database, for starters. “Database” is not too technical a word, either. A translytics processing platform consolidates OLTP and analytics workloads, both of which require the set of strict transactional safeguards that (for example) a relational database enforces.
In production, a translytical database ingests, performs operations on and manages the transactional data generated by the applications, services, systems, and so on that undergird common business workflows. In its analytic database function, it preserves a derived history of all transactional data. In performing translytics processing, it runs current transactional data against analytic models; this may or may not also involve processing historical data.
A translytical database supports common analytics processing use cases (for example, operational reporting and ad hoc query/analysis), along with advanced practices such as analytic discovery and data science. Its data and analytics processing capabilities are potentially useful to developers, ML and AI engineers, as well as a diversity of other, non-traditional consumers. The upshot is that a translytical database may be required to support a large number of concurrent users.
A translytics database is not self-contained. In production usage, it will most likely also ingest data from external sources. These include NoSQL databases, connected devices, RESTful endpoints, file systems and (not least) other relational databases. For this reason, most organizations will use data and application integration technologies to facilitate access to external sources. Common integration technologies include ETL, ESB and stream-processing, as well as data replication and CDC.
What Is Translytics Useful for?
The category of true real-time and event-driven use cases is relatively small at this time. It consists of fraud detection, financial trading, sports betting, healthcare and quality control in manufacturing. In each of these cases, the time dimension is so critical as to be definitive.
In fraud detection, for example, it is critical to identify fraudulent transactions before they are committed—that is, prior to the transfer of money or goods. A financial trade is made on the basis of the point-in-time valuation of an asset. At the very least, a delay in processing could result in reduced profits--or significant losses. In the same way, quickly identifying a production anomaly and shutting down the affected manufacturing processes could save money as well as improve yields.
Translytics Is about Fresher Right-time Data, Too
While the use cases for true real-time and event-driven data may be relatively few right now, many common business scenarios stand to benefit from access to fresher “right-time” data.
After all, an ability to ingest data at a more rapid rate notionally translates into an ability to process data at a more rapid rate, too. This could permit an enterprise to design more tightly coupled event-driven apps, services, workflows, etc. Think of this as a “right-time” as distinct to a real-time dependency.
For example, some of the most common workflows or processes associated with sales and marketing--such as customer creation and validation; name and address validation; or context-dependent upsell and cross-sell--are right-time dependent. Certain workflows and processes in HR (such as employee onboarding and employee termination), information security (intrusion detection and remediation), finance, and procurement, among others, also are nominally right-time dependent.
In particular, workflows or business processes that cut across or involve multiple business function areas are right-time dependent. For example, the sales process is not just confined to sales and marketing--behind the scenes, a sales workflow might query a supply chain system as to the availability of an item (along with that of potential upsell/cross-sell items) or a finance system as to the feasibility of offering a customer credit. These cross-process workflows are right-time dependent.
The Translytics Model of Analytic Processing Requires Special Software
Read and write latencies must be very low, and data throughput fast and consistent, for a database to reliably ingest and perform operations on data as soon as possible after it is created. One way to accomplish this is to use in-memory processing—that is, new data get loaded directly into memory, without first landing in a persistence layer. On top of this, the entire contents of the database live in RAM.
A proven way to accelerate analytics processing is to distribute data processing across a cluster of servers. This is the specialty of the massively parallel processing, or MPP, database. However, an MPP database relies on software features (for example, an MPP database kernel and query optimizer) that are highly specialized, and few MPP databases are explicitly positioned as in-memory translytical systems.
The Translytics Model of Analytic Processing Requires Special Technology
The technology that underpins a translytical database almost always makes use of high-speed/low-latency hardware components. (This is true in the cloud context, too.) However, the in-memory data processing requirement, in particular, presents several challenges--especially for analytics workloads.
At the database level, OLTP data volumes tend to be just a fraction of the size of analytical data volumes. So, for example, an enterprise data warehouse usually contains a derived subset of all of the information ever recorded in the OLTP context, along with data from other contexts. The total volume of all of this historical data is usually several orders of magnitude larger than that of current OLTP data.
The challenge with scaling an in-memory database is that physical memory is limited and volatile in a way that physical storage is not. For example, the contents of RAM vanish as soon as a system loses power. Similarly, physical storage can be provisioned at far greater capacities than can physical memory. For this reason, an in-memory translytical database invariably uses a persistent storage tier of some kind. If nothing else, data must be read into memory from storage each time the system restarts.
The Cloud Is Not Always Hospitable to Translytics
If a translytics workload really does require real-time processing, it will probably perform better in the on-premises context. This is because real-time workloads are especially sensitive to latency. In general, cloud infrastructure does not consistently achieve low enough latency to permit reliable translytics processing in real-time. That said, Amazon, Google, Microsoft and other providers now offer low-latency infrastructure services that may be suitable for certain types of translytics workloads.
Because latency is less predictable in the cloud than in the on-premises context, cloud infrastructure services are usually better suited for right-time, as distinct to real-time, translytics workloads.
Translytics is not a new idea; it is a newly feasible idea, thanks to the maturation of enabling software (in-memory databases, MPP databases, analytic modeling tools) and the vastly improved scalability of commodity technologies such as CPUs, memory and flash storage.