Does SDS support operation execution under hybrid cloud data sources

Andy Pang · February 8, 2019

Hi, Suppose we have some data in a Hadoop on-premise, while some other data are stored in other data sources in the public cloud. Does SDS support execution of the operation (eg. k-means) using both of these data sources and come up with a combined result What will be the best way to achieve this Cheers, Andy

Steven Hillion · February 8, 2019

Yes, SDS can handle processing from multiple sources to be precise, itsupports hybrid data sources within oneworkflow. So you can take a dataset in Oracle/Teradata/whatever, run whatever transformations you like on it (filtering, aggregation, windowing, etc.) then move it into Hadoop, combine it with a Hadoop dataset, and then build a ML model. The data sources can be anywhere (on-prem, cloud) and anything (Hive, Hadoop, RDBMS).

And of course, every single operation is pushed down into the underlying database (Oracle, Hadoop, etc).

To minimise data movement, you can perform all the pre-aggregation and transformations directly in the source database.

Sign In

Does SDS support operation execution under hybrid cloud data sources

Recommended Posts

Andy Pang

Link to comment

Share on other sites

Steven Hillion

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Industries