Jump to content

Does SDS support operation execution under hybrid cloud data sources

Andy Pang

Recommended Posts

Hi, Suppose we have some data in a Hadoop on-premise, while some other data are stored in other data sources in the public cloud. Does SDS support execution of the operation (eg. k-means) using both of these data sources and come up with a combined result What will be the best way to achieve this Cheers, Andy
Link to comment
Share on other sites

Yes, SDS can handle processing from multiple sources to be precise, itsupports hybrid data sources within oneworkflow. So you can take a dataset in Oracle/Teradata/whatever, run whatever transformations you like on it (filtering, aggregation, windowing, etc.) then move it into Hadoop, combine it with a Hadoop dataset, and then build a ML model. The data sources can be anywhere (on-prem, cloud) and anything (Hive, Hadoop, RDBMS).

And of course, every single operation is pushed down into the underlying database (Oracle, Hadoop, etc).

To minimise data movement, you can perform all the pre-aggregation and transformations directly in the source database.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...