Alexandre Verri Posted December 18, 2019 Posted December 18, 2019 Having a list of DataSet in IronPython, what is the most efficient way to create a new DataTable with the the contents of all DataSets
Rayees Wani Posted December 18, 2019 Posted December 18, 2019 Will be great if you can elaborate your problem description. Below article may help. How to Add New Data Table Based on Existing Data Table in Analysis in TIBCO Spotfire Using IronPython Scripting: https://community.spotfire.com/wiki/how-add-new-data-table-based-existing-d...
Alexandre Verri Posted December 18, 2019 Author Posted December 18, 2019 Thanks, but I think the problem is clear enough, and you have given a solution that I haven't asked. I have asked about creating a DataTable from existing list of DataSets, not creating a DataTable from existing DataTable! This is the context using code examples: from System.Data import DataSet from System.Data.Odbc import OdbcConnection, OdbcDataAdapter # Get a dataset for the given SQL def get_dataset(sql, conn_str): connection = OdbcConnection(conn_str) connection.Open() dataSet = DataSet() adaptor = OdbcDataAdapter(sql, connection) adaptor.Fill(dataSet) connection.Close() return dataSet # The connection string to the target database. conn_str = 'DSN=mydatabase;Pooling=true;Max Pool Size=20;Enlist=true' # A list of different queries. sqls = ['sql1', 'sql2', 'sql3'] # This is the list of data sets, one for each query. data_sets = [get_dataset(sql, conn_str) for sql in sqls] # Now, having the data sets, how to create a table with this list in the most efficient way
Rayees Wani Posted December 19, 2019 Posted December 19, 2019 Not sure if you would be able to use the above DataSets as a datasource as they are. Something below may just help: from Spotfire.Dxp.Data.Import import DatabaseDataSource from Spotfire.Dxp.Data.Import import DatabaseDataSourceSettings from System import * sqlCommand = "SELECT * from MYTABLE" dbsettings = DatabaseDataSourceSettings( "System.Data.OracleClient", "Server=localhost;UID=test;PWD=test",sqlCommand) ds = DatabaseDataSource(dbsettings) Document.Data.Tables.Add("Test",ds)Again, I am sorry if this does not really help much.
Alexandre Verri Posted December 19, 2019 Author Posted December 19, 2019 Hi Rayees, I appreciate your time trying to answer the question. I had found the example you provided, but it does not solve our issue. We need to convert from DataSet to DataTable, the main reason is performance. There is a big performance impact in our use case. We have to run multiple SQL queries and build a DataTable with the results. This is taking more than a minute, which is too much for the user to wait. One way we have found to improve the performance is to run the queries in parallel, using multithreading. The IronPython multithreading works fine in Spotfire if we don't use any Spotfire API in a different thread. The code you provided would work if we could run it in parallel, like this: from Spotfire.Dxp.Data.Import import DatabaseDataSource from Spotfire.Dxp.Data.Import import DatabaseDataSourceSettings from System import * from multiprocessing.pool import ThreadPool # Create a thread pool with 3 threads. pool = ThreadPool(3) def build_table(sql): dbsettings = DatabaseDataSourceSettings( "System.Data.OracleClient", "Server=localhost;UID=test;PWD=test", sql) ds = DatabaseDataSource(dbsettings) Document.Data.Tables.Add("Test",ds) def build_tables_in_parallel(sqls): pool.map(build_table, sqls) sqls = ['select * from t1', 'select * from t2', 'select * from t3'] build_tables_in_parallel(sqls) ... but it does not work, Spotfire can't run the function build_table in a different thread. That's why we need to run the SQL and get DataSet instead, because there is no Spotfire code involved in this, so we can run in different threads.Maybe you can give a hint about how to run multiple queries in parallel Using DataFunctions maybeThank you.
Hitesh Ganger 2 Posted January 8, 2020 Posted January 8, 2020 Here is the logic we are currently using to write the datatable but this is extremely slow. This function gets data from generator function that iterates over the dataset. def createDataTable(textData): # Turn it into an in-memory text data source. dataTableName = 'Data Table' columnNames=orderedColumns stream = MemoryStream() writer = StreamWriter(stream) writer.WriteLine(';'.join(columnNames) + 'rn') writer.Flush() for line in textData: writer.WriteLine(line) writer.Flush() settings = TextDataReaderSettings() settings.Separator = ";" settings.AddColumnNameRow(0) settings.ClearDataTypes(False) for i in range(len(columnNames)): if columnNames[i] == 'MEASUREVALUE': settings.SetDataType(i, DataType.Real) elif columnNames[i] == 'DATETIME_ID': settings.SetDataType(i, DataType.DateTime) elif columnNames[i] == 'DATE_ID': settings.SetDataType(i, DataType.Date) stream.Seek(0, SeekOrigin.Begin) fs = TextFileDataSource(stream, settings) if Document.Data.Tables.Contains(dataTableName): dataTable = Document.Data.Tables[dataTableName] settings = AddRowsSettings(dataTable, fs) dataTable.AddRows(fs, settings) else: # Create Table if not already present print "Adding Data Table" dataTable = Document.Data.Tables.Add(dataTableName, fs) def generate_text_data(dataSet,columnNames): currRow = "" global orderedColumns for row in dataSet.Tables[0].Rows: currRow = [] for col in orderedColumns: if col in columnNames: currRow.append(str(row[col])) else: currRow.append('') yield "%srn" % (';'.join(currRow))
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now