My company is running several different database systems, including Oracle and Ingres. To produce financial reports, we need to make a central repository for the relevant reporting data, which we'll extract from these multiple data warehouses. I was thinking of using SQL Server as the repository and Data Transformation Services (DTS) to perform the extraction. Would that approach work? Also, I read that the new SQL Server 2005 Express is limited to storing databases with a maximum size of 4GB each. Does SQL Server 2000 also have this database-size limit?
DTS is an effective extraction, transformation, and loading (ETL) tool, but it relies heavily on the quality of the OLE DB providers you use to read from the heterogeneous systems you mentioned. Many customers have found that it can sometimes be faster to unload data to delimited text files on each host, ship the text files to SQL Server by using a file-transfer tool (or make them available to DTS as a network share), then use DTS to load them into SQL Server. DTS, which uses bulk-load interfaces, is highly optimized for loading data into SQL Server. However, if I were loading data into Oracle, I'd probably use DTS to run a SQL*Loader command task to get maximum throughput (and I'm sure IBM has a similar batch data load API).
As for your question about the new SQL Server 2005 Express, a replacement for Microsoft SQL Server Desktop Engine (MSDE), that database is designed largely for organizations that need a database engine to package with an application. MSDE's size limit per database is even smaller—2GB. But the full versions of SQL Server 2000 and the upcoming SQL Server 2005 don't have the database-size limit you noted. So if you're storing more than 2GB of data, you should consider SQL Server 2000 Standard Edition, which also gives you a set of database management tools that you aren't licensed to use with MSDE. And if you buy a server CPU-based license, you can connect as many users as your hardware will allow, up to the 32,727 limit.