We have an application that requires us to log all user activity against a SQL Server 2000 database, then make the logs viewable online. We need to keep the activity logs for 7 years. We're expecting to capture about 50GB of data in just 1 year. To minimize backup storage requirements, we were planning to break the data into two databases: an active-archive database and an inactive-archive database. The active archive would contain activity collected over the past month, and the inactive archive would contain everything else. Is this a valid approach? If so, what's the best way for the application programmer to extract data from the two databases—by using distributed partitioned views?
Using distributed partitioned views is an option for retrieving data from the two databases, but this approach could also result in end users inadvertently querying the larger, inactive database if the application doesn't use the appropriate partitioning key to constrain the query. Using a Live System query page and an Archive System query page might be a simpler solution, as long as you set users' expectations for potentially slower response for queries against the Archive System. Rolling data into an archive system is a good idea. As you note, this architecture lets you update and back up the archive system less often, saving tape and disk space—although 50GB seems smaller and smaller every year!