As humans (and especially as IT professionals), we're wired towards optimization and the need to refine. Or, as Antoine de Saint Exupéry said in The Little Prince, "It seems that perfection is attained not when there is nothing more to add, but when there is nothing more to remove."
Related: Replication Basics
Consequently, when setting up replication, it's pretty common to want to struggle and fit everything into a single publication. This can be a huge pain when replicating code (views, sprocs, and udfs). Especially when there are lots of overlapping dependencies (sprocs written against views that, in turn, rely upon UDFs—and so on). In a perfect world, some of the complexities that plague replication in terms of dependencies, could simply be refactored or removed from the code-base. But in far too many environments, attempting such a chore would be a cross between a Herculean and a Sisyphean undertaking.
Think in Terms of Long-term (Ownership) Costs
For many publications, putting everything (schema, data, code) into a single publication is obviously the best way to approach things. For more complex publications, I've found that it's actually MUCH easier to break problematic publications into two, distinct, publications. That may seem to fly in the face of all that is holy and ordered in the universe, but the long-term management overhead ends up being significantly less hassle than trying to get everything working correctly in the same publication.
Consider some benefits:
- Snapshots/Re-initialization. If you're replicating roughly, say, 5GB of data, it can take a while for that data to be snapshotted and delivered. If all 5GBs of data ends up being snapshotted and then delivered, but you run into a problem with dependencies in your code as part of the same snapshot/publication, then snapshot delivery fails, and you’re back to re-pushing that entire 5GB chunk of data again once you think you’ve addressed the root problem and are ready to try again. Or, in essence, when you bundle data and code in the same publication (with a publication that's highly volatile and frequently runs into issues), then you’re commonly playing a game of hurry-up and wait.
- Locks. Along the same lines as the benefit listed above, if you’re replicating 5GBs of data and need to snapshot or re-initialize that entire publication just to address problems with dependencies in your code, just remember that snapshot acquire and hold VERY aggressive locks while executing (meaning that you're likely going to run into problems with very ugly contention if you try to snapshot during production hours, or you're going to have to wait until off-peak hours to re-initialize).
- Data is Slow, Code is Quick. The primary benefit of the approach I'm proposing (for scenarios where a single publication ends up being unruly), is that data is slow and time-consuming to replicate while code can typically be replicated in a matter of literally seconds. As such, if you’re routinely (i.e., every month or three—or even more frequently) running into situations where regular code changes and churn either cause full-blown problems with regular replication schedules or (more commonly) where you just need to re-initialize because we all know that replication periodically just 'loses its damned mind'—then having your code in a nice, tidy package that can be replicated in just a minute or so—is a huge benefit when it comes to troubleshooting.
In short, while this technique is CLEARLY not something I’d recommend as a blanket approach to replication, I find—over and over again—the unconventionality of having two publications for unruly databases typically ends up saving HUGE amounts of time over the long haul and makes replication just that much easier to manage.