File storage and block storage are the two major types of data storage typically found on networked storage systems. The terms “file” and “block” refer to the way data is stored, managed and accessed on the storage media, such as hard disk drives, solid-state drives or tape. File and block are access protocols that use different methods to save and retrieve data.
File storage is associated with unstructured data. Unstructured data is data that may be at variable lengths, with files that don’t necessarily have common formats or features that allow for the data to be arranged in a meaningful, coherent and consistent fashion. Some examples of file data include word processing documents, presentations, email messages, videos and graphics.
Block data isn’t file oriented, but rather consists of chunks of data that comprise databases and other data forms that have specific structures that define each data segment by its constituents. Block data is often represented in columns and rows. A row represents a single entity, while a column defines the contents of that entity. Rows are generally referred to as “records” and columns as “fields.”
How Does File Storage Work?
Both file storage and block storage are forms of data virtualization because they provide organization and a management layer that operates above the storage media’s native management system.
File storage devices manage the data they hold centrally, using a file system interface, such as Windows File Explorer or Apple’s Finder utility on Macs. The data that comprises a single file as recognized by a word processing application, for instance, may in fact consist of many pieces that are spread across a drive or multiple drives. The file system maintains pointers that indicate the locations of those pieces. The system can then enable the assembly of those pieces in the proper order to present the full file.
The file system also stores metadata for each file. Metadata is basic information that helps to identify the file and includes file name, the size of the file, the date the file was created and when it was last modified. Files are listed in hierarchical fashion within multiple levels of folders.
In addition to the way data is stored on individual PCs, file storage is the main type of storage used for shared storage called network-attached storage, or NAS. NAS systems allow multiple servers and the users they support to access a defined share of a centralized storage pool.
File storage on NAS systems typically supports file access protocols such as network file system (NFS), which is native to Linux and Linux applications, and server message block (SMB) -- previously called common internet file system (CIFS) -- for Windows servers and the applications they host.
How Does Block Storage Work?
Block storage doesn’t overlay the inner workings of a drive to the degree that file storage does. It is more closely functionally related to the underlying native drive management system that controls the storage media. As such, block storage isn’t file oriented and maintains much less metadata for the storage it manages.
With block storage, users define volumes with blocks of storage capacity. To the server’s operating system and applications, a volume looks like a single drive. The blocks have a fixed size and are sometimes difficult to adjust once they are in use, so when allocating a volume to an application, users should anticipate data growth and make the appropriate allowances.
Block is the type of storage used in storage area networks (SANs), which are shared storage resources accessible by many servers and applications. SANs are similar in concept to NAS systems, but typically host only block storage. Originally, SANs required their own specialized networking protocol, Fibre Channel (FC), which was developed specifically for networked storage. FC networks use different network interface cards, switches and other gear than Ethernet networks, so SANs that support block storage were typically built around their own specialized infrastructure.
In 2004, that restriction eased somewhat with the publication of the Internet Small Computer Systems Interface (iSCSI) standard. ISCSI allowed shared block storage systems to use conventional TCP/IP networks so that the same Ethernet components that connected servers and users could be used to attach those assets to shared block storage. About five years later, another storage networking protocol -- Fibre Channel over Ethernet (FCoE) -- was introduced. FCoE retained some of the performance aspects of FC but could run over a standard Ethernet network.
What Are the Benefits of File Storage?
File storage is familiar to anyone who has used a computer. It’s easy to use and readily accessible by a broad range of applications, including popular productivity apps like Microsoft Office.
As the basis for a shared storage resource -- NAS systems -- file storage offers relatively easy management with a minimum of administrative tasks required to make storage available to users and applications. And because NAS can use existing Ethernet networking facilities to make the storage available, it does not require any special networking knowledge or specialized networking components.
What Are the Benefits of Block Storage?
Because it operates close to native operations of the drives themselves, block storage usually performs at a higher level than file storage. That level of performance, combined with its block-based data access, makes block storage particularly suitable for databases. It is also suitable for other applications that use large amounts of data that is augmented and updated regularly.
In a disk drive block storage environment, even greater performance can be attained by short-stroking the disk drives. This involves using only the outer edges of the disks that move a faster clip than the inner parts so data can be written and retrieved more quickly. All-flash SAN arrays and flash-aided arrays are likely to produce even greater performance for block storage systems.
What Are the Drawbacks of File Storage?
The main drawback to file storage is that file systems are limited in the number of files they can manage. That number will vary considerably from one system to the next, but the common trait is that when the limit is reached, the system will not be able to store any more data.
File systems are hierarchical, containing levels of folders and sub-folders that hold files that must be tracked, managed and updated with current metadata. By limiting the number of folders and files, the storage system can manage everything in a reasonably timely manner so that users and applications get access to data without major delays.
Its hierarchical nature also limits files storage’s performance. While it may be possible to run modest databases and other non-office productivity applications on file storage, the performance might be insufficient for handling more challenging relational databases and related applications.
What Are the Drawbacks of Block Storage?
Block is typically higher priced than file storage. A networked storage resource may require a special network to support Fibre Channel -- unless the block storage is part of an iSCSI SAN using traditional Ethernet networking protocols.
Managing block storage in a SAN can be complicated and require more expertise than needed for NAS administration. Although most SAN vendors that sell block storage have simplified administrative procedures over the past decade, capacity allocation and overall management often remains complex.
Block storage also doesn’t attach as much metadata to the data it stores compared with file storage, so other tools may be needed to identify and manage stored data.
Unified Storage: File Storage Coexists With Block Storage
Unified storage systems combine file and block storage in a single storage resource. Because iSCSI block storage can use Ethernet protocols just as file storage does, it made it possible to provide both types of storage in a single shared resource. Later developments made it possible to include FC and FCoE storage protocols in unified storage systems.
With unified systems, users can decide how to allocate the storage capacity as either file or block. These systems are typically less expensive than block-only arrays, although block performance may not be at the same level of dedicated block storage appliances. But the flexibility to allocate resources as needed may outweigh the trade-off in performance.
Conclusion: Use Cases Determine File or Block Storage
Both file storage and block storage have been around for a while. They are both mature technologies that address specific data center needs. That’s true even as the storage media has evolved from spinning disks to flash-based solid-state drives. The choice of file vs. block depends on the intended use cases and expectations related to performance.