Sunday, May 27, 2012

Remote BLOB Store [RBS] with SharePoint 2010


Remote BLOB Storage with SharePoint 2010


With Sharepoint 2010, the Remote Blob Storage (RBS) functionality, which allows putting documents into the database filesystem instead of the database itself, came into focus again. To make that happen each content database is located in a specific section of the file system where all the documents are stored. This documents are none the less managed by Sharepoint, a database is mandatory even if the documents are stored as BLOB because metadata is written exclusively into databases.

1   What is RBS

RBS is a library API set that is incorporated as an add-on feature pack for Microsoft SQL Server.  It can be run on the local server running Microsoft SQL Server 2008 R2, SQL Server 2008 or SQL Server 2008 R2 Express. To run RBS on a remote server, you must be running SQL Server 2008 R2 Enterprise edition. RBS is not supported for Microsoft SQL Server 2005.
Binary large objects (BLOBs) are data elements that have either of the following characteristics:
·        Unstructured data that has no schema (such as a piece of encrypted data).
·        A large amount of binary data (many megabytes or gigabytes) that has a very simple schema, such as image files, streaming video, or sound clips.

RBS uses a provider to connect to any dedicated BLOB store that uses the RBS APIs. Storage solution vendors can implement providers that work with RBS APIs. SharePoint Server 2010 supports a BLOB storage implementation that accesses BLOB data by using the RBS APIs through such a provider. You can implement RBS for Microsoft SharePoint 2010 Products by using a supported provider that you obtain from a third-party vendor. Most third-party providers store BLOBs remotely.



In addition to third-party providers, you can use the RBS FILESTREAM provider that is available through the SQL Server Remote BLOB Store installation package from the Feature Pack for Microsoft SQL Server 2008 R2. The RBS FILESTREAM provider uses the SQL Server FILESTREAM feature to store BLOBs in an additional resource that is attached to the same database and stored locally on the server. The FILESTREAM feature manages BLOBs in a SQL database by using the underlying NTFS file system.
The location that an RBS provider stores the BLOB data depends on the provider that you use. In the case of the SQL FILESTREAM provider, the data is not stored in the MDF file, but in another file that is associated with the database.





2   The case for RBS

RBS can provide the following benefits:
  •         BLOB data can be stored on less expensive storage devices that are configured to handle simple storage.
  •        The administration of the BLOB storage is controlled by a system that is designed specifically to work with BLOB data.
  •         Database server resources are freed for database operations.


The following are the common scenarios that benefit of RBS:

2.1   Large Database of Mostly Binary Data


If a given SQL Server database would grow to 500GB without RBS enabled, then RBS would be a beneficial option.  A 500GB database is considerably large.  Having a very large database can have a negative impact on business continuity and maintenance operations, for example:
·        Backup and restore operations take considerably longer.
·        Index and statistics defragmentation takes considerably longer.  This is a particular concern if the database must be taken offline during defragmentation.

For these reasons, enabling RBS on an otherwise very large database can be very beneficial as each of the concerns addressed above are alleviated.

2.2   Digital Asset Management Databases


Consider the scenario of a custom application that is designed to provide training content to users.  The application will contains less than 1000 documents but they are all training videos that average 200MB in size.  The database for this application would be nearly 200GB in size even though there are relatively few documents.

By enabling RBS, the binary data for these videos can be kept out of the database while the structured metadata in the database remains responsive and easy to manage.  This is a much more natural and integrated solution than allowing the video files to inflate the database or forcing the actual source videos to be stored in an unmanaged location outside of the control of the application.

2.3   When Storage Tiers need to be implemented


One of the most significant benefits of RBS lies in extensibility.  RBS doesn’t just have to be for getting BLOB data out of the database.  It can serve other creative purposes such as facilitating highly efficient storage tiers.

Consider the scenario where a document management solution has been or will be deployed for an organization.  A large percentage of the corporate user base will be adding and editing collaborative content on a daily basis.  Over time, a very large number of customer centric documents are created, possibly declared as records for retention, and then archived. 

In this scenario, an RBS provider that enables a tiered storage platform could provide tremendous cost savings by intelligently managing the storage location of BLOB data.  New and frequently accessed content could be stored in high performance storage.  Older documents that are accessed only occasionally could be automatically moved to lower cost and lower performance storage such as SATA arrays for example.  Then over time, very old documents that are rarely accessed could be automatically moved again to extremely inexpensive cloud based storage.  In all cases, end users are able to access content in real time but the responsiveness and cost of the storage is intelligently managed.

2.4   When Storage needs to be optimized


When BLOB data is allowed to inflate a SQL database, file I/O and processing load is increased on the database server.  If the average size of BLOB data is 80KB or higher, then implementing RBS reduces I/O and processing load which improves the performance of SQL Server.
Also, RBS providers have an additional advantage in that they can perform additional processing on the BLOB stream as it is being passed to the BLOB store. 

3   

Using RBS together with SharePoint 2010 Products


SharePoint Server 2010 supports the FILESTREAM provider that is included in the SQL Server Remote BLOB Store installation package from the Feature Pack for SQL Server 2008 R2.
In SharePoint Server 2010, site collection backup and restore and site import or export will download the file contents and upload them back to the server regardless of which RBS provider is being used. However, the FILESTREAM provider is the only provider that is currently supported for SharePoint 2010 Products farm database backup and restore operations.

When RBS is implemented, SQL Server itself is regarded as an RBS provider. You will encounter this factor when you migrate content into and out of RBS.
If you plan to store BLOB data in an RBS store that differs from your SharePoint Server 2010 content databases, you must run SQL Server 2008 with SP1 and Cumulative Update 2. This is true for all RBS providers.

The FILESTREAM provider that is recommended for upgrading from stand-alone installations of Windows SharePoint Services 3.0 that have content databases that are over 4 gigabytes (GB) to SharePoint Server 2010 associates data locally with the current content database, and does not require SQL Server Enterprise Edition.

RBS does not enable any kind of direct access to any files that are stored in Microsoft SharePoint 2010 Products. All access must occur by using SharePoint 2010 Products only.


4   

RBS Considerations

4.1   Review the environment Review the environment


If the content database sizes meet the criteria for a RBS recommendation, you should then consider what kind of content is being accessed and how it is being used.

Content database sizes

You can expect to benefit from RBS in the following cases:
§  The content databases are larger than 500 gigabytes (GB).
§  The BLOB data files are larger than 256 kilobytes (KB).
§  The BLOB data files are at least 80 KB and the database server is a performance bottleneck. In this case, RBS reduces the both the I/O and processing load on the database server.

Although the presence of many small BLOBs can create some decrease in performance, the cost of storage is usually the most important consideration when you evaluate RBS. The predicted decrease in performance is usually an acceptable trade-off for the cost savings in storage hardware.

Content type and usage

RBS is most beneficial in systems that store very large files, such as digital media. RBS is typically implemented in environments in which large stored files are infrequently accessed, such as an archive. If this situation describes your environment, you should consider implementing RBS.

If you are storing many small (less than 256 KB) files that are frequently accessed by many users, you might experience increased latency on sites that have many small files that are stored in RBS. Increased latency is one cost factor that you should consider when you evaluate RBS for your storage solution. However, it is unlikely to be the strongest consideration. The amount of increased latency is also related to the RBS provider that you use.

4.2   Evaluate provider options


RBS requires a provider that connects the RBS APIs and SQL Server.

 Important:

RBS can be run on the local server running Microsoft SQL Server 2008 R2, SQL Server 2008 or SQL Server 2008 R2 Express. To run RBS on a remote server, you must be running SQL Server 2008 R2 Enterprise edition. SharePoint Server 2010 requires you to use the version of RBS that is included with the SQL Server Remote BLOB Store installation package from the Feature Pack for Microsoft SQL Server 2008 R2. Earlier versions of RBS will not work with SharePoint Server 2010. In addition, RBS is not supported in SQL Server 2005.
           
BLOBs can be kept on commodity storage such as direct-attached storage (DAS) or network attached storage (NAS), as supported by the provider. The FILESTREAM provider is supported by SharePoint Server 2010 when it is used on local hard disk drives only. You cannot use RBS with FILESTREAM on remote storage devices, such as NAS.


The following table summarizes FILESTREAM benefits and limitations:


Operational requirement
RBS with FILESTREAM
RBS without FILESTREAM
SQL Server integrated backup and recovery of the BLOB Store
Yes
Yes
Scripted migration to BLOBs
Yes
Yes
Supports mirroring
No
No
Log shipping
Yes
Yes, with provider implementation
Database snapshots
No
No
Geo replication
Yes
No
Encryption
NTFS only
No
Network Attached Storage (NAS)
Not supported by SharePoint 2010 Products
Yes, with provider implementation


If the RBS provider that you are using does not support snapshots, you cannot use snapshots for content deployment or backup. For example, the SQL FILESTREAM provider does not support snapshots.


5   Install and configure Remote BLOB Storage (RBS) with the FILESTREAM provider (SharePoint Server 2010)


The following will describe how to install and configure Remote BLOB Storage (RBS) with the FILESTREAM provider on a Microsoft SQL Server 2008 database server that supports a Microsoft SharePoint Server 2010 system. RBS is typically recommended in the case where the content databases are 4 gigabytes (GB) or larger.

 



5.1   Enable FILESTREAM and provision the RBS data store


FILESTREAM must be enabled & configured on the computer that is running SQL Server 2008 that hosts the SharePoint Server 2010 databases.


5.2   Install RBS


RBS must be installed on the database server and on all Web servers and application servers in the SharePoint farm. RBS must be configured separately for each associated content database.

5.3   Enable and test RBS


RBS will be enabled on one of the web servers to test the RBS data store

No comments:

Post a Comment