Wednesday, January 22, 2014

SQL Server Data Compression Page vs Row

SQL Server Page Data Compression vs Row Data Compression
                                                      
                                                                       Today one of my friend ask me about SQL Server Data Compression, After that I have discuss with him about SQL Server Data Compression such as what are the advantages and disadvantages, how it's working like that... For this article I'm going to discuss about SQL Server Data Compression ,how is working and advantages and disadvantages.

Ok! First we will look, what is SQL Server Data Compression?
                                                                                               Data Compression is available only in the Enterprise Edition of SQL Server(And of course developer edition also) You can use the data compression feature to help compress the data inside a database. And this will support  for tables and indexes only. In addition to saving space, data compression can help to improve performance of I/O because after apply data compression it will be less data pages.
Page compression algorithm is reduce data duplication method which means if some records will duplicate within the page then SQL Server remove such duplicate values within a data page by replacing some kind of special values.
And row compression use remove unused spaced(bytes) in column.At the same time you can not use PAGE and ROW compression both, but if page compression enabled then automatically includes row compression. 

Normal Method,

USE [MSDN]
GO
--Drop Table
IF  EXISTS (SELECT 1 FROM sys.objects WHERE [object_id] = OBJECT_ID(N'[dbo].[tblWithOutDataCompression]') AND [type] in (N'U'))
DROP TABLE [dbo].[tblWithOutDataCompression];
GO
--Create Table
IF NOT EXISTS (SELECT 1 FROM sys.objects WHERE [object_id] = OBJECT_ID(N'[dbo].[tblWithOutDataCompression]') AND [type] in (N'U'))
BEGIN
CREATE TABLE [dbo].[tblWithOutDataCompression](
[rowID] [bigint] NULL,
[colName] [varchar](2000) NULL,
[colcity] [varchar](2000) NULL,
[colAddress] [varchar](4000) NULL
) ON [PRIMARY]
END
GO
--Create Clustered Index
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE [object_id] = OBJECT_ID(N'[dbo].[tblWithOutDataCompression]') AND name = N'PK_ClusteredIndex_tblWithOutDataCompression')
CREATE CLUSTERED INDEX [PK_ClusteredIndex_tblWithOutDataCompression] ON [dbo].[tblWithOutDataCompression]
(
[rowID] ASC
) ON [PRIMARY]
GO
INSERT INTO [dbo].[tblWithOutDataCompression]([rowID],[colName],[colcity],[colAddress])
SELECT bucketid,cacheobjtype,objtype,'SQL SERVER DATA COMPRESSION' FROM sys.syscacheobjects
WHERE
(LEN([cacheobjtype])<2000)  
AND (LEN([objtype])<2000) 
ORDER BY bucketid;
GO
DBCC IND('MSDN',tblWithOutDataCompression,-1);
GO
EXEC sp_spaceused 'tblWithOutDataCompression';

GO

After execute above SQL Statement you will notice it will returns no of data pages created and table space allocation.

Row compression,

USE [MSDN]
GO
--Drop Table
IF  EXISTS (SELECT 1 FROM sys.objects WHERE [object_id] = OBJECT_ID(N'[dbo].[tblDataCompressionROW]') AND type in (N'U'))
DROP TABLE [dbo].[tblDataCompressionROW];
GO
--Create Table
IF NOT EXISTS (SELECT 1 FROM sys.objects WHERE [object_id] = OBJECT_ID(N'[dbo].[tblDataCompressionROW]') AND [type] in (N'U'))
BEGIN
CREATE TABLE [dbo].[tblDataCompressionROW](
[rowID] [bigint] NULL,
[colName] [varchar](2000) NULL,
[colcity] [varchar](2000) NULL,
[colAddress] [varchar](4000) NULL
) ON [PRIMARY]
END
GO
--Create Clustered Index
IF NOT EXISTS (SELECT 1 FROM sys.indexes WHERE [object_id] = OBJECT_ID(N'[dbo].[tblDataCompressionROW]') AND name = N'PK_ClusteredIndex_tblDataCompressionROW')
CREATE CLUSTERED INDEX [PK_ClusteredIndex_tblDataCompressionROW] ON [dbo].[tblDataCompressionROW]
(
[rowID] ASC
)WITH (DATA_COMPRESSION = ROW) ON [PRIMARY]
GO
INSERT INTO [dbo].[tblDataCompressionROW]([rowID],[colName],[colcity],[colAddress])
SELECT bucketid,cacheobjtype,objtype,'SQL SERVER DATA COMPRESSION' FROM sys.syscacheobjects
WHERE
(LEN([cacheobjtype])<2000)  
AND (LEN([objtype])<2000) 
ORDER BY bucketid;
GO  
DBCC IND('MSDN',tblDataCompressionROW,-1);
GO
EXEC sp_spaceused 'tblDataCompressionROW';

GO

Well! After execute above row compression SQL Statement then you will noticed some small amount of number of data(8K) pages reduce. And allocation size also.(Note : if you have very less amount of data in above tables then you will not noticed any space gain.)

Page compression,

USE [MSDN]
GO
--Drop Table
IF  EXISTS (SELECT 1 FROM sys.objects WHERE [object_id] = OBJECT_ID(N'[dbo].[tblDataCompressionPAGE]') AND [type] in (N'U'))
DROP TABLE [dbo].[tblDataCompressionPAGE];
GO
--Create Table
IF NOT EXISTS (SELECT 1 FROM sys.objects WHERE [object_id] = OBJECT_ID(N'[dbo].[tblDataCompressionPAGE]') AND [type] in (N'U'))
BEGIN
CREATE TABLE [dbo].[tblDataCompressionPAGE](
[rowID] [bigint] NULL,
[colName] [varchar](2000) NULL,
[colcity] [varchar](2000) NULL,
[colAddress] [varchar](4000) NULL
) ON [PRIMARY]
END
GO
--Create Clustered Index
IF NOT EXISTS (SELECT 1 FROM sys.indexes WHERE [object_id] = OBJECT_ID(N'[dbo].[tblDataCompressionPAGE]') AND name = N'PK_ClusteredIndex_tblDataCompressionPAGE')
CREATE CLUSTERED INDEX [PK_ClusteredIndex_tblDataCompressionPAGE] ON [dbo].[tblDataCompressionPAGE]
(
[rowID] ASC
)WITH (DATA_COMPRESSION = PAGE) ON [PRIMARY]
GO
INSERT INTO [dbo].[tblDataCompressionPAGE]([rowID],[colName],[colcity],[colAddress])
SELECT bucketid,cacheobjtype,objtype,'SQL SERVER DATA COMPRESSION' FROM sys.syscacheobjects
WHERE
(LEN([cacheobjtype])<2000)  
AND (LEN([objtype])<2000) 
ORDER BY bucketid;
GO  
DBCC IND('MSDN',tblDataCompressionPAGE,-1);
GO
EXEC sp_spaceused 'tblDataCompressionPAGE';

GO

Hope After execute above page compression SQL Statement then you will noticed some significant amount of number of data(8K) pages reduce. And allocation size also.Following SQL Statements will help you to find more information about  Data Compression.

USE [MSDN]
GO
SELECT 
    t.[name] AS TableName,
    p.[rows] AS RowCounts,
    SUM(u.[total_pages]) AS TotalPages, 
    SUM(u.[used_pages]) AS UsedPages, 
    (SUM(u.[total_pages]) - SUM(u.[used_pages])) AS UnusedPages
FROM sys.tables t
INNER JOIN      
    sys.indexes i ON t.[object_id] = i.[object_id]
INNER JOIN 
    sys.partitions p ON i.[object_id] = p.[object_id] AND i.[index_id] = p.[index_id]
INNER JOIN 
    sys.allocation_units u ON p.[partition_id] = u.[container_id]
GROUP BY t.[name], p.[rows]
ORDER BY t.[name];
GO
EXEC sp_estimate_data_compression_savings 'dbo', 'tblWithOutDataCompression', NULL, NULL, 'ROW';
GO
EXEC sp_estimate_data_compression_savings 'dbo', 'tblWithOutDataCompression', NULL, NULL, 'PAGE';
GO



My personal view is this can be good move if you have more gain for disk IO. If you have limited CPU and memory then, this can be performance issue.Microsoft mention it will be extra CPU resources are required to compress and decompress the data but I hope it will affect tempdb also, depend on the table or index size.Best method is if you test this in your development or test server and then move to production database.
And after apply data compression then if you noticed not gain any page compression then better to remove it.
And I hope this should be monitoring job weekly/monthly/yearly because first 3 months you will notice good IO performance of data compression and after one year you will notice there is no gain, Because your table contains data (last 9 months) can not be compress then this can be disadvantage.
If you need more information hope this Microsoft white paper will help you. - Data Compression: Strategy, Capacity Planning and Best Practices   
     
Cheers...
and Most welcome your comments and ideas...                             

No comments:

Post a Comment