When you use an SSIS package to run Data Quality Services Cleansing activities via the DQS Cleansing Component, each run produces a Data Quality Project. The resulting Data Quality project is useful for auditing the cleansed data produced from the SSIS package, and also to export the data if a copy is needed. However, over time the DQS projects accumulate in the DQS_PROJECTS database, the database may grow large, and the projects may become too numerous to easily delete. If you need to find the largest projects for cleanup, use the script described in KB 2685743 to identify the projects and the size of each.
In the Data Quality Client, click the button to open the Data Quality Project
Enumerate the list of DQS Cleansing Projects. The SSIS projects will have a unique naming convention with the Package Name, Transform Name, Run Date and Time, GUID. Right click on a project to unlock it (red text means locked) Deselect the unlocked project by moving up or down one row (this will refresh the menu), then right click on the unlocked project, and choose Delete.
In the case where there are too many projects to manually delete, you may use the TSQL script described here to automate the cleanup for a certain date range.
1. The script uses a date and time range to target the project deletion to a scope of time. 2. The script bulk deletes Data Quality projects according to the ‘Type’ flag which is set to 3 by default (1 = KB, 2 = Cleansing project DQS Client, 3 = SSIS Project) 3. The script deletes project in both Locked and Unlocked states. 4. If a project fails to delete, the script continues to delete the remaining projects. 5. Printed text output shows progress of the deletions and any errors which occur.
1. The Windows account executing the TSQL script should have a ‘dqs_administrator’ role; we recommend the account to have a sysadmin role on the box. 2. Run the script from SQL Server Management Studio while connected to SQL Server instance running the DQS instance (hosting the DQS_MAIN and DQS_PROJECTS databases) 3. Modify the ‘FromDate’ and ‘ToDate’ dates and times in the script to define the window for cleansing up the projects
SET
NOCOUNT
ON
USE DQS_MAIN
DECLARE
@FromDate datetime
@ToDate datetime
@ProjectId
bigint
@LockClientId
@DqProject varbinary(
max
)
@ResultRecords varbinary(
,@ErrMessage
VARCHAR
(
,@rowcount
INT
,@errCount
= 0
--Update From date and To date here before execution of script
SELECT
@FromDate =
CAST
'2012-10-19 00:00:01.001'
AS
datetime)
@ToDate =
'2012-10-19 23:59:59.997'
PRINT
'***************************************************************'
(GETDATE()
MAX
)) +
' :: '
+
'Executing script for date range '
(@FromDate
' to '
(@ToDate
))
DELETE_PROJECTS_CURSOR
CURSOR
FOR
[ID],
ISNULL
([LOCK_CLIENT_ID],-1)
FROM
[DQS_Main].[dbo].[A_KNOWLEDGEBASE]
WHERE
[TYPE] = 3
-- BatchDQProject, projects that are generated by SSIS packages
AND
[CREATE_DATE]
BETWEEN
@FromDate
@ToDate
OPEN
FETCH
NEXT
INTO
@ProjectId, @LockClientId
WHILE @@FETCH_STATUS = 0
BEGIN
TRY
'Operating on Project: ['
(@ProjectId
']'
EXECUTE
[KnowledgebaseManagement].[SetDataQualitySession] @clientId=@LockClientId, @knowledgebaseId=
NULL
IF (@LockClientId != -1)
[KnowledgebaseManagement].[DQProjectGetById] @ProjectId,@DqProject
OUTPUT
[KnowledgebaseManagement].[DQProjectExit] @DqProject,@ResultRecords
END
-- delete project's activity archive
DELETE
[dbo].[A_PROFILING_ACTIVITY_ARCHIVE]
[ACTIVITY_ID]
IN
ID
[dbo].[A_KNOWLEDGEBASE_ACTIVITY]
[KNOWLEDGEBASE_ID] = @ProjectId)
-- refresh the project state
)) + ' ::
' + '
Deleting project: [
' + CAST(@ProjectId AS VARCHAR(MAX)) +'
]
'
EXECUTE [KnowledgebaseManagement].[DQProjectDelete] @DqProject
PRINT CAST(GETDATE() AS VARCHAR(MAX)) + '
::
Deleted project: [
END TRY
BEGIN CATCH
An error has occurred
with
the following details
Error
' + CONVERT(varchar(50), ERROR_NUMBER()) +
, Severity
' + CONVERT(varchar(5), ERROR_SEVERITY()) +
, State
' + CONVERT(varchar(5), ERROR_STATE()) +
,
Procedure
' + ISNULL(ERROR_PROCEDURE(), '
-
') +
, Line
' + CONVERT(varchar(5), ERROR_LINE());
Error Message:
' + ERROR_MESSAGE();
SELECT @errCount = @errCount + 1
Skipping this project because
of
errors
END CATCH
FETCH NEXT FROM DELETE_PROJECTS_CURSOR INTO @ProjectId, @LockClientId
IF @errCount > 0
Script completed
' + CAST(@errCount AS VARCHAR(MAX)) + '
ELSE
Script completed successfully
PRINT '
***************************************************************'
CLOSE
DEALLOCATE
CATCH
--Do nothing
DQS Resources on TechNet Wiki