When you use an SSIS package to run Data Quality Services Cleansing activities via the DQS Cleansing Component, each run produces a Data Quality Project. The resulting Data Quality project is useful for auditing the cleansed data produced from the SSIS package,
and also to export the data if a copy is needed.
However, over time the DQS projects accumulate in the DQS_PROJECTS database, the database may grow large, and the projects may become too numerous to easily delete. If you need to find the largest projects for cleanup, use the script described in
KB 2685743 to identify the projects and the size of each.
In the Data Quality Client, click the button to open the Data Quality Project
Enumerate the list of DQS Cleansing Projects. The SSIS projects will have a unique naming convention with the Package Name, Transform Name, Run Date and Time, GUID.
Right click on a project to unlock it (red text means locked)
Deselect the unlocked project by moving up or down one row (this will refresh the menu), then right click on the unlocked project, and choose Delete.
In the case where there are too many projects to manually delete, you may use the TSQL script described here to automate the cleanup for a certain date range.
1. The script uses a date and time range to target the project deletion to a scope of time.
2. The script bulk deletes Data Quality projects according to the ‘Type’ flag which is set to 3 by default (1 = KB, 2 = Cleansing project DQS Client, 3 = SSIS Project)
3. The script deletes project in both Locked and Unlocked states.
4. If a project fails to delete, the script continues to delete the remaining projects.
5. Printed text output shows progress of the deletions and any errors which occur.
1. The Windows account executing the TSQL script should have a ‘dqs_administrator’ role; we recommend the account to have a sysadmin role on the box.
2. Run the script from SQL Server Management Studio while connected to SQL Server instance running the DQS instance (hosting the DQS_MAIN and DQS_PROJECTS databases)
3. Modify the ‘FromDate’ and ‘ToDate’ dates and times in the script to define the window for cleansing up the projects
SET
NOCOUNT
ON
USE DQS_MAIN
DECLARE
@FromDate datetime
DECLARE
@ToDate datetime
DECLARE
@ProjectId
TSQL Code
SET
yle="color:#006699;font-weight:bold;">bigint
DECLARE
@LockClientId
bigint
DECLARE
@DqProject varbinary(
max
)
DECLARE
@ResultRecords varbinary(
max
)
,@ErrMessage
VARCHAR
(
max
)
,@rowcount
INT
,@errCount
INT
= 0
--Update From date and To date here before execution of script
SELECT
@FromDate =
CAST
(
'2012-10-19 00:00:01.001'
AS
datetime)
SELECT
@ToDate =
CAST
(
'2012-10-19 23:59:59.997'
AS
datetime)
PRINT
'***************************************************************'
PRINT
CAST
(GETDATE()
AS
VARCHAR
(
MAX
)) +
' :: '
+
'Executing script for date range '
+
CAST
(@FromDate
AS
VARCHAR
(
MAX
)) +
' to '
+
CAST
(@ToDate
AS
VARCHAR
(
MAX
))
DECLARE
DELETE_PROJECTS_CURSOR
CURSOR
FOR
SELECT
[ID],
ISNULL
([LOCK_CLIENT_ID],-1)
FROM
[DQS_Main].[dbo].[A_KNOWLEDGEBASE]
WHERE
[TYPE] = 3
-- BatchDQProject, projects that are generated by SSIS packages
AND
[CREATE_DATE]
BETWEEN
@FromDate
AND
@ToDate
OPEN
DELETE_PROJECTS_CURSOR
FETCH
NEXT
FROM
DELETE_PROJECTS_CURSOR
INTO
@ProjectId, @LockClientId
WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN
TRY
PRINT
CAST
(GETDATE()
AS
VARCHAR
(
MAX
)) +
' :: '
+
'Operating on Project: ['
+
CAST
(@ProjectId
AS
VARCHAR
(
MAX
)) +
']'
EXECUTE
[KnowledgebaseManagement].[SetDataQualitySession] @clientId=@LockClientId, @knowledgebaseId=
NULL
IF (@LockClientId != -1)
BEGIN
EXECUTE
[KnowledgebaseManagement].[DQProjectGetById] @ProjectId,@DqProject
OUTPUT
EXECUTE
[KnowledgebaseManagement].[DQProjectExit] @DqProject,@ResultRecords
OUTPUT
&nde>
BEGIN
EXECUTE
[KnowledgebaseManagement].[DQProjectGetById] @ProjectId,@DqProjebsp;
END
-- delete project's activity archive
DELETE
FROM
[dbo].[A_PROFILING_ACTIVITY_ARCHIVE]
WHERE
[ACTIVITY_ID]
IN
(
SELECT
ID
FROM
[dbo].[A_KNOWLEDGEBASE_ACTIVITY]
WHERE
[KNOWLEDGEBASE_ID] = @ProjectId)
-- refresh the project state
EXECUTE
[KnowledgebaseManagement].[DQProjectGetById] @ProjectId,@DqProject
OUTPUT
PRINT
CAST
(GETDATE()
AS
VARCHAR
(
MAX
)) + ' ::
' + '
Deleting project: [
' + CAST(@ProjectId AS VARCHAR(MAX)) +'
]
'
EXECUTE [KnowledgebaseManagement].[DQProjectDelete] @DqProject
PRINT CAST(GETDATE() AS VARCHAR(MAX)) + '
::
' + '
Deleted project: [
' + CAST(@ProjectId AS VARCHAR(MAX)) +'
]
'
END TRY
BEGIN CATCH
PRINT CAST(GETDATE() AS VARCHAR(MAX)) + '
::
' + '
An error has occurred
with
the following details
'
PRINT CAST(GETDATE() AS VARCHAR(MAX)) + '
::
' + '
Error
' + CONVERT(varchar(50), ERROR_NUMBER()) +
'
, Severity
' + CONVERT(varchar(5), ERROR_SEVERITY()) +
'
, State
' + CONVERT(varchar(5), ERROR_STATE()) +
Error
' + CONVERT(varchar(50), ERROR_NUMBER()) +
&="color:blue;">'
,
Procedure
' + ISNULL(ERROR_PROCEDURE(), '
-
') +
'
, Line
' + CONVERT(varchar(5), ERROR_LINE());
PRINT CAST(GETDATE() AS VARCHAR(MAX)) + '
::
' + '
Error Message:
' + ERROR_MESSAGE();
SELECT @errCount = @errCount + 1
PRINT CAST(GETDATE() AS VARCHAR(MAX)) + '
::
' + '
Skipping this project because
of
errors
'
END CATCH
FETCH NEXT FROM DELETE_PROJECTS_CURSOR INTO @ProjectId, @LockClientId
END
IF @errCount > 0
PRINT CAST(GETDATE() AS VARCHAR(MAX)) + '
::
' + '
Script completed
with
' + CAST(@errCount AS VARCHAR(MAX)) + '
errors
'
ELSE
PRINT CAST(GETDATE() AS VARCHAR(MAX)) + '
::
' + '
Script completed successfully
'
PRINT '
***************************************************************'
BEGIN
TRY
CLOSE
DELETE_PROJECTS_CURSOR
DEALLOCATE
DELETE_PROJECTS_CURSOR
END
TRY
BEGIN
CATCH
--Do nothing
END
CATCH
See Also