Text Search in SQL Server Stored Procedures: Overcoming sys.comments Insufficiencies

Jun 09 2010

Searching a SQL Server database’s sys.comments table has long been a quick way to find stored procedure usage of another database, table, or column.  Likewise, clicking “view dependencies” on a table is a quick way to find the opposite: which stored procedures (on the local db) refer to that table.  I recently came across a scenario where neither met my needs, so I had to expand the capabilities of the former.  The limitation of the first approach is parsing the text field to find multiple items.  CHARINDEX and SUBSTRING functions only get you to the first, and looping can be incorrect as line breaks may occur haphazardly mid-text. 

A client wants to decommission a legacy data warehouse so all stored procedures which extract data from it need to be remapped to a new data source.  How do you retrieve all the tables that any stored procedure refers to and compile a report of table names and counts?  The below solution is a quick method which isn’t overly clever, but very useful.  For this example, I ran the code on a database called “Extracts” which houses the stored procedures that pull data from the database “Datawarehouse”.  Temp tables rather than variable tables are necessary in this solution because of the usage of dynamic SQL. 

First, compile a list of procedure names which refer to your database in question.  We need to loop through the table of stored procedure names and use the sp_helptext system procedure to dump the contents of the procedure into the table, with the name of the stored procedure in the other column.  This process is more accurate than directly using sys.comments because the line breaks are set by the developer and are compiled, thus complete.  Note: feel free to use your looping mechanism of choice; I used a cursor but respect the ongoing “evil cursors” debate.  The @ToFind and @Limiter variables are declared and set at the top of the SQL to promote reusability.

DECLARE @ToFind VARCHAR(200),@Limiter VARCHAR(200),@Name VARCHAR(200),@SQL VARCHAR(3000)
SELECT @ToFind = 'Datawarehouse.dbo.',@Limiter = ' '

CREATE TABLE #Text(Name VARCHAR(200),Text VARCHAR(MAX))

DECLARE _Cursor_ CURSOR FOR
SELECT DISTINCT a.Name
FROM sys.objects a
INNER JOIN syscomments b ON a.object_id = b.id
AND b.text LIKE '%' + @ToFind + '%'

OPEN _Cursor_
FETCH NEXT FROM _Cursor_ INTO @Name

WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @SQL = 'INSERT INTO #TEXT(Text) EXEC sp_helpText ' + @Name

EXEC (@SQL)
UPDATE #Text SET Name = @Name WHERE Name IS NULL

FETCH NEXT FROM _Cursor_ INTO @Name
END

CLOSE _Cursor_
DEALLOCATE _Cursor_

This process took 00:11 seconds and the #Text table holds 47,903 rows.  Much of this data is useless to us as all content from the stored procedures is present.  We are only interested in obtaining the references to tables keyed with “Datawarehouse.dbo.”.  I chose to save the name of the stored procedure with the text.  Even though my final output does not make use of this information, it’s likely the client may wish to have the breakdown by stored procedure, which is easily producible with this field. 

 At first I thought I could just query this list for instances of “Datawarehouse.dbo.” but I realized it’s possible that developers put multiple references to tables on the same line; a looping practice is still needed.  Again, to promote reusability, I created a string search function which returns a list of occurrences of a “Text to Find” up until a limiter.

CREATE FUNCTION [dbo].[ufn_StringSearch](
@Text VARCHAR(MAX)
,@ToFind VARCHAR(200)
,@Limiter VARCHAR(200)
)

RETURNS @Output TABLE(FoundText VARCHAR(MAX))

AS

BEGIN

--Index is always the starting position of the ToFind text

DECLARE @Index INT,@Length INT
SELECT @Index = CHARINDEX(@ToFind,@Text),@Length = LEN(@Text)

--Insure that there is text we care about and still a limiter present

WHILE @Index > 0 AND CHARINDEX(@Limiter,@Text) > 0
BEGIN
       --Trim the Text to begin with what we care about (ToFind) and reset Length

       SELECT @Text = SUBSTRING(@Text,@Index,@Length)
SELECT @Length = LEN(@Text)

       --Save the ToFind up until the Limiter (giving everything inbetween)
--If there's no limiter present, just insert the ext as is.

INSERT INTO @Output
SELECT CASE   WHEN CHARINDEX(@Limiter,@Text) = 0 THEN @Text
ELSE SUBSTRING(@Text,0,CHARINDEX(@Limiter,@Text)) END

       --Now find the start of ToFind text in the rest of the Text, after the Limiter

       SELECT @Text = SUBSTRING(@Text,CHARINDEX(@Limiter,@Text) + 1,@Length)
SELECT @Index = CHARINDEX(@ToFind,@Text)

END

RETURN

END

By returning a table, the function is easily joined to with the CROSS APPLY statement.  Thus the results of our inquiry, “how many times these Datawarehouse tables appear in the stored procedures”, is produced with:

SELECT RTRIM(REPLACE(b.FoundText,@ToFind,'')),COUNT(*)
FROM #Text a
CROSS APPLY dbo.ufn_StringSearch(a.Text,@ToFind,@Limiter) b
GROUP BY RTRIM(REPLACE(b.FoundText,@ToFind,''))
ORDER BY 1

Here, I simply stripped out the “Datawarehouse.dbo.” prefix of the outputted data.  This query returns my data (152 tables and a count for each table) in 00:08 seconds.  This process is also useful for finding column references, developer comments, or any other fixed text assuming you have a beginning and ending criteria.

About the Author

 

Disclaimer

The words and opinions expressed here are those of each article's respective author, and do not necessarily represent the views of CapTech Ventures.