Monday, May 21, 2007

When to not not use Cursors

Cursors are usually bad to use; almost always there is a more efficient way to solve your problem using a set based solution. However there are some cases when cursors allow you to create a solution that is exponentially easier to implement than a set based solution. In general you should only resort to using cursors when a difficult set based solution becomes trivial when solved using cursors.

A classic example is that you have 5 classrooms of various sizes, and 3 classes of various sizes, and you want to assign a class to each classroom utilizing the minimum space required.


we'll simplify this problem and leave dates and times out of this issue :). Now this problem is very difficult to solve using a set based solution (it is possible, google itzik). However this is pretty trivial to solve using a cursor based solution, psuedo code below

  1. Declare 2 cursors, one of the list of classrooms (lets call it: RoomsCursor) sorted by increasing capacity (number of seats), and another cursor for the list of classes (ClassesCursor) sorted by increasing number of students.
  2. Now Fetch the first (smallest since you sorted ascending) class from the RoomsCursor
  3. While the fetch returned a class that needs a classroom
    1. Fetch the smallest unused classroom from RoomsCursor. if there is no available room, or the room is too small, continue and fetch the next smallest. Repeat fetching new rooms until you find a room that has fit or run out of rooms
    2. If you didnt run out of rooms (and the last fetch yielded a room and the number of seats in the room is smaller than the number of students in the current room:
      1. if you found a big enough room, schedule the class
      2. else, you ran out of rooms!
      3. fetch another Class
  4. Return the scheduled events
In this case we are scanning both the classrooms and the classes in order. We never back up the cursor. We schedule classes by matching classes to class rooms until we either run out of classses to find classrooms for or we run out of rooms to accomidate classes. The only time there is an error is when no solution exists.

This solution runs in O(N) time since we are simply stepping through the cursor, the worst case solution is that we look at each class or classroom once.

this will set up the problem

USE tempdb;
GO
IF OBJECT_ID('dbo.Classes') IS NOT NULL
DROP TABLE dbo.Classes;
GO
IF OBJECT_ID('dbo.Classrooms') IS NOT NULL
DROP TABLE dbo.Classrooms;
GO

CREATE TABLE dbo.Classrooms
(
classroomid VARCHAR(10) NOT NULL PRIMARY KEY,
classSize INT NOT NULL
);

INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('C001', 2000);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('B101', 1500);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('B102', 100);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('R103', 40);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('R104', 40);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('B201', 1000);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('R202', 100);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('R203', 50);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('B301', 600);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('R302', 55);
INSERT INTO dbo.Classrooms(classroomid, classSize) VALUES('R303', 55);

CREATE TABLE dbo.Classes
(
classid INT NOT NULL PRIMARY KEY,
eventdesc VARCHAR(25) NOT NULL,
attendees INT NOT NULL
);

INSERT INTO dbo.Classes(classid, eventdesc, attendees)
VALUES(1, 'Mikes Adv T-SQL Seminar', 193);
INSERT INTO dbo.Classes(classid, eventdesc, attendees)
VALUES(2, 'CIQ .NET Pages', 51);
INSERT INTO dbo.Classes(classid, eventdesc, attendees)
VALUES(3, 'How to Break the DB', 232);
INSERT INTO dbo.Classes(classid, eventdesc, attendees)
VALUES(4, 'XAML ROCKS', 89);
INSERT INTO dbo.Classes(classid, eventdesc, attendees)
VALUES(5, 'CIQ Security Issues', 897);
INSERT INTO dbo.Classes(classid, eventdesc, attendees)
VALUES(6, 'Data Modeling 101', 46);
GO

CREATE INDEX idx_att_eid_edesc
ON dbo.Classes(attendees, classid, eventdesc);
CREATE INDEX idx_classSize_rid
ON dbo.Classrooms(classSize, classroomid);
GO
Cursor Solution:

DECLARE
@classroomid AS VARCHAR(10), @classSize AS INT,
@classid AS INT, @attendees AS INT;

DECLARE @Result TABLE(classroomid VARCHAR(10), classid INT);

DECLARE CClassrooms CURSOR FAST_FORWARD FOR
SELECT classroomid, classSize FROM dbo.Classrooms
ORDER BY classSize, classroomid;
DECLARE CClasses CURSOR FAST_FORWARD FOR
SELECT classid, attendees FROM dbo.Classes
ORDER BY attendees, classid;

OPEN CClassrooms;
OPEN CClasses;

FETCH NEXT FROM CClasses INTO @classid, @attendees;
WHILE @@FETCH_STATUS = 0
BEGIN
FETCH NEXT FROM CClassrooms INTO @classroomid, @classSize;

WHILE @@FETCH_STATUS = 0 AND @classSize < @attendees
FETCH NEXT FROM CClassrooms INTO @classroomid, @classSize;

IF @@FETCH_STATUS = 0
INSERT INTO @Result(classroomid, classid) VALUES(@classroomid, @classid);
ELSE
BEGIN
RAISERROR('Not enough Classrooms for Classes.', 16, 1);
BREAK;
END

FETCH NEXT FROM CClasses INTO @classid, @attendees;
END

CLOSE CClassrooms;
CLOSE CClasses;

DEALLOCATE CClassrooms;
DEALLOCATE CClasses;

SELECT classroomid, classid FROM @Result;
GO

1 comment:

Anonymous said...

This problem can still be solved with a set-based approach. It's also faster than the cursor solution (39% vs 61%)

With the CTEs and windowing functions made available in SQL 2005, there is almost no reason at all to use a cursor.


with matches as
(
select
c.classId
, r.classRoomId
, c.classSize
, r.classroomSize
, pref = row_number() over (partition by c.classId order by r.classroomSize asc)
from Classes c
cross join Classrooms r
where c.classSize <= r.classroomSize
)
select
m.classId
, m.classRoomId
from matches m
where not exists (select 'x' from matches where classId = m.classId and pref < m.pref)
and not exists (select 'x' from matches where classRoomId = m.classRoomId and pref < m.pref)