Friday, March 13, 2009

Curricular Networks

I wrote the following report years ago, and stumbled upon it. Most of us are probably used to the idea of curriculum mapping as a way to find out what gets taught where, but there is another way to make a map of the curriculum. I have edited it somewhat.


1. Introduction

Course prerequisites ideally keep students from taking classes they are unprepared for. As a logistical side effect, they theoretically limit enrollment in classes with prerequisites. Taken together, prerequisites form a network of courses that theoretically describes the curriculum in terms of logistical flow. We will look at this network as it exists in theory and compare it to some actual data.

There is a wide variety of types of prerequisites, but they fall into two categories that may be combined with a conjunction.

AND Requirements

These simply list a group of courses that must be taken prior to the target course. Each listed course restricts further the potential enrollment for the target.

OR Requirements

In the most general description, these are requirements to take a fixed number of courses from a list. In the simplest case, it’s of the form Take ABC-101 or ABC-102 or ABC-103. In the more complicated form, they may require more than one course from a list, as in Take 2 courses from ABC-101, ABC-102, or ABC-103.

All of the prerequisites that currently exist are described in our registration database as being one or the other of the above types, or a combination of the two conjoined with AND.

2. Method of Analysis

Course prerequisites are described in a systematic way in the administrative database. These were obtained in electronic form and parsed. Student transcripts were then used to count how many students satisfied the requirements for a class and how many actually took it. This process is shown in the schematic below.

In the first step, a symbolic logic representation of each prerequisite tree is generated. Some of these get quite complicated, and in fact there were a few that were infinitely long because of ‘looping’ prerequisites in the Dance curriculum. A few such prerequisites were artificially truncated in order to let the computation finish within the lifetime of the universe.

As an illustration, here’s the recursive logic for BIO 211:
BIO-211=(MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*BIO-101+MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111+MAT-100*MAT-101*BIO-101*MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111+MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*BIO-102+MAT-100*MAT-101*BIO-101*MAT-100*MAT-101*BIO-102+MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111*MAT-100*MAT-101*BIO-102)*(MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*BIO-101+MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111+MAT-100*MAT-101*BIO-101*MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111+MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*BIO-102+MAT-100*MAT-101*BIO-101*MAT-100*MAT-101*BIO-102+MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111*MAT-100*MAT-101*BIO-102)*(MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*BIO-101+MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111+MAT-100*MAT-101*BIO-101*MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111+MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*BIO-102+MAT-100*MAT-101*BIO-101*MAT-100*MAT-101*BIO-102+MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111*MAT-100*MAT-101*BIO-102)*(MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*BIO-101+MAT-100*MAT-101*BIO-110*MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111+MAT-100*MAT-101*BIO-101*MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111+MAT-00*MAT-101*BIO-110*MAT-100*MAT-101*BIO-102+MAT-100*MAT-101*BIO-101*MAT-100*MAT-101*BIO-102+MAT-100*MAT-101*MAT-100*MAT-101*BIO-110*BIO-111*MAT-100*MAT-101*BIO-102)
This is by no means the simplest way to express the proper relationship, but it has the advantage of being completely machine generated and can be processed by existing algorithms normally used for teaching computer architecture courses.

Once the logic is generated, a ten year range of transcripts was processed to compute how many students during a semester had satisfied the prerequisites for courses offered that semester. It was a simple matter to count up how many actually had taken the course, so then comparisons can be made. Both recursive and non-recursive methods were used, and ultimately I decided that the non-recursive ones were the most useful. Basically that means that a student is counted as having satisfied the requirements for a course if he or she meets the overt requirements, but not necessarily the sub-requirements. The long process of working through decisions like this one boiled down to finding an appropriate metaphor for the curricular network. Ideas that seemed reasonable, but were ultimately discarded were modeling the requirement tree on a digestive system (yuck), a river (too simple), an electrical network (no way to include the complicated ‘or’ prerequisites), a quantum mechanical system (no one would understand it), pure probability (leads to the computation of huge correlation matrices), and multi-valued logic (problems with commutivity). I finally settled on using the model of a rather simple computer chip. You could, in fact, burn the course requirements onto a piece of silicon (excepting the Dance department, with their infinite loops).
The diagram above shows an imaginary set of course requirements for ABC-300, implemented with logic gates. The one for the biology example above could be simplified to fit on one page, but would be more work than it’s worth.

3. Historical Data

It would be natural to expect that the more severe the prerequisite restrictions, as reflected by the number of students who qualify to take a given target course, the fewer students would actually take it. In other words, one would expect to see the emergence of a straightforward linear relationship with nice correlation. The truth is a lot stranger. The scatter plot below summarizes five years of data.

Each dot is an individual course type taught during a semester. So all HIS-210’s for a semester are lumped together, for example. The vertical scale shows how many students during that semester met the requirements for that course, according to their transcripts (transfer courses were NOT included, which must be taken into account when drawing conclusions). The lines at the top are actually accumulations of dots that represent courses with no prerequisites, so all students qualified. The horizontal scale shows how many students actually took a course type (all sections within a semester being aggregated).

The data has a correlation coefficient of .2, which means it isn’t particularly linear, an obvious conclusion one can draw from the picture. The conclusion that does force itself upon us, just from inspection, is that there are essentially two types of courses. Those with no prerequisites (top of the chart) and those with prerequisites (bottom of the chart). If we only consider the group at the bottom, we get a correlation coefficient of .4, which is a stronger indication of linearity than before, but still not very compelling (it would need to be near 1).

Since many prerequisites are of the AND type, one would expect demand to drop exponentially with each additional requirement, which is a strong hint that we should look at the graph with a log scale on the vertical axis.
Now we can clearly see the ‘funnel effect’ of restrictions on class entry. The relationship isn’t a simple linear one, where fewer restrictions equals more enrollment, but rather sets the upper limit on class enrollment statistically. It’s easy to see this boundary where the dots stop toward the right of the graph at each level.

Of course, there are many factors that determine why a student takes one course rather than another. One would guess that a main consideration is the student’s major. This is borne out in an analysis of class correlations. The chart below shows a matrix of correlations between classes taught over a ten year period. Courses are in alphabetical order, which has the advantage of grouping together ones from the same discipline.
A perfectly distributed curriculum, where every student was equally likely to take any given class, would give a perfectly uniform matrix. The ‘clumpiness’ visible above is evidence of a high degree of selectivity. The dot size, ranging from specks to balloons, is indicative of how strong the correlation is. The fact that most of the larger ones are distributed around the diagonal shows that students have a tendency to choose classes within the same discipline. General education classes or other popular classes will appear as horizontal and vertical streaks (correlation matrices are symmetrical). The main diagonal of the matrix has been omitted, because all it would show is relative class size for each class, and would obscure the relationships we’re looking for.

4. Conclusions

Prerequisites for classes create barriers to entry. The example in Exhibit 6 below shows a diagram of the prerequisites in Math and History.


The average number of prerequisites per course is 2.09 for Math and .95 for History.

Given the same number of majors, you could get by with fewer History classes per semester than Math classes. This depends also on actual requirements for the degree, of course, and not just on the information presented above.

Obviously if most students can’t take a particular course, it’s going to have lower enrollment. More than half of courses taught probably fall into that category. In many cases we can’t do anything about that unless we radically change the way we think about majors and disciplines.

In some cases we have introductory level courses segregated by the type of student. For example, Biology has separate courses for majors and non-majors, while English has a separate course for transfer students. This is another kind of prerequisite.

It would be a good thing for programs to review their curriculum from this "barrier to entry" approach periodically in order to reduce unnecessary complications and to perhaps unwind an infinite loop or two.

No comments:

Post a Comment