A second example – college course advising

For a more practical example, we implemented a textual interface for the TAROT course advising system (TAROT: A Course Advising System for the Future, J. Eckroth, R. Anderson, Journal of Computing Sciences in Colleges, 34(3), pp. 108-116, 2018). TAROT's domain logic is implemented in Prolog, like our Pokémon example. However, TAROT is not open source, so we included the Pokémon example to show a complete solution. Yet, the course advising example shows a few interesting variations that we detail in the following text.

TAROT itself runs an HTTP server (http://tarotdemo.artifice.cc:10333) that accepts a JSON list containing the Prolog rule to execute as well as its arguments. Some of those arguments will be variables (starting with a capital letter), thus causing the TAROT to compute their values. The HTTP server returns all the values for these variables that meet the rule's constraints. These returned values are formatted as a JSON list of key-value pairs, where the keys are the variable names, and the values are the variables' values.

For example, a curl command, which can be run from a macOS or Linux Terminal, that executes TAROT's finishDegreeFromStudentId rule. This rule uses a student's ID number (for example, 800000000), their desired major, the number of semesters they have remaining, the starting year and semester (Fall or Spring), and any courses they do not need (for example, the student received permission from the instructor to skip). The rule computes values for various arguments, including the courses the student has already taken (obtained by reading the student's information from a database), their multi-year schedule of courses as determined by TAROT, their current and minimum, and maximum future grade point average, and their course credit counts:

curl -X "POST" "http://tarotdemo.artifice.cc:10333/tarot" -H 'Content-Type: application/json; charset=utf-8' 
-d $'[
  "finishDegreeFromStudentId",
  "800000000",
  "Student",
  "ClassLevel",
  "Advisors",
  "Major",
  "[csci]",
  "4",
  "2018",
  "fall",
  "[csci141,fsem,jsem]",
  "Taken",
  "PlannedSemesters",
  "Gpa",
  "MinGpa",
  "MaxGpa",
  "CreditCount",
  "AllCreditCount",
  "PlannedCreditCount",
  "PlannedAllCreditCount"
]'

The data returned from the request includes all variable values, but the only value we will be interested in is the PlannedSemesters schedule. This value will be a string with a specific syntax that we will break apart with regular expressions in our Java code. The string contains a semester-by-semester schedule that shows all the classes the student should take to finish their degree.

Here is an example of the result of running the finishDegreeFromStudentId HTTP request (with newlines added for formatting):

[
  {
    "Advisors":"[("Smith","Jane")]",
    "AllCreditCount":"39.0",
    "ClassLevel":"senior",
    "CreditCount":"24.0",
    "Gpa":"3.333333333333333",
    "Major":"csci",
    "MaxGpa":"3.75",
    "MinGpa":"1.25",
    "PlannedAllCreditCount":"79.0",
    "PlannedCreditCount":"64.0",
    "PlannedSemesters":"[(2018,fall,[(csci311,_2148,4),
(csci321,_2304,4),(math142,_2460,4),(pcb141,_2616,4)]),
(2019,spring,[(csci301,_3516,4),(csci304,_3672,4),
(csci331,_3828,4),(pcb142,_3984,4)]),
(2019,fall,[(csci498,_4966,4),(free,x,x),(free,x,x),(free,x,x)]),
(2020,spring,[(csci499,_5342,4),(free,x,x),(free,x,x),(free,x,x)])
    ]",
    "Student":""Doe","John"",
    "Taken":"[(noyear,transfer,[(astr180,tr,3),(chem110,tr,4),
(csci111,tr,4)]),
(2016,fall,[(csci142,3.33,4),(csci211,3.67,4),(rels390,p,4)]),
(2017,spring,[(csci201,3.0,4),(csci221,3.67,4),(hlsc219,4.0,4),
(math141,2.33,4)])]"
  }
]

The _ syntax in the PlannedSemesters value indicates the grade is unknown because the course has not been taken. Additionally, a course written (free,x,x) means the course can be any general education requirement or a general elective.

For Rasa training, we wrote a Chatito script that supports multiple variations of three types of queries: What courses do I need this next semester?, What courses do I need to finish my degree?, and What courses are the prerequisites for a specific course?:

%[schedule_single_semester]('training':'1000')
  ~[what] ~[classes?] ~[can_i] ~[take] next ~[semester?]?
  ~[what] ~[classes?] ~[must_i] ~[take] next ~[semester?]?
  ~[what] ~[classes?] ~[can_i] ~[take] this ~[semester]?
  ~[what] ~[classes?] ~[must_i] ~[take] this ~[semester]?

%[schedule_finish_degree]('training':'500')
  ~[what] ~[classes] ~[are] ~[left]?
  ~[what] ~[can_i] ~[take] to finish?
  ~[what] ~[can_i] ~[take] to graduate?
  ~[what] ~[are] my ~[4yr]?
  ~[what] ~[are] a ~[4yr]?

%[prereqs]('training':'1000')
  ~[what] ~[are] the ~[prereqs] ~[for] ~[course]?
  ~[what] ~[are] ~[course] ~[prereqs]?
  ~[what] ~[must_i] ~[take] ~[for] ~[course]?

~[what]
  what
  which
  show me
  find

~[classes]
  classes
  courses
  sections

. (etc.)

Because there are many possible courses (CSCI141, CSCI142, MATH340, and so on), we do not wish to list all of these courses in the Rasa training data. Instead, we will extract the course from the query when we need it to pass to TAROT (just in the query that determines prerequisites). We take this alternative instead of using Rasa's entity extraction capabilities because we are not able to reasonably provide Rasa with all the training examples (all the courses) necessary for it to reliably extract the course as an entity.

But mentioning a course in a query is a good indicator that the query is about course prerequisites. Thus, we still want Rasa to learn about how courses are written. To do this, we use Rasa's regular expression matching feature. Chatito does not support regexes directly, so we need to create a JSON file of Rasa options that we provide to Chatito. This file contains the following information:

{
  "rasa_nlu_data": {
    "regex_features": [
      {
        "name": "course",
        "pattern": "[A-Za-z]{2,4}\s*\d{3}[A-Za-z]?"
      }
    ]
  }
}

This single regex tells Rasa what a course looks like (two to four letters followed by three digits, possibly also followed by a letter for general education courses). We tell Chatito to include this configuration in its Rasa output:

npx chatito training/tarot-training.chatito 
--format=rasa --formatOptions=training/tarot-rasa-options.json

Rasa's support for regular expressions is sometimes misunderstood. The regular expression's name (course in the preceding example) does not relate to any intents or entities in the Rasa training examples; it's just a name for the regex. Furthermore, in most Rasa configurations, a regex does not yield an entity even if the regex does match the input. In other words, regexes in Rasa do not create intents or entities. They are only helpful for detecting intents and entities. Whenever the regex matches one of the training examples, that training example records the fact the regex matched. If the regex matches in the input string (the user's query) as well, then Rasa looks for all examples, and the associated intents, that also matched the regex. Thus, regular expression in Rasa helps identify the intent and enables us to avoid creating a separate training example for every variation of the course names.

The last aspect to address in the TAROT example is NLG. Again, we use SimpleNLG. Our use of SimpleNLG is similar to the Pokémon example, but there is one interesting case. When listing a course's prerequisite, we have a few different possible scenarios:

  • There are no prerequisites
  • There is one set of prerequisite courses (a single course or multiple)
  • There are multiple different prerequisites (each a single course or multiple), that is, CSCI211's prerequisites are either both CSCI141 and MATH125, or both CSCI141 and MATH141

For the first case, with no prerequisites, we just create a simple String that says, CSCI111 has no prerequisites. (where the course is whatever the user asked about). There is no benefit in using SimpleNLG to generate a simple statement that has no variation.

We handle the second and third cases with the same code. To do so, we use CoordinatedPhraseElement, with an or conjunction, for the different subsets of prerequisite courses. For each course in the subset, we use a new CoordinatedPhraseElement with an and conjunction (the default conjunction). If we ultimately only add a single course or a single subset to the respective CoordinatedPhraseElement, then SimpleNLG will simply not write the and or the or.

Lastly, if there are multiple subsets, we use plurals in the sentence, and if there are many subsets, we just stop after three and say how many more. At our university, the required senior research course (CSCI498) has complex prerequisites that can be realized in many different ways: two 300+ level CSCI courses and another 300+ level CSCI or CINF course:

public static String respondPrereqs(NLGFactory nlgFactory, Realiser realiser, String course, List<String> prereqs) {

  // If no prereqs, return immediately
  if(prereqs.get(0).equals("[]")) {
    return course.toUpperCase() + " has no prerequisites.";
  }

  SPhraseSpec p = nlgFactory.createClause();
  NPPhraseSpec subject = nlgFactory.createNounPhrase("the", "prerequisite");

  // if multiple prereq subsets, make sure "prerequisites"
  // is plural and the verb is "are" instead of "is"
  if(prereqs.size() > 1) {
    subject.setPlural(true);
    p.setPlural(true);
  }
  PPPhraseSpec prep = nlgFactory.createPrepositionPhrase("for", nlgFactory.createNounPhrase(course.toUpperCase()));

  // set the sentence subject to "the prerequisite",
  // and add a pre-modifier that says "for [course]",
  // resulting in "the prerequisite for [course]"
  p.setSubject(subject);
  p.addPreModifier(prep);
  p.setVerb("is");

  // build a disjunction ("or") between subsets of courses
  CoordinatedPhraseElement prereqOptions = new CoordinatedPhraseElement();
  prereqOptions.setFeature(Feature.CONJUNCTION, "or");

  // for each course mentioned in the subset
  // (extract with regexes due to how TAROT returns the data)
  Pattern coursePattern = Pattern.compile("([a-z]{4}[0-9]{3})");

  // show at most 3 subsets
  for(int i = 0; i < Math.min(prereqs.size(), 3); i++) {
    String pr = prereqs.get(i);

    // start a conjunction ("and") for the course list
    CoordinatedPhraseElement prereqsConj = new CoordinatedPhraseElement();

    // extract each course
    Matcher prMatcher = coursePattern.matcher(pr);
    int count = 0;
    while(prMatcher.find()) {
      prereqsConj.addCoordinate(prMatcher.group(1).toUpperCase());
      count++;
    }

    // if we have multiple courses, say "all of ..."
    // or "both ..."
    if(count > 2) {
      prereqsConj.addPreModifier("all of");
    } else if(count == 2) {
      prereqsConj.addPreModifier("both");
    }
    prereqOptions.addCoordinate(prereqsConj);
  }

  // if we had lots of prereq subsets, just summarize the rest
  if(prereqs.size() > 3) {
    prereqOptions.addCoordinate(nlgFactory.createNounPhrase((prereqs.size() - 3) + " more options"));
  }

  // the object of the sentence is the list of courses
  p.setObject(prereqOptions);
  return realiser.realiseSentence(p);
}

Some examples of the output of our TAROT example can be seen in the following table:

Query

Response

what classes should I take next?

In Fall 2018, you should take CSCI311, CSCI321, MATH142, and PHYS141.

what do I need to take this semester?

[Same response as previous.]

what do I take next?

[Same response as previous.]

what's left?

In Fall 2018, you should take CSCI311, CSCI321, MATH142, and PCB141. In Spring 2019, you should take PCB142, CSCI304, CSCI301, and CSCI331. In Fall 2019, you should take CSCI498 and 3 general eds. In Spring 2020, you should take CSCI499 and 3 general eds.

what must I take to graduate?

[Same response as previous.]

what's left in my 4-year plan?

[Same response as previous.]

tell me the prereqs for csci211.

The prerequisites for CSCI211 are both CSCI141 and MATH125 or both CSCI141 and MATH141.

What must I take for comp sci 142?

The prerequisite for CSCI142 is CSCI141.

what are math 142's prereqs?

The prerequisite for MATH142 is MATH141.

what's required to take math 401?

The prerequisite for MATH401 is all of MATH211, MATH221, and MATH243.

What's required for computer science 498?

The prerequisites for CSCI498 are all of CINF490, CSCI485, and CSCI490, all of CINF490, CSCI471 and CSCI490, all of CINF490, CSCI471 and CSCI485 or 1765 more options.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.90.235