It should be evident that the quality of output is directly dependent on the quality of the rule set developed for the content realizer. In one sense, the control mechanism for the rules has very little to do with the quality of output. Given a sufficiently insightful and complete set of rules, any control mechanism should be capable of producing quality output. On the other hand, the control mechanism must be able to handle the set of rules it is given. If the most insightful rules make references to data that cannot be obtained by the control mechanism, then this will severely impact on the quality that can be achieved. For instance, if decisions made by parent nodes cannot be made dependent on the content of their children, then serious restrictions will be placed on the set of rules that can be developed. This impacts on the ``completeness'' of the system. Completeness refers to the ability of the planner to solve all variety of problems with no inherent limitations.
Completeness
A central aims of this project was to create a complete planner. One of the drawbacks inherent in a text generator like Penman is the inability of the modules to communicate with each other. As stated in the problem statement, the main characteristic of text generation driving this project is the inter-action of choices available at different levels. This planner specifically allows for these inter-actions. There are, however, two classes of constraints that this system cannot at present address (see ``further research''). It is not able to use constraints that arise from the combination of surface features. Thus a constraint such as ``if the clause is already 30 words long, make a sentence boundary'' cannot be used. Related to this is the limitation on using discourse constraints such as focus, which critically relies on surface ordering. There are two ways these limitations could be overcome. First, a post analyzer could simply take the output of the present system, build the structures up, and apply the surface and discourse constraints. This would have the advantage of not introducing the need to build intermediate structures during the planning process, but it has the liability that certain plans that could be eliminated by surface constraints are carried along to the end when they could have been pruned. The second option would be to carry along during planning the most limited set of surface information necessary. For instance, keeping an explicit record of the clause ordering would enable focus tracking. This approach would require that more information be calculated and carried along during the planning process, but would enable a whole different class of constraints to be applied during, and not after, the main planning.
I would tend to opt for the first approach because there seems to be very few decisions that must be made on the basis of surface oriented information. Generating pronouns would seem to be an exception, except that I have found that the pronoun generation subsystem I developed works fine as a post-processor to the main planning. The only area that is severely impacted is the planning of sentence boundaries. It is difficult to know where and when to break sentences without knowing the surface ordering of the clauses and how long each of them are. As far as the length (in words) of the sentence, this can be estimated by looking at the number of concepts involved (which can already be accomplished in this system). My suggestion, then, is to simply carry along a representation of surface ordering during the planing process. This would minimize the overhead while addressing the main problem.
Quality
Determining the actual quality of the rule set is also problematic. There are two factors that contribute to quality here: insightfulness and completeness. One can write a set of rules that is very thorough but not particularly insightful, with the result that the output will be poor and the processing time large. One can also insightfully tackle a few problems, but lack completeness, with the result that the overall output will still be poor. Obviously, it is extremely difficult and time consuming to write a complete set of rules for a given language. Perhaps less obvious, it is extremely difficult to write insightful rules, even for a small fragment of a language. Entire PhD dissertations have been produced covering small aspects of a language. Describing a language is an extremely hard thing to do well! And it must be noted that the relation between the size of the fragment described and the time it takes to describe it is not linear. As the number of features included grow, the possible interactions between features grows exponentially. And, quite probably, it is taking into account those interactions that will produce the highest quality outputs.
In summary (with respect to this section), the control mechanism should be judged by its ability to handle various types of rules. It should be determined what types of information might be helpful in language generation, and then demonstrated that the control mechanism can find and process that information. This system is complete with respect to the biggest class of constraints necessary for text generation, and, with some extensions, could be complete for all but the most unusual of surface-related constraints. As for the quality of the rule set, unless a major effort is undertaken, ``quality'' general output is, practically speaking, unattainable. The approach taken here is to confine the semantic inputs to a very limited domain. Taking this restriction into account, and appreciating the difficulty of producing a rule set for even limited domains, the output quality of this system is excellent.