A quick tutorial on ontology building

Background Readings:

1. World modeling for NLP. Carlson and Nirenburg. CMU-CMT-90-121.

2. The lexicon in the scheme of KBMT things. Onyshkevych and Nirenburg. MCCS-94-277.

How to enter the Mikrokarat Tool:

1. cd ~mikro/karat

2. ukarat I will be running the server on ogygia for the present. Soon there will be a new machine for this. For the present, type in ukarat ogygia. If the server is not running on ogygia, you will get a warning message but you will still be able to use the tool for present purposes.

3. Type in your user name and password.

4. Left click (i.e., click the left mouse button) on CURRENT DATABASE. This will show a menu of database names.

5. Choose the appropriate database from the many by clicking on it and then clicking on OK. For example, select lori-trial@ or dan-trial@ These databases have been set up for your experimentation for now. They also have the proper configuration files set up using the mksetup program. Please remember if you are looking at other databases that any changes you may make inadvertantly may get stored permanently in those databases!

6. Click on Browser. It will ask you whether the root displayed should be the node ALL. Either click OK or edit the root to some other node (e.g., illocutionary act).

7. You will now see a part of the ontology displayed in the browser window. When a node has more children than the maximum limit set for display purposes, a dark triangle will be shown after the children nodes displayed to indicate that more children are present. To see these hidden children, click on this triangle. I have set up your databases so that the depth is 3 and the number of children is 5 for display purposes. This will ensure that the tool responds fairly quickly whenever it redraws the graph. In my experience, setting the depth to any value greater than 4 slows down the system drastically.

8. To go to a different part of the ontology, type in the root node name in the root window in the top left corner of the browser and then click on DISPLAY TREE.

8. To go up in the hierarchy, left click on the root. The tree rooted at its parent node will be displayed. If the node has multiple parents, you will be asked to select which parent you want to go up on. This interaction will happen in a separate pop-up window.

9. To go down, click (by default, click means click the left mouse button) on the Children button and then click on any leaf node. Any children of that leaf node will be added to the current display of the tree. If you simply left click on any node other than the root, it will display the tree rooted at that node.

10. You can also look up the ancestors of a node all the way to ALL by clicking on the Ancestors button and then on the node.

11. To add a new node to the ontology, click on ADD LINK and then on the parent node whose new child you want to add. When it asks for the child's name enter the name of the new node you want to create. If the child node already exists, an IS-A link (and its inverse, the SUBCLASSES link) will be added between the two nodes. If not, you will be asked whether you want a new frame created.

12. Do delete a link, click on DELETE LINK and then on the link (not the node) you want to delete. If you delete all links to a node, it will ask you whether you want to delete the frame. Say yes to delete a frame from the database. If you want to delete a link and reattach the child frame to a different parent, you must first add the link to the new parent and then delete the link to the old one.

13. Each node is a frame. To edit a frame, right click on the node. In the frame editor, you will see the definition string displayed on top and all the slots and their VALUE facets shown below the definition. If you want to see other facets (such as SEM), you may click on Show Facets.

You may also enter the frame editor directly from the MikroKARAT window by clicking on Edit and selecting Frame. You will be asked for the name of the frame you want to edit when you do this. If instead of Frame, you select Ontology in the Edit button, the tool will crash. Remember, however, that it is not a good idea (yet) to enter the frame editor directly. The current tool will not display the definition string at the top when you enter it this way, for instance.

14. In the frame editor, you can add a slot by clicking on ADD SLOT and then typing in the slot name.

15. To add a facet to a slot, right click on the slot and choose Add Facet. Then type in the facet name.

16. To add a filler to a facet, right click on the facet and choose add filler. Type the filler in.

17. Do not try Delete Filler. To delete a filler, just Backspace it out!

18. You may also delete facets and slots. However, sometimes, you may have to exit the tool and reenter to see the changes you made. (I am not sure this has been rectified in the new version of the tool.)

19. To save and quit the frame editor, click on SAVE, then on QUIT EDITOR, and say Yes if it asks whether to save.

20. You must enter a different frame editor every time you want to edit a frame. You cannot switch from one frame to another in the same editor.

21. Do not QUIT the browser often. The tool crashes when you quit the browser.

Occasionally, the tool crashes due to bus errors or other "irrecoverable FramePaC errors" and so on. Bear with us while the tool is being fixed and improved.

More on frame editing:

22. Every frame must have a definition slot whose value facet has a filler that is a string (i.e., enclosed in double quotes) which is a definition of the concept in that frame. This is intended for the human browser who is searching for concepts in the ontology. The semantic analyzer does not look at this value. If the concept is listed in the Mikrokosmos glossary, enter the string in the glossary as the definition. Otherwise, enter an intuitive or dictionary description of the concept as its definition string.

To edit the definition string, you cannot edit the string that is displayed at the top of the frame editor. You can edit the definition by clicking on Show Facets and editing the same string that is shown in the VALUE facet of the DEFINITION slot.

23. IS-A and SUBCLASSES slots are automatically filled when you do an ADD LINK in the browser. It is not a good idea to edit them manually. If you must, always make sure that the links in either direction are edited properly and there is no inconsistency between the SUBCLASSES slots of parent concepts and the IS-A slots of children concepts. Also, the tree display may be corrupted when you reorganize the hierarchy by editing these slots manually. You must be able to click on DISPLAY-TREE to see a correct display. In the worst case, you might have to quit the tool and reenter to see a correct display.

24. For all other slots, you will typically be filling in the SEM facets with names of other concepts, the corresponding implication being that of a selectional constraint on slot (value) fillers.

25. As a rule, do not enter arbitrary strings in slots. If you notice that any concept already has an arbitrary string or some other strange symbol in it, please make a note of it and we will correct it.

Building an ontology involves lots of communication among the builders in the ontology team and also with other groups, especially the lexicon building group. Whenever in doubt, ask your colleagues or note things in the file ~mikro/onto/ontology.questions. We might have to resort to lots of discussions, both in meetings and informally, to build a consistent ontology in a team. Thank you for your cooperation!

A Walkthrough Example:

NOTE: In writing this tutorial, I might have already added the concept that I am describing how to add below. Feel free to try adding a different concept on the same lines as outlined below.

I invoke ukarat and select the new-trial database (or lori-trial or dan-trial as the case may be; they are all the same at present). I enter the Browser and ask for Illocutionary-act as the root. In the tree displayed I want to see if there are any children of Expressive-Act. To see this, I click on the Children button and then on Expressive-Act. There is no change and hence no children. I would like to add a child node that corresponds to a thanking event. But, first, I would like to see the definition and other details of the Expressive-act frame to be sure that this is the right parent for my thank frame. I right-click on this frame to enter the frame editor.

In the frame editor, I just see the definition string at the top and a list of slot names and their VALUE facets. To see their other facets and fillers, I click on the Show Facets button. Now I see facet names such as SEM. The slots I see do not have any fillers. I scroll down a couple of times until I see the DEFINITION and IS-A slots which do have fillers. I see that the IS-A of Expressive-act is Illocutionary-act. This is why the browser is showing an arrow from Illocutionary-act to Expressive-act. Now, having read the definition string of this frame and having looked at its other slots, I am convinced that an event of thanking someone belongs as a child of Expressive-act.

I do not need to edit the Expressive-Act frame. So I click on Quit Editor. Since I did not make any changes I won't be asked whether I need to save the changes. If I had made any changes inadvertantly, I would have been asked whether the changes must be saved and I would have said NO.

Now, I turn to the browser again and get to the task of adding Thank as a child of Expressive-Act. I click on Add Link and then on Expressive-act. A window pops up asking me for the child's name. I type in the child's name as thank. All names are converted internally to upper case and it does not matter whether you type in upper or lower case. This popup window is also showing that the link that will be added is an IS-A link. This is what we want in this case and hence we will not change it to a Part-Of link. After I type in the child's name, I click on OK. The tool asks me whether to create a frame. I say Yes and the tool adds a link to a new node called Thank. This new node may be shown at a corner of the window with the link to it at an odd angle. It is also possible that there is no arrowhead to the link to Thank. To see a fresh display, I click on Display Tree (a button on the top left corner of the browser window). I get what I want.

Now it is time to enter the Thank frame and edit it. For example, we must add a definition to it. I right click on the Thank frame and enter the frame editor. I again click on Show Facets to see the facets. I notice that the tool has already added an IS-A filler to link this frame to Expressive-Act. It will have added a corresponding inverse link, a SUBCLASSES filler, from Expressive-Act to Thank.

Now, I would like to add a definition to Thank. If the database has been configured properly, there may already be a definition slot added to this frame. If not, I click on Add Slot. When asked for the name of the slot, I type in definition and say OK. The slot is added at the end of the list of slots. I scroll down to see it and notice that it already has the default SEM facet. I need to add a value to this slot. (By the way, we add VALUE fillers only to definition slots. All other slots get SEM fillers in the ontology and get VALUE fillers only in the instances which are in the onomasticon or are instantiated in the TMR's built by the analyzer.)

To add a value facet to Definition, I right click on it and select Add Facet. When asked for the facet name, I type in Value and say Ok. The facet is added at the bottom and I might have to scroll down to see it. To add a filler to the value facet, I right click on the value facet and select Add filler. This opens a new window to hold the filler. I left click on this window to activate the cursor in it. Now I can simply type in the value to be added. I do this by starting and ending with a double quote since this value must be a string. Let us say the string for Thank is "An expressive act of thanking someone."

We must also add an Agent and a Theme slot for Thank and set their SEM facets to Human. That is, by filling in Human in the SEM facets of Agent and Theme, we are constraining the agents and themes of Thank events to humans. This can be done in the same way as we added the definition string above. The only difference is this time we do not enclose the filler Human in quotes since it is the name of a concept and not a string.

I now save and quit the frame editor and I am done adding the Thank concept to the ontology.

Note that the databases you have been given (lori-trial and dan-trial) may have nodes and links that you think are illogical or plain unwanted. Feel free to clean up the databases by deleting, reorganizing, or otherwise editing these questionable parts. However, do not spend much time cleaning up these databases. There is no way to merge the results with other "real" databases. These databases are for learning purposes only. We will be better off cleaning up the real databases after we have learned how to do things by playing with the trial databses.

More Help:

For further help or if you have any questions, contact Mahesh, mahesh@crl.nmsu.edu. Some documentation on the tool is available in ~mikro/karat/doc. Many html help files for individual buttons and features are available in ~mikro/karat/help. There is also on-line help in the tool (which uses the above html files).

Exercise

Shown below is a list of concepts from the Docteur Andreu text. Please add these concepts appropriately to the trial database. If you decide some of these should not be concepts in the ontology, tell us why.

Concept list:

 say
 create
 own
 advise
 engage
 inform
 acquire
 company
 Spain
 transact
 amount
 stock
 consult
 reporter
 expert
 estimate
 sign
 Madrid
 agree
 continue
 produce
 sell
 product
 know
 manufacture
 pill
 heal 
 penetrate
 area
 cardiology
 rheumatology
 publish
 develop
 medicine
 vitamin
 chemical
 country
 employee
 drug
 veterinary
 earn
 increase 
 invest
 research
 develop 
 profit
 flow
 pinpoint
 diagnostic

Feel free to merge several of the concepts above or rename them, etc. The above is actually a word list, not a concept list. It would have been nice if we already had lexicon for these. In the absence of that, assume the word sense that is appropriate for the Docteur Andreu text.

Enjoy!

Back to the Table of Contents