MRC File Primer
What is "MRC"?
MRC stands for "Multiple Rows per Concept". MRC is a more human readable format for specifying the contents of a TBX-Basic terminology file. Instead of using XML format, with all of its tags and escape sequences, it uses rows and columns to specify terminological data.
TBX-Basic is, however, the more commonly used format in interchange. For this reason, we provide a utility for converting MRC files into TBX-Basic files.
What follows is a very short tutorial to get you started using the MRC format for your terminology files. A more thorough tutorial can be found here. Also, here is a sample MRC file to help get you started.
An MRC file is a plain text file, easily edited with a simple text editor such as Notepad. Each row (line) represents one piece of information. Let's start an MRC file. The first three lines of an MRC file are as follows:
=MRCtermTable A workingLanguage en A sourceDesc The description of where this data comes from
You can ignore the first row (though it must be present). The second row specifies the working language of the document using a standard abreviation ("en" for English). The third row gives a description of the source of the data, and should be edited accordingly.
Most rows in an MRC have three columns, separated by tab characters. The first column is an identifier which helps to organize the information hierarchically and to uniquely identify a specific row. The "A" used in the second two rows above specifiy a row as a header, applying to All the information in the document. The second column specifies the type of information contained in the third column.
When you start specifying actual terminological information, you will need to use longer ID's. For concepts in your term base, the ID's should start with "C", followed by some number. If a given row specifies a word or term, it will also need a language abreviation and another number. Let's look at an example:
C002 subjectField Restaurant Menus C002fr1 term pois chiches C002fr1 partOfSpeech noun C002en1 term garbanzo beans C002en1 partOfSpeech noun C002en2 term chick peas C002en2 partOfSpeech noun
The first line specifies the field to which the terminology subject belongs. Notice the use of an ID begining with C (for "Concept") and followed by a number. Notice also that the rest of the rows begin with the same sequence of characters in their ID's.
The rest of the rows have a language abreviation in their name, indicating that the data applies to a specific language. They also have numbers after the abreviation so that separate terms which point to the same concept and belong to the same language may be distinguished. Here, the French term "pois chiches" can be either "garbanzo beans" or "chick peas" in English.
By the way, you can also choose to specify the subject field for your entire term base through the use of a header at the top of your document:
A subjectField Restaurant Menus
MRC supports many more kinds of information: notes, term types, other grammatical categories, and links between entries. Remember, however, that the types of information that are allowed in an MRC file are limited by the specification. You can't, for example, add the following row:
A purpose To save the company money by keeping terms together
The reason you can't add a "purpose" row is that there is no "purpose" data type in the MRC specification. Refer to the documentation linked at the beginning of this tutorial for a list of allowed types and how to use them.
Using Excel to Create an MRC File
Using Excel, or any other spreadsheet program, makes it easier to work with MRC files. Rows and columns are visually distinguishable, and it is easy to sort rows by any column (remember that a requirement of our MRC-TBX conversion script is that MRC rows be sorted by ID). Try opening Excel and pasting in some of the data from this tutorial. Excel automatically splits text on tab characters, so copying and pasting
C002en2 partOfSpeech nouninto Excel will result in three separate columns. Once you have finished filling in the terminological data, you must save the Excel file as a tab delimited text file. Detailed directions, pictures and all, can be found on HowToGeek.
One thing you need to be careful of is that lines starting with "=" in an Excel cell will be misinterpreted as a function of some sort. This will clobber the first line of the file,
=MRCtermTable. To counteract this, add a single quote (') before the "=" character (so that it is