eHRAF Tutorial

 

 

eHRAF Tutorial: Text Search
by Christiane Cunnar,

Human Relations Area Files (HRAF) at Yale University

 

[Last Page]  [Next Page]

What to expect when searching in eHRAF
When searching the eHRAF Collection of Ethnography database, you must keep in mind the ethnographic nature of this database.  According to the Encyclopędia Britannica Online, "ethnography" is the:

"descriptive study of a particular human society or the process of making such a study; based almost entirely on fieldwork; requires the complete immersion of the anthropologist in the culture and everyday life of the people who are the subject of study; uses method called participant-observation, which emphasizes objectivity; focuses on formulating generalizations about culture and on drawing comparisons." 

Reference from: "Ethnography" Britannica Student Encyclopedia,  <http://search.eb.com/ebi/article?eu=347529&query=ethnography>, 
[Accessed September 15, 2002]. 

eHRAF is an ethnographic database and its topics are culture-specific.  For example, you might find a document titled "Ethnobotany of the Blackfoot Indians," or documents that describe the specific use of herbal plants by a culture.  However, you won't find a document titled "Ethnobotany of Native North Americans," because such a document refers to a group of cultures rather than a single culture.  

When searching eHRAF also keep in mind that  ethnographies span many years (some are from the early 20th century; others are more recent).  You will find "rare" ethnographies that have been translated into English by HRAF that cannot be found at many libraries. 


Search: Overview

The Search menu consists of two parts: Text Search and the eHRAF Source Bibliography. The Text Search allows you to search OCM subject codes, words in documents contained in the culture files. The eHRAF Source Bibliography search allows you to search citations of the documents contained in the culture files. Each citation consists of publication information and evaluation information written by HRAF indexers. Please note that this tutorial will only focus on Text Search. Please consult the "Help" Files in the database for the eHRAF Source Bibliography search. 

Text Search
In Text Search (Figure 15) you can search in any one of three sections: OCM subject codes, Exact word or phrases, and Cultures/OWC. Each section can be used separately, or in combination. A "Quick Search" box is available for a simple word search, but cannot be used to search for OCM subject codes. 

In the database, the various "Help" buttons in the Text Search explain the function of the sections in detail.   In this tutorial, rather than reiterating the information found in the "Help" buttons (of the database) I want to concentrate on how the various search features can fit your needs searching needs.

In the eHRAF database click on the Search button (in the blue horizontal bar) to enter the default Text Search (Figure 15) and view each section.

Figure 15. Text Search in the eHRAF Collection of Ethnography database 

Guidelines to Searching in eHRAF
The following guidelines will help you in "building" your search.  

#1. The Power of Boolean Operators (and, or, not)
If you combine one search section (e.g., OCM subject codes, Exact Word or Phrase) with another you will automatically perform an "and" Boolean search. Within the OCM Subject Search and the Exact Word or Phrase Search you can use Boolean operators (and, or, not) by using the pull-down menus (Figure 15). Do not type in the words "words "and," "or," and "not" in the boxes. 

Understanding Boolean operators, especially the difference between "and" and "or" is vital for performing powerful searches. 

The drop-down menus with Boolean operators include:
· And (both terms must be present) >>restricts a search 
· Or (either term must be present) >> expands a search 
· Not (excludes second term) >> restricts a search 

#2. Exact Word or Phrase Search in Text Search: Paragraphs vs. Titles
When you use the "Exact Word or Phrase" boxes to search for words you can search at two different levels using the pull-down box (Figure 15) -- Paragraphs (default) or Titles

Figure 16 is an excerpt of full text from the document titled "The Sun god's children" by James Willard Schultz and shows the titles in the table of contents on the left-hand side and the indexed paragraphs on the right-hand side.

Paragraphs
Every paragraph of each document is indexed with
OCM subject codes (see Figure 16).  The paragraph is the basic unit of a text in eHRAF.  Occasionally a paragraph can be extremely short consisting of only one or two words such as a list or an index entry.  

Titles (in Text Search)
A title in the eHRAF is considered a chapter or subchapter and can be found in the table of contents (TOC) of a document (see left-hand side in Figure 16). Unlike a paragraph, a title is NOT indexed with OCM subject codes. Please note that in eHRAF you can also search for book titles, but you need to use the eHRAF Source Bibliography search ( Figure 15, left-hand side).  

Figure 16. Excerpt of full text with titles and paragraphs, indexed with OCM Subject Codes.

Regardless of whether you are performing a word search in the paragraphs or titles of the eHRAF texts, the search engine will look for the exact spelling of the words or phrases searched.  This means that for your search to be successful, your word choice has to match the word choice used by author.  Matching words is sometimes not an easy task for the eHRAF user and that is why I stress using OCM subject codes for your search!

When writing,  an author (usually) writes about very specific aspects of broader topics (e.g., about the role of wolves in mythology) in the text of a paragraph, but usually lists broader topics (e.g., mythology) in the chapter titles.  Keep this concept in mind when searching for words or phrases in eHRAF. Although you can search for any words, a good rule of thumb is to use broad terms when searching titles and narrow terms when searching paragraphs.  Or use the OCM subject codes if you want to be less dependent on choosing the "correct" word, or use the OCM subject code in combination with a word to narrow a topic.  The power of an OCM subject code search is that it is able to retrieve concepts rather than just words appearing in the text (but being without proper context).  The following section shows search examples for a word-in-title search, an OCM subject Code search, and an OCM/word search.

Search Examples
In the beginning of the tutorial I suggested a particular topic, geographical region and culture to be used in this tutorial.  Let's revisit the idea-- imagine you are interested in studying the different types of mythologies (e.g., creation story, hero myths, myths of good and evil, etc.).  You are particularly interested in studying a culture from the northwestern Plains of the United States. Furthermore, you are particularly interested in a certain aspect of mythology, specifically the role of animals and, in particular the role of the "wolf" in folklore.  This gives us quite a bit of information to get started with the eHRAF search.  First it is best to analyze the various aspects of your research question(s):

You are interested in:

1. a culture from a particular geographic area (e.g., northwestern Plains in the United States). 
2. a subject (e.g., mythology) 
3. a narrow aspect of a broader topic (e.g., the wolf as mythological creature)

The database is structured by cultures rather than sub-regions. So you must first "translate" your regional preferences into a culture or cultures. In the first part of the eHRAF Tutorial you have learned how to find a culture representing a certain geographic region. In this case, you have found that the Blackfoot are located in Montana, representing the northwestern Plains region of the United States.  In the first part of the eHRAF Tutorial you have also learned how to find OCM subject codes representing certain topics. In this case, the OCM Subject Code "773" represents the topic "mythology." 


Word-in-Title search

Example 1: A "not-so-good"  Word-in-Title Search

Ex1: Text Search
You have first decided to perform a "Word-in-Title" search for "mythology."  Your initial search may look something like shown in Figure 17.  You launched your search and got zero results. 

What has happened? You've filled out too many boxes and typed in too many words.  Remember eHRAF searching is based on finding the  EXACT words or phrases.  The search engine simply didn't find the word and phrase combination in a single chapter title.  

Let's reexamine the text of an eHRAF document.  In the eHRAF Tutorial scroll up to Figure 16 and look at the wording of the titles in the left-hand table of contents (TOC) and you will see the word "myth" but no references to "Northwestern Plains" or "American Indians."  

In general, culture names and regions should not be put in the Exact Word or Phrase search boxes because they are often not found in the body of the ethnography texts.  It is better to choose cultures in the "Culture/OWC" section or choose cultures after a search is executed.

In the database type in the Exact Word or Phrase section type in "mythology" in the first box, "American Indians" in the second box, and "northwestern Plains" in the third box. Change the pull-down box to read "Titles."   Click the gray Search button to execute the search. This should produce a window with an error message.  Click on your Internet browser's Back button to return to Text Search and delete the words in the boxes.

Figure 17. Search example of a not-so-good  Word-in-Title search.


Example 2: "Good" Word-in-Title Search

Ex2: Text Search
The magic to performing a "good" search in eHRAF is that your search entries have to be basic--the fewer words, the better the results!  To capture the word variations such as mythologies, myths, mythological, etc., you can truncate the word "mythology" to "myth*" (see Figure 18.).  To search for similar words, the other boxes with words such as "legend", "folklore," etc., can be used, but the Boolean operator must be changed to "or." For general topics such as mythology, it is good to search at the title level rather than paragraphs. Choose a culture (e.g., Blackfoot) in the "Culture/OWC" section or choose the culture or cultures after a search is executed.

In the database in the Exact Word or Phrase section type the truncated words "myth*" in the first box, the word "legend*" in the second box, and the word "folk*" in the third box.  Change the Boolean operators to "or" to expand the search and the pull-down box to "Titles." In the "Culture/OWC" select Blackfoot, NF06. Click the gray Search button to execute the search. 

Figure 18. Search example of a "good" Word-in-Title search 

Ex2: Culture View
Figure 19 is an excerpt of the "culture results" page showing the Blackfoot File and the listings for number of documents and matches found.   From this list you can pick the culture you are interested in (in this case the Blackfoot). 

In the eHRAF database, the search engine found the Blackfoot File with 27 matches in 10 documents.  Click on Blackfoot, NF06.

Figure 19. Blackfoot File with number of matches found in documents 

Ex2: Document View
In the Blackfoot file several chapter titles with the words "myth," legend," or "folk" were found.  Figure 20 shows a short excerpt from the Blackfoot file with word results found in document titled " Sun god's children" by James Willard Schultz. Compare the subchapter title "The Scar-Face Myth" to how the same title is displayed in the table of contents in Figure 16 of the eHRAF Tutorial. 

In the database, once you have clicked on the culture name Blackfoot, NF06 in the "culture results" page you have entered the "document results" page.  View all the document results, but in particular take notice how Schultz's document titled "Sun god's children" is displayed in a word-in-title search. Click on the various matches in the chapter titles, then return to Text Search by clicking on Search at the top of the screen.

Figure 20. Short excerpt of the document titled "Sun god's children' of the Blackfoot culture file


Example 3: OCM Subject Codes Search

Ex3: Text Search
Now let's improve our searche in eHRAF by using OCM subject codes. Figure 21 shows a Text Search with an OCM subject code search for a particular culture. Using OCM subject code "773" for "Blackfoot" will search all the paragraphs of the documents in the Blackfoot file for the subject "mythology." 

Important!  When using an OCM subject the pull-down default must be set at "paragraphs" or else the search will not work. 

In the database type in 773 in the OCM subject code section (see Figure 21).  In the "Culture/OWC" section highlight the culture name "Blackfoot, NF06" and press the gray Search button to execute your search. 

Figure 21.  OCM subject code search in Text Search

Ex3: Culture View
Figure 22 is the "culture view " page showing that the search retrieved 534 matches in 21 documents of the Blackfoot file.  Now scroll up in the eHRAF Tutorial and view again the "Word-in-Title" search that retrieved 27 in 10 documents.  The OCM subject code search retrieved a significantly higher number of matches and many more documents. 

In the database you should now see the "culture results" page showing the "Blackfoot, NF06" with 534 matches found within 21 documents (see Figure 22).  Click on Blackfoot, NF06 to enter the "documents results" page (see Figure 23 in the eHRAF Tutorial).

Figure 22. Culture view with number of matches for "773" found in the number of  the documents of the Blackfoot culture file 

Ex3: Document View
The document view shows all the retrieved documents and matches for the OCM subject code found in the paragraphs of a culture file. Figure 23 shows an excerpt from the document view with the document titled "The Sun god's children" containing a great number of matches for OCM 773 found in the paragraphs. 

Notice that now  the document titled "The Sun god's children" not only shows the matches for chapters with the word "myth" in the title but also other chapters, not previously retrieved, such as the chapter titled  "Chapter III When Men and Animals were Friendly" with 79 matches!

When you search the documents in eHRAF for only one OCM subject code, you will usually come across documents containing a rather large number of matches found in paragraphs (see Figure 23).  See the high number the matches for OCM in paragraphs as an indication that significant information can be found for that particular topic.  

As you are developing your search strategies, you might first want to focus on the documents containing high number of matches and then work your way down to the documents containing fewer number of matches.  When you see a high number of matches, rather than clicking on every paragraph match, click on the hyperlinked heading right above the string of matches and paragraphs.  For example, if you click on the title "Chapter III When Men and Animals where Friendly" you would retrieve all the 79 matches for OCM subject code "773" (mythology) in the context of the chapter.  Scroll up in the eHRAF Tutorial and in Figure 16 you will see the an excerpt of the chapter titled "Chapter III When Men and Animals were Friendly" with the paragraphs containing the OCM subject code 773.

In the database in the "documents view" of the Blackfoot file scroll down the list to locate the document title "The Sun god's children" by James Willard Schultz. Underneath the chapter title "Chapter III When Men and Animals were Friendly" click on the number 1 to the right of the word "matches" (see Figure 23). This will retrieve a single paragraph (as shown in Figure 24).  In that paragraph click on heading titled Chapter III When Men and Animals were Friendly to find the paragraph within the context of the chapter.  Click your Internet browser Back button twice to put you back in to the "documents view" page.  Now click on the heading title Chapter III When Men and Animals were Friendly and you will see that it put you in the same place as clicking on the heading in the paragraph. 

Figure 23. An excerpt from the "documents results" page with the number of paragraphs and matches found for the OCM subject code "773" for the document titled "The Sun god's children."

Ex3: Paragraph View 
Each numbered "match" links to a paragraph containing the OCM subject code (see Figure 24). To see a larger section and the paragraph in context of a chapter click on the heading that is right above the gray "hide the OCM codes" button (Figure 24).

Figure 24. Paragraph containing OCM 773 in the document titled "The Sun god's children" 

OCM subject codes work very well in combination with other OCM subject codes or words.  Example 4 discusses how OCM subject codes can be combined with words. 


Example 4: OCM/Word Search

Ex4: Text Search
The OCM/Word Search is good to use if you want to narrow your search to a more specific aspect of an OCM concept.  For example, you might be interested in the role of animals in mythology, and your special interest is in the wolf as a mythological creature.  When you perform your OCM/Word search (see Figure 25), try to think of word variations and other names for wolves including their Latin name.  If possible, try to truncate your word to expand your search results.  For example, you can truncate "mythology" to "myth*" to capture the word variations such as mythologies, myths, mythological, etc.  Fill in the other word boxes with words such as "wolves", "canis," etc., but make sure to change the Boolean operator to "or."  In the "Culture/OWC" section you could highlight and select the names of the cultures you wish to search.  However, for this search we will use the default "All Cultures" and then select the cultures from the "cultures view" page. 

In the eHRAF database first click on the blue Search button to return to Text Search.  In the OCM Subject Codes box type in 773.  In the Exact Word or Phrase section type the truncated words "wolf*" in the first box, the word "wolves*" in the second box, and the word "canis*" in the third box.  Change the Boolean operators to "or" and make sure that the pull-down box is at the default "Paragraphs." In the "Culture/OWC" select All Cultures. Click the gray Search button to execute the search. 

Figure 25. OCM Subject Code "773" combined with the words "wolf," "wolves," or "canis."

Ex4: Culture View
Figure 26. is the culture view  page showing a list of cultures organized by regions indicating the document matches.

In the database scroll down to "North America" in the "culture view." Click on Blackfoot, NF06 to enter the "document view" (see Figure 26 in the eHRAF Tutorial). 

Figure 26. List of cultures with number of matches found in documents 

Ex3: Document View
As you can see in Figure 27, all the paragraphs containing the words "wolf" or "wolves" or "canis" will appear in the document view, but the OCM subject codes 773 not shown at this stage. Clicking on a "match" will open up the entire paragraph.  If you see a high number of "matches," you immediately can go to all the paragraphs within the context of a chapter or section by clicking on title heading right above the match.  Notice how the OCM/word search has changed the document view for the Schultz document titled "The Sun God's Children." Scroll up in the eHRAF Tutorial and in Figure 16 you will see the an excerpt of the chapter titled "Chapter III When Men and Animals were Friendly" with the paragraphs containing the OCM subject code 773.

In the database, in the document view underneath the chapter title "Chapter III When Men and Animals were Friendly" click on the match (see Figure 27 in the eHRAF Tutorial). This will retrieve a single paragraph (not shown as Figure).  In that paragraph click on heading titled Chapter III When Men and Animals were Friendly to find the paragraph within the context of the chapter.  Click your Internet browser Back button twice to put you back in to the "documents view" page.  Now click on the heading title Chapter III When Men and Animals were Friendly and you will see that it put you in the same place as clicking on the heading in the paragraph. 

Figure 27. "Snapshot" view of paragraphs containing the word "wolf" 

Example 5: Search for Graphics
I am often asked how to search for images in the eHRAF database.  Using the computer programming term "<graphic" in the word box can help you locate images.  Figure 28 is an excerpt from Text Search and shows how the word "<graphic" is used in combination with the OCM subject code "773" to search for any illustrations pertaining to mythologies.  

In the database first click on the Search button to return to Text Search.  In the OCM Subject Codes box type in "773."  In the Exact Word or Phrase section type the word "<graphic" in the first box.  Make sure that the pull-down box is at the default "Paragraphs." In the "Culture/OWC" select All Cultures. Click the gray Search button to execute the search. Browse the documents for results. 

Figure 28. Searching for "<graphics" and "773" in Text Search

Summary for Text Search
I hope that the search examples have given you some insight into the "search logic" of eHRAF. The first search example demonstrated the common mistakes that users make when first using Text Search in eHRAF--producing zero results (or a very low number of results) by simply using too many terms in the boxes and by using too many terms not likely to be found in ethnographic texts.   Choosing the appropriate Boolean operator is important because whether you are choosing "and" or "or" will make a great difference in your search results. 

The search examples have shown how you can develop a search and the different ways you can retrieve information from one and the same document of one culture. The "Word-in-Title" search, shown in Figure 18, makes for a good start in searching in eHRAF, but the OCM subject search (Example 3) is the ultimate power search.  

Search Example 4 showed you how to refine your OCM subject code search by adding word(s) to search more narrow aspects of topics.  For example, OCM 773 with the word "wolf" found information  on the "role of wolf as mythological creature." 

I usually call the OCM/word a "filter" search because the OCM "filters" the word into the appropriate context of text. You can completely redirect your search by changing the OCM subject code.  For example, if you would change the OCM to "825" (Ethnozoology) but leave the word "wolf," your search would retrieve information on a culture's notion about  wolves instead of mythology.

Truncating your words can often make a big difference in your search results.  For example, searching the word "mythology" in titles retrieves 48 matches in 43 documents, however, searching the word  "myth*" in titles increases the results to 407 matches in 176 documents. Also note that not only can you truncate words, but also OCM subject codes to two digits (e.g., 77*).  

Because language varies widely over time, a word search should avoid "trendy" or recent vocabulary. The OCM categories are particularly valuable when wording is very variable over time.

The eHRAF database is an extremely powerful database if you are interested in exploring cultural diversity.  However, it is not the most intuitive database and searching it does take some practice and patience.  You usually have to "build" your searches into meaningful results by trying various search methods. I find that the texts in eHRAF can be a wonderful help in building a search.  Once you find texts with good results take a closer look at the paragraphs.  The type of the words used by the author and other OCMs appearing on the top of the paragraphs can often give you leads in redefining your search. Visit the section "Search Methods, tips, and examples" in the eHRAF User Guide for more ideas on how to use words in combinations with OCM Subject Codes.  The eHRAF User Guides at www.yale.edu/hraf/userguides.html also contain a HRAF Glossary of Terms and brief overviews of the Browse and Search menus. 

[Last Page]  [Next Page]

eHRAF Tutorial Index
   Introduction
   Browse Cultures
   Culture Files
   Documents
   Browse Subjects
   Text Search


   

For database support call HRAF at 203-764-9401, 1-800-520-4723 (9 am to 5 pm, EST), or email HRAF at hraf@yale.edu

Return to HRAF Homepage