|
eHRAF
Tutorial: Text
Search
by Christiane Cunnar,
Human Relations Area Files
(HRAF) at Yale University
[Last
Page] [Next
Page]
What to expect when searching in eHRAF
When searching the eHRAF Collection of Ethnography database, you must
keep in mind the ethnographic nature of this database. According to the Encyclopędia Britannica
Online, "ethnography" is the:
"descriptive study of a particular human society or the process of making such a study; based almost entirely on fieldwork; requires the complete immersion of the anthropologist in the culture and everyday life of the people who are the subject of study; uses method called participant-observation, which emphasizes objectivity; focuses on formulating generalizations about culture and on drawing
comparisons."
Reference from: "Ethnography" Britannica Student Encyclopedia,
<http://search.eb.com/ebi/article?eu=347529&query=ethnography>,
[Accessed September 15, 2002].
eHRAF is an ethnographic database and its topics are
culture-specific. For example, you might find a document titled "Ethnobotany
of the Blackfoot Indians," or documents that describe the specific use of herbal
plants by a culture. However, you won't find a document titled
"Ethnobotany of Native North Americans," because such a document refers to a
group of cultures rather than a single culture.
When searching eHRAF
also keep in mind that ethnographies span many years (some are from
the early 20th century; others are more recent). You will find
"rare" ethnographies that have been translated into English by
HRAF that cannot be found at many libraries.
Search: Overview
The Search menu consists of two parts: Text Search
and the eHRAF Source Bibliography. The Text Search
allows you to search OCM subject codes, words in
documents contained in the culture files. The eHRAF Source Bibliography
search allows you to search citations of the documents contained in the culture files. Each citation consists of publication information and evaluation information written by HRAF indexers.
Please note that this tutorial will only focus on Text
Search. Please consult the "Help" Files in the
database for the eHRAF Source Bibliography search.
Text Search
In Text Search (Figure 15) you can search in any one of three sections: OCM subject codes, Exact word or phrases, and
Cultures/OWC. Each section can be used separately, or in combination.
A "Quick Search" box is available for a simple word search, but
cannot be used to search for OCM subject codes.
In the database, the various "Help" buttons in the
Text Search explain the function of the sections in detail. In this tutorial, rather than reiterating the
information found in the "Help" buttons (of the database) I want to concentrate on how the various search features can fit your
needs searching needs.
In the eHRAF database click on the Search button (in the blue
horizontal bar) to enter the
default Text Search (Figure 15) and view each section.
| Figure 15. Text Search in the eHRAF
Collection of Ethnography database |
 |
Guidelines to
Searching in eHRAF
The following guidelines will help you in
"building" your search.
#1. The Power of Boolean Operators (and, or, not)
If you combine one search section (e.g., OCM subject codes, Exact Word or
Phrase) with another you will automatically perform an "and"
Boolean search. Within the OCM Subject Search and the Exact Word or Phrase Search you can use Boolean operators (and, or, not) by using the pull-down
menus (Figure 15). Do not type in the words "words "and," "or," and "not" in the boxes.
Understanding Boolean operators, especially the difference between
"and" and "or" is vital for performing powerful
searches.
The drop-down menus with Boolean operators include:
· And (both terms must be present) >>restricts a search
· Or (either term must be present) >> expands a search
· Not (excludes second term) >> restricts a search
#2. Exact Word or Phrase Search in Text Search:
Paragraphs vs. Titles
When you use the "Exact Word or
Phrase" boxes to search for words you can search at two different
levels using the pull-down box (Figure 15) -- Paragraphs (default)
or Titles.
Figure 16 is an excerpt of full text from the document
titled "The Sun god's children" by James Willard Schultz and
shows the titles in the table of contents on the left-hand side and the
indexed paragraphs on the right-hand side.
Paragraphs
Every paragraph of each document is indexed with OCM subject
codes (see Figure 16). The paragraph is the basic unit of a text in eHRAF.
Occasionally a paragraph can be
extremely short consisting of only one or two words such as a list or an index
entry.
Titles (in Text Search)
A title in the eHRAF is considered a chapter or subchapter and can be
found in the table of contents (TOC) of a document (see left-hand side in
Figure 16). Unlike a paragraph, a
title is NOT indexed with OCM subject codes.
Please note that in eHRAF you can also search for book titles, but you
need to
use the eHRAF Source Bibliography search ( Figure 15, left-hand side).
| Figure 16. Excerpt of full text with titles and
paragraphs, indexed with OCM Subject Codes. |
 |
Regardless of whether you are performing a word search in the paragraphs or
titles of the eHRAF texts, the search engine will look for the exact
spelling of the words or phrases searched. This means
that for your search to be successful, your word choice has to match the
word choice used by author. Matching words is sometimes not an easy
task for the eHRAF user and that is why I stress using OCM subject codes
for your search!
When writing, an
author (usually) writes about very specific aspects of broader topics (e.g.,
about the role of wolves in mythology) in the text of a paragraph, but
usually lists broader topics (e.g., mythology) in the chapter titles. Keep this
concept in mind when searching for words or phrases in eHRAF. Although you
can search for any words, a good rule of thumb is to use broad terms when searching
titles
and
narrow terms when searching paragraphs. Or use the OCM subject codes
if you want to be less
dependent on choosing the "correct" word, or use the OCM subject
code in combination with a word to narrow a topic. The
power of an OCM subject code search is that it is able to retrieve
concepts rather than just words appearing in the text (but being without
proper context). The following
section shows search examples for a word-in-title search, an OCM subject
Code search, and an OCM/word search.
Search Examples
In the beginning of
the tutorial I suggested a particular topic,
geographical region and culture to be used in this
tutorial. Let's revisit the idea-- imagine you are interested in studying the different types of
mythologies (e.g., creation story, hero myths, myths of good
and evil, etc.). You are particularly interested in studying a
culture from the northwestern Plains of the United States. Furthermore, you are particularly interested in a
certain aspect of mythology, specifically the role of animals and, in
particular the role of the "wolf" in folklore. This gives us quite a bit of
information to get started with the eHRAF search. First it is best to
analyze the various aspects of your research question(s):
You are interested in:
1. a culture from a particular geographic area (e.g., northwestern Plains in the United States).
2. a subject (e.g., mythology)
3. a narrow aspect of a broader topic (e.g., the wolf as mythological
creature)
The
database is structured by cultures rather than sub-regions.
So you must first "translate" your regional
preferences into a culture or cultures. In the first part of the
eHRAF Tutorial you have learned how to find a culture
representing a certain geographic region. In this case, you have
found that the Blackfoot are located in Montana,
representing the northwestern Plains region of the
United States. In the first part of
the eHRAF Tutorial you have also learned how to find OCM
subject codes representing certain topics. In this case,
the OCM Subject Code "773" represents the topic
"mythology."
Word-in-Title search
Example 1: A "not-so-good" Word-in-Title Search
Ex1: Text Search
You have first decided to perform a "Word-in-Title" search
for "mythology." Your initial search
may look something like shown in Figure 17. You launched your search
and got zero results.
What has happened? You've filled out too many boxes and typed in too many words.
Remember eHRAF searching is based on finding the EXACT words or phrases.
The search engine simply didn't find the word and phrase combination in a
single chapter title.
Let's reexamine the text of an eHRAF document.
In the eHRAF Tutorial scroll up to Figure 16 and look at the
wording of the titles in the
left-hand table of contents (TOC) and you will see the word
"myth" but no references to "Northwestern Plains" or "American
Indians."
In general, culture names and regions should not
be put in the Exact Word or Phrase search boxes because
they are often not found in the body of the ethnography
texts. It is better to choose cultures in the
"Culture/OWC" section or choose cultures after
a search is executed.
In the database type in the
Exact Word or Phrase section type in
"mythology" in the first box, "American
Indians" in the second box, and "northwestern
Plains" in the third box. Change the pull-down box
to read "Titles." Click the gray Search button
to execute the search. This should
produce a window with an error message. Click on
your Internet browser's Back button to return to
Text Search and delete the words in the boxes.
| Figure 17. Search example of a
not-so-good Word-in-Title search. |
 |
Example 2: "Good" Word-in-Title Search
Ex2: Text Search
The magic to performing a "good" search in eHRAF is that
your search entries have to be basic--the fewer
words, the better the results! To capture the word variations
such as mythologies, myths, mythological, etc., you can truncate the word "mythology" to "myth*" (see
Figure 18.). To search for
similar words, the other boxes with words such as "legend", "folklore,"
etc., can be used, but the Boolean operator must be changed to
"or." For general topics such as mythology, it is good to search
at the title level rather than paragraphs. Choose a culture (e.g.,
Blackfoot) in the
"Culture/OWC" section or choose the culture or cultures after a search
is executed.
In the database
in the Exact Word or Phrase section type the truncated
words "myth*" in the first box, the word
"legend*" in the second box, and the word
"folk*" in the third box. Change the
Boolean operators to "or" to expand the search
and the pull-down
box to "Titles." In the "Culture/OWC"
select Blackfoot, NF06. Click the gray Search button
to execute the search.
| Figure 18. Search example of a "good"
Word-in-Title search |
 |
Ex2: Culture View
Figure 19 is an excerpt of the
"culture results" page showing the Blackfoot
File and the listings for number of documents and matches found.
From this list you can pick the culture you are
interested in (in this case the Blackfoot).
In the eHRAF database,
the search engine found the Blackfoot File with 27
matches in 10 documents. Click on Blackfoot, NF06.
| Figure 19. Blackfoot File with number of
matches found in documents |
 |
Ex2: Document View
In the Blackfoot file several
chapter titles with the words "myth,"
legend," or "folk" were found.
Figure 20 shows a short excerpt from the Blackfoot file
with word results found in document titled " Sun
god's children" by James Willard Schultz. Compare
the subchapter title "The Scar-Face Myth" to
how the same title is displayed in the table of contents
in Figure 16 of the eHRAF Tutorial.
In the
database, once you have clicked on the culture name Blackfoot,
NF06 in the "culture results" page you
have entered the "document results"
page. View all the document results, but in particular take
notice how Schultz's document titled "Sun god's
children" is displayed in a word-in-title search.
Click on the various matches in the chapter titles, then
return to Text Search by clicking on Search at
the top of the screen.
| Figure 20. Short excerpt of the
document titled "Sun god's children' of the
Blackfoot culture file |
 |
Example 3: OCM Subject Codes Search
Ex3: Text Search
Now let's improve our searche in eHRAF by using OCM subject codes.
Figure 21 shows a Text Search with an OCM subject code
search for a particular culture. Using OCM subject code
"773" for "Blackfoot" will search
all the paragraphs of the documents in the Blackfoot
file for the subject "mythology."
Important! When using an OCM
subject the pull-down default must be set at "paragraphs" or else the search will
not work.
In the database type in 773 in the OCM subject code
section (see Figure 21). In the "Culture/OWC" section
highlight the culture name "Blackfoot, NF06" and press the gray Search
button to execute your search.
| Figure 21. OCM subject code search in Text Search |
 |
Ex3: Culture View
Figure 22 is the "culture view " page showing that the
search retrieved 534 matches in 21 documents of the Blackfoot file.
Now scroll up in the eHRAF Tutorial and view again the
"Word-in-Title" search that retrieved 27 in 10 documents.
The OCM subject code search retrieved a significantly higher number of
matches and many more documents.
In the database you should
now see the "culture results" page showing the
"Blackfoot, NF06" with 534 matches found
within 21 documents (see Figure 22). Click on Blackfoot, NF06 to
enter the "documents results" page (see Figure
23 in the eHRAF Tutorial).
| Figure 22. Culture view with number of matches
for "773" found in the number of the documents of the Blackfoot culture file |
 |
Ex3: Document View
The document view shows all the retrieved documents and
matches for the OCM subject code found in the paragraphs
of a culture file. Figure 23 shows an excerpt from the
document view with the document titled "The Sun
god's children" containing a great number of
matches for OCM 773 found in the paragraphs.
Notice that now the document titled "The Sun
god's children" not only shows the matches for chapters with the word "myth"
in the title but also
other chapters, not previously retrieved, such as the chapter titled
"Chapter III When Men and Animals were Friendly" with 79
matches!
When you search the documents in eHRAF for only one OCM
subject code, you will usually come across documents containing
a rather large number of matches found in paragraphs
(see Figure 23). See the high number the matches
for OCM in paragraphs as an indication that significant
information can be found for that particular
topic.
As you are developing your search
strategies, you might first want to focus on the
documents containing high number of matches and then
work your way down to the documents containing fewer
number of matches. When you see a high number of
matches, rather than clicking on every paragraph match,
click on the hyperlinked heading right above the string
of matches and paragraphs. For example, if you
click on the title "Chapter III When Men and
Animals where Friendly" you would retrieve all the
79 matches for OCM subject code "773"
(mythology) in the context of the chapter. Scroll
up in the eHRAF Tutorial and in Figure 16 you will see
the an excerpt of the chapter titled "Chapter III
When Men and Animals were Friendly" with the
paragraphs containing the OCM subject code 773.
In the database in the
"documents view" of the Blackfoot file scroll
down the list to locate the document title "The Sun god's children" by James Willard
Schultz. Underneath the chapter title "Chapter III When Men and
Animals were Friendly" click on the number 1
to the right of the word "matches" (see Figure
23). This will retrieve a single paragraph (as shown in
Figure 24). In that paragraph click on heading
titled Chapter III When Men and Animals were Friendly
to find the paragraph within the context of the
chapter. Click your Internet browser Back
button twice to put you back in to the "documents
view" page. Now click on the heading title Chapter
III When Men and Animals were Friendly and you will
see that it put you in the same place as clicking on the
heading in the paragraph.
| Figure 23. An excerpt from the
"documents results" page with the number of paragraphs and
matches found for the OCM subject code "773" for the
document titled "The Sun god's children." |
 |
Ex3: Paragraph View
Each numbered "match" links to a paragraph containing the
OCM subject code (see Figure 24). To see a larger section and the
paragraph in context of a chapter click on the heading that is right above
the gray "hide the OCM codes" button (Figure 24).
| Figure 24. Paragraph containing OCM 773 in the
document titled "The Sun god's children" |
 |
OCM subject codes work very well in combination with other OCM
subject codes or words. Example 4 discusses how OCM subject codes
can be combined with words.
Example 4: OCM/Word Search
Ex4: Text Search
The OCM/Word Search is good to use if you want to narrow your search to a more
specific aspect of
an OCM concept. For example, you might be interested in the role of animals
in mythology, and your special interest is in the wolf as a mythological creature.
When you perform your OCM/Word search (see Figure 25), try to think of word variations and other names for wolves including their
Latin name. If possible, try to truncate your word to expand your search
results. For example, you can truncate "mythology" to
"myth*" to capture the word variations such as mythologies,
myths, mythological, etc. Fill in the other word boxes with
words such as "wolves", "canis," etc., but make sure
to change the Boolean operator to "or." In the
"Culture/OWC" section you could highlight and select the names
of the cultures you wish to search. However, for this search we will use the default "All Cultures" and then select the
cultures from the "cultures view" page.
In the eHRAF database first click on
the blue Search button to return to Text Search. In the OCM
Subject Codes box type in 773. In the Exact Word or Phrase section type the truncated
words "wolf*" in the first box, the word
"wolves*" in the second box, and the word
"canis*" in the third box. Change the
Boolean operators to "or" and make sure that
the pull-down
box is at the default "Paragraphs." In the "Culture/OWC"
select All Cultures. Click the gray Search button
to execute the search.
| Figure 25. OCM Subject Code
"773" combined with the words "wolf,"
"wolves," or "canis." |
 |
Ex4: Culture View
Figure 26. is the culture view page showing a list of cultures
organized by regions indicating the document matches.
In the database scroll down to "North
America" in the "culture view." Click on Blackfoot, NF06
to enter the "document view" (see Figure 26 in the eHRAF
Tutorial).
| Figure 26. List of cultures with
number of
matches found in documents |
 |
Ex3: Document View
As you can see in Figure 27, all the paragraphs containing the words "wolf" or
"wolves" or "canis" will appear in the document view, but the OCM subject codes 773
not shown at this stage. Clicking on a "match" will
open up the entire paragraph. If you see
a high number of "matches," you immediately can go to all the
paragraphs within the context of a chapter or section by clicking on title
heading right
above the match. Notice how the OCM/word search has changed the
document view for the Schultz document titled "The Sun God's
Children." Scroll up in the eHRAF Tutorial and in Figure 16 you will
see the an excerpt of the chapter titled "Chapter III When Men and
Animals were Friendly" with the paragraphs containing the OCM subject
code 773.
In the database, in the
document view underneath the chapter title "Chapter III When Men and
Animals were Friendly" click on the match (see Figure 27 in
the eHRAF Tutorial). This will retrieve a single paragraph (not shown as
Figure). In that paragraph click on heading titled Chapter III
When Men and Animals were Friendly to find the paragraph within the
context of the chapter. Click your Internet browser Back
button twice to put you back in to the "documents view"
page. Now click on the heading title Chapter III When Men and
Animals were Friendly and you will see that it put you in the same
place as clicking on the heading in the paragraph.
| Figure 27. "Snapshot" view of paragraphs
containing the word "wolf" |
 |
Example 5: Search for Graphics
I am often asked how to search for images in the eHRAF database. Using the
computer programming term "<graphic" in the
word box can help you locate images. Figure 28 is
an excerpt from Text Search and shows how the word "<graphic"
is used in combination with the
OCM subject code "773" to search for any
illustrations pertaining to mythologies.
In the database
first click on the Search button to return
to Text Search. In the OCM Subject Codes box type
in "773." In the Exact Word or Phrase section type the
word "<graphic" in the first box.
Make sure that the pull-down
box is at the default "Paragraphs." In the "Culture/OWC"
select All Cultures. Click the gray Search button
to execute the search. Browse the documents for
results.
| Figure 28. Searching
for "<graphics" and "773"
in Text Search |
 |
Summary for Text Search
I hope that the search
examples have given you some insight into the
"search logic" of eHRAF. The first search
example demonstrated the common mistakes that users make
when first using Text Search in eHRAF--producing zero
results (or a very low number of results) by simply using
too many terms in the boxes and by using too many terms
not likely to be found in ethnographic texts.
Choosing the
appropriate Boolean operator is important because
whether you are choosing "and" or
"or" will make a great difference in your
search results.
The search examples
have shown how you can develop a search and the
different ways you can retrieve information from one and
the same document of one culture. The
"Word-in-Title" search, shown in Figure 18,
makes for a good start in searching in eHRAF, but the
OCM subject search (Example 3) is the ultimate power
search.
Search Example 4 showed you how to refine
your OCM subject code search by adding word(s) to search
more narrow aspects of topics. For example, OCM
773 with the word "wolf" found
information on the "role of wolf as
mythological creature."
I usually call the OCM/word
a "filter" search because the OCM
"filters" the word into the appropriate
context of text. You can completely redirect your search
by changing the OCM subject code. For example, if
you would change the OCM to "825" (Ethnozoology)
but leave the word "wolf," your search would
retrieve information on a culture's notion about
wolves instead of mythology.
Truncating your words can often make a big difference in
your search results. For example, searching the
word "mythology" in titles retrieves 48
matches in 43 documents, however, searching the
word "myth*" in titles increases the
results to 407 matches in 176 documents. Also note that
not only can you truncate words, but also OCM subject
codes to two digits (e.g., 77*).
Because language varies widely
over time, a word search should avoid "trendy"
or recent vocabulary. The OCM categories are
particularly valuable when wording is very variable over
time.
The eHRAF database is an extremely powerful database if
you are interested in exploring cultural
diversity. However, it is not the most intuitive
database and searching it does take some practice and
patience. You usually have to "build"
your searches into meaningful results by trying various
search methods. I find that the texts in eHRAF can be a
wonderful help in building a search. Once you find
texts with good results take a closer look at the
paragraphs. The type of the words used by the
author and other OCMs appearing on the top of the
paragraphs can often give you leads in redefining your
search. Visit the section "Search Methods, tips,
and examples" in the eHRAF User Guide for more
ideas on how to use words in combinations with OCM
Subject Codes. The eHRAF User Guides at www.yale.edu/hraf/userguides.html
also contain a HRAF Glossary of Terms and brief
overviews of the Browse and Search menus.
[Last
Page] [Next
Page]
eHRAF
Tutorial Index
Introduction
Browse
Cultures
Culture
Files
Documents
Browse
Subjects
Text
Search
|