Skip to Main Content

     

Digital Humanities

Here you will find a guide to Digital Humanities resources.

What is Voyant Tools?

According to their website, "Voyant Tools is a web-based reading and analysis environment for digital texts." You will use it to analyze a large number of texts at once.

How Do I Upload Texts into Voyant Tools?

The group of texts you add to Voyant Tools at a time are called a corpus. 

Home page of Voyant Tools

As you can see from the above photo, you can either upload texts by copying and pasting the URLs of their texts onto separate lines, copying and pasting the full body of their texts, opening one of two corpuses Voyant Tools provides, or uploading multiple texts you have already downloaded on your computer. For this example, I will be using Voyant Tools' corpus of "Austen's Novels"

Screenshot of uploading Voyant Tools' corpus of Austen's Novels to analyze

 

Navigating Voyant Tools

When you first upload your corpus and click "Reveal," your screen will change into something similar to this.

The main page of Voyant Tools once a corpus is uploaded. There is a circle and "1" next to "Cirrus" on the top left corner, a circle and "2" next to "Reader" on the middle of the top of the page, a circle and a "3" next to "Trends" in the top right corner, a circle and a "4" next to "Summary" on the bottom left corner, and a circle and a "5" next to "Context" in the bottom right corner.

There are five boxes of analysis Voyant Tools immediately give you. As numbered above, those boxes are:

1. Cirrus

2. Reader

3. Trends

4. Summary

5. Contexts

As these five tend to be some of the most used tools, I will go more in depth about each of them in the following boxes as well as an overview of other tools you can use.

If you want to focus on one tool, you can open it in a new window. Let's do it with the Cirrus tool:

A close-up of the "Cirrus" section on the top left corner of the main page. There is a circle around a box with an arrow coming out of its top right corner.As shown in the black box I created, click the small box with an arrow leaving the top right corner.

The "Export" pop-up with "a URL for this view (tools and data)" selected.

This small menu will pop it. Click export, and a new window will be made of just the tool you want to focus on. 

Using Voyant Tools: #1 - Cirrus

As the first tool I described in the previous box was Cirrus, let us take a look at how to use it. Cirrus is a great way to visualize the words used within a corpus!

The "Cirrus" tab in its own window. It is filled with different words of varying sizes and colors. There is an arrow with a "1" pointing to the "Scale" drop-down bar on the bottom left corner next to the "Terms" slider bar with an arrow and "2" pointing at it.Here we have the Cirrus box in its own window. It is also known as a world cloud, showing the frequency of words used within the corpus. The size of the words is dependent on how frequently they are used within the corpus (larger ones = many uses, smaller ones = less uses). Numbered in the bottom left corner are:

1. Scale: Where you can select if Cirrus will show the word frequencies of the entire corpus you have uploaded or texts you select.

2. Terms: Use the slider to adjust how many terms you want Cirrus to include.A close-up of the "Cirrus" word bubble. "mr" is the largest word. It is in blue, near the middle of the page, and has a bar over it saying "mr: 3117."

If you put your cursor over any of the words, Cirrus will tell you how many times that word is used within the corpus. For example, "mr" is used 3,117 times within Jane Austen's work. Clicking on any of the words in Cirrus' own window or on the main tool page will impact the other tools by having them analyze the world selected (which we will explore shortly).

Using Voyant Tools: #2 - Reader

Our next tool to explore is Reader. This one is fairly simple, as its main function is to let you read the texts you uploaded.The "Reader" tab in its own window. There is text from Austen's works on top of a boxes of various colors and sizes. There is also a vertical blue line at the very beginning of the boxes.

As seen above, The main area of this tool is where you can read the texts you have uploaded as a corpus. The different colored and sized boxes on the bottom represent the different documents of the corpus. Using your cursor to click any location in the boxes will take you to that location within the text. In the above picture, you can see a vertical blue line at the beginning of the boxes; this lets you know where in the corpus you are.

The "Reader" tab. The word "LOVE" in the first line of the text is highlighted, and there is a box below it that says "document frequency: 54."

If you place your cursor over any word in the document, it will tell you how many times that term is used within the specific document you are in. For example, the word "love" appears 54 times in the first work of this corpus.

 

Let's see what happens when we select a word from the drop-down bar near the bottom left corner. I'm going to chose "mr," since it is the most used word in this corpus.

The "Reader" tab. The boxes on the bottom of the screen now have frequency lines going through them.In the fourth, purple box, there is a pop-up box that reads "1813 Pride and Prejudice  mr: 0.0004909823." On the bottom left corner, there is a circle highlighting a drop-down bar where "mr" is selected.

There are now jagged lines throughout all the boxes of this corpus. This shows you how frequently a word is used in each document. If you place your cursor over a section of a line, it will tell you the frequency of the word in relation to all words within the document. For example, the frequency of "mr" in a section of Pride and Prejudice is 0.0004909823.

Using Voyant Tools: #3 - Trends

Next, we have the "Trends" tool. As you will see below, Voyant Tools automatically chose to analyze the top five most used words in this corpus: "mr," "mrs," "said," "miss," and "think." If you would like to change the words the tool analyzes, click on the drop-down menu on the lower left corner, similar to other tools. Or, you could click on the word above the chart that you would like to get rid of. Voyant Tools automatically chooses the "Line + Stacked Bar" graph to show you the trends, but you can change this by clicking the "Display" drop-down menu on the bottom left corner.

The "Trends" tool tab. On the top are the words "mr" next to a blue circle, "mrs" next to a green circle, "said" next to a pink circle, "miss" next to a purple circle, and "think" next to a teal circle. Below is a graph showing the frequencies of each word in each document through bar graphs and lines. The x-axis is each document in the corpus while the y-axis is the "Relative Frequencies" of each word. On the bottom left corner, there is a circle around a drop-down menu for term selection next to a circle around the "Display" drop-down menu. There are boxes highlighting the frequencies of "mrs," "said," and "think" in Sense and Sensibility.

If you hover your cursor over any of the points or bars, Voyant will tell you the frequency of the word in each text. For instance, in Sense and Sensibility, "mrs" has a relative frequency of 0.0044182, "said" has one of 0.0033095, and "think" has one of 0.0017506. As the pop-up boxes say, you can "Double-click to drilldown." Doing so will bring up a pop-up menu that allows you to chose to either make a graph of the term ("View the distribution of this term within all documents") or of the document ("View the distribution of all current terms within this document"). Below, you can see the distribution of "mr" across all documents (each color representing a different text) and then the distribution of all terms in Pride and Prejudice.

Distribution of the term "mr" throughout each text.

Distribution of each term throughout Pride and Prejudice.

Using Voyant Tools: #4 - Summary

Compared to the other tools, the "Summary" tool is quite straightforward. 

A photo of the "Summary" tool for the corpus of Austen's Works. The different summaries include "Document Length," "Vocabulary Density," "Average Words Per Sentence," "Readability Index," "Most frequent words in the corpus," and "Distinctive words (compared to the rest of the corpus)"

As you can see above, there are many different summaries of the entire corpus for you to explore! You can compare "Document Length," "Vocabulary Density," "Average Words Per Sentence," "Readability Index," "Most frequent words in the corpus," and "Distinctive words (compared to the rest of the corpus)." Next to the first four summaries are small graphs which visualizes the highest and lowest of each summary in the order the works appear in the corpus. 

You can adjust how many words you would like to appear in the "Most frequent words in the corpus" and "Distinctive words (compared to the rest of the corpus)" summaries by moving the "Items" slider bar circled below. As you can see, there are many more words in these sections than before.

The "Most frequent words in the corpus" and "Distinctive words (compared to the rest of the corpus)" summaries of the "Summary" tool with many more words than there were before. On the bottom left corner, there is a circle around the "items" slider bar.Finally, similar to Voyant's other tools, if you click any of the document titles or any of the "Most frequent words," Voyant will take you to a new main page with the focus on the selected words and/or titles.

Using Voyant Tools: #5 - Contexts

The fifth and final tool on the main page is "Contexts." It shows you the context of different words within the different works of the corpus, helping you to better understand how the authors used each word in their texts.

I have numbered the major components of this tool.

1. Document: Tells you which document the context was of the word was found from.

2. Left: The words before the usage of the word you have selected to analyze; the first half of the context the word is in.

3. Term: The word (or term) you have selected to analyze. As "mr" is the most used term in the corpus, it is our example. 

4. Right: The words after the usage of the word you have selected to analyze; the second half of the context the word is in.

5. Search Bar: Similar to other tools, this is where you state which term(s) you want to find the context of.

6. Context: Using this slider bar will increase or decrease the number of terms on both left and right sides of the context.

7. Scale: Using this drop-down menu, you can select which document(s) you want to analyze in particular.

The "Contexts" tool. There is a circle around and a "1" near the "Document" column on the top left, a circle around and a "2" near the "Left" column in the top middle, circle around and a "3" near the "Term" column, and circle around and a "4" near the "Right" column, all following the "Left" column. On the bottom left, there is a circle around the search bar with a "5" next to it, a circle around the "context" sliding bar with a "6" next to it, and a circle around the "Scale" drop-down bar with a "7" next to it.

As seen below, if you click on the small "+" in the box left to a row (seen as a "-" in the photo below, as it is already expanded), you can see how the term fits into a larger context of the text. The term you have selected to analyze will always be highlighted in yellow. By using the "expand" slider bar will increase or decrease the number of terms you will see in the expanded excerpt; this has no relation to the number of terms you will see in the "Left" and "Right" columns,

The "Contexts" tool with a section expanded. There is a circle on the top left side around a "-" in a box, showing how the selection has expanded. There is also a circle around the "expand" slider bar near the bottom center. There is a larger except of words in the main area with one of the terms "Mr" highlighted,

Final Things

That's how the five main tools of Voyant work! However, there are many more for you to explore on your own!

As you can see on the main page, there are other tools listed next to the main tool. Clicking on one of those will replace the tool in the window with the new one. You can also click the Windows button to see a list of all the different tools!

The main Voyant Tools page. There are circles around the named tools next to each of the five tools already covered. There is also a circle around the Windows symbol on the top left window with another circle around the list of tools below it.

If you would like further guidance, click this link to Voyant Tools' Help Guide: https://voyant-tools.org/docs/#!/guide

Stopwords:

Voyant Tools already has a list of stopwords that will not show up in your analyses. For example, the words "the," "you," and other common words will be excluded from analysis. If you would like to see the stopwords and edit them (for example, edit out any random links that show up in your corpus), click on the tab circled below in any of the tool windows.

A close-up of the top left box. There is a circle around the tab next to the Window symbol and question mark.

The following screen will show up:

The "Options" pop-up box. There is a circle around the "Edit List" button and the two terms next to it ("Auto-detect" and "apply globally" are underlined.

You would click the "Edit List" button to view your list (as shown below). Above, you can also see that "Auto-detect" is underlined; if you would like to eliminate the given stopwords in favor of none or your own, click the drop down menu. Checking off the box next to "apply globally" will let all the stopwords in your list apply to every tool; leaving it unchecked will make the words only apply to the tool window you opened the options in.

The "Edit Stoplist" pop-up. Some of the stopwords are visible. Above the words is written "This is the stoplist, one term per line."

Finally, hovering over any of the question mark symbols, either above the tool window or in the search box, will provide you with more details for each tool and how to manipulate the syntax of your terms. Clicking on the question mark symbol will bring up a pop-up box that you can click to learn more information. As seen below, the cursor is hovering over the "?" next to the "Trends" search bar, causing a syntax information box to float above it. There is also a circle around the "?" of the "Trends" window itself.

A close-up of the "Trends" window. The details are described in the passage above.

Good luck on your Voyant Tools journey! Feel free to ask for help from a professor or librarian at any time.

Contributor

Fall 2022 Library Digital Humanities Intern: Meeghan Bresnahan