Experimental linguistics with Python (I)

Displaying your experiments in HTML

Posted by Mario Casado on 2020-10-16
Experimental linguistics, Web development

Introduction

When running an experiment, we often need gather data from participants: auditory information, judgements or even sociological aspects. Designing an interface to cope with this has several advantages. On the one hand, in the first place, it allows you to forget about handling individual forms for each task and participant. In the second place, it anticipates the moment of putting all the experiment data together, shaping it and typing it into the computer. On the other hand, it makes data preprocessing easier as it allows to treat data with all Python power before storing it and right after collecting it.

The point here is why would you bother to design a form and collect data in paper to then put it into the computer when you could have collected it on the computer in the first place?

Prerequisites

  • Basic knowledge of HTML and templates
  • Basic knowledge of Python
  • Familiarity with experimental linguistics

Designing the experiment

First of all, we must make clear our experiment design. For this particular case, we will be playing different synthesized voices for participants to decide how natural they feel said voices in a scale from 1 to 3, being 1 very unnatural and 3 very natural. We will also collect some personal data from participants: age, origin (city, country) and highest qualification.

To preserve participants anonimity, whenever they fill out the personal data details, their are assigned a random id. At no point will the experiment keep any personal details. During the experiment, personal details and responses are appended to JSON files containing all experiment data.

Tools

For the web server, we will be using Flask, a webservice framework for Python as it allows working with HTML templates. For the graphical interface design, we will be using Flask-Bootstrap, a Bootstrap environment for Python’s Flask. Bootstrap is an open HTML templates library. It offers tons of pre-designed HTML elements ready to be added to your code. It will serve us for quickly styling our interface.

Setting up our environment

First of all, we need get the project dependencies installed. A virtual environment is highly suggested. Just run the following command and wait for it.

1
$ pip3 install Flask flask-bootstrap

Building Flask structure

Flask allows to quickly set up a webserver architecture with very few lines of code. To create and initialize the app, create a Python file (might be app.py) as follows:

1
2
3
4
5
6
7
# app.py file

from flask import Flask, render_template, request
from flask_bootstrap import Bootstrap

app = Flask(__name__)
Bootstrap(app)

That will create our Flask app (line 4) variable and apply Bootstrap styles over (line 5).

Routing structure

Web services are structured in routes. It’s like folder structures. Imagine when you browse your PC files. You start at your user’s root directory, represented as user/ and you deepen within folders that start adding to that path: user/documents/personal/. Websites work alike – the route is the main URL, say www.example.com and each possible route adds to that: www.example.com/users, www.example.com/login, etc.

Our web service will have two routes or endpoints: one for the main page or index and the other for the proper experiment page.

Designing our templates

Flask incorporates an HTML templates engine that reads from a templates/ folder. That means we don’t need to design every single HTML site page as a single file in our project. We will be able to reuse HTML blocks for different pages. We can design the parts of our web separately and then use them wherever we want.

Flask-Bootstrap has a base template with predefined blocks to start with. To start building pages, we just have to import that template (bootstrap/base.html) and start writing our own blocks.

Common elements

All of the pages in our website will have a navigation bar (navbar) at the top with the title of the site (could be our name, the experiment’s name, etc.) and all of the main body contents inside a Bootstrap container so they are shown with some margins. Because of that, we will include these two elements within our base file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<!-- templates/base.html file -->

{% extends 'bootstrap/base.html' %}

{% block navbar %}
<nav class="navbar navbar-default">
<div class="container-fluid">
<div class="navbar-header">
<a class="navbar-brand" href="/">Experiment</a>
</div>
</div>
</nav>
{% endblock %}

{% block content %}
<div class="container">
{% block main %}{% endblock %}
</div>
{% endblock %}

As you can see, we indicate we are overwriting Bootstrap base template (line 1). We then define the block that we are modifying within the tags {% block %}{% endblock %}. Within content block, we have drawn a Bootstrap container and for all of the contents to be fitted in, we have defined a new block called main. Save this code as base.html inside templates/ folder. It will be our personal base template for the whole website.

Index page

Our index page will add to the base template a welcome message and a brief experiment presentation together with the possibility to get it started. As before, we just have to extend our base template and start writing our blocks.

1
2
3
4
5
6
7
8
9
10
11
12
<!-- templates/index.html file -->

{% extends 'base.html' %}

{% block main %}
<h1>Welcome to the experiment</h1>
<p>We are glad you have decided to participate in our project. We want to thank you for your help.</p>
<p>This project aims to seek for the highest degree of naturality in synthesized voice. That's why we are putting to test some of our best <i>Text-to-Speech</i>voices.</p>
<p>In this experiment you will be played some audio clips with voices and you will be expected to grade the in a scale from 1 to 3, being 1 very unnatural and 3 very natural.</p>
<p>We don't collect any private data. All the info gathered in this experiment is stored preserving the anonimity of participants.</p>
<a type="button" class="btn btn-default" href="/experiment">Get started</a>
{% endblock %}

As you can see, we have imported our personal base template this time – the one containing our navbar, and we have drawn the main block – the one we defined inside a container. Observe how we have set href attribute in the a tag button to redirect to /experiments so when clicking that button, it will redirect to the experiment. Save this file as index.html and we’re done.

You should have by now the following directory structure:

1
2
3
4
5
project/
app.py
templates/
base.html
index.html

Checking our index page

Now that we have our first page ready, let’s take a quick look to it. For that, we need to create the route within our site. We could host the index under the route /index, /home, etc.; however, we will just use the root (our bare URL) as we want this page to be the first that users see. To define a new root in Flask, we need to use de decorator @app.route() and create a function that serves our HTML template. That function will only return Flask’s function render_template(), which sends to the browser the desired HTML template.

1
2
3
4
5
6
# app.py file

@app.route('/')
def index():
return render_template('index.html')

Run $ flask run -p 3131 in a terminal and open an Internet browser on the url localhost:3131. You will see your index page. You can stop Flask server by pressing ctr+c in the terminal.

Experiment templates

For the experiment’s page, we have to cover three different scenarios:

  • At the beginning, we display a survey for personal data.
  • After survey is filled, we render a form to complete the experiment.
  • When all the audio files have been played, we should show a final thanks message.

To accomplish that, we will design a different template to cover each scenario and we will then set up the conditions to render one template or another in our Flask app.

The survey

For the survey we will use the basic Bootstrap form and rewrite it with our data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
<!-- templates/survey.html file-->

{% extends 'base.html' %}

{% block main %}
<h1>Fill with your info</h1>
<form action="/experiment" method="POST">
<div class="form-group">
<label>When's your birthday?</label>
<input type="date" name="born">
</div>
<div class="form-group">
<label>Where are you from? (City, Country)</label>
<input type="text" name="origin" placeholder="Madrid, España">
</div>
<div class="form-group">
<label>What's your highest level of studies? (City, Country)</label>
<select name="studies">
<option selected disabled>Choose one</option>
<option value="1">None</option>
<option value="2">Primary studies</option>
<option value="3">Secondary studies</option>
<option value="3">High studies</option>
</select>
</div>
<input type="hidden" name="survey">
<button type="submit" class="btn btn-default">Send</button>
</form>
{% endblock %}

Again, we refer our base template and start building this page’s main body block. To collect the values that participants send within the form, it is important to use an HTML <form> and set its method attribute to POST. Otherwise, Flask won’t let us recieve the data from the website. We have set the form to send all of the data to /experiment route as seen in the action attribute. Within the form, we have one <input> tag for every piece of data that we want to get from the participant.

In order to easily detect that the form comes from the survey and not from other steps, there is a hidden <input> tag. A hidden tag allows us to send values in the form that are not shown to the user in the website. In the case, we are sending the value survey.

This form will look like this:

Final page

Whenever participants listen and judge the last audio clip, they are shown a final page thanking them for participating. Say…

1
2
3
4
5
6
7
8
<!-- templates/end.html file -->

{% extends 'base.html' %}

{% block main %}
<h1>This is the end</h1>
<p>Thank you for participating</p>
{% endblock %}

Which would turn into the following.

The experiment

The last and the most important part is to manage the experiment page. On the one hand, we have the audio clips. Web browsers cannot access file system, so you won’t be able to link files from the website to the folders in the project. For serving files in a website, we need to place them in the static folder. It’s the only folder where web browsers can search for files. We often put there JavaScript and CSS files, images and videos to be seen in the website, etc.

Flask serves to the browser any file that you place within a folder called static/. Likewise, you should currently have the following project structure.

1
2
3
4
5
6
7
8
project/
app.py
templates/
base.html
end.html
index.html
survey.html
static/

Place all the audio clips in the static directory.

We need to serve audio files one after another. However, any time the form is sent to the app by the participant, the whole code is executed from top to bottom. We cannot run a loop within the experiment endpoint as it will be reset again and again. In order to keep track of played audios and keep them in the right order, we will create a Python dictionary that stores each audio with its position in the list. We can use os.listdir() to get the audio files list and enumerate() to enumerate them and then make a dictionary for easily accesing each position.

1
2
3
4
5
# app.py file

from os import listdir

data = dict(enumerate(listdir('./static')))

Now we have all the audio clips identified with a key number within data variable. Flask app will exchange the audio key (code from now on) with the forms so, in each request received from the experiment, we will get the code, we will increment it and serve the next audio in our list.

Let’s create the experiment form.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
<!-- templates/experiment.html file -->

{% extends 'base.html' %}

{% block main %}
<h1>How natural you think the following utterance is?</h1>
<audio autoplay>
<source src="{{ url_for('static', filename=audio) }}" type="audio/wav">
Your browser does not support the audio element.
</audio>
<form action="/experiment" method="POST">
<div class="form-group">
<input type="hidden" name="code" value="{{ code }}">
<input type="hidden" name="user" value="{{ user }}">
<label>
<input type="radio" name="judgement" value="1"> 1 Very unnatural
</label>
</div>
<div class="form-group">
<label>
<input type="radio" name="judgement" value="2"> 2 Natural
</label>
</div>
<div class="form-group">
<label>
<input type="radio" name="judgement" value="3"> 3 Very natural
</label>
</div>
<button type="submit" class="btn btn-default">Send</button>
</form>
{% endblock %}

The form is similar to that of the survey. There is an HTML audio object that uses url_for() Flask handler. That Flask function receives the variable filename from the app and builds the path to it within the static folder. Observe the hidden <input> tag that will send the code variable. Flask allows us to render the template together with some variables and access them by using {{ }} symbols. We will render this template with three variables: the filename of the audio to be played, the audio code and the user ID.

Take a look at the result:

We will now turn to the web service route /experiment. We need to include there the conditions to serve either the survey, the experiment form or the end page.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# app.py file

from uuid import uuid4

@app.route('/experiment', methods=['GET', 'POST'])
def experiment():
if not request.form:
return render_template('survey.html')
else:
if 'survey' in list(request.form.keys()):
code = 0
user = str(uuid4())
else:
if int(request.form['code']) == max(data.keys()):
return render_template('end.html')
else:
code = int(request.form['code']) + 1
user = request.form['user']
return render_template('experiment.html', user=user, code=code, audio=data[code])

First of all, we must enable both GET and POST methods in order to not only render the HTML templates, but also receive the forms. Observe the conditions: if there isn’t a form in the request to /experiment endpoint, then we want to render the survey as in the other stages of the experiment there are always forms. If the form exists, we check whether the form contains the key survey among the list of form values. If so, that means the user have just sent the survey form and they are awaiting the first clip. We set the code to 0 to send to first audio of our list and then we create an random ID with uuid library to link participant’s information.

Whenever there is a form but it is not the survey, we are in the middle of the experiment. We need evaluate if we have to render the end or another audio. We accomplish that by comparing the code received from in the form to check whether it is the maximum value of our audios codes. If they match, then the user has just listened to the last audio and we want to render the end page. Otherwise, there are audios left to be played, so we increment the code and render the experiment form with three variables: the user ID, the audio clip code and the audio clip name.

Processing data

So far we have enabled a graphical web service to collect data from participants. However, we still have data processing left. For this experiment, we will create a function to feed data from and to two JSON files: one for the user’s info and the other for the experiment’s judgments.

First of all, we have to preprocess the information that we receive from the HTML pages. Templates are sending data that we don’t need to keep in our database (audio code, survey property…). We will store in a Python dictionary the values that we want to keep:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# For user's data
form_data = {
'id': user,
'born': request.form['born'],
'origin': request.form['origin'],
'studies': request.form['studies']
}

# For the judgements
form_data = {
'user': request.form['user'],
'audio': data[int(request.form['code'])],
'judgement': request.form['judgement']
}

And then, we will need a function that loads JSON files, appends the new contents gathered from the forms and saves them back. We will use Python’s json library with the functions load() and dump().

1
2
3
4
5
6
def save_data(data, file):
with open(file, 'r', encoding='utf-8') as f:
db = load(f)
with open(file, 'w', encoding='utf-8') as w:
db.append(data)
dump(db, w)

The first with block loads a JSON file and the second appends current data and writes the file back to our project folders. Before keeping on, it is very important that you create the JSON files with an empty list (i.e., []) each. The reason is this function will look for data to load. If it doesn’t find the files or the list to load, an error will be raised. So by now, we have the following project:

1
2
3
4
5
6
7
8
9
10
project/
app.py
judgements.json
users.json
templates/
base.html
end.html
index.html
survey.html
static/

To make this function run, we will call it after the dictionaries we just created. Let’s see the final look of our app.py file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# app.py file

from flask import Flask, render_template, request
from flask_bootstrap import Bootstrap
from os import listdir
from uuid import uuid4
from json import dump, load

app = Flask(__name__)
Bootstrap(app)
data = dict(enumerate(listdir('./static')))

def save_data(data, file):
with open(file, 'r', encoding='utf-8') as f:
db = load(f)
with open(file, 'w', encoding='utf-8') as w:
db.append(data)
dump(db, w)

@app.route('/')
def index():
return render_template('index.html')

@app.route('/experiment', methods=['GET', 'POST'])
def experiment():
if not request.form:
return render_template('survey.html')
else:
if 'survey' in list(request.form.keys()):
code = 0
user = str(uuid4())
form_data = {
'id': user,
'born': request.form['born'],
'origin': request.form['origin'],
'studies': request.form['studies']
}
save_data(form_data, 'users.json')
else:
form_data = {
'user': request.form['user'],
'audio': data[int(request.form['code'])],
'judgement': request.form['judgement']
}
save_data(form_data, 'judgements.json')
if int(request.form['code']) == max(data.keys()):
return render_template('end.html')
else:
code = int(request.form['code']) + 1
user = request.form['user']
return render_template('experiment.html', user=user, code=code, audio=data[code])

Error page

It is not rare that users get the wrong route or URL to a web page. For this reason, it’s always smart to design a 404 error page. Whenever one enters the wrong path within our website, there won’t be any ugly message displaying the error. We will show our base template with a custom text. Here’s my suggestion.

1
2
3
4
5
6
7
8
9
<!-- templates/404.html file -->
{% extends 'base.html' %}

{% block main %}
<h1>Error 404</h1>
<div>
<p>Nothing was found. Please, try again.</p>
</div>
{% endblock %}

In order to render this template on 404 server errors, we need add the following routing code to our app:

1
2
3
4
5
# app.py file

@app.errorhandler(404)
def page_not_found(e):
return render_template('404.html'), 404

This is the look:

Run

To run the webserver locally, open a terminal in the project directory and run flask run -p 3131. After that, just open an Internet browser and go to localhost:3131.

Conclusion

In this tutorial, we have build a simple dynamic website to carry out our experiments. We have used Python’s Flask framework for the site’s architecture and Flask-Bootstrap for the design. Our web service has two endpoints: an index page at the root path that welcomes the participants, introduces the project and the experiment to them and gets them started; and the route to the experiment. The latter is formed by three HTML templates served under conditional evaluations. If the user has just landed to the page, they are surveyed the personal information required for the experiment and it is stored when received by the app. After that, they are rendered the experiment form as many times as audios we have in our corpus. Each time the participant sends a judgement, our app stores the data within our JSON database. Whenever they get to listen the last clip, a final page is shown thanking them for participating.

Browse the files

You can browse the project we have just built in Github.