Sphinx IQ

Extract information from texts

Recall : The transformation Extract information from texts is possible for the text variables.

trans1

Choose the type of variable you wish to create : lexical, lemmatised, analysis via a thematic dictionary.

Lexical variable

varr

You can generate different types of variables having the words chosen as possible modalities (to be separated by « ; »). In our example, we choose to consider the three words more frequently cited  (health, life and well-being) to the question « Picture_comments » which asks respondents to cite the words coming to their minds when seeing the wall of images.

The lexical measures will create variables allowing to analyze the content of your text variable. The length corresponds to the number of words of the response (if the answer is « life is beautiful », the length will be 4, if we take into account the words tools), the richness is the number of different words in one response, the banality is an average of frequency of the words of a response (The more frequent the words are the more the banality is), finally the intensity gives the « rate of presence » of the words chosen (in our case "health", "life" and "well-being") in a response. For example, if the answer is « Life », the intensity of the word « life » will be 50%, as the response contains two words one of which is « Life » (if we take into account the tool words).

Lemmatised variable

lem text

This assistant allows to create :

  • text variables which contents will be the texts of the selected variables in their lemmatised form (infinitive, masculine, singular) : example, « he drinks » becomes « he drink ».
  • several text variables for each grammatical category of the selected variable (verbs, nouns, adjectives) : example, « I drink water », one variable Verb will be created containing the value « drink » and one variable Noun containing the value « water ».

Thematic analysis

thema an

From a dictionary of words stored by themes, to load by clicking Choose a thematic, this assistant will create an ordered variable of the more cited themes as well as the closed variables corresponding to each of the themes.

text3

Before finishing, select the location of insertion of the new variables.

Regroup the dates

Recall : The transformation Regroup the dates is possible for the date variables.

This assistant allows to regroup the dates by category/class. For example, we can decide to regroup all our dates by years.

date

You have two possibilities :

  • Either to regroup the dates in predefined periods, type month/year, trimester, semester etc.
  • Or to regroup the dates in personalized periods, to be entered manually ( type DD.MM.YYYY, for example « 01.02.2003 ; 01.02.2004 ; 01.02.2005 » will create four periods, before 01.02.2003 ; between 01.02.2003 and 01.02.2004 ; between 01.02.2004 and 01.02.2005 and finally after 01.02.2005)

 date1

3 Choose if you wish to modify the variable chosen by recoding it (you will lose the initial variable) or if you want to create a new variable.

4 In the case of the creation of a new variable, inform at minimum the name of this variable and possibly the title,

Then choose the location of the insertion of the new variable created (after the variable selected or at the end of the questionnaire).

Group numbers into classes

Recall: The transformation Group numbers into classes is possible for the open numerical variables.

This assistant allows to assign each observation for one class, that we will define, The aim is to be able later to use these classes of observations for the analyses.

In our example, we want to create classes of age. For this, we select the variable Age (numerical) then arrive to this assistant which proposes two ways of dsefining classes. Either we define the limits of classes manually (bounds of personalized classes), or we use another method allowing a distribution of observations in the classes, in function of a fixed number of classes.

Define the limits of classes

nombre en classes 1

We define the limits of classes by separating them with « ; ». In our case, we wish to create the classes under 18 years, From 18 to 25 years, From 26 to 50 years and more than 50 years.

nombre en classes 2

We can later modify the title of classes, created by default, by clicking didact final copie html m4d24f600, for example we can make the decision to name the under 18 years : « Minors », those between 18 and 25 years : « Active youth » etc…

nombre en classes 3

  • For the stages 1 and 2, to relate to the introductory sectcion of Transforming a variable.
  • 3 Choose later if you wish to modify the variable chosen by recoding it (you will lose the initial variable) or if  you want to create a new variable,
  • In the case of the creation of a new variable, inform at minimum the name of this variable (and possibly the title),
  • 5 Then choose the location of inserting a new variable created (after the variable to transform, at the end of the questionnaire).

Use classes definition method

numb class 4

You have the choice between four methods of the definition of classes :

  • Same amplitude : The gap between the classes created will be constant. For example, in our case we will obtain the classes 20-29, 30-39, 40-49 etc…
  • Same value : will create as many classes as it exists different values of the variable selected. In our example, we have chosen the variable Age, if we have three observations which the values of this variable are respectively 21, 21 et 42, the method will create two classes : the class 21, which will regroup two observations, and the class 42 with only one observation.
  • Around the average : allows you to create 3, 5 or 7 classes of intervals, each interval having a « length » of ½, 1 or twice the standard deviation. The median class will be the interval containing the average. For example, your data set possesses for the variable « Age » the values 10, 20, 30, 40. We obtain an average of 25 and a standard deviation of 12,91. You wish to create five classes around the average of 1 standard deviation (12,91). These are the classes that will be created :
    nombre en classes 5 
  • Same (effective) : will create classes containing approximately the same number of observations (balanced distribution of observations).

Regroup the codes

Recall : The transformation Regroup the codes is possible for the code variables (for example, postal code). In our case, we have a question which asks respondents for their postal code.

code

3 You can reduce the code to a certain number of characters. In our example if we tick this option and we put (1 ;2), the codes will appear in the form 38, 69, 73 et 74, but if we put (1 ;2 ;3 ;4 ;5) or if we leave the field empty, the codes will appear in the form 38000, 69000 etc,

4 You can regroup the codes with a dictionary (file .txt). In our case we wish to regroup the codes by department in order to carry out statistics by department (or by region, etc.) during the analyses,

5 Or in a personlaized way by regrouping manually your codes and by creating the names of the appropriate groups (for example, select 73 et 74 and create the group Land of Savoie).

Manage the modalities

Recall : The transformation Manage the modalities is possible for the scale and closed variables (unique and multiple).

You can regroup, rename, delete or arrange the modalities (for a recall, the modalities are all the possible values / responses of a variable), or create a variable for each of the modalities of the variable selected (operating mode developed below).

In the case of a multiple closed variable, you can « Create a variable containing the number of effective responses » (for example if for one question the respondent may make at maximum three choices, this new variable created will take 0, 1, 2 or 3 for values according to the number of choices made by the respondent). You can also « Create a variable from the combinations of their modalities » (for example, in the case where the respondent has selected « Home » and « Restaurant » for a question asking him "where do you mostly drink", the variable created will contain, for this observation, the modality « Home_Restaurant »).

The particular case of an arranged multiple closed question gives you the possibility to « Create one variable for each rank». (If for a question the respondent selects « Home » first  « Restaurant »  second, the assistant will create two variables which will respectively have « Home » and « Restaurant » as modalities).

trans

Regroup the modalities

We will regroup, in our example, the images selected on the wall of images in function of their alcoholic or non-alcoholic nature.

regroup 1

To create a group (regroup the modalities), select the interesting modalities then click on the button  plus. In our example we have selected all the modalities refering to an image representing an alcoholic beverage. After selecting, the button plus has therefore allowed us to regroup these modalities in the group named « Alcoholic beverages ». We have proceeded the same way for the non-alcoholic beverages and the images not allowing to know the nature (alcoholic or not) of the beverage represented.

regroup 2

Once all the groups are created (here three groups), click next.

regroup 3

Choose if you wish to modify the variable chosen by recoding it (the initial variable will be lost) or if you want to create a new variable.
Click Finish then validate and quit the assistant.

Create one variable for each of the modalities

regroup 4

It is possible from this assistant to create a unique closed variable for each of the modalities of the variable selected.

Two possibilities are proposed :

  • The new variables created will contain « Yes » or « No» for each of the modalities chosen on the original variable. If on your original variable, you propose two modalities, for example « Always » and « Never » and the respondent selects « Always ». The two new variables which will be created in your data set will be « Always » which will have  « Yes » for value, and « Never » which will haver « No » for value.
  • The new variables created will contain, in function of the modalities chosen on the original variable, either the name of the modality chosen, or a non-response. If on your original variable, you propose two modalities, for example « Always » and « Never » and the respondent selects « Always ».The two new variables which will be created in your data set will be « Always » which will have « Always » for value, and « Never » which will be in non-responses (empty field).

regroup 5

In the case of the creation of new variables, when you click finish, as many variables as the original one will be created as the original variable possesses modalities. If you have selected The modalities « Yes and No », the new variables created will have « Yes » or « No » for modalities. If the respondent has chosen the original variable then the new corresponding variable will take the value « Yes », in the contrary case it will take the value « No ».

gerer-modalites-7

If you have selected « One modality in the name of the original  modality », only the variable corresponding to the modality of the variable selected will have for modality the name of the original modality, the others will be « empty ».

Create a variable containing the number of actual responses

tr

From this assistant, you can create a numerical variable containing the number of effective responses of the initial variable chosen (in this case, « Selected Pictures »). In the case of an empty response, you can assign as value to the new variable either a non-response (empty cell) or the numerical value 0.

transfor

In this case, it is not possible to recode the existing variable but actually to create a new variable. Choose where you wish to insert this variable (after the initial variable or at the end of the questionnaire). Click Finish to validate and quit the assistant.

Create a variable for each rank

transfou

This assistant allows to create as many new variables as the initial one possesses ranks (If the variable possesses two ranks, two variables will be created). each variable will contain the set of modalities of the original variable and will take the modality of the concerned rank for value. Click Finish to validate and quit the assistant.

More Articles...

  1. Changing the type

 

Retour vers : Generate data