Data preparation
The data preparation is a functionality of Sphinx iQ which allows to get access from the same dialog box to the set of functions present in the tab Data of the environment Spreadsheet. This dialog box is accessible in two ways :
- In the tab Data of the Spreadsheet, click on the button Data preparation.
- From the home panel, click on the button Data preparation in the stage Data management.

You reach the following dialog box which lists, by tabs, the functions below. Most of the functionalities have been treated in the part Spreadsheet of the tutorial.
Only two functionalities haven't been offered in the part Spreadsheet and are then accessible only from this dialog box. Automatically number in the tab Variables and Collect identical surveys in the tab Compare / collect surveys.
The data qualification is a functionality of Sphinx iQ which allows especially to identify the variables poorly-documented or badly-informed, to distinguish the singular observations to replace the non-responses. For example, some respondents always give the same answer to the scale questions, in order to respond to the survey faster. The dataset that we obtain may be biased, and necessary to qualify its data in order to make them more pertinent. This operating mode is accessible by two methods :
- In the tab Data of the Spreadsheet, click on the button Qualifiy.
- From the home panel, click on the button Qualifiy in the stage Data management.

You reach a dialog box in which you must select the type of changement you wish to perform on your data :
- Spot the singular observations or poorly-documented
- Replace the non responses (by the most cited modality for example)
- Spot the non-relevant variables
- Extract a cylindered sample
1 Select the operation you wish to perform on your dataset. In our example, we will replace the non-responses (empty cells in the table of the central area)
2 Click possibly on Dataset quality in order to have a graphical overview of the relevance of your data. The button Options allows to perform the analysis of dataset quality on a group of variables selected.
3 Select the variables you wish to replace by the non-responses
4 Choose the value to replace the non-responses : average, mode (for numerical variables), or previous value (previous value for all types of variables).
Click on Detailed results if you wish to visualize the set of observations concerned with replacing the non-responses, and Finish to validate the replacements in the dataset.
Data management
The data management allows to have a detailed idea about data and possibly add or modify, delete, transform them. It splits in four sections :
- the spreadsheet,
- the qualification ,
- the adjustment,
- the data preparation .
The spreadsheet restores the responses given by respondents in the shape of a table.
The qualification of data allows to spot the particular observations and possibly to modify some data (the non-responses, the poorly documented observations…) in order to establish the global quality of the set of data.
The data adjustment consists in defining weights to each respondent according to the category it belongs to.
The data preparation allows to transform, combine variables, or create new variables on the basis of personalized calculations. It is also possible to proceed to fusions, imports etc.
Despite all the efforts deployed to select the individuals, during the phases of e-mailing for example, a sample may be "biased", which means that its composition is not satisfactory, as it doesn't correspond to the criteria of representativeness we have previously defined. For example, we think that we have reached 60% of women and 40% of men but it's not real. We must then « adjust » the sample.
This operating mode is accessible in two ways :
- In the tab Data of the Spreadsheet, click on the button Adjust.
- From the home panel, click on the button Adjust of the stage Data management.

You reach the dialog box below :
1 Click Add to select the variable on which we wish to adjust the sample.
2 Indicate the percentage we wish to obtain for each modality.
In our example, we have now 17,26% of the respondents French as our objective is to have 20% of the probed people French. The same for the other nationalities. As the goal is to make the sample representative of the population surveyed on this criterion.
Two methods of calculation are possible :
- If you select Weight the observations, this allows to assign a weight to each observation (for example, if your sample doesn't contain enough French respondents, it will be assigned to each observation which contains the value France for the variable (relative to the country of origin a weight superior to 1). On the other side, as we have 33% of Spanish respondents, a weight inferior to 1 will be assigned to observations which contain the value Spain for the (variable relative to the country of origin).
- If you select Extract a sapmle, this consists in extracting a sample which allows to assign the modalities 1 or 0 to the new variable Adjustment, (1 corresponding to observations retained to represent the sample).
By clicking Finish, a new variable « Adjustment » will be created (this variable may be used as a sub-sample (profile) in the board : you can therefore visualize the boards on the new sample created representative, by adjustment, of the population studied.