"False"
Skip to content
printicon
Main menu hidden.

The Northern Sweden Diet Database, NSDD

The NSDD is a uniform database integrating self-reported questionnaire data on diet from two large research projects in Northern Sweden; DietVIP within the Västerbotten Intervention Project (VIP) and DietMON within the Northern Swedish MONICA project.

Information on the habitual eating habits in the region of Northern Sweden has been collected since 1985, by means of a semi-quantitative food frequency questionnaire.

Both DietVIP and DietMON are ongoing research projects, and the NSDD is thus regularly updated with new data from VIP and the recurrent MONICA screenings.

VIP, Västerbotten Intervention Programme (in Swedish)

Access to dietary data from NSDD

Basic information for research on dietary data

Publications and validations studies

 

The Northern Sweden Diet Database (NSDD)

The Northern Sweden Diet Database (NSDD) is a standardised database built around dietary data collected from two major research projects in northern Sweden: DietVIP within the Västerbotten Intervention Programme (VIP) and DietMON within the MONICA screenings.

Since 1985, information on dietary habits has been collected using Food Frequency Questionnaires (FFQ) that capture the intake frequency of common foods in the region, reflecting the entire dietary intake.

Both DietVIP and DietMON are ongoing research projects, which allows the NSDD to be regularly updated with new information collected as part of VIP and from regular MONICA screenings.

The structure of the diet database NSDD

Umeå University is the accountable authority. The database is stored on a secure server at Umeå University and has approval from the Swedish Data Protection Authority.

The NSDD harmonises dietary data from the DietVIP and DietMON projects, along with validation studies of dietary questionnaires and processed dietary data. It has been developed with support from the Swedish Research Council, the Research Council for Working Life and Social Work (FAS), the Research Council for Health, Working Life and Welfare (Forte), and the Swedish Cancer Society.

What is NSDD?

In DietVIP and DietMON, dietary information is collected using a validated, semi-quantitative dietary questionnaire (see the tab “Different versions of the dietary questionnaire”). DietVIP includes data collected annually starting in 1985, and DietMON includes data from screenings in 1986, 1990, 1994, 1999, 2004, 2009 and 2014. The revised and digital version of the dietary questionnaire (FFQ2020) was used to collect dietary data in the 2022 MONICA study (see the tab “The digital questionnaire FFQ2020”).

Since 1992, the dietary questionnaire has been optically readable, and most of the questionnaires collected between 1985 and 1992 have been manually entered.

The division into Long (blue) and Short (orange) refers to the number of questions in the questionnaire. See the tab “Different versions of the dietary questionnaire”.

The design of the dietary questionnaire

Participants in the Västerbotten Intervention Programme and the MONICA study have completed a semi-quantitative dietary questionnaire consisting of three main sections:

– Most common portion size: Provided for three separate food groups (potatoes/rice/pasta, meat/fish, and vegetables) using photo illustrations of four portion sizes according to the figure below.

- Intake frequency: Reported on a nine-point scale from never, once a year, 1–3 times/month, 1 time/week, 2–3 times/week, 4–6 times/week, 1 time/day, 2–3 times/day to 4 times/day or more.

- Dietary supplements: The research participant can indicate intake of multivitamins, multiminerals, iron and selenium over the last 14 days and the last year, respectively.

Some general information is also collected regarding specific dietary and meal habits.

Dietary variables

The basic data from the completed questionnaires include an ordinal variable for each dietary question with a value on a nine-point frequency scale from never, once a year, 1–3 times/month, 1 time/week, 2–3 times/week, 4–6 times/week, 1 time/day, 2–3 times/day to 4 times/day or more. In the data files, these are labelled ma01–ma84 for questions from the longer version of the questionnaire, and mat01–mat66 from the shorter version of the questionnaire. These basic variables are rarely used directly by researchers, but they can be requested. This request is made when filling in the withdrawal form.

The basic variables ma01–ma84 and mat01–mat66 are converted into numerical variables that express times/day, where for example 1 time/week becomes 1/7 = 0.142 times/day. These converted variables are listed in the withdrawal form and are what researchers usually request. In the data files, these are labelled da01–da84 for questions from the longer version of the questionnaire, and dat01–dat66 from the shorter version of the questionnaire (see the tab Different versions of the dietary questionnaire).

Intake frequencies have also been converted to intake expressed as grams/day. These use the portion sizes for potatoes/rice/macaroni; meat/fish; and vegetables that participants indicated in the photos, along with gender and age-standardised portion sizes for other food groups and natural sizes for things like fruit. These grams/day variables can also be requested in the withdrawal application. Please note, however, that valid measurements of daily intake ideally require a weighed dietary record or at least a food frequency form with a large number of pictures of different portion sizes. This is not the case with the NSDD, which is based on a semi-quantitative food frequency questionnaire. Estimates of intake expressed in grams/day do not provide better precision here than intake expressed in frequency/day.

In addition to information on intake frequencies of individual foods and food groups, estimated energy and nutrient intakes are also available. Energy and nutrient intakes are calculated by multiplying the intake frequency (times/day) by the energy and nutrient content of each food according to the Swedish Food Agency’s database according to gender-standardised and age-standardised portion sizes (Johansson et al., 2002). 

Different versions of the dietary questionnaire

Until the latter half of the 1990s, both the Västerbotten Intervention Programme and the MONICA study used the original dietary questionnaire, with questions about consumption of 84 different foods. Subsequently, some shorter versions (64–66 different foods), all of which are optically readable, have been used in the Västerbotten Intervention Programme.

Within the MONICA study, the original variant has been used with the exception of the 1990 screening, when a shorter version with 49 questions was used. Financial considerations were behind the shorter version. All the different versions used the identical scale of intake frequencies and identical sections for food questions and food combinations. The 2022 MONICA study used the updated version of the dietary questionnaire, FFQ2020.

Printed dietary questionnaires used 1985–2021:

The digital questionnaire FFQ2020

FFQ2020 is an updated and digitised version of the dietary questionnaires previously used in VIP and MONICA. The questionnaire has been revised to better capture current food choices and eating habits. It has 108 frequency questions. Researchers can apply to use FFQ2020 in their own dietary studies. Contact Anna Winkvist or Maria Wennberg for more information.

FFQ2020 was used in the 2022 MONICA screening. Contact info.brs@umu.se if you want to use dietary data from MONICA 2022 in research projects.

Creating standardised dietary data

An initial step investigated how many dietary questionnaires from the two main groups, i.e. the longer one (84 foods) and the shorter one (64–66 foods), and other versions, had been used. The different questionnaire versions were compared to see if any of them needed to be excluded. Criteria for exclusion of questionnaire versions were (i) frequency options deviate in an unmanageable way, (ii) information about portion size is missing, or (iii) the food questions differ in a way that is unmanageable. Based on these criteria, the following questionnaire versions were included in the NSDD:

  • Longer questionnaire (84 foods): the optically readable versions BASN, BASG, and BAS6, the MONICA questionnaire, and the non-optically readable “apricot” version.
  • The shorter questionnaire (64–66 foods): the optically readable versions AC4, AC5, AC6, AC00, AC03, AC05, and AC11.

Longer and shorter questionnaire versions have been harmonised. The variables da01–da84 have frequencies/day for research participants who responded to the longer questionnaire. These have been converted to correspond to dat01–dat66 by combining answers from a number of questions, so that the answers correspond to what was requested in the shorter survey. For example, da40 captures intake of fried potatoes and da41 intake of fries, and the questions come from the longer questionnaire with 84 questions. The variable dat31 captures intake of both fried potatoes and fries and comes from the shorter questionnaire. In a harmonised version of data from both questionnaire versions, responses from the long questionnaire have been recoded so that da40+da41 corresponds to dat31. This recoding is indicated on the withdrawal form.

A question about egg intake was introduced in later versions of the questionnaires for both VIP and MONICA. Contact PI Anna Winkvist or research advisor Maria Wennberg for more information if intake of eggs is central for the research question.

Researchers can request a data file from DietVIP that only has research participants who responded to the longer questionnaire, a data file with only research participants who responded to the shorter questionnaire, or a file with merged data, i.e., created with all research participants where recoding is done to harmonise the response options from the different questionnaire versions. It is therefore important to know what assumptions are used when merging the longer and shorter questionnaires into a single file. For this reason, carefully read the tab “How do I work with NSDD data” below before starting to analyse the data.

Researchers can request data from DietMON from the seven different screenings (1986, 1990, 1994, 1999, 2004, 2009, and 2014) or a combined data file. Note that the 1990 dietary questionnaire for the MONICA screening only consists of 49 questions and no nutritional values have been calculated. Dietary data from MO2022 were collected with FFQ2020 and can be requested but these have not yet been harmonised with dietary data collected with previous dietary questionnaires.

In addition, dietary data for nested case-control studies within NSHDS can be requested. These often include dietary data collected with all types of questionnaires, which must be considered when case-control matching.

Making withdrawals from the NSDD

The NSDD is a validated data collection originating from VIP (Västerbotten Intervention Programme) and the various MONICA screenings. It includes a large number of calculated dietary factors.

Contact the PI for NSDD, Anna Winkvist, or the scientific advisor specialising in nutrition, Maria Wennberg, to begin discussions and for advice on your planned dietary project.

The following documents are to be sent to the Section for Biobank and Registry Support (BRS).

  • Withdrawal form for dietary data
  • Research plan
  • Ethical review application and approval

In the withdrawal form, write the project title, describe whether ethical approval has been received or is in process, provide a brief statement of the purpose of the project and explanation of why dietary variables are needed, and indicate the format in which data are requested. Also specify which parts of the NSDD are requested (VIP, long/short questionnaire version, MONICA, other criteria).

For the dietary variables, indicate which of all the food items are requested. Indicate whether you want frequency (times/day) and/or grams/day. If all foods are requested, this should be explained in the introductory withdrawal request describing the purpose of the research and the reason dietary variables are needed. Intake frequencies for 84 foods are available for participants who completed the longer questionnaire (da01–da84 variables). Intake frequencies for 66 foods are available for participants who completed the shorter questionnaire (dat01–dat66 variables). If a harmonised data file that includes participants who responded to both questionnaire versions is requested, the intake frequency from the longer questionnaire is converted to correspond to dat01–dat66 by combining answers from a number of questions in the 84-food questionnaire, so that the responses correspond to what was requested in the shorter questionnaire. The withdrawal form indicates how this merger was done. For example, the questions White (soft) bread (da12) and Thin crisp bread (da13) have been merged into one question (dat12), which is shown in the withdrawal form.

This is followed by a section where you indicate which calculated nutrients are requested. These are provided in weight/day according to appropriate units. If all foods are requested, this should be explained in the initial withdrawal section describing the purpose of the research and the reason dietary variables are needed. Some variables require permission from the responsible researchers before they can be provided, for example phytoestrogens. Finally, there is a section on supplements and meal patterns.

Addresses

All documents are sent unsigned by email. We will contact you later for signatures.

Email: info.brs@umu.se

or by post to the following address:

Anette Forsgren
Section of Biobank and Registry Support
NUS, Building 5B, floor 1, destination point P11
Umeå University
901 87 Umeå

Note: If additional information about the research participants is requested beyond the data included in the variable list for NSDD, the request must be made using a variable withdrawal form, which is also to be sent to the Section of Biobank and Registry Support (info.brs@umu.se).

How do I work with NSDD data?

Exclude research participants in the data file due to incomplete responses

In the NSDD data files provided to researchers, no research participants have been excluded due to incomplete responses to the dietary questions. However, incomplete responses have been examined in the data and the results summarised in a variable (exclude) that can be used to exclude research participants judged to have very incomplete responses and/or biologically unreasonable values (see below).

No types of imputations are applied to the basic frequencies ma01–ma84 and mat01–mat66. Some corrections for incomplete responses are applied when converting from the base variables to the variables da01–da84 and dat01–dat66. If a response is missing or illegible within sections with similar foods, a response for number of intakes per day is imputed by weighing the other responses within that section. For example, if a research participant has marked high intake of the spreadable fat Bregott, but information for spreadable low-fat margarine, butter or regular margarine is missing, these are set to “never” on the assumption that the research participant has a preference for a specific variant within the food group. If intake frequency for a food is completely missing, a response is imputed based on the median intake for this food in the entire database based on gender and 10-year age group. If a researcher intends to use the recoded day frequencies (da and dat variables), they must consider this and decide whether individuals with many missing responses should potentially be excluded, based on the exclude variable.

The “exclude” variable has been created to highlight for researchers those research participants who should potentially be excluded due to incomplete information. This is coded 0 if intake frequency is missing on less than 10% of the questions and information on portion sizes is complete and the individual is to be retained in the data file; 1 if intake frequency is missing on more than 10% of the food questions; and 2 if any indication of portion size is missing so that nutrient intake/day cannot be calculated. Research participants lacking both of these data points are coded “exclude”=1. This variable can be used by researchers to exclude research participants depending on their research question. If the focus is on the intake of a few individual foods and does not involve the entire diet or intake of energy and nutrients, then research participants coded “exclude”=1 can also be included in the analyses.

Exclude research participants in the data file due to biologically unreasonable responses

Researchers should consider whether to identify and exclude research participants who have indicated biologically unreasonable daily intakes of energy and/or specific nutrients. To support this, the variable “FIL” (food intake level = estimated energy intake/basal metabolism) has been created. It is up to the researcher to decide on appropriate cut-off values based on the relevant question. Some have chosen to remove the highest and lowest 1% and others a “FIL” value below the 5th lowest percentile and above the highest 2.5th percentile based on the distribution in the population.

This should be done divided by gender and whether the long or short questionnaire version was used. Use the “antfrag” variable to divide into long or short questionnaire versions.

Managing repeated measurements

A number of research participants within NSDD have participated in the VIP and the MONICA study on multiple occasions. In VIP, the variable “id” is unique for each individual and the variable “enumber” is unique for each occasion. In MONICA, the variable “pidnr” is unique for each individual and the variable “mo_seqno” is unique for each occasion.

Adjusting dietary energy intake in your analyses

In studies of the relationship between dietary intake and health outcomes, it is common to adjust for total energy intake. This is because individuals with a larger body mass and/or high physical activity eat more and thus have a greater intake of most nutrients. Instead, researchers often want to capture differences in nutritional quality between different individuals. Energy adjustment is also an appropriate way to deal with under- and over-reporting, which are common methodological problems in dietary studies.

Ways of adjusting for total energy intake include expressing macronutrient intake as a percentage of energy, including total energy intake as a covariate in multivariable analyses, or the residual method. These methods are described in:

Willet WC, Howe GR, Kushi LH. Adjustment for total energy intake in epidemiological studies. Am J Clin Nutr 1997;65(4):1220S-8S.

Further discussions on how different ways of adjusting energy affect what we measure in our analyses can be found in the following article:

Tomova GD, Arnold KF, Gilthorpe MS, Tennant PWG. Adjustment for energy intake in nutritional research: a causal inference perspective. Am J Clin Nutr 2022;115:189-98.

Validation studies

A validation of the original, longer food frequency questionnaire (84 food items) was conducted among 196 research participants with repeated 24-hour dietary interviews and measurement of plasma β-carotene levels (Johansson et al., 2002). Participants also completed the same food frequency questionnaire on two occasions, one year apart. Good agreement was obtained in energy and nutrient intake between the two questionnaire occasions. Slightly higher dietary intake was reported with the food frequency questionnaire compared to the 24-hour dietary interview. Despite a high proportion of under-reporting participants (Johansson et al., 2001), the validity of the questionnaire is considered to be equivalent to that of dietary questionnaires in other prospective cohort studies. Intake of a number of fatty acids, fatty acid profile in erythrocyte membranes (Wennberg et al., 2009), B-vitamin intake (Johansson et al., 2010) and phytosterol intake (Klingberg et al., 2013) have also been validated against 24-hour dietary interviews.

The updated version of the dietary questionnaire (FFQ2020) has been validated against 4–6 diet registrations (Wennberg et al., 2024). A comparison has also been made between intake according to FFQ2020 and intake according to the previously used dietary questionnaire (Winkvist et al., 2024).

References to the validation studies

STROBE 

STROBE (Strengthening the Reporting of Observational Epidemiology statement) is a research initiative to improve the quality of epidemiological research so that reporting is done in an equivalent and comparable way worldwide.

www.strobe-statement.org

Over time, the STROBE criteria have been adapted for different subject areas and a growing number of scientific journals now support the STROBE initiative. For research on nutrition and health, STROBE-nut is an excellent checklist. As such, we recommend that all users of the Northern Sweden Diet Database read and use these:

Strobe-nut checklist
Strobe-nut expansion

 
Latest update: 2025-11-04