QR code nutritional data V1.0 (draft)

Introduction

In many countries it is mandatory for food labelling to include standardised nutritional information. There are many health apps to allow people to track their intake of key nutritional metrics.

There are also good reasons for foods to be clearly marked with allergens, and also information about whether it is suitable for vegetarians.

There is, however, very little in terms of reliable machine processable data on food. Many crowd sourced data exists, but is incomplete and inaccurate. There are also many cases, such a made to order Subway sandwiches, and so on, where the data needs to be customised, and ideally included on the receipt.

This document proposes a system for encoding the basic nutritional information (not the full ingredients list) in a QR code.

This includes means to identify main allergen groups, and other attributes such as whether the item is suitable for vegetarians.

The proposal allows for extensions, allowing additional values, flags, etc, to be added in a backwards compatible way.

Main objectives

A compact 2D bar code providing key data for health, allergens, and key preferences (e.g. vegetarian). The design considers the alphabet used in QR codes, and a compact format, but also aims to make a coding that is not impossible to be read by humans (when decoded from the QR code) if necessary.

Applications

The main concept is to allow mobile apps to read data on products, and provide that data to the user.

Alphabet

The QR code (IEC18004) standard has a number of alphabets that can be used. The main alphanumeric alphabet uses a set of characters that includes upper case letters A-Z, digits 0-9,and symbols $, %, *, +, -, ., /, :, and space.

The design of this format makes use of this restricted character set so as to allow more compact encoding. However, readers should accept and parse lower case letters as equivalent to their upper case counter parts.

The final part of the coding allows for a product description string. It is recommended that the same restricted character set be used, but more extensive characters and even UTF-8 characters can be used if desired. To minimise the size of the QR code, the encoding should make use of the most efficient combination of encoding alphabets and switch during the code as necessary - this one reason is why the description is the last element in the format.

Format

This is version 1.0. Simply adding new fields does not justify an increase in the major version number. Readers should ignore unknown fields in most cases (except allergens).

The encoding consists of a header and version following by a number of fields. The fields can be of various types:

The numeric data fields, and information fields, can internally be in an order, but they must all come before any allergens or flags. This is simply because the parsing would not work for an allergen or flag to come immediately before a numeric data field or information field. The description is always the last field, simply because it allows any characters within it.

The scale field applies to following numeric data fields, and so the placement of numeric data fields relative to the scale field is important.

Header

The format starts with NUQR: followed by the two digit country code for the data, and digits for the major version, in this case 1 for version one as described here.

The country code is used to indicate the reference definitions for the values, so where there is a harmonised region such as the EU, the country code can be simply EU. This is not the country of origin of the product, it is the country / region for the reference definitions.

Scale

The default unit for mass is grammes (g) and for energy is kilo Joules (kJ). Whilst numeric data fields can include decimal points it is recommended that scale is defined. This is done by a scale header. The scale header can appear anywhere, and indicates the scale from that point on. The scale applies to numeric data fields, whether mass, energy or some other unit.

The format is * following by a scale. Each numeric value header that follows is divided by this scale. E.g. *10 before mass units means mass is then in 0.1g units, so 123 would mean 12.3g.

The scale applies to all numeric value units that follow, until replaced with a new scale header. The new scale header does not compound, it replaces the previous scale header, so *1 returns to single units.

It is recommended that fields fields not requiring a scale are included first, then the scale, then those that need the scale. E.g. J and W before a scale, and F, C, etc, after the scale, as per the example shown later. Applying a scale before J and W are valid but not expected to be needed and may simply cause confusion. It is also recommended that scales 1, 10, 100 are used and not obscure values like 3.

Numeric data fields

The numeric data fields consist of one (or more) letters followed by one (or more) digits. Only single letters are defined so far, but two (or more) letter fields could be added in the future.

The letter(s) represent the field type. The digits represent the value of the field. The digits can include decimal points, but it is recommends that the units setting is defined instead.

In addition, the digits can, in some cases, be followed by / and a secondary value.

Apart from W packet/portion size, all Mass values are g per 100g (i.e. a percentage) but are subject to the scale setting. All energy values are in kJ per 100g, but are subject to the scale setting. Obviously applications that use this data can present data per pack and per portion as necessary. If an application wishes to show data in other units such as Calories, or oz, that is a conversion to be done on the application - there is no variant format for different base units in the QR code format.

CodePrimary valueSecondary value
GMass of entire pack in grammes.Optional portion size in grammes.
JEnergy in kJ per 100g
FFatSaturated
CCarbohydrateSugars
BFibre
PProtein
SSalt

A special case of a value that is just a . is valid to indicate a trace amount. Where there is zero present the field may simply be omitted.

Unknown fields should be ignored

Information fields

In addition to the numeric fields as above, information fields in the same format are defined.

CodeValue
IIdentity (UPC/EAN bar code number)

Unknown fields should be ignored

Allergens

Allergens should be included, if necessary. These can be in any order. Each allergen is given an alphabetic code of one or more characters. Each is included prefixed with $.

Unlike data fields, and unknown allergen should be flagged to the user as a unknown allergen code and advise that the user reads the label.

CodeMeaning
GCereals containing gluten, namely: wheat (such as spelt and Khorasan wheat), rye, barley, oats
CCrustaceans for example prawns, crabs, lobster, crayfish
EEggs
FFish
PPeanuts
SSoybeans
MMilk
NNuts; namely almonds, hazelnuts, walnuts, cashews, pecan nuts, Brazil nuts, pistachio nuts, macadamia (or Queensland) nuts
CECelery (including celeriac)
MSMustard
SMSesame
SDSulphur dioxide/sulphites, where added and at a level above 10mg/kg in the finished product. This can be used as a preservative in dried fruit
LULupin which includes lupin seeds and flour and can be found in types of bread, pastries and pasta
MOMolluscs like clams, mussels, whelks, oysters, snails and squid
XGeneric unspecified allergen - READ THE LABEL! This is used where a code has yet to be allocated for some new allergen.

Flags

Additional flags can be added, in any order. These indicate some attribute of the product. They consist of either + or - followed by letters. The use of + asserts the flag is true, and - asserts it is not true. Absence of a flag makes no assertion.

For example +V means suitable for vegetarians but -V means not suitable for vegetarians. If there is neither than the QR code does not stipulate if the product is suitable for vegetarians, or not.

CodeMeaning
VSuitable for vegetarians
VGSuitable for vegans
KKosher
KPKosher for Passover
HHalāl

Unknown fields may be ignored.

The table of allergens and flags do not overlap. The allergen codes can be used with + to indicate may contain i.e. where processed in the same factory, or with - to assert the allergen is definitely not present, e.g. -G to assert definitely gluten free.

Description

The final field is the description. The format is % followed by manufacturer, then / followed by description.

It is recommended that these fields use the same restricted alphanumeric upper case character set, but mixed case and UTF-8 can be used if desired. The QR code shoudl be efficiently encoded which changes of alphabet as necessary.

Parsing

The header must be present, and this is followed by digits for the version. After this the fields are parsed in different ways depending on the first character.

The content is broken in to fields using the following rules, this allows for expansion of the specification.

Examples

Marmite

This is an example with no fibre, and trace fat content.

NUQR:EU1Header, version 1, EU reference
I8722700166221EAN barcode 8722700166221
J11001100kJ per 100g
W400/8400g jar, 8g portions
*10Units are 0.1g
F.Fat - trace amount
C300/12Carbohydrates 30.0g, of which sugars 1.2g, per 100g
P340Protein 34.0g, per 100g
S108Salt 10.8g, per 100g
+VSuitable for vegetarians
$GContains gluten
$CEContains celery
%UNILEVER/MARMITEManufacturer UNILEVER, Product MARMITE

The entire code is therefore:

NUQR:EU1I8722700166221J1100W400/*10F.C300/12P340S108+V$G$CE%UNILEVER/MARMITE

This encodes to a QR code for example:

Bombay mix

NUQR:EU1Header, version 1
I5000436850816EAN barcode 5000436850816
J20672067kJ per 100g
W200/25200g pack, 25g portions
*10Units are 0.1g
F289/58Fat 28.9g, of which saturates 5.8g, per 100g
C415/95Carbohydrates 41.5g, of which sugars 9.5g, per 100g
B89Fibre 8.9g, per 100g
P130Protein 13.0g, per 100g
S12Salt 1.2g, per 100g
$NContains nuts
$GContains gluten
$SDContains sulpher dioxide
+VSuitable for vegegtarians
%TESCO/BOMBAY MIXManufacturer TESCO, product BOMBAY MIX

The entire code is therefore:

NUQR:EU1I5000436850816J2067W200/25*10F289/58C415/95B89P130S12$N$G$SD+V%TESCO/BOMBAY MIX

This encodes to a QR code for example:

Authors

Idea proposed by James Kennard, specification version 1.0 by Adrian Kennard, Dec 2017.